You are a SPECIAL Librarian! You are a Military Librarian!

Tag Archive | "big data"

Predictive Modeling With Big Data: Is Bigger Really Better?

Predictive Modeling With Big Data: Is Bigger Really Better?
Source: Big Data

With the increasingly widespread collection and processing of “big data,” there is natural interest in using these data assets to improve decision making. One of the best understood ways to use data to improve decision making is via predictive analytics. An important, open question is: to what extent do larger data actually lead to better predictive models? In this article we empirically demonstrate that when predictive models are built from sparse, fine-grained data—such as data on low-level human behavior—we continue to see marginal increases in predictive performance even to very large scale. The empirical results are based on data drawn from nine different predictive modeling applications, from book reviews to banking transactions. This study provides a clear illustration that larger data indeed can be more valuable assets for predictive analytics. This implies that institutions with larger data assets—plus the skill to take advantage of them—potentially can obtain substantial competitive advantage over institutions without such access or skill. Moreover, the results suggest that it is worthwhile for companies with access to such fine-grained data, in the context of a key predictive task, to gather both more data instances and more possible data features. As an additional contribution, we introduce an implementation of the multivariate Bernoulli Naïve Bayes algorithm that can scale to massive, sparse data.

Posted in Links of InterestComments Off

Occupational Outlook Quarterly: Working with big data

Working with big data
Source: Bureau of Labor Statistics (Occupational Outlook Quarterly)

This year, 2013, is The International Year of Statistics. It’s a designation intended to highlight the role that data and statistical analysis have in society. To further that goal, this article describes work with big data. The first section outlines what big data is. The second section provides an overview of big data work. The third section explains some of the challenges that big data work entails. The fourth section describes how to prepare for this work. Sources of information are provided at the end.

Posted in Links of InterestComments Off

Big Data, H.P. Lovecraft, and common sense

hplovecraft

“The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents. We live on a placid island of ignorance in the midst of black seas of infinity, and it was not meant that we should voyage far. The sciences, each straining in its own direction, have hitherto harmed us little; but some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the deadly light into the peace and safety of a new dark age.”  — H.P Lovecraft, “Call of Cthulhu,” Weird Tales, 11, No. 2 (February 1928), 159–78, 287.

While it is obvious that H.P. Lovecraft could not know about big data, this quote is very relevant.  Alexis Madrigal of Nextgov’s Big Data blog thinks so as well.  That is where I first saw this quote and its being linked to big data.  There are many things about big data and how it can be used that are very scary.  We have seen recently where is is actually difficult to limit how much data you may end up collecting because of how hard it is to separate out what you need from what is found (ie. NSA and FISA).  It is important to not blame the technology as the problem.  The problem is its application and use by human beings. 

There are many “Big Benefits” from the use and application of “Big Data.”  Look at the growth and maturation of the field of bioinformatics and its use in medicine.   Sequencing of the human genome is the application of big data.  Genomics will change how we are treated for disease. 

Big Data will help us in the battle to overcome global warming.  Increasingly accurate weather forecasts and improved computer models of the effects of global warming are all applications of big data.

Big data is now showing up in all the hard sciences and in the “soft” such the social sciences.  It is impossible to get away from it.

All of us in special libraries, especially in business, technical and research libraries, have seen our jobs change because of interest in big data and because many of us are directly involved in the exploration, analysis, and manipulation of big data sets. 

A bit of common sense will help us avoid us the fate suggested by H.P. Lovecraft of  “mad from the revelation or flee from the deadly light into the peace and safety of a new dark age.”  As with all technologies, big data is not in itself good or bad.  It is in how it is used.  As librarians we can help direct its use into positive directions.

Note: These are my own opinions and not the opinions of SLA, Military Libraries Division of SLA, my employer, or the U.S.Air Force or DoD.  — Bill Drew

Posted in Guest PostsComments Off

Big Data – a working definition

“BIG DATA IS DATA THAT EXCEEDS THE PROCESSING CAPACITY OF CONVENTIONAL DATABASE SYSTEMS.”

Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures. To gain value from this data, you must choose an alternative way to process it.

Edd Dumbill. Big Data. March 2013, 1(1): 1-2. doi:10.1089/big.2012.1503.

Edd Dumbill
Editor-in-Chief
Big Data

Posted in Web/TechComments Off

#SLAtalk: Let’s Get Technical

Today’s Twitter chat: “Let’s Get Technical: Providing Access to the Changing Nature of Information” will examine such topics as Open Access, Big Data, evolving trends in discovering and sharing information, and credibility assessment.

We are pleased to have IET as our vendor-partner co-host lending their expertise to these key topics.

The full details are below. We hope you can join!

—————————————————–

#SLAtalk: Let’s Get Technical

September’s #SLAtalk will feature your ideas on how information professionals will effectively provide access to the increasingly changing nature of information.

We’re pleased to be joined by IET (the Institute of Engineering and Technology) who will be co-hosting and participating in #SLAtalk: Let’s Get Technical. IET is a valued exhibitor and sponsor of SLA events, so let’s show them a warm welcome to the #SLAtalk world! (follow their new handle: @IETInnovates)

When:

Tuesday, September 10th, 17:30 UTC (13:30 EDT / 10:30 PDT)

 

Questions to answer:

1.        

  1. 1.     Open Access: How is open access influencing and revolutionizing the way users access information? (first 15 minutes)
  2. 2.     Analytical Skills: How do you as an information professional assess the credibility of new information and/or resources? What trends do you see in information needs? (second 15 minutes)
  3. 3.     Information Discovery Trends: How do student researchers or young professionals discover, use and share information? Are there telling differences between new & veteran researchers? (third 15 minutes)
  4. 4.     Big Data: Do you play a role in sharing, organizing, and analyzing ‘Big Data’? Do you think it will live up to its hype in the coming years? (last 15 minutes)

 

Prepare with some light reading:

New here?

Check out How to #SLAtalk and the latest #SLAtalk Roundups.

Posted in NewsComments Off

Blog Archives