A Timeline of Big Data Analytics

The world seems to run on big data nowadays. In fact, it’s sometimes difficult to remember a time when businesses weren’t intensely focused on big data analytics. It’s equally difficult to forget that big data is still relatively new to the mainstream. The amount of data being generated in the world today is immense, and it’s only getting bigger. Some experts predict that by the year 2020, that volume of digital data will reach as high as 40 trillion gigabytes. Though much attention is placed on what the future holds in the big data realm, sometimes it’s just as important to take a step back and see how far we’ve come in such a short time. Consider the following to be a brief timeline of the big data analytics phenomenon.

It’s difficult to pinpoint a precise moment when the concept of “big data” was officially conceived. Data, after all, has played a key role in scientific pursuits for centuries if not millennia, but big data is a different beast. Most scholars look at 1944 as the year where the seed for the idea was planted. That year, Fremont Rider of Wesleyan University published a paper called The Scholar and the Future of the Research Library, wherein he looked at university libraries across the country and looked at their exponential growth. His main concern was how much physical space would be needed as the amount of data collected increased.

The mid-1950s were a time where data was beginning to be used for analytics purposes. Small teams at various organizations worked with data to develop predictive analytics and reporting activity. The data they used was made up of structured internal data, and their findings helped organizations in their decision-making process, not unlike what many businesses do now.

Over the next decade, researchers discussed the need for better storage mediums as the amount of data generated continued to expand. There was also greater emphasis placed on the need to transmit data more quickly, and computer were seen as instrumental in this idea. By 1970, the first framework for a relational database was presented by Edgar F. Codd of IBM. This was the foundation upon which many databases were built, and it’s the framework that many data services still use or reference.

Through all this, no one really spoke in terms of “big data”, but one of the first possible uses of the term was made in 1989 by author Erik Larson. At the same time, a new growth of popularity was seen among many enterprises for business intelligence. This growth was spurred by greater use of computers in the workplace along with new software capabilities.

1996 can be seen as a historic year since it is believed this was the time when digital storage became the more cost-effective option for storing data over paper. With this monumental shift, more companies could switch to digital files, allowing for analytics to take hold much more quickly. Equally influential was the growth of the internet, which multiplied data generation at a rate that only seems to increase each year.

By 1999, the term “big data” as we understand it today first appeared in a paper in the Communications of the ACM. This paper looked at the growing volume of data along with the importance of analyzing it. As the authors emphasized, the focus of using big data should be on finding hidden insights and not so much on how much of it exists.

In 2000, the first comprehensive study ever undertaken to quantify just how much digital data existed in the world along with how fast it was growing was released. This helped scientists and businesses get a good grasp of data consumption and generation. In 2001, Gartner analyst Doug Laney released a paper about data management which described what would later be considered the three characteristics of big data: volume, velocity, and variety. 2005 was the year Hadoop was created, a framework designed for storing and analyzing big data. The mid-2000s also saw a rise in unstructured data generation, leading to the establishment of the data scientist position.

By 2007, the term “big data” reached mainstream audiences through Wired magazine. Two years after that, research from McKinsey found that companies with more than 1,000 employees were on average storing more than 200 terabytes of digital data. Big data had truly become a popular concept. McKinsey also predicted in 2011 that there would be a major demand for data scientists by 2018 but not enough supply, a prediction that is coming true.

The future is bright for big data as well. Concepts like the Internet of Things, Spot instances, and more have opened even more possibilities for how to use big data. Analytics has come a long way, but there’s plenty more to uncover in the years and decades ahead.

Rick Delgado

Rick Delgado is a freelance technology writer and commentator.

Leave a Reply