Commercial corporations in the Internet Age face endlessly growing data asset management – but traditional business technology isn’t the way to help, argues Neo Technology’s Emil Eifrem
Digital consumers are generating data at an exponential rate, via social networking, emails, blogs and smartphones. One telling metric: trend spotter Mary Meeker has found that we now generate 2.5 quintillion (1 followed by 18 zeroes) bytes of new data every day globally.
By far the majority of this is semi-structured or unstructured data. How are we going to manage this new data?
As the volume, velocity and variety of the data in our interconnected world increases, a new breed of database, generally referred to as NoSQL (Not only SQL) has emerged that’s creating a post-relational database landscape. The NoSQL family includes the key-value store, the column family database, the document database and the graph database. Hadoop, oriented at large-scale batch analytics, has also emerged from this approach.
Each variant offers particular strengths, but what links them all is their ability to process large volumes of data. RDBMS still has an irreducibly important role to play in an organizational context, as do even earlier database formats like flat-file/network (CICS, etc.). Relational databases are also still great for managing the transactional and analytical processing requirements associated with heterogeneous datasets like CRM or HR data.
But that 2.5 quintillion plus bytes has to be addressed – and more pressingly needs to help leaders better manage organizations, people and resources. Firms need to be able to connect the dots so as to finally create what traditional enterprise Business Intelligence (BI) has been striving for, the ‘360-degree’ view of the customer – or now, the digital consumer.
A variety of use cases
That’s not to say CIOs haven’t already been taking advantage of NoSQL. For example, graph databases were used by the social web giants like Google, who exploited the connections in web documents to rank search results as part of its proprietary algorithm.
These early custom-built graph technologies were only in-house. The good news is that equivalent tools are now available to the wider marketplace, as graph databases have won support in the developer and open source communities. As a result, a growing number of enterprises are turning to graph databases to power highly personalized product and service recommendations using huge volumes of data, all in real time.
For example, there are telecom providers diagnosing network issues and enterprises re-imagining their master data management as well as their identity and access models – all with graph databases. There’s also a lot of work taking place around fraud detection. Increasingly, Fortune 500 firms are starting to view graph databases as the best way to model, store and query data.
That’s because graph databases are so good at finding connections between people, places or things – relationships – making them a natural fit for many business problems. Even better, as understanding the connections between data and the meaning of these links doesn’t need new data, you can pull new insights from reframing the problem and looking at it in a graph database.
It’s likely that the three areas of healthcare, media and government are likely to be the biggest new users of graph databases. Networks are inherent to healthcare, after all, thanks to the relationships between doctor and multiple patients involving large and complicated related datasets. Research into diseases is particularly applicable for graph databases as well.
Media data, meanwhile, has a complex structure in which no single individual or asset exists in isolation, a level of interconnectedness that also fits perfectly with graphs. In the public sector, the international security community is a prime growth sector for graph databases, while politics involves lots of networks, from donors to voters, that graph databases are great at mapping.
There’s clearly enormous market growth taking place. Forrester Research estimates that one in four enterprises will be using such technology by 2017 while Gartner reports 70% of leading companies will pilot a graph database project of some significance by 2018.
Graph databases aren’t applicable or helpful for all problems; there are transactional and analytical processing needs for which relational technology will probably always be the correct option, and there are NoSQL database alternatives that handle other types of large dataset well.
But graphs make sense for any organisation seeking to make the most of its connected data. That is why I would recommend that any CIO looks to NoSQL, including graph databases, as a powerful new tool to supplement their RDBMS investment and deal with the growing data tsunami.
Latest posts by Emil Eifrem
- Why Now Is the Ideal Time for the CIO to Work with Graphs - November 29, 2016
- Can Graph Databases Really Advance Our Digital Public Services? - August 29, 2016
- Graph Technology: The Answer to Combating Fraud - March 22, 2016