When do you pick HBase instead of MySQL?

hbase1 300x204 When do you pick HBase instead of MySQL?Facebook works its data magic at scales others only dream of. And they do this for over 600,000,000 people, in real time!  (see: Facebook’s New Real-Time Messaging System: HBase To Store 135+ Billion Messages A Month). Cade Metz just wrote a piece diving deeper into this at The Register. His article is titled “HBase: Shops swap MySQL for open source Google mimic.”  This great reporting underscores something we already knew, that Facebook is a pioneer in the world of fast/realtime read/write access to big data. He also underscored that when you see Facebook making moves like swapping MySQL for HBase it is yet another reason to study what is going on here. This is especially important since Facebook is not the only firm swapping out MySQL for HBase.

So here is a bit more on HBase:

HBase is an Apache Software Foundation project. Here is more from Apache.org:

HBase is an open-source, distributed, versioned, column-oriented store modeled after Google’ Bigtable: A Distributed Storage System for Structured by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop. HBase includes:

Here is more from the Cade Metz article:

HBase is part of the Apache Hadoop project, a sweeping effort to mimic Google’s proprietary infrastructure with open source code. It dovetails with HDFS, the Hadoop distributed file system, and Hadoop MapReduce, the distributed number-crunching platform. HBase is essentially a low-latency layer that sits atop HDFS, letting you rapidly store and retrieve data. It’s fashioned after Google’s BigTable platform, which Mountain View publicly described in a 2006 research paper.

Now back to the title of this post.  How do you know if you should pick MySQL or HBase for a solution?

If you are designing systems to operate at huge scale, or integrating with other Hadoop related projects, select HBase. HBase, with Hadoop and related capabilities, is also there to support analysis at scale. So, if you are designing for making sense over big data, pick HBase.

MySQL is not going away. But the things it will be optimized for are traditional RDBMS solutions. If you can comfortably store your data in tables with rows and columns and don’t have that much of it or don’t do fast analysis over it, MySQL may be your best pick. MySQL is widely used, highly reliable, and well understood. So if your future growth and business model indicates you will never run into scale problems or have challenges conducting analysis over your data, MySQL may well be the best choice.

 When do you pick HBase instead of MySQL?

CTOvision Pro Special Technology Assessments

We produce special technology reviews continuously updated for CTOvision Pro members. Categories we cover include:

  • Analytical Tools - With a special focus on technologies that can make dramatic positive improvements for enterprise analysts.
  • Big Data - We cover the technologies that help organizations deal with massive quantities of data.
  • Cloud Computing - We curate information on the technologies enabling enterprise use of the cloud.
  • Communications - Advances in communications are revolutionizing how data gets moved.
  • GreenIT - A great and virtuous reason to modernize!
  • Infrastructure  - Modernizing Infrastructure can have dramatic benefits on functionality while reducing operating costs.
  • Mobile - This revolution is empowering the workforce in ways few of us ever dreamed of.
  • Security  -  There are real needs for enhancements to security systems.
  • Visualization  - Connecting computers with humans.
  • Hot Technologies - Firms we believe warrant special attention.

 

solid
About Bob Gourley

Bob Gourley is the publisher of CTOvision.com and DelphiBrief.com and the new analysis focused Analyst One Bob's background is as an all source intelligence analyst and an enterprise CTO. Find him on Twitter at @BobGourley

Comments

  1. Mat Keep says:

    Facebook are still massive MySQL users – for the particular application above, they used HBase, but their core data management is run with MySQL – and that isn't goin to change any time soon, however much articles like this suggest they are. Take a look at their recent stats and webcast on how they use MySQL: http://highscalability.com/blog/2010/11/4/faceboo

  2. Bob,

    An interesting follow up to this would be some insight on when an organization should be considering an OLTP database (MySQL) or alternative (HBase) vs when you would use an OLAP or analytic database (Aster Data) vs when you would use an ELT Batch processing platform (Hadoop). I think there is a lot of misconception in the market about when to pick which solution.

    You may also want to check out some of the recent posts by Curt Monash over on DBMS2. He's recently been posting about requirements for a good data analytic system.

  3. Thanks Ian, very good point. I need to get writing on that.

    I'll see you at the Big Data event tomorrow. I have high hopes for that event, and would appreciate your thoughts on what the federal enterprises should be thinking through after the event.

    Bob

  4. Thanks for sharing your info. I really appreciate your efforts and I will be waiting for your further write ups thanks once again.
    html5 player| html5 video player

Trackbacks

  1. [...] This post was mentioned on Twitter by Bob Gourley, Mike Olson and fury, CTOvision. CTOvision said: When do you pick #HBase instead of #MySQL? http://goo.gl/fb/2yhAY #bigdata #cto #cloudcomputing #gov20 [...]

  2. [...] When do you pick HBase instead of MySQL? (ctovision.com) [...]

  3. [...] When do you pick HBase instead of MySQL? (ctovision.com) [...]

  4. [...] first-hand how to leverage HadoopEnterprise CTOs: Learn Hadoop and Cloudera’s CDH3 on 21 AprilWhen do you pick HBase instead of MySQL?Upcoming Cloudera Hadoop Training in the DC AreaCommon Hadoopable ProblemsBackground on Lucene, [...]

  5. [...] Upcoming Cloudera Hadoop Training in the DC AreaEnterprise CTOs: Learn Hadoop and Cloudera’s CDH3 on 21 AprilWhen do you pick HBase instead of MySQL? [...]

  6. [...] Hadoop Training in the DC AreaEnterprise CTOs: Learn Hadoop and Cloudera’s CDH3 on 21 AprilWhen do you pick HBase instead of MySQL? /**/Share this:FacebookEmail Posted by bobgourley on September 16, 2011. Filed under ctovision. [...]