Cleversafe provides dispersed storage solutions that give infinite scale and cost-effective data storage/protection/access. Apache Hadoop and the CDH4 distribution provides all the required software for implementing MapReduce and the other chores associated with analysis over massive quantities of data. What if these two capabilities could be combined in a smart, well engineered way? The potential impact on analysis of Big Data capabilities would be huge if this were to happen.
Cleversafe engineers have been working on this very thing and today are announcing a dispersed storage solution can run Hadoop. This is way cool.
I had an opportunity to speak with Cleversafe's VP of Product Strategy, Russ Kennedy and their Director of Federal Solutions, Bobby Caudill about this announcement. Here is what I took away from the conversation:
- This new ability to combine computation and dispersed storage has several key benefits including enhanced ease of use for enterprise-scale Hadoop systems, highly secure object storage (this method is far more secure than traditional implementations), plus cost-effective scalability.
- Another really cool thing, from my point of view, is Cleversafe has already been teaming with Lockheed Martin on this to ensure the Cleversafe Dispersed Compute Storage Solution is ready for the federal market.
- For application developers that build capabilities which need to interface with Hadoop, little changes for them. They still write the same code and it accesses data though an API that requires no change to the code.
- The resulting solution can scale and can do so in ways that have built in reliability, security and speed.
Here is more from their press release:
Cleversafe First to Deliver Breakthrough Capabilities for Combined Storage and Massive Computation
First System to Support Storage and Analysis of Datasets at Previously Unattainable
Scale with Unparalleled Reliability and Efficiency
CHICAGO, July 10, 2012 – Cleversafe Inc., the solution for limitless data storage, today announced plans to build the first Dispersed Compute Storage solution by combining the power of Hadoop MapReduce with Cleversafe’s highly scalable Object-based Dispersed Storage System. This solution will significantly alter the Big Data landscape by decreasing infrastructure costs for separate servers dedicated to analytical processes, reducing required storage capacity, and simultaneously improving data integrity. In addition, the company’s solution will reduce network bottlenecks by bringing together computation and storage at any scale, petabytes to exabytes and beyond.
Traditional storage systems are not designed for large-scale distributed computation and data analysis. Present implementations treat data storage and analysis of that data separately, transferring data from Storage Area Networks (SANs) or Network Attached Storage (NASs) across the network to perform the computations used to gather insight. In this manner the network quickly becomes the bottleneck, making multi-site computation over the WAN particularly challenging. Cleversafe solves this problem by combining Hadoop MapReduce alongside its Dispersed Storage Network (dsNet) system on the same platform and replacing the Hadoop Distributed File System (HDFS) which relies on 3 copies to protect data with Information Dispersal Algorithms thereby significantly improving reliability and allowing analytics at a scale previously unattainable through traditional HDFS configurations.
“For any company, the movement, management and storage of massive data stores for analytical purposes is already unmanageable,” said Chris Gladwin, CEO and President of Cleversafe. “Many companies have had to invest significant resources in both CAPEX and OPEX to manage the challenge of Big Data and to try and capitalize on the opportunity to gather insights from that data,” said Gladwin. “The key to reducing both cost and complexity is to combine computation with dispersed storage,” said Gladwin. “Cleversafe’s solution will provide infinitely scalable, reliable, and cost effective storage for data to support massive computation while enhancing the analysis workflow.”
Hadoop MapReduce, which is already being used broadly throughout the industry, represents only a partial solution to this problem. While it lends itself naturally to enabling computations where the data exists rather than transferring data to computation nodes, it has inherent scalability and reliability limitations. Current HDFS deployments utilize a single server for all metadata operations and 3 copies of the data for protection. Failure of the single metadata node could render stored data inaccessible or result in a permanent loss of data. Maintaining 3 copies of data at massive scale for protection leads to skyrocketing overhead and management costs.
Cleversafe’s dsNet system protects both data and metadata equally and is inherently more reliable. By applying the company’s unique Information Dispersal technology to slice and disperse data, single points of failure are eliminated. As data is distributed evenly across all Slicestor nodes metadata can scale linearly and infinitely as new nodes are added, thus reducing any scalability bottlenecks and increasing performance. Cleversafe’s unique approach delivers the powerful combination of analytics and storage in a geographically distributed single system allowing organizations to efficiently scale their Big Data environments to hundreds of petabytes and even exabytes today.
“There isn’t an industry today that’s untouched by Big Data or a company that wouldn’t benefit from the intrinsic value of that data if they could collect, organize, store and analyze it in a cost-effective manner,” said John Webster, Senior Partner at Evaluator Group. “Cleversafe’s approach to combining dispersed storage and Hadoop for analytics is a groundbreaking step for the industry and for any company to effectively bridge storage and large-scale computation,” said Webster.
No market segment has a more critical need to harness Big Data than the Government sector. Lockheed Martin is partnering with Cleversafe to develop a federal version of the Cleversafe Dispersed Compute Storage solution designed for the unique needs of federal government agencies.
“By combining the power of Hadoop analytics with Cleversafe’s Object-based Dispersed Storage solution, government entities will be able to significantly reduce their total cost of infrastructure as the amount of their mission critical data grows,” said Tom Gordon, CTO & VP of Engineering of Lockheed Martin’s Information Systems and Global Solutions-National. “The Federal community has been out in front of Big Data, well ahead of many other market segments, and needs technology solutions today that are well suited for Exabyte scale storage as well as massive computation,” said Gordon. “Taken Cleversafe’s approach with Hadoop across commodity hardware, these features deliver a new approach to bring the true potential of Big Data analytics into reach.”
Cleversafe’s object-based storage solution is 100 million times more reliable than traditional RAID-based systems and it doesn’t rely on replication to protect information. Its information dispersal capabilities reduce storage costs up to 90 percent while meeting compliance requirements and ensuring protection against data loss, whether it’s latent hardware errors, data corruption or malicious threats. With the combination of limitless scale, highly reliable storage and efficient analytics in the same platform, Cleversafe is solving the most challenging Big Data problems for customers in a very efficient manner.
Tweet this: @Cleversafe to build first storage-based compute solution based on its dsNet solution and Hadoop MapReduce.
About Cleversafe Inc.
Cleversafe has created a breakthrough technology that solves petabyte and beyond big data storage problems. This solution drives up to 90 percent of the storage cost out of the business while enabling secure and reliable global access and collaboration. The world's largest data repositories rely on Cleversafe. To learn more about Cleversafe and its solutions, please visit www.cleversafe.com, call 312-423-6640 or email us at email@example.com.
Cleversafe Products: http://www.cleversafe.com/products/flexibility-of-choice
Cleversafe's Technology - How it Works: http://www.cleversafe.com/overview/how-cleversafe-works
Cleversafe on Twitter: http://twitter.com/Cleversafe
Lois Paul & Partners, for Cleversafe
- Cleversafe: How does it really work? (ctovision.com)
- Hadoop for Law Enforcement: Big Data can make us all safer (ctolabs.com)
- CDH: The standard for Hadoop in the enterprise just got better (ctovision.com)
- Cleversafe: Limitless Data Storage (ctolabs.com)