Hadoop World was held 15-17 Oct in NYC. This post provides insights into some of the best technologies demonstrated there.
CTOvision has attended Hadoop World since the beginning. Attending has helped us better track tech trends and assess the potential business impacts of some of the greatest technologies created for the enterprise. It is also a great place to interact with business-focused architects, engineers, planners and of course CTOs.
One of the greatest things about the Strata Conference and Hadoop World is the ability to discover new technologies. It is also a great way to get updates on well known capability providers who continue to enhance their offerings. The things we learn will drive our assessments here at CTOvision and help us continue to create business focused analysis for our readers.
By our count there were over 135 technology vendors on the expo floor. It would be great to spend a full hour with every vendor. Let me see, 135 vendors at one hour each, that would be over three weeks of work assuming 40 hour work weeks. OK, that does not scale! With this post we can reduce that 135 hours to a few minutes. Please look this over and give us your thoughts. With that, we present:
The CTOvision Must See Tech List for Hadoop World
Category One: Enterprise Data Hub Providers
- Cloudera: Delivering framework of capabilities that enable the enterprise data hub concept.
- IBM: Big and expensive but clearly they perform. Ensure you benchmark what they can do.
- EMC: From roots in storage they now provide analytics.
- Oracle: With both software and hardware they can deliver well engineered solutions.
Category Two: Infrastructure Management and Data Tools
- Cloudera: CDH is 100% open source with great management tools. Cloudera manager adds more functionality.
- Databricks: Mastery over Apache Spark
- MongoDB: New style data storage, retrieval and analysis including document focus.
Category Three: Analytics
- Clearstory: Good combination of known data with your holdings
- Platfora: Great focus on users, but terrific back end and incredibly fast ability to iterate data
- Pentaho: Open source platform focused on business analytics.
- Ngrain: 3D interactive and augmented reality technologies.
- RevolutionAnalytics: Everyone knows R. Revolution makes R ready for the enterprise.
- SkyTree: Machine Learning platform
Category Four: Adjacencies
- Intel: Chips with capabilities to accelerate analytics and secure big data
- MemSQL: Distributed database for real-time analytics.
- Cisco: Smart data movement
- Mellanox: Supplying end to end InfiniBand and Ethernet interconnect.
Category Five: Consulting, Training, Integrating, Teaching
- CSC: Proven past performance
- Caserta: Tech innovation consulting
- Koverse: Awesome team delivering a platform that separates signal from noise.
- Syracuse University iSchool: Curriculum available online.
- Texas A&M University: Focus on analytics including MS in analytics.
It is also very important that you speed read the entire list of sponsors and firms on the expo floor, I could have left someone out from my assessment that has just what your enterprise mission needs. The entire list is below:
The Technologies of Hadoop World
Cloudera is revolutionizing enterprise data management with the first unified Platform for Big Data: The Enterprise Data Hub. Cloudera offers enterprises one place to store, process and analyze all their data, empowering them to extend the value of existing investments while enabling fundamental new ways to derive value from their data. Founded in 2008, Cloudera was the first and still is the leading provider and supporter of Hadoop for the enterprise. Cloudera also offers software for business critical data challenges including storage, access, management, analysis, security and search. Cloudera works with over 800 hardware, software and services partners to meet customers’ big data goals.
MapR delivers on the promise of Hadoop with a proven, enterprise-grade platform that supports a broad set of mission-critical and real-time production uses. MapR brings unprecedented dependability, ease-of-use, and world-record speed to Hadoop, NoSQL, database and streaming applications in one unified Big Data platform. MapR is used across financial services, retail, media, healthcare, manufacturing, telecommunications and government organizations as well as by leading Fortune 100 and Web 2.0 companies. Amazon, Cisco and Google are part of MapR’s broad partner ecosystem. Investors include Lightspeed Venture Partners, Mayfield Fund, NEA and Redpoint Ventures. Connect with MapR on Facebook, LinkedIn and Twitter.
Microsoft believes anyone should be able to get insights from Big Data. So, we bring the power of the cloud to Big Data making it easier than ever to work with all data types. With Microsoft data solutions, everyone can bring Big Data business insights to life through advanced analytics and stunning visualizations – all powered by our enterprise-grade, flexible and open cloud. Explore our solutions at www.microsoft.com/bigdata.
ClearStory Data provides the first Data Intelligence solution that delivers a fast and easy way to access, discover, harmonize, interactively analyze, and collaboratively explore data, from diverse internal and external data sources. ClearStory’s solution is an integrated Application and Platform that radically changes how people consume data from corporate and external sources, to accelerate the pace of informed and intelligent decision-making. Anyone and any organization can use ClearStory to speed the constant cycle of question to data-driven answers. The company is backed by investments from Andreessen Horowitz, Google Ventures, Kleiner Perkins Caufield & Byers, Khosla Ventures, DAG Ventures and Silicon Valley industry leaders.
IBM offers enterprise ready Hadoop for managing big data and extends it with innovative features, including built-in analytics, visualization and security. InfoSphere BigInsights takes the complexity out of big data and is an integral part of IBM Watson Foundations, a big data and analytics platform designed to complement your existing information infrastructure. This means you can get started quickly today and confidently expand to address more complex problems tomorrow.
Intel, the world leader in silicon innovation, delivers hardware and software technologies to continually advance how people work and live. For over two decades, Intel’s contributions to open-source projects-from one end of the solution stack to the other-have helped ensure that a breadth of solutions run exceptionally well on Intel® architecture. As a result, open-source-based solutions, running on Intel® architecture, help unlock business opportunities, power businesses, connect people, and enhance lives. Open source is bringing amazing experiences to life-and Intel is helping power these experiences as a Sponsor of Tomorrow.
Platfora is the #1 native Big Data Analytics platform for Hadoop. Platfora puts big data directly into the hands of line-of-business people through self-service analytics that help them uncover new opportunities that were once impossible or impractical across transaction, customer interaction and machine data. An interactive and visual full-stack platform delivered as subscription software in the cloud or on-premises, Platfora Big Data Analytics is creating data-driven competitive advantages in the areas of security, marketing, finance, operations and the Internet of Things. Leading organizations such as Citi, Comcast, DirecTV, Disney, Edmunds.com, Opower, Riot Games, Vivint and The Washington Post use Platfora.
SAP helps companies of all sizes and industries run better. From back office to boardroom, warehouse to storefront, desktop to mobile device, SAP empowers people and organizations to work together more efficiently and use business insight more effectively to stay ahead of the competition. We do this by extending the availability of software across on-premise installations, on-demand deployments, and mobile devices. We believe that the power of our people, products, and partners unleashes growth and creates significant new value for our customers, SAP, and, ultimately, entire industries and the economy at large. Our mission is to help companies of all sizes and industries to run better. Our vision is to help the world run better. www.sap.com
The leader in business analytics software and services, SAS is known for analyzing big data. But did you know that SAS provides everything you need to derive insights from data stored in Hadoop? With SAS®, you get simpler data preparation processes, deeper analyses and faster answers. Create seamless access to the Pig and Hive languages and the MapReduce framework. Explore and visualize data stored in Hadoop to discover patterns and publish reports. Model data with domain-specific, high-performance analytics. Automatically deploy and execute models to score data inside Hadoop. From data to decision, we’ve got you covered. Learn more at sas.com/hadoop.
Cisco is the worldwide IT leader that enables amazing things to happen when you connect the previously unconnected. Cisco brings together people, process, data and things to transform how organizations meet next-generation demands. Our customers rely on the power of Cisco intelligent servers and networks to be competitive. Our collective strengths enable us to solve our customer’s important business challenges. Cisco’s true value isn’t just in what we make- it’s what we make possible. www.cisco.com/go/bigdata
Pentaho is building the future of business analytics. Pentaho’s open source heritage drives our continued innovation in a modern, integrated, embeddable platform built for accessing all data sources. With support for all of the leading Hadoop distributions, NoSQL databases and high performance analytic databases, Pentaho provides the broadest support for big data analytics, as well as integration and orchestration of big data and traditional sources. For more information visit pentahobigdata.com or call +1 866-660-7555.
WANdisco (LSE: WAND) is a provider of enterprise-ready, non-stop software solutions that enable globally distributed organizations to meet today’s data challenges of secure storage, scalability and availability. WANdisco’s products are differentiated by the company’s patented, active-active data replication technology, serving crucial high availability (HA) requirements, including Hadoop Big Data and Application Lifecycle Management (ALM), including Apache Subversion and Git. Fortune Global 1000 companies, including Juniper Networks, Motorola, and Halliburton, rely on WANdisco for performance, reliability, security and availability. For additional information, please visit www.wandisco.com.
Actian transforms Big Data into business value for any organization – not just the privileged few. Actian provides transformational business value by delivering actionable insights into new sources of revenue, business opportunities, and ways of mitigating risk with high-performance in-database analytics complemented with extensive connectivity and data preparation. The 21st century software architecture of the Actian Analytics Platform delivers extreme performance on off-the-shelf hardware, overcoming key technical and economic barriers to broad adoption of Big Data. Actian also makes Hadoop enterprise-grade by providing high-performance data enrichment, visual design and SQL analytics on Hadoop without the need for MapReduce skills. Among tens of thousands of organizations using Actian are innovators using analytics for competitive advantage in industries like financial services, telecommunications, digital media, healthcare and retail. The company is headquartered in Silicon Valley and has offices worldwide. Stay connected with Actian Corporation at www.actian.com or on Facebook, Twitter and LinkedIn
Informatica is the world’s number one independent provider of data integration software. Organizations rely on Informatica to realize their information potential and drive their top business imperatives. Informatica Vibe, the industry’s first and only embeddable virtual data machine (VDM), powers the unique “Map Once. Deploy Anywhere” capabilities of the Informatica Platform. Vibe harnesses data in every application, every process, for every person and in every device in the world. Organizations around the globe depend on Informatica to fully leverage their information assets from devices to mobile to social to big data residing on-premise, in the Cloud and across social networks.
Founded in 1989, MicroStrategy (Nasdaq: MSTR) is a leading worldwide provider of enterprise software platforms. The Company’s mission is to provide the most flexible, powerful, scalable and user-friendly platforms for analytics, mobile, identity and loyalty, offered either on premises or in the cloud. The MicroStrategy Analytics Platform™ enables leading organizations to analyze vast amounts of data and distribute actionable business insight throughout the enterprise. The MicroStrategy Mobile App Platform™ lets organizations rapidly build information-rich applications that combine multimedia, transactions, analytics, and custom workflows. To learn more, visit www.microstrategy.com and follow us on Facebook (www.facebook.com/microstrategy) and Twitter (www.twitter.com/microstrategy).
Oracle provides the world’s most complete, open, and integrated business software and hardware systems representing a variety of sizes and industries in more than 145 countries around the globe. Big data is revolutionizing the way businesses and government operate virtually overnight. As you explore how to leverage the power of big data in your business, let the experts at Oracle show you how to maximize value from big data and transform your world. Oracle and big data — transform your business with big data today. Learn about our Big Data Solutions at http://www.oracle.com/us/technologies/big-data/index.html
Every data analyst will admit they spend more time cleaning, merging and shaping data than on discovery or analytics. While traditional ETL processes is fine for data that is constant and controlled, 50% or more of the information being used by the business comes from ever-changing data, derived from a set of highly dynamic sources. Paxata is the first Adaptive Data Preparation™ platform built from the ground-up to address IT and business requirements for data integration, quality, enrichment, collaboration and governance. Paxata makes it possible for every analyst to rapidly get ALL the data they need ready for analytics in minutes…not hours or days. Paxata is a cloud-based, self-service solution powered by Intellifusion™, the proprietary semantic fusion and machine learning engine which proactively and automatically detects data types, relationships, patterns, anomalies and errors, via a highly interactive visual experience. Built on the Hadoop stack, and in partnership with Cloudera, Paxata delivers data prep @scale with a seamless connection to any BI tool like Tableau, Qlik and Excel, so analysts have total flexibility to use the visualization and discovery solutions they prefer to use. Stop by Booth No 230 to see Paxata in action. www.paxata.com, @paxata
Splunk is a provider of the industry leading software platform for real-time operational intelligence. Splunk® Enterprise collects, indexes and harnesses machine-generated big data from the websites, applications, servers, networks, sensors and mobile devices that power a business. More than 6,400 customers in 90+ countries use Splunk software to gain Operational Intelligence to deepen business and customer understanding, improve service and uptime, reduce costs and mitigate cybersecurity risks. Splunk Cloud™ delivers Splunk Enterprise as a cloud service. Hunk™: Splunk Analytics for Hadoop is a fully integrated analytics platform for Hadoop to interactively explore, analyze and visualize data in Hadoop.
Syncsort provides fast, secure, enterprise-grade software spanning Big Data solutions in Hadoop to Big Iron on mainframes. We help customers around the world to collect, process and distribute more data in less time, with fewer resources and lower costs. 87 of the Fortune 100 companies are Syncsort customers, and Syncsort’s products are used in more than 85 countries to offload expensive and inefficient legacy data workloads, speed data warehouse and mainframe processing, and optimize cloud data integration. Experience Syncsort at www.syncsort.com.
Teradata is a global leader in analytic data platforms, marketing and analytic applications, and consulting services. Teradata helps organizations collect, integrate, and analyze all of their data so they can know more about their customers and business and do more of what’s really important. With 10,000+ professionals in 77 countries, Teradata serves more than 2,500 customers, including the top companies across all major industries: consumer goods, financial services, healthcare, automotive, communications, travel, hospitality, and more. An ethical and future-focused company, Teradata is recognized by the business media and industry analysts for technological excellence, sustainability, and business value. Visit teradata.com for details.
Accenture is a global management consulting, technology services and outsourcing company, with approximately 289,000 people serving clients in more than 120 countries. Combining unparalleled experience, comprehensive capabilities across all industries and business functions, and extensive research on the world’s most successful companies, Accenture collaborates with clients to help them become high-performance businesses and governments. The company generated net revenues of US$28.6 billion for the fiscal year ended Aug. 31, 2013. Its home page is www.accenture.com.
GE is a global infrastructure, finance and media company taking on the world’s toughest challenges. From everyday light bulbs to fuel cell technology, to cleaner, more efficient jet engines. GE has continually shaped our world with groundbreaking innovations for over 130 years.
Pivotal, committed to open source and open standards, recently introduced Pivotal One, the world’s first comprehensive multi-cloud Enterprise PaaS. The company is also a leading provider of application and data infrastructure software, agile development services, and data science consulting. Learn more at www.gopivotal.com.
Launched in 2006, Amazon Web Services, Inc. began exposing key infrastructure services to businesses in the form of web services — now widely known as cloud computing. Today, Amazon Web Services provides a highly reliable, scalable, low-cost infrastructure platform in the cloud that powers hundreds of thousands of enterprise, government and startup customers businesses in 190 countries around the world. Amazon Web Services offers over 30 different services, including big data services such as Amazon Redshift (data warehousing), Amazon Elastic MapReduce (hosted Hadoop service), Amazon Kinesis (real time streaming), and Amazon DynamoDB (managed NoSQL database service).
Attivio makes information meaningful, accessible, and actionable in ways that were never before possible. Our patented Active Intelligence Engine® (AIE®) brings together information from any source or format and enriches it to expose the relationships, patterns, and insights that are hidden within. AIE’s flexible design enables business and technology leaders to speed innovation through rapid prototyping and deployment, while dramatically lowering risk. Systems integrators, independent software vendors, corporations and government agencies partner with Attivio to automate information-driven processes and gain competitive advantage. For more information visit www.attivio.com.
Couchbase is a leading provider of NoSQL database technology and the company behind the Couchbase open source project. Couchbase Server, the company’s flagship product, is a NoSQL document-oriented database with production deployments at AOL, Cisco, Concur, LinkedIn, Orbitz, Salesforce.com, Shuffle Master, Zynga and hundreds of other household names worldwide. It is particularly well suited for interactive applications, providing easy scalability, consistent high performance, 24×365 availability, and a flexible data model for ease of development.
Dell today is a global leader in delivering customer-centric technology solutions for businesses and consumers. Its scalable products, services and solutions enable customers to drive results, create competitive advantage and expand their opportunities. As one of the most successful technology solutions providers in the world, Dell ranks among the top 40 U.S. and the top 100 global companies. The company is growing through a combination of internal investments and business partnerships, as well as technology and solutions acquisitions, to provide the most value for its customers. For more information, visit Dell.com or Dell.com/Hadoop.
Hortonworks is a leading commercial vendor of Apache Hadoop, the preeminent open source platform for storing, managing and analyzing big data. Our distribution, Hortonworks Data Platform powered by Apache Hadoop, provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy big data solutions. Hortonworks is the trusted source for information on Hadoop, and together with the Apache community, Hortonworks is making Hadoop more robust and easier to install, manage and use. Hortonworks provides unmatched technical support, training and certification programs for enterprises, systems integrators, and technology vendors. For more information, visit www.hortonworks.com.
HP creates new possibilities for technology to have a meaningful impact on people, businesses, governments and society. With the broadest technology portfolio spanning printing, personal systems, software, services and IT infrastructure, HP delivers solutions for customers’ most complex challenges in every region of the world. More information about HP (NYSE: HPQ) is available at http://www.hp.com
MemSQL is a distributed database for real-time analytics. Data scientists, analysts, and developers can query high velocity workloads and historical data simultaneously, all through a convenient SQL interface. By combining significant speed and throughput advantages with complex analytics, an enterprise can gain instant insight to their business and stay competitive in a fast-moving environment.
Qlik® (NASDAQ: QLIK) is committed to changing the world by making it easier to make more insightful decisions and act on them. The QlikView® Business Discovery software platform and Qlik Customer Success Framework™ provide people; technology and service, helping organizations optimize data as a strategic resource. QlikView uses Natural Analytics™ to support the way people naturally analyze information. It enables users to see associations, make comparisons and anticipate outcomes in a natural way, rather than forcing them down inflexible drill paths. QlikView gives the immediate insights businesses need with the enterprise governance IT requires. Qlik serves approximately 30,000 customers in 10+ countries.
Red Hat is the world’s leading provider of open source solutions, taking a community-powered approach to provide reliable and high-performing cloud, virtualization, storage, Linux, and middleware technologies. Red Hat also offers award-winning support, training, and consulting services. And as the connective hub in a global network of enterprises, partners, and open source communities, Red Hat enables the creation of relevant, innovative technologies that liberate resources for growth and prepare customers for the future of IT. Red Hat is an S&P company with more than 70 offices spanning the globe, empowering our customers’ businesses.
RedPoint Global develops software that helps companies cost-effectively manage all their data, derive actionable insights from it, and take the most appropriate actions to monetize it. RedPoint is the only company to offer comprehensive ETL, data quality and data integration capabilities that operate across both traditional and Hadoop 2.0 / YARN environments — all without the need for MapReduce skills, allowing more companies to reap the benefits of Hadoop. For more information, visit www.redpoint.net or email firstname.lastname@example.org.
Whether it’s optimizing inventory, cross-selling products, or averting a crisis before it happens, TIBCO uniquely delivers the Two-Second Advantage® – the ability to capture the right information at the right time and act on it preemptively for a competitive advantage. TIBCO Software Inc. is a provider of infrastructure software for companies to use on-premise, or as part of cloud computing environments. TIBCO Spotfire® is the company’s enterprise-class, in-memory or in-database analytics platform. Spotfire’s analytic apps and metrics provide an analytic platform to instantly turn insight into action by enabling anyone to rapidly discover hidden insights and quickly collaborate in context.
Trifacta, the pioneer in data transformation, significantly enhances the value of an enterprise’s Big Data by enabling users to easily transform raw, complex data into clean and structured inputs for analysis. Leveraging decades of innovative work in human-computer interaction, scalable data management and machine learning, Trifacta’s unique technology creates a bi-directional partnership between user and machine, with each component learning from the other and becoming smarter through use. Trifacta is backed by venture capital firms Greylock and Accel and is headquartered in San Francisco. Its founders and technical advisors include global leaders in data science, interaction design and big data.
Waterline Data Science is an early-stage Big Data software company, founded in December 2013, backed by Menlo Ventures and Sigma West. The inspiration for the name “Waterline” came from the water metaphor of the Big Data Lake. Waterline solves the challenges of data self-service for the Hadoop data lake. It’s easy to get data into Hadoop, but it’s not easy to get data out of Hadoop in a self-service manner to get business value from Hadoop. The idea behind “Waterline” is that data self-service for Hadoop should be like finding the data you need, at the Waterline, without having to dive for it.
CSC, a global leader in next-generation IT services and solutions, enables superior returns on clients’ technology investments through best-in-class industry solutions, domain expertise and global scale. CSC provides a fully integrated and managed Big Data Platform as a Service that utilizes advanced web-scale technologies to enable application developers to quickly develop, test and deploy applications that require any combination of ad hoc, batch and real-time analytics. To learn more, go to www.csc.com/big_data.
For more than a decade, MarkLogic has delivered a powerful, agile and trusted Enterprise NoSQL database platform that enables organizations to turn all data into valuable and actionable information. Organizations around the world rely on MarkLogic’s enterprise-grade technology to power the new generation of information applications. MarkLogic is headquartered in Silicon Valley with offices in Washington D.C., New York, London, Frankfurt, Utrecht, and Tokyo. For more information, please visit www.marklogic.com.
VMware, the industry-leading virtualization and cloud infrastructure solutions, empowers organizations to innovate and thrive in the cloud era. By virtualizing, automating, and managing all aspects of IT – from the data center to the cloud to mobile devices – VMware enables organizations to deliver services on demand from any device, anytime, anywhere. VMware delivers value to more than 500,000 customers through virtualization software, professional services and a robust ecosystem of more than 55,000 partners that drives application interoperability and customer choice. See more at www.vmware.com
Alpine is the world’s first collaborative, code-free solution for advanced analytics on Big Data. Alpine’s end-to-end approach make predictive analytics approachable across the organization–regardless of technical skill. Alpine’s unique “in-cluster analytics” technology leverages existing enterprise data platforms, including MPP databases and Hadoop. Alpine Chorus enables Big Data agility for your data science team. The first solution of its kind, Chorus provides an analytic productivity platform that enables the team to search, explore, visualize, and import data from anywhere in the organization. It provides rich social network features that revolve around datasets, insights, methods, and workflows, allowing data analysts, data scientists, IT staff, DBAs, executives, and other stakeholders to participate and collaborate on Big Data. Customers deploy Chorus to create an agile, analytic infrastructure; teams can create workspaces on the fly with self-service provisioning and then instantly start creating and sharing insights. Alpine Chorus cuts weeks and months from the analytics process by eliminating unnecessary data movement. Models run faster and are deployed directly within the database or Hadoop. Its workflow structure provides a transparent history of all transformations for auditing and data governance.
Basho Technologies is the creator and developer of Riak, a distributed database (sometimes categorized as a NOSQL database) that provides extreme high-availability, fault-tolerance, and operational simplicity, and Riak CS, a cloud-based object storage system that sits on top of Riak. Riak has rapidly gained adoption throughout the Fortune 100 and has become foundational to many of the world’s fastest-growing Web-based, mobile and social networking applications, as well as cloud service providers offering public, private and hybrid solutions. Basho offers open-source and commercial editions of Riak and Riak CS. Riak Enterprise extends Riak with additional features, including multi-datacenter replication, and provides customers with 24×7 customer support.
Bloomberg technology helps drive the world’s financial markets. We provide communications platforms, data, analytics, trading platforms, news and information for the world’s leading financial market participants. We deliver through our unrivaled software, digital platforms, mobile applications and state of the art hardware developed by Bloomberg technologists for Bloomberg customers. Our 3,000+ technologists work to define, architect, build and deploy complete systems and solutions that anticipate and fulfill our clients’ needs and market demands. We offer critical enterprise computing solutions for the financial services industry to help organizations deliver, decipher and manage data to meet their needs and growing regulatory requirements
BlueData is transforming how enterprises provision and manage their big data applications. The BlueData software platform lets enterprises create private individualized clusters for each of their users – instantly, using any storage, any server and can run any Hadoop application. With BlueData, enterprises can now run all their big data applications unmodified, extracting value from their data in a fraction of the time and at a lower cost as compared to current solutions. Based in Mountain View, CA, BlueData is founded by a highly experienced team from VMware, Akamai, Intel and SGI and backed by industry luminaries from Silicon Valley.
CA Technologies helps customers succeed in a future where every business–from apparel to energy–is being rewritten by software. From planning to development to management to security, at CA we create software that fuels transformation for companies in the application economy. With CA software at the center of their IT strategy, organizations can leverage the technology that changes the way we live–from the data center to the mobile device. Our software and solutions help our customers thrive in the new application economy by delivering the means to deploy monitor and secure their applications and infrastructure.
Continuuity makes it easy for any Java developer to build and manage data applications in the cloud or on-premise. Continuuity Reactor, its flagship product, is the industry’s first Big Data Application Server for Apache Hadoop™. Based in Palo Alto, Calif., the company is backed by leading investors including Battery Ventures, Andreessen Horowitz and Ignition Partners. Learn more at www.continuuity.com.
NGRAIN is creating a world transformed with interactive 3D and augmented reality technologies. With NGRAIN products, companies enhance the performance of people, machines, and the interactions between them. NGRAIN’s integrated, versatile platform combines enterprise 3D assets and data to provide operational intelligence for mission critical equipment and accelerate decision-making across the organization. From the factory floor to the field, get information how you need it, when you need it – with NGRAIN interactive 3D and augmented reality.
Rackspace® (NYSE: RAX) is the global leader in hybrid cloud and founder of OpenStack®, the open-source operating system for the cloud. Hundreds of thousands of customers look to Rackspace to deliver the best-fit infrastructure for their IT needs, leveraging a product portfolio that allows workloads to run where they perform best-whether on the public cloud, private cloud, dedicated servers, or a combination of platforms. The company’s award-winning Fanatical Support® helps customers successfully architect, deploy and run their most critical applications. Headquartered in San Antonio, TX, Rackspace operates data centers on four continents. Rackspace is featured on Fortune’s list of 100 Best Companies to Work For. For more information, visit www.rackspace.com.
Software AG helps organizations achieve their business objectives faster. The company’s big data, integration and business process technologies enable customers to drive operational efficiency, modernize their systems and optimize processes for smarter decisions and better service. Building on over 40 years of customer-centric innovation, the company is among the top 10 fastest-growing technology companies in the world and is ranked as a leader in 15 market categories, fueled by core products Adabas and Natural, Alfabet, ARIS, Apama, Terracotta and webMethods. For more information please visit www.softwareag.com/na
Treasure Data is the first managed service in the cloud for big data analytics. By simplifying data acquisition, storage and analysis, businesses get value from big data in days, not months. Economically process enormous data volumes in real-time with no infrastructure to manage. 100+ customers include Toyota and NTT Docomo.
World Wide Technology (WWT) is a leading systems integrator and supply chain solutions provider that brings an innovative and proven approach to how organizations around the globe evaluate, architect and implement technology. Our customers have hands-on access to cutting-edge data center, networking, security and collaboration products in our Advanced Technology Center; technical expertise from our expansive team of engineering resources; and accelerated global product delivery, powered by a sophisticated supply chain management infrastructure. By working with a financially strong, privately held systems integrator that ranks as a top tier of Cisco, HP, EMC, NetApp, VMware and Citrix, our customers realize the benefits of saving time and money while significantly minimizing risk.
Caserta Concepts is a New York-based technology innovation consulting services firm that specializes in big data analytics, data warehousing, governance, visualization and business intelligence. With a worldwide network of professionals, Caserta Concepts collaborates with CIOs and their IT organizations to help them gain new business insights through a better understanding of their data. Internationally recognized big data analytics authority and author, Joe Caserta, founded the company in 2001. For more information, please visit www.casertaconcepts.com Connect with Caserta Concepts on Twitter (@casertaconcepts) and LinkedIn at www.linkedin.com/company/caserta-concepts. You can also follow Joe Caserta on Twitter at @joe_caserta
Seattle-based Context Relevant provides the next-generation of predictive analytics capabilities for all industries, with a current focus on the financial sector. Wall Street banks and insurance companies are leveraging the software to run analytics on a vast amount of data to provide real-time recommendations across use-cases within the turbulent bond market to preventing fraud caused by major security breaches or stolen credit cards. Context Relevant’s solution has significantly reduced the time and manual effort typical of big data projects and is arming customers with unparalleled competitive intelligence.
Founded in 2004, Facebook’s mission is to make the world more open and connected. People use Facebook to stay connected with friends and family, to discover what’s going on in the world, and to share and express what matters to them.
H2O by 0xdata brings better algorithms to big data. H2O is the fast open source in-memory prediction engine & machine learning platform. With H2O enterprises can use all of their data (instead of sampling) in real-time for better predictions. Data Scientists can take both simple & sophisticated models to production from the same interactive platform used for modeling, within R and JSON. H2O is also used as an algorithms library for Making Hadoop Do Math. Our earliest customers have built powerful domain specific predictive engines for Recommendations, Pricing and Outlier detection in Fraud & Insurance. 0xdata is the maker of H2O and nurturing a grassroots movement of math, systems and data scientists to herald the new wave of Discovery with Big Data Science.
Aerospike delivers a next-generation NoSQL database that powers some of the world’s leading Web-scale real-time big data driven platforms in digital advertising and omni-channel marketing, including AppNexus, BlueKai, Chango, The Trade Desk and [x +1]. The first flash-optimized, in-memory, operational NoSQL database with ACID properties, Aerospike is used by revenue-critical applications to personalize the user experience by predictably processing billions of user profiles and terabytes of current contextual data with sub-millisecond response times. Developers in search, mobile, video, gaming, social, ecommerce, retail, banking, telecom and more are choosing Aerospike to gain 10x better price/performance across multiple data centers with zero-touch, zero-downtime operations.
Altiscale offers the first cloud service that is purpose-built to run Apache Hadoop. Altiscale’s optimized infrastructure is faster, more reliable, easier to use, and more affordable than alternatives. Without the distractions of Hadoop operations, and with access to practically unlimited resources, Altiscale customers can drive more business value from Hadoop than ever before. Altiscale’s founding team has been at the forefront of Apache Hadoop, from its incubation at Yahoo! to operating more than 40,000 Hadoop nodes. Altiscale is backed by General Catalyst and Sequoia Capital, with additional investment from Accel Partners, Jerry Yang, and other individual investors.
Appfluent provides IT organizations with unprecedented visibility into their Big Data systems to reduce costs. Appfluent helps companies put the right workload on the right system, across data warehouses, business intelligence, and Hadoop. With Appfluent, enterprises can address exploding data growth, proactively manage performance of BI and data warehouse systems, and realize the tremendous economies of Hadoop.
Ataccama Corporation combines data quality, master data management, and data governance in a single technology platform, ready for operational, analytical and BigData deployments. Ataccama Data Integration & Processing Platform for Hadoop offers an easy-to-use development interface (GUI), shared metadata, and rich data integration layer – often replacing specialized ETL technologies. It accommodates the key Hadoop features, such as massive parallel processing, scale-out nature, fault tolerance, memory management, etc., and is available to anyone who needs to profile, map, model, process, transform, cleanse, enrich, and integrate data with Hadoop. Kick off your Hadoop initiative and request a complimentary Big Data Test Drive.
Attunity is a leading provider of information availability software solutions that enable access, management, sharing and distribution of data, including Big Data, across heterogeneous enterprise platforms, organizations, and the cloud. Our software solutions include data replication, data management, test data management, change data capture (CDC), data connectivity, enterprise file replication (EFR), managed-file-transfer (MFT), and cloud data delivery. Using Attunity’s software solutions, our customers enjoy significant business benefits by enabling real-time access and availability of data and files where and when needed, across the maze of heterogeneous systems making up today’s IT environment. www.attunity.com
Azul Zing™ is essential technology for Big Data applications that are critical to business results. Zing is the only Java performance solution that delivers both very low latency and high sustained throughput for real-time analytics and self-service business intelligence. With Zing your Big Data applications can utilize massive in-memory datasets while delivering predictable performance, allowing reports to be run on more live data with faster results. Zing even reduces or eliminates the need for extra caching applications. For more information visit our website www.azulsystems.com/analytics
Booz Allen Hamilton is a leading provider of management consulting, technology, and engineering services to the US government in defense, intelligence, and civil markets, and to major corporations, institutions, and not-for-profit organizations. Booz Allen is headquartered in McLean, Virginia, employs approximately 23,000 people, and had revenue of $5.76 billion for the 12 months ended March 31, 2013. In 2014, Booz Allen celebrates its 100th anniversary year. To learn more, visit www.boozallen.com. (NYSE: BAH)
Bright Computing delivers on the promise of advanced cluster management, made easy. Bright Cluster Manager is enterprise-grade software that makes it easy to deploy and manage Hadoop clusters of all sizes. It significantly reduces the cost of deploying, managing and using servers, compute clusters, and Hadoop clusters while increasing system availability and throughput. From its bare-metal provisioning of the entire software stack to its beautiful graphical user interface, Bright provides the most advanced cluster management solution for Hadoop available. Dell, Cisco, Amazon and Intel are part of Bright’s partner ecosystem, and our customers include Fortune 100 companies.
Cloudwick is the leading big data service provider in North America providing the Fortune 1000 with scale-out production services for Cloudera, Hortonworks, MapR and DataStax big data clusters. Its trained and certified Hadoop and NoSQL administrators, developers and data scientists have more than 125,000 hours of enterprise production experience with leading enterprises including 3M, Bank of America, Comcast, Home Depot, Intuit, JP Morgan, NetApp, Nike, Target, Visa, Walmart, and more than 50 others enterprises. As a strategic partner to Cloudera, Hortonworks and DataStax, Cloudwick builds, operates, monitors and manages many of North America’s leading big data systems. For more information, please visit www.cloudwick.com.
The Comcast, Technology & Product is transforming how Comcast customers access and enjoy entertainment, communications and home management services across all mediums–online, mobile and TV. By assembling the country’s top developers and thinkers, we are rapidly becoming the premier environment for building and launching one-of-a-kind experiences that make everyday life easier and more entertaining for millions of customers.
Databricks was founded out of the UC Berkeley AMPLab by the creators of Apache Spark. We’ve been working for the past six years on cutting-edge systems to extract value from Big Data. We believe that Big Data is a huge opportunity that is still largely untapped, and we’re working to revolutionize what you can do with it.
DataFactZ is a highly specialized firm in the area of BI/DW consulting services with a competency in Big Data Analytics solutions designed to generate new value by providing the right skills and smart strategic advice companies are searching for by helping business leaders use data and technology to drive greater insights. With 600+ technical experts and solution consultants’ on-staff, DataFactZ offers a specialized team with deep knowledge and tactic around discovering and driving business value in Big Data as businesses are realizing significant operational savings & productivity benefits when it comes to managing large repositories of Structured, Semi-structured and Unstructured data.
Dataguise is the leading provider of data privacy protection and compliance intelligence for sensitive data stored in Big Data and traditional repositories. The comprehensive Dataguise DgSecure suite of quick-to-deploy solutions enables businesses to discover and maintain a 360-degree view of their sensitive data, evaluate their compliance exposure risks, and intelligently mask or encrypt the data, no matter how or where that data is stored. Dataguise was named a Visionary in the Gartner Magic Quadrant for Data Masking Technology, December 2013. For more information on Dataguise, visit http://www.dataguise.com.
Datameer is the only end-to-end big data analytics application purpose-built for Hadoop, designed to make big data easy for everyone. Hundreds of companies, such as British Telecom, Citibank, Kabam, Trustev, Visa, Vivint, and Workday use Datameer to integrate, analyze and visualize all of their data to get new insights faster than ever. Datameer is available for all major Hadoop distributions including Apache, Cloudera, EMC, Hortonworks, IBM, MapR and Amazon.
DataStax powers the online applications that transform businesses for more than 400 customers, including startups and more than 20 of the Fortune 100. DataStax delivers a massively scalable, flexible and continuously available big data platform built on Apache Cassandra™. DataStax integrates enterprise-ready Cassandra, Apache Hadoop™ for analytics and Apache Solr™ for search across multi-datacenters and in the cloud. Companies such as Adobe, Healthcare Anytime, eBay and Netflix rely on DataStax to transform their businesses. Based in San Mateo, Calif., DataStax is backed by industry-leading investors: Lightspeed Venture Partners, Crosslink Capital, Meritech Capital Partners, Scale Venture Partners, DFJ Growth and Next World Capital. For more information, visit DataStax or follow us @DataStax and @DataStaxEU
DataTorrent is the first data & action platform in the world that can instantaneously process streaming data on a massive-scale. Built exclusively on Hadoop 2.0, it lets enterprises process, monitor, analyze, and act on massive amounts of unstructured or structured data in real-time. DataTorrent runs directly in your Hadoop cluster in memory and handles the processing and transformation of your data instantaneously, with built-in fault tolerance and elasticity. Unlike traditional batch processing that can literally take hours, DataTorrent enables immediate “NowTime” decision making. Download our platform today at www.datatorrent.com.
EMC Corporation is a global leader in enabling businesses and service providers to transform their operations and deliver IT as a service. Fundamental to this transformation is cloud computing. Through innovative products and services, EMC accelerates the journey to cloud computing, helping IT departments to store, manage, protect and analyze their most valuable asset – information – in a more agile, trusted and cost-efficient way.
Exar will demonstrate the newly announced hardware-assisted Hadoop Acceleration Solution. Exar’s solution has been certified by Cloudera, seamlessly integrates into Hadoop clusters without any software code changes and has been benchmarked on numerous OEM platforms. The Hadoop Acceleration solution significantly reduces the Data Analytics time and the Cluster TCO. This solution is part of Exar’s Data Compression and Security product line, which offers high-performance solutions for structured and unstructured data. Exar’s product portfolio includes power management and connectivity components, communications products, and network security and storage solutions. Exar has locations worldwide providing real-time customer support. Visit www.exar.com for more information.
GridGain develops the leading open source In-Memory Computing Platform, enabling organizations to conquer challenges that traditional technology can’t approach. From high-performance data management and real-time streaming to an industry first in-memory Hadoop accelerator, GridGain provides the most complete end-to-end stack for low-latency, in-memory computing that allows customers to easily innovate ahead of the accelerating pace of business. Fortune 500 companies, top government agencies and innovative mobile and web companies use GridGain to achieve unprecedented computing performance and business insights. GridGain is headquartered in Foster City, California. To download the GridGain In-Memory Computing Platform, please visit www.gridgain.org.
Trafodion is an open source initiative from HP, incubated at HP Labs and HP-IT, to develop an enterprise-class SQL-on-HBase solution targeted for big data transactional or operational workloads. Trafodion builds on the scalability, elasticity, and flexibility of Hadoop. Trafodion extends Hadoop to provide guaranteed transactional integrity, enabling new kinds of big data applications to run on Hadoop.
Impetus is a provider of innovative Big Data solutions and services. We empower enterprises in the financial services, healthcare, digital media, travel and entertainment, and manufacturing industries to gain big business impact from Big Data. We use proven methodologies spanning the full life-cycle of architecture advisory, proof of value, data science, application development and implementation services. We are experts in the complete Big Data ecosystem, including Hadoop, Cassandra, NoSQL, MPP systems, real-time and predictive analytics, machine learning, visualization, cloud computing and enterprise mobility. For more information, visit bigdata.impetus.com.
Jaspersoft empowers millions of people every day to make faster decisions by bringing them timely, actionable data inside their apps and business processes. Its embeddable, cost-effective reporting and analytics platform allows anyone to quickly self-serve and get the answers they need and scales architecturally and economically to reach everyone.
JethroData is a unique SQL and Indexing engine for Hadoop. It works by automatically indexing data as it is written into Hadoop. Queries use indexes to access only the data they need instead of performing a full-scan of the entire dataset, leading to both a dramatic improvement in speed and a substantial reduction in computing resource usage. Jethro is optimal for use cases such as interactive ad-hoc queries, live dashboards and rapid reports where queries typically access a portion of the data. With Jethro, you enjoy the scalability of Hadoop with the performance of an analytical database, in one system.
Kyvos is committed to unlock the power of Big Data Analytics with its unique “OLAP on Hadoop” technology. This allows you to build cubes in-place on Hadoop with linear scalability, eliminating the limitations of traditional OLAP solutions, and enabling interactive multi-dimensional analytics on your Big Data. Users can visualize, explore and analyze their data interactively on Hadoop with no programming required. Come and explore Kyvos to experience OLAP on Hadoop at unprecedented scale. See you at Booth# 106.
MathWorks is the leading developer of mathematical computing software. MATLAB, the language of technical computing, is a programming environment for algorithm development, data analysis, visualization, and numeric computation. Engineers and scientists worldwide rely on MATLAB for a range of applications, including signal processing and communications, image and video processing, control systems, test and measurement, computational finance, and computational biology. Visit us to see how MATLAB can help you explore big data, develop analytics, and integrate MATLAB based analytics into Hadoop and other production IT environments.
MBI Solutions is a leading Managed Services firm in the area of Capacity Planning, Chargeback Accounting and Remote Administration. Our experts are dedicated to understanding the utilization of your Hadoop environment via the Cloudera metrics accumulated. For Capacity planning this means MBI providing experts running predictive analytics/what-if scenarios/cost models to inform capacity and performance issues now and into the future (http://blog.cloudera.com/blog/2014/06/capacity-planning-with-big-data-and-cloudera-manager/). Chargeback Accounting gathers Cloudera metrics to provide simple or complex cost based models to allow organizations to share the cost. As an example, the types of accounting methods can be fixed rate, allocation based, utilization based or a mixture.
Mellanox Technologies (NASDAQ: MLNX) is a leading supplier of end-to-end InfiniBand and Ethernet interconnect solutions and services for servers and storage. Mellanox interconnect solutions increase data center efficiency by providing the highest throughput and lowest latency, delivering data faster to applications and unlocking system performance capability. Mellanox offers a choice of fast interconnect products: adapters, switches, software and silicon that accelerate application runtime and maximize business results for a wide range of markets including high performance computing, enterprise data centers, Web 2.0, cloud, storage and financial services. More information is available at www.mellanox.com.
An early adopter of big data and legacy modernization initiatives, MetaScale provides cutting-edge technologies, Hadoop training and technology solutions to its customers. As a subsidiary of Sears Holdings Corporation, we understand the value of heritage and the need for constant innovation to drive growth. Through this heritage, we offer a deep understanding of employing complex big data tools to solve traditional business problems in the enterprise. Our team brings extensive experience in the migration of workloads off mainframe, large-scale private open-source cloud computing, Hadoop for big data BI and legacy infrastructure modernization.
MongoDB makes development simple and beautiful. For tens of thousands of organizations, MongoDB provides agility and the freedom to scale. Fortune 500 enterprises, startups, hospitals, governments and organizations of all kinds use MongoDB because it is the best database for modern applications. Through simplicity, MongoDB changes what it means to build. Through openness, MongoDB elevates what it means to work with a software company. Please visit www.MongoDB.com for more.
Novetta delivers agile big data analytics that empower our customers to quickly extract value from massive amounts of data and make confident, data-driven decisions. Novetta Identity Analytics is the first to offer high-speed, high quality, entity resolution natively on Hadoop. Novetta Identity Analytics provides unified views of people, locations, organizations, events, and relationships across multiple systems and sources so you can connect the dots and start solving your business problems. Whether it is business intelligence, customer and marketing analytics, or fraud and risk analysis, Novetta can help you quickly tear down silos to uncover the insights that drive better decisions.
Headquartered in Sunnyvale, California, we create software that monitors all facets of Hadoop performance, including CPU, memory, disk I/O, and network by user, job, and task, in real time. It dynamically adjusts cluster utilization based on your policies and priorities so that your jobs run faster, more reliably, and more efficiently. Stop by and see us at booth 506. For more information, visit www.pepperdata.com and follow us on Twitter @pepperdata.
Plexxi makes datacenter networking infrastructure explicitly designed for Big Data and other distributed applications. Plexxi’s switch fabric combines three mature networking technologies–photonic switching, merchant Ethernet, and SDN control–to deliver networks that are flat, cost effective, and simple to manage. The combination of central control and photonic interconnect allows Plexxi to provide high-capacity, low-latency connectivity between Big Data nodes–whether they are 1m or 1000km apart. And the network that provides that connectivity is programmable and responsive to application inputs. By integrating with OpenStack and Cloudera, Plexxi delivers a robust piece of the Big Data puzzle.
Predixion was founded on the belief that predictive analytics has the power to create a smarter, safer and healthier world – and access to that power should not be limited. To achieve this vision, Predixion developed Predixion Insight™, a self-service predictive analytics platform that simplifies the entire predictive process. Predixion Insight is designed for business analysts and non-technical users to enable broader adoption of predictive analytics, but is powerful enough for data scientists. Predixion expedites the “Last Mile of Analytics™” – deployment of powerful predictions directly to those who need them to take action – so the value of being predictive is realized immediately. www.predixionsoftware.com.
Protegrity, the innovative leader of groundbreaking enterprise data security software, provides high performance, infinitely scalable end-to-end data security solutions for organizations worldwide. Protegrity helps its customers secure all of their sensitive data in Hadoop and across the enterprise, ensuring compliance with all PCI, PHI and Privacy regulations. Protegrity’s solutions give corporations the ability to implement a variety of data protection methods, including vaultless tokenization, strong encryption, masking and monitoring to ensure the protection of their sensitive data. For more information, visit www.protegrity.com
Pioneering advanced analytics vendor RapidMiner is redefining how business analysts use Big Data to predict the future. With an open source heritage, RapidMiner is one of today’s most widely known and used predictive analytics platforms, providing powerful solutions for a wide variety of industries. For more information, visit www.rapidminer.com.
Revolution Analytics delivers advanced analytics software at half the cost of existing solutions. The company brings high performance, productivity, and enterprise readiness to open source R, the most powerful statistics software in the world. To equip R for the demands and requirements the modern data-driven business, Revolution Analytics builds on open source R with innovations in big data analysis, integration and enterprise deployment. Leading organizations including Merck, Bank of America and Mu Sigma rely on Revolution R Enterprise for their data analysis, development and mission-critical production needs. Revolution Analytics is committed to fostering the growth of the R community, and offers free licenses of Revolution R Enterprise to academia. Revolution Analytics is headquartered in Palo Alto, Calif. and backed by North Bridge Venture Partners and Intel Capital.
Saama Technologies is one of the largest pure-play data science solutions and services companies focused on solving the data management and advanced analytics challenges of the world’s leading brands. Based in Campbell, CA, Saama has over 15 years of history in implementing business analytics, big data, predictive analytics and data management solutions for global clients in industries such as life sciences, healthcare, insurance, financial services, high technology, media and public sector. Saama serves these clients from its offices in US, Europe and India. Saama can be reached at www.saama.com
ScaleOut Software puts the power of in-memory computing into an elegant platform. Our in-memory computing server delivers continuous, data-parallel computation on your live data to capture perishable business opportunities. Now, the power of ScaleOut’s platform can be accessed using standard Hadoop MapReduce. With ScaleOut hServer®, use the same Hadoop MapReduce code you’re familiar with to continuously analyze, live, fast-changing data. Deploy your batch models on live data, generate real-time alerts, ETL data into HDFS and much more. There are many possibilities when you combine your batch and live data analysis under the same familiar framework.
Simba is the world expert in data access and analytics. Simba’s products power business intelligence-connecting front-end applications with back-end data sources. Microsoft and Simba developed ODBC in 1992. SimbaEngine SDK is the foundation for developing ODBC 3.8 and JDBC 4.0 drivers for any data source. Simba ODBC 3.8 Drivers for Hadoop/Hive, Cassandra, MongoDB and SalesForce connect products like Excel, MicroStrategy and Tableau to cloud and Big Data sources. No other Big Data ODBC drivers are as advanced or BI ready. Simba’s data access and analytics solutions are chosen by ISVs such as MapR, Microsoft, SAP and Teradata. | www.simba.com
SiSense Prism is a Big Data Analytics Solution that provides the benefits of In-Memory without its disadvantages. SiSense In-Memory Columnar Datastore analyzes 100 times more data at 10 times the speed of comparable solutions. No need to set up complex data warehouse systems or OLAP cubes. No need for programming either, regardless where data comes from or how big it is. Download a free trial version of the software at http://www.sisense.com/prism-free-trial
Skytree’s Machine Learning platform gives organizations the power to discover deep analytic insights, predict future trends, make recommendations and reveal untapped markets and customers. Predictive Analytics and Machine Learning are quickly becoming must-have technologies in the age of Big Data, and Skytree provides the Enterprise-grade foundation. Skytree’s flagship product – Skytree Server – is the only general purpose scalable Machine Learning system on the market, built for the highest accuracy at unprecedented speed and scale.
SoftLayer, an IBM Company, operates a global cloud infrastructure platform built for Internet scale with 13 data centers in the United States, Asia, and Europe and a global footprint of network points of presence. SoftLayer provides Infrastructure-as-a-Service to leading-edge customers ranging from Web startups to global enterprises. For more information, please visit www.softlayer.com or call 866.398.7638.
Based in San Francisco, Splice Machine provides the ANSI SQL database designed for Big Data applications. The Splice SQL Engine™ provides all the benefits of NoSQL databases such as auto-sharding, scalability, fault tolerance and high availability, while retaining the strengths of the industry standard – SQL. It optimizes complex queries to power real-time Big Data apps and enable interactive analytics without rewriting existing SQL-based apps and front-end BI tools such as MicroStrategy® and Tableau®.
Sqrrl is a Big Data software company whose employees have dealt with the world’s largest, most complex, and most sensitive datasets for the last decade. Sqrrl’s software product, Sqrrl Enterprise, is the most secure, scalable, and flexible NoSQL database for building real-time applications and is powered by Apache Accumulo™ and Hadoop. Sqrrl Enterprise extends the capabilities of Accumulo with additional data ingest, security, and real-time analytical features that help unlock the power of Big Data.
StackIQ offers a complete Hadoop solution featuring the most powerful, automated deployment and management capabilities available. We make it easy to implement Hadoop clusters of any size – from bare-metal to the applications layer – quickly and consistently. Our cluster manager provides consistent, reliable management of the Hadoop cluster through its integrated, database-driven design. What’s more, StackIQ’s modular architecture lets you customize your cluster environment, tailoring it to fit your unique software requirements.
Supermicro, the leader in server technology innovation and green computing, provides customers around the world with application-optimized server, workstation, blade, storage and GPU systems. Based on its advanced Server Building Block Solutions, Supermicro offers the most optimized selection for IT, datacenter and HPC deployments. The company’s system architecture innovations include Twin server, double-sided storage and SuperBlade product families. Offering the most comprehensive product lines in the industry, Supermicro delivers energy-efficient solutions with unmatched performance and value. Founded in 1993, Supermicro is headquartered in Silicon Valley with worldwide operations and manufacturing centers in Europe and Asia. For more information, visit http://www.supermicro.com/hadoop .
SUSE, a pioneer in open source software, provides reliable, interoperable Linux and cloud-based solutions that give enterprises greater control and flexibility. Decades of engineering excellence, industry leadership, and an unrivaled partner ecosystem power the products and support that help our customers manage complexity, reduce cost, and confidently deliver mission-critical services. The lasting relationships we build with them allow us to adapt and deliver the smarter innovation they need to succeed-today and tomorrow.
Tableau Software helps people see and understand data. Used by more than 15,000 customer accounts worldwide, Tableau’s award-winning software delivers fast analytics and rapid-fire business intelligence. Create visualizations and dashboards in minutes, then share in seconds. The result? You get answers from data quickly, with no programming required.
Talend provides integration solutions that truly scale for any type of integration challenge, any volume of data, and any scope of project, no matter how simple or complex. Talend’s highly scalable data, application and business processes integration platform leverages all information assets and accelerates dramatically the time-to-value of integration. Ready for big data, Talend’s flexible architecture easily adapts to future IT platforms. A common set of easy-to-use tools implemented across all Talend products maximizes the skills of integration teams, too. Unlike vendors offering closed and disjointed solutions, Talend offers an open and flexible platform, supported by a predictable subscription model.
Tamr, Inc., connects and enriches the vast reserves of underutilized internal and external data, so enterprises can use all their data for analytics and downstream applications. Tamr’s data-unification platform combines machine learning algorithms with collective human insight to identify sources, understand relationships and curate the massive variety of silo-ed data. The Tamr platform is deployed in production at information services providers, pharmaceutical firms, retailers and other customers. Based in Cambridge, Mass., Tamr was founded in 2013 by database industry veterans Andy Palmer, Mike Stonebraker and Ihab Ilyas with George Beskales, Daniel Bruckner and Alex Pagan.
Voltage Security®, Inc. is the leading data protection provider, delivering secure, scalable, and proven data-centric encryption, tokenization and key management solutions, enabling our customers to effectively combat new and emerging security threats. Our powerful data protection solutions allow any company to seamlessly secure all types of sensitive corporate and customer information, wherever it resides, while efficiently meeting regulatory compliance and privacy requirements.
WebAction’s Real-time Enterprise App Platform allows you to process your data-in-motion before it lands on storage, enabling you to make decisions while they still matter. Scaling out on commodity hardware, WebAction offers a library of pre-built Apps and a development environment where data driven Apps are built in days. The end-to-end platform allows data sources to be acquired from any structured or unstructured source, correlated, filtered, processed, and enriched with other streams, historical, and context data.
Whitepages Pro is the leading provider of high-quality data in the U.S. and Canada. With a data organization approach we call the Contact Graph, names, phone numbers, and addresses are linked to form an intricate web of information. This unique structure allows us to update our data at unprecedented speeds, and ensures accuracy by double- or triple- verifying each data point. With the best mobile coverage of any data vendor in North America and numerous ways to use our data at enterprise scale, we have become a key data partner in many industries.
Founded by Stephen Wolfram in 1987, Wolfram Research is one of the world’s most respected software companies-as well as a powerhouse of scientific and technical innovation. As pioneers in computational science and the computational paradigm, we have pursued a long-term vision to develop the science, technology, and tools to make computation an ever-more-potent force in the world. At the center is Mathematica, our ever-advancing core product that launched modern technical computing and has become the world’s most powerful global computation system. Mathematica represents a unique blend of major research breakthroughs, outstanding user-oriented design, and world-class software engineering.
YarcData, a Cray company, delivers a Big Data appliance for real-time data discovery, enabling enterprises to gain game-changing business insights by surfacing unknown relationships and non-obvious patterns. Adopters include the Swiss National Supercomputing Centre (CSCS), Institute of Systems Biology, the Mayo Clinic, Noblis, Oak Ridge National Laboratory, QinetiQ, Pittsburgh Supercomputing Center and Sandia National Laboratories, as well as leading government and intelligence organizations, financial services firms, life sciences companies, and telecommunications providers. YarcData is based in the San Francisco bay area and more information is at www.yarcdata.com.
Zaloni is a leading provider of peerless Hadoop software and services solutions. Zaloni’s Bedrock Data Management Platform™ is the company’s unique foundation for building and deploying comprehensive, agile, Hadoop-based production implementations that solve the most complex large-scale data analytic challenges. Fortune 100 companies in Telecom, Healthcare and Financials depend on Zaloni to provide Big Data solutions that deliver speed and performance in an efficient, cost-effective manner.
Zettaset Orchestrator™ is the only Big Data management solution designed to address enterprise requirements for comprehensive enterprise-class security, multi-service high availability, and cluster manageability in Hadoop’s distributed computing environment. And now users of business intelligence and analytics applications can also benefit from our latest software release. Orchestrator 6 extends data encryption, fine-grained access control, and security policy enforcement to analytics applications without impacting performance, while safely and securely enabling analytics apps and their users to take full advantage of the scalability and flexibility of Hadoop.
Designed to support Big Data, Zoomdata’s Stream Processing technology delivers real time data feeds to tablet and browser based devices. Through the use of touch screen devices, users are able to interact with data in real time, rewind the data, compare the data and share views with their colleagues.
Apervi develops big data integration products and solutions to the data driven and data intensive enterprises of today. Apervi’s flagship product “Conflux”, is an unified orchestration platform for big data technologies including Hadoop, Storm and Spark, making it easy and intuitive to leverage the power of these emerging technologies from a common interface. Conflux makes big data application development simple, secure and fast. Apervi’s team consists of technology and data engineers with proven results in enterprise software, data management and analytics. Apervi is headquartered in Irving, TX with a development center in Hyderabad, India.
AtScale delivers the power of Hadoop into the hands of business analysts. AtScale turns your Hive warehouse into an OLAP server, allowing you to use tools like Tableau or Excel to interactively query billions or trillions of rows of data. AtScale Dynamic Cubes access the data directly in your Hadoop cluster without requiring ETL or data movement, and support modern data formats like arrays, structs, and non-scalars. With AtScale, customers can now benefit from the affordable scale that Hadoop delivers coupled with the interactive response times and business friendly interfaces that have historically been limited to scale-constrained OLAP solutions.
Caspida is a predictive cyber-security and threat intelligence company that detects & prevents hidden threats across corporate, SaaS and mobile environments. Our new class of product is necessary to counter today’s organized cyber attacks, state sponsored espionage and insider attacks. Caspida’s disruptive approach thwarts adversaries with new threat detection and prevention capabilities using advanced user, application, device and data-aware behavior models.
Concurrent, Inc. is the leader in Big Data application infrastructure, delivering products that help enterprises create, deploy, run and manage data applications at scale. The company’s flagship enterprise solution, Driven, was designed to accelerate the development and management of enterprise data applications. Concurrent is the team behind Cascading™, the most widely deployed technology for data applications with more than 150,000 user downloads a month. Used by thousands of businesses including eBay, Etsy, The Climate Corp and Twitter, Cascading is the de facto standard in open source application infrastructure technology. Concurrent is headquartered in San Francisco and online at http://concurrentinc.com.
Crate Data is an open source data store for any data. It is massively scalable, supports real time SQL search and queries and requires zero administration. Crate is a shared-nothing, fully searchable, document-oriented cluster store and can be installed on commodity hardware or the cloud: a super simple backend for any data. Crate Data was founded by Jodok Batlogg, Christian Lutz and Bernd Dorn and is backed by Sunstone Capital and DFJ Esprit. To download Crate Data go to crate.io/download
Founded in 2009, CrowdFlower is the leading data enrichment platform for data scientists. Our quality-control technology is the most accurate and fastest way to collect, label, and clean data from an on-demand workforce. Our platform automates the management of the online workforce to tackle tasks that require human intelligence — like search relevance tuning, data categorization, image annotation, metadata creation, sentiment analysis, transcription, and de-duplication. Backed by Trinity Ventures, Bessemer Venture Partners and Canvas Venture Fund, CrowdFlower’s customers include LinkedIn, Intuit, Flickr, General Electric, The Home Depot, Edelman, and ebay. People-powered data enrichment – www.crowdflower.com.
Designers have Photoshop, Web Analysts have GA, but where is the go-to-tool for people who work with data? To fill that gap and to create a tool that really encompasses what both beginner and advanced data scientists need, Dataiku created Data Science Studio (DSS). DSS is a software platform that aggregates all the steps and big data tools necessary to get from raw data to production ready applications. It significantly shortens the load-prepare-test-deploy cycles required to create data driven applications. Thanks to its visual and interactive workspace, it is accessible to both Data Scientists and Business Analysts. Try DSS out for free: http://www.dataiku.com/dss/trynow/!
DataRPM is an industry pioneer in cognitive data discovery, the next generation of big data technology, delivering hyper-fast results to organizations challenged by the volume, velocity and variety of their big data. Our patent-pending Computational Graph Search technology enables automatic data modeling from disparate sources using semantic algorithms, eliminating the need to manually build complex data warehouses. DataRPM also creates hyper-fluid data lakes, enabling a more agile way to access and manage big data from multiple sources. Using a hyper-aware, natural language search interface to analyze and visualize data, DataRPM provides an easy, Google-like user experience for Big Data Discovery.
Elasticsearch is on a mission to make massive amounts of data usable for businesses everywhere by delivering the world’s most advanced search and analytics engine available. With a laser focus on achieving the best user experience imaginable, the Elasticsearch ELK stack – comprised of Elasticsearch, Logstash, and Kibana – has become one of the most popular and rapidly growing open source solutions in the market. Used by thousands of enterprises in virtually every industry today, Elasticsearch, Inc. provides production support, development support and training for the ELK stack. To learn more, visit elasticsearch.com
Found delivers a hosted and fully managed search service built on top of Elasticsearch. Our hosted Elasticsearch suits a range of uses, from full text search on websites to Big Data analytics. Customers get their own dedicated Elasticsearch cluster with reserved memory and storage. Our search solution is secure and stable, and we’ve made it really easy to perform upgrades and downgrades – without any downtime. For production and mission critical environments we provide replication and automatic failover, protecting clusters against unplanned downtime.
GraphLab is the recognized pioneer of large-scale machine learning. The company’s software products and services help organizations of all sizes unleash the potential of data science to optimize internal processes, personalize customer experience, even create new revenue streams. GraphLab is making this possible by applying the power of machine learning to the imperatives of data scientists, software engineers, product managers and IT architects tasked with making businesses more productive. Well known firms spanning every vertical, use GraphLab software to build the data products that make item recommendations, predict customer churn, detect fraud, analyze social networks and provide customer insights among others.
Koverse separates signal from noise for data-driven organizations. We deliver meaningful insights critical for organizations to increase effectiveness in every aspect of business. The Koverse platform consolidates the elements required for deriving actionable insights from data — Collect, Analyze, Act — in a secure, scalable multi-tenant environment built on established technologies such as Hadoop.
NFLabs is an enterprise analytics company focusing on simplifying big data. Our main product, Peloton, removes current barriers to entry for using big data by abstracting the two most difficult tasks of required to do current big data analytics: 1. Data Pipeline-ingestion, integration, and formatting of data for instant, ready-to-analyze dataset immediately upon connection with data source. 2. Application Framework–API based application level framework to enable plug-n-play of algorithms, visualizations, and other libraries on your big data platform. Peloton is already in use by some of the largest companies in Korea and around the world, including LG and Korea Telecom.
ParStream, the fastest real-time database for Big Data Analytics provides immediate insights from massive, continuously growing amounts of data. Based on unique, patented algorithms, ParStream, “the #1 Big Data Company” (CIO.com) enables new types of applications and business models in ecommerce, telco, finance, retail and many other industries. Based in Cupertino, and Cologne (Germany) ParStream is backed by prominent Silicon Valley Investors and supported by leading database specialists. For more information, visit www.ParStream.com
RStudio offers open source and enterprise ready professional software for R. Our flagship product is an Integrated Development Environment (IDE) which makes it easy for analysts, scientists, data scientists and quants to perform their analyses. We also offer a web application framework called Shiny that allows you to take those analyses and share them with your team/organization by creating interactive web applications. The RStudio team also contributes code to many R packages and projects. R Markdown, ggvis, dplyr, knitr, and packrat are R packages from RStudio that enhance the value, reproducibility, and appearance of the work of data scientists.
RTTS is the premier software and services firm in the software quality and testing field — assisting over 600 companies since 1996. Our team boasts experts who have implemented solutions for improving the health of many corporations’ data. QuerySurge is enterprise software built by RTTS to give your team an organized, repeatable way to assure your company’s greatest asset — its data. QuerySurge enables you to increase the level of quality within your data warehouse by analyzing and pinpointing any differences to find bad data throughout the ETL process. Major database and non-database (Hadoop/Hive, flat files, XML, Excel/Access) data sources are supported.
Sigmoid provides a platform for Real Time Streaming Analytics. With this platform Sigmoid is democratizing Streaming use-cases like Fraud Detection, Social Media Analytics, Sensor Data Analytics, Log Analytics etc. The platform leverages Apaches Spark & provides a SQL like language to easily write streaming pipelines.
SynerScope technology pushes sense making from Big Data to the next level. Our collaborative sense making platform supports the use of all structured and unstructured data, including data from live web searches. We provide domain experts unrivaled new capabilities at much higher efficiencies for active discovery. The value from Big Data comes from better decision making, which is why SynerScope relentlessly focuses on building technology that directly supports human brain processes for analysis and decision making. Highly interactive bulk data visualizations using combined GPU/CPU platforms deliver the user a coherent view at any stage from raw to filtered or aggregated data. Any algorithmic machine process and data transformation is always transparent and aligned with human reasoning. We offer our stack of visual, analytic and MDM software in an appliance integrated with a choice of state of the art middleware and databases including Hadoop. We fulfill instant gratification for our clients’ needs and demands for information from data from any source. Radically driving down the cost of sense making terabyte for terabyte our clients can now think unhindered of how to use data best.
The School of Information Studies (iSchool) at Syracuse University offers the first NYS-approved Certificate of Advanced Study in Data Science. The 15-credit program is flexible in its offerings, including part-time and online study. While technical skills are an important component of the curriculum, graduates are also equipped with highly marketable, theoretical knowledge that carries them beyond the rise and fall of any one technology. Its expertise covers both structured and unstructured data and spans the full data lifecycle from collection to curation.
Tervela was founded in 2004 at the beginning of a massive transformation in the way businesses built and operated high-performance distributed applications. Tervela’s solution called Cloud FastPath is the first fully automated service for fast and secure cloud-to-cloud and on-premise-to-cloud Big Data transfer. It is built on top of Tervela’s enterprise-class Data Fabric, which powers some of the most demanding applications in the world. Cloud FastPath is a simple point-and-click web service that automates massive data movement and streaming to the cloud and between cloud environments, securely and efficiently.
The Department of Statistics in partnership with Mays Business School at Texas A&M University offers a MS in Analytics degree which will prepare working managers and professionals to make better informed decisions more quickly to optimize business performance and identify new business opportunities. Courses will be taught at Houston CityCentre and via live video around North America in the evenings, offering a convenient two year, part-time program for working professionals. Open to individuals with strong quantitative skills, for example bachelor’s degree holders in the sciences, mathematics, business and engineering fields. Accepting pre-questionnaires now for our next fall cohort.
Transwarp Information Technologies (Shanghai) Co., Ltd. (abbr. Transwarp) is a big data startup company in China. Transwarp provides enterprise-ready stable Spark and Hadoop data platforms in telecommunications, financial services, and the public sector. Transwarp’s products include Inceptor and Hyperbase. Transwarp Inceptor is a fast in-memory/on-SSD computation engine on top of Hadoop, providing ANSI SQL’99 compatible query interface and R language support. Inside Inceptor is Apache Spark: the execution framework. Transwarp Hyperbase is a database product on top of Apache HBase, including distributed transactions, concurrent SQL query, graph traversal and full-text search capabilities.
Mainframe transaction, log and enterprise data served in near-real time to your Hadoop projects. Reduce costs by performing batch projects and archival of mainframe data in Hadoop. Integrated with IBM BigInsights, Cloudera, Hortonworks and MapR, vStorm Enterprise is the industry’s most secure software for a lightweight alternative to traditional ETL without staging, costly software development or MIPS charges. Only vStorm Enterprise has SSL encryption of data-in-motion. You can even keep your data on platform with zDoop, Hadoop for Linux on z. Supports: DB2, VSAM, SMF/RMF, IMS and more.
VoltDB is an in-memory NewSQL database that exceeds the performance needs of modern ‘fast and smart’ data-intensive applications in industries including advertising, telecom, financial services, energy and gaming. Unlike traditional RDBMSs and NoSQL offerings, VoltDB is a “no compromise” solution that delivers the performance of in-memory, the scalability of NoSql, with the transactional consistency of traditional RDBMSs. VoltDB achieves high-velocity data ingestion with real-time analytics on a purpose-built scalable architecture designed by our founder MIT’s Mike Stonebraker. With VoltDB, organizations can act on data at its point of maximum value and narrow the “data-to-decision” gap from minutes to milliseconds.
X15 Software is a revolutionary large-scale machine and log data management company. Our flagship product provides a highly scalable, open and modern platform that combines search and analytic query capabilities. With best-in-class developer productivity and the lowest total cost of ownership, X15 Software is the new global standard for enterprise-wide machine data efforts.
Latest posts by Bob Gourley
- OODA Produces A Traveling Executive’s Guide to Cybersecurity - April 19, 2019
- The Cloud Security Alliance Federal Summit 7 May 2019 - April 17, 2019
- Insights for your enterprise approach to AI and ML ethics: Advice for the C-Suite - April 9, 2019