Jesse Robbins of Opscode and O’Reilly Radar provided lessons learned from his experiences in cloud computing and scaling. Jesse was titled “Master of Disaster” with Amazon, responsible for Website Availability for every Amazon.com domain. Today, he’s the CEO of Opscode, an infrastructure automation framework. Jesse is also the co-chair of the Velocity Conference, which is a conference for large scale, high trafficked websites. The lessons he has learned in availability, scalability, and elasticity are valuable to every IT manager with plans to moving operations into cloud based services.
Jesse focused on Infrastructure as a Service (IaaS). Since 1999, infrastructure has been completely revolutionized, to the point where we can deploy and re-deploy 10k Nodes in under 5 minutes. This change has meant that infrastructure is easier to get, but harder to manage. Instead of fighting to install a server, system administrators are battling managing thousands. In today’s infrastructure, developers are crucial to Operations. There are so many systems which need management, yet so few tools to aid administrators.
The big cloud providers (Amazon, Google, and Microsoft) have all built their own tools. These tools are completely locked down and are not shared with anyone. All the other cloud operators are unprepared, poorly equipped and inexperienced to take the steps to enable cloud scale management.
Jesse offers a few tips of advice for cloud computing:
- Sysadmins need to say “Yes” – not no – and try to make needs happen
- Executives need to realize that cloud is not a magic unicorn, benefits come from efficiency not raw Capex, and that there are true cultural implications across board, lastly, executive buy-in is essential to success
- the secret of cloud computing is a huge configuration requirement on the start up
- if you are to cloud compute, you need to do it within the constraints of the cloud provider
Jesse noted an important shift in which developers are writing more operational code as opposed to waiting for system administrators. There has been a melding of the operations and developer world.
Jesse touched on load provisioning, which is based on cutting down provisioning after a certain number of hits have passed. Capacity planning is important, so without properly configured capacity planning, your website (or cloud service) is at risk. Capacity planning requires math and planning and lots of hard work.
Lastly, Jesse introduced NoSQL, a new way to look at databases. All databases use the CAP theorem, a mix of Consistency, Availability, and Partition management. It is possible to focus on two, but always to the detriment of the third. Databases and MySQL focus on consistency first and availability secondly. However, Web applications need Availability most, and Partition Tolerance second. NoSQL has many tools which tweak CAP differently, including CouchDB, Cassandra, Redis, and MongoDB.
Check out Opscode here, and Jesse @ O’Reilly Radar here.





