Wednesday, October 21, 2009

Cloud computing is not about providing a software architecture for scale... that is what Open Source does.

Recently I heard the comment "WOW, elastic cloud computing is great. I can take on a lot more stress with any load and in a few minutes stand up an instance to accommodate usage on demand and keep the app running without long term cost or even contractual commitments". While this person is right they did not know that by starting another instance you are likely just turning on another problem if the software application was never designed to be distributed.

Cloud computing (and the "elasticity" it can provide with Infrastructure as a Service IaaS) is not about providing a software architecture for scale. Let me repeat this again, cloud computing is not about providing a software architecture for scale. So what is it then you ask?

Cloud computing provides an on-demand infrastructure so that your well designed distributed enterprise software application can quickly scale based on the spikes and valleys of usage and interactions of your system (pay as you go for only what you need). IaaS is about hardware resourcing given to your software to reduce expenses but if your software is not designed to take advantage then the opposite will happen with CPU & Memory running away with a false sense of security.

The issue is often that the internal workings of a software system are designed (to coin the phrase) "cloud monolithic". This means that software is usually designed to execute on a single server with a database (often a cluster) and to scale it you just add more servers and put the clusters together. Over the last 3-5 years many *VERY* large cloud based services have emerged and they have open sourced the solutions for how they scaled.

It is important to understand the inner workings of:

1) a-synchronous processing
2) global caching
3) distributed and parallel processing

Without all three of these patterns working together you will actually compound your stress with load bottleneck for each blocking call inside of your software. Your safety net of cloud computing turns into the proverbial wet blanket faster than it ever did before.

Lets break each of these patterns out and how they apply and what solutions exists. In another post I will explain how these apply when dealing with a ridicules amount of information processing on a large scale with the time of process exponentially reduced (because of using the algorithms for map/reduce). I bring this up now because the way the map/reduce algorithms achieve this ability to handle and process so much information exponentially faster is realized from the software written to implement them which also use the technologies that are explained here.

1) A-Synchronous Processing - Ok well this is not really a new one and often there are too many solutions to choose from all with their own pros & cons ( you have to make this call yourself ). Queuing systems have been around for a long time and have numerous implementations in the marketplace. It is so numerous that often each language has it's own set of queue servers to choose from. This being said they are often NOT used correctly because #2 & #3 are not implemented also. I have seen many systems make use of an a-synchronous process which allows the bottleneck of a blocking synchronous call to more expediently return creating a perceived performance gain. The problem here is that you are just passing the problem for another process to either eat up unnecessary cycles or not utilize unused cycles on other parts of your infrastructure (ultimately requiring you to get more servers or now turn on more instances).

Creating a performant software application is about taking both synchronous and a-synchronous operations and making sure that they are utilizing information that has already "crunched" by other parts of the infrastructure [#2 global caching] and maximizes the hardware so the crunching happens on the parts of the infrastructure that currently has the least "crunching" occurring [#3 parallel distributed processing].

So now maybe you are getting the problem and solution so here is how to implement it.

Global Caching with Memcached "memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.".

What this means is two fold. Instead of your application querying the database for information it firsts checks the cache to see if that information is available. What makes this more powerful than using some static variable or custom solution is that memcached is a server that runs on EVERY machine that an instance of your application is running on. So, if server 3 pulls information from the database and adds it to the cache it is a "global cache" that the memcached server replicates for ALL instances/servers to make use of. This is extremely powerful because now every instance of your application is benefiting when all parts of the infrastructure are being used. In this scenario now you have un-compartmentalized "crunching" to no longer have to repeat some "crunch" of information to get to a result that another instance/server has already gotten to for their request/response.

Now this is a HUGE reduction to what often stresses a system but in of itself will not reduce the process to the degree that we are trying to get to because "at the end of the day" that "crunching" still has to occur. The crunching in the memcached implementation will still happen (hopefully a-synchronously once you find that it is not in your global cache and you have to "crunch" =8^0 ).

Now you need to crunch because your data is not memcached or perhaps you have to crunch for some other reason (that is what software does, right?) Just moving this to happen in the background off and onto another process provides no benefit within a multi-server environment.

A la "distributing the processing" and "executing it in parallel" which is where Gearman comes in "Gearman provides a generic application framework to farm out work to other machines or processes that are better suited to do the work. It allows you to do work in parallel, to load balance processing, and to call functions between languages. It can be used in a variety of applications, from high-availability web sites to the transport of database replication events. In other words, it is the nervous system for how distributed processing communicates."

Both memcached and Gearman are servers with a great following with multi-language client implementation support. They are written in C so that they will execute better than if the had to deal with an interpreter or virtual machine. They might be over kill but if you find yourself with bottlenecks I hope you think about your design and internal architecture of the system before you throw more hardware at your problem (especially now that this can be done with a few clicks to launch an instance).

Joe Stein

No comments:

Post a Comment