Thursday, December 31, 2009

Linda, Tuples, Rinda, DRb & Parallel Processing For Distributed Computing

When building enterprise solutions often you start to get into orchestration of your data and process and really you just need to distribute your load. Traditionally this has been solved using MessageQueues, the occasional multi-threaded server (filled with pools and queues) etc, etc, etc. More recently solutions like Hadoop, Gearman, et el have sprung up but one way to solve this problem is with your own solution implementing Linda and tuple spaces.

There are a lot of open source and commercial software that can help with this paradigm but when you are building your own software then you sometimes just need straight up inter-process communication (without having to build your own socket based messaging platform) so that your software can pass objects off to another process (literally) in your architecture and parallelize the "crunching" of your data accordingly.

Now, when this inter-process communication is available across the network and even self aware (meaning that clients can find the server to register to get the data it needs) we really have a powerful solution... but I digress.

"In computer science, Linda is a model of coordination and communication among several parallel processes operating upon objects stored in and retrieved from shared, virtual, associative memory. This model is implemented as a "coordination language" in which several primitives operating on ordered sequence of typed data objects, "tuples," are added to a sequential language, such as C, and a logically global associative memory, called a tuplespace, in which processes store and retrieve tuples." http://en.wikipedia.org/wiki/Linda_%28coordination_language%29

Ok, Ok, enough of the esoteric academic theory... let me introduce you to Rinda.

Rinda is the Ruby implementation of Linda and Rinda is a built in library to Ruby (specifically DRb which is Distributed Ruby). Yes this comes out of the pervebial box of Ruby 1.8.1 and greater.

What DRb elegantly provides you with Rinda is a RingServer which is basically a solution to manage the tuple spaces and a service for auto-magically finding the server providing you all of the inter-process (and via network) communication with your tuple spaces.

Without further ado I would like to send you here http://segment7.net/projects/ruby/drb/rinda/ringserver.html for your first look at Rinda as I found it especially useful and within 15 minutes I had it read, understood and implemented in my software.

Now if you do not know Ruby this might be a good reason to learn it.

Or (if you are like me) do not care what language you use [just using the language to implement solutions for the problems that are at hand] then you can check out some other Linda implementations. I have never used any of these yet but I am sure I will.

# Linda for C++
# Linda for C
# Erlinda (for Erlang)
# Linda for Java
# Linda for Prolog
# PyLinda (for Python)
# Rinda (for Ruby)
# Linda for Scala (on top of Scala Actors)

As cloud computing continues to evolve solutions like Linda and various implementations could become more and more how software frameworks will be implemented as multi-threading is a dead end (better than a dead lock) when trying to parallelize processing.

Sometimes all you need are chopsticks to catch a fly =8^)

/*
Joe Stein
http://www.linkedin.com/in/charmalloc
*/