Tuesday, February 27, 2007

Mutil-threading in Ruby

Having spent some actual time on a Java based multi-threaded project, I want to reflect.

My reflection is: Don't ever use Java's concurrency libraries in an agile project. At a high level this is for several reasons:

  1. People find multi-threading difficult to understand. If you have a learn a bunch of things on a project, things like this generate fear, and push it into the little group of "people who know".
  2. Multi-threaded design using locks, synchronization, futures, etc is inherently unstable - e.g. a small change in requirements can invalidate a carefully put-together architecture. Essentially - if any code needs to know it is multi-threaded, you are going to have a hard time.
The trouble is, the concurrency libraries are so beguiling. They say (silently, in the back of your head) "hmmm - yes we know the synchronize keyword is wrong, so let's use barriers and countdown latches instead".

You're doomed, doomed I tell you.

What's the alternative? Well, CSP (concurrent sequential processes) model of computation is pretty clean. It defines a mechanism of communication that allows each process to be single threaded, and have NO side-effects except on channels that are injected into that process.

There are implementations for Java, and for .Net. The problem with these, of course, is that the existing (poor) concurrency primitives are still available in the language. So even if you use CSP you'll have to worry about someone not getting it and creating a deadlock in your code.

BUT - in ruby this isn't the case. No-one really uses the threading libraries in ruby because the VM isn't multi-threaded. So we have an opportunity to head-off the varmints at the pass.

So - firstly - don't make the situation worse. Secondly, let's get some CSP goodness into the ruby world, and have the ruby VM support CSP in the next few releases. We can think about pi-calculus later.

7 Comments:

At Wednesday, February 28, 2007 11:56:00 am, Anonymous Anonymous said...

Oh my, now rubists are even talking about weaknesses of the language as if they were strengths!

It looks like VB programmers saying "VB is so much simpler, because it doesn't have all this object-oriented stuff!"

 
At Wednesday, February 28, 2007 3:26:00 pm, Blogger Unknown said...

Hey Nick,

Interesting post this. I was involved in building an app recently that needed to be multi-threaded and ended up using the util.concurrent classes (although Alistair Jones will prolly tell you differently) and fell into the trap that you describe - that of thinking "Oooo, CountdownLatch, I want one of them!".

I have to say that it didn't really turn out that bad. I think in part because we were aggressively test-driving the app. We ended up always injecting a latch or exchanger or whatever in a constructor and then pretty much calling "run". Interestingly from a very quick look at the JCSP spec it seems they basically advocate the same thing, that is:

inject concurrency stuff into constructor.
...
call run

But don't externally mess about with the process's state while it is running. It's uncanny that it's almost exactly the design we ended up with.

I should probably blog about this myself, but thought you might be interested.

 
At Wednesday, February 28, 2007 4:34:00 pm, Anonymous Anonymous said...

You seem to be implying that multithreading is more of a problem for agile projects than non-agile ones. As one of the 'people who know' (from that project) ;) I don't think that is really the case. Multithreading is too hard in Java (and C# and most imperative languages frankly), project process notwithstanding. Much discipline is needed in any threaded codebase to keep the parts that need to worry about threads separate from the parts that don't.

 
At Thursday, March 01, 2007 1:52:00 am, Blogger Nick Drew said...

In response to anonymous:

I'm not sure I'm a rubist. Sophist, maybe, but not a rubist.

Anyway, it's much more like Cobol programmers saying "Cobol is much simpler because we can rely on being invoked in a batch process environment".

And they'd be right.

The analogy to VB would be like, say, treating everything like a variant, and then saying "Yes, it is object oriented because all our data structures are polymorphic".

So - if you're going to offer multi-threaded capability, try and raise the bar. Ruby is young enough that we can steer it's direction.

 
At Thursday, March 01, 2007 1:57:00 am, Blogger Nick Drew said...

To mj:

I agree that statement is a little unclear AND generalist. But you capture the gist - Since the VM is cooperatively multi-threaded, you don't really get many efficiency gains from using the MT capability, but you still have to deal with all the complexity of managing multiple threads.

In my opinion, it's not just that Thread and Mutex classes are a waste, but that they should be removed, and replaced with a much better concurrency model.

 
At Thursday, March 01, 2007 2:03:00 am, Blogger Nick Drew said...

To Darren:

I concur. I'm not saying that other methodolgies are ok to use Thread. They'll still have problems when significant changes are required.

The agile spoke of the argument is related to the rate of change of the codebase. A well run agile project moves and changes fast.
And thus requires the concurrency model to support rapid mutation.

 
At Friday, April 27, 2007 8:30:00 am, Blogger Dave Cameron said...

I would LOVE to see a different concurrency implementation in Ruby.

But, I disagree about why the current concurrency libraries are a problem. Basically, standard strategies for concurrency management (I'll just grab this global mutex!) don't effectively encapsulate the concern anywhere. I think synchronized is actually a good idea because it ties the synchronization to a relevant object. At least in a well designed system. Still, there's room for improvement both from a performance and correctness angle.

I think James' comment supports the idea that it's encapsulation that's the problem : inject your concurrency dependencies, just like all your dependencies, and it becomes more manageable. The problem with unmanageable concurrency is it's such hell to debug compared to other unmanageable code. Instead of a number coming out wrong, the process just locks. From that, little information is provided to do any debugging.

Apparently there are problems with threading in ruby anyway: http://blog.cbcg.net/articles/2007/04/22/python-up-ruby-down-if-that-runtime-dont-work-then-its-bound-to-drizzown

My favourite discussion of the conceptual problems are described on c2: http://c2.com/cgi/wiki?ThreadsAreComputationalTasks

The cleanest concurrency that I've actually used was in QNX. It's method-calls-as-blocking-message-passing and is described reasonably here: http://c2.com/cgi/wiki?SendReceiveReply

I think Erlang reduces to something similar, on some level.

So, what's your preferred model for Ruby? I think that's discussion for the next beer night.

 

Post a Comment

<< Home