Practical Tutorial on Ruby Concurrency and Parallelism

Let’s clarify a common misconception among Ruby developers: Concurrency and parallelism are not synonymous (concurrent != parallel).

Concurrency in Ruby means two tasks can start, run, and finish in overlapping timeframes. It doesn’t guarantee they’ll run simultaneously (e.g., multiple threads on a single-core CPU). Conversely, parallelism means two tasks execute simultaneously (e.g., multiple threads on a multicore CPU).

The key takeaway: Concurrent threads/processes aren’t always parallel.

This tutorial practically explores techniques for concurrency and parallelism in Ruby.

For real-world examples, see our article on Ruby Interpreters and Runtimes.

Our Test Case

We’ll use a Mailer class with a CPU-intensive Fibonacci function (instead of sleep()) for a simple test case:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class Mailer

  def self.deliver(&block)
    mail = MailBuilder.new(&block).mail
    mail.send_mail
  end

  Mail = Struct.new(:from, :to, :subject, :body) do 
    def send_mail
      fib(30)
      puts "Email from: #{from}"
      puts "Email to  : #{to}"
      puts "Subject   : #{subject}"
      puts "Body      : #{body}"
    end

    def fib(n)
      n < 2 ? n : fib(n-1) + fib(n-2)
    end  
  end

  class MailBuilder
    def initialize(&block)
      @mail = Mail.new
      instance_eval(&block)
    end
    
    attr_reader :mail

    %w(from to subject body).each do |m|
      define_method(m) do |val|
        @mail.send("#{m}=", val)
      end
    end
  end
end

We can use this Mailer class to send mail:

1
2
3
4
5
6
Mailer.deliver do 
  from    "eki@eqbalq.com"
  to      "jill@example.com"
  subject "Threading and Forking"
  body    "Some content"
end

(Note: Source code is available here on GitHub.)

For comparison, let’s benchmark by invoking the mailer 100 times:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
puts Benchmark.measure{
  100.times do |i|
    Mailer.deliver do 
      from    "eki_#{i}@eqbalq.com"
      to      "jill_#{i}@example.com"
      subject "Threading and Forking (#{i})"
      body    "Some content"
    end
  end
}

This yielded these results on a quad-core processor with MRI Ruby 2.0.0p353:

1
15.250000   0.020000  15.270000 ( 15.304447)

Multiple Processes vs. Multithreading

There’s no universal answer to choosing between multiple processes or multithreading in Ruby. This table outlines key factors:

Processes	Threads
Uses more memory	Uses less memory
If parent dies before children have exited, children can become zombie processes	All threads die when the process dies (no chance of zombies)
More expensive for forked processes to switch context since OS needs to save and reload everything	Threads have considerably less overhead since they share address space and memory
Forked processes are given a new virtual memory space (process isolation)	Threads share the same memory, so need to control and deal with concurrent memory issues
Requires inter-process communication	Can "communicate" via queues and shared memory
Slower to create and destroy	Faster to create and destroy
Easier to code and debug	Can be significantly more complex to code and debug

Ruby solutions using multiple processes:

Resque: A Redis-backed library for background jobs, queuing, and processing.
Unicorn: An HTTP server for Rack apps optimized for fast clients and low-latency connections on Unix-like systems.

Ruby solutions using multithreading:

Sidekiq: A comprehensive background processing framework for Ruby, offering easy Rails integration and high performance.
Puma: A Ruby web server built for concurrency.
Thin: A fast and simple Ruby web server.

Multiple Processes

Before multithreading, let’s explore spawning multiple processes.

Ruby’s fork() system call creates a process “copy,” scheduled independently by the OS, enabling concurrency. (Note: fork() is POSIX-specific, unavailable on Windows.)

Let’s run our test case with fork():

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
puts Benchmark.measure{
  100.times do |i|
    fork do     
      Mailer.deliver do 
        from    "eki_#{i}@eqbalq.com"
        to      "jill_#{i}@example.com"
        subject "Threading and Forking (#{i})"
        body    "Some content"
      end
    end
  end
  Process.waitall
}

(Process.waitall waits for all child processes, returning their statuses.)

Results on a quad-core processor with MRI Ruby 2.0.0p353:

1
0.000000   0.030000  27.000000 (  3.788106)

A ~5x speed increase with a few code changes!

However, forking’s memory consumption is a major drawback, especially without a Copy-on-Write (CoW) in your Ruby interpreter. Forking a 20MB app 100 times could use 2GB of memory!

Multithreading has its complexities, but fork() brings challenges like shared file descriptors, semaphores, and inter-process communication via pipes.

Ruby Multithreading

Let’s try making the program faster using Ruby multithreading.

Multiple threads within a process have less overhead than processes due to shared memory.

Let’s revisit our test case with Ruby’s Thread class:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
threads = []

puts Benchmark.measure{
  100.times do |i|
    threads << Thread.new do     
      Mailer.deliver do 
        from    "eki_#{i}@eqbalq.com"
        to      "jill_#{i}@example.com"
        subject "Threading and Forking (#{i})"
        body    "Some content"
      end
    end
  end
  threads.map(&:join)
}

Results on a quad-core processor with MRI Ruby 2.0.0p353:

1
13.710000   0.040000  13.750000 ( 13.740204)

Disappointing! Why is it similar to synchronous execution?

The culprit is the infamous Global Interpreter Lock (GIL). The GIL in CRuby (MRI) hinders true threading.

The Global Interpreter Lock synchronizes threads in interpreters, allowing only one to execute at a time. GIL-based interpreters (like Ruby MRI and CPython) execute one thread at a time, even on multi-core CPUs.

How can we leverage multithreading in Ruby with the GIL limitation?

Unfortunately, CRuby offers limited multithreading benefits.

However, Ruby concurrency (without parallelism) is useful for IO-bound tasks (e.g., network operations). Threads existed before multi-core systems for a reason.

Consider alternatives like JRuby or Rubinius if possible. They lack a GIL and support true parallel Ruby threading.

Here’s the threaded code on JRuby (instead of CRuby):

1
43.240000   0.140000  43.380000 (  5.655000)

That’s more like it!

However…

Threads Ain’t Free

Improved performance with threads doesn’t mean unlimited scalability. Threads consume resources, leading to limitations.

Let’s run the mailer 10,000 times instead of 100:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
threads = []

puts Benchmark.measure{
  10_000.times do |i|
    threads << Thread.new do     
      Mailer.deliver do 
        from    "eki_#{i}@eqbalq.com"
        to      "jill_#{i}@example.com"
        subject "Threading and Forking (#{i})"
        body    "Some content"
      end
    end
  end
  threads.map(&:join)
}

An error occurred on OS X 10.8 after spawning around 2,000 threads:

1
can't create Thread: Resource temporarily unavailable (ThreadError)

Resource exhaustion is inevitable, limiting scalability.

Thread Pooling

Fortunately, thread pooling offers a solution.

A thread pool is a collection of pre-created, reusable threads for task execution. It’s beneficial for numerous short tasks, minimizing thread creation overhead.

A crucial parameter is the pool size (number of threads). Threads can be instantiated upfront or lazily (as needed).

When assigned a task, the pool uses an idle thread. If none are available (and the maximum is reached), it waits for a thread to become free.

Let’s use Queue (a thread safe data type) for a simple thread pool implementation:

require “./lib/mailer” require “benchmark” require ‘thread’

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
POOL_SIZE = 10

jobs = Queue.new

10_0000.times{|i| jobs.push i}

workers = (POOL_SIZE).times.map do
  Thread.new do
    begin      
      while x = jobs.pop(true)
        Mailer.deliver do 
          from    "eki_#{x}@eqbalq.com"
          to      "jill_#{x}@example.com"
          subject "Threading and Forking (#{x})"
          body    "Some content"
        end        
      end
    rescue ThreadError
    end
  end
end

workers.map(&:join)

We used Queue for thread-safe job queuing, avoiding complex implementations with a mutex.

We pushed mailer IDs to the queue and created ten worker threads.

Each worker thread pops tasks from the queue, executing them upon availability.

This solution is scalable but relatively complex.

Celluloid

The Ruby Gem ecosystem offers gems that simplify multithreading.

Celluloid is an excellent example. It provides a clean way to implement actor-based concurrency in Ruby. Celluloid allows building concurrent programs with concurrent objects as easily as sequential ones.

We’ll focus on the Pools feature, but explore Celluloid further. It helps build multithreaded Ruby programs without deadlocks and offers features like Futures and Promises.

Here’s our mailer using Celluloid:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
require "./lib/mailer"
require "benchmark"
require "celluloid"

class MailWorker
  include Celluloid

  def send_email(id)
    Mailer.deliver do 
      from    "eki_#{id}@eqbalq.com"
      to      "jill_#{id}@example.com"
      subject "Threading and Forking (#{id})"
      body    "Some content"
    end       
  end
end

mailer_pool = MailWorker.pool(size: 10)

10_000.times do |i|
  mailer_pool.async.send_email(i)
end

Clean, simple, scalable, and robust.

Background Jobs

Another option, depending on your needs, is background jobs. Several Ruby Gems support background processing (queuing and processing jobs asynchronously). Popular choices include Sidekiq, Resque, Delayed Job, and Beanstalkd.

We’ll use Sidekiq with Redis (a key-value store).

First, install and run Redis locally:

1
2
brew install redis
redis-server /usr/local/etc/redis.conf

With Redis running, here’s our mailer (mail_worker.rb) using Sidekiq:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
require_relative "../lib/mailer"
require "sidekiq"

class MailWorker
  include Sidekiq::Worker
  
  def perform(id)
    Mailer.deliver do 
      from    "eki_#{id}@eqbalq.com"
      to      "jill_#{id}@example.com"
      subject "Threading and Forking (#{id})"
      body    "Some content"
    end  
  end
end

Trigger Sidekiq with mail_worker.rb:

1
sidekiq  -r ./mail_worker.rb

And from IRB:

1
2
3
4
5
6
⇒  irb
>> require_relative "mail_worker"
=> true
>> 100.times{|i| MailWorker.perform_async(i)}
2014-12-20T02:42:30Z 46549 TID-ouh10w8gw INFO: Sidekiq client with redis options {}
=> 100

Incredibly simple and easily scalable by adjusting the worker count.

Sucker Punch, another asynchronous RoR processing library, is another option. The implementation with Sucker Punch is similar, replacing Sidekiq::Worker with SuckerPunch::Job and MailWorker.perform_async() with MailWorker.new.async.perform().

Conclusion

Achieving high concurrency in Ruby is simpler than you might think.

Forking multiplies processing power, while multithreading offers a lighter-weight approach but requires managing resources. Thread pools address resource limitations. Gems like Celluloid simplify multithreading with its Actor model.

Background processing provides an alternative for time-consuming tasks, with various libraries and services available. Popular options include database-backed job frameworks and message queues.

Forking, threading, and background processing are all valid options. Choose the best fit for your application, environment, and requirements. Hopefully, this tutorial has provided a helpful overview.