zerowidth positive lookahead

Initializing Shared Resources With Clojure And Ruby

While writing ringleader, I needed to initialize a shared resource, an app instance, on-demand. Because the app initialization could take many seconds and more than one connection could request it during that time, I needed a way to coordinate the app startup between requests.

Clojure

My first attempt at ringleader was written in clojure. During this experiment, the friendly programmers in #clojure on freenode pointed me towards a neat trick using atoms and promises.

First, to demonstrate the problem, define a not-yet-initialized ref as a shared app “instance” and use a naive check-or-set to decide whether to initialize it:

(defn thread-id []
  (.getId (Thread/currentThread)))

; because stdout isn't synchronized:
(def logger (agent nil))
(defn log [& msgs]
  (send-off logger (fn [_] (apply println (cons ";" msgs)))))

; define the shared app
(def app (ref nil))

; naive check-or-set
(defn get-app []
  (let [tid (thread-id)]
    (if @app
      (do
        (log "thread " tid " got running app")
        @app)
      (do
        (log "thread " tid " starting app")
        (Thread/sleep (rand 500))
        (log "thread " tid " started")
        (dosync
          (ref-set
            app
            (fn [n]
              (str "app started in thread " tid " handled request " n))))))))

(defn make-request
  "start and call the shared application"
  [request-number]
  (log "request " request-number " retrieving app from " (thread-id))
  (let [instance (get-app)]
    (log (instance request-number))))

And issue several requests to it:

(->>
  (range 1 4)
  (map (fn [n] (future-call #(make-request n))))
  (map deref))
(Thread/sleep 600)
(make-request 4)

; request  3  retrieving app from  159
; thread  159  starting app
; request  1  retrieving app from  157
; thread  157  starting app
; request  2  retrieving app from  158
; thread  158  starting app
; thread  158  started
; app started in thread 158 handled request 2
; thread  157  started
; thread  159  started
; app started in thread 159 handled request 3
; app started in thread 157 handled request 1
; request  4  retrieving app from  20
; thread  20  got running app
; app started in thread 157 handled request 4

As expected, there’s a new app instance for each request.

To do this correctly, there are a couple things to coordinate. First, if the app isn’t running and no one else is starting it, start it. If the app isn’t running but someone else is starting it, wait for it to finish. Lastly, if it’s already running, just return it.

To handle each of these situations, change the app to be a a promise and an atom:

(def app {:started (promise nil)
          :starting (atom nil)})

The :started promise is the end goal: a delivered value containing the initialized app.

The :starting atom coordinates between threads so only one starts the app.

(defn start-app []
  (let [tid (thread-id)]
    (Thread/sleep (rand 500))
    (deliver
      (:started app)
      (fn [n]
        (str "app started in thread " tid " handled request " n)))))

(defn start-or-wait-for-app []
  (let [me (thread-id)]
    (if (= me (swap! (:starting app) (fn [id] (or id me))))
      (do
        (log (str me " starting app"))
        (let [started (start-app)]
          (deliver (:started app) started)
          (deref (:started app))))
      (do
        (log (str me " lost, waiting"))
        (deref (:started app))))))

(defn get-app []
  (if (realized? (:started app))
    (deref (:started app))
    (start-or-wait-for-app)))

The trick is in the swap! call. (or id me) sets the value only if it’s not already set, and thus the first thread to set it will “win”. If a thread isn’t able to set the atom to its own thread id, it waits for the :started promise to be delivered instead.

Issuing several requests in parallel, as well as a request after it should have been initialized, gives the expected result. Only one of the parallel requests initializes the app while the rest wait.

(->>
  (range 1 4)
  (map (fn [n] (future-call #(make-request n))))
  (map deref))
(Thread/sleep 600)
(make-request 4)

; request  3  retrieving app from  159
; request  1  retrieving app from  158
; request  2  retrieving app from  156
; 156  starting app
; 158  waiting
; 159  waiting
; app started in thread 156 handled request 3
; app started in thread 156 handled request 2
; app started in thread 156 handled request 1
; request  4  retrieving app from  20
; app started in thread 156 handled request 4

Ruby and Celluloid

The approach to synchronization in ruby, using the celluloid actor framework, proved to be fairly similar to clojure even though the underlying semantics of STM versus the actor model are different.

First, a naive version using check-or-initialize, with the same pattern of three simultaneous calls and a fourth after waiting a second:

require "celluloid"

class App
  def initialize(instance)
    @instance = instance
  end

  def call(n)
    "response #{n} from app #{@instance}"
  end
end

class Server
  include Celluloid

  # The thread this actor uses for execution doesn't change, so pass the calling
  # thread id in so it can be logged instead.
  def get_app(thread_id)
    if @app
      puts "thread #{thread_id}: app started already"
    else
      puts "thread #{thread_id}: starting app"
      sleep rand
      puts "thread #{thread_id}: started app"
      @app = App.new thread_id
    end

    @app
  end
end

server = Server.new

3.times.map do |n|
  Thread.new do
    puts server.get_app(Thread.current.object_id % 1000).call(n)
  end
end.map(&:join)

And the output:

$ ruby naive.rb
thread 860: starting app
thread 100: starting app
thread 920: starting app
thread 100: started app
response 2 from app 100
thread 920: started app
response 1 from app 920
thread 860: started app
response 0 from app 860
thread main: app started already
response 3 from app 860

As before, each of the three initial requests creates a new instance of the app. This may be surprising, since celluloid dictates that only one thread of execution is active in an actor at a time. However, the sleep call is what gets us: by sleeping, one thread of execution yields to the next, which runs in the same actor while the other one waits.

Fortunately, “one-active-thread-only” makes a lock easier to reason about. It only needs to be an instance variable since only one thing can change it at a time, and we can use celluloid’s signaling mechanism to let threads wait for notification:

class Server
  include Celluloid

  def get_app(thread_id)
    if @app
      puts "thread #{thread_id}: app started already"
    else
      if @starting
        puts "thread #{thread_id}: waiting for app to start"
        wait :started
      else
        @starting = true
        puts "thread #{thread_id}: starting app"
        sleep rand
        puts "thread #{thread_id}: started app"
        @app = App.new thread_id
        signal :started
      end
    end

    @app
  end
end

And indeed, this works:

$ ruby coordinated.rb
thread 620: starting app
thread 880: waiting for app to start
thread 660: waiting for app to start
thread 620: started app
response 2 from app 620
response 1 from app 620
response 0 from app 620
thread main: app started already
response 3 from app 620

You can view the code for this post in this gist. The clojure version of ringleader is here, and the final ruby version is here.

I’m sure there are better or simpler ways of handling this example in both clojure and ruby, and I’d love to hear about it.