Retrofitting the Reloaded pattern into Clojure projects

2013-09-07 clojure

Stuart Sierra has done a great job with clojure.tools.namespace and the reloaded Leiningen template. If you haven’t heard about this before, please have a look at c.t.n’s readme and watch this presentation.

I have retrofitted this pattern into two rather large Clojure projects (20,000 and 5,000 lines) with several modules, and here are some of my findings.

Removing global state

The first step is to find all resources that need to be “lifecycled.” Typical examples are Jetty servers, database/message bus clients, etc. It’s common that these resources are in a (defonce server (atom ...)) form. I tend to grep for defonce and atom to find these items.

Please note that not all defonces/atoms need to be lifecycled. Some of them can be safely “dropped” when you un/reload the namespace. In this case, you can leave them as (def thing (atom ...)). The rule of thumb is to lifecycle the ones that create system-wide resources, like network ports, message queue channels, etc.

After you find your candidates, you should replace them with defrecords implementing some kind of LifeCycle protocol. Here’s a version I use:

(ns lifecycle)

(defprotocol LifeCycle
  (start [this])
  (stop [this]))

(defn start-system [system]
  (doseq [s (->> system :order (map system))]
    (start s)))

(defn stop-system [system]
  (doseq [s (->> system :order (map system) reverse)]
    (stop s)))

The records themselves hold their state (typically in an atom) which gets updated by the start and stop functions. Here’s an example:

(defrecord JettyServer [cfg state]
  LifeCycle
  (start [_]
    (reset! state (jetty/run-jetty my-site {:port (:port cfg) :join? false}))
  (stop [_]
    (when @state
      (.stop @state)
      (reset! state nil))))

(defn create-jetty [cfg]
  (->JettyServer cfg (atom nil)))

Turning your global state into lifecycle records is the most intrusive part of this whole operation, expect to touch quite a few files.

Creating the system

After you create these records, you need to create (and start) your system (the collection of the lifecycled records). This will most likely be done in two places: in your app’s “main” function and in the user namespace (more on REPL usage below).

(defn create-system [cfg]
  {:jetty (create-jetty-server cfg)
   :rabbit (create-rabbitmq-channel cfg)
   :order [:rabbit :jetty])

(defn app-main []
  (let [cfg (config/create-config)
        system (create-system cfg)]
    (start-system system)
    ;; (.await) / (.join) / (deref) etc...
    (stop-system system)))

Simple huh?

Dealing with omnipresent / implicit databases

One thing you’ll find while removing these atoms is all the places in your code where a database / connection is “assumed”. This leads to quite brittle code, which is also hard to test since you have to set everything up in the correct order. In some cases you set a global db connection in the db library but you can also have lots of code that uses that global atom you just removed!

When retrofitting this pattern into existing codebases, having the system passed around everywhere can be a big change. Furtunately you can cheat a little here, and still get all the REPL benefits. A compromise I’ve settled on is after the resource have been lifecycled in a record as described above, I put a dynamic var where the atom used to be. Then the start/stop functions does a alter-var-root on this var, and test fixtures can (safely) bind it. This doesn’t solve the fundamental problem of implied resources, but it does let me move on and get the REPL environment I want (without terrifying my co-workers with a monumental pull request).

Removing class files from the jar

If you have a :main (and :aot) key in your project.clj you might have noticed that you have quite a few .class files in your jar. This is usually not a big deal, but it causes problems for clojure.tools.namespace since it can’t unload these namespaces correctly.

One method to minimize the class files in your jar is a namespace containing your new entry point:

(ns app
  (:gen-class))

(defn -main [& args]
  (require 'myapp.core)
  (eval `(apply myapp.core/old-main ~args)))

Note that this namespace doesn’t require any other in the ns macro, this means you’ll only get class files for this namespace in your jar.

Dealing with dependant services

Now, your application probably consist of more than one service. So you’ll have to apply the steps described above to all of them. Then, in order to maximize your REPL productivity you want to include and control all the services your current project interact with. This is only done in the :dev profile, since you wouldn’t do this in the “real” entry point of your service.

To make this work you need 2 things: Add a leiningen dependency to these services (under the :dev profile) and soft links to their folders in the current projects checkouts folder. The reason for the dependency is that we want to pull in all of the sub-projects dependencies (in the REPL) and checkouts is where we will do our edits.

This means that you are probably going to need a local nexus / archiva / clojars. Then have your CI system do a lein deploy after each successful build.

user.clj

The final piece is to add the reloaded template’s user.clj to your project. Simply copy reloaded’s user.clj into dev/user.clj and do some modifications to it. You want to require the namespace with your create-system function, and do a (alter-var-root #'system (fn [_] (system/create-system))) in the init function. Then add the (lifecycle/start-system system) in start (and the equivalent for stop).

That takes care of managing the lifecycle of the current service. If you are dependant on others (as described above) you should create and start/stop them here as well. In this case change the user/system to a map with a key / value for each of the sub-systems you have.

Finally add :repl-options {:init (user/go) to project.clj (once again under :dev) to launch the entire system when you “jack-in”.

Your new workflow

For maximum flexibility I tend to always “jack-into” the project at the “top” of the hierarchy. This means that I will have control over all other services from the REPL, and I can work on any of them without ever bouncing the REPL. I’ve found this to improve my productivity and remove alot of annoyances of slow startup times, so it’s defenately worth the effort.

Good luck refactoring!