Testing asynchronous workers

Testing is an important part of any infrastructure. Knowing that your code works the way it should through iterations is critical for success as well as to increase developer happiness. Writing tests can be a very daunting task, especially for systems where there are complex interactions (lots of asynchronous interactions) or where test coverage is lower than it should be (playing catchup).

One of the largest challenges I’ve encountered when dealing with ‘workers’ is testing end-to-end functionality. As asynchronous processes, they are disconnected from the remainder of the system and often rely on external APIs to get and send data. This means that they are not necessarily operating within the framework of your app so you can’t always test them the same way you would test other components. As a result testing API controllers may be easy using Rspec but testing external components that use that API can be a little more challenging.

(N.B: Since I am currently doing a lot of Ruby programming right now, this is a bit Ruby-based. That doesn’t mean there aren’t other, similar, libraries that could be used with your language of choice)

Separate logic into libraries

By removing the core logic from your workers and placing it into a library, it becomes easier to test the functionality of your worker without having to actually run your worker. Writing spec tests (or other tests) for workers that need to run becomes a little more tedious; Often you have to monkey-patch or otherwise mock your worker’s connection to the queue or any other services it has to connect to such as proxies, directory services, or others.

Libraries, on the other hand, are by nature self-contained and independent packages which makes them easier to control and less dependent on external services. And if they are dependent on external services, they tend to be limited in scope on a per-library / function basis which translates to less plumbing you have to fake simultaneously.

Feed your API test output into your worker as input

One of the more helpful libraries for testing I’ve come across is FakeWeb. FakeWeb basically monkey-patches Ruby’s built-in URI class to skip opening network sockets. Instead it provides a response for a given URL – a string, or the contents of a file or any other IO object. This allows you to test your HTTP-dependent components in isolation with repeatable, dependable results.

Unfortunately my first thought was that I was going to have to maintain a bunch of text files with valid API response. After a bit of thinking I realized that I could just store the output from API spec tests into a variety of files based on the fixtures being used and use that as the input for FakeWeb. Then I wrote a class that simply opens a file based on the URL for FakeWeb to use, and uses those results to be consistent with the API’s output (as has been validated by our spec tests). This way, as the API is tested and validated, its output is used to feed into the workers that depend on the API. Since the library for connecting to our job queue system (Gearman) already has test coverage and can be tested independently, I only need to validate that the worker handles data fed from our API properly, and things go much more smoothly.