Atom feed

Here's What I Think

Recently...

Scalable design is within GRASP

Turberfield is just about ready for release now. I'll put a link to the demo at the end of this post. There are a few things I discovered along the way which I thought I'd write up for you.

I have been hacking away on this for nearly six months, pausing several times to refactor. Now I'm finally happy with how it goes together, it occurs to me I don't know why; after all that work I can't articulate how the design is right.

So for safety, I look about for some way to validate all these ad-hoc decisions. I come across a book I've owned for many years, which has been very useful in the past, though I've not looked at it in a while. I find in there the concepts which match the code I've written. Those concepts might be useful to you too if you're writing a modern Python application.

For the purposes of this article, let's just say that Turberfield has to scale. Scale to more users, scale to larger deployments, scale to more developers.

Question is, how do I design something I'm confident will scale in these ways? The answer lies in a set of principles called GRASP.

Got to be a loose fit

How to scale to more developers? I'm going to skate over this today, since I cover these aspects elsewhere.

Suffice to say you should use Python namespace packages as the means to componentise your project. Entry points are the way to provide a plugin mechanism for customised behaviour.

What you're doing here is creating what GRASP calls Loose Coupling between the components in your application. By agreeing to Polymorphic behaviour in those components, you Protect Variation in the deployment of your software.

OK, that's covered; let's skip on to the new stuff!

Too much too young

While I was pondering scalability, I became aware of a phenomenon I'd seen in some other projects which claim to be 'distributed' or 'designed to scale'.

These projects often base themselves around a message queue technology like RabbitMQ or ZeroMQ. Those technologies may be good, but they are not fairy dust.

If your software project cannot work without a message queue infrastructure, it's not designed to scale, it's irreducibly complex. In fact, to be truly scalable, software first must exist as a tiny thing, but contain within it a pattern for growth.

GRASP mentions a design pattern called Controller. It's not a new concept; simply that the business rules and logic of your program should be separate from GUI code, database access, etc.

In Python, the simplest executable unit is a single module which runs from the command line. I organised Turberfield so that it consists of small programs which can either run by themselves or get invoked from a central point.

So when you launch the GUI (it's a Bottle web app), the code invokes one or more controllers, passing to them the command line arguments you gave it:

with concurrent.futures.ProcessPoolExecutor() as executor:
    future = executor.submit(
        turberfield.machina.demo.simulation.main, args
    )
    bottle.run(app, host="localhost", port=8080)

Or else, if you're debugging or working in batch mode, you can do away with the GUI and just run one controller on its own:

$ python -m turberfield.machina.demo.simulation -v

Inspectable

So if Turberfield consists of many collaborating controllers, how do they communicate?

Firstly, there must be a shared understanding of what information they are passing. A useful piece of Indirection here is the concept of events. Each controller generates (and may consume) a series of time-ordered events.

Secondly, the data format must be flexible, and friendly to human inspection. For this I selected RSON which has a number of attractive features. It's essentially a superset of JSON which has the useful property that if you concatenate RSON files the result remains a valid sequence of objects.

This deserves a picture, and so there's a diagram of it all below. The square boxes are our controllers (W is the web tier). One controller can broadcast to others by writing an RSON file of events.

A diagram of the components of a scalable system.

What do you mean, you need a break from the coroutine?

To promote High Cohesion within a controller, we separate the functionality out into classes, each with a specific behaviour. They are the numbered ellipses in the diagram above.

We distribute the desired behaviour of the controller among classes according to which class keeps the data necessary for that particular piece of logic. This is a GRASP principle too. It means our classes are Information Experts.

After working on Turberfield for a while I discovered that these experts must:

  • operate autonomously within an event loop
  • publish certain attributes to other experts within the same controller
  • publish certain attributes to other controllers via RSON
  • listen on a queue for messages from another expert
  • listen on a queue for messages from another controller
  • be configured with global settings like file paths and host names

This is quite a shopping list of features and it took me a while to work out how to do it in a Pythonic way.

Turberfield embraces asyncio whole-heartedly, to the degree that each Expert is a callable object and a coroutine. Each also has a formalised public interface to publish its attributes, and a static method for configuring options. There's a convention I defined on initialisation which lets you connect the Expert to an arbitrary number of message queues.

If you propagate this, then your children will speak REST

A web server is a very handy thing to have around. Whether or not your application is web-based, when you scale it out so that it's deployed to more than one server, HTTP becomes a credible option for host-to-host communications.

In Turberfield, controllers have the ability to publish JSON data which is served outward by the web tier. That web tier accepts REST API calls and feeds these back as messages over a POSIX named pipe to the controller again.

GRASP calls this evolution of microservices Pure Fabrication. So a concrete example is clearly required. You can download turberfield-utils (GPL) and turberfield-machina (AGPL) to try it for yourself.

Happy

Turberfield still has a long way to go, but the current demo proves the concept. You run it like this:

$ pip install turberfield-machina
$ python -m turberfield.machina.demo.main

Point your browser to localhost:8080 and you'll see a simple web-based adventure game.

What I'm particularly happy about is that there were only nine lines of Javascript to write. All the rest is HTML5, SVG, and scalable asynchronous Python3. And that's the future of the distributed web.

Now if there's one thing worse than working to a deadline, then that's not working to a deadline. So I'm bashing out the documentation for Turberfield to make it a suitable library for the next PyWeek. I guess that'll be in May.

Oh, and I'll be available for professional engagements from the end of April. You can get in touch via the Contact page.

Comments hosted at Disqus

Other articles