JBrisbin.com

Eventing Data with RabbitMQ and Riak

20
Apr

Eventing Data with RabbitMQ and Riak

Eventing Data with RabbitMQ and Riak

There's been lots of interest in my new Riak exchange type for RabbitMQ and its cousin, the Riak postcommit hook for RabbitMQ. They're two sides of the same coin but can be used independently.

I've had some questions about whether they will do this that or the other, so I thought a blog post was in order that went into a little more detail about how to use these utilities.

Installation instructions are the their respective READMEs (here and here) so I won't repeat that information in this post.

RabbitMQ Custom Exchange

The purpose of the RabbitMQ Riak custom exchange is to send messages to a Riak server. I'm currently working on expanding its functionality to be much more robust, but for the time being, think of the "x-riak" exchange as being simply a topic exchange that logs every message sent to it into Riak before routing the message with the normal topic routing machinery.

You can control where the message goes in Riak by specifying special AMQP headers in your message or letting it default to using the exchange name for the bucket name and the routing key for the Riak key. To control where the message goes in Riak manually, set the X-Riak-Exchange and X-Riak-Key AMQP message headers.

Once the message has been delivered to Riak, it simply calls the RabbitMQ topic routing logic and passes that along to the broker as if it were a normal topic exchange. I'm currently working on an improvement that would allow you to specify what exchange type to pretend to be. Then you could set an argument on your exchange like type=direct and, after storing your message in Riak, the exchange would route your message using the type of exchange you've specified. That should be much more flexible.

To control the Riak server to use, pass an argument in your exchange declaration call. Setting an argument named host will use that IP address or hostname and setting an argument named port will use that Protocol Buffers port. Underneath, the Riak exchange uses the PB client, so the port you give it should be that of the PB port on which your Riak server is listening.

To control the number of Riak clients in the pool, you can set an argument on your exchange named maxclients. This will pre-load that number of clients to your Riak server and, when delivering a message, pick one of these at random to use. The default is 5 clients.

Riak postcommit Hook

The other side of the coin is the postcommit hook you can install into your Riak server that will send any Riak objects inserted, saved, or deleted to a RabbitMQ server.

By attaching this postcommit hook to your bucket, your object will be sent to a RabbitMQ server whenever it is modified.

To specify where to send this message, you can include a number of different metadata headers of the X-Riak-Meta- variety on your object. I'm currently working on making this more flexible by allowing you to store per-bucket broker settings so you don't have to put this information on each Riak entry.

What can I do with it?

The concept is simple, of course, but the implications are profound.

One of the problems when dealing with RabbitMQ's clustering is that it uses mnesia under the covers. In many distributed setups this is not always ideal. Sometimes-connected WAN nodes in particular can suffer from not having a solid connection to the other brokers.

By specifying the RabbitMQ server to send Riak updates to, it's actually possible to set up a scenario like the one in the following diagram:

Riak Shovel Diagram

The two RabbitMQ servers in this diagram are not clustered. Using a combination of a Riak exchange type for RabbitMQ and the RabbitMQ postcommit hook for Riak, consumers on both servers will receive the messages in a manner much like what the shovel plugin does.

UPDATE: Please keep in mind this does not address the underlying problem mnesia has of communicating with nodes across a WAN (or, for that matter, in any scenario where nodes come and go ad hoc, like in a dynamic scaling situation). This simply circumvents this issue entirely by not relying on a clustered broker scenario. There are trade-offs in everything, of course, so your mileage may vary. Don't say you weren't warned!

One of the nice things about using a Riak-backed message exchange is that your messages are all stored. Since Riak is a fantastic large-scale data store, you can store every message your exchange receives without worrying about the storage overhead (just add more Riak servers for greater capacity). This also means you can write a dead-simple web interface to display those messages and, by simply updating the message you can resend any (or all of them). This could be great for replaying a set of messages.

That's just a smattering of ideas I've got kicking around in my head about how this killer combo could be leveraged in your architecture. Tweet me if you've got an idea.

blog comments powered by Disqus