Fubaredness Is Contagious

Dmitriy Samovskiy’s Blog

Full Data vs Incremental Data in Messaging

June 25th, 2009 · 1 Comment

My recent experiments with messaging for a distributed application led to a realization that I would like to share with you in this post. It’s not an earth shaking discovery but you may still find it interesting.

Do you remember an old Unix command to create tape backups called dump? Remember its concept of levels? To refresh your memory, in a nutshell level 0 (full backup) includes all files on the filesystem, and any other level corresponds to incremental backup where only files modified since last backup are included.

It turns out somewhat similar concept applies to messaging, specifically to the contents of messages themselves.

A message in general is some piece of information that one system passes to another. On one hand, publisher may make an observation, extract information from it, package entire current state into a blob, and send it out as a message. The same sequence of operations is performed at regular intervals. Examples of this model include sending a message about processes currently running on the system, clients currently connected to a server, current usage of RAM, etc. This model roughly corresponds to dump’s level 0 - consumer needs just a single message to obtain all information that publisher sent, there is no need for consumer to accumulate and merge a series of messages to get the full picture.

On the other hand, a publisher can send a message that contains information about a single event. For example, a new client connected, a new job got submitted to the backend, hard disk failed. This mode is more like incremental backup - a message contains only a delta, its payload doesn’t carry entire state.

Each of these models has its good and bad sides. In full data model, a single message is sufficient to transfer all knowledge about current state from producer to consumer, and consumer can start reading messages at any point in the queue - by design it will catch up once it receives and processes at least one message. The downsides of this model are waste of bandwidth and processing power (if there are no changes, same contents will be transferred over and over again) and the fact that delta must be calculated by consumer (for example, having received 2 “ps auxww” outputs, consumer would have to diff them and parse the result).

Incremental data model clearly provides an easy delta and is less wasteful on resources, but requires consumer to merge multiple messages to get entire picture and as a result is sensitive to a point from which a consumer starts reading the queue.

A potential solution is to do what dump does - send full data once in a while, followed by deltas. This way consumer will catch up eventually - once it gets full data message (which will come sooner or later). Another caveat is that not always does a consumer need a full picture - in a classic scalability scenario of supervisor-workers model, workers rarely need more than contents of their current job contained in an incremental message.

But it’s not the end of it. While working on a problem, I realized that usually I as a developer don’t even get to choose which model to use - it’s dictated to me by the nature of information I am trying to pass from one system to another. Some data can be easily obtained as full and very difficult to obtain as incremental, some vice versa. For example, a list of current processes on Linux is trivial to obtain as full (ps auxww) and quite difficult to obtain as incremental (I would need a notification about when each process starts and dies). Or in case of incoming jobs - it’s easy to obtain delta (one job) but it’s quite difficult to know current status of all jobs.

My conclusion here is that there are 2 main factors to think about:

  1. can my publisher get data in full or incremental form?
  2. does my consumer need data in full or incremental form?

If the answers to above questions are the same, you are good to go. But if they are different, you need to understand potential issues as discussed above and analyze further. I hope to be able to provide more practical thoughts on this in the future - stay tuned.

→ 1 CommentTags: rabbitmq · software engineering

Why I Want Google To Have a URL Shortener Service

June 24th, 2009 · No Comments

When I click a shortened link which points to a blog that I am subscribed to in my Google Reader, I want to be taken straight to my Reader instead of to the original blog site. I am pretty sure I don’t want to give access to my Reader to a third party. Hence, I want Google to run a URL Shortener service. Pretty please?

→ No CommentsTags: Internet

Why I Sometimes Prefer Shell To Ruby or Python

June 11th, 2009 · No Comments

Shell was among the first things I got familiar with when I was introduced to Linux. It’s not a typical programming language, primarily due to lack of easy-to-use high-level data structures such as hashes and arrays (anticipating your objection to this - note I said “easy-to-use”). This may explain why I often get funny looks from folks when I mention that I use shell quite a bit, often in quite non-trivial systems.

And here are my reasons.

Memory Management

Shell scripts are excellent in managing their memory and one has to try real hard to cause a shell script to leak memory. This makes shell a very convenient tool for long running processes, supervisors in multiple-workers models, daemons and so on. There is an easy explanation for this. In shell, there are only a handful of built-in primitives - everything else is an external command, which gets started and then finishes before giving control back to your script. If there is a memory leak in that command, it won’t damage your calling script and will usually be insignificant because it will return quickly.

No Exceptions

This is a double edged sword, and you need to be careful how you exploit this “weakness.” This feature allows me to write compact code which is easy to understand without enclosing every single command in “try… except”. For naysayers, I would like to point out that a strict mode exists, where every error is treated as fatal and causes the script to exit (set -e).

In general, not all unforeseen error conditions warrant a crash, like you get in Python or Ruby when an unhandled exception gets propagated all the way to the top. If a problem is transient, it may be better to ignore it temporarily.

To assure a Ruby or Python script doesn’t crash on some unforeseen transient problem, many people often end up enclosing their entire program in a wildcard try… except block to catch any exception - but to me this approach is dangerous, even though I sometimes end up using it myself.

If you are writing a daemon process to perform some action in a loop, shell is often by far the most stable alternative.

When Not To Use Shell

My personal rule of thumb is don’t use shell when you expect to need high-level data structures like hashes or arrays beyond what for loop can give you, or when you can see potential for code reuse following OOP patterns like inheritance, or when your program needs to participate in some orchestration schemes that go beyond creating and removing files on the filesystem.

Conclusion

I wouldn’t overlook shell if I were you.

→ No CommentsTags: python · ruby · software engineering

How Long Ago Was This EC2 Instance Started?

June 4th, 2009 · No Comments

By accident, today I discovered an easy way how to determine how long ago your EC2 instance was started. Note that uptime shows time since last reboot, so it’s not what we get here. Here is a bash implementation.

→ No CommentsTags: cloud computing

Branching In Git When Working On Big New Features

May 31st, 2009 · No Comments

A note to self.

When starting to work on a new big feature, always set up 2 branches for it. Say FEATURE_work and FEATURE_integration. Do your regular development in FEATURE_work committing as often as you want. When you reach certain milestones (but entire feature is still not ready yet), squash merge FEATURE_work into FEATURE_integration. When entire feature is finished, merge FEATURE_integration into master.

This gives you a much nicer history of commits, lets you group changes by milestone, and allows to keep big feature as multiple commits in master.

→ No CommentsTags: software engineering

Graphite RabbitMQ Integration

May 21st, 2009 · 4 Comments

I started a new project on github - http://github.com/somic/graphite-rabbitmq. It currently includes a couple of tools written in Python which facilitate sending data to Graphite via RabbitMQ instead of connecting directly to the service using TCP.

Graphite CLI Screenshot

Graphite CLI Screenshot

Graphite is a flexible and powerful tool to build charts. It’s also a data series analytics framework. It was developed inside Orbitz by my former colleague, originally for use within a single group (of which I was a part). However, its power did not remain a secret for too long - it quickly spread to entire organization and became an irreplaceable tool for both development and engineering/operations. Graphite was then open-sourced under Apache license. It currently lives at http://graphite.wikidot.com/

The key to Graphite’s power, in addition to dynamic web UI, an improved RRD implementation called “whisper” (read this FAQ - highly recommend!) web-based command line with auto-completion which allows you to overlay any metrics on a single chart, IMHO is the fact that you are in control what kind of data to send to it, how often, and how to set up hierarchies of your metrics - by environment, by machine type, by datacenter, etc. Graphite doesn’t do its own polling that won’t scale to hundreds or thousands of metrics. Nor does it enforce anything but the fact that your metrics are dot-separated hierarchies (as in routing keys of AMQP topic exchanges - my.metric.name) and that their values are numeric (int or float).

If you are still reading this but still are not convinced that it’s the way to go, I’ve got one last argument. If you already use RabbitMQ to publish and consume data, wouldn’t it be nice to get a powerful charts without touching your application AND without installing agents on your publishers or consumers? Recall the duplication pattern of RabbitMQ - you can fork the incoming stream of messages into another queue (without impacting your original consumers and the queues to which they attach) and set up Graphite+RabbitMQ off of this new queue.

If you are planning to run multiple carbon instances, remember that heavy lifting (writing to disk) is actually performed by another process called carbon-persister.py (it’s started by carbon-agent, with communications over a pipe) - try to avoid multiple persisters writing data within the same hierarchy to avoid slow down and possible data corruption. RabbitMQ can help you sort out what messages go where, thus minimizing this risk.

I am very excited about future opportunities that a Graphite-RabbitMQ combination can deliver, and I hope someone finds my scripts useful. Both tools bring a lot of awesomeness to the table, and nicely complement each other forming a great charts and data series analytics solution you have been searching for. Check it out!

→ 4 CommentsTags: python · rabbitmq

The Power of Knowing “Why?” in Software Engineering

May 12th, 2009 · No Comments

I am currently reading “How Life Imitates Chess” by Garry Kasparov, after I saw a great review of the book by Baron Schwartz. Great book and I highly recommend it.

It’s got many lessons for software engineers as well. For example, in chapter 9 “Phases of the game” Kasparov talks about inexperienced players blindly following openings by famous grandmasters and how this can carry one only so far and ultimately is a trap.

Players, even club amateurs, dedicate hours to studying and memorizing the lines of the preferred opening. This knowledge is invaluable, but it can also be a trap. Many make the mistake of believing that if they know what a famous Grandmaster played in this exact position back in 1962, they don’t have to think for themselves. [...] Without knowing why all the moves are made, he’ll have little idea of how to continue when play inevitably advances beyond the moves he was able to store in his memory.

In software engineering, we have many conferences and online tutorials and blogs where our own Grandmasters talk about how they tackled a particular problem or resolved a particular outage. Sharing experiences is invaluable, but like Kasparov says, it can only carry you so far. Many people will blindly follow solutions described during conference talks, without understanding why it was done this way and not the other. Some people base their selection of a certain technology on opinion of a guru. Again - without fully understanding the context and reasons behind the decision.

What I am trying to say is Learn from other people’s experiences, but don’t forget to understand their context and their reasons. Your ability as a software engineer is based on your ability to adapt the solution to your needs, not simply copy it. Or if you copy, you need to know exactly why it will work for you.

→ No CommentsTags: technology

Don’t Use OpenDNS On Servers

April 17th, 2009 · 1 Comment

Are you thinking about using OpenDNS in your servers’ /etc/resolv.conf? Don’t. Why? Because when OpenDNS receives a query for a non-existing name, instead of returning NXDOMAIN (essentially name you’re looking for does not exist), it will return some IP, which probably is meant to catch typos, misspelt URLs or phishing attempts. Works great for humans and their browsers, not so much for your applications. NXDOMAIN is a valid result after all and may impact application’s logic.

$ dig @208.67.222.222 doesnotexist---doesnt.com

; <<>> DiG 9.4.2-P2 <<>> @208.67.222.222 doesnotexist---doesnt.com
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46259
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;doesnotexist---doesnt.com.	IN	A

;; ANSWER SECTION:
doesnotexist---doesnt.com. 0	IN	A	208.69.36.132

;; Query time: 14 msec
;; SERVER: 208.67.222.222#53(208.67.222.222)
;; WHEN: Fri Apr 17 14:14:49 2009
;; MSG SIZE  rcvd: 59

→ 1 CommentTags: linux

Compiling Erlang On Linux With Old Glibc

April 14th, 2009 · No Comments

I recently wanted to compile Erlang (in order to install RabbitMQ) on a Linux box with old glibc (2.3.2, from days of Red Hat Linux 7.0). It was the only out-of-date component, everything else was quite fresh - GCC 4.3.3, binutils 2.19.1.

Version of Erlang I used was R12B-5. I configured it with ./configure –disable-x –enable-threads –disable-hipe.

But it wouldn’t build, giving me the following error:

Fatal, could not get clock_monotonic value!, errno = 22

This was strange because I had no problems building this version of erlang on Debian Etch, even with an older compiler.

The solution was to edit all instances of config.h in the build tree (in my case, there were 2 - lib/erl_interface/src/i686-pc-linux-gnu/config.h and erts/i686-pc-linux-gnu/config.h) after running ./configure but before starting make and comment out this line:

/* Define if you want to use clock_gettime to simulate gethrtime */
/* #define GETHRTIME_WITH_CLOCK_GETTIME 1 */

→ No CommentsTags: erlang · linux

Eliminating Single Points of Failure - One, Two, Many

April 9th, 2009 · No Comments

I recently reached an interesting conclusion. When you are trying to eliminate a single point of failure from your architecture, it’s almost always beneficial to first go with a 2-way redundant solution (active-passive or active-active pair, whichever is easiest to implement) and only then go to N-way, N > 2, only if necessary.

One huge difference between a pair and N-way (N>2) is how difficult it is to detect partitioning (of CAP Theorem fame - you can simultaneously achieve only two properties from the following three: data Consistency, high Availability and Partition tolerance). Assuming symmetrical communications (A can talk to B if and only if B can talk to A), partitioning detection in a pair is trivial, because there can be only one option - system A can’t talk to system B. With N>2 however, there are way more scenarios to deal with: A can’t talk to B while both A and B can talk to C, A can’t talk to B and C , etc. Additionally, communications may be restored in some random order - A may first be able to talk to B, and only some time later get its visibility to C back.

Interestingly, also from personal experience, if you manage to build a 3-way redundancy, building 4-way or even 5-way is relatively not that difficult.

There are also a couple of purely practical aspects that make a 2-way redundancy an attractive option, even if it’s going to be intermediate step before N-way is achieved. 2-way can serve as a working prototype - you can observe it, learn and analyze its failure scenarios and make sure your response to each is optimal. This can validate your approach before you sink all this time in partitioning detection for N-way.

And secondly, after you build an easier 2-way, you might as well discover that you don’t need an N-way redundancy. If a pair meets your goal (say a given percentage of service availability), you can save a lot of time and effort.

My advice - don’t skip two on your way from one to many.

→ No CommentsTags: distributed