Seek patterns of Elasticsearch

One must know the disk seek patterns of an application when optimizing the storage layer for any application. So when I was working on the performance analysis of ES, one of the first thing was to determine its disk seek patterns. I used blktrace and seekwatcher for that purpose.

blktrace is a utility that traces block io request issued to a given block device. To quote from the blktrace manual:

…blktrace receives data from the kernel in buffers passed up through the debug file system (relay). Each device being traced has a file created in the mounted directory for the debugfs, which defaults to /sys/kernel/debug

For using blktrace, the first step is to mount debugfs. It is a memory based file system to debug linux kernel code.

# mount -t debugfs debugfs /sys/kernel/debug

The next step is to setup the disk on which we need to the tracing. This should be a separate disk as we do not want other OS activity to influence our readings. For this, I’ve used /dev/sdc as the data directory of ES. Now, start ES and wait for a few seconds after the cluster comes into green state, so as to be sure that we are only tracing the disk activity while searching and not the startup. Before firing the ES query, start blktrace on /dev/sdc with the following command

blktrace -d /dev/sdc -o es-ssd-search

This will start tracing the block requests on the disk and send output to es-ssd-search.blktrace.0 .. es-ssd-search.blktrace.n-1 where each file represents requests from one core. Since I was using a quadcore CPU, I got the following files:

es-on-ssd.blktrace.0
es-on-ssd.blktrace.1
es-on-ssd.blktrace.2
es-on-ssd.blktrace.3

Now that we have data from blktrace, the next step is to visualize it. That can be done by blkparse utility. It formats the blktrace output into human readable form. But as they say, a picture is worth a thousand words, there is another tool, namely seekwatcher that can produce plots and movies from the output of blktrace. I used it to visualize the seek patterns. To encode movies with seekwatcher, we also need to install MEncoder. Once seekwatcher and MEncoder are installed, run the following command to generate the movie:

# seekwatcher --movie-frames=10 -m -t es-on-ssd.blktrace.0 -o es-on-ssd.movie

It will produce es-on-ssd.movie that can be played with MPlayer. Following is the output that I got:

As can be seen, apart from a few random reads most of the reads are sequential-reads for which I went on to optimize the storage.

Reference:

Bonus video: ES startup seeks

ES Start Seek Patterns from Anand Nalya on Vimeo.

Minifying Javascript/css without changing file references in your source

Rule 10 of Steve Souders High Performance Web Sites: Minify Javascript

The most common problem faced while implemnting this is how you handle the full and minified version and how to change there reference in referencing documents. One of the easier ways to do this is to make it part of the deployment process.

Here are the relevent steps involved.

I’m using YUI Compressor.

#!/bin/bash
 
#Execute this script after checking out the latest source from repository.
 
#Minify all javascript files
cd /path/to/javascript
for x in `ls *.js`
do
        java -jar /path/to/compressor/yuicompressor-2.4.2.jar -o ${x%%.*}-min.js --preserve-semi  $x
done
 
#Minfiy all css files
cd /path/to/css
for x in `ls *.css`
do
        java -jar /path/to/compressor/yuicompressor-2.4.2.jar -o ${x%%.*}-min.css  $x
done

Now you don’t want to replace all references to x.css or x.js in your development code with references to x-min.css and x-min.js respectively. So what you can do is rewrite all those filenames at the web server level.

For apache the following rewrite rules work fine:

#enable rewriting
RewriteEngine on
RewriteRule /(.*)\.js /$1-min.js
RewriteRule /(.*)\.css /$1-min.css

Caution: Remember to delete existing minified css/js file before running the minifying script or you will end up with file names like x-min-min-min.js and so on. One way to do this is to clear the js/css folder before checking out files from your source repository.