Quick GC Histograms

Some times its hard to find a way of explaining to the technically less literate the power that the Unix shell hands you. The original ideas behing the combination wordprocessor and programmers work bench combined with the power of pipes make text processing easy.

So to today’s problem – analysing GC logs for latency spikes. As we run the jvm with the following properties;

-XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCApplicationStoppedTime 
-XX:+PrintTenuringDistribution -Xloggc:/path/to/gc.logfile

The output looks like this;

2013-04-18T13:30:33.491+0000: 97108.543: [GC 97108.543: [ParNew
Desired survivor size 26836992 bytes, new threshold 3 (max 4)
- age 1: 12848568 bytes, 12848568 total
- age 2: 7033104 bytes, 19881672 total
- age 3: 7791880 bytes, 27673552 total
- age 4: 5567656 bytes, 33241208 total
: 459080K->45463K(471872K), 0.0147270 secs] 2571295K->2162244K(4968704K) icms_dc=0 , 0.0148850 secs] [Times: user=0.27 sys=0.00, real=0.02 secs]
Total time for which application threads were stopped: 0.0154030 seconds
Total time for which application threads were stopped: 0.0002950 seconds
Total time for which application threads were stopped: 0.0002120 seconds
Total time for which application threads were stopped: 0.0003010 seconds
Total time for which application threads were stopped: 0.0002510 seconds

(This system is not on the critical path by the way).

So, looking for an idea of spikes/deviation from normal or just a view of how the pauses are distributed and the length of our tail a histogram is a good start.

We want to see something like this;

Count Millis
 1301 0.000
   71 0.001
2 0.002
1 0.003
3 0.005
8 0.006
8 0.007
15 0.008
29 0.009
33 0.010
32 0.011
45 0.012
47 0.013
50 0.014
52 0.015
54 0.016
60 0.017
30 0.018
39 0.019
37 0.020
24 0.021
24 0.022
18 0.023
8 0.024
10 0.025
4 0.026
8 0.027
5 0.028
3 0.029
4 0.030
3 0.031
1 0.032

Which we get by using this;

grep 'stopped:' ems2-gc.log | cut -d':' -f 2 | cut -d' ' -f 2 | sort -n 
| awk '{ printf "%1.3fn", $1 }' | (echo " Count Millis" ; uniq -c )

From left to right we’re;

  • pulling out the stopped lines using grep
  • extracting the seconds using the two cut statements
  • numerically sorting the output
  • formatting the output to print out in the buckets we’re interested in – in this case milliseconds using awk
  • and lastly creating a histogram using uniq

There are much cleverer bash one liners out there, and each linux admin has their own favourite set of tools, so the above can be done in many different ways no doubt. There are also those who reach for perl/python for anything more complex than grepping the output of ps. I think they’re missing out, as almost every day (and there have been many) there’s a new flag or command to surprise and delight me with bash.

Any opinions, news, research, analyses, prices or other information ("information") contained on this Blog, constitutes marketing communication and it has not been prepared in accordance with legal requirements designed to promote the independence of investment research. Further, the information contained within this Blog does not contain (and should not be construed as containing) investment advice or an investment recommendation, or an offer of, or solicitation for, a transaction in any financial instrument. LMAX Group has not verified the accuracy or basis-in-fact of any claim or statement made by any third parties as comments for every Blog entry.

LMAX Group will not accept liability for any loss or damage, including without limitation to, any loss of profit, which may arise directly or indirectly from use of or reliance on such information. No representation or warranty is given as to the accuracy or completeness of the above information. While the produced information was obtained from sources deemed to be reliable, LMAX Group does not provide any guarantees about the reliability of such sources. Consequently any person acting on it does so entirely at his or her own risk. It is not a place to slander, use unacceptable language or to promote LMAX Group or any other FX and CFD provider and any such postings, excessive or unjust comments and attacks will not be allowed and will be removed from the site immediately.