Distributed Tracing
Ok, what is hell is "distributed tracing" ?
With micro-service architecture a user request will typically span multiple
services across different servers before stitching the response and sending it
back to the user. The problem with this is monitoring, debug-ability, reduced
global visibility.
Distributed tracing, also called distributed request tracing, is a method used to profile and monitor applications, especially those built using a microservices architecture. It refers to methods of observing requests as they propagate through distributed systems. It’s a diagnostic technique that reveals how a set of services coordinate to handle individual user requests. Distributed tracing requires that software developers add instrumentation to the code of an application.
OpenTracing provides API specification allowing to add instrumentation to the application code in a vendor neutral manner.
Cost Of Instrumentation
Services would usually talk to each other through some sort of IPC. There are many frameworks to allow such communication. Many frameworks provide inbuilt support for instrumentation, making it simple to enable distributed tracing. These framework usage and performance are studied extensively. The OpenTracing Blog is probably a good place to start.
So now let's come to the significant question "what is the cost of
instrumentation" ?
What is the performance impact of adding
instrumentation to the application code ?
For measuring the cost we will use the jaeger tracer and JMH to capture the metrics.
JMH provides an API to consume cpu cycles varying linearly with token value
specified. We will use this to mock a long running job which will be
instrumented.
// We need to consume or return the result to avoid JVM dead code optimization
public long processLongJob(long token) {
Blackhole.consumeCPU(token);
return token;
}
We will measure the metrics without any instrumentation, with NoOpTracer and JaegerTracer with default initialization values.
Benchmark | Param: token | AverageTime (ns/ops) | Error (99.9%) | Cost (Impact %) |
---|---|---|---|---|
NoInstrumentation | 1000 | 1664.84177 | 49.940273 | 0 |
NoInstrumentation | 5000 | 8324.19976 | 107.233129 | 0 |
NoInstrumentation | 10000 | 16606.7095 | 60.983405 | 0 |
NoInstrumentation | 50000 | 83249.7561 | 2113.50206 | 0 |
NoInstrumentation | 100000 | 166090.738 | 1693.40512 | 0 |
NoInstrumentation | 500000 | 829908.745 | 6188.7579 | 0 |
|
|
|
|
|
NoOpTracer | 1000 | 1648.86056 | 4.521151 | -0.960 |
NoOpTracer | 5000 | 8280.19528 | 86.68164 | -0.529 |
NoOpTracer | 10000 | 16569.8697 | 134.783752 | -0.222 |
NoOpTracer | 50000 | 83254.7547 | 645.070768 | 0.006 |
NoOpTracer | 100000 | 166211.527 | 979.291791 | 0.073 |
NoOpTracer | 500000 | 833341.912 | 14473.6203 | 0.414 |
|
|
|
|
|
JaegerTracer | 1000 | 2373.87521 | 2.188993 | 42.589 |
JaegerTracer | 5000 | 10059.0192 | 18.01787 | 20.841 |
JaegerTracer | 10000 | 18803.8006 | 63.222248 | 13.230 |
JaegerTracer | 50000 | 87511.7573 | 651.314285 | 5.120 |
JaegerTracer | 100000 | 172132.755 | 959.72657 | 3.638 |
JaegerTracer | 500000 | 846543.298 | 8904.63822 | 2.004 |
As seen above No Instrumentation and NoOpTracer scores are comparable with virtually no impact on performance while JaegerTracer costs ~1.5x but decreases with increase in CPU cycles consumed.
Lets look at the average time per fixed-CPU-cycles. As CPU cycles consumed varies linearly with the token value we can divide the time by the token. Below table summarizes the above table into scores for the different instrumentation techniques
Token | Score (AverageTime / Token) | ||
---|---|---|---|
NoInstrumentation Score | NoOpTracer Score | JaegerTracer Score | |
1000 | 1.664841774 | 1.648860562 | 2.373875211 |
5000 | 1.664839952 | 1.656039056 | 2.011803834 |
10000 | 1.660670949 | 1.656986968 | 1.880380062 |
50000 | 1.664995121 | 1.665095094 | 1.750235147 |
100000 | 1.660907376 | 1.662115265 | 1.721327553 |
500000 | 1.65981749 | 1.666683825 | 1.693086597 |
Plotting the table with token on X Axis and the scores on Y axis it becomes
clear that the more intensive the code being instrumented the lesser it has
impact on performance.
Using the above measurements instrumenting any function whose total execution time > 1 ms, the cost of instrumentation is negligible.
That's all folks. Till next time.