Introducing Performance Lab
Performance Lab is a new initiative dedicated to exploring the multifaceted world of software performance. Launched just a few weeks ago, our project delves deep into various performance domains, including system design, benchmarking, profiling, optimization techniques, and much more. Each of these areas represents its own rich discipline worthy of focused exploration.
About Me
I'm MZ, a technology veteran with experience dating back to the dot-com bubble era. My passion for software optimization began during my early career developing ISAPI filters and Apache modules designed to intercept server requests. These components demanded exceptional efficiency and reliability, as any bugs could potentially bring down entire servers.
Throughout my career, I've focused on building highly efficient software that maximizes hardware capabilities while reducing costs. My experience spans analytics projects aimed at identifying system bottlenecks and designing comprehensive end-to-end solutions for data processing, storage, and reporting systems.
Performance Lab: Our Journey to Efficiency
Our journey begins from the bottom up. When a new programmer starts developing, whether it's an app or a service, they're working on code that needs to be as efficient as possible while still delivering the expected outcomes.
A Simple Example: sum(a,b)
Consider a basic example: creating functionality that calculates the sum of integers from 1 to n. When writing a code block for this functionality, there's a natural progression to efficiency:
Logic and flow of the block - Getting the core logic right
Efficient algorithms - Finding optimal solutions
Correct data structures - Choosing appropriate ways to store and access data
Code-level optimization - Fine-tuning implementation details
A decade ago, I took an algorithm course on Coursera.
I once took an algorithms course instructed by Kevin Wayne and Robert Sedgewick one phrase from Robert constantly echoed me “Can we do it better?“ This question challenges us to continuously improve our solutions. I recommend the great book Algorithms (4th Edition) witch inspired to think this way.
But this raises an important question: how do we measure and prove that our implementation is truly efficient? In the Java world, the answer is JMH (Java Microbenchmark Harness), which will be the topic of our next articles.
Thanks for joining us on this journey to performance excellence!
Play With JMH
In the example above, we will use Maven.
Create a project if you do not have
<properties>
<java.version>17</java.version>
<maven.compiler.source>${java.version}</maven.compiler.source>
<maven.compiler.target>${java.version}</maven.compiler.target>
<jmh.version>1.37</jmh.version>
</properties>
Add JMH dependencies
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-core</artifactId>
<version>${jmh.version}</version>
</dependency>
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-generator-annprocess</artifactId>
<version>${jmh.version}</version>
</dependency>
Adding a benchmark runner for an easy run from IDE
package io.performancelab;
import org.openjdk.jmh.annotations.*;
public class BenchmarkRunner {
// easy run from IDE
public static void main(String[] args) throws Exception {
org.openjdk.jmh.Main.main(args);
}
}
Adding benchmarks class
@BenchmarkMode(Mode.Throughput)
@Fork(1)
@Threads(1)
public class PlayWithJMH {
@Benchmark
public void sum_range(Blackhole blackhole) {
int x = Sum.sum_range(1, 1000);
blackhole.consume(x);
}
@Benchmark
public void sum_stream(Blackhole blackhole) {
int x = Sum.sum_stream(1, 1000);
blackhole.consume(x);
}
@Benchmark
public void sum_algo(Blackhole blackhole) {
int x = Sum.sum_algo(1, 1000);
blackhole.consume(x);
}
}
Adding @BenchmarkMode to define the benchmarking metrics. There are different types of Benchmarking. In this example, we are using Throughput. By adding @Benchmark, JMH will benchmark the method individually. There are a few other modes we will expand in the next article.
The current example is simple, but it can show the strategy for optimization
You start with some method or flow that you want to optimize
you have the current implementation, which is a naive one
Some of the developers take the stream approach (more elegant - but does it have good performance)
Others take high school simple algorithm
public class Sum {
public static int sum_range(int a, int b) {
int tmp = 0;
for (int i = a; i <= b; i++) {
tmp += i;
}
return tmp;
}
public static int sum_stream(int a, int b) {
IntStream intStream = IntStream.range(a, b);
return intStream.reduce(0, Integer::sum);
}
public static int sum_algo(int a, int b) {
int n = b - a + 1;
return (n * (a + b)) / 2;
}
}
To run the benchmark not from the IDE lets add to pom.xml file
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.13.0</version>
<configuration>
<source>17</source>
<target>17</target>
<annotationProcessorPaths>
<path>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-generator-annprocess</artifactId>
<version>1.37</version>
</path>
</annotationProcessorPaths>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.4.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<manifestEntries>
<Main-Class>org.openjdk.jmh.Main</Main-Class>
</manifestEntries>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
Run maven (to create the package)
mvn clean install package
Run jar file
java -jar target/play-with-jmh-1.0-SNAPSHOT.jar
Results
Let’s explore the results
Benchmark Mode Cnt Score Error Units
PlayWithJMH.sum_algo thrpt 5 1771912740.903 ± 76827725.410 ops/s
PlayWithJMH.sum_range thrpt 5 3243665.358 ± 54706.748 ops/s
PlayWithJMH.sum_stream thrpt 5 2983338.638 ± 169515.222 ops/s
Looking at the results, we defined the BenchmarkMode as Throughput.
Throughput is counting the number of invocations we did in each method (Bigger is better)
sum_stream creates a stream of integers in the defined range and reduces them with the sum, a kind of new API in Java, but less preferred than the naive approach, which is to have a for loop that accumulates the values. The sum_algo method wins by far from the other methods. If you can switch to a more optimal algorithm, this will be the biggest improvement that you can achieve.
You can find the complete running benchmark on GitHub.
See you in the following article when we dive deep into JMH.