I think every programmer has had their bad moment when debugging a program. However, improving the performance of a program can be more tricky than debugging. Sometimes we need to know in which part(e.g. which line of code or which function) drags the overall performance, then we need to profile the program.

According to the definition of profiling from Wikipedia, profiling includes analyzing a program during runtime to measure the space/time consumption of a program. The usage of particular instructions or the duration of some function calls would also be recorded at the same time. Profiling is of vital importance in performance engineering.

Profiling is achieved by instrumenting either the program source code or its binary executable form using a tool called profiler (or code profiler). Profilers may use a number of different techniques, such as event-based, statistical, instrumented, and simulation methods.

Generally, the profiler profiles a program by instrumenting the source code or its binary. There are many techniques involved in profilers such as simulation methods and event-based method.

In Linux, perf, gprof, valgrind are the most common profiling tools. In my daily work, I use perf and valgrind depending on the task. Today, I am going to introduce a little bit about how to leverage perf with flamegraph and thus the details can be inspected more intuitive!

Perf (sometimes called perf_events or perf tools, originally Performance Counters for Linux, PCL) is a performance analyzing tool in Linux, available from Linux kernel version 2.6.31 in 2009. As I am using RHEL 8, I need to get a perf through

1sudo yum install perf

For Debian based distributions, it is also straightforward as

1sudo apt update
2sudo apt install linux-tools-common

After installation, you can check the documention with “help” parameter. I will not introduce perf and perf’s commands in detail, as there should be many materials covering such content. Just google it and you can master perf in a short time.

OK, let us get through to our topic - how to draw a flamegraph based on the data collected with perf? Let us install the flamegraph driver firstly with

1sudo yum install js-d3-flame-graph

Then,

1sudo perf record -g PROGRAM
2sudo perf script report flamegraph (params)

where PROGRAM should be replaced with the name of your program to be profiled, for example, “./main” in the parathenses, the params can be specified to customize the flamegraph,

1-f --format(json html)
2-o --output
3--template should be defulat(d3-flamegraph-base.html)
4--colorscheme(blue-green, orange)
5-i --input(be suppress)

See, you can customize the format or the colour of the flamegraph.

After the two commands, a file named “flamegraph.html” will be there at the same directory. As suggested by the name “html”, you can open it with your Chrome or Firefox browser. Then with the cursor you can inspect the details of the perf data such as the duration of a function call.

The two-line command can even be merged to one line according to a thread,

1perf record -a -g -F 99 sleep 60
2perf script report flamegraph 

is equal to

1perf script flamegraph -a -F 99 sleep 60

I have omitted “sudo” here for simplification.

For more detailed documentation, you can refer to Redhat’s documentation on flamegraph. But I strongly suggest to try with my guide first then dive into the official documentation as this would be a user-friendly way.