[perf] Improve README

2025-04-13 08:42:59 +00:00 · 2025-03-07 13:33:08 -07:00 · 2025-03-07 13:33:08 -07:00 · 460a8225ba
commit 460a8225ba
parent 575e70b063
1 changed files with 40 additions and 10 deletions
--- a/perf/README.md
+++ b/perf/README.md
@ -4,17 +4,21 @@ Benchmarks are implemented using [Google Benchmark](https://github.com/google/be

 To build the benchmarks in this directory you need to set the benchmark
 option while configuring the build with meson:
-
-```
-meson build -Dbenchmark=enabled --buildtype=release
-```
-or:
 ```
 meson build -Dbenchmark=enabled --buildtype=debugoptimized
 ```

+The default build type is `debugoptimized`, which is good enough for
+benchmarking, but you can also get the fastest mode with `release`
+build type:
+```
+meson build -Dbenchmark=enabled --buildtype=release
+```

-Then build a specific benchmark binaries with ninja:
+You should, of course, enable features you want to benchmark, like
+`-Dfreetype`, `-Dfontations`, `-Dcoretext`, etc.
+
+Then build a specific benchmark binaries with ninja, eg.:
 ```
 ninja -Cbuild perf/benchmark-set
 ```
@ -33,22 +37,48 @@ It's possible to filter the benchmarks being run and customize the output
 via flags to the benchmark binary. See the
 [Google Benchmark User Guide](https://github.com/google/benchmark/blob/main/docs/user_guide.md#user-guide) for more details.

+The most useful benchmark is `benchmark-font`. You can provide custom fonts to it too.
+For example, to run only the "paint" benchmarks, against a given font, five times for
+each font backend, you can do:
+```
+./build/perf/benchmark-font NotoColorEmoji-Regular.ttf --benchmark_filter="paint"
+```
+
+Some useful options are: `--benchmark_repetitions=5` to run the benchmark 5 times,
+`--benchmark_min_time=.1s` to run the benchmark for at least .1 seconds (defaults
+to .5s), and `--benchmark_filter=...` to filter the benchmarks by regular expression.
+
+To compare before/after benchmarks, you need to save the benchmark results in files
+for both runs. Use `--benchmark_out=results.json` to output the results in JSON format.
+Then you can use:
+```
+./subprojects/benchmark-1.8.4/tools/compare.py benchmarks before.json after.json
+```
+Substitute your version of benchmark instead of 1.8.4.
+
 # Profiling

-Configure the build to include debug information for profiling:
-
+If you like to disable optimizations and enable frame pointers for better profiling output,
+you can do so with the following meson command:
 ```
 CXXFLAGS="-fno-omit-frame-pointer" meson --reconfigure build -Dbenchmark=enabled --buildtype=debug
 ninja -Cbuild
 ```
+However, this will slow down the benchmarks significantly and might give you inaccurate
+information as to where to optimize. It's better to profile the `debugoptimized` build (the default).

 Then run the benchmark with perf:
-
 ```
 perf record -g build/perf/benchmark-subset --benchmark_filter="BM_subset_codepoints/subset_notocjk/100000" --benchmark_repetitions=5
 ```
-You probably want to filter to a specific benchmark of interest and set the number of repititions high enough to get a good sampling of profile data.
+You probably want to filter to a specific benchmark of interest and set the number of
+repititions high enough to get a good sampling of profile data.

 Finally view the profile with:

+```
 perf report
+```
+
+Another useful `perf` tool is the `perf stat` command, which can give you a quick overview
+of the performance of a benchmark, as well as stalled cycles, cache misses, and mispredicted branches.