π§ Where we left off
Welcome back to our ongoing series on measuring test coverage for binary programs!
In part 1 we used Go’s built-in -cover flag β clean and accurate, but only works if you own the source and can recompile. In part 2 we used valgrind and gdb to trace gzip without touching its source. In part 3 we explored Intel PIN, a proper dynamic binary instrumentation framework β powerful, but it came with a ~100MB proprietary C++ SDK and was limited to x86_64.
At the end of that post I promised we’d go further: full automation, any binary, no recompilation. Today we make good on that promise with a native eBPF approach, and the result is a tool called funkoverage.
(image courtesy of https://www.pexels.com/@tanfeez/)
π³οΈ Why eBPF changes everything
eBPF is a Linux kernel technology that lets you run small sandboxed programs inside the kernel in response to events β without loading kernel modules or patching the kernel itself. For tracing purposes, this means we can hook function entry points with uprobes and receive notifications in userspace via a ring buffer, all with negligible overhead.
Two eBPF features make this particularly attractive for coverage measurement:
uprobe_multi (available since Linux 6.6) lets you attach uprobes for an entire binary or library in a single syscall, passing all symbol names and “cookies” at once. Previously you needed one syscall per function β at 8,000 functions that’s 8,000 syscalls just to set up. Now it’s one.
Kernel-side first-call deduplication: inside the BPF program, we use an atomic compare-and-swap operation on a per-function flag stored in a kernel map. This means each function fires exactly one event to userspace, no matter how many times it is called during the program’s lifetime. For coverage purposes this is exactly what we want: a clean yes/no signal with no noise.
Here’s how the approaches compare:
| Approach | Overhead | SDK required | Architecture | First-call dedup |
|---|---|---|---|---|
| valgrind/callgrind | ~10β20Γ slower | None | x86_64 | No |
| Intel PIN | ~5β10Γ slower | ~100MB C++ SDK | x86_64 | No |
| eBPF uprobe_multi | ~1β2% overhead | None | ANY | Yes |
The overhead difference is significant in practice. With valgrind, even a trivial gzip -h takes half a second. With uprobes, it takes milliseconds β the program runs at essentially native speed.
π₯· A transparent impostor
funkoverage is a pure-Go tool that uses this eBPF infrastructure to give you function-level coverage on any ELF binary, without source code or recompilation.
The design is built around two cooperating binaries:
ββββββββββββββββββββ ββββββββββββββββββββββββββββ
β funkoverage β CLI β funkoverage-shim β
β install/report β β transparent replacement β
ββββββββββββββββββββ ββββββββββββββββββββββββββββ
funkoverage is the CLI you interact with: it installs and uninstalls the shim, enumerates functions, and generates coverage reports.
funkoverage-shim is a “tiny” Go binary that gets installed in place of the target binary. It’s completely generic β it doesn’t know anything about gzip or any other program. When invoked, it reads a JSON sidecar file to discover which functions to hook, attaches the BPF probes, and then transparently runs the real binary.
Running sudo funkoverage install /usr/bin/gzip performs these steps:
- Moves the real
gzipbinary to/var/coverage/bin/gzip - Enumerates all functions from the symbol table (falls back to DWARF if needed)
- Writes a
gzip.funcs.jsonsidecar with the symbol list - Copies the shim binary to
/usr/bin/gzip - Runs
setcap cap_bpf,cap_perfmon+epon the shim so it can attach uprobes without running as root
From that point on, every invocation of gzip transparently runs through the shim. The shim’s runtime sequence looks like this:
user runs "gzip -h"
β
βΌ
/usr/bin/gzip β this is now the shim
β
βββ read gzip.funcs.json
βββ fork a child process (paused on a pipe)
βββ load embedded BPF program
βββ link.UprobeMulti(all symbols) β one syscall per image
βββ seed kernel "watched" map with child PID
βββ start ring buffer reader goroutine
β
βββ unblock child via pipe β child exec()s real gzip
β
βββ BPF fires on first call to each function
β βββ event β ring buffer β demangle β _called.log
β
βββ child exits β detach probes β drain buffer β close log
No LD_PRELOAD, no ptrace, no binary patching. The real binary runs unmodified inside the child process; the parent shim just watches what happens at the kernel level.
π©Ί Hooking gzip, live
We’ll use gzip again β the same target as in part 2 β so we can compare the numbers directly.
Build and install funkoverage (you’ll need Go 1.26+ and Linux kernel β₯ 6.6 with BTF enabled):
$ git clone https://github.com/ilmanzo/BinaryCoverage
$ cd BinaryCoverage
$ ./build.sh
$ sudo cp funkoverage funkoverage-shim /usr/local/bin/
Now install the shim over gzip:
$ sudo funkoverage install /usr/bin/gzip
β moved /usr/bin/gzip β /var/coverage/bin/gzip
β enumerated 80 functions
β shim installed at /usr/bin/gzip (cap_bpf,cap_perfmon+ep)
Run our simple smoke test from part 2:
$ gzip -h
Usage: gzip [OPTION]... [FILE]...
Compress or uncompress FILEs (by default, compress FILES in-place).
...
The output is identical β gzip behaves exactly as before. But now we have a log file:
$ tail -5 /var/coverage/data/gzip_*_called.log
CALLED /var/coverage/bin/gzip main
CALLED /var/coverage/bin/gzip try_help
CALLED /var/coverage/bin/gzip license
CALLED /var/coverage/bin/gzip rpl_printf
CALLED /var/coverage/bin/gzip progerror
Generate the coverage report:
$ funkoverage report /var/coverage/data /tmp/report
$ cat /tmp/report/gzip.txt
Functions: 9/80 (11.25%)
11.25% β exactly what valgrind reported in part 2. Reassuring! But this time gzip -h ran in milliseconds, not half a second.
π Chasing the dark functions
The shim appends to the log file on each run, and the report accumulates. Let’s follow the same progression as part 2 and watch the coverage grow.
Check the version:
$ gzip -V
$ funkoverage report /var/coverage/data /tmp/report
Functions: 10/80 (12.50%)
Try an error path β a non-existent file:
$ gzip foobar
gzip: foobar: No such file or directory
$ funkoverage report /var/coverage/data /tmp/report
Functions: 19/80 (23.75%)
That jumped β error-handling code exercised functions we hadn’t hit before. Now let’s do some actual compression:
$ echo "hello funkoverage" > /tmp/test.txt
$ gzip /tmp/test.txt
$ gzip -d /tmp/test.txt.gz
$ funkoverage report /var/coverage/data /tmp/report
Functions: 52/80 (65.00%)
π Same progression as valgrind: 11% β 23% β 65%. The HTML report also shows the uncalled functions by name, which is handy for knowing exactly where your test suite still has gaps.
π One binary, any chip
When we extended funkoverage to support ARM64, we didn’t need to change the BPF program logic at all β the eBPF instruction set is architecture-independent. What we needed was to compile the BPF C code for each target architecture and ship both objects in the repository.
The bpf2go tool from the cilium/ebpf project generates a Go file for each architecture, and Go’s build tag mechanism selects the right one at compile time:
tracer_x86_bpfel.go β //go:build 386 || amd64
tracer_arm64_bpfel.go β //go:build arm64
The pre-generated objects are checked into the repository, so a normal build only needs Go β no Clang or kernel headers required. On an ARM64 machine (a Raspberry Pi, a Graviton cloud instance, an Apple Silicon VM), the exact same CLI and workflow applies.
π The coverage you were owed
We’ve come a long way from the humble go build -cover of part 1. With eBPF and uprobe_multi we now have a tool that:
- Works on any ELF binary β vendor tools, distro packages, daemons β without source code or recompilation
- Adds negligible runtime overhead, making it practical even for longer test suites
- Produces clean, first-call-only coverage data without manual log parsing or glue scripts
- Runs on both x86_64 and ARM64 with no changes to the workflow
If you’re writing integration tests for a binary you don’t control, funkoverage gives you the coverage feedback loop you’ve been missing.
The project is at github.com/ilmanzo/BinaryCoverage β issues and pull requests are very welcome.
If you are interested in this approach, also check out xcover, another eBPF-based test coverage tool. It was recently presented in a lightning talk at FOSDEM 2026 (slides).
Feel free to leave comments and feedback, happy hacking! 👋
