If you’ve read the first article in this series, you’ll know what full stack tracing is and why you definitely want it. This time, we’ll show you how to setup full stack tracing on your Linux system. There are two steps – first get everything configured to capture a trace, and then view and interpret the trace.
Setup full stack tracing with a bit of kernel help
To capture a trace, we’ll be using LTTng (Linux tracing toolkit next generation) in our examples. LTTng captures tracepoints with minimal overhead. This is something you definitely want, as too much extra CPU introduced by tracing can change the system’s behavior, even causing it to fail unpredictably. Another factor in LTTng’s favor is that it’s well supported by the open source community.
LTTng was designed to record kernel level events. However, you’ll also want to use its user space tracepoints to capture application level events. That will give you consistent visibility, regardless of where execution moves throughout the software stack. User space tracepoints is critical to the setup of full stack tracing as it lets you integrate application, Qt, and kernel tracepoints together in a single view.
Configuring the kernel
Start by making sure that your kernel is configured appropriately. Using make menuconfig (or its GUI cousins xconfig and gconfig), look under Kernel Hacking > Tracers and enable the following at a minimum:
You’ll probably also want to add these options, which are highly recommended for more complete coverage:
Note that if you are doing embedded work, you’ll need to invoke menuconfig for cross-compiling (substituting the architecture of your embedded target, of course):
Next thing you’ll want to do is build your kernel. That may vary based on your distribution and target and other options, but again for embedded development it will likely look something like this:
Of course, once your new trace-ready kernel is built, you’ll need to deploy it to your target.
Installing LTTng
You’ll also need the LTTng tool itself. The three main packages you need are lttng-tools for the command-line interface, lttng-modules-dkms for the kernel tracepoint modules, and liblttng-ust-dev to support user space tracing, which includes tracepoints in Qt and your application. (There are a couple of other optional packages if you need tracing from Python or Java.)
If apt-get is your package manager, installing LTTng will look like this:
If you use a different package management system, you might be able to find the equivalent commands in the LTTng docs.
Because LTTng is such a common debugging tool, other distributions may provide simpler alternatives. For example, Yocto enables LTTng by adding the following line to your local.conf:
Enabling Qt tracing
To get traces from Qt APIs and internals, you’ll have to build Qt from source rather than using the pre-built libraries. KDAB has been leading the charge on enabling Qt for full stack tracing, but it’s undergoing active development and is a moving target. While our first pushed changes for Qt tracing support got merged into 5.12, you’ll definitely want 5.13 or later.
If you’re using 5.13, you’ll need some patches. These patches will enable tracing within QtQuick/QML:
If you happen to be using yocto, you can also apply the following patch to get some additional tracepoints:
If you’re using 5.14 or later, you won’t need any of these patches – they’re all merged in. Either way, the next step is to build Qt.
Again consult the docs if you need anything besides the simplest configuration. For example, if you want to keep a trace-enabled and trace-free version, you may want to set up shadow building to keep binaries and build artifacts for both configurations in separate directory trees.
Once you have Qt built, put it on your embedded target with your reconfigured kernel and you should be all set to start tracing.
Configuring the trace
LTTng does need a few options to tell it where to capture a trace and what events you want to catch. So, you’ll need a script to configure LTTng with all the needed options. As a starting point, here’s my LTTng launch script:
A short explanation of this shell script:
- Lines 1-5 : Creates a trace session and supplies a filename for writing the trace data. We use the system seconds to give us a new file each time this script is run.
- Lines 7-19 : Creates a trace channel, and populates it with the kernel events we want to monitor. We use bracket shell expansions to make the list a bit more manageable.
- Lines 21-23 : Creates a separate channel for user events, and enables them so we catch our application and any Qt-generated events.
- Line 26 : Starts the trace!
Size and speed tradeoffs
Properly configuring a trace can require fine-tuning since you are often trading off between two critical factors – the size of the trace data and the speed of the trace. If there is too much trace data, you’ll exhaust your buffers and/or disk space, which will prevent you from seeing the events you’re trying to capture. And if the trace capture is too slow, it can alter the behavior of the system. Slow traces can either mask timing problems or – even more likely – introduce new issues and prevent proper operation of the system.
What you change to impact the size of your trace is often related to its speed and vice versa. Let’s look at a couple ways that you can tinker with the configuration to make sure that your full stack trace does what you want.
Managing the trace file
By telling LTTng where to put its captured trace data, you have some control over the speed of the capture. In the lttng_start.sh script above, we invoke lttng with the “create -o” argument to store the tracing data under /tmp/lttng, which normally resides on the embedded system’s flash drive. Hence, the speed of the trace buffer disk flushes will depend on the speed of your flash. With a slow flash and a lot of data, it may be better to move the buffer onto a RAM disk to make LTTng’s buffer writing as fast as possible. You can do this by replacing the lines of directory management and session creation with this:
You may need to ratchet your buffer above 128 Mb to capture your whole trace or lower it if you need to conserve RAM. This will improve buffer write times, letting you capture more data without overflowing. Clearly though, this strategy only works if you’ve got enough RAM to spare.
Another thing you can do with the trace file is to remotely stream it to your development workstation. LTTng calls this network streaming mode (using lttng-relayd); this technique lets you capture larger and longer traces on your embedded device. Of course, your mileage may vary depending on what you’re tracing and the speed of your connection. The network has to be fast enough to move the trace data off the system before LTTng buffers overflow, so lttng-relayd is probably best reserved for fast connections (like hard-wired Gigabit Ethernet) rather than slow ones (like WiFi).
Managing the trace events
Many kernel events are called nearly constantly by a running system. Capturing tracepoints on those events can capture too much data, bogging down the system or exhausting your disk space.
To more easily locate problems, as well as ensure the trace is small enough to operate on your system properly, it’s best to capture only the essential tracepoints. In our lttng_start.sh script (setting the kernel_events environment variable), we’ve tried to provide a comprehensive event set that’s not too dramatically verbose. However, you may need to remove some of these events or add even more of your own, depending on what you’re attempting to catch – see the LTTng documentation to see how to get the list of possible trace events on your system.
Stopping the trace
Turning off tracing is as simple as running LTTng with the stop command. While it's not completely necessary to setup full stack tracing, it does make it easier if you wrap the stop with a script that allows you to delay for a bit before stopping it. That’s because a trace with a fixed duration is often the easiest way to capture the snapshot you need.
We use this technique to capture the boot process: one of the nicest applications of full stack tracing.
However, sometimes what you’re trying to capture won’t happen within a fixed time window. Maybe it’s a sporadic bug, or it happens much later after the application has started. In those cases, you can also turn off the trace when initiated by the developer. Or alternatively, create a small script to periodically check for the problematic event and turn off tracing once it’s found. Here you’ll want LTTng to capture events using a ring buffer that constantly overwrites old events. That way, you can see the last several seconds prior to when you stop the trace, presumably as soon as the bug shows itself. We won’t go into that use case here; you’ll want to investigate the event record loss modes in the documentation to understand how it works and set things up properly.
Capturing the boot process
You’ve now got everything configured and LTTng is ready to go. How do you capture your first trace? To diagnose problems in the boot process, add some systemd scripts to fire up the trace during the system startup.
Once these scripts are in place, you’ll want to enable the trace start/stop service, so they’ll be called during a reboot:
This will trace 20 seconds of system behavior during the boot, so you can catch whatever is slowing down your boot process.
Next time…
That’s about it to setup full stack tracing! Stay tuned for our next blog where we’ll continue with how to view and interpret your traces.
Trusted software excellence across embedded and desktop platforms
The KDAB Group is a globally recognized provider for software consulting, development and training, specializing in embedded devices and complex cross-platform desktop applications. In addition to being leading experts in Qt, C++ and 3D technologies for over two decades, KDAB provides deep expertise across the stack, including Linux, Rust and modern UI frameworks. With 100+ employees from 20 countries and offices in Sweden, Germany, USA, France and UK, we serve clients around the world.