Using Single Wire Output when parallel trace isn’t available

Your hardware might not have the necessary 20-way connector on board to support the TRACE capabilities of ORBTrace, but don’t worry, all is not lost, SWO is here to help with an auto-configured alternative providing a link at up to 48Mbps.

SWO is Single Wire Output. The clue is in the name – it’s a single wire output only debug data link on CORTEX-M3 and above defined in the ARM Coresight Components Technical Reference Manual. It can carry Instruction Trace Macrocell (ITM) and Data Watchpoint & Trace (DWT) messaging from your target back to the host…which is a complicated way of saying it gives you access to the multi-channel debug streams and PC sampling ( ‘top’ ) that you get with full blown TRACE, but at the reduced expense of only one pin, which is generally there on your 10-pin debug connector already.

On most processors SWO re-purposes the JTAG TDO pin, so it can only be used in Single Wire Debug (SWD) mode. To be honest, if you’ve got SWD available you’d be a bit crazy to be using JTAG anyway, so that’s no great constraint.

Data over the SWO can be carried in a UART frame or, alternatively, via Manchester Encoding. UART is more ubiquitous, and most examples you’ll find across the net use it; Orbuculum has long supported ftdi-style Serial-to-USB adaptors to collect SWO at up to 12 Mbps for example, giving an effective SoC-to-Application data rate of somewhere just north of 900KBytes/sec.

For all of UART’s ubiquity, Manchester encoding has the distinct advantage that it carries timing information along with the payload. That means that you don’t have to configure the link baudrate and, if the speed of communication changes (e.g. your SoC goes into a low clock, low power mode) it remains possible to recover useful data. The gotcha with Manchester is that not too many interfaces support it, so it’s always been the poor relation, even though seems to be provisioned on the vast majority of SoCs.

ORBTrace, from V1.1.0 onwards, natively supports Manchester encoding and, what’s more, it supports it at bitrates from 16Kbps to 48Mbps, with no requirement for pre-configuration – it’s literally plug-and-go. Here’s the result of an overnight test of SWO/Manch link running at 48Mbps transferring some deeply meaningful strings;

$ time orbcat -c 0,"%c" | sort | uniq -c
1 ABCDEFGHIJKLMNOPQRST 2385153869 ABCDEFGHIJKLMNOPQRSTUVWXYZ_*_abcdefghijklmnopqrstuvwxyz
real 490m37.616s
user 82m15.087s
sys 11m2.847s
$

That represents a SoC to Application data rate of 4.32MBytes/sec..not too shabby for one wire and no config! Link reliability looks pretty good too, which is important given that there’s no native error detection or correction in the SWO protocol.

So, how to access these new found riches? It couldn’t really be easier. All you need to do is configure ORBTrace to collect trace via SWO/Manch (the -T m option to orbtrace) and your application to output SWO/Manch (typically via the gdbtrace.init commands prepareSWO and enableSTM32SWO or whatever equivalent sets up the pins on your SoC).

Let’s demonstrate. We’ll use the ‘simple’ application from the orbmule repository, and blackmagic probe to connect to ORBTrace, as shown previously. First of all, compile the ‘simple’ application for your target ;

$ make
Compiling src/itm_messages.c
Compiling src/main.c
Assembling system/startup_ARMCM4.S
text data bss dec hex filename
1500 1080 10068 12648 3168 ofiles/simple.elf
$

Now, let’s set up a .gdbinit file to automate everything. The -T m on the orbtrace command line is the magic bit…that means perform trace over SWO. Couple that with configuring the SWO port to export Manchester encoded data in the prepareSWO line and you’re good to go;

!orbtrace -e vtref,on -p vtref,3.3 orbtrace -T m
file ofiles/simple.elf
target extended-remote localhost:2000
set mem inaccessible-by-default off
monitor swd
attach 1
load
source Support/gdbtrace.init
enableSTM32SWO 4
prepareSWO 16000000 8000000 0 1
dwtSamplePC 1
dwtSyncTap 3
dwtPostTap 1
dwtPostInit 1
dwtPostReset 10
dwtCycEna 1
ITMTXEna 1
ITMEna 1

You’ll find a copy of this file in the orbmule repository in the file gdbinit_files/gdbinit-bmp-swo. Start blackmagic probe in a new window, and then you can launch gdb to download to your target;

$ arm-none-eabi-gdb -q
Available Targets:
No. Att Driver
1 STM32F40x M4
0x08000638 in ?? ()
Loading section .text, size 0x5d4 lma 0x8000000
Loading section .ARM.exidx, size 0x8 lma 0x80005d4
Loading section .data, size 0x438 lma 0x80005dc
Start address 0x8000298, load size 2580
Transfer rate: 10 KB/sec, 516 bytes/write.
(gdb) c
Continuing.

…at this point you will have noticed the ‘Trace’ LED on ORBTrace has gone green (or possibly red if there’s a _lot_ of data flowing). So let’s collect those data using orbuculum, and monitor how many are arriving;

$ orbuculum -m 1000
57.2 KBits/sec

…and now, in another window, let’s decode the first debug channel;

$ orbcat -c 0,"%c"
FGHIJKLMNOPQRSTUVWXYABCDEFGHIJKLMNOPQRSTUVWXYABCDEFGHIJKLMNOPQRSTUVWXYABCDEFGHIJKLMNOPQRSTUVWXYABCDEFGHIJKLMNOPQRSTUVWXYABCDEFGHIJKLMNOPQRSTUVWXYABCDEFGHIJKLMNOPQRSTUVWXYABCDEFGHIJKLMNOPQRSTUVWXYABCDEFGHIJKLMNOPQRSTUVWXYABCDEFGHIJKLMNOPQRSTUVWXYABCDEFGHIJKLMNOPQRSTUVWXYABCDEFGHIJKLMN...(etc)

While we’re at it, we might as well just look at the ‘top’ output…its not very exciting for such a simple program, so we’ll split it by line number to pad it out a bit;

$ orbtop -e ofiles/simple.elf -l
24.55% 348 _sieve::37
22.08% 313 _sieve::39
20.95% 297 _sieve
 8.60% 122 _sieve::28
 6.98%  99 _sieve::30
 5.43%  77 _sieve::45
 5.01%  71 _sieve::43
 4.51%  64 _sieve::40
 0.91%  13 _sieve::34
 0.49%   7 _sieve::36
 0.42%   6 _sieve::42
-----------------
99.93% 1417 of 1417 Samples
[-S-H] Interval = 1000mS

So, there you have it. (Some) tracing with no requirement for parallel trace pins. The only thing that’s different between the TRACE pins, SWO/Manch and SWO/UART is the line protocol, the data are just the same, so you’ve got access to the same tooling when you’re using SWO as you have when you’re using the TRACE pins, modulo bandwidth limits. We’re hardly testing the limits of the channel using the ‘simple’ example, but feel free to stretch it’s legs.

Some SoCs even provide TRACE data over the SWO, so you might even be able to get orbmortem running over it on your device…do let us know.