Cross-bus analysis puts troubleshooting in the fast lane
01 November 2007
Today’s digital systems, from the video game console in the media room to the complex switching elements in a communication network, rely heavily on serial bus technology. Not surprisingly, a host of application-specific serial buses has emerged
Serial ATA handles communication between chipsets and disk drives. HDMI manages data going from digital A/V sources to display devices. PCI Express (PCIe), designed to connect peripheral devices in the PC environment, now finds itself in a range of applications not served by other specialised interfaces. In a given electronic system, it is not unusual to find all of these buses co-existing, and potentially several parallel buses as well.
This trend has intensified the demand for cross-bus troubleshooting solutions that offer a simple integrated way to view logical activity on several different buses at once. A variety of solutions exists.
The traditional approach is to pair a standard-specific protocol analyser with a logic analyser (LA); the former takes care of the serial acquisition while the LA captures parallel bus data that may pertain to the troubleshooting issue at hand.
Another approach is to use an LA with a bus support package that includes an external interface to convert serial data into the parallel data used by the logic analyser.
New methodology
Tektronix TLA7000 Series logic analysers can be equipped with integrated PCI Express serial acquisition modules that plug directly into the LA mainframe just like their parallel counterparts. Users can mix serial and parallel acquisition modules within a single system. With the addition of this serial capability, the TLA7000 family can capture and display time-correlated parallel and serial data as well as analogue waveforms from an oscilloscope, all on the same LA screen.
Cross-bus analysis often begins with a dual-trace LA display showing the analogue waveform plotted against a serial or parallel bus data acquisition. Captured with an oscilloscope and ported to the TLA7000.
The iView function screen shows if the analogue behaviour of the serial signal is causing the errors in the serial signal. The user can see if the analogue waveform is within normal tolerances.
This capability is destined to simplify digital troubleshooting. Using a combination of PCI Express serial and parallel modules, crossbus analysis can be performed by one logic analyser system.
Analogue troubleshooting
Digital solutions often begin with analogue troubleshooting. The underlying architecture of a PCI Express serial link is well-established. Often embedded as an element within an FPGA, a PCI Express transmitter with a SerDes (serialiser-deserialiser) at its heart sends 8b/10b encoded information to a receiver elsewhere in the system. Transmission impedances, bit rates, and clock characteristics are explicitly specified and controlled for interoperable operation between diverse manufacturers’ PCI Express components.
Though this link is a digital system, errors may have either digital or analogue origins. Frequently the first step in troubleshooting is to take a ‘snapshot’ of the analogue waveforms at the time of the error.
Analogue integration
Some logic analysers include features that enable them to integrate analogue acquisitions (waveforms) from a connected oscilloscope into the digital LA display. The analogue traces are time-correlated with their digital counterparts. This makes it possible to observe analogue events such as glitches and runt pulses concurrently with the digital events that may be their cause or consequence. A logic analyser equipped with parallel modules and serial modules and this analogue display capability is an unmatched cross-bus analysis platform.
The logic analyser acquires parallel data from the debug ports simultaneously with serial data from the PCI Express link. Acquisition modules marked ‘P’ are parallel while the unit marked with an ‘S’ is a dedicated PCI Express serial module. All data traces are time-correlated when an integrated logic analyser is used.
PCI Express transmitter/ receiver pairs, common in consumer electronic products and telecommunication systems often include not only a serial link, but also a built-in ‘debug port’. This parallel output delivers real-time data summarising the transactions occurring within the device. With debug ports on both the transmitter and the receiver, developers can monitor the health of the transmission link and localise many types of problems to either the transmit or the receive side.
Parallel data’s view
The image is of a block diagram showing a test set-up for the PCI Express serial link and its transmitter and receiver state machines. Assume that this is a troubleshooting routine designed to locate the origin of garbled data appearing on the serial link. The debug ports are of course connected to a parallel acquisition module, while the PCI Express link connects to a serial module.
In a serial bus form trace, the serial errors are shown to coincide with an incorrect state change in the debug port state machine. This implies a timing problem within the SerDes, which may stem from errors in the FPGA synthesis process.
A screen view from the logic analyser acquisition adds the parallel data stream captured from the receiver’s debug port. The new logic analyser busform trace includes hexadecimal values.
Traces are time-correlated due to the tightly integrated serial and parallel acquisition modules operating within the same logic analyser mainframe. In some cases, the serial bus transition may lag behind the debug port output due to latency, i.e., the time required for the serial buffer to flush its contents after the state has changed. In such instances the timing differential visible in the cross-bus view will reflect this latency accurately.
When the traffic on the serial bus is so dense that individual cycles cannot be displayed at the current resolution, it is still clear that a portion of the serial trace coincides with the 001 state on the state machine trace. The portion of the serial trace’s timing matches up correctly with the idle state on the debug port. The link is operational and communicating but it is not following its intended routine.
Detecting time-related issues
As the serial data errors coincide with the overflow state on the debug port, and because the serial data is driven by the SerDes it is reasonable to assume that the problem is timing-related and originates within the SerDes.
At this point, there may be several potential troubleshooting strategies, influenced by architectural considerations or other debug findings.
Most commonly, serial link features are incorporated into an FPGA. This device is designed to transform itself into functional elements defined by the programmer. This transformation process is known as synthesis; it literally synthesises the desired functions using its internal gates. The astute designer will troubleshoot the error first by double-checking the FPGA synthesis results to make sure the timing of all state machine transitions is correctly implemented.
If that does not reveal the problem’s source, a second step is to route other signals to the debug connector to trace the device’s behaviour.
For example, after evaluating the current state data, the FPGA might be reprogrammed to deliver the ‘next state’ data to the debug port. This could reveal issues that are not seen in the current state, and there are even more states that can be investigated beyond that.
Frequently, tracing a system problem involves more than just following a glitch back to its source in some logic element. An error on one bus may originate on, and impact, multiple buses in the system. With the advent of integrated tools that bring time-correlated serial, parallel, and even analogue events into view on a logic analyser screen, it is possible to see simultaneous interactions throughout the system, speeding efforts to track down not just errors, but also their root causes.
DAVE IRELAND is EMEA design and manufacturing manager, Tektronix
Contact Details and Archive...