A new approach to highly parallel wafer-level reliability systems

Author : Joris Donders, Se

12 May 2017

Figure 1: The modular PXI platform provides scalable, high-density solutions for test applications

With semiconductor content increasingly appearing in everything we use today, it is becoming even more important to ensure that semiconductor devices maintain their performance over a given lifetime. Reliability testing has long served as a method for semiconductor manufacturers to ensure this. Not only is the number of semiconductor devices growing, but their complexity is also increasing as innovative processes reduce device geometries and add integrated technologies, such as wireless connecti

IC manufacturers push the boundaries daily and need the long-term reliability of their ICs to remain unaffected, as they focus on applications such as autonomous driving, monitoring in healthcare and even cloud-based data storage. It is easy to understand why assurances on product reliability are demanded by customers in these ‘mission critical’ applications.

To maintain and improve quality while addressing these challenges, semiconductor companies are vastly increasing the amount of reliability data they collect and analyse in the continuous pursuit of reducing the cost of test. These complex battles have proven difficult to solve using traditional approaches, leading some engineers to turn toward modular, flexible solutions that can scale to fit their needs. 

Reliability testing

Device reliability is typically modeled as failure rate over time, with the highest failure rates occurring immediately after manufacturing and again after the product has exceeded its useful lifetime. 

The left side of the graph in Figure 2 shows early failures, often caused by defects in the manufacturing process. These types of failures can be screened during production to minimise the number of defective parts sent to customers. However, the functional tests performed during production cannot identify defects that cause the device to prematurely wear out, and therefore do not offer insight into the product’s usable lifetime. Reliability testing identifies these types of failure mechanisms and estimates the product’s usable lifetime. 

This involves stressing a device at the extreme ends of its specifications (usually voltage and temperature) to accelerate device wear-out and model the usable lifetime against known failure mechanisms. In the semiconductor industry, these tests can be performed on a wafer or packaged part. Due to cost of packaging and risk for potential damage in the packaging stage, most companies use Wafer-level Reliability (WLR) tests to provide them with more data earlier in the manufacturing process. 

Wafer-level reliability

WLR is a type of parametric test that typically involves applying a stress, such as voltage or current, and measures the response of the device to uncover any unexpected signs of degradation. This extracts information about the useable lifetime and long-term reliability of the device. Since reliability typically depends on the manufacturing process and not the device’s functionality, the tests are usually performed on test structures or purpose-built die. These consist of transistors, capacitors and resistors that are built into the wafer specifically for this purpose, rather than performing it on the actual IC being developed. Common failure mechanisms include:

Figure 2: A typical model of device reliability

• Bias or negative bias temperature instability (BTI or NBTI)

• Hot-carrier injection (HCI)

• Time-dependent dielectric breakdown (TDDB)

• Electromigration (EM)

Traditional approach to building WLR systems

Most CMOS (Complementary Metal-Oxide-Semiconductor) devices are tested with DC instruments, such as source measure units (SMUs), which supply the necessary stress and measurement capability for collecting parametric data. Excluding specialised systems involving high-frequency AC, most WLR systems are based on two main approaches: either rack-and-stack systems, with traditional box instruments, or purpose-built turnkey systems. 

Closer look at rack-and-stack systems vs turnkey systems

Using pre-packaged turnkey solutions that provide all the required functionality, systems combining instrumentation and software might look like the fastest way to succeed. However, aligning your test requirements with the vendor-defined functionality of this solution can prove difficult. Additionally, this approach often requires the largest capital budget, and with a solution that is pre-defined, it does not easily scale to changing needs or application requirements. Typically, over time these solutions prove to have a higher total cost of ownership than one may anticipate in the beginning. 

Figure 3: Industry analysts predict PXI will continue to be the leading modular platform

We then have traditional rack-and-stack solutions, leveraging expensive high-precision DC instruments like SMUs. These SMUs are typically limited in channel count and often combined with low-leakage switching matrices, to obtain a limited 20-40 channel count system in a full 19-inch test rack. Because the switches need to reserve the high-end specifications of the SMUs, they have a high cost per channel of up to $10,000. 

Challenges of traditional WLR systems

The main reason why engineers are opting to build parallel test systems using modular instrumentation on the PXI platform is because of two challenges with alternative WLR approaches. The approach of either buying purpose-built systems or building rack-and-stack systems from box instrumentation has served their purpose for decades. Today however, many engineers are finding that these architectures no longer meet their new channel-density and cost requirements. 

The first challenge of turnkey systems is that they do not provide the flexibility needed to modify the test software or hardware. As device requirements change, modifications become expensive. 

A further challenge with rack-and-stack systems is that they are limited by the low-channel density of traditional box SMUs. This low density creates challenges for building high-channel-count systems with small footprints, which often forces engineers to use a switched topology to multiplex the SMU to multiple pins. However, this switched topology quickly becomes a bottleneck as the pins are tested serially, instead of in parallel. So, to implement advanced algorithms that require constant stress and monitoring, proves impossible. 

A new approach for building WLR systems

The market for test instrumentation has changed dramatically with the rise of modular platforms such as PXI. Modular platforms have grown increasingly desirable for building automated test systems because of their extensive I/O capability, compact form factor and flexible software. 

Using a modular approach can reduce the footprint of WLR systems without sacrificing measurement quality. The open software architecture allows you to define the functionality of your system, modify tests and add hardware as your requirements change. 

Figure 4: High-uptime PXI chassis with redundant fans and power supplies

High-density source measure units

Modular platforms such as PXI allow you to build systems with hundreds of SMU channels whilst maintaining a reasonable footprint and cost per channel. The high-channel density of these instruments avoids routing signals through a switching subsystem and connects each test pad directly to a high-precision SMU. This ‘SMU per pin’ architecture prevents the negative impact that switches have on signal integrity and test time, giving you flexibility to implement advanced stress-measure algorithms.  

High uptime and serviceability

Managing system uptime is critical for both inline and offline reliability systems. If an inline system fails, wafer production can stop. Offline reliability tests do not directly influence wafer production, but they do involve experiments that can run for several months. Ensuring a tester stays active and continues to acquire data is essential for the experiment’s success. A failed tester can lead to a failed experiment.

High-uptime applications often require systems with built-in redundancy for high-risk parts. Building a test system with redundant, hot-swappable fans and power supplies allows you to mitigate the failure risk associated with these parts and ensures that the test system continues running after a component failure. If the component is also hot-swappable, you can service the system without powering down the chassis and aborting the experiment. Additionally, you can remotely monitor the health of your system for fan speed, temperature, power consumption and other key parameters that may indicate an upcoming failure.

Parallel systems drive competitive advantage

Traditional reliability systems have served their purpose for decades. However, the inability of these systems to provide and analyse massive amounts of reliability data is becoming a bottleneck. To address these needs, many companies are turning to modular platforms such as PXI, to build highly parallel WLR systems with high uptime and the latest commercial processors. Using the software-defined architecture of these systems, companies can maintain control of their intellectual property and scale their systems as requirements change. This approach satisfies their need for more reliability data at lower cost and allows them to address the ever-changing test requirements of the future.


Contact Details and Archive...

Print this page | E-mail this page