8/05/2025

Defect and Failure in VLSI , IC Failure & Testing - Ep:2






Understanding CMOS IC Failure:

CMOS technology is the backbone of modern electronic devices, powering everything from smartphones to microprocessors. Defects and failures in CMOS ICs can significantly impact performance, yield, and overall reliability, making it critical to understand their causes and effects. This presentation will cover the bridging defects,gate oxide short, open circuit defects. We will explore the causes of defects and failures in CMOS ICs, detection techniques, and strategies for preventing and mitigating these issues to improve yield and reliability. Effective defect management is crucial to optimizing semiconductor manufacturing processes and is essential for sustaining Moore's Law as process nodes continue to scale down. 

Bridging Defects:Unintended connections between circuit parts causing shorts, leading to malfunctions. Caused by process variations in fabrication. 

Gate Oxide Shorts: Breakdown of the gate oxide layer creating conductive paths, impairing transistor performance. Result from deposition or etching defects.

Open Circuit Defects: Breaks in conductive paths preventing current flow, causing non-functionality or failure. Caused by poor bonding or material defects.

These defects degrade performance, reduce yield, and increase production costs. Detected via advanced testing like electrical and thermal imaging; minimized through improved process control and inspection. Unresolved defects harm device reliability, emphasizing the need for improved fabrication techniques.

Bridging Defects in ICs:

Unintentional connection between two or more circuit nodes. Causes abnormal electrical behavior based on circuit parameters & topology .

Major Bridge Defect Variables :

1.  Type: Ohmic or nonlinear

2.  Location: Intragate i.e within transistor internal nodes , across I/O nodes of separate logic gates, Power rail to ground rail

3. Circuit Topology: Combinational or sequential

4. Interconnect Materials: Metal, polysilicon, diffusion region

5. Critical Resistance: Affected by transistor drive strength & W/L ratios

 Common Ohmic Bridge Defects :

1. Metal slivers bridging two interconnects

2. Large material deposits shorting multiple interconnects

3. Gate oxide shorts:

4. Ruptures in transistor thin oxide

5. Connect gate to silicon structures

Impact on ICs :

1. Memory cells & flip-flops: May behave differently from combinational circuits

2. Power rail shorts (VDD to GND): Do not affect signal paths but must be controlled , lead to power leakage, reducing product lifespan , critical for low-power & battery-operated devices.


Critical Resistance in Bridging Defects :





What is Critical Resistance ? 

Critical resistance is the defect resistance above which the circuit functions correctly. Determines the impact of a bridging defect on circuit behavior.

Lower resistance → Higher risk of functional failure.



Defect Activation:                                                                    If no voltage drop across the bridge → No current → Defect is inactive.                                                                                    

If current flows through the bridge → Voltage drop → Defect is activated.

Voltage & Current Response to Defect Resistance:

Higher resistance → Output voltage approaches fault-

free behavior. Low resistance (≤ 5 kΩ) → Incorrect

logic state → Functional failure.

Threshold for Logic Error:                                                    If defect resistance is ≤ 5 kΩ, logic error occurs. If defect resistance is > 5 kΩ, logic is correct but may be weak & noise-sensitive.

Factors Affecting Critical Resistance :

Transistor Strength i.e, W/L Ratio determines driving capability against defect. Logic Threshold Voltage (VTL) defines functional failure threshold. Found at Vin = Vout on the transfer curve (~VDD/2). Critical resistance depends on circuit design & defect location. Functional failure occurs when the defect overpowers transistor drive strength. Higher resistance = More tolerable defect; lower resistance leads to logic errors & noise vulnerability.

Fault Models for Bridging Defects : 

Bridge defects cause intermediate voltages at shorted nodes. Analog simulators (e.g., SPICE) can accurately analyze: (1) Operation regions of affected transistors , (2) Intermediate voltages at short nodes,  (3) Induced power supply current increase

Test & diagnosis goal :

1. Predict faulty behavior and generate test vectors to expose bridging faults.

2. Most test pattern generators use a logic-level netlist to derive test vectors.

3. Logic simulators fail to model BFs accurately due to non-standard logic voltages.

4. Analog simulation is too slow for large circuits and often unnecessary. 

5. Logic fault models bridge the gap between analog behavior and logic-level representation.

6. These models provide a practical way to simulate and detect BFs during testing.

Common logic-level bridging fault models :

1. Stuck-at

2. Pseudo stuck-at 

3. Logic-wired AND/OR

4. Voting

5. Biased voting                  


Logic Fault Models: 

1. Stuck-at Fault Model (SAF) :

Simplest and most widely used logic fault model. Originated during the bipolar transistor IC era . Accurate for that technology. Proven inadequate for CMOS, but still dominant due to computational efficiency, industry acceptance  and established tools.  A fault model where a signal line is permanently stuck at logic ‘0’ (stuck-at-0) or logic ‘1’ (stuck-at-1), regardless of actual circuit behavior.

2. Pseudo Stuck-at Fault Model :

This model leverages leakage current behavior from bridging faults. Simplifies ATPG and improves fault coverage. Defect detection is confirmed when the fault effect reaches the gate output. Detection doesn’t require propagation to primary outputs. Fault effects observed via quiescent current at the power supply pin. Minimal changes needed to adapt existing stuck-at ATPG tools.


3. Logic-Wired AND/OR Model :

Logic-wired fault models for bridging faults were originally developed for bipolar technologies like ECL and TTL. In these technologies, if two logic gate outputs are accidentally  connected/shorted, the result often behaves like a logic OR or AND gate. This is because one logic level is electrically stronger than the other, so when a short occurs, the stronger signal dominates. This logic fault model is more versatile than the stuck-at model. Although this model doesn't work well for CMOS technology, where neither logic level is inherently stronger. In CMOS, the voltage at a shorted node depends on several factors, including the relative sizes of the gates, the resistance of the short/bridge, and the input values.


 


4. Voting Model :

The voting model improved the way bridging fault are represented . It considers what happens when shorted nodes are forced to opposite logic levels—pMOS and nMOS transistors compete to control the output. The model assumes that the group of transistors with the strongest drive or higher current will determine the final logic value. The voting model doesn't consider the effects of nonzero resistance in the bridge or the logic threshold of the connected gates. If the bridging resistance isn’t zero, the outputs of the shorted gates can sit at different voltages due to voltage drops, which might be  interpreted as different logic levels by following gates.

5. Biased Voting : This model solves the problems faced by circuits with logic gates that have different threshold voltages. Unlike the voting model, which uses a fixed starting voltage to calculate transistor conductance, the biased voting model calculates conductance based on the actual voltage at the bridged node. This way, it better reflects the real, nonlinear behavior of transistors when determining a device’s driving strength. It also considers the varying threshold voltages of the logic gates connected to the shortened outputs.

6. Mixed Description : A bridge defect causes analog behavior in the circuit, it requires a mixed modeling approach for accurate analysis. In this method, the entire circuit is modeled using digital logic, except for the fault location, which is simulated using an analog tool. This approach combines the accuracy of analog simulation at the defect site with the speed and efficiency of logic-based simulation for the rest of the circuit.

Feedback and Non Feedback Bridging Faults:

At the circuit level, we talk about two types of bridging faults (BFs): intragate and intergate.

Intragate BF : Unwanted connection/short between parts inside a single logic gate.

Intergate BF : Unwanted connection/short between two or more different logic gates.

This difference is important for ATPG tools, because the way the circuit is described (either using gates or transistors) affects which faults the tools look for. In combinational circuits, there are two types of bridging faults:

1. Non-feedback bridging faults,

2. Feedback bridging faults.


Non-feedback bridging fault :

A non-feedback bridging fault is a short or unwanted connection that doesn’t create a loop or path from a gate’s output back to any of its inputs. This type of fault is the most basic kind of bridging fault. When this kind of defect is triggered, the shorted nodes may settle at a voltage level between 0 and 1 — not fully high or low — depending on the resistance of the short and the threshold voltage of the gates that receive the signal.




Feedback bridging fault : 

Feedback bridging fault happens when a short/unwanted connection creates a situation where the output of one gate can affect its own input — through a logic path. In other words, it forms a loop. Depending on how the shorted gates are connected, three situations can occur when an input is applied to 


the circuit:

1. Case A: The value at node j does not depend on node i → this behaves like a non-feedback bridging fault.

2. Case B: The value at node j is the same as at node i → this is called a non-inverted feedback bridging fault.

3. Case C: The value at node j is the opposite of node i → this is called an inverted feedback bridging fault.


Bridging Faults in Sequential Circuit : 

Control Loops in Sequential Circuits : 

- Floating state: Memory state not controlled by logic

- Forced state: Memory state driven by preceding logic

- Detection depends on circuit design and location of defect

BFs in Flip-Flops :

Behavior depends on flip-flop design (CMOS vs. NAND-based) . Some BFs do not elevate current but affect timing (setup/hold time).. Design modifications can improve current-based detectability

BFs in Semiconductor Memories:

BFs in SRAMs may not always elevate quiescent current. Floating control loops can prevent detectio. BFs in DRAMs cause logic errors but do not elevate current 

Bridging Faults & Technology Scaling : 

Physical defect mechanisms remain unchanged.Impact on IC behavior changes due to scaling effects.

Key scaling effects:

Critical resistance decreases → Makes logic testing less effective. Higher leakage currents → Reduces effectiveness of traditional quiescent current testing. Advanced detection needed → Delay-based or leakage variation monitoring


Gate Oxide Short :

1. Introduction & Model : 


 GOS have been a persistent issue in MOS technology since 60s. Occur due to rupture in SiO₂ layer between polysilicon gate and Si. Thin oxide layer is critical for controlling charge flow in the channel. Undamaged oxide regions may still function, allowing charge inversion. In some cases, transistors with GOS may remain functional  exhibiting degraded drain current. The electrical behavior differs based on the type of gate oxide short.Two types of GOS : (1) Thermal filament growth at gate edge due to high over voltage and strong electric-field, causing breakdown and filament formation between gate and source. (2) Particle-induced short between gate and p-well, caused by contamination.

3. Gate Oxide Short Models :

Figure illustrates an inverter cross-section in n-well CMOS technology. Gate-drain/Gate-source oxide shorts are modeled based on the doping types of the connected terminals. Six distinct parasitic connections can occur if gate material  merges with substrate. GOS connects gate polysilicon to the drain, source, or bulk of the transistor. 

The electrical model depends on doping polarity:

(1) Same doping type → modeled as a resistor.

(2) Opposite doping type → modeled as a pn junction diode.

Electrical behavior of each shorted path varies with the type of contact formed. Detailed modeling helps in accurate fault analysis and reliable simulation.


nMOS Transistor Gate Oxide Short :

1.  nMOS Transistor Gate–Drain/Source Oxide Shorts :

Sometimes, a direct (Ohmic) connection forms between the gate and the drain or source in an nMOS transistor. This happens when the thin oxide layer breaks between the n- doped polysilicon gate and the n-doped source or drain. It's similar to placing a resistor between the gate and the source or drain.

2. nMOS Transistor Gate–Substrate Oxide Shorts :

Ohmic connection forms when an oxide rupture links n- doped polysilicon gate to n-doped drain or source in an nMOS transistor. Electrically similar to an external resistor connected between gate and drain/source terminals. These shorts can result from: weak oxide layers, high electric fields at gate edges (more prone to breakdown than gate center)






pMOS Transistor Gate Oxide Short:

1. pMOS Transistor Gate–Drain/Source Oxide Shorts:

When a pMOS transistor gate gets shorted, it can form a diode if the short connects an n-doped polysilicon gate to a p-type source or drain. If this diode-like short happens at the source, it limits the gate voltage. If the short is to the drain, things are more complex—the diode acts like a nonlinear feedback loop from the output/drain back to the input/gate. But if the gate is made from p-doped polysilicon and shorted to the source or drain, it creates a direct conductive/Ohmic short, not a diode.

2. pMOS Transistor Gate–Substrate Oxide Shorts :

If a defect occurs between the n-doped polysilicon gate and the substrate of a pMOS transistor, it creates a low-resistance (Ohmic) connection to the substrate, since both have the same doping type. When power is applied, this defect combines with the transistor to create a parasitic pnp bipolar transistor.This effect becomes more serious in single-well CMOS structures, where: The parasitic bipolar transistor (caused by the defect) can interact with other parasitic devices in the circuit. This can trigger latchup, a condition where the circuit enters a high-current, unstable state (with a negative slope in the I-V curve), potentially damaging the device.






Open Circuit Defects :

Open circuit defects in ICs are unintended breaks in metal/polysilicon/diffusion interconnects, leading to diverse failures. Unlike bridge defects, they are more complex and harder to detect, often causing partial or intermittent issues. As CMOS scaling pushes metal lines below 130 nm and increases via height-to-width ratios, the risk of open defects rises due to billions of vias and extensive interconnects in modern ICs. 

Challenges in Detecting & Testing Open Defects :

These are breaks or partial discontinuities in conductive paths (like wires or vias) that interrupt current flow. Partial opens can cause intermittent faults, making detection and localization difficult. Standard models like stuck-at or bridging faults often miss open defects, especially partial or resistive ones. Advanced models (IDDQ, transistor-level, resistive-open) are more effective but require more resources and time. Open defects may not be active under normal conditions. They often manifest only under specific voltage, temperature, or timing stress, leading to unpredictable behavior. As feature sizes shrink (e.g., 5nm, 3nm), narrower wires and vias increase the likelihood of open defects. Process variations can either hide or worsen these defects depending on design tolerances. Opens may escape detection during factory testing but fail during actual use (latent defects). This poses serious risks for high-reliability sectors like automotive, medical, and aerospace.


Modeling Floating Nodes :

1. Capacitor Coupling in Open Circuits :

When an open defect occurs in a metal line, the floating node behaves like a capacitor. Depending on how the broken metal line is positioned over the IC substrate (GND) and the well area (VDD), the node experiences capacitive coupling. This coupling can create a capacitor voltage divider, affecting the  floating node’s voltage.


2. Influence of Surrounding Lines :

In modern ICs, metal lines are closely packed, and adjacent lines create parasitic capacitance. If an open defect occurs in one line, the floating node’s voltage is significantly influenced by adjacent active lines.

3. MOSFET Charge Influence :

If the floating node is connected to a MOSFET gate, the charge stored at the transistor gate and the voltage at the transistor drain play a crucial role in determining the final gate voltage. This could lead to unpredictable logic states. 

4. Tunneling Effects :

For very small open defects (narrow cracks in metal), quantum mechanical electron tunneling can occur, allowing some current to pass through the defect. This can create an unusual failure mechanism where a gate functions at low frequencies but fails at higher frequencies.

Classification of Open Defects :

Open defects can manifest in different ways, impacting circuit functionality in various degrees. Based on failure analysis, these defects are categorized into six broad classes:

1. Transistor-On Open Defect : Affects a single transistor gate.The floating gate voltage is influenced by parasitic capacitive coupling. The transistor may still function, but with a shifted threshold and increased power consumption.

2. Transistor Pair-On Defect : Occurs when an open defect affects both the PMOS and NMOS transistors of a logic gate. The floating node may settle to an intermediate voltage, causing both transistors to conduct simultaneously, leading to increased static power dissipation.

3. Transistor Pair On/Off Defect : If the floating node voltage is near VDD or GND, one transistor remains permanently ON while the other is OFF. This can cause a stuck-at logic error, where the output is permanently HIGH or LOW.

4. The Open Delay Defect : In cases where electron tunneling is the only conduction mechanism across an open, the circuit may function at low frequencies but fail at high frequencies. These defects become particularly noticeable in high-speed digital ckt.

5. Memory (Stuck-Open) Defect : A transistor remains OFF due to an open, but previous charge stored in the circuit keeps it functioning correctly. The defect manifests only when certain input sequences cause the stored charge to be lost.This is one of the most difficult defects to detect using standard functional testing.

6. Sequential Open Defects : Occur in sequential logic circuits, such as flip-flops and latches. Can lead to logic race conditions, erroneous clocked behavior, or timing violations. These defects may or may not elevate static power consumption.


Watch video lecture here :