Denser, scalable IC packages require jumps in technology and reliability

IBM Research – Zurich
Dr. Gerd Schlottig
24 June 2013
Denser, scalable IC packages
require jumps in
technology and reliability
IBM Research - Zurich
Overview
Thermal Interfaces:
Structures,
Materials,
Processes
2
Heat Removal:
Chip attached,
Cap attached,
Embedded
Power Supply:
Electrochemical,
Architecture
Energy Re-Use:
Hot Water Cooling,
Space Heating,
Adsorption Cooling,
Desalination
System Scale Technology Applications:
High Concentration Photovoltaic Thermal
Systems, Liquid Cooled Supercomputers
© 2013 IBM Corporation
SuperMUC
 Hot Water Cooled iDataPlex cluster with
3.2 / 2.9 PFlop/s peak / Rmax performance
- ca. 20’000 CPUs / 160’000 Cores
- ca. 44 Mio. Components
- with ca. 263 Mio. pins, and > 200Mio CPU C4s
 Energy Efficient AND Direct Heat Reuse
- 4 MW Power, PUE 1.1
- 40% less energy consumption than air-cooled
- 90% of waste heat for reuse
- #6 in Top500 list #82 in Green500 list
- #1 in reuse list (ERE pending)
 SuperMUC phase II announced
– 3PF New more efficient compute hardware
– ~3 MW power budget
– Total machine power 7 MW (Phase I + II)
 SuperMUC is based on Aquasar
Hot Water Cooling technology
 Largest universal CPU system
 System is part of the Partnership for Advanced Computing
Europe
(PRACE)
HPC
iDataPlex in
dx360
M4 board
and rack
infrastructure for researchers and industrial institutions throughout Europe
3
© 2013 IBM Corporation
IBM Research - Zurich
Failure Drivers: CTE mismatch
Typical die footprint with >10000 C4s
G. Schlottig et al., EuroSimE, 2012.
4
© 2013 IBM Corporation
IBM Research - Zurich
Failure Drivers: temperature gradients
Temperature Changes
and Peaks during
Package Processing
Exemplary Temperature Budget of Packaging Processes
Exemplary Powermap over
footprint of heat source die,
color pixeled squares
represent CP core activity
Resulting
Temperature
distribution
5
Temperature Changes
and Peaks during
Lifetime Operation
© 2013 IBM Corporation
Cooling Efficiency … air vs. water cooling using microchannels
Liquid cooling results in > 40% reduction in cooling power
 Free-cooling (no chiller required)
6
© 2013 IBM Corporation
Efficient Cooling: Transition Towards Lid-Integral Cold Plates
Today’s cold plate solution
Separable cold plate
Separable cold plate:
+ ease of test and assembly
TIM 1&2
- CTE mismatched copper cap  reliability
- need for TIM 1&2  limits thermal performance
Future demand: minimal Rth solution
Lid-integral cold plate:
Lid-integral copper cold plate:

Heat recovery (“hot-water” cooling applications: Aquasar)

Performance scaling (increased hot-spot power density)

3D stacking (potential CP-CP stack)

Si-photonics (temperature stability of photonic components)
Lid-integral silicon cold plate:
Embedded liquid cooling:
Double-side cooling:
TIM1
Thermal performance evolution
Cu-cold plate
Compliant TIM1: gel
7
Si-cold plate
Rigid TIM1: adhesive, solder
Chip embedded microchannels
Elimination of TIM1
Si-interposer cavity
Double-side heat removal
© 2013 IBM Corporation
IBM Research - Zurich
LiLC and ELC Reliability Aspects
8
© 2013 IBM Corporation
IBM Research - Zurich
Chip Stack Heat Dissipation … thermal interfaces
Heat conduction path 3D chip stacks  Chip stacks constrain heat dissipation
 accumulation of heat flux and interfaces
 Bottleneck: Underfill material. Initially designed to transfer mechanical stress from solder balls  enhance heat dissipation
13
© 2013 IBM Corporation
Enhanced thermal conductivity using Percolating th.UF
5x improvement in thermal conductivity
14
T. Brunschwiler et al. , JMEP (2012).
© 2013 IBM Corporation
IBM Research - Zurich
Neck based electrical interconnect (NEI)
Low temperature el.
interconnects
Ag 2%vol, Ø 20nm
in TGME
T_sinter 150°C
Yu et al., ESTC 2012
15
© 2013 IBM Corporation
IBM Research - Zurich
NEI performance
Yu et al., ESTC 2012
16
© 2013 IBM Corporation