IBM Research – Zurich Dr. Gerd Schlottig 24 June 2013 Denser, scalable IC packages require jumps in technology and reliability IBM Research - Zurich Overview Thermal Interfaces: Structures, Materials, Processes 2 Heat Removal: Chip attached, Cap attached, Embedded Power Supply: Electrochemical, Architecture Energy Re-Use: Hot Water Cooling, Space Heating, Adsorption Cooling, Desalination System Scale Technology Applications: High Concentration Photovoltaic Thermal Systems, Liquid Cooled Supercomputers © 2013 IBM Corporation SuperMUC Hot Water Cooled iDataPlex cluster with 3.2 / 2.9 PFlop/s peak / Rmax performance - ca. 20’000 CPUs / 160’000 Cores - ca. 44 Mio. Components - with ca. 263 Mio. pins, and > 200Mio CPU C4s Energy Efficient AND Direct Heat Reuse - 4 MW Power, PUE 1.1 - 40% less energy consumption than air-cooled - 90% of waste heat for reuse - #6 in Top500 list #82 in Green500 list - #1 in reuse list (ERE pending) SuperMUC phase II announced – 3PF New more efficient compute hardware – ~3 MW power budget – Total machine power 7 MW (Phase I + II) SuperMUC is based on Aquasar Hot Water Cooling technology Largest universal CPU system System is part of the Partnership for Advanced Computing Europe (PRACE) HPC iDataPlex in dx360 M4 board and rack infrastructure for researchers and industrial institutions throughout Europe 3 © 2013 IBM Corporation IBM Research - Zurich Failure Drivers: CTE mismatch Typical die footprint with >10000 C4s G. Schlottig et al., EuroSimE, 2012. 4 © 2013 IBM Corporation IBM Research - Zurich Failure Drivers: temperature gradients Temperature Changes and Peaks during Package Processing Exemplary Temperature Budget of Packaging Processes Exemplary Powermap over footprint of heat source die, color pixeled squares represent CP core activity Resulting Temperature distribution 5 Temperature Changes and Peaks during Lifetime Operation © 2013 IBM Corporation Cooling Efficiency … air vs. water cooling using microchannels Liquid cooling results in > 40% reduction in cooling power Free-cooling (no chiller required) 6 © 2013 IBM Corporation Efficient Cooling: Transition Towards Lid-Integral Cold Plates Today’s cold plate solution Separable cold plate Separable cold plate: + ease of test and assembly TIM 1&2 - CTE mismatched copper cap reliability - need for TIM 1&2 limits thermal performance Future demand: minimal Rth solution Lid-integral cold plate: Lid-integral copper cold plate: Heat recovery (“hot-water” cooling applications: Aquasar) Performance scaling (increased hot-spot power density) 3D stacking (potential CP-CP stack) Si-photonics (temperature stability of photonic components) Lid-integral silicon cold plate: Embedded liquid cooling: Double-side cooling: TIM1 Thermal performance evolution Cu-cold plate Compliant TIM1: gel 7 Si-cold plate Rigid TIM1: adhesive, solder Chip embedded microchannels Elimination of TIM1 Si-interposer cavity Double-side heat removal © 2013 IBM Corporation IBM Research - Zurich LiLC and ELC Reliability Aspects 8 © 2013 IBM Corporation IBM Research - Zurich Chip Stack Heat Dissipation … thermal interfaces Heat conduction path 3D chip stacks Chip stacks constrain heat dissipation accumulation of heat flux and interfaces Bottleneck: Underfill material. Initially designed to transfer mechanical stress from solder balls enhance heat dissipation 13 © 2013 IBM Corporation Enhanced thermal conductivity using Percolating th.UF 5x improvement in thermal conductivity 14 T. Brunschwiler et al. , JMEP (2012). © 2013 IBM Corporation IBM Research - Zurich Neck based electrical interconnect (NEI) Low temperature el. interconnects Ag 2%vol, Ø 20nm in TGME T_sinter 150°C Yu et al., ESTC 2012 15 © 2013 IBM Corporation IBM Research - Zurich NEI performance Yu et al., ESTC 2012 16 © 2013 IBM Corporation