The ASTRON / IBM 64 bit µServer for SKA Ronald P. Luijten – Data Motion Architect Andreas Doering – lead designer [email protected] - IBM Research - Zurich 16 Nov 2014 DISCLAIMER: This presentation is entirely Ronald’s view and not necessarily that of IBM. Compute is free – data is not Ronald P. Luijten – Data Motion Architect Andreas Doering – lead designer [email protected] - IBM Research - Zurich 16 Nov 2014 DISCLAIMER: This presentation is entirely Ronald’s view and not necessarily that of IBM. Definition µServer: The integration of an entire server node motherboard* into a single microchip except DRAM, Nor-boot flash and power conversion logic. This does NOT imply low performance! 139mmx55mm 245mm 305mm * no graphics Ronald P. Luijten – SC14 16-20 Nov 2014 I.D. http://youtu.be/imweQe8NgnI Ronald P. Luijten – SC14 16-20 Nov 2014 4 IBM DOME µServer Motivation & Objectives • Create the worlds highest density 64 bit µ-server drawer • Useful for both SKA radio-astronomy and IBM future business – Platform for Business Analytics appliance pre-product research • • • • • – “Datacenter in-a-box” Very high energy efficiency / very low cost (radioastronomers…) Use commodity components only, HW + SW standards Leverage ‘free computing’ paradigm Enhance with ‘Value Add’: packaging, system integration, … Density and speed of light • Most efficient cooling using IBM technology (ref: SuperMUC TOP500 machine) • Must be true 64 bit to enable business applications • Must run server class OS (SLES11 or RHEL6, or equivalent) – Precluded ARM (64-bit Silicon was not available) – PPC64 is available in SoC from FSL since 2011 – (I am poor – no $$$ for my own SoC…) • This is a research project – capability demonstrator only 5 Ronald P. Luijten – SC14 16-20 Nov 2014 Compute node board form factor 55 mm T4240 SoC Standard 240 pin DDR3 memory DIMM board (lid removed) 30 mm FRONT 133 mm P5020 SoC Decoupling Capacitors area 55 mm (Lid Removed) 133 mm BACK 139 mm 139 mm T4240 P5020/P5040 6 Ronald P. Luijten – SC14 16-20 Nov 2014 IBM / ASTRON compute node board diagram DRAM DRAM DRAM SPI flash T4240 PSoC Power converter I2C Serial JTAG USB 7 4x 10 GbE SDcard 2 x SATA Ronald P. Luijten – SC14 16-20 Nov 2014 IBM / ASTRON compute node board diagram DRAM DRAM PSOC collapses 6 functions into a small chip to save Area, Power and Cost SPI flash PSoC I2C 1. 2. 3. 4. 5. 6. On/Off and Power up sequencing DRAM Provide uServer boot configuration JTAG debug access Serial port access (Linux) T4240 Temperature monitoring and protection Power Management interface and control converter Serial JTAG USB 8 4x 10 GbE SDcard 2 x SATA Ronald P. Luijten – SC14 16-20 Nov 2014 Hot Water Cooling Most Energy Efficient solution: – Low PUE possible (<=1.1) – Green IT – 40% less energy consumption compared to air-cooled systems – 90% of waste heat can be reused (CO2 neutral according Kyoto protocol) – Allows very high density – Less thermal cycling - improved reliability – Lower Tj reduces leakage current – further saving energy SuperMUC HPC machine at LRZ in Germany demonstrates ZRL hot water cooling – No 4 on June 2012 TOP500 HPC list SuperMuc node board 9 Ronald P. Luijten – SC14 16-20 Nov 2014 Compute node heat spreader Functions: • Electrically and thermally connects the compute node to cooling-power delivery infrastructure • allows heat removal laterally • allows main power delivery to the board •Heat spreader SHOWN AT OUR BOOTH •Processor chip •Power inductors •Populated processor board •Memory chips •Processor PCB •capacitors •Gnd •Power •Heat spreader 10 •Power delivery contacts •(rivets) •Schematics of board assembly Ronald P. Luijten – SC14 16-20 Nov 2014 •Shield 19” 2U Chassis with Combined Cooling and Power 128 compute node boards 1536 cores / 3072 Threads 6 TB DRAM Datacenter-in-a-box Compute Nodes Electrical + Thermal Interface Water Out 3 layer Laminated Copper Plate Transporting Supply Current and Heat SoC Carrier FR4 Ronald P. Luijten – SC14 16-20 Nov 2014 11 Water In Strawman network for 128 compute nodes with 40G external links 6 x 40G 6 x 40G 6 x 40G Switch Switch N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N 6 x 40G Switch Switch N N N N N N N N N N N N N N N N N N N N N N N N N N N Switch Switch 4 x 40G Switch N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N 4 x 40G Switch •12 12 • 32 external 40G ports using Ethernet switches • 1280 Gbps external BW Ronald P. Luijten – SC14 16-20 Nov 2014 SOFTWARE 13 Ronald P. Luijten – SC14 16-20 Nov 2014 DEMO at Booth#233 P5040 running Linux and CPMD T4240 running uBoot and memory test 16GB DRAM 2x 10 GbE 2x SATA 48GB DRAM 4x 10GbE 2x SATA A BONUS video: http://t.co/4vEkEVEazO 18 Ronald P. Luijten – SC14 16-20 Nov 2014 Published Conference Papers • • • • • • 19 “Parallelism and Data Movement Characterization of contemporary Application Classes ”, Victoria Caparros Cabezas, Phillip Stanley-Marbell, ACM SPAA 2011, June 2011 “Quantitative Analysis of the Berkeley Dwarfs' Parallelism and Data Movement Properties”, Victoria Caparros Cabezas, Phillip Stanley-marbell, ACM CF 2011, May 2011 “Performance, Power, and Thermal Analysis of Low-Power Processors for ScaleOut Systems”, Phillip Stanley-Marbell, Victoria Caparros Cabezas, IEEE HPPAC 2011, May 2011 “Pinned to the Walls—Impact of Packaging and Application Properties on the Memory and Power Walls”, Phillip Stanley-Marbell, Victoria Caparros Cabezas, Ronald P. Luijten, IEEE ISLPED 2011, Aug 2011. “The DOME embedded 64 bit microserver demonstrator”, R. Luijten and A. Doering, ICICDT 2013, Pavia, Italy, May 2013 “Dual function heat-spreading and performance of the IBM / Astron DOME 64-bit μServer demonstrator”, R. Luijten , A. Doering and S. Paredes, ICICDT 2014, Austin Tx, May 2014 Ronald P. Luijten – SC14 16-20 Nov 2014 Questions??? µServer website: www.swissdutch.ch 20 Ronald P. Luijten – SC14 16-20 Nov 2014