Sustainable Predictive Storage Management D avid E ssary D ep t . of C omp u ter S c ie n c e Un ive rsity of P it ts b ur g h © 2011 David Essary SYSTOR 2011 Dept. of Computer Science Introduction Disk vs CPU Storage systems’ power becoming more critical Rate of data generation is alarming No “silver bullet” Goal: Dynamic (and adaptive), sustainable predictive grouping engine Group = Disk track © 2011 David Essary SYSTOR 2011 Dept. of Computer Science Motivating Example Group A: 1 2 3 4 Group B: 5 6 7 8 Group C: 9 Remap with Replication Group A: 1 5 9 2 Group B: 6 1 3 7 11 4 8 10 Group C: 10 11 12 Block Access Pattern: 1,5,9,2,6,1,3,7,11 Used block Free block Group Access Pattern Prior to Remapping: A,B,C,A,B,A,B,C © 2011 David Essary Group Access Pattern with Remapping and Replication: A,B,C SYSTOR 2011 Dept. of Computer Science Challenges When are predictions performed? How are predictions made? How is predictive metadata gathered? Where are predictions to be located? How are predictions used? © 2011 David Essary SYSTOR 2011 Dept. of Computer Science SPORe Data Request Stream SPORe Control SESH LRDU hot lists Root Monitor Scribe Cartographer Storage Device © 2011 David Essary SYSTOR 2011 Dept. of Computer Science Supergroups A 0.4 0.6 0.5 D 0.3 C 0.9 0.2 E F 0.6 G 0.8 F 0.1 0.6 H I L 0.8 0.2 0.4 J M N 0.4 I 1.0 J Supergroup (Roots: A and K) A, B, C, G, D, E, F, I, K, L, M, N, ?, ?, ?, ? © 2011 David Essary 0.2 B Group 1 (Root: A) A, B, C, G, D, E, F, I Group 2 (Root: K) K, L, M, F, N, C, G, I K SYSTOR 2011 C 0.9 G 0.1 H Dept. of Computer Science Group Scanning Does the offending block exist between the current disk head location and the target location If so, we instead switch to the predictive group containing the offending block © 2011 David Essary Track distance Disk rotation Disk arm movement SYSTOR 2011 Dept. of Computer Science Reducing Commit Overhead Commit predictions to device opportunistically Use only items already in main memory Avoid additional seeks Avoid updating a group if it contains 75% of objects that it “should” contain We call this percentage the overlap threshold © 2011 David Essary SYSTOR 2011 Dept. of Computer Science Overlap Threshold Group α A, B, C, D, E, F, G Group α’ A, B, C, D, H, I, J Overlap: 62.5% Result: Replace group α with α’ Group α A, B, C, D, E, F, G Group α’’ A, B, C, D, E, F, J Overlap: 87.5% Result: Abort update and keep group α © 2011 David Essary SYSTOR 2011 Dept. of Computer Science Hardware-Based Validation Data Request Stream Firewire 400 Test Drive Enclosure Test Drive Internal Power Supply Workload Replay System Network 5 Volts In-line Resistor Data Acquisition 12 Volts Data Acquisition Leads Voltage Measurement Workstation © 2011 David Essary SYSTOR 2011 Dept. of Computer Science Hardware Validation DAQ and External HD © 2011 David Essary SYSTOR 2011 Dept. of Computer Science Validation Results WD Drive Percentage Latency Reduction (Group Size: 8K) 70.0% 52.5% 35.0% © 2011 David Essary SYSTOR 2011 ) ks s) oc ck bl B 12 ar ,5 ar ,4 ye rt ( oz a m m oz a rt ( ye m oz a rt ( ye ar ,8 K K B bl o B bl o ck oc bl B 12 ,5 th on m rt ( oz a m s) ks ) ks oc bl K B ,4 th on m rt ( oz a m m oz a rt ( m on th ,8 K B bl oc ks ) 0% ) 17.5% Dept. of Computer Science Validation Results WD Drive Percentage Energy Reduction (Group Size: 8K) 70.0% 52.5% 35.0% © 2011 David Essary SYSTOR 2011 ) ks s) oc ck bl B 12 ar ,5 ar ,4 ye rt ( oz a m m oz a rt ( ye m oz a rt ( ye ar ,8 K K B bl o B bl o ck oc bl B 12 ,5 th on m rt ( oz a m s) ks ) ks oc bl K B ,4 th on m rt ( oz a m m oz a rt ( m on th ,8 K B bl oc ks ) 0% ) 17.5% Dept. of Computer Science Validation Results WD Drive Percentage Latency Reduction (Group Size: 1K) 70.0% 52.5% 35.0% © 2011 David Essary SYSTOR 2011 ) ks s) oc ck bl B 12 ar ,5 ar ,4 ye rt ( oz a m m oz a rt ( ye m oz a rt ( ye ar ,8 K K B bl o B bl o ck oc bl B 12 ,5 th on m rt ( oz a m s) ks ) ks oc bl K B ,4 th on m rt ( oz a m m oz a rt ( m on th ,8 K B bl oc ks ) 0% ) 17.5% Dept. of Computer Science Validation Results WD Drive Percentage Energy Reduction (Group Size: 1K) 70.0% 52.5% 35.0% © 2011 David Essary SYSTOR 2011 ) ks s) oc ck bl B 12 ar ,5 ar ,4 ye rt ( oz a m m oz a rt ( ye m oz a rt ( ye ar ,8 K K B bl o B bl o ck oc bl B 12 ,5 th on m rt ( oz a m s) ks ) ks oc bl K B ,4 th on m rt ( oz a m m oz a rt ( m on th ,8 K B bl oc ks ) 0% ) 17.5% Dept. of Computer Science Validation Results 5% Percentage Average Reduction for Time and Energy 4% 3% 2% 1% 0% Time Energy Transition Estimate © 2011 David Essary HIT SYSTOR 2011 SAM WD Dept. of Computer Science Validation Results Percentage Reduction of Energy by Block Size 70% 60% 50% 40% 30% 20% 10% 0% 512 4096 Transition Estimate © 2011 David Essary SYSTOR 2011 8192 Measured (WD) Dept. of Computer Science SPORe Conclusions Opportunistic, dynamic, sustainable Replicates data on the fly (no warm-up period) Simultaneously reduces Track distance (up to 80% reduction) Track seeks (up to 65%) Latency due to mechanical movement (up to 63%) Energy due to mechanical movement (up to 61%) Strong correlation between seek reduction and energy and latency reduction Latency and energy results validated by live hardware © 2011 David Essary SYSTOR 2011 Dept. of Computer Science