Efficient Power Management Schemes for Dual-Processor Fault-Tolerant Systems Yifeng Guo, Dakai Zhu The University of Texas at San Antonio Hakan Aydin George Mason University Outline Background and Motivation System Models POED Algorithm Application to Energy Efficient FaultTolerant Real-Time Systems Simulation Results and Discussions Conclusions HARSH-2013, Shenzhen, China 2 Energy is Precious: Everywhere! Popularity mobile devices Smart phones: 492 million in 2011 Battery operated: limited capacity; Smart phones: a few days Laptop: 3 – 10 hours Data centers and servers Excessive heat cooling Operation cost: 1.5% electricity in US (2007) billion $ HARSH-2013, Shenzhen, China 3 Power Reduction Techniques LCD Brightness; on/off Memory Different power states Disks SSD Spin down Pd CPU Voltage/frequency scaling Low power states: 1-5% of the peak power HARSH-2013, Shenzhen, China f/2 f f 4 A Simple System-Level Power Model Example: A real time task T needs 2 units with processing speed f D power consumption execution E = P * t; t Energy from: f T P : the same; s f/2 Pind: increases; Pd: decreases T A simple system power model P(f) = Ps+ (Pind+Pd )= Ps+ Pind+Cef f m Minimum energy efficient frequency f ee P ind = c ⋅ ( m − 1) ef 1 m [Zhu ’06] HARSH-2013, Shenzhen, China 5 Fault-Tolerant System Design Faults Transient fault Temporary, will disappear Permanent fault System component replacement/redundancy Techniques Time redundancy available system slack for recovery execution only tolerate transient fault w/o permanent fault task with utilization > 50% can not be managed Hardware redundancy Both transient & permanent fault (e.g., duplex, TMR system) Tremendous energy consumption HARSH-2013, Shenzhen, China 6 Transient Faults/Reliability vs. DVFS Transient faults vs. Critical charge Qcrit • smallest charge needed to flip circuit states HARSH-2013, Shenzhen, China [Ernst-Micro-2004] 7 Co-Management of Energy and Reliability Reliability-Aware Power Management [Zhu ’06, Qi ’11] low supply voltage (DVFS) time redundancy more transient fault Standby-Sparing [Ejlali ’09] and [Haque ’11] dual-processor systems, aperiodic/periodic tasks primary processor: primary tasks, DVFS spare processor: backup tasks, DPM, deallocation minimize the overlap between primary and backup tolerate transient fault and one permanent fault Secondary Execution Time Shifting (SETS) [Unsal ’09] periodic tasks a mixed manner (P/B tasks) static scheme to reduce overlap between primary and backup HARSH-2013, Shenzhen, China 8 Application Model n periodic real-time tasks Ψ = {T1, …, Tn}; Ti: (ci, pi) ci : worst case execution time (WCET) at fmax (fmax = 1); pi: period; WCET = ci/fi in a lower frequency; ui = ci/pi U = ∑ui, i= 1, …n Bi: backup for Ti same parameters (ci & pi) no DVFS (transient fault) ) different CPUs for Ti and Bi (permanent fault) HARSH-2013, Shenzhen, China 9 Problem to Solve Hardware redundancy: Dual-CPU systems Tolerate a single permanent fault Tolerate transient faults Each task: primary & backup copies Primary & backup need on different CPUs [Haque ’11] Standby-Sparing in Dual-CPU Example: T1(1, 5) and T2(2, 10) CPU1: T1,T2 f1,2=2/5 T1,1 T2,1 Secondary CPU2: B1,B2 B11 2 4 T2,1 EDF B21 B12 EDL T1,2 6 8 10 time Shenzhen, China is wasted! Slack time onHARSH-2013, secondary CPU 10 Mixed Allocation of P/B Tasks Each CPU gets a set of mixed P/B tasks Scale down primary tasks f1=1/4 CPU1: T1,B2 CPU2: T2,B1 T1,1 B21 f2=1/4 B11 T2,1 2 4 B21 T1,2 EDF B12 6 T2,1 8 10 time Problem: backup tasks run concurrently with primary tasks more energy consumption! HARSH-2013, Shenzhen, China 11 Differentiate Executions of P/B Tasks P/B tasks: different preferences Primary tasks: as soon as possible (ASAP) Backup tasks: as late as possible (ALAP) f1=1/4 CPU1: T1,B2 T1,1 f2=1/4 CPU2: T2,B1 T2,1 2 B21 T1,2 B21 B11 T2,1 B12 4 6 8 10 time Problem: how to efficiently schedule RT tasks with different preferences on each CPU? HARSH-2013, Shenzhen, China 12 RT Tasks with Preferences and Schedule A set of n periodic tasks: Ψ = {T1, …, Tn} Each task has a preference: ASAP or ALAP ASAP tasks (ΨS ) & ALAP tasks (ΨL) Ψ = ΨS ∪ ΨL [Guo TR’12] A feasible schedule of tasks S : t → Ti , 0 ≤ t ≤ LCM,1 ≤ i ≤ n Schedule: Ti is executed in time slot [t, t+1): S(t) = Ti No deadline miss HARSH-2013, Shenzhen, China 13 Accumulated ASAP/ALAP Executions Accumulated ASAP execution before time t t ∆(S, t) = ∑δ (S, z) 0 ≤ t ≤ LCM z=0 where δ (S, z) =1 ifS(z) = Ti and Ti ∈ Ψ s Accumulated ALAP execution after time t Ω(S, t) = LCM −1 0 ≤ t ≤ LCM ∑ ω (S, z) z=t where ω (S, z) =1 ifS(z) = Ti and Ti ∈ Ψ L HARSH-2013, Shenzhen, China 14 Optimal Preference-Oriented Schedules opt An ASAP-optimal schedule: Sasap opt S If asap is a feasible schedule and, for any other feasible schedule S, there is: opt ∆(Sasap , t) ≥ ∆(S, t) (0 ≤ t ≤ LCM ) opt An ALAP-optimal schedule: Salap opt If Salap is a feasible schedule and, for any other feasible schedule S, there is: opt Ω(Salap , t) ≥ Ω(S, t) (0 ≤ t ≤ LCM ) An PO-optimal schedule: S opt If S opt is a feasible schedule and, for any other feasible schedule S, there is: ∆(S opt , t) ≥ ∆(S, t) and Ω(S opt , t) ≥ Ω(S, t) (0 ≤ t ≤ LCM ) HARSH-2013, Shenzhen, China 15 Optimal Schedules vs. System Loads U < 1: discrepant optimal schedules with idle time Example: T1 (1, 3), T2 (1, 4) and T3 (1, 6), U = 0.75 where ΨS = {T1}, ΨL = {T2, T3} T1 T2 T3 T1,1 0 T1 T2 T2,1 T1,2 1 2 3 4 T1 T3 T2 T1 T3,1 T1,3 T2,2 5 6 7 8 LCM T1,4 T2,3 T3,2 not ALAPoptimal 9 10 11 12 An ASAP-optimal schedule T1,1 0 T2,1 T1,2 T3,1 T1,3 T2,2 1 2 3 4 5 6 7 8 T1,4 T2,3 T3,2 not ASAPoptimal 9 10 11 12 An ALAP-optimal schedule U = 1: harmonious optimal schedules HARSH-2013, Shenzhen, China 16 Preference-Oriented Earliest Deadline Heuristic ASAP Scheduling Principle At any time, if there are ready ASAP tasks, they should be executed first provided that such executions will not lead to deadline miss for ALAP tasks ALAP Scheduling Principle If there is no ready ASAP tasks, CPU should idle provided that it will not lead to deadline miss for ALAP tasks Explicitly manage idle time with wrapper task Idle time [Zhu ’09] wrapper tasks with deadlines HARSH-2013, Shenzhen, China 17 Preference-Oriented Earliest Deadline Heuristic POED scheduling algorithm: at time t If Tk is a ready ASAP task with earliest deadline dk, check look-ahead interval [t, dk] If there is free time, execute Tk (maybe wrapped execution) Otherwise, urgent execute the earliest deadline ALAP task If wrapper tasks Tx with deadline dx (ASAP) ), check look-ahead interval [t, dx] If there is free time, execute Tx (CPU free) Otherwise, urgent execute the earliest deadline ALAP task No ASAP/wrapper tasks: execute ALAP tasks with EDF HARSH-2013, Shenzhen, China 18 Look-Ahead Interval Qla = {Tx ,Ty } , Tx ,Ty ∈ Ψ L ay free t Ty Ty dx Tx dy Ty t' dk Qla = {Tx ,Ty ,Tz },Tx ,Ty ∈ Ψ, Tz ∈ Ψ S L az no free section at the beginning Tz Tz t ay Ty Ty dx Tx Ty dy dz Tz t' dk HARSH-2013, Shenzhen, China 19 POED-Based EEFT on Duplex Systems Steps: map primary tasks to two CPUs (e.g., WFD) cross assign backup tasks to CPUs calculate scaled frequency for primary tasks on each CPU on each CPU, execute tasks with the POED scheduler Primary tasks have ASAP preference Backup tasks have ALAP preference When a task completes successfully on one CPU, notify other CPU to cancel its backup Online Extension dynamic slack from task cancellation & AET << WCET further slow down primary/delay backup HARSH-2013, Shenzhen, China 20 An Example T1 (1, 5) and T2 (2, 10), U = 0.4 T1,1 T1,2 B1,1 B1,2 f1 = 1/4 T1,1 T1, B2 f2 = 1/4 T2, B1 T2,1 0 2 B2,1 T2,1 B2,1 T1,2 B2,1 B1,1 T2,1 B1,2 4 6 8 HARSH-2013, Shenzhen, China LCM 10 21 Simulation Settings Power Application Processor Ps 0.01 Pind 0.1 Cef 1 m 3 frequency levels 0.4, 0.6, 0.8, 1.0 Num Tasks/Task Set 10, 20, …, 100 Utilization of Each Task UUniFast scheme [Bini ’04] Period of Each Task [pmin, pmax] uniform dist. pmax 100 pmin 10 Num Tasks Sets/Data Point 100 U (static load) 0.3, 0.4, …, 1.0 αi (dynamic load) Uniform dist. w/ average α Num of Processors 2 (dual-processor system) HARSH-2013, Shenzhen, China 22 Schemes for Comparisons Baseline: Basic-SS Basic standby-sparing w/o scaled frequency Existing schemes for comparison SS-SPM Standby-sparing w/ offline scaled frequency SS-DPM (ASSPT [Haque ’11]) Standby-sparing w/ further scaled frequency using online slack Proposed schemes POED-SPM POED w/ offline scaled frequency POED-DPM POED w/ further scaled frequency using online slack HARSH-2013, Shenzhen, China 23 Energy Savings: POED vs. Standby-Sparing Normalized energy consumption vs. static load; 20 tasks per set HARSH-2013, Shenzhen, China 24 Energy Savings: POED vs. Standby-Sparing Normalized energy consumption vs. dynamic load; 20 tasks per set HARSH-2013, Shenzhen, China 25 Conclusions & Future Work POED-based EEFT for dual-processor systems Objective co-management of energy with reliability Results significant energy savings vs. standby-sparing Future work Effects of additional DVFS transition Multiprocessor system with more than two processors HARSH-2013, Shenzhen, China 26 Thanks & Questions http://www.my.cs.utsa.edu/~yguo HARSH-2013, Shenzhen, China 27