Presentation

Sustainable
Predictive Storage
Management
D avid E ssary
D ep t . of C omp u ter S c ie n c e
Un ive rsity of P it ts b ur g h
© 2011 David Essary
SYSTOR 2011
Dept. of Computer Science
Introduction
Disk vs CPU
Storage systems’ power becoming more critical
Rate of data generation is alarming
No “silver bullet”
Goal: Dynamic (and adaptive), sustainable
predictive grouping engine
Group = Disk track
© 2011 David Essary
SYSTOR 2011
Dept. of Computer Science
Motivating Example
Group A:
1
2
3
4
Group B:
5
6
7
8
Group C:
9
Remap with
Replication
Group A:
1
5
9
2
Group B:
6
1
3
7
11 4
8
10
Group C:
10 11 12
Block Access Pattern: 1,5,9,2,6,1,3,7,11
Used block
Free block
Group Access Pattern
Prior to Remapping: A,B,C,A,B,A,B,C
© 2011 David Essary
Group Access Pattern with
Remapping and Replication: A,B,C
SYSTOR 2011
Dept. of Computer Science
Challenges
When are predictions performed?
How are predictions made?
How is predictive metadata gathered?
Where are predictions to be located?
How are predictions used?
© 2011 David Essary
SYSTOR 2011
Dept. of Computer Science
SPORe
Data Request Stream
SPORe Control
SESH
LRDU hot lists
Root Monitor
Scribe
Cartographer
Storage
Device
© 2011 David Essary
SYSTOR 2011
Dept. of Computer Science
Supergroups
A
0.4
0.6
0.5
D
0.3
C
0.9
0.2
E
F
0.6
G
0.8
F
0.1
0.6
H
I
L
0.8 0.2
0.4
J
M
N
0.4
I
1.0
J
Supergroup (Roots: A and K)
A, B, C, G, D, E, F, I, K, L, M, N, ?, ?, ?, ?
© 2011 David Essary
0.2
B
Group 1 (Root: A)
A, B, C, G, D, E, F, I
Group 2 (Root: K)
K, L, M, F, N, C, G, I
K
SYSTOR 2011
C
0.9
G
0.1
H
Dept. of Computer Science
Group Scanning
Does the offending block
exist between the current
disk head location and the
target location
If so, we instead switch
to the predictive group
containing the offending
block
© 2011 David Essary
Track distance
Disk rotation
Disk arm movement
SYSTOR 2011
Dept. of Computer Science
Reducing Commit
Overhead
Commit predictions to device opportunistically
Use only items already in main memory
Avoid additional seeks
Avoid updating a group if it contains 75% of
objects that it “should” contain
We call this percentage the overlap threshold
© 2011 David Essary
SYSTOR 2011
Dept. of Computer Science
Overlap Threshold
Group α
A, B, C, D, E, F, G
Group α’
A, B, C, D, H, I, J
Overlap: 62.5%
Result: Replace group α with α’
Group α
A, B, C, D, E, F, G
Group α’’
A, B, C, D, E, F, J
Overlap: 87.5%
Result: Abort update and keep group α
© 2011 David Essary
SYSTOR 2011
Dept. of Computer Science
Hardware-Based
Validation
Data Request
Stream
Firewire 400
Test Drive
Enclosure
Test
Drive
Internal
Power Supply
Workload Replay
System
Network
5 Volts
In-line Resistor
Data
Acquisition
12 Volts
Data Acquisition
Leads
Voltage Measurement
Workstation
© 2011 David Essary
SYSTOR 2011
Dept. of Computer Science
Hardware Validation
DAQ and External HD
© 2011 David Essary
SYSTOR 2011
Dept. of Computer Science
Validation Results
WD Drive
Percentage Latency Reduction (Group Size: 8K)
70.0%
52.5%
35.0%
© 2011 David Essary
SYSTOR 2011
)
ks
s)
oc
ck
bl
B
12
ar
,5
ar
,4
ye
rt
(
oz
a
m
m
oz
a
rt
(
ye
m
oz
a
rt
(
ye
ar
,8
K
K
B
bl
o
B
bl
o
ck
oc
bl
B
12
,5
th
on
m
rt
(
oz
a
m
s)
ks
)
ks
oc
bl
K
B
,4
th
on
m
rt
(
oz
a
m
m
oz
a
rt
(
m
on
th
,8
K
B
bl
oc
ks
)
0%
)
17.5%
Dept. of Computer Science
Validation Results
WD Drive
Percentage Energy Reduction (Group Size: 8K)
70.0%
52.5%
35.0%
© 2011 David Essary
SYSTOR 2011
)
ks
s)
oc
ck
bl
B
12
ar
,5
ar
,4
ye
rt
(
oz
a
m
m
oz
a
rt
(
ye
m
oz
a
rt
(
ye
ar
,8
K
K
B
bl
o
B
bl
o
ck
oc
bl
B
12
,5
th
on
m
rt
(
oz
a
m
s)
ks
)
ks
oc
bl
K
B
,4
th
on
m
rt
(
oz
a
m
m
oz
a
rt
(
m
on
th
,8
K
B
bl
oc
ks
)
0%
)
17.5%
Dept. of Computer Science
Validation Results
WD Drive
Percentage Latency Reduction (Group Size: 1K)
70.0%
52.5%
35.0%
© 2011 David Essary
SYSTOR 2011
)
ks
s)
oc
ck
bl
B
12
ar
,5
ar
,4
ye
rt
(
oz
a
m
m
oz
a
rt
(
ye
m
oz
a
rt
(
ye
ar
,8
K
K
B
bl
o
B
bl
o
ck
oc
bl
B
12
,5
th
on
m
rt
(
oz
a
m
s)
ks
)
ks
oc
bl
K
B
,4
th
on
m
rt
(
oz
a
m
m
oz
a
rt
(
m
on
th
,8
K
B
bl
oc
ks
)
0%
)
17.5%
Dept. of Computer Science
Validation Results
WD Drive
Percentage Energy Reduction (Group Size: 1K)
70.0%
52.5%
35.0%
© 2011 David Essary
SYSTOR 2011
)
ks
s)
oc
ck
bl
B
12
ar
,5
ar
,4
ye
rt
(
oz
a
m
m
oz
a
rt
(
ye
m
oz
a
rt
(
ye
ar
,8
K
K
B
bl
o
B
bl
o
ck
oc
bl
B
12
,5
th
on
m
rt
(
oz
a
m
s)
ks
)
ks
oc
bl
K
B
,4
th
on
m
rt
(
oz
a
m
m
oz
a
rt
(
m
on
th
,8
K
B
bl
oc
ks
)
0%
)
17.5%
Dept. of Computer Science
Validation Results
5%
Percentage Average Reduction for
Time and Energy
4%
3%
2%
1%
0%
Time
Energy
Transition Estimate
© 2011 David Essary
HIT
SYSTOR 2011
SAM
WD
Dept. of Computer Science
Validation Results
Percentage Reduction of Energy
by Block Size
70%
60%
50%
40%
30%
20%
10%
0%
512
4096
Transition Estimate
© 2011 David Essary
SYSTOR 2011
8192
Measured (WD)
Dept. of Computer Science
SPORe Conclusions
Opportunistic, dynamic, sustainable
Replicates data on the fly (no warm-up period)
Simultaneously reduces
Track distance (up to 80% reduction)
Track seeks (up to 65%)
Latency due to mechanical movement (up to 63%)
Energy due to mechanical movement (up to 61%)
Strong correlation between seek reduction and energy
and latency reduction
Latency and energy results validated by live hardware
© 2011 David Essary
SYSTOR 2011
Dept. of Computer Science