slides

Write Barrier Elision for
Concurrent Garbage Collectors
Martin T. Vechev
Cambridge University
David F. Bacon
IBM T.J.Watson Research Center
1
Write Barriers for Concurrent GC
• Mutator and Collector interleaved
• Mutator changes object graph
• Potential race conditions
• Write barriers a form of synchronization
– Between mutators and the collector.
2
Can Synchronization Be Reduced?
?
write barrier
Mutators
write
read
•
read
Minimum information from a mutator to a collector?
–
•
•
Heap
Connectivity
Graph
Concurrent
Collector
without compromising correctness?
Can static analysis help?
Can we discover dynamic opportunities?
3
Outline
• The Synchronization Problem
• Elimination conditions
• Elimination opportunities (limit study)
• Elimination correlation
• Exploiting Order
• Conclusions and Future Work
4
The Synchronization Problem
B
C
A
P1
Time
Collector Working
- Marks B as live
5
The Synchronization Problem
B
B
C
A
P1
P2
C
A
P1
Time
Collector Working Thread Working
- Marks B as live
- Installs P2
6
The Synchronization Problem
B
B
C
A
P1
P2
B
C
A
P1
P2
C
A
Time
Collector Working Thread Working Thread Working
- Marks B as live
- Installs P2
- Deletes P1
7
The Synchronization Problem
B
B
C
A
P1
P2
B
C
A
P1
P2
B
C
A
P2
?
A
Time
Collector Working
- Marks B as live
Thread Working Thread Working Collector Working
- Installs P2
- Deletes P1
- C was not seen
- Reclaims C: live!
8
The Synchronization Problem
A and B contain pointers to Object C
Point of Error
Memory
Location
P1
A
P2
B
0
1
2
3
4
5
6
Time
9
Types of Write Barriers
Steele
Dijkstra
Marks B (Source) for
rescanning
B
B
C
A
P1
Yuasa
Mark C (new target) as Mark C (old target) as live
live
P2
B
C
A
P1
P2
B
C
C
A
P2
A
TIME
Collector Working Thread Working Thread Working Collector Working
10
Costs of Concurrent Write Barriers
• Mutator Overhead
– Direct Run-time Overhead (5-20%)
– I-cache pollution by write barrier code
• Collector Overhead
– Process Write Barrier Information
• Space costs
– Sequential Store Buffer
– Increased Code Size
• Termination Issues
– Yuasa vs. Dijkstra/Steele
11
Contributions
•
Four Elimination Conditions a static analysis can utilize.
– Main idea based on pointer lifetimes.
1. Two Covering conditions: Apply to all barriers
2. Two Allocation conditions: Apply to incremental barriers.
•
Limit study shows potential for barrier elimination
– For Yuasa, on average 83% can be eliminated
– For Dijkstra and Steele, on average 54% can be eliminated
•
Correlation study between barrier elimination and
– Object Size
– Object Lifetime
– Program Locality
12
Single Covering Condition
B
B
C
A
P1
B
P2
C
A
P1
B
C
A
C
A
P1
P1
TIME
Collector Working Thread Working Thread Working Collector Working
Memory
Location
Key Observation:
• P1’s lifetime covers P2’s
• Barriers on P2 are redundant
A
B
P1
P2
Time
1
2
3
13
Single Covering Condition (Incremental)
Memory
Location
A
Memory
Location
Rescan
Roots
L
A
H
B
H
H
B
Time
1
2
3
4
Time
1
2
3
4
14
Single Covering Condition (Snapshot)
Memory
Location
A
Memory
Location
L
A
H
B
H
H
B
Time
1
2
3
4
Time
1
2
3
4
15
Multiple Covering Condition
Memory
Location
V
P1
A
P2
B
P3
C
Time
0
1
2
3
4
5
6
7
Key Observation :
• P1 together with a barrier on P2 form a virtual pointer PV
• Apply Single Covering to eliminate barriers on P3
16
Multiple Covering Condition (Incremental)
Memory
Location
A
Memory
Location
L
A
L
B
H
B
H
C
L
H
C
Time
1
2
3
4
Time
1
2
3
4
17
Multiple Covering Condition (Incremental)
Memory
Location
A
Memory
Location
H
A
L
B
H
B
H
C
H
H
C
Time
1
2
3
4
Time
1
2
3
4
18
Multiple Covering Condition (Snapshot)
Memory
Location
A
Memory
Location
L
A
L
B
H
B
H
C
L
H
C
Time
1
2
3
4
Time
1
2
3
4
19
Multiple Covering Condition (Snapshot)
Memory
Location
A
Memory
Location
H
A
L
B
H
B
H
C
H
H
C
Time
1
2
3
4
Time
1
2
3
4
20
Single Allocation Condition
Memory
Location
A
Key Observation :
H starts Inside ‘New’
pointer N
N
P2
B
1
Time
2
3
• Allocation conditions exploit Initial Root Set Marking
• Main idea: At the time of a barrier, the object is already marked
• Only if allocating non-white (also trade off), for Dijkstra/Steele only
21
Multiple Allocation Condition
Key Observation :
Memory
Location
A
• V = N join L
• Apply SAC between V and H
V
N
L
B
H
C
Time
1
2
3
22
Eliminating Null Pointer Stores
Yuasa(destination, old_pointer)
if ( Collector_Marking
&& old_pointer != NULL)
store (old_pointer);
•
•
•
•
Dijkstra(destination, new_pointer)
if (Collector_Marking
&& new_pointer != NULL)
store (new_pointer);
If old pointer is NULL, eliminate barrier
Yuasa vs Steele/Dijskstra
Yuasa good because of initialization
Null stores are easier to eliminate
– But less payoff
23
Evaluation Methodology
• Limit study based on elimination condition
• Shows potential for elimination
• Traces
– Jikes RVM 2.2.0
– Trace Events : Allocation, Pointer Stores
– Application Objects Only
• Which benchmarks:
–
–
–
–
–
SpecJVM98 (-s100)
Jolden
Ipsixql
Xalan
Deltablue
24
Barrier Elimination Potential – Dijkstra / Steele
SCC
SAC
Null
Remaining
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
G
eo
M
ea
n
ck
ja
trt
m
c
va
ja
db
t
ip h
si
xq
xa l
co la
m n
pr
es
s
al
bh
he
po
d e we
lta r
bl
ue
0%
25
Barrier Elimination Potential - Yuasa
SCC
Null
Remaining
100%
90%
80%
70%
60%
n
ea
M
ja
trt
m
c
va
ja
db
ck
G
eo
co
m
pr
es
s
n
la
xa
ql
six
ip
th
al
bh
he
bl
lta
de
po
we
r
ue
50%
40%
30%
20%
10%
0%
26
Time (MB) vs. Eliminated Yuasa Barriers
• Elimination is Periodic => WB elimination occurs in bursts
JAVAC
14000
Eliminated Barriers
12000
10000
8000
6000
4000
2000
0
1476
3483
5488
7490
9493
11493
13498
15498
Time (MB allocated)
27
Time (MB) vs. Eliminated Yuasa Barriers
• Elimination is Periodic => WB elimination occurs in bursts
500000
Eliminated Barriers
DB
400000
300000
200000
db
100000
0
6086
7268
8447
9755
11048
12371
13669
15030
Time (MB allocated)
28
Object Size (words) vs. Eliminated Yuasa Barriers
• Elimination is on Small objects
1800000
JAVAC
Eliminated Barriers
1500000
1200000
900000
600000
300000
0
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Object Size (words)
29
Object Age vs. Eliminated Yuasa Barriers
• Elimination is mostly on Young objects (Log Y scale)
1000000
JAVAC
Eliminated Barriers
100000
10000
1000
100
10
1
0
10
20
30
40
50
60
70
80
90
Object Age (in MB)
30
Object Age vs. Eliminated Yuasa Barriers
• An Exception
100000
DB
10000
1000
100
10
1
0
2258
6687
8656
8856
9065
9265
9485
9689
9889
31
Further Write Barrier Elimination
• So far assumed collector sees pointers in any order
• Often, Collector order exists
– take advantage of it
• Main idea: writes on the collector wave are safe.
– Cannot apply previous conditions
– Can eliminate more barriers
• Order extends to Lists, Queues, Trees (Needs
connectivity approximation)
32
Single Object Scenarios (Order matters)
Broken Sequence
B
C
P1
B
P2
C
B
P2
C
B
P2
?
P1
Time
Collector Working Thread Working Thread Working Collector Working
-B is partially
scanned (gray)
-Installs P2
- Deletes P1
- Reclaims live
object C incorrectly
33
Single Object Scenarios (Order matters)
Correct Sequence
B
P1
C
B
P1
C
P2
B
C
P2
C
B
P2
Collector Working
Thread Working Thread Working Collector Working
- B is partially
scanned (gray)
- Installs P2
- Deletes P1
- C is NOT collected
34
Conclusions and Future Work
• Static Analysis Elimination Conditions
• An Upper Bound
• Hot Barrier Methods
• Yuasa Barriers seem superior to Dijkstra/Steele
– (study higher elimination vs. reduced floating garbage)
– Yuasa are harder to statically eliminate
• More conditions possible
– Requires formal reasoning
• Other elimination combinations possible:
– Coverage (ownership) and Order can be exploited further.
35