Write Barrier Elision for Concurrent Garbage Collectors Martin T. Vechev Cambridge University David F. Bacon IBM T.J.Watson Research Center 1 Write Barriers for Concurrent GC • Mutator and Collector interleaved • Mutator changes object graph • Potential race conditions • Write barriers a form of synchronization – Between mutators and the collector. 2 Can Synchronization Be Reduced? ? write barrier Mutators write read • read Minimum information from a mutator to a collector? – • • Heap Connectivity Graph Concurrent Collector without compromising correctness? Can static analysis help? Can we discover dynamic opportunities? 3 Outline • The Synchronization Problem • Elimination conditions • Elimination opportunities (limit study) • Elimination correlation • Exploiting Order • Conclusions and Future Work 4 The Synchronization Problem B C A P1 Time Collector Working - Marks B as live 5 The Synchronization Problem B B C A P1 P2 C A P1 Time Collector Working Thread Working - Marks B as live - Installs P2 6 The Synchronization Problem B B C A P1 P2 B C A P1 P2 C A Time Collector Working Thread Working Thread Working - Marks B as live - Installs P2 - Deletes P1 7 The Synchronization Problem B B C A P1 P2 B C A P1 P2 B C A P2 ? A Time Collector Working - Marks B as live Thread Working Thread Working Collector Working - Installs P2 - Deletes P1 - C was not seen - Reclaims C: live! 8 The Synchronization Problem A and B contain pointers to Object C Point of Error Memory Location P1 A P2 B 0 1 2 3 4 5 6 Time 9 Types of Write Barriers Steele Dijkstra Marks B (Source) for rescanning B B C A P1 Yuasa Mark C (new target) as Mark C (old target) as live live P2 B C A P1 P2 B C C A P2 A TIME Collector Working Thread Working Thread Working Collector Working 10 Costs of Concurrent Write Barriers • Mutator Overhead – Direct Run-time Overhead (5-20%) – I-cache pollution by write barrier code • Collector Overhead – Process Write Barrier Information • Space costs – Sequential Store Buffer – Increased Code Size • Termination Issues – Yuasa vs. Dijkstra/Steele 11 Contributions • Four Elimination Conditions a static analysis can utilize. – Main idea based on pointer lifetimes. 1. Two Covering conditions: Apply to all barriers 2. Two Allocation conditions: Apply to incremental barriers. • Limit study shows potential for barrier elimination – For Yuasa, on average 83% can be eliminated – For Dijkstra and Steele, on average 54% can be eliminated • Correlation study between barrier elimination and – Object Size – Object Lifetime – Program Locality 12 Single Covering Condition B B C A P1 B P2 C A P1 B C A C A P1 P1 TIME Collector Working Thread Working Thread Working Collector Working Memory Location Key Observation: • P1’s lifetime covers P2’s • Barriers on P2 are redundant A B P1 P2 Time 1 2 3 13 Single Covering Condition (Incremental) Memory Location A Memory Location Rescan Roots L A H B H H B Time 1 2 3 4 Time 1 2 3 4 14 Single Covering Condition (Snapshot) Memory Location A Memory Location L A H B H H B Time 1 2 3 4 Time 1 2 3 4 15 Multiple Covering Condition Memory Location V P1 A P2 B P3 C Time 0 1 2 3 4 5 6 7 Key Observation : • P1 together with a barrier on P2 form a virtual pointer PV • Apply Single Covering to eliminate barriers on P3 16 Multiple Covering Condition (Incremental) Memory Location A Memory Location L A L B H B H C L H C Time 1 2 3 4 Time 1 2 3 4 17 Multiple Covering Condition (Incremental) Memory Location A Memory Location H A L B H B H C H H C Time 1 2 3 4 Time 1 2 3 4 18 Multiple Covering Condition (Snapshot) Memory Location A Memory Location L A L B H B H C L H C Time 1 2 3 4 Time 1 2 3 4 19 Multiple Covering Condition (Snapshot) Memory Location A Memory Location H A L B H B H C H H C Time 1 2 3 4 Time 1 2 3 4 20 Single Allocation Condition Memory Location A Key Observation : H starts Inside ‘New’ pointer N N P2 B 1 Time 2 3 • Allocation conditions exploit Initial Root Set Marking • Main idea: At the time of a barrier, the object is already marked • Only if allocating non-white (also trade off), for Dijkstra/Steele only 21 Multiple Allocation Condition Key Observation : Memory Location A • V = N join L • Apply SAC between V and H V N L B H C Time 1 2 3 22 Eliminating Null Pointer Stores Yuasa(destination, old_pointer) if ( Collector_Marking && old_pointer != NULL) store (old_pointer); • • • • Dijkstra(destination, new_pointer) if (Collector_Marking && new_pointer != NULL) store (new_pointer); If old pointer is NULL, eliminate barrier Yuasa vs Steele/Dijskstra Yuasa good because of initialization Null stores are easier to eliminate – But less payoff 23 Evaluation Methodology • Limit study based on elimination condition • Shows potential for elimination • Traces – Jikes RVM 2.2.0 – Trace Events : Allocation, Pointer Stores – Application Objects Only • Which benchmarks: – – – – – SpecJVM98 (-s100) Jolden Ipsixql Xalan Deltablue 24 Barrier Elimination Potential – Dijkstra / Steele SCC SAC Null Remaining 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% G eo M ea n ck ja trt m c va ja db t ip h si xq xa l co la m n pr es s al bh he po d e we lta r bl ue 0% 25 Barrier Elimination Potential - Yuasa SCC Null Remaining 100% 90% 80% 70% 60% n ea M ja trt m c va ja db ck G eo co m pr es s n la xa ql six ip th al bh he bl lta de po we r ue 50% 40% 30% 20% 10% 0% 26 Time (MB) vs. Eliminated Yuasa Barriers • Elimination is Periodic => WB elimination occurs in bursts JAVAC 14000 Eliminated Barriers 12000 10000 8000 6000 4000 2000 0 1476 3483 5488 7490 9493 11493 13498 15498 Time (MB allocated) 27 Time (MB) vs. Eliminated Yuasa Barriers • Elimination is Periodic => WB elimination occurs in bursts 500000 Eliminated Barriers DB 400000 300000 200000 db 100000 0 6086 7268 8447 9755 11048 12371 13669 15030 Time (MB allocated) 28 Object Size (words) vs. Eliminated Yuasa Barriers • Elimination is on Small objects 1800000 JAVAC Eliminated Barriers 1500000 1200000 900000 600000 300000 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Object Size (words) 29 Object Age vs. Eliminated Yuasa Barriers • Elimination is mostly on Young objects (Log Y scale) 1000000 JAVAC Eliminated Barriers 100000 10000 1000 100 10 1 0 10 20 30 40 50 60 70 80 90 Object Age (in MB) 30 Object Age vs. Eliminated Yuasa Barriers • An Exception 100000 DB 10000 1000 100 10 1 0 2258 6687 8656 8856 9065 9265 9485 9689 9889 31 Further Write Barrier Elimination • So far assumed collector sees pointers in any order • Often, Collector order exists – take advantage of it • Main idea: writes on the collector wave are safe. – Cannot apply previous conditions – Can eliminate more barriers • Order extends to Lists, Queues, Trees (Needs connectivity approximation) 32 Single Object Scenarios (Order matters) Broken Sequence B C P1 B P2 C B P2 C B P2 ? P1 Time Collector Working Thread Working Thread Working Collector Working -B is partially scanned (gray) -Installs P2 - Deletes P1 - Reclaims live object C incorrectly 33 Single Object Scenarios (Order matters) Correct Sequence B P1 C B P1 C P2 B C P2 C B P2 Collector Working Thread Working Thread Working Collector Working - B is partially scanned (gray) - Installs P2 - Deletes P1 - C is NOT collected 34 Conclusions and Future Work • Static Analysis Elimination Conditions • An Upper Bound • Hot Barrier Methods • Yuasa Barriers seem superior to Dijkstra/Steele – (study higher elimination vs. reduced floating garbage) – Yuasa are harder to statically eliminate • More conditions possible – Requires formal reasoning • Other elimination combinations possible: – Coverage (ownership) and Order can be exploited further. 35