Presentation

Healing Data Races On-The-Fly
Bohuslav Křena, Zdeněk Letko, Tomáš Vojnar
( Brno university of Technology, Czech Republic )
Rachel Tzoref, Shmuel Ur
( IBM, Haifa Research Lab, Israel )
PADTAD, London 07/2007
1
Agenda
●
Introduction
●
Self-Healing steps:
–
Problem detection
–
Problem localisation
–
Problem healing
–
Healing assurance
●
Preliminary results and experiments
●
Discussion
PADTAD, London 07/2007
2
What is a race?
●
●
●
A data race occurs when two concurrent
threads access a shared variable and when:
–
at least one access is a write and
–
the threads use no explicit mechanism to prevent
the accesses from being simultaneous.
Usually a data race is a serious error caused by
failure to synchronize properly.
Can cause wrong results, deadlocks,
exceptions...
PADTAD, London 07/2007
3
Atomicity races
●
●
Races caused by violation of wrong
assumptions that some blocks of code will be
executed atomically.
Example:
Thread 1
void someMethod(){
long local = shared;
local = update(local);
shared = local;
}
Thread 2
void someMethod(){
long local = shared;
local = update(local);
shared = local;
}
PADTAD, London 07/2007
4
Atomicity races
●
●
Races caused by violation of wrong
assumptions that some blocks of code will be
executed atomically.
Example:
Thread 1
void someMethod(){
shared=update(shared);
}
Thread 2
void someMethod(){
shared=update(shared);
}
PADTAD, London 07/2007
5
Inherent races
●
Races not related to atomicity.
●
Data race if the following holds:
–
Executing any segment of code in each thread
atomically does not determine an order of accesses
to shared variable.
–
The different orders in which the shared variable is
acessed can be classified as “good” and “bad”
according to the expected behaviour of the
program.
PADTAD, London 07/2007
6
Inherent races
●
Example:
Thread 1
void synchronized
someMethod(){
long local = shared;
local = update(local);
shared = local;
}
Thread2
void synchronized
otherMethod(){
shared = null;
}
PADTAD, London 07/2007
7
Agenda
●
Introduction
●
Self-Healing steps:
–
Problem detection
–
Problem localisation
–
Problem healing
–
Healing assurance
●
Preliminary results and experiments
●
Discussion
PADTAD, London 07/2007
8
Problem detection
●
Eraser algorithm
–
●
Detects so called apparent data races
Principle:
–
For each variable maintains its state and the set of
candidate locks
–
Race is detected whenever:
●
●
the variable is in state Shared and
the set of candidates locks becomes empty
PADTAD, London 07/2007
9
Demonstration of the
detection
static class Flight {
private int soldSeats;
...
Flight(){
soldSeats = 0;
...
}
...
boolean bookTicket(){
soldSeats++;
}
...
}
Thread_main
new Flight();
(state = Exclusive, C(v)={})
Thread 1
synchronized(lock){
bookTicket();}
(state = Shared, C(v)={lock})
Thread 2
bookTicket();
(state = Race, C(v)={})
PADTAD, London 07/2007
Time
10
Agenda
●
Introduction
●
Self-Healing steps:
–
Problem detection
–
Problem localisation
–
Problem healing
–
Healing assurance
●
Preliminary results and experiments
●
Discussion
PADTAD, London 07/2007
11
Problem localisation
●
●
●
Often hard task even for humans.
Oracle based on looking for pre-specified data
race bug patterns in the code with the aid of
information collected by race detector.
Use formal methods to reduce the number of
false alarms but with reasonable overhead.
PADTAD, London 07/2007
12
Atomicity Violation
Bug Patterns
●
Load-store bug pattern
x++;
●
Test-and-use bug pattern
if (p != null)
p = p.next;
●
Repeated test-and-use pattern
while (p != null)
p = p.next;
PADTAD, London 07/2007
2:
5:
6:
7:
getfield
iconst_1
iadd
putfield
#2
#2
0: aload_0
1: getfield
#2
4: ifnull
18
7: aload_0
8: aload_0
9: getfield
#2
12:getfield
#3
15:putfield
#2
18:...
13
Demonstration of the
localisation
static class Flight {
private int soldSeats;
...
Flight(){
soldSeats = 0;
...
}
...
boolean bookTicket(){
soldSeats++;
}
...
}
getfield
iconst_1
iadd
putfield
PADTAD, London 07/2007
#2
#2
14
Agenda
●
Introduction
●
Self-Healing steps:
–
Problem detection
–
Problem localisation
–
Problem healing
–
Healing assurance
●
Preliminary results and experiments
●
Discussion
PADTAD, London 07/2007
15
Healing atomicity races
●
Influencing the scheduler
–
Forcing a context switch
Thread.yield();
●
Idea:
–
●
Pros
–
●
To receive full time window from the scheduler.
Safe and legal solution.
Cons
–
Only decrease the probability of race manifestation.
PADTAD, London 07/2007
16
Healing atomicity races
●
Influencing scheduler
–
Temporary changes of the priorities
Thread.setPriority(MAXPRIORITY);
...
Thread.setPriority(originalPriority);
●
Idea:
–
●
Pros
–
●
To receive full time window from the scheduler.
Safe and legal solution.
Cons
–
Only decrease the probability of race manifestation.
–
Strongly JVM and OS dependent.
PADTAD, London 07/2007
17
Healing atomicity races
●
Adding synchronization actions
–
Suitable use of mutexes
healingMutex.lock();
...
healingMutex.unlock();
●
Idea:
–
●
Pros
–
●
To prevent accesses being simultaneous.
Heal the race.
Cons
–
Could introduce deadlock.
–
Exceptions must be covered.
PADTAD, London 07/2007
18
Healing inherent races
●
Distinguish between good and bad orders
●
Enforce good order
–
●
Change the scheduling of the program.
Override bad order
–
Concentrate on multiple write accesses.
–
Doesn't prevent bad order from occurring.
PADTAD, London 07/2007
19
Demonstration of the healing
static class Flight {
private int soldSeats;
...
Flight(){
soldSeats = 0;
...
}
...
boolean bookTicket(){
soldSeats++;
}
...
}
Healing by influencing scheduler:
boolean bookTicket(){
Thread.yield();
soldSeats++;
}
Healing by synchronization:
boolean bookTicket(){
raceLock.lock();
soldSeats++;
raceLock.unlock();
}
PADTAD, London 07/2007
20
Agenda
●
Introduction
●
Self-Healing steps:
–
Problem detection
–
Problem localisation
–
Problem healing
–
Healing assurance
●
Preliminary results and experiments
●
Discussion
PADTAD, London 07/2007
21
Healing assurance
●
●
Static analysis and/or bounded model checking
–
Reduce false alarms during detection and
localisation.
–
Ensure that a new bug cannot be introduced.
–
Help to choose suitable healing method.
... future work
PADTAD, London 07/2007
22
Agenda
●
Introduction
●
Self-Healing steps:
–
Problem detection
–
Problem localisation
–
Problem healing
–
Healing assurance
●
Preliminary results and experiments
●
Discussion
PADTAD, London 07/2007
23
Preliminary results
●
●
Implemented race detector is able:
–
To detect wrong locking policy using Eraser
algorithm.
–
To detect load-store atomicity bug pattern.
–
To localise the bug and give enough information to
the developer.
–
To heal founded bug by influencing scheduler and
also by introducing additional synchronization.
Architecture available also as an Eclipse plug-in.
PADTAD, London 07/2007
24
Experiments
PADTAD, London 07/2007
25
Thank you
PADTAD, London 07/2007
26
Eraser algorithm
Candidate_locks(x) := Candidate_locks(x) ∩ Locks_held(x);
if Candidate_locks(x) = { }, then issue a warning
Candidate_locks = set of locks used to protect variable
Locks_held = set of locks owned by thread
PADTAD, London 07/2007
27
Architecture overview
Original bytecode
...
load x
...
Instrumented bytecode
...
BeforeVarRead
load x
AfterVarRead
...
PADTAD, London 07/2007
28