Presentation

Amiram Yehudai
Shmuel Tyszberowicz
Dor Nir
Problem introduction.
Regression bug – definition.
Proposed solution.
Experimental results.
Future work.
Given existence of a bug, we want to
locate the place in the source code that
causes the bug.
Bug
Location
Bug Found
Locating
Command
Command:
Click on the button.
Enter a text to an edit box.
Insert a record to DB.
…
Command
Command
Checkpoint
Checkpoint:
Check that the button is enabled.
Check that a certain text is shown
in the edit box.
Check the value in a record of a
DB.
…
Command
Command
Regression
bug
Specifications
1. X
2. Y
3. Z
Version 1
R
el
e
a
s
e
Changes in code
Version 2
Bug…
But no
regression
Specifications
1. X
2. Y
3. Z
4. A
5. B
Regression bugs occur whenever
software functionality that previously
worked as desired stops working, or
no longer works as planned.
Typically regression bugs occur as an
unintended consequence of program
changes.
What is the cause for the regression bug?
Version 1
Changes in code
Version 2
What is the change that causes the
regression bug?
Check
C – A checkpoint that failed when
point
running a test-case.
V - last version of the AUT where
Version 1
checkpoint C still passed when running
the test-case.
We want to find in the source code of
the AUT the locations p1 , p2 ... pn that
caused C to fail.
Input
Failed
Check
Point
Last
Passed
The code psychologist tool
Heuristic
SCT
Change 1
Change 2
Change 3
Change 4
Change 5
…
changes
Change
Sound
Filter
Heuristic
Heuristic
Version
First phase
Second phase
Output
Relevant
changes:
1. Change n1
2. Change n2
3. Change n3
…
The code psychologist tool
Heuristic
S
C
T
C
S
F
Heuristic
Heuristic
Data Base of source code.
Very common in software development.
Check-in / Check-out operation.
History of versions.
Differences between versions.
Retrieve changes submitted after version V.
Amount of retrieved changes can be large.
The code psychologist tool
Heuristic
S
C
T
C
S
F
Heuristic
Heuristic
Retrieving relevant changes.
Soundness– The output of the CSF
must contains the changes that cause
the regression bug.
Tests
Check text in
message box
File t.xml was
Created successfully
“SELECT NAMES from
Table1” is not empty
Source code
Windows.cpp
errMessages.cpp
File.cs
IO.cs
C:\code\windows
DB project
Filtering refactoring changes.
Changes in comments.
Using profiler information to filter
irrelevant changes.
Code that was not executed could not
cause the regression.
The code psychologist tool
Heuristic
S
C
T
C
S
F
Heuristic
Heuristic
Rank changes.
Not conservative.
Each heuristic has different weight.
Rank ( p )
i |H |
i
HeuristicRanki ( p )
i 1
2
1
3
Affinity – “Close connection marked by
similarity in nature or character”
Measure affinity between words.
chair
house
<
chair
table
<
chair
chair
Object
WrdAff(a,b)
artifact
instrumentality
article
Conveyance,
transport
ware
vehicle
tableware
wheeled
vehicle
Automotive,
motor
bike, bicycle
1
Distance(a,b)
1
WrdAf f(bike,fork )
10
cutlery, eating
utensil
fork
WrdAff(a,b)
1
Distance(a,b)
AsyGrpAff(A,B )
GrpAff
1
n
n
max{WrdAff(ai ,b j ) | 1
j
i 1
(AsyGrpAff ( A,B) AsyGrpAff ( B,A))/2
m}
The code psychologist tool
Heuristic
S
C
T
Code Lines Affinity.
Check-in comment affinity.
File Affinity.
Function Affinity.
C
S
F
Heuristic
Heuristic
The code psychologist tool
Heuristic
Human factor:
Programmers history.
Time of change.
Late at night.
Close to release deadline.
Code complexity
Number of Branches.
Concurrency.
S
C
T
C
S
F
Heuristic
Heuristic
C++.
MFC framework.
891 files in 29 folders.
3 millions lines of code.
Visual source safe.
3984 check-ins.
Results:
Bug
Code Lines
Heuristic
Check-in
Average
File Affinity
Functions
Simple
Weighted
1
5
3
9
1
1
1
2
-
1
24
-
7
3
3
5
3
3
1
1
1
4
-
-
-
6
6
5
5
2
1
4
1
4
1
Results with file grouping:
Bug
Heuristic
Code Lines
Check-in
Average
File Affinity
Functions
Simple
Weighted
1
1
3
9
1
1
1
2
9
1
24
-
3
2
3
3
3
3
2
1
1
4
-
-
-
3
9
4
5
1
1
4
1
1
1
Locating the bug took 20 hours of
strenuous work of two experienced
programmers.
Fixing the bug took less then an
hour.
Heuristic
Rank (group by file)
Code Line Affinity
7
Check-in comment Affinity
-
File Affinity
22
Function Affinity
8
Average
4
Implementing the human factor and the
code complexity heuristics.
Learning mechanism – Automatic tuning of
heuristics.
More experiments on “real world”
regression bugs.
The code psychologist tool
Heuristic
S
C
T
Code line affinity:
Rank1 (C,P)
GrpAff(W(C ), W( P))
C
S
F
L
1
L
Heuristic
Heuristic
GrpAff(W(C ), W( P,l ))
l 1
W (P, L) = Group of words in the source code located L lines from the
change P.
– coefficient that gives different weight for lines inside the change.
Check-in comment affinity:
Rank 2 (C,P) GrpAff(W(C ), W(Checkin( P)))
The code psychologist tool
Heuristic
S
C
T
File affinity:
maxAff(a,B,map)
Rank 3 (C,P )
Heuristic
Heuristic
max{WrdAff(a,bi ) map[bi ] | 1 i
n
HstAff( A,B,map)
C
S
F
MaxAff(ai ,B )
i 1
n max{map[b j ] | 1
j
m}
HstAff(W(C ), W( F ), Hstg( F ))
m}
The code psychologist tool
Heuristic
S
C
T
Function affinity:
FuncAff(C,f )
k
k
k
i 1
Rank 4 (C,P )
C
S
F
Heuristic
Heuristic
GrpAff(W(C ), W( f ))
GrpAff(W(C ), Bdy( f ))
1
k
FncAff(C , FncCall( f,i ))
FuncAff(C , func( P))
Check point
Select "clerk 1" from the
clerk tree (clerk number 2).
Go to the next clerk.
The next clerk is "clerk 3"