Omer Tripp - Security analysis with WALA

security analysis with WALA
Omer Tripp
IBM TJ Watson
Workshop on WALA (WoW), 13 June 2015
collaborators
•
Marco Pistoia
•
Patrick Cousot
•
Radhia Cousot
•
Julian Dolby
•
Manu Sridharan
•
Pietro Ferrrara
•
Steve Fink
•
Sal Guarnieri
•
Takaaki Tateishi
•
Omri Weisman
•
Steve Teilhet
•
Ryan Berg
•
Shay Artzi
•
Lotem Guy
•
Roee Hay
motivation
threat evolution
1960
Program installation
only by experts
1970
1980
Program
sharing
1990
Program
downloading
2000
Mobile
code
Web
applications
2010
Mobile
applications
threats: XSS
Attacker’s
evil script
<SCRIPT>...</SCRIPT>
Web Application
Attacker’s evil script
executed using victim’s credentials
Attacker
Victim
threats: SQLi
String query = “SELECT * FROM users WHERE name=‘” + userName +
“’ AND pwd=‘” + pwd + “’”;
SELECT * FROM users WHERE name='jsmith' AND pwd='Demo1234'
Ouch!
SELECT * FROM users WHERE name='foo';drop table custid;--' AND pwd=''
threats: info leakage
threats: JS
1515
%%
of Fortune 500 websites have
exploitable security issues in
JavaScript.
According to an IBM study performed in 2010
DOM-based XSS
document.write(document.URL.substring(
pos,document.URL.length));
Open Redirect
var pos = document.location.href.indexOf("name=");
var val = document.location.href.substring(pos);
document.location.href = "http://" + val;
threats: mobile
* http://blackdiamondsolutions.com/wp-content/uploads/2013/04/state-of-software-security-report-volume5.pdf
owasp top ten
1. Cross-site scripting (XSS)
2. Injection flaws
3. Malicious file executions
4. Insecure direct object reference
5. Cross site request forgery (CSRF)
6. Information leakage and improper error handling
7. Broken authentication and improper session management
8. Unsecure cryptographic storage
9. Unsecure communications
10. Failure to restrict URL accesses
outline of our work
the high-level pieces
dynamically loaded client-side code
statically loaded client-side code
server-side code
security defenses: sanitizers and validators
libraries and frameworks
the high-level progression
2009
TAJ
2010
2011
PISA
2012
F4F
2013
Andromeda
2014
JSA
server-side security verification
TAJ (PLDI’09):
Java taint analysis
sti
Store
statement
Hybridcomputed
SDG based on
li
Load
statement
st1
ski
Sink-dispatch
statement
ci
Call statement
ri
Return statement
si
Other statement
l1
l2
st3
st4
Store-to-load
direct edge
computed using graph c1
reachability over a noheap SDG
c5
Slice in the
no-heap
SDG
r5
r1
l3
l4
l5
sk1
st5
c4
r4
c2
c3
s1
r3
s2
r7
r2
r8
st4
Load-to-store or load-
to-sink summary edge
No-heap SDG
edge
l2
preliminary pointer
analysis
st2
sk2
TAJ: design choices
●
●
●
●
pointer analysis: variant of Andersen’s analysis
custom context-sensitivity policy:
● unlimited-depth object sensitivity for Java collections (up
to recursion)
● one level of call-string context for factory methods
● one level of call-string context for taint APIs
● one-level receiver-object context sensitivity as default
field sensitivity
flow insensitive, though intraprocedurally flow sensitive
thanks to SSA
TAJ: WALA support
●
●
●
●
●
●
●
pointer analysis / context sensitivity
field sensitivity
synthetic IR for taint APIs
SSA representation
IFDS enhancements [contributed to WALA]
ICFGSupergraph [contributed to WALA]
.NET support [contributed to WALA]
IBM AppScan Developer Edition
Andromeda (FASE’13):
demand-driven taint analysis
Andromeda: WALA support
●
●
PA-free CG construction algs for Java (CHA, RTA, …)
efficient infrastructure for fixpoint iteration
IBM AppScan Source Edition
the high-level pieces
dynamically loaded client-side code
statically loaded client-side code
server-side code
security defenses: sanitizers and validators
libraries and frameworks
the high-level pieces
dynamically loaded client-side code
statically loaded client-side code
server-side code
security defenses: sanitizers and validators
libraries and frameworks
framework modeling
F4F (OOPSLA’13):
modeling Java web frameworks
effect of modeling
F4F: WALA support
●
●
●
synthetic IR for call-graph methods
customization of call-graph target resolution
(CustomContextSelector)
various knobs to tweak in pointer analysis
IBM AppScan Source Edition
the high-level pieces
dynamically loaded client-side code
statically loaded client-side code
server-side code
security defenses: sanitizers and validators
libraries and frameworks
the high-level pieces
dynamically loaded client-side code
statically loaded client-side code
server-side code
security defenses: sanitizers and validators
libraries and frameworks
sanitizer verification
PISA (ISSTA’11, TOSEM’13):
encoding into M2L
PISA: sanitizers from the wild
PISA: WALA support
●
●
●
uniform IR: JS, Java, …
pointer analysis: aliasing
SSA
IBM AppScan Source Edition
the high-level pieces
dynamically loaded client-side code
statically loaded client-side code
server-side code
security defenses: sanitizers and validators
libraries and frameworks
the high-level pieces
dynamically loaded client-side code
statically loaded client-side code
server-side code
security defenses: sanitizers and validators
libraries and frameworks
client-side security verification
Actarus (ISSTA’11):
JS taint analysis
NY times
WordPress
Actarus: handling aliases
Actarus: WALA support
●
●
JS IR, including
● property accesses
● arguments array
● lexical scoping
● function pointers
novel CG construction algs for JS [contributed to WALA]
IBM AppScan Source Edition
the high-level pieces
dynamically loaded client-side code
statically loaded client-side code
server-side code
security defenses: sanitizers and validators
libraries and frameworks
the high-level pieces
dynamically loaded client-side code
statically loaded client-side code
server-side code
security defenses: sanitizers and validators
libraries and frameworks
handling dynamic aspects of JS
JSA (ISSTA’14):
HTML/JavaScript, concrete URLs, …
black-box
scanner
issues
DOM
modeling
reduce scope
taint
analysis
find issues
string
analysis
eliminate false positives
JSA (ISSTA’14):
▪specialized string analysis using dynamic pieces of information (e.g., concrete URL)
"https://some-site/release/jsp/sso/login.html?..."
var str = document.URL;
var url_check = str.indexOf('login.html');
if (url_check > -1) {
result = str.substring(0,url_check);
result = result + 'login.jsp' +
str.substring((url_check+search_term.length),
str.length);
document.URL = result;
}
▪part controlled by attacker is unknown, but known prefix modeled precisely
URL as Source
http://www.mysite.com/folder/page?a=1&b=2#anchor
NOT CONTROLLED BY ATTACKER
CONTROLLED BY ATTACKER
JSA: WALA support
●
●
call-graph construction
IFDS / RHS (proved general enough!)
● ability to leverage domain knowledge
● CallNoneToReturnFlow as a means to plug in models
IBM AppScan Standard Edition
the high-level pieces
dynamically loaded client-side code
statically loaded client-side code
server-side code
security defenses: sanitizers and validators
libraries and frameworks
the high-level pieces
dynamically loaded client-side code
statically loaded client-side code
server-side code
security defenses: sanitizers and validators
libraries and frameworks
nowadays
PL / ML
PL / ML
PL / ML
PL / ML
PL / ML
http://malwaredetection.mybluemix.net/analyzer/
thoughts into the future
•
query-based analyses
•
more WALA support around IFDS / IDE
(already underway!!)
•
integration with machine-learning engines
like WEKA
•
integration with dynamic analyses
thank you!