slides

Sociology of
Programming
Languages
LEO A. MEYEROVICH
@LMEYEROV
BERKELEY // SWIVEL.IO
ARIEL S. RABKIN
@ARITALKING
PRINCETON
1 Sociology of
Programming
Languages ^
ADOPTION
LEO A. MEYEROVICH
@LMEYEROV
BERKELEY // SWIVEL.IO
ARIEL S. RABKIN
@ARITALKING
PRINCETON
2 MAN AND THE MACHINE
CULT OF PERSONALITY
3 ENIAC
4 2011
5 PL Blind Spot: Social
Design
I can only stand on shoulders of giants
for intrinsic feature design
Performance
parallel algorithms/compilers/
synthesis
Abstraction
concurrency/dynamism (FRP)
6 Zeroing In
“Psychological aspects of programming and in
the computational aspects of psychology”
ICSE, FSE,
MSR, CHASE,
…
CSCW
Computer Supported
Cooperative Work and
Social Computing
7 Sociology of
Languages
Principles
Language
s
Standalone topic!
[Implications for Design, Paul Dourish, CHI’06]
8 Why Start with Adoption?
[P. Coburn;
switching costs]
Change Function threshold to adopt:
perceived adoption need
perceived adoption pain
> 1
FP!!!
new language

9 Why Start with Adoption?
“From now on, my goal in life
would be to also drive the
denominator down to zero”
- Erik Meijer
Confession of a Used Language
Salesman
10 Why Start with Adoption?
FP!!!
new language

FP!!
same language

11 SOCIAL THEORIES
[Onward! ‘12]
Adoption Model?
Decision Making?
Acquisition?
ANALYSIS OF
200K PROJECTS
AND 20K DEVS
[PLATEAU ‘12, OOPSLA
‘13]
Challenge Problems: Design &
12 Well-Studied Social Theories of
Adoption
[Onward! ‘12] optimize for
longevity
Technology
Toys
Economics rational
Music
Medicine quantitative Religion
Policy
Linguistics
“different”,
not
…
…
“better”
13 [Mark, 1998] Ecological Theory
Music is fun with friends!
Can’t listen to all music…
14 [Mark, 1998] Social Network ~ Preferences
Ecological
Theory
Music competes
for social
networks,
not individuals
15 [PLATEAU 2013] 200K
Projects (2000-2010)
16 Popularity Across Niches
60% Popularity 40% bloggin
g
Java searc
h
20% 0% Project categories (223) 4% 3% 2% build tools
Scheme 1% 0% Project categories (223) 17
Popularity vs. Niche: Dispersion
Popularity
1
Java
0.1
C#
PL/SQL
0.01
Assembly
Fortran
0.001
VBScript Scheme
Prolog
0.0001
0
1
2
3
4
Dispersion across niches
(σ / µ)
5
18 Most Used Languages CDF (Ohloh)
100%
90%
80%
70%
60%
DSLs
dominate
50%
Cumulativ
40%
e
css
30%
Popularityhtml
c
shell
java
javascript
c++
python
make
php
bat
sql
rubyc#
winner
takes all
20%
10%
0%
xml
Language
19 Odds for Unpopular
Languages?
100.0000%
10.0000%
1.0000%
Proportion
0.1000%
of
Projects for
0.0100%
Language
SourceFor
ge
BUGGY DATA
Sources only track
certain languages
Ohloh
0.0010%
0.0001%
1
10
Language Rank (Decreasing )
100
20 Slash + + Wired
Survey
1,600 responses (2 days) •  InternaKonal audience •  83% have at least 1 degree •  73% are out of school 21 Use of Unpopular Languages
100.0000%
10.0000%
1.0000%
Proportion
0.1000%
of
Projects for
0.0100%
Language
0.0010%
SourceFor
ge
Long Tail
Design for
niches
and grow
Slashdot
Survey
Ohloh
0.0001%
1
10
Language Rank (Decreasing )
100
22 How Do People Pick
Languages?
hQp://bpodgursky.wordpress.com/2013/08/22/updates-­‐to-­‐language-­‐vs-­‐income-­‐breakdown-­‐post/ 23 P(L’ | L) p(popular) 75% p(prev) 30% L L’ 24 Poll to Dig Deeper
Typically,
what factor most influences
language selection?
25 Polling Perceived Reality
In your last project,
what factor most influenced
language selection?
26 Survey of 1,679 Developers
Extrinsic factors
dominate!
(on last
project)
27 Demographics Matter
Probability of Using a Language on Last Project
28 Surveys of Biased Samples
< 20yr olds: correctness less important The image cannot be displayed. Your computer may not
have enough memory to open the image, or the image
may have been corrupted. Restart your computer, and
then open the file again. If the red x still appears, you
may have to delete the image and then insert it again.
massive open online course survey (MOOC) Avg. Age 37 30 Degree 53% 55% Employed Dev 92% 62% Female 3% 16% Hobbyists learn quickly More latent biases? 29 Sample Bias in
Repo
Software
Early SourceForge Adopters Late GitHub etc. Adopters 2000 2002 2008 2010 30 Cross-Validating Adoption of Java
Generics
People Define Top 20 Projects [Parnin 2011] (online course) People Invoke Class List<T>{…} ! n = new List<Int>()!
14% 28% 44% 60% vague self-­‐reported jargon How often do you create …: never, sometimes, …
How often do you invoke …: never, sometimes, …
Also: very different values for C++ templates
31 [Rogers 1963, Ryan & Gross 1943] Detailed Model: Diffusion of
Innovation
Ado
pYo
n: 12 Y
ears
500+ tech adoption studies
later…
32 [Rogers 1963] Diffusion of Innovation: Process
1. Knowledge
2. Persuasion
not so bad talk + read 4. Trial
The image cannot be displayed. Your computer may not have enough memory to open the
image, or the image may have been corrupted. Restart your computer, and then open the
file again. If the red x still appears, you may have to delete the image and then insert it
again.
3. Decision
5. Confirmation
33 [Kelly 1991, Limanonda 1994 ] Actionable Example: Safe Sex
Process
  knowledge and persuasion
x  decision/trial/confirmation
Catalysts
  rel. advantage and simple
x  observability x  trialability and compatibility
Sounds
like
static
typing...
34 Two Weekends to Spread Safe
Sex
[Kelly 1991, Limanonda 1994 ] hang out at gay bars,
identify opinion leaders
1 2 teach to promote,
give visible badge
35 [Kelly 1991, Limanonda 1994 ] Safe Sex IntervenKon: Success! Reported acKvity 80% safe sex 60% 40% unsafe sex 20% 0% 3 months 3 years 36 Noteworthy Diffusion of
Innovations
Trialability
Coverity
post-mortem result
Relative advantage
Hadoop, EC2
niche, quant. benefit
Compatibility
Observability
jQuery > Scala > …
jsFiddle / JS
libraries, E-DSLs,
shareable URLs
JVMLs Simplicity
Scheme, Ruby, Scala
pay-as-you-go abstractions
& language-as-a-library
37 38 ? Adoption
Language
39 Goal? Adoption
Language
40 Process? Adoption
Language
41 Fuel Adoption
Language
42 1000000 JavaScript Posts over Time on
StackOverflow
1 0.1 10000 0.01 CDF 100000 new answers 1000 2009 2010 2011 11/1/08 11/1/09 11/1/10 2012 11/1/11 2013 0.001 11/1/12 43 1 100000 0.1 CDF 1000000 JavaScript Posts over Time on
StackOverflow
cumulaKve answers 10000 0.01 new answers 1000 2009 2010 2011 11/1/08 11/1/09 11/1/10 2012 11/1/11 2013 0.001 11/1/12 44 1 0.9 0.8 0.7 100000 0.6 cumulaKve answers 0.5 0.4 10000 0.3 new answers 0.2 0.1 1000 2009 2010 2011 2012 2013 0 11/1/08 11/1/09 11/1/10 11/1/11 11/1/12 cumulaKve distrib CDF 1000000 JavaScript Posts over Time on
StackOverflow
45 (Network Effect, Commons, …) Metcalfe’s Law
Developers: 1,000 –
1,000,000
Users: 100 – 1,000,000,000+
Artifacts:
Network’s Value:
O(N2)
CPUs, libraries, program
traces, REPL sessions, …
46 Data 1/2: Guided REPLs/APIs Useful? Others? Preprocessing? install(“fit”); import(“fit”)!
a = fitdistr(data, distr=“exp”)!
plot(a); summary(a)!
Postprocessing? Others? 47 Data 2/2: Traces for Whitebox
Testing
input Z3: synthesize input1 for null
!
input2 for !null!
null che
ck [Sen, Engler, Godefroid..] Augment facts with “TraceDB” Synth can fail.. 48 Empirical Tools 1/2: Data Rich
Packaging The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
REPL
sessions?
traces,
Executable states,

aliases,
s
MWEs
+ Tests
49 Empirical Tools 2/2: Analyzers
•  Survey design for language design &
prospecting
•  Rapid prototyping for social learning
•  Repo mining is being tackled by many people
50 Recap
1.  BIG GAP: “social” language principles &
designs
2.  Adoption: social literature & empirical analysis
Onwards 2012, PLATEAU 2012, OOPSLA 2013
3.  Empirical tools: needs instrument design
research
Surveys (MOOCs!) >> repository mining
4.  Big opportunities for social languages
51