1. Evolution of the mashup ecosystem
by copying
Michael Weiss Technology Innovation
Management (TIM)
Solange Sari
www.carleton.ca/tim
weiss@sce.carleton.ca Mashups 2010 1
2. Objective
⢠Mashups are applications that combine data and
services provided through APIs with user data
⢠New application development model: opportunistic
programming, uses a bricolage approach
⢠Creation of mashups supported by an ecosystem of
data providers, mashup platforms, and users
⢠Research questions
â How do mashup developers select APIs?
â How do mashup developers learn to develop mashups?
weiss@sce.carleton.ca Mashups 2010 2
3. Relevance
⢠Users/platforms: can benefit from/offer tools that
better support the way users work
⢠Directory providers: their role is to facilitate the
selection of APIs and learning of developers
⢠Data providers: need to understand which APIs their
APIs are used together most (interoperability)
weiss@sce.carleton.ca Mashups 2010 3
4. Previous work
⢠Examined structure and growth of mashup
ecosystem using visualization and network analysis
to identify members and their relationships
⢠Opportunistic programming studies how developers
use online resources in problem solving
⢠Research on innovation: (re)combination shortens
learning curve, modularity allows mix-and-match
⢠Models of network growth: preferential attachment
⢠Copying and duplication mechanisms in describing
the growth of the web and biological networks
weiss@sce.carleton.ca Mashups 2010 4
5. Hypothesis
⢠As answer to research questions, we examine to
what degree developers create mashups by copying
other mashups: copy of the mashup âblueprintâ
Number of copies/mashup
Not copied
Snapshot on 08/16/10
Amazon/GoogleMaps/YouTube
ProgrammableWeb
5e-01
GoogleMaps/Twitter Mashups 4983 100%
Cumulative probability
5e-02
Flickr/GoogleMaps Not copied 1528 31%
Amazon/
Flickr
Blueprints 341 7%
5e-03
GoogleMaps
Copies of 3114 62%
GoogleMaps/YouTube
GoogleMaps
blueprints
5e-04
YouTube
1 5 10 50 100 500 1000
Number of copies
weiss@sce.carleton.ca Mashups 2010 5
6. Copying model
⢠Mashup ecosystem as network of mashups and APIs:
a link indicates that a mashup uses an API
⢠Assumption: mashups all have m APIs
⢠Initialize network:
â Create m0 ⼠m APIs, one mashup
⢠Grow network from t=m0 + 1 to t=N:
â Add new API with probability p
â With probability 1-p, choose a mashup as a template
â For each API in template, copy the API with probability Îą, or
choose a new API at random with probability 1-Îą
weiss@sce.carleton.ca Mashups 2010 6
7. Example
⢠Initial network: APIs 1 and 2, mashup 3
⢠Thin solid lines indicate random selection
1
3
t API
2 t Mashup
weiss@sce.carleton.ca Mashups 2010 7
8. Example
⢠Growth: add a new mashup (4)
⢠Thick solid lines indicate âcopiesâ relationship
⢠Thin dashed lines indicate copying
1
3 4 Full copy
t API
2 t Mashup
weiss@sce.carleton.ca Mashups 2010 8
9. Example
⢠Growth: add a new API (5)
5
1
3 4 Full copy
t API
2 t Mashup
weiss@sce.carleton.ca Mashups 2010 9
10. Example
⢠Growth: add a new mashup (6)
⢠Thin solid lines indicate random selection
5
1
6 Partial copy
3 4 Full copy
t API
2 t Mashup
weiss@sce.carleton.ca Mashups 2010 10
11. Research method
⢠Calibrate simulation parameters
â N: combined actual number of APIs and mashups
â m = 2: good approximation of average actual APIs / mashup
â p: number of APIs / N (all as of 08/16/10)
⢠Simulate mashup ecosystem evolution
â Vary Îą over range 0.0 to 1.0, keep m = 2 fixed
â Run each simulation multiple times and terminate when 95%
confidence interval is sufficient for the optimization
⢠Determine best fit of simulated distribution of
mashups / API with actual using two fitting methods:
sum of squared error fit, and power law fit
weiss@sce.carleton.ca Mashups 2010 11
12. Actual distribution
⢠Distribution of mashups / API follows Zipfâs law:
plotting frequency of mashups relative to rank results
in a line with slope close to -1 in a log-log plot
GoogleMaps Actual
Flickr
Twitter
500
YouTube
Number of mashups
100
-0.990
50
10
5
1
1 2 5 10 20 50 100 200 500
Rank
weiss@sce.carleton.ca Mashups 2010 12
13. Sum of squared error fit
⢠Underestimates contribution of top-ranked API
⢠Overestimates the number of APIs used by at least
one mashup by 45% (1020 vs 703)
Actual
Îą = 0.798
Simulated (sum of squared error)
1e+07
500
8e+06
Sum of squared error
Number of mashups
100
6e+06
50
4e+06
10
5
2e+06
1
0.2 0.4 0.6 0.8 1 2 5 10 20 50 100 200 500
Copying factor (!) Rank
weiss@sce.carleton.ca Mashups 2010 13
14. Power law fit
⢠Slightly overestimates contribution of top API
⢠Overestimates the number of APIs used by at least
one mashup by 22% (859 vs 703)
2.5
Actual
Îą = 0.855
Simulated (power law)
2.0
500
Power law coefficient error
1.5
Number of mashups
100
50
1.0
10
0.5
5
0.0
1
0.2 0.4 0.6 0.8 1 2 5 10 20 50 100 200 500
Copying factor (!) Rank
weiss@sce.carleton.ca Mashups 2010 14
15. Cumulative contribution of APIs
⢠Sum of squared error fit underestimates number of
APIs that contributed to 50% of API uses
⢠Power law fit overestimates number of APIs that
contributed to 50% of API uses
Cumulative contribution
1.0
0.8
0.6
0.4
0.2
1 2 5 10 20 50 100 200 500
Rank
weiss@sce.carleton.ca Mashups 2010 15
16. Discussion
⢠Both methods obtained their best fit for a high
copying factor: this suggests that most mashups are
created by modifying the an existing blueprint
⢠Power law fit more closely approximates actual Zipf
distribution, however, sum of squared error fit offers a
better match of actual degrees of APIs in midrange
weiss@sce.carleton.ca Mashups 2010 16
17. Insights for stakeholders
⢠Confirmation of practices directories follow:
â List combinations of APIs into mashups
â Keep track of developers of mashups
â Provide tutorials on mashup development
⢠Directory providers should make blueprints more
apparent: also list frequency of blueprints
⢠Users benefit as they can look at blueprints to select
APIs that work well together and as examples
⢠API providers learn which other APIs are frequently
combined with their API: incentive to interoperate
weiss@sce.carleton.ca Mashups 2010 17
18. Conclusion
⢠Results indicate that copying plays a significant role
in the evolution of the mashup ecosystem
⢠However, we cannot rule out other factors that could
explain how mashup ecosystem grows
⢠Copying hypothesis in line with current thinking about
innovation: eg MacArthurâs Nature of Technology
⢠Other current and future work:
â Extend simulation to include mashups of different size
â Test copying hypothesis empirically: we currently examine
hereditary relationships between mashups
â Examine link between copying and diversity of ecosystem
weiss@sce.carleton.ca Mashups 2010 18