David Thoumas, OpenDataSoft CTO, about data API strategy (rich API vs. multiple end-points) for broadcasting data & making business
At APIdays 2012, the 1st European event dedicated to API world
Automating Google Workspace (GWS) & more with Apps Script
From open data to API-driven business
1. From Open Data to Open APIs
Future plants of sustainable innovation
David Thoumas – CTO
12/4/2012
with OpenDataSoft
1
think bigdata. start smartdata. scale fastdata
3. OpenDataSoft
the data driven innovation company
Full featured cloud-based platform for
• operating ones own data driven innovation hubs
• speeding-up new data-centric services go-to-market
• leverage both cloud elasticity & big data fields
Smart metering, M2M, data API … thanks to its integrated platform of Data
Management & Data Publishing, OpenDataSoft helps its customers to set up and
operate their own data open platform dedicated to new services go-to-market,
leveraging cloud computing & big data technologies.
3
7. Open Data Strategy ?
« Open data strategy should be a top priority for any
organization that uses the Web as a channel for
delivering goods and services … Open data APIs are a
lightweight approach to data exchange. Their use is now considered
a best practice for opening data and functionality to developers and
other businesses. »
(Gartner – août 2012)
7
8. What problem do we solve ?
Removing the barriers between Data and Applications.
• For Business users / analysts
• Who know their data
• To think and design their applications
• With a quick Go To Live
• Without needing to instantiate a complex IT project
Building applications in days, no more in
months or years
8
9. How do we solve this problem ?
Functionalities
Data procurement
• File extraction
• Pulling from remote sources (Web, DB, …)
• Pushing through APIs
• And sometimes, MDM
Data preparation
• Fields mapping
• Simple transformations (geo coords, text, …)
• Complex transformations (geocoding, analytics, …)
Data publishing
• Simple Web UI (navigate into the data)
• API Factory (search, facetting, aggregation)
9
10. How do we solve this problem ?
Technologies
NoSQL
• Schema less
Search
• Highly scalable
• New flavor of applications
• Mixing
– Traditionnal search approach
In the Cloud
– And Big Data analytics
… and Python …
because it’s cool !
• Sustained by a brilliant and huge community
• Allows for multi-facets development patterns
10
11. How do we solve this problem ?
Performances
One platform for any scale use case.
Store datasets of tens of millions of records or
even hundreds of millions of records.
With sub-second response time in simple
real-time analytical processing tasks.
11
12. And, at the end ?
API factory => Application factory
Smart Metering …
… Mobile applications
…
12
15. API Factory
Functionalities for rich modern applications
Cross datasets search
Full Text Search
Numerical and rangesearch
Field search
Using standards
Geographical filtering Analytical processing
Streaming of large results sets
15