SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Downloaden Sie, um offline zu lesen
Application Programming Interfaces
Why?
I want my code to have access to your code or data... from a different
computer!
    we might be using different operating systems!

    different programming languages!

    have different compression capabilites!

    security!

    etc.

At least you don't have to install tons of code or download all of the data.
The Internet Suggests a Solution
HyperText Transfer Protocol: HTTP

    Since the WWW has caught on, HTTP has become a dominant protocol.

    Pretty much all computers support some kind of HTTP client

    Browsers are just fancy HTTP clients

    R can be a client too!

Duncan Temple Lang's RCurl package offers R access to libcurl, a popular HTTP library.
But what data will we transfer?
HTTP gives us a nearly universal way to pass data between machines, now we have to decide what format
messages ought to have.

    Let's choose something lightweight and human readable

        (so no XML :p)

    but it should be something easily serializable, and should have some structure

        JSON is the popular choice
JSON
JSON looks like this:

 1 {
 2     "hello"        :   "world",
 3     "universe"     :   42,
 4     "pizza"        :   nil,
 5     "cookies"      :   ["chocolate", "molasses", "oatmeal"],
 6     "eggs"         :   {
 7                            "over" : "easy"
 8                        }
 9 }

JSON has types, can be nested, and has analogies (e.g. 'dicts' or 'hashes' or 'maps') in most major programming
languages.

smells like a list in R

The JSONIO , also by Duncan Temple Lang, takes R lists to and from their JSON representations.
Numerous Examples
Computational
    geocoding, Google, et al.

    face-recognition, face.com

    prediction, Google

Data
    Federal Register

    Bloomberg

"Data APIs/feeds available as packages in R"
asked on stats.exchange.com a couple of months ago. The list of packages included:

quantmod , tseries , flmport , WSI , RGoogleTrends , RGoogleDocs , twitteR , Zillow , RNYTimes ,
UScensus2000 , infochimps , rdatamarket , factualR , RDSTK , RBloomberg , LIM , RTAQ , IBrokers ,
rnpn , RClimate
API example: TopicWatch


TopicWatch is a platform for text analytics and visualization

    currently developing 3 interfaces to the API:

        iPad app

        web app

        R package

We collect streaming data from a variety of sources including Twitter, RSS feeds, government publications,
and others.
API Outline
The API is still under development, and is unstable. We're always adding new features and polishing old ones.
Just a few concrete capabilites that are already running:

    time series of n-gram frequencies & counts

        aggregated at several resolutions

    n-grams ranked by frequency

        also aggregated a several resolution

        can be filtered by sub grams

    raw documents that contain a gram

    topics that contain a gram

    time series counts of documents that contain co-occurring n-grams

    ranking grams by usage change between any two times
TopicWatchr
The R package is thin wrapper for the HTTP API. It (unsurprisingly) works
by
   sending a request to a URL

   parsing JSON results

   re-arranging lists into data frames

But it has some nice functionality to make working with the API a bit
smoother:
   parses timestamps in data

   paginates large requests automatically

   handles authentication
Example 1: Presidential Candidates
Code to get data:

1   library(TopicWatchr)
2   set_credentials("PRUG", "12345")
3
4   candidates <- c("Herman Cain", "Mitt Romney", "Rick Perry",
5                   "Newt Gingrich", "Ron Paul", "Michelle Bachmann",
6                   "Jon Huntsman", "Rick Santorum")
7
8   twitter_counts <- wordCounts("twitter_sample", candidates)
9   rss_counts     <- wordCounts("rss-majorpapers", candidates)

The wordCounts function constructs the proper API call, makes the call, and arranges the results into a data
frame. Each data frame looks like this:

'data.frame':   5 obs. of 9 variables:
$ times            : POSIXct, format: "2011-11-15 08:00:00" "2011-11-15 08:30:00" ...
$ Herman Cain      : num 0 0.00148 0 0.00326 0.00274
$ Mitt Romney      : num 0 0.00148 0 0.00326 0.00548
$ Rick Perry       : num 0 0.00148 0 0 0
$ Newt Gingrich    : num 0 0.00148 0 0.00326 0
$ Ron Paul         : num 0 0 0 0 0
$ Michelle Bachmann: num 0 0 0 0 0
$ Jon Huntsman     : num 0 0 0 0 0
$ Rick Santorum    : num 0 0.00148 0 0 0


Then we combine data frames and polish with ggplot2 ...
Final Result
Example 2: Likely Phrase Generator
 1   lastGram <- function(g){
 2            strsplit(g, " ")[[1]][[2]]
 3   }
 4
 5   vc <- topGrams("twitter_sample",
 6                  filter=first, limit=1,
 7                  m=1, n=2, prefix=TRUE,
 8                  resolution="daily")$gram
 9
10   phrase <- c()
11
12   for (i in 1:i){
13       vc <- lastGram(vc)
14       phrase <- c(phrase, vc)
15       vc <- topGrams(twsrc, filter=vc, limit=1, m=1, n=2,
16                  prefix=TRUE, dev_server=TRUE,
17                  resolution="daily")$gram
18   }
`Likely' phrases from earlier today:
Twitter: "im going back :) lt3 please follow back :) lt3 please"

Technology RSS feeds: "user interface displays users click scheme federal trade commission ftc antitrust
complaint outside occupy wall street"

same source, seeded with the word "statistics": "statistics showing highlights google apps like behavioral
advertising refers obliquely suggested session sounded viable business edition"

Politics RSS feeds: "washington university battleground poll numbers superfan badge request may become
president obama administration asked whether congress approval"

Major papers RSS feeds: "percent stake throughout california chapter 11 years ago effectively sealed george
w street movement prefers birds early"

Federal Register: "revision incorporates provisions related investigative actions could result based upon fresh
prunes grown ornamentals ca fip"
Feeling Adventurous?
We're looking for beta testers for the R package! In Shackleton's words, what to expect:

...BITTER COLD, LONG MONTHS OF COMPLETE DARKNESS, CONSTANT DANGER, SAFE RETURN DOUBTFUL...

But it can still be fun! You can talk with me about it, or get in touch later at

homer@luckysort.com
That's all!
Thanks for listening. Questions?

Weitere ähnliche Inhalte

Andere mochten auch

CPSC Phthalate SOP
CPSC Phthalate SOPCPSC Phthalate SOP
CPSC Phthalate SOPjaykg64
 
Irdnewtech - INSTITUTE FOR RESEARCH AND DEVELOPMENT OF NEW TECHNOLOGIES VIET ...
Irdnewtech - INSTITUTE FOR RESEARCH AND DEVELOPMENT OF NEW TECHNOLOGIESVIET ...Irdnewtech - INSTITUTE FOR RESEARCH AND DEVELOPMENT OF NEW TECHNOLOGIESVIET ...
Irdnewtech - INSTITUTE FOR RESEARCH AND DEVELOPMENT OF NEW TECHNOLOGIES VIET ...Nguyen Trung
 
Cpf para estrangeiros
Cpf  para estrangeirosCpf  para estrangeiros
Cpf para estrangeirosjuramentado05
 
Elecció tema tr
Elecció tema trElecció tema tr
Elecció tema trdolors
 
a muller na posguerra española
a muller na posguerra españolaa muller na posguerra española
a muller na posguerra españolaFende Testas
 
“Mooi weer, mooie cijfers” bij Orangina Schweppes Belgium
“Mooi weer, mooie cijfers” bij Orangina Schweppes Belgium“Mooi weer, mooie cijfers” bij Orangina Schweppes Belgium
“Mooi weer, mooie cijfers” bij Orangina Schweppes BelgiumMartin van Wunnik
 
Actividadesdigitalesparacomenzarelcursoytrabajar 120905060046-phpapp02
Actividadesdigitalesparacomenzarelcursoytrabajar 120905060046-phpapp02Actividadesdigitalesparacomenzarelcursoytrabajar 120905060046-phpapp02
Actividadesdigitalesparacomenzarelcursoytrabajar 120905060046-phpapp02llocprova
 
Vinterberg & Festen. Raquel Sánchez Amorós (Universidad Complutense de Madrid)
Vinterberg & Festen. Raquel Sánchez Amorós (Universidad Complutense de Madrid)Vinterberg & Festen. Raquel Sánchez Amorós (Universidad Complutense de Madrid)
Vinterberg & Festen. Raquel Sánchez Amorós (Universidad Complutense de Madrid)AtomSamit
 
No Time-Outs: How to Empower Round-the-Clock Analytics
No Time-Outs: How to Empower Round-the-Clock AnalyticsNo Time-Outs: How to Empower Round-the-Clock Analytics
No Time-Outs: How to Empower Round-the-Clock AnalyticsInside Analysis
 
Validação de cartesta estrangeira e cnh
Validação de cartesta estrangeira e cnhValidação de cartesta estrangeira e cnh
Validação de cartesta estrangeira e cnhjuramentado05
 

Andere mochten auch (20)

Astronomia
AstronomiaAstronomia
Astronomia
 
Revista Cuarta Pared
Revista Cuarta ParedRevista Cuarta Pared
Revista Cuarta Pared
 
CPSC Phthalate SOP
CPSC Phthalate SOPCPSC Phthalate SOP
CPSC Phthalate SOP
 
Irdnewtech - INSTITUTE FOR RESEARCH AND DEVELOPMENT OF NEW TECHNOLOGIES VIET ...
Irdnewtech - INSTITUTE FOR RESEARCH AND DEVELOPMENT OF NEW TECHNOLOGIESVIET ...Irdnewtech - INSTITUTE FOR RESEARCH AND DEVELOPMENT OF NEW TECHNOLOGIESVIET ...
Irdnewtech - INSTITUTE FOR RESEARCH AND DEVELOPMENT OF NEW TECHNOLOGIES VIET ...
 
Cpf para estrangeiros
Cpf  para estrangeirosCpf  para estrangeiros
Cpf para estrangeiros
 
mal by DCD
mal by DCDmal by DCD
mal by DCD
 
Elecció tema tr
Elecció tema trElecció tema tr
Elecció tema tr
 
Precioso
PreciosoPrecioso
Precioso
 
Independent Dir Sme Services
Independent Dir Sme ServicesIndependent Dir Sme Services
Independent Dir Sme Services
 
Porlasmujeres
PorlasmujeresPorlasmujeres
Porlasmujeres
 
a muller na posguerra española
a muller na posguerra españolaa muller na posguerra española
a muller na posguerra española
 
Revista: Punto a contra punto
Revista: Punto a contra puntoRevista: Punto a contra punto
Revista: Punto a contra punto
 
Bacterias
BacteriasBacterias
Bacterias
 
“Mooi weer, mooie cijfers” bij Orangina Schweppes Belgium
“Mooi weer, mooie cijfers” bij Orangina Schweppes Belgium“Mooi weer, mooie cijfers” bij Orangina Schweppes Belgium
“Mooi weer, mooie cijfers” bij Orangina Schweppes Belgium
 
Actividadesdigitalesparacomenzarelcursoytrabajar 120905060046-phpapp02
Actividadesdigitalesparacomenzarelcursoytrabajar 120905060046-phpapp02Actividadesdigitalesparacomenzarelcursoytrabajar 120905060046-phpapp02
Actividadesdigitalesparacomenzarelcursoytrabajar 120905060046-phpapp02
 
Vinterberg & Festen. Raquel Sánchez Amorós (Universidad Complutense de Madrid)
Vinterberg & Festen. Raquel Sánchez Amorós (Universidad Complutense de Madrid)Vinterberg & Festen. Raquel Sánchez Amorós (Universidad Complutense de Madrid)
Vinterberg & Festen. Raquel Sánchez Amorós (Universidad Complutense de Madrid)
 
Presentaciones
PresentacionesPresentaciones
Presentaciones
 
No Time-Outs: How to Empower Round-the-Clock Analytics
No Time-Outs: How to Empower Round-the-Clock AnalyticsNo Time-Outs: How to Empower Round-the-Clock Analytics
No Time-Outs: How to Empower Round-the-Clock Analytics
 
Validação de cartesta estrangeira e cnh
Validação de cartesta estrangeira e cnhValidação de cartesta estrangeira e cnh
Validação de cartesta estrangeira e cnh
 
The Easy Button
The Easy ButtonThe Easy Button
The Easy Button
 

Ähnlich wie APIs Unlock Data Access Across Systems

Big Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioBig Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioCHAKER ALLAOUI
 
Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Eli White
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learningPaco Nathan
 
Hatkit Project - Datafiddler
Hatkit Project - DatafiddlerHatkit Project - Datafiddler
Hatkit Project - Datafiddlerholiman
 
Hadoop & Hive Change the Data Warehousing Game Forever
Hadoop & Hive Change the Data Warehousing Game ForeverHadoop & Hive Change the Data Warehousing Game Forever
Hadoop & Hive Change the Data Warehousing Game ForeverDataWorks Summit
 
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...Big Data Spain
 
Bridging data analysis and interactive visualization
Bridging data analysis and interactive visualizationBridging data analysis and interactive visualization
Bridging data analysis and interactive visualizationNacho Caballero
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking VN
 
RDF Stream Processing: Let's React
RDF Stream Processing: Let's ReactRDF Stream Processing: Let's React
RDF Stream Processing: Let's ReactJean-Paul Calbimonte
 
How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?Samet KILICTAS
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in SparkPaco Nathan
 
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...Andrii Gakhov
 
Metamodeling vs Metaprogramming, A Case Study on Developing Client Libraries ...
Metamodeling vs Metaprogramming, A Case Study on Developing Client Libraries ...Metamodeling vs Metaprogramming, A Case Study on Developing Client Libraries ...
Metamodeling vs Metaprogramming, A Case Study on Developing Client Libraries ...Markus Scheidgen
 
Big dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosqlBig dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosqlKhanderao Kand
 
Making Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFluxMaking Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFluxTrayan Iliev
 
Open Analytics Environment
Open Analytics EnvironmentOpen Analytics Environment
Open Analytics EnvironmentIan Foster
 
Congressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4jCongressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4jWilliam Lyon
 
Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015Databricks
 
[Reactive] Programming with [Rx]ROS
[Reactive] Programming with [Rx]ROS[Reactive] Programming with [Rx]ROS
[Reactive] Programming with [Rx]ROSAndrzej Wasowski
 

Ähnlich wie APIs Unlock Data Access Across Systems (20)

Big Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioBig Data to SMART Data : Process Scenario
Big Data to SMART Data : Process Scenario
 
Yahoo compares Storm and Spark
Yahoo compares Storm and SparkYahoo compares Storm and Spark
Yahoo compares Storm and Spark
 
Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
 
Hatkit Project - Datafiddler
Hatkit Project - DatafiddlerHatkit Project - Datafiddler
Hatkit Project - Datafiddler
 
Hadoop & Hive Change the Data Warehousing Game Forever
Hadoop & Hive Change the Data Warehousing Game ForeverHadoop & Hive Change the Data Warehousing Game Forever
Hadoop & Hive Change the Data Warehousing Game Forever
 
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...
 
Bridging data analysis and interactive visualization
Bridging data analysis and interactive visualizationBridging data analysis and interactive visualization
Bridging data analysis and interactive visualization
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
 
RDF Stream Processing: Let's React
RDF Stream Processing: Let's ReactRDF Stream Processing: Let's React
RDF Stream Processing: Let's React
 
How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in Spark
 
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
 
Metamodeling vs Metaprogramming, A Case Study on Developing Client Libraries ...
Metamodeling vs Metaprogramming, A Case Study on Developing Client Libraries ...Metamodeling vs Metaprogramming, A Case Study on Developing Client Libraries ...
Metamodeling vs Metaprogramming, A Case Study on Developing Client Libraries ...
 
Big dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosqlBig dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosql
 
Making Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFluxMaking Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFlux
 
Open Analytics Environment
Open Analytics EnvironmentOpen Analytics Environment
Open Analytics Environment
 
Congressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4jCongressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4j
 
Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015
 
[Reactive] Programming with [Rx]ROS
[Reactive] Programming with [Rx]ROS[Reactive] Programming with [Rx]ROS
[Reactive] Programming with [Rx]ROS
 

Kürzlich hochgeladen

Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

APIs Unlock Data Access Across Systems

  • 1.
  • 2. Application Programming Interfaces Why? I want my code to have access to your code or data... from a different computer! we might be using different operating systems! different programming languages! have different compression capabilites! security! etc. At least you don't have to install tons of code or download all of the data.
  • 3. The Internet Suggests a Solution HyperText Transfer Protocol: HTTP Since the WWW has caught on, HTTP has become a dominant protocol. Pretty much all computers support some kind of HTTP client Browsers are just fancy HTTP clients R can be a client too! Duncan Temple Lang's RCurl package offers R access to libcurl, a popular HTTP library.
  • 4. But what data will we transfer? HTTP gives us a nearly universal way to pass data between machines, now we have to decide what format messages ought to have. Let's choose something lightweight and human readable (so no XML :p) but it should be something easily serializable, and should have some structure JSON is the popular choice
  • 5. JSON JSON looks like this: 1 { 2 "hello" : "world", 3 "universe" : 42, 4 "pizza" : nil, 5 "cookies" : ["chocolate", "molasses", "oatmeal"], 6 "eggs" : { 7 "over" : "easy" 8 } 9 } JSON has types, can be nested, and has analogies (e.g. 'dicts' or 'hashes' or 'maps') in most major programming languages. smells like a list in R The JSONIO , also by Duncan Temple Lang, takes R lists to and from their JSON representations.
  • 6. Numerous Examples Computational geocoding, Google, et al. face-recognition, face.com prediction, Google Data Federal Register Bloomberg "Data APIs/feeds available as packages in R" asked on stats.exchange.com a couple of months ago. The list of packages included: quantmod , tseries , flmport , WSI , RGoogleTrends , RGoogleDocs , twitteR , Zillow , RNYTimes , UScensus2000 , infochimps , rdatamarket , factualR , RDSTK , RBloomberg , LIM , RTAQ , IBrokers , rnpn , RClimate
  • 7. API example: TopicWatch TopicWatch is a platform for text analytics and visualization currently developing 3 interfaces to the API: iPad app web app R package We collect streaming data from a variety of sources including Twitter, RSS feeds, government publications, and others.
  • 8. API Outline The API is still under development, and is unstable. We're always adding new features and polishing old ones. Just a few concrete capabilites that are already running: time series of n-gram frequencies & counts aggregated at several resolutions n-grams ranked by frequency also aggregated a several resolution can be filtered by sub grams raw documents that contain a gram topics that contain a gram time series counts of documents that contain co-occurring n-grams ranking grams by usage change between any two times
  • 9. TopicWatchr The R package is thin wrapper for the HTTP API. It (unsurprisingly) works by sending a request to a URL parsing JSON results re-arranging lists into data frames But it has some nice functionality to make working with the API a bit smoother: parses timestamps in data paginates large requests automatically handles authentication
  • 10. Example 1: Presidential Candidates Code to get data: 1 library(TopicWatchr) 2 set_credentials("PRUG", "12345") 3 4 candidates <- c("Herman Cain", "Mitt Romney", "Rick Perry", 5 "Newt Gingrich", "Ron Paul", "Michelle Bachmann", 6 "Jon Huntsman", "Rick Santorum") 7 8 twitter_counts <- wordCounts("twitter_sample", candidates) 9 rss_counts <- wordCounts("rss-majorpapers", candidates) The wordCounts function constructs the proper API call, makes the call, and arranges the results into a data frame. Each data frame looks like this: 'data.frame': 5 obs. of 9 variables: $ times : POSIXct, format: "2011-11-15 08:00:00" "2011-11-15 08:30:00" ... $ Herman Cain : num 0 0.00148 0 0.00326 0.00274 $ Mitt Romney : num 0 0.00148 0 0.00326 0.00548 $ Rick Perry : num 0 0.00148 0 0 0 $ Newt Gingrich : num 0 0.00148 0 0.00326 0 $ Ron Paul : num 0 0 0 0 0 $ Michelle Bachmann: num 0 0 0 0 0 $ Jon Huntsman : num 0 0 0 0 0 $ Rick Santorum : num 0 0.00148 0 0 0 Then we combine data frames and polish with ggplot2 ...
  • 12. Example 2: Likely Phrase Generator 1 lastGram <- function(g){ 2 strsplit(g, " ")[[1]][[2]] 3 } 4 5 vc <- topGrams("twitter_sample", 6 filter=first, limit=1, 7 m=1, n=2, prefix=TRUE, 8 resolution="daily")$gram 9 10 phrase <- c() 11 12 for (i in 1:i){ 13 vc <- lastGram(vc) 14 phrase <- c(phrase, vc) 15 vc <- topGrams(twsrc, filter=vc, limit=1, m=1, n=2, 16 prefix=TRUE, dev_server=TRUE, 17 resolution="daily")$gram 18 }
  • 13. `Likely' phrases from earlier today: Twitter: "im going back :) lt3 please follow back :) lt3 please" Technology RSS feeds: "user interface displays users click scheme federal trade commission ftc antitrust complaint outside occupy wall street" same source, seeded with the word "statistics": "statistics showing highlights google apps like behavioral advertising refers obliquely suggested session sounded viable business edition" Politics RSS feeds: "washington university battleground poll numbers superfan badge request may become president obama administration asked whether congress approval" Major papers RSS feeds: "percent stake throughout california chapter 11 years ago effectively sealed george w street movement prefers birds early" Federal Register: "revision incorporates provisions related investigative actions could result based upon fresh prunes grown ornamentals ca fip"
  • 14. Feeling Adventurous? We're looking for beta testers for the R package! In Shackleton's words, what to expect: ...BITTER COLD, LONG MONTHS OF COMPLETE DARKNESS, CONSTANT DANGER, SAFE RETURN DOUBTFUL... But it can still be fun! You can talk with me about it, or get in touch later at homer@luckysort.com
  • 15. That's all! Thanks for listening. Questions?