SlideShare ist ein Scribd-Unternehmen logo
1 von 12
Exploring Public APIs with
 MongoDB and Analytica
        Nosh Petigara
     nosh@analytica.com
          @noshp
Today
• MongoDB and public APIs

• What is Analytica?

• Demo
  – Analytica shell (twitter data)
  – Analytica for Excel (StackOverflow data)
MongoDB and public APIs
• Most APIs talk JSON
   – MongoDB’s native JSON import

• APIs vary wildly (internally and between one
  another)
   – MongoDB is schema-free

• Data import is only half the battle
   – MongoDB’s query language and aggregation
     framework
Some data sets to explore
• Twitter API (JSON)
   – https://dev.twitter.com/

• Crunchbase API (JSON)
   – http://developer.crunchbase.com/

• Stackoverflow (JSON and CSV)
   – http://data.stackexchange.com/

• NYTimes (JSON, XML)
   – http://developer.nytimes.com/docs
Importing data sets
• Streaming JSON directly into MongoDB
  – curl
    https://stream.twitter.com/1/statuses/sample.jso
    n –uUSERNAME:PASSword| ./mongoimport –d
    twitter –c tweets
• Importing JSON files
  – ./mongoimport –d mydb –c mycoll file.json
• CSV
  – ./mongoimport –d db –c coll --type csv --
    headerline myfile.csv
Analytica
• Analytics & reporting platform for MongoDB
  – Natively understands JSON/document hierarchy
  – Tailored for analytics (not querying)
  – Works directly on MongoDB


• Discovery, analysis, visualization cycle

• In private beta [http://analytica.com]
What can you do with Analytica?
• Inspect and extract data

• Augment your data model

• Calculate & aggregate

• Filter and transform data

• Join collections
Demos
• Today
  – Twitter stats [using the Analytica Shell]
  – Stackoverflow community analysis [using
    Analytica for Excel]


• Not shown
  – REST API
  – Analytica web (Coming soon)
Demo 1: Twitter data
Some other examples


Tweets vs. retweets   count(select(twitter.tweets.where(retweet_count <> 0)))
vs. replies


Follower counts       max(twitter.tweets.user.followers_count)



Popular hashtags      set twitter.byhashtag = group(tweets.by(entities.hashtags.text))
                      set twitter.byhashtag.tweetcount = count(tweets)
                      set twitter.populartags = orderdesc(byhashtag.by(tweetcount))
                      get twitter.populartags.text
Demo 2: StackOverflow User Profiles
Next steps
• Private beta
  – http://analytica.com


• Get in touch
  – nosh@analytica.com or info@analytica.com


• @analytica_inc on twitter

Weitere ähnliche Inhalte

Mehr von MongoDB

Mehr von MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Exploring Public APIs with MongoDB and Analytica

  • 1. Exploring Public APIs with MongoDB and Analytica Nosh Petigara nosh@analytica.com @noshp
  • 2. Today • MongoDB and public APIs • What is Analytica? • Demo – Analytica shell (twitter data) – Analytica for Excel (StackOverflow data)
  • 3. MongoDB and public APIs • Most APIs talk JSON – MongoDB’s native JSON import • APIs vary wildly (internally and between one another) – MongoDB is schema-free • Data import is only half the battle – MongoDB’s query language and aggregation framework
  • 4. Some data sets to explore • Twitter API (JSON) – https://dev.twitter.com/ • Crunchbase API (JSON) – http://developer.crunchbase.com/ • Stackoverflow (JSON and CSV) – http://data.stackexchange.com/ • NYTimes (JSON, XML) – http://developer.nytimes.com/docs
  • 5. Importing data sets • Streaming JSON directly into MongoDB – curl https://stream.twitter.com/1/statuses/sample.jso n –uUSERNAME:PASSword| ./mongoimport –d twitter –c tweets • Importing JSON files – ./mongoimport –d mydb –c mycoll file.json • CSV – ./mongoimport –d db –c coll --type csv -- headerline myfile.csv
  • 6. Analytica • Analytics & reporting platform for MongoDB – Natively understands JSON/document hierarchy – Tailored for analytics (not querying) – Works directly on MongoDB • Discovery, analysis, visualization cycle • In private beta [http://analytica.com]
  • 7. What can you do with Analytica? • Inspect and extract data • Augment your data model • Calculate & aggregate • Filter and transform data • Join collections
  • 8. Demos • Today – Twitter stats [using the Analytica Shell] – Stackoverflow community analysis [using Analytica for Excel] • Not shown – REST API – Analytica web (Coming soon)
  • 10. Some other examples Tweets vs. retweets count(select(twitter.tweets.where(retweet_count <> 0))) vs. replies Follower counts max(twitter.tweets.user.followers_count) Popular hashtags set twitter.byhashtag = group(tweets.by(entities.hashtags.text)) set twitter.byhashtag.tweetcount = count(tweets) set twitter.populartags = orderdesc(byhashtag.by(tweetcount)) get twitter.populartags.text
  • 11. Demo 2: StackOverflow User Profiles
  • 12. Next steps • Private beta – http://analytica.com • Get in touch – nosh@analytica.com or info@analytica.com • @analytica_inc on twitter