SlideShare ist ein Scribd-Unternehmen logo
1 von 9
Downloaden Sie, um offline zu lesen
H I T C H H I K ER ’ S GUI D E T O A NO S Q L
D A T A C L O UD – T UT O R I A L H A ND O UT
                                    AWS SETUP
1. What you need :
     a. An AWS account. Signup is at Amazon EC2 page
     b. Create a local directory aws : ~/aws or c:aws.
         It is easier to have a sub directory to keep all aws stuff
     c. (optional) Have downloaded and installed aws tools. I install them in the aws sub
         directory
     d. Setup and store an SSH key. My key is at ~/aws/SSH-Key1.pem
     e. Create a security group (for example Test-1 as shown in Figure 1 – Open port 22,80
         and 8080. Remember this is for experimental purposes and not for production)




                                         FIG UR E 1

2. Redeem AWS credit. Amazon has given us credit voucher for $25. (Thanks to Jeff Barr and
   Jenny Kohr Chynoweth)
       a. Visit the AWS Credit Redemption Page and enter the promotion code above. You
           must be signed in to view this page.
       b. Verify that the credit has been applied to your account by viewing your Account
           Activity page.
       c. These credits will be good for $25 worth of AWS services from June 10, 2010-December
           31, 2010.
3. I have created OSCON-NOSQL AMI that has the stuff needed for this tutorial.
       a. Updates and references at http://doubleclix.wordpress.com/2010/07/13/oscon-
           nosql-training-ami/
       b. Reference Pack
                i. The NOSQL Slide set
               ii. The hand out (This document)
iii. Reference List
                 iv. The AMI
                  v. The NOSQL papers
   4. To launch the AMI
      AWS Management Console-EC2-Launch Instance
      Search for OSCON-NOSQL (Figure 2)




                                              FIG UR E 2

       [Select]

       [Instance Details] (small instance is fine)

       [Advanced Instance Options] (Default is fine)

       [Create Key pair] (Use existing if you already have one or create one and store in the ~/aws
directory

       [Configure Firewall] – choose Test-1 or create 1 with 80,22 and 8080 open

       [Review]-[Launch]-[Close]

        Instance should show up in the management console in a few minutes (Figure 3)




                                              FIG UR E 3

   5. Open up a console and ssh into the instance
      SSH-Key1.pem ubuntu@ec2-184-73-85-67.compute-1.amazonaws.com
      And type-in couple of commands (Figure 4)
FIG UR E 4

Note :

         If it says connection refused, reboot the instance. You can check the startup logs by
typing
      ec2-get-console-output --private-key <private key file
name> --cert <certificate file name> <instancd id>
      Also type-in commands like the following to get a feel for
the instance.
ps –A
vmstat
cat /proc/cpuinfo
uname –a
sudo fdisk –l
sudo lshw
df –h –T


Note : Don’t forget to turn off the instance at the end of the
hands-on session !
HANDS-ON DOCUMENT DATASTORES - MONGODB
Close Encounters of the first kind …

              Start mongo shell : type mongo




      FIG UR E 5

              The CLI is a full Javascript interpreter
              Type any javascript commands
               Function loopTest() {
               For (i=0;i<10;i++) {
               Print(i);
               }
               }
              See how the shell understands editing, multi-line and end of function
              loopTest will show the code
              loopTest() will run the function
              Other commands to try
                   o show dbs
                   o use <dbname>
                   o show collections
                   o db.help()
                   o db.collections.help()
                   o db.stats()
                   o db.version()

Warm up

              create a document & insert it
o person = {“name”,”aname”,”phone”:”555-1212”}
                  o db.persons.insert(person)
                  o db.persons.find()
             <discussion about id>
                  o db.persons.help()
                  o db.persons.count()
             Create 2 more records
             Embedded Nested documents
             address = {“street”:”123,some street”,”City”:”Portland”,”state”:”WA”,
              ”tags”:[“home”,”WA”])
             db.persons.update({“name”:”aname”},address) – will replace
             Need to use update modifier
                  o Db.persons.update({“name”:”aname”},{$set :{address}}) to make add
                      address
             Discussion:
                  o Embedded docs are very versatile
                  o For example the collider could add more data from another source with
                      nested docs and added metadata
                  o As mongo understands json structure, it can index inside embedded docs.
                      For example address.state to update keys in embedded docs
             db.persons.remove({})
             “$unset” to remove a key
             Upsert – interesting
                  o db.persons.update({“name”:”xyz”},{“$inc”:{“checkins”:1}}) <- won’t add
                  o db.persons.update({“name”:”xyz”},{“$inc”:{“checkins”:1}},{“upsert”:”true”)
                      <-will add
             Add a new field/update schema
                  o db.persons.update({},{“$set”:{“insertDate”:new Date()}})
                  o db.runCommand({getLastError:1})
                  o db.persons.update({},{“$set”:{“insertDate”:new
                      Date()}},{”upsert”:”false”},{“multi”:”true”})
                  o db.runCommand({getLastError:1})


Main Feature – Loc us

             Goal is to develop location based/geospatial capability :
              o       Allow users to 'check in' at places
              o       Based on the location, recommend near-by attractions
              o       Allow users to leave tips about places
              o       Do a few aggregate map-reduce operations like “How many people are
                      checked in at a place?” and “List who is there” …
             But for this demo, we will implement a small slice : Ability to checkin and show
              nearby soccer attractions using a Java program
             A quick DataModel
              Database : locus
              Collections: Places – state, city, stadium, latlong
people – name, when, latlong
   Hands on




                                   FIG UR E 6




           FIG UR E 7



       o use <dbname> (use locus)
       o Show collections
o db.stats()
o db.<collection>.find()
     o db.checkins.find()
     o db.places.find()




                 FIG UR E 8




                 FIG UR E 9
FIG UR E 10

   db.places.find({latlong: {$near : [40,70]})

   Mapreduce
    Mapreduce is best for batch type operations, especially aggregations - while the
    standard queries are well suited for lookup operations/matching operations on
    documents. In our example, we could run mapreduce operation every few hours
    to calculate which locations are 'trending' i.e. have a many check ins in the last
    few hours..

    First we will add one more element place and capture the place where the
    person checksin. Then we could use the following simple map-reduce operation
    which operates on the checkin collection which aggregates counts of all
    historical checkins per location:

    m = function() {
    emit(this.place, 1);}

    r = function(key, values) {
    total = 0;
    for (var i in values) total += values[i];
    return total;
    }

    res = db.checkins.mapReduce(m,r)
    db[res.result].find()

    You could run this (this particular code should run in the javascript shell) and it
    would create a temporary collection with totals for checkins for each location,
    for all historical checkins.

    If you wanted to just run this on a subset of recent data, you could just run the
mapreduce, but filter on checkins over the last 3 hours e.g.
                  res = db.checkins.mapReduce(m,r, {when: {$gt:nowminus3hours}}), where
                  nowminus3hours is a variable containing a datetime corresponding to the
                  current time minus 3 hours

                  This would give you a collection with documents for each location that was
                  checked into for the last 3 hours with the totals number of checkins for each
                  location contained in the document.

                  We could use this to develop a faceted search – popular places - 3 hrs, 1 day, 1
                  week, 1 month and max.

                  (Thanks to Nosh Petigara/10Gen for the scenario and the code)

The wind-up

      *** First terminate the instance ! ***

      Discussions

The Afterhours (NOSQL + Beer)

      Homework – add social graph based inference mechanisms.
      a) MyNearby Friends
         When you check-in show if your friends are here (either add friends to this site or based
         on their facebook page using the OpenGraph API)
      b) We were here
         Show if their friends were here or nearby places the friends visited
      c) Show pictures taken by friends when they were here

Weitere ähnliche Inhalte

Was ist angesagt?

mobl - model-driven engineering lecture
mobl - model-driven engineering lecturemobl - model-driven engineering lecture
mobl - model-driven engineering lecturezefhemel
 
The Ring programming language version 1.10 book - Part 56 of 212
The Ring programming language version 1.10 book - Part 56 of 212The Ring programming language version 1.10 book - Part 56 of 212
The Ring programming language version 1.10 book - Part 56 of 212Mahmoud Samir Fayed
 
java experiments and programs
java experiments and programsjava experiments and programs
java experiments and programsKaruppaiyaa123
 
MySQL flexible schema and JSON for Internet of Things
MySQL flexible schema and JSON for Internet of ThingsMySQL flexible schema and JSON for Internet of Things
MySQL flexible schema and JSON for Internet of ThingsAlexander Rubin
 
Backbone.js: Run your Application Inside The Browser
Backbone.js: Run your Application Inside The BrowserBackbone.js: Run your Application Inside The Browser
Backbone.js: Run your Application Inside The BrowserHoward Lewis Ship
 
Hopping in clouds - phpuk 17
Hopping in clouds - phpuk 17Hopping in clouds - phpuk 17
Hopping in clouds - phpuk 17Michele Orselli
 
Writing Clean Code in Swift
Writing Clean Code in SwiftWriting Clean Code in Swift
Writing Clean Code in SwiftDerek Lee Boire
 
Node js mongodriver
Node js mongodriverNode js mongodriver
Node js mongodriverchristkv
 
Cnam azure 2014 mobile services
Cnam azure 2014   mobile servicesCnam azure 2014   mobile services
Cnam azure 2014 mobile servicesAymeric Weinbach
 
The Ring programming language version 1.5.4 book - Part 40 of 185
The Ring programming language version 1.5.4 book - Part 40 of 185The Ring programming language version 1.5.4 book - Part 40 of 185
The Ring programming language version 1.5.4 book - Part 40 of 185Mahmoud Samir Fayed
 
Mozilla とブラウザゲーム
Mozilla とブラウザゲームMozilla とブラウザゲーム
Mozilla とブラウザゲームNoritada Shimizu
 
Using Scala Slick at FortyTwo
Using Scala Slick at FortyTwoUsing Scala Slick at FortyTwo
Using Scala Slick at FortyTwoEishay Smith
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseMike Dirolf
 
Test driven game development silly, stupid or inspired?
Test driven game development   silly, stupid or inspired?Test driven game development   silly, stupid or inspired?
Test driven game development silly, stupid or inspired?Eric Smith
 
Test Driven Cocos2d
Test Driven Cocos2dTest Driven Cocos2d
Test Driven Cocos2dEric Smith
 
RxSwift 시작하기
RxSwift 시작하기RxSwift 시작하기
RxSwift 시작하기Suyeol Jeon
 
Data Types and Processing in ES6
Data Types and Processing in ES6Data Types and Processing in ES6
Data Types and Processing in ES6m0bz
 
All Things Open 2016 -- Database Programming for Newbies
All Things Open 2016 -- Database Programming for NewbiesAll Things Open 2016 -- Database Programming for Newbies
All Things Open 2016 -- Database Programming for NewbiesDave Stokes
 

Was ist angesagt? (20)

mobl - model-driven engineering lecture
mobl - model-driven engineering lecturemobl - model-driven engineering lecture
mobl - model-driven engineering lecture
 
The Ring programming language version 1.10 book - Part 56 of 212
The Ring programming language version 1.10 book - Part 56 of 212The Ring programming language version 1.10 book - Part 56 of 212
The Ring programming language version 1.10 book - Part 56 of 212
 
java experiments and programs
java experiments and programsjava experiments and programs
java experiments and programs
 
MySQL flexible schema and JSON for Internet of Things
MySQL flexible schema and JSON for Internet of ThingsMySQL flexible schema and JSON for Internet of Things
MySQL flexible schema and JSON for Internet of Things
 
Backbone.js: Run your Application Inside The Browser
Backbone.js: Run your Application Inside The BrowserBackbone.js: Run your Application Inside The Browser
Backbone.js: Run your Application Inside The Browser
 
Hopping in clouds - phpuk 17
Hopping in clouds - phpuk 17Hopping in clouds - phpuk 17
Hopping in clouds - phpuk 17
 
Drupal 8 database api
Drupal 8 database apiDrupal 8 database api
Drupal 8 database api
 
Writing Clean Code in Swift
Writing Clean Code in SwiftWriting Clean Code in Swift
Writing Clean Code in Swift
 
Node js mongodriver
Node js mongodriverNode js mongodriver
Node js mongodriver
 
Pdxpugday2010 pg90
Pdxpugday2010 pg90Pdxpugday2010 pg90
Pdxpugday2010 pg90
 
Cnam azure 2014 mobile services
Cnam azure 2014   mobile servicesCnam azure 2014   mobile services
Cnam azure 2014 mobile services
 
The Ring programming language version 1.5.4 book - Part 40 of 185
The Ring programming language version 1.5.4 book - Part 40 of 185The Ring programming language version 1.5.4 book - Part 40 of 185
The Ring programming language version 1.5.4 book - Part 40 of 185
 
Mozilla とブラウザゲーム
Mozilla とブラウザゲームMozilla とブラウザゲーム
Mozilla とブラウザゲーム
 
Using Scala Slick at FortyTwo
Using Scala Slick at FortyTwoUsing Scala Slick at FortyTwo
Using Scala Slick at FortyTwo
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
 
Test driven game development silly, stupid or inspired?
Test driven game development   silly, stupid or inspired?Test driven game development   silly, stupid or inspired?
Test driven game development silly, stupid or inspired?
 
Test Driven Cocos2d
Test Driven Cocos2dTest Driven Cocos2d
Test Driven Cocos2d
 
RxSwift 시작하기
RxSwift 시작하기RxSwift 시작하기
RxSwift 시작하기
 
Data Types and Processing in ES6
Data Types and Processing in ES6Data Types and Processing in ES6
Data Types and Processing in ES6
 
All Things Open 2016 -- Database Programming for Newbies
All Things Open 2016 -- Database Programming for NewbiesAll Things Open 2016 -- Database Programming for Newbies
All Things Open 2016 -- Database Programming for Newbies
 

Andere mochten auch

Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)
Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)
Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)Rouzeh Eghtessadi
 
กราฟกับงานพรีเซนเตชั่น
กราฟกับงานพรีเซนเตชั่นกราฟกับงานพรีเซนเตชั่น
กราฟกับงานพรีเซนเตชั่นMonsicha Saesiao
 
เทคนิคPhotoshop
เทคนิคPhotoshopเทคนิคPhotoshop
เทคนิคPhotoshopMonsicha Saesiao
 
WinWire slides for Mobile Enterprise SaaS Platform Launch from Appcelerator
WinWire slides for Mobile Enterprise SaaS Platform Launch from AppceleratorWinWire slides for Mobile Enterprise SaaS Platform Launch from Appcelerator
WinWire slides for Mobile Enterprise SaaS Platform Launch from AppceleratorWinWire Technologies Inc
 
Adviesinzakeadolescentenstrafrecht
AdviesinzakeadolescentenstrafrechtAdviesinzakeadolescentenstrafrecht
Adviesinzakeadolescentenstrafrechtjohanperridon
 
Proposed New US fashion design law 2010
Proposed New US fashion design law 2010Proposed New US fashion design law 2010
Proposed New US fashion design law 2010Darrell Mottley
 
网络团购调查研究报告
网络团购调查研究报告网络团购调查研究报告
网络团购调查研究报告Enlight Chen
 
Erich von daniken recuerdos del futuro
Erich von daniken   recuerdos del futuroErich von daniken   recuerdos del futuro
Erich von daniken recuerdos del futurofrostiss
 
Bollywood purple sarees online collection with maching clutch pursh
Bollywood purple sarees online collection with maching clutch purshBollywood purple sarees online collection with maching clutch pursh
Bollywood purple sarees online collection with maching clutch purshChrisPerez
 
Rode Duivels - Red Devils Belgium
Rode Duivels - Red Devils BelgiumRode Duivels - Red Devils Belgium
Rode Duivels - Red Devils BelgiumNils Demanet
 
Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013
Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013
Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013WinWire Technologies Inc
 
ABC Breakfast Club m. Abena: 40% færre restordre
ABC Breakfast Club m. Abena: 40% færre restordreABC Breakfast Club m. Abena: 40% færre restordre
ABC Breakfast Club m. Abena: 40% færre restordreABC Softwork
 
Feast of saint martin celebrated in santarcangelo by cassoli alberto 3 d
Feast of saint martin  celebrated in santarcangelo by cassoli alberto 3 dFeast of saint martin  celebrated in santarcangelo by cassoli alberto 3 d
Feast of saint martin celebrated in santarcangelo by cassoli alberto 3 dbarbelkarlsruhe
 
Ace of Sales Affiliate Program Walkthrough
Ace of Sales Affiliate Program WalkthroughAce of Sales Affiliate Program Walkthrough
Ace of Sales Affiliate Program WalkthroughAndy Horner
 
CSA 2010: Carrier Considerations
CSA 2010: Carrier ConsiderationsCSA 2010: Carrier Considerations
CSA 2010: Carrier ConsiderationsMulti Service
 

Andere mochten auch (20)

Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)
Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)
Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)
 
กราฟกับงานพรีเซนเตชั่น
กราฟกับงานพรีเซนเตชั่นกราฟกับงานพรีเซนเตชั่น
กราฟกับงานพรีเซนเตชั่น
 
เทคนิคPhotoshop
เทคนิคPhotoshopเทคนิคPhotoshop
เทคนิคPhotoshop
 
WinWire slides for Mobile Enterprise SaaS Platform Launch from Appcelerator
WinWire slides for Mobile Enterprise SaaS Platform Launch from AppceleratorWinWire slides for Mobile Enterprise SaaS Platform Launch from Appcelerator
WinWire slides for Mobile Enterprise SaaS Platform Launch from Appcelerator
 
Adviesinzakeadolescentenstrafrecht
AdviesinzakeadolescentenstrafrechtAdviesinzakeadolescentenstrafrecht
Adviesinzakeadolescentenstrafrecht
 
Proposed New US fashion design law 2010
Proposed New US fashion design law 2010Proposed New US fashion design law 2010
Proposed New US fashion design law 2010
 
Brochure invest eng
Brochure invest engBrochure invest eng
Brochure invest eng
 
网络团购调查研究报告
网络团购调查研究报告网络团购调查研究报告
网络团购调查研究报告
 
Erich von daniken recuerdos del futuro
Erich von daniken   recuerdos del futuroErich von daniken   recuerdos del futuro
Erich von daniken recuerdos del futuro
 
Bollywood purple sarees online collection with maching clutch pursh
Bollywood purple sarees online collection with maching clutch purshBollywood purple sarees online collection with maching clutch pursh
Bollywood purple sarees online collection with maching clutch pursh
 
Rode Duivels - Red Devils Belgium
Rode Duivels - Red Devils BelgiumRode Duivels - Red Devils Belgium
Rode Duivels - Red Devils Belgium
 
Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013
Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013
Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013
 
ABC Breakfast Club m. Abena: 40% færre restordre
ABC Breakfast Club m. Abena: 40% færre restordreABC Breakfast Club m. Abena: 40% færre restordre
ABC Breakfast Club m. Abena: 40% færre restordre
 
Feast of saint martin celebrated in santarcangelo by cassoli alberto 3 d
Feast of saint martin  celebrated in santarcangelo by cassoli alberto 3 dFeast of saint martin  celebrated in santarcangelo by cassoli alberto 3 d
Feast of saint martin celebrated in santarcangelo by cassoli alberto 3 d
 
Android
AndroidAndroid
Android
 
Resumenek
ResumenekResumenek
Resumenek
 
Ace of Sales Affiliate Program Walkthrough
Ace of Sales Affiliate Program WalkthroughAce of Sales Affiliate Program Walkthrough
Ace of Sales Affiliate Program Walkthrough
 
나의 Tstore 공모전 체험수기 (김재철)
나의 Tstore 공모전 체험수기 (김재철)나의 Tstore 공모전 체험수기 (김재철)
나의 Tstore 공모전 체험수기 (김재철)
 
Open
OpenOpen
Open
 
CSA 2010: Carrier Considerations
CSA 2010: Carrier ConsiderationsCSA 2010: Carrier Considerations
CSA 2010: Carrier Considerations
 

Ähnlich wie Nosql hands on handout 04

Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with ClojureDmitry Buzdin
 
Fun Teaching MongoDB New Tricks
Fun Teaching MongoDB New TricksFun Teaching MongoDB New Tricks
Fun Teaching MongoDB New TricksMongoDB
 
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your DataMongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your DataMongoDB
 
CouchDB on Android
CouchDB on AndroidCouchDB on Android
CouchDB on AndroidSven Haiges
 
An intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECSAn intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECSYevgeniy Brikman
 
AWS Office Hours: Amazon Elastic MapReduce
AWS Office Hours: Amazon Elastic MapReduce AWS Office Hours: Amazon Elastic MapReduce
AWS Office Hours: Amazon Elastic MapReduce Amazon Web Services
 
CouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourCouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourPeter Friese
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkCaserta
 
Introduction tomongodb
Introduction tomongodbIntroduction tomongodb
Introduction tomongodbLee Theobald
 
Redis the better NoSQL
Redis the better NoSQLRedis the better NoSQL
Redis the better NoSQLOpenFest team
 
The Art Of Readable Code
The Art Of Readable CodeThe Art Of Readable Code
The Art Of Readable CodeBaidu, Inc.
 
C++totural file
C++totural fileC++totural file
C++totural filehalaisumit
 
Writing Maintainable JavaScript
Writing Maintainable JavaScriptWriting Maintainable JavaScript
Writing Maintainable JavaScriptAndrew Dupont
 
Node Boot Camp
Node Boot CampNode Boot Camp
Node Boot CampTroy Miles
 
The Ring programming language version 1.10 book - Part 22 of 212
The Ring programming language version 1.10 book - Part 22 of 212The Ring programming language version 1.10 book - Part 22 of 212
The Ring programming language version 1.10 book - Part 22 of 212Mahmoud Samir Fayed
 
Fullstack conf 2017 - Basic dev pipeline end-to-end
Fullstack conf 2017 - Basic dev pipeline end-to-endFullstack conf 2017 - Basic dev pipeline end-to-end
Fullstack conf 2017 - Basic dev pipeline end-to-endEzequiel Maraschio
 

Ähnlich wie Nosql hands on handout 04 (20)

Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
 
Fun Teaching MongoDB New Tricks
Fun Teaching MongoDB New TricksFun Teaching MongoDB New Tricks
Fun Teaching MongoDB New Tricks
 
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your DataMongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
 
CouchDB on Android
CouchDB on AndroidCouchDB on Android
CouchDB on Android
 
An intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECSAn intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECS
 
Angular Schematics
Angular SchematicsAngular Schematics
Angular Schematics
 
AWS Office Hours: Amazon Elastic MapReduce
AWS Office Hours: Amazon Elastic MapReduce AWS Office Hours: Amazon Elastic MapReduce
AWS Office Hours: Amazon Elastic MapReduce
 
CouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourCouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 Hour
 
Latinoware
LatinowareLatinoware
Latinoware
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
Introduction tomongodb
Introduction tomongodbIntroduction tomongodb
Introduction tomongodb
 
Play á la Rails
Play á la RailsPlay á la Rails
Play á la Rails
 
Redis the better NoSQL
Redis the better NoSQLRedis the better NoSQL
Redis the better NoSQL
 
C++ tutorial
C++ tutorialC++ tutorial
C++ tutorial
 
The Art Of Readable Code
The Art Of Readable CodeThe Art Of Readable Code
The Art Of Readable Code
 
C++totural file
C++totural fileC++totural file
C++totural file
 
Writing Maintainable JavaScript
Writing Maintainable JavaScriptWriting Maintainable JavaScript
Writing Maintainable JavaScript
 
Node Boot Camp
Node Boot CampNode Boot Camp
Node Boot Camp
 
The Ring programming language version 1.10 book - Part 22 of 212
The Ring programming language version 1.10 book - Part 22 of 212The Ring programming language version 1.10 book - Part 22 of 212
The Ring programming language version 1.10 book - Part 22 of 212
 
Fullstack conf 2017 - Basic dev pipeline end-to-end
Fullstack conf 2017 - Basic dev pipeline end-to-endFullstack conf 2017 - Basic dev pipeline end-to-end
Fullstack conf 2017 - Basic dev pipeline end-to-end
 

Mehr von Krishna Sankar

Pandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data SciencePandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data ScienceKrishna Sankar
 
An excursion into Graph Analytics with Apache Spark GraphX
An excursion into Graph Analytics with Apache Spark GraphXAn excursion into Graph Analytics with Apache Spark GraphX
An excursion into Graph Analytics with Apache Spark GraphXKrishna Sankar
 
An excursion into Text Analytics with Apache Spark
An excursion into Text Analytics with Apache SparkAn excursion into Text Analytics with Apache Spark
An excursion into Text Analytics with Apache SparkKrishna Sankar
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesKrishna Sankar
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with SparkKrishna Sankar
 
Architecture in action 01
Architecture in action 01Architecture in action 01
Architecture in action 01Krishna Sankar
 
Data Science with Spark - Training at SparkSummit (East)
Data Science with Spark - Training at SparkSummit (East)Data Science with Spark - Training at SparkSummit (East)
Data Science with Spark - Training at SparkSummit (East)Krishna Sankar
 
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538Krishna Sankar
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsKrishna Sankar
 
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache SparkThe Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache SparkKrishna Sankar
 
Data Science Folk Knowledge
Data Science Folk KnowledgeData Science Folk Knowledge
Data Science Folk KnowledgeKrishna Sankar
 
Data Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science CompetitionsData Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science CompetitionsKrishna Sankar
 
Bayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive BayesBayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive BayesKrishna Sankar
 
AWS VPC distilled for MongoDB devOps
AWS VPC distilled for MongoDB devOpsAWS VPC distilled for MongoDB devOps
AWS VPC distilled for MongoDB devOpsKrishna Sankar
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonKrishna Sankar
 
Big Data Engineering - Top 10 Pragmatics
Big Data Engineering - Top 10 PragmaticsBig Data Engineering - Top 10 Pragmatics
Big Data Engineering - Top 10 PragmaticsKrishna Sankar
 
Scrum debrief to team
Scrum debrief to team Scrum debrief to team
Scrum debrief to team Krishna Sankar
 
Precision Time Synchronization
Precision Time SynchronizationPrecision Time Synchronization
Precision Time SynchronizationKrishna Sankar
 
The Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to KaggleThe Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to KaggleKrishna Sankar
 

Mehr von Krishna Sankar (20)

Pandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data SciencePandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data Science
 
An excursion into Graph Analytics with Apache Spark GraphX
An excursion into Graph Analytics with Apache Spark GraphXAn excursion into Graph Analytics with Apache Spark GraphX
An excursion into Graph Analytics with Apache Spark GraphX
 
An excursion into Text Analytics with Apache Spark
An excursion into Text Analytics with Apache SparkAn excursion into Text Analytics with Apache Spark
An excursion into Text Analytics with Apache Spark
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with Spark
 
Architecture in action 01
Architecture in action 01Architecture in action 01
Architecture in action 01
 
Data Science with Spark - Training at SparkSummit (East)
Data Science with Spark - Training at SparkSummit (East)Data Science with Spark - Training at SparkSummit (East)
Data Science with Spark - Training at SparkSummit (East)
 
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science Competitions
 
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache SparkThe Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
 
Data Science Folk Knowledge
Data Science Folk KnowledgeData Science Folk Knowledge
Data Science Folk Knowledge
 
Data Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science CompetitionsData Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science Competitions
 
Bayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive BayesBayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive Bayes
 
AWS VPC distilled for MongoDB devOps
AWS VPC distilled for MongoDB devOpsAWS VPC distilled for MongoDB devOps
AWS VPC distilled for MongoDB devOps
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & Python
 
Big Data Engineering - Top 10 Pragmatics
Big Data Engineering - Top 10 PragmaticsBig Data Engineering - Top 10 Pragmatics
Big Data Engineering - Top 10 Pragmatics
 
Scrum debrief to team
Scrum debrief to team Scrum debrief to team
Scrum debrief to team
 
The Art of Big Data
The Art of Big DataThe Art of Big Data
The Art of Big Data
 
Precision Time Synchronization
Precision Time SynchronizationPrecision Time Synchronization
Precision Time Synchronization
 
The Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to KaggleThe Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to Kaggle
 

Nosql hands on handout 04

  • 1. H I T C H H I K ER ’ S GUI D E T O A NO S Q L D A T A C L O UD – T UT O R I A L H A ND O UT AWS SETUP 1. What you need : a. An AWS account. Signup is at Amazon EC2 page b. Create a local directory aws : ~/aws or c:aws. It is easier to have a sub directory to keep all aws stuff c. (optional) Have downloaded and installed aws tools. I install them in the aws sub directory d. Setup and store an SSH key. My key is at ~/aws/SSH-Key1.pem e. Create a security group (for example Test-1 as shown in Figure 1 – Open port 22,80 and 8080. Remember this is for experimental purposes and not for production) FIG UR E 1 2. Redeem AWS credit. Amazon has given us credit voucher for $25. (Thanks to Jeff Barr and Jenny Kohr Chynoweth) a. Visit the AWS Credit Redemption Page and enter the promotion code above. You must be signed in to view this page. b. Verify that the credit has been applied to your account by viewing your Account Activity page. c. These credits will be good for $25 worth of AWS services from June 10, 2010-December 31, 2010. 3. I have created OSCON-NOSQL AMI that has the stuff needed for this tutorial. a. Updates and references at http://doubleclix.wordpress.com/2010/07/13/oscon- nosql-training-ami/ b. Reference Pack i. The NOSQL Slide set ii. The hand out (This document)
  • 2. iii. Reference List iv. The AMI v. The NOSQL papers 4. To launch the AMI AWS Management Console-EC2-Launch Instance Search for OSCON-NOSQL (Figure 2) FIG UR E 2 [Select] [Instance Details] (small instance is fine) [Advanced Instance Options] (Default is fine) [Create Key pair] (Use existing if you already have one or create one and store in the ~/aws directory [Configure Firewall] – choose Test-1 or create 1 with 80,22 and 8080 open [Review]-[Launch]-[Close] Instance should show up in the management console in a few minutes (Figure 3) FIG UR E 3 5. Open up a console and ssh into the instance SSH-Key1.pem ubuntu@ec2-184-73-85-67.compute-1.amazonaws.com And type-in couple of commands (Figure 4)
  • 3. FIG UR E 4 Note : If it says connection refused, reboot the instance. You can check the startup logs by typing ec2-get-console-output --private-key <private key file name> --cert <certificate file name> <instancd id> Also type-in commands like the following to get a feel for the instance. ps –A vmstat cat /proc/cpuinfo uname –a sudo fdisk –l sudo lshw df –h –T Note : Don’t forget to turn off the instance at the end of the hands-on session !
  • 4. HANDS-ON DOCUMENT DATASTORES - MONGODB Close Encounters of the first kind …  Start mongo shell : type mongo FIG UR E 5  The CLI is a full Javascript interpreter  Type any javascript commands Function loopTest() { For (i=0;i<10;i++) { Print(i); } }  See how the shell understands editing, multi-line and end of function  loopTest will show the code  loopTest() will run the function  Other commands to try o show dbs o use <dbname> o show collections o db.help() o db.collections.help() o db.stats() o db.version() Warm up  create a document & insert it
  • 5. o person = {“name”,”aname”,”phone”:”555-1212”} o db.persons.insert(person) o db.persons.find()  <discussion about id> o db.persons.help() o db.persons.count()  Create 2 more records  Embedded Nested documents  address = {“street”:”123,some street”,”City”:”Portland”,”state”:”WA”, ”tags”:[“home”,”WA”])  db.persons.update({“name”:”aname”},address) – will replace  Need to use update modifier o Db.persons.update({“name”:”aname”},{$set :{address}}) to make add address  Discussion: o Embedded docs are very versatile o For example the collider could add more data from another source with nested docs and added metadata o As mongo understands json structure, it can index inside embedded docs. For example address.state to update keys in embedded docs  db.persons.remove({})  “$unset” to remove a key  Upsert – interesting o db.persons.update({“name”:”xyz”},{“$inc”:{“checkins”:1}}) <- won’t add o db.persons.update({“name”:”xyz”},{“$inc”:{“checkins”:1}},{“upsert”:”true”) <-will add  Add a new field/update schema o db.persons.update({},{“$set”:{“insertDate”:new Date()}}) o db.runCommand({getLastError:1}) o db.persons.update({},{“$set”:{“insertDate”:new Date()}},{”upsert”:”false”},{“multi”:”true”}) o db.runCommand({getLastError:1}) Main Feature – Loc us  Goal is to develop location based/geospatial capability : o Allow users to 'check in' at places o Based on the location, recommend near-by attractions o Allow users to leave tips about places o Do a few aggregate map-reduce operations like “How many people are checked in at a place?” and “List who is there” …  But for this demo, we will implement a small slice : Ability to checkin and show nearby soccer attractions using a Java program  A quick DataModel Database : locus Collections: Places – state, city, stadium, latlong
  • 6. people – name, when, latlong  Hands on FIG UR E 6 FIG UR E 7 o use <dbname> (use locus) o Show collections
  • 7. o db.stats() o db.<collection>.find() o db.checkins.find() o db.places.find() FIG UR E 8 FIG UR E 9
  • 8. FIG UR E 10  db.places.find({latlong: {$near : [40,70]})  Mapreduce Mapreduce is best for batch type operations, especially aggregations - while the standard queries are well suited for lookup operations/matching operations on documents. In our example, we could run mapreduce operation every few hours to calculate which locations are 'trending' i.e. have a many check ins in the last few hours.. First we will add one more element place and capture the place where the person checksin. Then we could use the following simple map-reduce operation which operates on the checkin collection which aggregates counts of all historical checkins per location: m = function() { emit(this.place, 1);} r = function(key, values) { total = 0; for (var i in values) total += values[i]; return total; } res = db.checkins.mapReduce(m,r) db[res.result].find() You could run this (this particular code should run in the javascript shell) and it would create a temporary collection with totals for checkins for each location, for all historical checkins. If you wanted to just run this on a subset of recent data, you could just run the
  • 9. mapreduce, but filter on checkins over the last 3 hours e.g. res = db.checkins.mapReduce(m,r, {when: {$gt:nowminus3hours}}), where nowminus3hours is a variable containing a datetime corresponding to the current time minus 3 hours This would give you a collection with documents for each location that was checked into for the last 3 hours with the totals number of checkins for each location contained in the document. We could use this to develop a faceted search – popular places - 3 hrs, 1 day, 1 week, 1 month and max. (Thanks to Nosh Petigara/10Gen for the scenario and the code) The wind-up *** First terminate the instance ! *** Discussions The Afterhours (NOSQL + Beer) Homework – add social graph based inference mechanisms. a) MyNearby Friends When you check-in show if your friends are here (either add friends to this site or based on their facebook page using the OpenGraph API) b) We were here Show if their friends were here or nearby places the friends visited c) Show pictures taken by friends when they were here