SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Downloaden Sie, um offline zu lesen
Implementing and Visualizing Click-Stream
                         Data with MongoDB

                       Jan 22, 2013 - New York MongoDB User Group
                                   Cameron Sim - LearnVest.com




Monday, April 15, 13
Agenda

                       About LearnVest
                       HL Application Architecture
                       Data Capture
                       Event Packaging
                       MongoDB Data Warehousing
                       Loading & Visualization
                       Finishing up
Monday, April 15, 13
LearnVest Inc.
                                              www.learnvest.com
                                                  Mission Statement
                       Aiming to making Financial Planning as accessible as having a gym membership


                        Company                                               Key Products
          Founded in 2008 by Alexa Von Tobel, CEO                  Account Aggregation and Management
                                                                 (Bank, Credit, Loan, Investment, Mortgage)
                   50+ People and Growing rapidly
                           Based in NYC                        Original and Syndicated Newsletter Content

                            Platforms                                         Financial Planning
                            Web & iPhone                                  (tiered product offering)


                                                         Stack

                      Operational                                            Analytics
            Wordpress, Backbone.js, Node.js                        MongoDB 2.2.0 (3-node replica-set)
             Java Spring 3, Redis, Memcached,                              Java 6, Spring 3
          MongoDB, ActiveMQ, Nginx, MySQL 5.x                                 pyMongo
                                                                             Django 1.4
Monday, April 15, 13
LearnVest.com
                            Web




Monday, April 15, 13
LearnVest.com
                            IPhone




Monday, April 15, 13
High Level Architecture
                         Production                                      Analytics
     Platform Delivery                       Services         Services        Loaders & Dashboards
  }

                                }
                                }
                                                                         }
                       HTTPS
                       pyMongo
                       MongoDB Java Conn
                       MongoDB Replication              Event Packaging Warehousing
                                                        MongoDB Visualization
                                                        Loading & Data
                                                              Collection
                       JDBC
Monday, April 15, 13
High Level Architecture
                         Production                                      Analytics
     Platform Delivery                       Services         Services        Loaders & Dashboards
  }

                                }
                                }
                                                                         }
                       HTTPS
                       pyMongo
                       MongoDB Java Conn
                       MongoDB Replication              Event Packaging Warehousing
                                                        MongoDB Visualization
                                                        Loading & Data
                                                              Collection
                       JDBC
Monday, April 15, 13
High Level Architecture
                         Production                                      Analytics
     Platform Delivery                       Services         Services        Loaders & Dashboards
  }

                                }
                                }
                                                                         }
                       HTTPS
                       pyMongo
                       MongoDB Java Conn
                       MongoDB Replication              Event Packaging Warehousing
                                                        MongoDB Visualization
                                                        Loading & Data
                                                              Collection
                       JDBC
Monday, April 15, 13
High Level Architecture
                         Production                                      Analytics
     Platform Delivery                       Services         Services        Loaders & Dashboards
  }

                                }
                                }
                                                                         }
                       HTTPS
                       pyMongo
                       MongoDB Java Conn
                       MongoDB Replication              Event Packaging Warehousing
                                                        MongoDB Visualization
                                                        Loading & Data
                                                              Collection
                       JDBC
Monday, April 15, 13
High Level Architecture
                         Production                                      Analytics
     Platform Delivery                       Services         Services        Loaders & Dashboards
  }

                                }
                                }
                                                                         }
                       HTTPS
                       pyMongo
                       MongoDB Java Conn
                       MongoDB Replication              Event Packaging Warehousing
                                                        MongoDB Visualization
                                                        Loading & Data
                                                              Collection
                       JDBC
Monday, April 15, 13
High Level Architecture
                         Production                                      Analytics
     Platform Delivery                       Services         Services        Loaders & Dashboards
  }

                                }
                                }
                                                                         }
                       HTTPS
                       pyMongo
                       MongoDB Java Conn
                       MongoDB Replication              Event Packaging
                                                        Loading & VisualizationData Ware
                                                              Collection
                                                                   MongoDB
                       JDBC
Monday, April 15, 13
High Level Architecture
                         Production                                      Analytics
     Platform Delivery                       Services         Services        Loaders & Dashboards
  }

                                }
                                }
                                                                         }
                       HTTPS
                       pyMongo
                       MongoDB Java Conn
                       MongoDB Replication              Event Packaging
                                                        Loading & Visualization MongoDB
                                                              Collection
                       JDBC
Monday, April 15, 13
Philosophy For Data Collection
       Capture Everything
       • User-Driven events over web and mobile
       • System-level exceptions
       • Everything else
       Temporary Data
       • Be ‘ok’ with approximate data
       • Operational Databases are the system of record
       Aggregate events as they come in
       • Remove the overhead of basic metrics (counts, sums) on core events
       • Group by user unique id and increment counts per event, over time-dimensions
         (day, week-ending, month, year)




Monday, April 15, 13
Data Capture
                IOS

                - (void) sendAnalyticEventType:(NSString*)eventType
                                        object:(NSString*)object
                                          name:(NSString*)name
                                          page:(NSString*)page
                                        source:(NSString*)source;
                {
                    NSMutableDictionary *eventData = [NSMutableDictionary dictionary];

                       if   (eventType!=nil) [params setObject:eventType forKey:@"eventType"];
                       if   (object!=nil) [eventData setObject:object forKey:@"object"];
                       if   (name!=nil) [eventData setObject:name forKey:@"name"];
                       if   (page!=nil) [eventData setObject:page forKey:@"page"];
                       if   (source!=nil) [eventData setObject:source forKey:@"source"];
                       if   (eventData!=nil) [params setObject:eventData forKey:@"eventData"];

                       [[LVNetworkEngine sharedManager] analytics_send:params];
                }




Monday, April 15, 13
Data Capture
                WEB (JavaScript)

                function internalTrackPageView() {
                   var cookie = {
                       userContext: jQuery.cookie('UserContextCookie'),
                   };

                       var trackEvent = {
                           eventType: "pageView",
                           eventData: {
                              page: window.location.pathname + window.location.search
                           }
                       };

                       // AJAX
                       jQuery.ajax({
                           url: "/api/track",
                           type: "POST",
                           dataType: "json",
                           data: JSON.stringify(trackEvent),
                           // Set Request Headers
                           beforeSend: function (xhr, settings) {
                              xhr.setRequestHeader('Accept', 'application/json');
                              xhr.setRequestHeader('User-Context', cookie.userContext);
                              if(settings.type === 'PUT' || settings.type === 'POST') {
                                  xhr.setRequestHeader('Content-Type', 'application/json');
                              }
                           }
                       });
                }

Monday, April 15, 13
Bus Event Packaging
        1. Spring 3 RESTful service layer, controller methods define the eventCode via @tracking
           annotation

        2. Custom Intercepter class extends HandlerInterceptorAdapter and implements
           postHandle() (for each event) to invoke calls via Spring @async to an EventPublisher

        3. EventPublisher publishes to common event bus queue with multiple subscribers, one of
           which packages the eventPayload Map<String, Object> object and forwards to Analytics Rest
           Service




Monday, April 15, 13
Bus Event Packaging
        1) Spring RestController Methods
        Interface
        @RequestMapping(value = "/user/login", method = RequestMethod.POST,
        headers="Accept=application/json")
        public Map<String, Object> userLogin(@RequestBody Map<String, Object> event,
        HttpServletRequest request);

        Concrete/Impl Class
        @Override
        @Tracking("user.login")
        public Map<String, Object> userLogin(@RequestBody Map<String, Object> event,
        HttpServletRequest request){

                //Implementation

                return event;
        }




Monday, April 15, 13
Bus Event Packaging
         2) Custom Intercepter class extends HandlerInterceptorAdapter

        protected void handleTracking(String trackingCode, Map<String, Object> modelMap,
        HttpServletRequest request) {


                       Map<String, Object> responseModel = new HashMap<String, Object>();

                       // remove non-serializables & copy over data from modelMap

                       try {
                          this.eventPublisher.publish(trackingCode, responseModel, request);
                       } catch (Exception e) {
                          log.error("Error tracking event '" + trackingCode + "' : "
                                 + ExceptionUtils.getStackTrace(e));
                       }
        }




Monday, April 15, 13
Bus Event Packaging
         2) Custom Intercepter class extends HandlerInterceptorAdapter
        public void publish (String eventCode, Map<String,Object> eventData,
                                                             HttpServletRequest request) {

                  Map<String,Object> payload = new HashMap<String,Object>();
                  String eventId=UUID.randomUUID().toString();
                  Map<String, String> requestMap = HttpRequestUtils.getRequestHeaders(request);

                  //Normalize message
                  payload.put("eventType", eventData.get("eventType"));
                  payload.put("eventData", eventData.get("eventType"));
                  payload.put("version", eventData.get("eventType"));
                  payload.put("eventId", eventId);
                  payload.put("eventTime", new Date());
                  payload.put("request", requestMap);
                  .
                  .
                  .
                  //Send to the Analytics Service for MongoDB persistence
        }



        public void sendPost(EventPayload payload){
                HttpEntity request = new HttpEntity(payload.getEventPayload(), headers);
             Map m = restTemplate.postForObject(endpoint, request, java.util.Map.class);
        }




Monday, April 15, 13
Bus Event Packaging
         The Serialized Json (User Action)
        {
               “eventCode”   :   “user.login”,
               “eventType”   :   “login”,
               “version”     :   “1.0”,
               “eventTime”   :   “1358603157746”,
               “eventData”   :   {
                                     “” : “”,
                                     “” : “”,
                                     “” : “”
                                 },
               “request” : {
                                 “call-source” : “WEB”,
                                 “user-context” : “00002b4f1150249206ac2b692e48ddb3”,
                                 “user.agent”   : “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2)
                                                   AppleWebKit/537.11 (KHTML, like Gecko) Chrome/
                                                   23.0.1271.101 Safari/537.11”,
                                 “cookie”       : “size=4; CP.mode=B; PHPSESSID=c087908516
                                                   ee2fae50cef6500101dc89; resolution=1920;
                                                   JSESSIONID=56EB165266A2C4AFF9
                                                   46F139669D746F; csrftoken=73bdcd
                                                   ddf151dc56b8020855b2cb10c8", "content-length" :
                                                   "204", "accept-encoding" : "gzip,deflate,sdch”,

                             }
        }




Monday, April 15, 13
Bus Event Packaging
         The Serialized Json (Generic Event)
        {
               “eventCode”   :   “generic.ui”,
               “eventType”   :   “pageView”,
               “version”     :   “1.0”,
               “eventTime”   :   “1358603157746”,
               “eventData”   :   {
                                     “page”    : “/learnvest/moneycenter/inbox”,
                                     “section” : “transactions”,
                                     “name”    : “view transactions”
                                     “object” : “page”
                                 },
               “request” : {
                                 “call-source” : “WEB”,
                                 “user-context” : “00002b4f1150249206ac2b692e48ddb3”,
                                 “user.agent”   : “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2)
                                                   AppleWebKit/537.11 (KHTML, like Gecko) Chrome/
                                                   23.0.1271.101 Safari/537.11”,
                                 “cookie”       : “size=4; CP.mode=B; PHPSESSID=c087908516
                                                   ee2fae50cef6500101dc89; resolution=1920;
                                                   JSESSIONID=56EB165266A2C4AFF9
                                                   46F139669D746F; csrftoken=73bdcd
                                                   ddf151dc56b8020855b2cb10c8", "content-length" :
                                                   "204", "accept-encoding" : "gzip,deflate,sdch”,

                             }
        }




Monday, April 15, 13
Bus Event Packaging
         The Serialized Json (Generic Event)
        {
               “eventCode”   :   “generic.ui”,
               “eventType”   :   “pageView”,
               “version”     :   “1.0”,
               “eventTime”   :   “1358603157746”,
               “eventData”   :   {
                                     “page”    : “/learnvest/moneycenter/inbox”,
                                     “section” : “transactions”,
                                     “name”    : “view transactions”
                                     “object” : “page”
                                 },
               “request” : {
                                 “call-source” : “WEB”,
                                 “user-context” : “00002b4f1150249206ac2b692e48ddb3”,
                                 “user.agent”   : “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2)
                                                   AppleWebKit/537.11 (KHTML, like Gecko) Chrome/
                                                   23.0.1271.101 Safari/537.11”,
                                 “cookie”       : “size=4; CP.mode=B; PHPSESSID=c087908516
                                                   ee2fae50cef6500101dc89; resolution=1920;
                                                   JSESSIONID=56EB165266A2C4AFF9
                                                   46F139669D746F; csrftoken=73bdcd
                                                   ddf151dc56b8020855b2cb10c8", "content-length" :
                                                   "204", "accept-encoding" : "gzip,deflate,sdch”,

                             }
        }




Monday, April 15, 13
MongoDB Data Warehousing
        MongoDB Information
        • v2.2.0
        • 3-node replica-set
        • 1 Large (primary), 2x Medium (secondary) AWS Amazon-Linux machines
        • Each with single 500GB EBS volumes mounted to /opt/data
        MongoDB Config File
        dbpath = /opt/data/mongodb/data
        rest = true
        replSet = voyager

        Volumes
        ~IM events daily on web, ~600K on mobile
        2-3 GB per day at start, slowed to ~1GB per day
        Currently at 78GB (collecting since August 2012)

        Future Scaling Strategy
        • Setup 2nd Replica-Set
        • Shard replica-sets to n at 60% / 250GB per EBS volume
        • Shard key probably based on sequential mix of email_address & additional string

Monday, April 15, 13
MongoDB Data Warehousing
         Approach

         1. Persist all events, bucketed by source:-
            WEB
            MOBILE

         2. Persist all events, bucketed by source, event code and time:-
            WEB/MOBILE
            user.login
            time (day, week-ending, month, year)

         3. Insert into collection e_web / e_mobile

         4. Upsert into:-
            e_web_user_login_day
            e_web_user_login_week
            e_web_user_login_month
            e_web_user_login_year

         5. Predictable model for scaling and measuring business growth


Monday, April 15, 13
MongoDB Data Warehousing
         2. Persist all events, bucketed by source, event code and time:-
        //instantiate collections dynamically
        DBCollection collection_day = mongodb.getCollection(eventCode + "_day");
        DBCollection collection_week = mongodb.getCollection(eventCode + "_week");
        DBCollection collection_month = mongodb.getCollection(eventCode + "_month");
        DBCollection collection_year = mongodb.getCollection(eventCode + "_year");

        BasicDBObject newDocument = new BasicDBObject().append("$inc"
                         new BasicDBObject().append("count", 1));

        //update day dimension
        collection_day.update(new BasicDBObject().append("user-context", userContext)
                            .append("eventType", eventType)
                            .append("date", sdf_day.format(d)),newDocument, true, false);

        //update week dimension
        collection_week.update(new BasicDBObject().append("user-context", userContext)
                            .append("eventType", eventType)
                            .append("date", sdf_day.format(w)), newDocument, true, false);

        //update month dimension
        collection_month.update(new BasicDBObject().append("user-context", userContext)
                            .append("eventType", eventType)
                            .append("date", sdf_month.format(d)), newDocument, true, false);

        //update month dimension
        collection_year.update(new BasicDBObject().append("user-context", userContext)
                            .append("eventType", eventType)
                            .append("date", sdf_year.format(d)), newDocument, true, false);


Monday, April 15, 13
MongoDB Data Warehousing
         Persist all events, bucketed by source, event code and time:-
        > show collections
        e_mobile
        e_web
        e_web_account_addManual_day
        e_web_account_addManual_month
        e_web_account_addManual_week
        e_web_account_addManual_year
        e_web_user_login_day
        e_web_user_login_week
        e_web_user_login_month
        e_web_user_login_year
        e_mobile_generic_ui_day
        e_mobile_generic_ui_month
        e_mobile_generic_ui_week
        e_mobile_generic_ui_year

        > db.e_web_user_login_day.find()
        { "_id" : ObjectId("50e4b9871b36921910222c42"), "count"          : 5, "date" : "01/02",
        "user-context" : "c4ca4238a0b923820dcc509a6f75849b" }
        { "_id" : ObjectId("50cd6cfcb9a80a2b4ee21422"), "count"          : 7, "date" : "01/02",
        "user-context" : "c4ca4238a0b923820dcc509a6f75849b" }
        { "_id" : ObjectId("50cd6e51b9a80a2b4ee21427"), "count"          : 2, "date" : "01/02",
        "user-context" : "c4ca4238a0b923820dcc509a6f75849b" }
        { "_id" : ObjectId("50e4b9871b36921910222c42"), "count"          : 3, "date" : "01/03",
        "user-context" : "50e49a561b36921910222c33" }




Monday, April 15, 13
MongoDB Data Warehousing
         Persist all events
        > db.e_web.findOne()
        { "_id" : ObjectId("50e4a1ab0364f55ed07c2662"), "created_datetime" :
        ISODate("2013-01-02T21:07:55.656Z"), "created_date" :
        ISODate("2013-01-02T00:00:00.000Z"),"request" : { "content-type" : "application/
        json", "connection" : "keep-alive", "accept-language" : "en-US,en;q=0.8", "host" :
        "localhost:8080", "call-source" : "WEB", "accept" : "*/*", "user-context" :
        "c4ca4238a0b923820dcc509a6f75849b", "origin" : "chrome-extension://
        fdmmgilgnpjigdojojpjoooidkmcomcm", "user-agent" : "Mozilla/5.0 (Macintosh; Intel Mac
        OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.101 Safari/
        537.11", "accept-charset" : "ISO-8859-1,utf-8;q=0.7,*;q=0.3", "cookie" : "size=4;
        CP.mode=B; PHPSESSID=c087908516ee2fae50cef6500101dc89; resolution=1920;
        JSESSIONID=56EB165266A2C4AFF946F139669D746F;
        csrftoken=73bdcdddf151dc56b8020855b2cb10c8", "content-length" : "255", "accept-
        encoding" : "gzip,deflate,sdch" }, "eventType" : "flick", "eventData" : { "object" :
        "button", "name" : "split transaction button", "page" : "#inbox/79876/", "section" :




Monday, April 15, 13
MongoDB Data Warehousing
         Indexing Strategy

         • Indexes on core collections (e_web and e_mobile) come in under 3GB on 7.5GB Large
            Instance and 3.75GB on Medium instances

         • Split datetime in two fields and compound index on date with other fields like eventType
            and user unique id (user-context)

         • Heavy insertion rates, much lower read rates....so less indexes the better



                         	   	   	   	




Monday, April 15, 13
MongoDB Data Warehousing
         Indexing Strategy
        > db.e_web.getIndexes()
        [
           {
               "v" : 1,
               "key" : {
                  "request.user-context" : 1,
                  "created_date" : 1
               },
               "ns" : "moneycenter.e_web",
               "name" : "request.user-context_1_created_date_1"
           },
           {
               "v" : 1,
               "key" : {
                  "eventData.name" : 1,
                  "created_date" : 1
               },
               "ns" : "moneycenter.e_web",
               "name" : "eventData.name_1_created_date_1"
           }
        ]


                         	   	   	   	




Monday, April 15, 13
Loading & Visualization
             Objective
             • Show historic and intraday stats on core use cases (logins, conversions)
             • Show user funnel rates on conversion pages
             • Show general usability - how do users really use the Web and IOS platforms?
             Non-Functionals
             • Intraday doesn’t need to be “real-time”, polling is good enough for now
             • Overnight batch job for historic must scale horizontally
             General Implementation Strategy
             • Do all heavy lifting & object manipulation, UI should just display graph or table
             • Modularize the service to be able to regenerate any graphs/tables without a full load




Monday, April 15, 13
Loading & Visualization
             Java Batch Service

             Java Mongo library to query key collections and return user counts and sum of events

             DBCursor webUserLogins = c.find(
                   new BasicDBObject("date", sdf.format(new Date())));

             private HashMap<String, Object> getSumAndCount(DBCursor cursor){
                    HashMap<String, Object> m = new HashMap<String, Object>();

                       int sum=0;
                       int count=0;
                       DBObject obj;
                       while(cursor.hasNext()){
                          obj=(DBObject)cursor.next();
                          count++;
                          sum=sum+(Integer)obj.get("count");
                       }

                       m.put("sum", sum);
                       m.put("count", count);
                       m.put("average", sdf.format(new Float(sum)/count));

                       return m;
             }




Monday, April 15, 13
Loading & Visualization
             Java Batch Service

             Use Aggregation Framework where required on core collections (e_web) and external data
             //create aggregation objects
             DBObject project = new BasicDBObject("$project",
                 new BasicDBObject("day_value", fields) );
             DBObject day_value = new BasicDBObject( "day_value", "$day_value");
             DBObject groupFields = new BasicDBObject( "_id", day_value);

             //create the fields to group by, in this case “number”
             groupFields.put("number", new BasicDBObject( "$sum", 1));

             //create the group
             DBObject group = new BasicDBObject("$group", groupFields);

             //execute
             AggregationOutput output = mycollection.aggregate( project, group );

             for(DBObject obj : output.results()){
             .
             .
             }




Monday, April 15, 13
Loading & Visualization
             Java Batch Service

             MongoDB Command Line example on aggregation over a time period, e.g. month
             > db.e_web.aggregate(
               [
                  { $match : { created_date : { $gt : ISODate("2012-10-25T00:00:00")}}},
                  { $project : {
                     day_value : {"day" : { $dayOfMonth : "$created_date" },
                                 "month":{ $month : "$created_date" }}
                 }},
                 { $group : {
                     _id : {day_value:"$day_value"} ,
                     number : { $sum : 1 }
                 } },
                 { $sort : { day_value : -1 } }
               ]
             )




Monday, April 15, 13
Loading & Visualization
             Java Batch Service

             Persisting events into graph and table collections

             >db.homeGraphs.find()

             { "_id" : ObjectId("50f57b5c1d4e714b581674e2"), "accounts_natural" : 54,
             "accounts_total" : 54, "date" : ISODate("2011-02-06T05:00:00Z"), "linked_rate" :
             "12.96", "premium_rate" : "0", "str_date" : "2011,01,06", "upgrade_rate" : "0",
             "users_avg_linked" : "3.43", "users_linked" : 7 }

             { "_id" : ObjectId("50f57b5c1d4e714b581674e3"), "accounts_natural" : 144,
             "accounts_total" : 144, "date" : ISODate("2011-02-07T05:00:00Z"), "linked_rate" :
             "11.11", "premium_rate" : "0", "str_date" : "2011,01,07", "upgrade_rate" : "0",
             "users_avg_linked" : "4", "users_linked" : 16 }

             { "_id" : ObjectId("50f57b5c1d4e714b581674e4"), "accounts_natural" : 119,
             "accounts_total" : 119, "date" : ISODate("2011-02-08T05:00:00Z"), "linked_rate" :




Monday, April 15, 13
Loading & Visualization
             Django and HighCharts

             Extract data (pyMongo)
             def getHomeChart(dt_from, dt_to):
                 """Called by home method to get latest 30 day numbers"""
                 try:
                      conn = pymongo.Connection('localhost', 27017)
                      db = conn['lvanalytics']

                           cursor = db.accountmetrics.find(
                                 {"date" : {"$gte" : dt_from, "$lte" : dt_to}}).sort("date")
                           return buildMetricsDict(cursor)

                       except Exception as e:
                           logger.error(e.message)


             Return the graph object (as a list or a dict of lists) to the view that called the
             method
             pagedata={}
             pagedata['accountsGraph']=mongodb_home.getHomeChart()

             return render_to_response('home.html',{'pagedata': pagedata},
             context_instance=RequestContext(request))




Monday, April 15, 13
Loading & Visualization
             Django and HighCharts

             Populate the series.. (JavaScript with Django templating)
             seriesOptions[0] = {
                id: 'naturalAccounts',
                name: "Natural Accounts",
                data: [
                    {% for a in pagedata.metrics.accounts_natural %}
                       {% if not forloop.first %}, {% endif %}
                            [Date.UTC({{a.0}}),{{a.1}}]
                       {% endfor %}
                ],
                tooltip: {
                   valueDecimals: 2
                }
                };




Monday, April 15, 13
Loading & Visualization
             Django and HighCharts

             And Create the Charts and Tables...




Monday, April 15, 13
Loading & Visualization
             Django and HighCharts

             And Create the Charts and Tables...




Monday, April 15, 13
Lessons Learned
                       • Date Time managed as two fields, Datetime and Date
                       • Aggregating and upserting documents as events are received works for us
                       • Real-time Map-Reduce in pyMongo - too slow, don’t do this.
                       • Django-noRel - Unstable, use Django and configure MongoDB as a
                        datastore only

                       • Memcached on Django is good enough (at the moment) - use django-
                        celery with rabbitmq to pre-cache all data after data loading

                       • HighCharts is buggy - considering D3 & other libraries
                       • Don’t need to retrieve data directly from MongoDB to Django, perhaps
                        provide all data via a service layer (at the expense of ever-additional
                        features in pyMongo)




Monday, April 15, 13
Next Steps
                       • A/B testing framework, experiments and variances
                       • Unauthenticated / Authenticated user tracking
                       • Provide data async over service layer
                       • Segmentation with graphical libraries like D3 & Cross-Filter (http://
                        square.github.com/crossfilter/)

                       • Saving Query Criteria, expanding out BI tools for internal users
                       • MongoDB Connector, Hadoop and Hive (maybe Tableau and other tools)
                       • Storm / Kafka for real-time analytics processing
                       • Shard the Replica-Set, looking into Gizzard as the middleware




Monday, April 15, 13
Thanks & Questions
                              Hrishi Dixit           Kevin Connelly                 Will Larche
                       Chief Technology Officer   Director of Engineering        Lead IOS Developer
                        hrishi@learnvest.com      kevin@learnvest.com           will@learnvest.com




                        Jeremy Brennan                Cameron Sim               <your name here>
                 Director of UI/UX Technology    Director of Analytics Tech   New Awesome Developer
                    jeremy@learnvest.com         cameron@learnvest.com          you@learnvest.com




                                                                               HIRED
                                                                                     !
Monday, April 15, 13

Weitere ähnliche Inhalte

Was ist angesagt?

Data persistence using pouchdb and couchdb
Data persistence using pouchdb and couchdbData persistence using pouchdb and couchdb
Data persistence using pouchdb and couchdbDimgba Kalu
 
Visualizing Mobile Broadband with MongoDB
Visualizing Mobile Broadband with MongoDBVisualizing Mobile Broadband with MongoDB
Visualizing Mobile Broadband with MongoDBMongoDB
 
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB
 
MongoDB et Hadoop
MongoDB et HadoopMongoDB et Hadoop
MongoDB et HadoopMongoDB
 
OSCON 2011 CouchApps
OSCON 2011 CouchAppsOSCON 2011 CouchApps
OSCON 2011 CouchAppsBradley Holt
 
Architecting An Enterprise Storage Platform Using Object Stores
Architecting An Enterprise Storage Platform Using Object StoresArchitecting An Enterprise Storage Platform Using Object Stores
Architecting An Enterprise Storage Platform Using Object StoresNiraj Tolia
 
Rest with Java EE 6 , Security , Backbone.js
Rest with Java EE 6 , Security , Backbone.jsRest with Java EE 6 , Security , Backbone.js
Rest with Java EE 6 , Security , Backbone.jsCarol McDonald
 
FIWARE Wednesday Webinars - Introduction to NGSI-LD
FIWARE Wednesday Webinars - Introduction to NGSI-LDFIWARE Wednesday Webinars - Introduction to NGSI-LD
FIWARE Wednesday Webinars - Introduction to NGSI-LDFIWARE
 
Document validation in MongoDB 3.2
Document validation in MongoDB 3.2Document validation in MongoDB 3.2
Document validation in MongoDB 3.2Andrew Morgan
 
MongoDB .local Toronto 2019: MongoDB Atlas Search Deep Dive
MongoDB .local Toronto 2019: MongoDB Atlas Search Deep DiveMongoDB .local Toronto 2019: MongoDB Atlas Search Deep Dive
MongoDB .local Toronto 2019: MongoDB Atlas Search Deep DiveMongoDB
 
Database Trends for Modern Applications: Why the Database You Choose Matters
Database Trends for Modern Applications: Why the Database You Choose Matters Database Trends for Modern Applications: Why the Database You Choose Matters
Database Trends for Modern Applications: Why the Database You Choose Matters MongoDB
 
Webinar: Building Your First App with MongoDB and Java
Webinar: Building Your First App with MongoDB and JavaWebinar: Building Your First App with MongoDB and Java
Webinar: Building Your First App with MongoDB and JavaMongoDB
 
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsOUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsAndrew Morgan
 
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy NguyenGrokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy NguyenHuy Nguyen
 
MongoDB Best Practices for Developers
MongoDB Best Practices for DevelopersMongoDB Best Practices for Developers
MongoDB Best Practices for DevelopersMoshe Kaplan
 
The CIOs Guide to NoSQL
The CIOs Guide to NoSQLThe CIOs Guide to NoSQL
The CIOs Guide to NoSQLDATAVERSITY
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 

Was ist angesagt? (20)

Data persistence using pouchdb and couchdb
Data persistence using pouchdb and couchdbData persistence using pouchdb and couchdb
Data persistence using pouchdb and couchdb
 
Visualizing Mobile Broadband with MongoDB
Visualizing Mobile Broadband with MongoDBVisualizing Mobile Broadband with MongoDB
Visualizing Mobile Broadband with MongoDB
 
MediaGlu and Mongo DB
MediaGlu and Mongo DBMediaGlu and Mongo DB
MediaGlu and Mongo DB
 
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data Presentation
 
MongoDB et Hadoop
MongoDB et HadoopMongoDB et Hadoop
MongoDB et Hadoop
 
MongoDB Workshop
MongoDB WorkshopMongoDB Workshop
MongoDB Workshop
 
OSCON 2011 CouchApps
OSCON 2011 CouchAppsOSCON 2011 CouchApps
OSCON 2011 CouchApps
 
Architecting An Enterprise Storage Platform Using Object Stores
Architecting An Enterprise Storage Platform Using Object StoresArchitecting An Enterprise Storage Platform Using Object Stores
Architecting An Enterprise Storage Platform Using Object Stores
 
Rest with Java EE 6 , Security , Backbone.js
Rest with Java EE 6 , Security , Backbone.jsRest with Java EE 6 , Security , Backbone.js
Rest with Java EE 6 , Security , Backbone.js
 
FIWARE Wednesday Webinars - Introduction to NGSI-LD
FIWARE Wednesday Webinars - Introduction to NGSI-LDFIWARE Wednesday Webinars - Introduction to NGSI-LD
FIWARE Wednesday Webinars - Introduction to NGSI-LD
 
Document validation in MongoDB 3.2
Document validation in MongoDB 3.2Document validation in MongoDB 3.2
Document validation in MongoDB 3.2
 
MongoDB .local Toronto 2019: MongoDB Atlas Search Deep Dive
MongoDB .local Toronto 2019: MongoDB Atlas Search Deep DiveMongoDB .local Toronto 2019: MongoDB Atlas Search Deep Dive
MongoDB .local Toronto 2019: MongoDB Atlas Search Deep Dive
 
Database Trends for Modern Applications: Why the Database You Choose Matters
Database Trends for Modern Applications: Why the Database You Choose Matters Database Trends for Modern Applications: Why the Database You Choose Matters
Database Trends for Modern Applications: Why the Database You Choose Matters
 
Webinar: Building Your First App with MongoDB and Java
Webinar: Building Your First App with MongoDB and JavaWebinar: Building Your First App with MongoDB and Java
Webinar: Building Your First App with MongoDB and Java
 
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsOUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
 
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy NguyenGrokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
 
MongoDB Best Practices for Developers
MongoDB Best Practices for DevelopersMongoDB Best Practices for Developers
MongoDB Best Practices for Developers
 
The CIOs Guide to NoSQL
The CIOs Guide to NoSQLThe CIOs Guide to NoSQL
The CIOs Guide to NoSQL
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 

Andere mochten auch

Andere mochten auch (6)

Building Lanyrd
Building LanyrdBuilding Lanyrd
Building Lanyrd
 
Clickstream & Social Media Analysis using Apache Spark
Clickstream & Social Media Analysis using Apache SparkClickstream & Social Media Analysis using Apache Spark
Clickstream & Social Media Analysis using Apache Spark
 
Clickstream Data Warehouse - Turning clicks into customers
Clickstream Data Warehouse - Turning clicks into customersClickstream Data Warehouse - Turning clicks into customers
Clickstream Data Warehouse - Turning clicks into customers
 
Building Secure Twitter Apps
Building Secure Twitter AppsBuilding Secure Twitter Apps
Building Secure Twitter Apps
 
Using Big Data to Drive Customer 360
Using Big Data to Drive Customer 360Using Big Data to Drive Customer 360
Using Big Data to Drive Customer 360
 
Scalable Django Architecture
Scalable Django ArchitectureScalable Django Architecture
Scalable Django Architecture
 

Ähnlich wie MongoDB ClickStream and Visualization

Google App Engine Update 2012
Google App Engine Update 2012Google App Engine Update 2012
Google App Engine Update 2012David Chandler
 
Cloudfoundry architecture
Cloudfoundry architectureCloudfoundry architecture
Cloudfoundry architectureRamnivas Laddad
 
Spring into the Cloud - JDC2012 Cairo, Egypt
Spring into the Cloud - JDC2012 Cairo, EgyptSpring into the Cloud - JDC2012 Cairo, Egypt
Spring into the Cloud - JDC2012 Cairo, EgyptChris Richardson
 
MongoDB for Java Devs with Spring Data - MongoPhilly 2011
MongoDB for Java Devs with Spring Data - MongoPhilly 2011MongoDB for Java Devs with Spring Data - MongoPhilly 2011
MongoDB for Java Devs with Spring Data - MongoPhilly 2011MongoDB
 
MongoDB for Java Developers with Spring Data
MongoDB for Java Developers with Spring DataMongoDB for Java Developers with Spring Data
MongoDB for Java Developers with Spring DataChris Richardson
 
Cloud Foundry Open Tour Keynote
Cloud Foundry Open Tour KeynoteCloud Foundry Open Tour Keynote
Cloud Foundry Open Tour KeynoteRamnivasLaddad
 
Inaugural address manjusha - Indicthreads cloud computing conference 2011
Inaugural address manjusha -  Indicthreads cloud computing conference 2011Inaugural address manjusha -  Indicthreads cloud computing conference 2011
Inaugural address manjusha - Indicthreads cloud computing conference 2011IndicThreads
 
Salesforce & SAP Integration
Salesforce & SAP IntegrationSalesforce & SAP Integration
Salesforce & SAP IntegrationRaymond Gao
 
CommunityOneEast 09 - Running Java On Amazon EC2
CommunityOneEast 09 - Running Java On Amazon EC2CommunityOneEast 09 - Running Java On Amazon EC2
CommunityOneEast 09 - Running Java On Amazon EC2Chris Richardson
 
SAP technology roadmap- 2012 Update
SAP technology roadmap- 2012 UpdateSAP technology roadmap- 2012 Update
SAP technology roadmap- 2012 UpdateA J
 
IBM Pulse 2013 session - DevOps for Mobile Apps
IBM Pulse 2013 session - DevOps for Mobile AppsIBM Pulse 2013 session - DevOps for Mobile Apps
IBM Pulse 2013 session - DevOps for Mobile AppsSanjeev Sharma
 
SD Forum Java SIG - Running Java Applications On Amazon EC2
SD Forum Java SIG - Running Java Applications On Amazon EC2SD Forum Java SIG - Running Java Applications On Amazon EC2
SD Forum Java SIG - Running Java Applications On Amazon EC2Chris Richardson
 
Are good SharePoint solutions only a myth?
Are good SharePoint solutions only a myth?Are good SharePoint solutions only a myth?
Are good SharePoint solutions only a myth?Adis Jugo
 
Development Model for The Cloud
Development Model for The CloudDevelopment Model for The Cloud
Development Model for The Cloudumityalcinalp
 
Node.js and the MEAN Stack Building Full-Stack Web Applications.pdf
Node.js and the MEAN Stack Building Full-Stack Web Applications.pdfNode.js and the MEAN Stack Building Full-Stack Web Applications.pdf
Node.js and the MEAN Stack Building Full-Stack Web Applications.pdflubnayasminsebl
 
Sap microsoft interoperability sitnl 08-12-2012
Sap microsoft interoperability sitnl 08-12-2012Sap microsoft interoperability sitnl 08-12-2012
Sap microsoft interoperability sitnl 08-12-2012Twan van den Broek
 
Cloud Foundry Bootcamp
Cloud Foundry BootcampCloud Foundry Bootcamp
Cloud Foundry BootcampJoshua Long
 
Venus-c: Using open source clouds in eScience
Venus-c: Using open source clouds in eScienceVenus-c: Using open source clouds in eScience
Venus-c: Using open source clouds in eScienceOW2
 

Ähnlich wie MongoDB ClickStream and Visualization (20)

Google App Engine Update 2012
Google App Engine Update 2012Google App Engine Update 2012
Google App Engine Update 2012
 
Cloudfoundry architecture
Cloudfoundry architectureCloudfoundry architecture
Cloudfoundry architecture
 
Spring into the Cloud - JDC2012 Cairo, Egypt
Spring into the Cloud - JDC2012 Cairo, EgyptSpring into the Cloud - JDC2012 Cairo, Egypt
Spring into the Cloud - JDC2012 Cairo, Egypt
 
MongoUK 2012
MongoUK 2012MongoUK 2012
MongoUK 2012
 
MongoDB for Java Devs with Spring Data - MongoPhilly 2011
MongoDB for Java Devs with Spring Data - MongoPhilly 2011MongoDB for Java Devs with Spring Data - MongoPhilly 2011
MongoDB for Java Devs with Spring Data - MongoPhilly 2011
 
MongoDB for Java Developers with Spring Data
MongoDB for Java Developers with Spring DataMongoDB for Java Developers with Spring Data
MongoDB for Java Developers with Spring Data
 
Cloud Foundry Open Tour Keynote
Cloud Foundry Open Tour KeynoteCloud Foundry Open Tour Keynote
Cloud Foundry Open Tour Keynote
 
Inaugural address manjusha - Indicthreads cloud computing conference 2011
Inaugural address manjusha -  Indicthreads cloud computing conference 2011Inaugural address manjusha -  Indicthreads cloud computing conference 2011
Inaugural address manjusha - Indicthreads cloud computing conference 2011
 
Salesforce & SAP Integration
Salesforce & SAP IntegrationSalesforce & SAP Integration
Salesforce & SAP Integration
 
CommunityOneEast 09 - Running Java On Amazon EC2
CommunityOneEast 09 - Running Java On Amazon EC2CommunityOneEast 09 - Running Java On Amazon EC2
CommunityOneEast 09 - Running Java On Amazon EC2
 
SAP technology roadmap- 2012 Update
SAP technology roadmap- 2012 UpdateSAP technology roadmap- 2012 Update
SAP technology roadmap- 2012 Update
 
IBM Pulse 2013 session - DevOps for Mobile Apps
IBM Pulse 2013 session - DevOps for Mobile AppsIBM Pulse 2013 session - DevOps for Mobile Apps
IBM Pulse 2013 session - DevOps for Mobile Apps
 
SD Forum Java SIG - Running Java Applications On Amazon EC2
SD Forum Java SIG - Running Java Applications On Amazon EC2SD Forum Java SIG - Running Java Applications On Amazon EC2
SD Forum Java SIG - Running Java Applications On Amazon EC2
 
Are good SharePoint solutions only a myth?
Are good SharePoint solutions only a myth?Are good SharePoint solutions only a myth?
Are good SharePoint solutions only a myth?
 
Development Model for The Cloud
Development Model for The CloudDevelopment Model for The Cloud
Development Model for The Cloud
 
Node.js and the MEAN Stack Building Full-Stack Web Applications.pdf
Node.js and the MEAN Stack Building Full-Stack Web Applications.pdfNode.js and the MEAN Stack Building Full-Stack Web Applications.pdf
Node.js and the MEAN Stack Building Full-Stack Web Applications.pdf
 
Spring Mvc
Spring MvcSpring Mvc
Spring Mvc
 
Sap microsoft interoperability sitnl 08-12-2012
Sap microsoft interoperability sitnl 08-12-2012Sap microsoft interoperability sitnl 08-12-2012
Sap microsoft interoperability sitnl 08-12-2012
 
Cloud Foundry Bootcamp
Cloud Foundry BootcampCloud Foundry Bootcamp
Cloud Foundry Bootcamp
 
Venus-c: Using open source clouds in eScience
Venus-c: Using open source clouds in eScienceVenus-c: Using open source clouds in eScience
Venus-c: Using open source clouds in eScience
 

Kürzlich hochgeladen

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 

Kürzlich hochgeladen (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 

MongoDB ClickStream and Visualization

  • 1. Implementing and Visualizing Click-Stream Data with MongoDB Jan 22, 2013 - New York MongoDB User Group Cameron Sim - LearnVest.com Monday, April 15, 13
  • 2. Agenda About LearnVest HL Application Architecture Data Capture Event Packaging MongoDB Data Warehousing Loading & Visualization Finishing up Monday, April 15, 13
  • 3. LearnVest Inc. www.learnvest.com Mission Statement Aiming to making Financial Planning as accessible as having a gym membership Company Key Products Founded in 2008 by Alexa Von Tobel, CEO Account Aggregation and Management (Bank, Credit, Loan, Investment, Mortgage) 50+ People and Growing rapidly Based in NYC Original and Syndicated Newsletter Content Platforms Financial Planning Web & iPhone (tiered product offering) Stack Operational Analytics Wordpress, Backbone.js, Node.js MongoDB 2.2.0 (3-node replica-set) Java Spring 3, Redis, Memcached, Java 6, Spring 3 MongoDB, ActiveMQ, Nginx, MySQL 5.x pyMongo Django 1.4 Monday, April 15, 13
  • 4. LearnVest.com Web Monday, April 15, 13
  • 5. LearnVest.com IPhone Monday, April 15, 13
  • 6. High Level Architecture Production Analytics Platform Delivery Services Services Loaders & Dashboards } } } } HTTPS pyMongo MongoDB Java Conn MongoDB Replication Event Packaging Warehousing MongoDB Visualization Loading & Data Collection JDBC Monday, April 15, 13
  • 7. High Level Architecture Production Analytics Platform Delivery Services Services Loaders & Dashboards } } } } HTTPS pyMongo MongoDB Java Conn MongoDB Replication Event Packaging Warehousing MongoDB Visualization Loading & Data Collection JDBC Monday, April 15, 13
  • 8. High Level Architecture Production Analytics Platform Delivery Services Services Loaders & Dashboards } } } } HTTPS pyMongo MongoDB Java Conn MongoDB Replication Event Packaging Warehousing MongoDB Visualization Loading & Data Collection JDBC Monday, April 15, 13
  • 9. High Level Architecture Production Analytics Platform Delivery Services Services Loaders & Dashboards } } } } HTTPS pyMongo MongoDB Java Conn MongoDB Replication Event Packaging Warehousing MongoDB Visualization Loading & Data Collection JDBC Monday, April 15, 13
  • 10. High Level Architecture Production Analytics Platform Delivery Services Services Loaders & Dashboards } } } } HTTPS pyMongo MongoDB Java Conn MongoDB Replication Event Packaging Warehousing MongoDB Visualization Loading & Data Collection JDBC Monday, April 15, 13
  • 11. High Level Architecture Production Analytics Platform Delivery Services Services Loaders & Dashboards } } } } HTTPS pyMongo MongoDB Java Conn MongoDB Replication Event Packaging Loading & VisualizationData Ware Collection MongoDB JDBC Monday, April 15, 13
  • 12. High Level Architecture Production Analytics Platform Delivery Services Services Loaders & Dashboards } } } } HTTPS pyMongo MongoDB Java Conn MongoDB Replication Event Packaging Loading & Visualization MongoDB Collection JDBC Monday, April 15, 13
  • 13. Philosophy For Data Collection Capture Everything • User-Driven events over web and mobile • System-level exceptions • Everything else Temporary Data • Be ‘ok’ with approximate data • Operational Databases are the system of record Aggregate events as they come in • Remove the overhead of basic metrics (counts, sums) on core events • Group by user unique id and increment counts per event, over time-dimensions (day, week-ending, month, year) Monday, April 15, 13
  • 14. Data Capture IOS - (void) sendAnalyticEventType:(NSString*)eventType object:(NSString*)object name:(NSString*)name page:(NSString*)page source:(NSString*)source; { NSMutableDictionary *eventData = [NSMutableDictionary dictionary]; if (eventType!=nil) [params setObject:eventType forKey:@"eventType"]; if (object!=nil) [eventData setObject:object forKey:@"object"]; if (name!=nil) [eventData setObject:name forKey:@"name"]; if (page!=nil) [eventData setObject:page forKey:@"page"]; if (source!=nil) [eventData setObject:source forKey:@"source"]; if (eventData!=nil) [params setObject:eventData forKey:@"eventData"]; [[LVNetworkEngine sharedManager] analytics_send:params]; } Monday, April 15, 13
  • 15. Data Capture WEB (JavaScript) function internalTrackPageView() { var cookie = { userContext: jQuery.cookie('UserContextCookie'), }; var trackEvent = { eventType: "pageView", eventData: { page: window.location.pathname + window.location.search } }; // AJAX jQuery.ajax({ url: "/api/track", type: "POST", dataType: "json", data: JSON.stringify(trackEvent), // Set Request Headers beforeSend: function (xhr, settings) { xhr.setRequestHeader('Accept', 'application/json'); xhr.setRequestHeader('User-Context', cookie.userContext); if(settings.type === 'PUT' || settings.type === 'POST') { xhr.setRequestHeader('Content-Type', 'application/json'); } } }); } Monday, April 15, 13
  • 16. Bus Event Packaging 1. Spring 3 RESTful service layer, controller methods define the eventCode via @tracking annotation 2. Custom Intercepter class extends HandlerInterceptorAdapter and implements postHandle() (for each event) to invoke calls via Spring @async to an EventPublisher 3. EventPublisher publishes to common event bus queue with multiple subscribers, one of which packages the eventPayload Map<String, Object> object and forwards to Analytics Rest Service Monday, April 15, 13
  • 17. Bus Event Packaging 1) Spring RestController Methods Interface @RequestMapping(value = "/user/login", method = RequestMethod.POST, headers="Accept=application/json") public Map<String, Object> userLogin(@RequestBody Map<String, Object> event, HttpServletRequest request); Concrete/Impl Class @Override @Tracking("user.login") public Map<String, Object> userLogin(@RequestBody Map<String, Object> event, HttpServletRequest request){ //Implementation return event; } Monday, April 15, 13
  • 18. Bus Event Packaging 2) Custom Intercepter class extends HandlerInterceptorAdapter protected void handleTracking(String trackingCode, Map<String, Object> modelMap, HttpServletRequest request) { Map<String, Object> responseModel = new HashMap<String, Object>(); // remove non-serializables & copy over data from modelMap try { this.eventPublisher.publish(trackingCode, responseModel, request); } catch (Exception e) { log.error("Error tracking event '" + trackingCode + "' : " + ExceptionUtils.getStackTrace(e)); } } Monday, April 15, 13
  • 19. Bus Event Packaging 2) Custom Intercepter class extends HandlerInterceptorAdapter public void publish (String eventCode, Map<String,Object> eventData, HttpServletRequest request) { Map<String,Object> payload = new HashMap<String,Object>(); String eventId=UUID.randomUUID().toString(); Map<String, String> requestMap = HttpRequestUtils.getRequestHeaders(request); //Normalize message payload.put("eventType", eventData.get("eventType")); payload.put("eventData", eventData.get("eventType")); payload.put("version", eventData.get("eventType")); payload.put("eventId", eventId); payload.put("eventTime", new Date()); payload.put("request", requestMap); . . . //Send to the Analytics Service for MongoDB persistence } public void sendPost(EventPayload payload){ HttpEntity request = new HttpEntity(payload.getEventPayload(), headers); Map m = restTemplate.postForObject(endpoint, request, java.util.Map.class); } Monday, April 15, 13
  • 20. Bus Event Packaging The Serialized Json (User Action) { “eventCode” : “user.login”, “eventType” : “login”, “version” : “1.0”, “eventTime” : “1358603157746”, “eventData” : { “” : “”, “” : “”, “” : “” }, “request” : { “call-source” : “WEB”, “user-context” : “00002b4f1150249206ac2b692e48ddb3”, “user.agent” : “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/ 23.0.1271.101 Safari/537.11”, “cookie” : “size=4; CP.mode=B; PHPSESSID=c087908516 ee2fae50cef6500101dc89; resolution=1920; JSESSIONID=56EB165266A2C4AFF9 46F139669D746F; csrftoken=73bdcd ddf151dc56b8020855b2cb10c8", "content-length" : "204", "accept-encoding" : "gzip,deflate,sdch”, } } Monday, April 15, 13
  • 21. Bus Event Packaging The Serialized Json (Generic Event) { “eventCode” : “generic.ui”, “eventType” : “pageView”, “version” : “1.0”, “eventTime” : “1358603157746”, “eventData” : { “page” : “/learnvest/moneycenter/inbox”, “section” : “transactions”, “name” : “view transactions” “object” : “page” }, “request” : { “call-source” : “WEB”, “user-context” : “00002b4f1150249206ac2b692e48ddb3”, “user.agent” : “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/ 23.0.1271.101 Safari/537.11”, “cookie” : “size=4; CP.mode=B; PHPSESSID=c087908516 ee2fae50cef6500101dc89; resolution=1920; JSESSIONID=56EB165266A2C4AFF9 46F139669D746F; csrftoken=73bdcd ddf151dc56b8020855b2cb10c8", "content-length" : "204", "accept-encoding" : "gzip,deflate,sdch”, } } Monday, April 15, 13
  • 22. Bus Event Packaging The Serialized Json (Generic Event) { “eventCode” : “generic.ui”, “eventType” : “pageView”, “version” : “1.0”, “eventTime” : “1358603157746”, “eventData” : { “page” : “/learnvest/moneycenter/inbox”, “section” : “transactions”, “name” : “view transactions” “object” : “page” }, “request” : { “call-source” : “WEB”, “user-context” : “00002b4f1150249206ac2b692e48ddb3”, “user.agent” : “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/ 23.0.1271.101 Safari/537.11”, “cookie” : “size=4; CP.mode=B; PHPSESSID=c087908516 ee2fae50cef6500101dc89; resolution=1920; JSESSIONID=56EB165266A2C4AFF9 46F139669D746F; csrftoken=73bdcd ddf151dc56b8020855b2cb10c8", "content-length" : "204", "accept-encoding" : "gzip,deflate,sdch”, } } Monday, April 15, 13
  • 23. MongoDB Data Warehousing MongoDB Information • v2.2.0 • 3-node replica-set • 1 Large (primary), 2x Medium (secondary) AWS Amazon-Linux machines • Each with single 500GB EBS volumes mounted to /opt/data MongoDB Config File dbpath = /opt/data/mongodb/data rest = true replSet = voyager Volumes ~IM events daily on web, ~600K on mobile 2-3 GB per day at start, slowed to ~1GB per day Currently at 78GB (collecting since August 2012) Future Scaling Strategy • Setup 2nd Replica-Set • Shard replica-sets to n at 60% / 250GB per EBS volume • Shard key probably based on sequential mix of email_address & additional string Monday, April 15, 13
  • 24. MongoDB Data Warehousing Approach 1. Persist all events, bucketed by source:- WEB MOBILE 2. Persist all events, bucketed by source, event code and time:- WEB/MOBILE user.login time (day, week-ending, month, year) 3. Insert into collection e_web / e_mobile 4. Upsert into:- e_web_user_login_day e_web_user_login_week e_web_user_login_month e_web_user_login_year 5. Predictable model for scaling and measuring business growth Monday, April 15, 13
  • 25. MongoDB Data Warehousing 2. Persist all events, bucketed by source, event code and time:- //instantiate collections dynamically DBCollection collection_day = mongodb.getCollection(eventCode + "_day"); DBCollection collection_week = mongodb.getCollection(eventCode + "_week"); DBCollection collection_month = mongodb.getCollection(eventCode + "_month"); DBCollection collection_year = mongodb.getCollection(eventCode + "_year"); BasicDBObject newDocument = new BasicDBObject().append("$inc" new BasicDBObject().append("count", 1)); //update day dimension collection_day.update(new BasicDBObject().append("user-context", userContext) .append("eventType", eventType) .append("date", sdf_day.format(d)),newDocument, true, false); //update week dimension collection_week.update(new BasicDBObject().append("user-context", userContext) .append("eventType", eventType) .append("date", sdf_day.format(w)), newDocument, true, false); //update month dimension collection_month.update(new BasicDBObject().append("user-context", userContext) .append("eventType", eventType) .append("date", sdf_month.format(d)), newDocument, true, false); //update month dimension collection_year.update(new BasicDBObject().append("user-context", userContext) .append("eventType", eventType) .append("date", sdf_year.format(d)), newDocument, true, false); Monday, April 15, 13
  • 26. MongoDB Data Warehousing Persist all events, bucketed by source, event code and time:- > show collections e_mobile e_web e_web_account_addManual_day e_web_account_addManual_month e_web_account_addManual_week e_web_account_addManual_year e_web_user_login_day e_web_user_login_week e_web_user_login_month e_web_user_login_year e_mobile_generic_ui_day e_mobile_generic_ui_month e_mobile_generic_ui_week e_mobile_generic_ui_year > db.e_web_user_login_day.find() { "_id" : ObjectId("50e4b9871b36921910222c42"), "count" : 5, "date" : "01/02", "user-context" : "c4ca4238a0b923820dcc509a6f75849b" } { "_id" : ObjectId("50cd6cfcb9a80a2b4ee21422"), "count" : 7, "date" : "01/02", "user-context" : "c4ca4238a0b923820dcc509a6f75849b" } { "_id" : ObjectId("50cd6e51b9a80a2b4ee21427"), "count" : 2, "date" : "01/02", "user-context" : "c4ca4238a0b923820dcc509a6f75849b" } { "_id" : ObjectId("50e4b9871b36921910222c42"), "count" : 3, "date" : "01/03", "user-context" : "50e49a561b36921910222c33" } Monday, April 15, 13
  • 27. MongoDB Data Warehousing Persist all events > db.e_web.findOne() { "_id" : ObjectId("50e4a1ab0364f55ed07c2662"), "created_datetime" : ISODate("2013-01-02T21:07:55.656Z"), "created_date" : ISODate("2013-01-02T00:00:00.000Z"),"request" : { "content-type" : "application/ json", "connection" : "keep-alive", "accept-language" : "en-US,en;q=0.8", "host" : "localhost:8080", "call-source" : "WEB", "accept" : "*/*", "user-context" : "c4ca4238a0b923820dcc509a6f75849b", "origin" : "chrome-extension:// fdmmgilgnpjigdojojpjoooidkmcomcm", "user-agent" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.101 Safari/ 537.11", "accept-charset" : "ISO-8859-1,utf-8;q=0.7,*;q=0.3", "cookie" : "size=4; CP.mode=B; PHPSESSID=c087908516ee2fae50cef6500101dc89; resolution=1920; JSESSIONID=56EB165266A2C4AFF946F139669D746F; csrftoken=73bdcdddf151dc56b8020855b2cb10c8", "content-length" : "255", "accept- encoding" : "gzip,deflate,sdch" }, "eventType" : "flick", "eventData" : { "object" : "button", "name" : "split transaction button", "page" : "#inbox/79876/", "section" : Monday, April 15, 13
  • 28. MongoDB Data Warehousing Indexing Strategy • Indexes on core collections (e_web and e_mobile) come in under 3GB on 7.5GB Large Instance and 3.75GB on Medium instances • Split datetime in two fields and compound index on date with other fields like eventType and user unique id (user-context) • Heavy insertion rates, much lower read rates....so less indexes the better Monday, April 15, 13
  • 29. MongoDB Data Warehousing Indexing Strategy > db.e_web.getIndexes() [ { "v" : 1, "key" : { "request.user-context" : 1, "created_date" : 1 }, "ns" : "moneycenter.e_web", "name" : "request.user-context_1_created_date_1" }, { "v" : 1, "key" : { "eventData.name" : 1, "created_date" : 1 }, "ns" : "moneycenter.e_web", "name" : "eventData.name_1_created_date_1" } ] Monday, April 15, 13
  • 30. Loading & Visualization Objective • Show historic and intraday stats on core use cases (logins, conversions) • Show user funnel rates on conversion pages • Show general usability - how do users really use the Web and IOS platforms? Non-Functionals • Intraday doesn’t need to be “real-time”, polling is good enough for now • Overnight batch job for historic must scale horizontally General Implementation Strategy • Do all heavy lifting & object manipulation, UI should just display graph or table • Modularize the service to be able to regenerate any graphs/tables without a full load Monday, April 15, 13
  • 31. Loading & Visualization Java Batch Service Java Mongo library to query key collections and return user counts and sum of events DBCursor webUserLogins = c.find( new BasicDBObject("date", sdf.format(new Date()))); private HashMap<String, Object> getSumAndCount(DBCursor cursor){ HashMap<String, Object> m = new HashMap<String, Object>(); int sum=0; int count=0; DBObject obj; while(cursor.hasNext()){ obj=(DBObject)cursor.next(); count++; sum=sum+(Integer)obj.get("count"); } m.put("sum", sum); m.put("count", count); m.put("average", sdf.format(new Float(sum)/count)); return m; } Monday, April 15, 13
  • 32. Loading & Visualization Java Batch Service Use Aggregation Framework where required on core collections (e_web) and external data //create aggregation objects DBObject project = new BasicDBObject("$project", new BasicDBObject("day_value", fields) ); DBObject day_value = new BasicDBObject( "day_value", "$day_value"); DBObject groupFields = new BasicDBObject( "_id", day_value); //create the fields to group by, in this case “number” groupFields.put("number", new BasicDBObject( "$sum", 1)); //create the group DBObject group = new BasicDBObject("$group", groupFields); //execute AggregationOutput output = mycollection.aggregate( project, group ); for(DBObject obj : output.results()){ . . } Monday, April 15, 13
  • 33. Loading & Visualization Java Batch Service MongoDB Command Line example on aggregation over a time period, e.g. month > db.e_web.aggregate( [ { $match : { created_date : { $gt : ISODate("2012-10-25T00:00:00")}}}, { $project : { day_value : {"day" : { $dayOfMonth : "$created_date" }, "month":{ $month : "$created_date" }} }}, { $group : { _id : {day_value:"$day_value"} , number : { $sum : 1 } } }, { $sort : { day_value : -1 } } ] ) Monday, April 15, 13
  • 34. Loading & Visualization Java Batch Service Persisting events into graph and table collections >db.homeGraphs.find() { "_id" : ObjectId("50f57b5c1d4e714b581674e2"), "accounts_natural" : 54, "accounts_total" : 54, "date" : ISODate("2011-02-06T05:00:00Z"), "linked_rate" : "12.96", "premium_rate" : "0", "str_date" : "2011,01,06", "upgrade_rate" : "0", "users_avg_linked" : "3.43", "users_linked" : 7 } { "_id" : ObjectId("50f57b5c1d4e714b581674e3"), "accounts_natural" : 144, "accounts_total" : 144, "date" : ISODate("2011-02-07T05:00:00Z"), "linked_rate" : "11.11", "premium_rate" : "0", "str_date" : "2011,01,07", "upgrade_rate" : "0", "users_avg_linked" : "4", "users_linked" : 16 } { "_id" : ObjectId("50f57b5c1d4e714b581674e4"), "accounts_natural" : 119, "accounts_total" : 119, "date" : ISODate("2011-02-08T05:00:00Z"), "linked_rate" : Monday, April 15, 13
  • 35. Loading & Visualization Django and HighCharts Extract data (pyMongo) def getHomeChart(dt_from, dt_to): """Called by home method to get latest 30 day numbers""" try: conn = pymongo.Connection('localhost', 27017) db = conn['lvanalytics'] cursor = db.accountmetrics.find( {"date" : {"$gte" : dt_from, "$lte" : dt_to}}).sort("date") return buildMetricsDict(cursor) except Exception as e: logger.error(e.message) Return the graph object (as a list or a dict of lists) to the view that called the method pagedata={} pagedata['accountsGraph']=mongodb_home.getHomeChart() return render_to_response('home.html',{'pagedata': pagedata}, context_instance=RequestContext(request)) Monday, April 15, 13
  • 36. Loading & Visualization Django and HighCharts Populate the series.. (JavaScript with Django templating) seriesOptions[0] = { id: 'naturalAccounts', name: "Natural Accounts", data: [ {% for a in pagedata.metrics.accounts_natural %} {% if not forloop.first %}, {% endif %} [Date.UTC({{a.0}}),{{a.1}}] {% endfor %} ], tooltip: { valueDecimals: 2 } }; Monday, April 15, 13
  • 37. Loading & Visualization Django and HighCharts And Create the Charts and Tables... Monday, April 15, 13
  • 38. Loading & Visualization Django and HighCharts And Create the Charts and Tables... Monday, April 15, 13
  • 39. Lessons Learned • Date Time managed as two fields, Datetime and Date • Aggregating and upserting documents as events are received works for us • Real-time Map-Reduce in pyMongo - too slow, don’t do this. • Django-noRel - Unstable, use Django and configure MongoDB as a datastore only • Memcached on Django is good enough (at the moment) - use django- celery with rabbitmq to pre-cache all data after data loading • HighCharts is buggy - considering D3 & other libraries • Don’t need to retrieve data directly from MongoDB to Django, perhaps provide all data via a service layer (at the expense of ever-additional features in pyMongo) Monday, April 15, 13
  • 40. Next Steps • A/B testing framework, experiments and variances • Unauthenticated / Authenticated user tracking • Provide data async over service layer • Segmentation with graphical libraries like D3 & Cross-Filter (http:// square.github.com/crossfilter/) • Saving Query Criteria, expanding out BI tools for internal users • MongoDB Connector, Hadoop and Hive (maybe Tableau and other tools) • Storm / Kafka for real-time analytics processing • Shard the Replica-Set, looking into Gizzard as the middleware Monday, April 15, 13
  • 41. Thanks & Questions Hrishi Dixit Kevin Connelly Will Larche Chief Technology Officer Director of Engineering Lead IOS Developer hrishi@learnvest.com kevin@learnvest.com will@learnvest.com Jeremy Brennan Cameron Sim <your name here> Director of UI/UX Technology Director of Analytics Tech New Awesome Developer jeremy@learnvest.com cameron@learnvest.com you@learnvest.com HIRED ! Monday, April 15, 13