Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

"Data is dead, long live data" by Dorion Caroll, CIO @Zynga

357 Aufrufe

Veröffentlicht am

Zynga makes online social and mobile games. Four years after its founding, Zynga had grown to over $1 billion in revenue and IPO'd in 2011 (with the largest IPO since Google in 2004!).

Dorion is a self-taught engineer that got into product development for large-scale, high-availability, data-driven internet applications, products, and properties for the best companies: Oracle, Technorati & Zynga ✅

This talk has:
1. Stories of pretending data is equal to information
2. Stories of storing data for data's sake
3. Stories with a happier ending - insights from data analytics 

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

"Data is dead, long live data" by Dorion Caroll, CIO @Zynga

  1. 1. Data is dead, long live Data Dorion Carroll March 2017
  2. 2. A quick exercise
  3. 3. A little data about me start,end,company,location,positions 1986,1987,”Bondoux Investment Management”,”San Francisco, CA”,”Office Manager” 1987,1990,”GT Capital Management”,”San Francisco, CA”,{”Marketing Analyst”,”Database Administrator”} 1990,1993,”Oracle”,”Redwood City, CA”,{”Database Administrator”,”Manager, MIS”,”Database Architect”,”Development Manager”} 1993,1997,”Electronic Arts”,”Redwood Shores, CA”,”Manager, Client/Server Application Development” 1997,2000,”Excite, Inc.”,”Redwood City, CA”,{”Engineering Manager, Ad Sales and Reporting Systems”,”Director, Engineering, Commerce Division”,”Senior Director, Engineering, Wireless Group”} 2000,2000,”VenusSports.com”,”San Francisco, CA”,”VP, Engineering” 2000,2000,”Softbank Venture Capital”,”Mountain View, CA”,”Technologist in Residence” 2000,2001,”Neomeo”,”San Francisco, CA”,”VP, Engineering and General Manager” 2001,2004,”Postini, Inc.”,”Redwood City, CA”,”Director, Engineering” 2004,2009,”Technorati”,”San Francisco, CA”,{”Senior Engineer”,”Director, Product”,”VP, Engineering”} 2009,2016,”Zynga”,”San Francisco, CA”,{”CTO, Mafia Wars”,”Zynga Fellow”,”CTO, Shared Technology Group”,”CTO, Mobile Division”,”CIO, VP”}
  4. 4. Information
  5. 5. But what does it mean? To understand the meaning of data, we must consider data within our context and experience. My career data, by itself, doesn’t really mean anything. Adding your context and experience might make it meaningful - let’s hope so.
  6. 6. Misuse, abuse and proper use ​Stories from the database ● Pretending data is equal to information ● Storing data for data's sake ● A happier ending - insights from data analytics
  7. 7. Oracle/Sun Microsystems marketing [25 years ago] ​Oracle was, and is, a Sales driven company ● Sales people were kings and queens ● They could do no wrong (or they got fired) ● Marketing wanted to prove their value to the organization Oracle for Sun Microsystems RDBMS kernel campaign at 30% off Massive increase in end of quarter revenue on Sun ● But don't believe data without checking correlations
  8. 8. Mafia Wars bots and server utilization [7 years ago] ​Mafia Wars servers were running at capacity (60%-80% CPU) ● Pushed a new release and CPU dropped instantly to 30% ● Frantically looked in our data systems for any problems ○ No change in revenue ○ No change in concurrents ○ No change in transactions per second Over the next 4 ½ hours CPU load returned to 75%-85% ● >30% points of capacity was for <6% of "players" (bots)
  9. 9. ​Manual testing of mobile games [5 years ago] ​​iPhone games made up 90%+ of revenue in the market ● Zynga had >​100 manual testers ● We were “moving too fast” to automate But we had so many games we couldn't test Android or tablet. We would have had to hire 300+ more testers. ● Supercell took advantage and gained a market we ignored ○ “Farmville” on mobile is called Hay Day
  10. 10. Knowing what you want AND don’t want is key. ● ​Players who purchase more often, retain better ● Optimize for increase in purchase orders Careful what you wish for. 5X increase in orders ● Optimizing for X but not contemplating the unintended consequences can be dangerous BigData and Machine Learning [2 years ago]
  11. 11. ​Marketing resurrection campaign [30 years ago] We had saved backups of >250,000 prospect records in 3 different data AND storage formats. I then spent a week resuscitating them to do a marketing mailing. ● $1.50 postage outbound and $1.50 postage on return mail. ● Less than 0.3% were still valid addresses. ○ 0.3% of 250,000 = 750 ○ Cost of postage: 250,000 X $3.00 = $750,000 (essentially, all the data was garbage) ○ Cost per valid address: $1,000.00 ○ Conversion rate on typical marketing campaigns: < 1% ○ Estimated reactivations of prospects: 7-8 ○ Estimated cost per prospect reactivation (postage costs only): > $100,000 Remember, these were only prospects. There was no guarantee they were ever former customers. I also didn’t include the cost of my time.
  12. 12. Data is expensive ​Technorati [10 years ago] ● Saving all blog data history ● Almost killed the business ($200,000/quarter in CapEx) ​Zynga games [today] ● Saving player data forever in high cost, high availability storage systems with low latency ○ 100’s of millions of player installs ○ 100,000 active players left ○ $10’s of millions in high performance storage
  13. 13. Postini email spam filtering [15 years ago] ​Secret sauce was about calculating the probability that an email was spam AND at the same time calculating the probability it was NOT spam. The trick was then setting a tolerance for spam misses or false positives.
  14. 14. Mafia Wars(even failure can teach) [7 years ago] ​Mafia Wars had a 3-tiered web application architecture. ● When a single memcache node failed it didn't just impact the players in that shard. ● If you had friends in the down shard your play might be impacted. ● Because it is a social game, 1 in 100 nodes down could impact as much as 60% of players. ● Write code defensively to allow MOST players to continue
  15. 15. Viral Growth(beware of viral collapse) [today] ​On the other side of exponential growth in a social network is exponential decay. As key nodes drop out, major portions of the social graph will collapse. ● Measure node growth ● Measure connection growth ● Measure connection intensity (frequency of interaction) ● Track trends ● Forecast decay ● Anticipate collapse
  16. 16. Multi-variant testing: “random” cohorts [today] ​A/B or multi-variant tests need a means of randomizing the test subjects. Typical hashing methods can have problems with unevenly distributed populations. ● Player population of 1,000,000 ● 250,000 per cohort (A,B,C,D) ● Relatively even distribution ● Payers are < 3% of population ● Top spenders (red dots) < 1% of Payers (0.03%) ● 7,500 payers per cohort ● 75 top spenders per cohort Note that quadrant D has no top spenders. ● Discard outliers BUT also analyze separately
  17. 17. Remember [the future] ​ Insights and meaning come from context and experience ● Data is not information ● Data is expensive ● Context can be wrong Challenge yourself to question - always
  18. 18. Thank you Dorion Carroll linkedin.com/in/dorioncarroll/