Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

•Download as PPTX, PDF•

0 likes•368 views

Recently, Google partnered with SOASTA to train a machine-learning model on a large sample of real-world performance, conversion, and bounce data. In this talk at Velocity Santa Clara, Pat Meenan of Google and Tammy Everts of SOASTA offer an overview of the resulting model—able to predict the impact of performance work and other site metrics on conversion and bounce rates.

Technology

Using machine learning
to determine drivers
of bounce and conversion
Velocity 2016 Santa Clara

Pat Meenan
@patmeenan
Tammy Everts
@tameverts

Get the code
https://github.com/WPO-
Foundation/beacon-ml

Random forest
Lots of random decision trees

Vectorizing the data
• Everything needs to be numeric
• Strings converted to several inputs as yes/no
(1/0)
• i.e. Device manufacturer
• “Apple” would be a discrete input
• Watch out for input explosion (UA String)

Balancing the data
• 3% conversion rate
• 97% accurate by always guessing no
• Subsample the data for 50/50 mix

Validation data
• Train on 80% of the data
• Validate on 20% to prevent overfitting

Smoothing the data
ML works best on normally distributed data
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_val = scaler.transform(x_val)

Input/output relationships
• SSL highly correlated with conversions
• Long sessions highly correlated with
not bouncing
• Remove correlated features from
training

Training deep learning
model = Sequential()
model.add(...)
model.compile(optimizer='adagrad',
loss='binary_crossentropy',
metrics=["accuracy"])
model.fit(x_train,
y_train,
nb_epoch=EPOCH_COUNT,
batch_size=32,
validation_data=(x_val, y_val),
verbose=2,
shuffle=True)

$Training random forest clf = RandomForestClassifier(n_estimators=FOREST_SIZE, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, bootstrap=True, oob_score=False, n_jobs=12, random_state=None, verbose=2, warm_start=False, class_weight=None) clf.fit(x_train, y_train)$

Feature importances
clf.feature_importances_

What’s in our beacon?
• Top-level – domain, timestamp, SSL
• Session – start time, length (in pages), total load time
• User agent – browser, OS, mobile ISP
• Geo – country, city, organization, ISP, network speed
• Bandwidth
• Timers – base, custom, user-defined
• Custom metrics
• HTTP headers
• Etc.

Finding 1
Number of scripts was a predictor… but
not in the way we expected

Finding 2
When entire sessions were more
complex, they converted less

Finding 3
Sessions that converted
had 38% fewer images
than sessions that didn’t

Finding 4
DOM ready was the greatest indicator
of bounce rate

Finding 5
Full load time was the second greatest
indicator of bounce rate

Finding 6
Mobile-related measurements weren’t
meaningful predictors of conversions

Finding 7
Some conventional metrics
were (almost) meaningless, too

Feature Importance (out of 93)
DNS lookup 79
Start render 69

1. YMMV
2. Do this with your own data
3. Gather your RUM data
4. Run the machine learning
against it

Viewers also liked

Moteki2016 researchstrategy　モテサクの実績と計画と野望 2016耕作茂木

Slides day nbs_part1Alex Proskyrin

TalentNet 2016 - "Recruiting at Tech Events"James Mayes

Herramientas digitalessandy montero

Reconverse - Inhouse Recruitment for StartupsJames Mayes

Building a developer community around hardware + softwareAmanda Whaley

Omdømmedagen 2009: Terje Venold (Veidekke)Andreas Rødland

2keer.nl bouwt aan de toekomst2keernl

観測船みらい船内セミナー　Moteki151119 mr1504seminar耕作茂木

Gagarin55 4 k-imax-presentation-ver5-update29082015 (1)Andrey Klimenko

PrototypingNikolay Berezovskiy

curso online_de_análise_fundamentalistaPaulo Klosowski

Cjeloviti koncept specijalističkog usavršavanja doktora medicineHrvatska liječnička komora

Sağlık Sektöründe Akredite Laboratuvar İhtiyaçlarıS.Oguz Savas

Thomas' chocolate clubRoom26pvs

Viewers also liked (15)

Moteki2016 researchstrategy　モテサクの実績と計画と野望 2016

Slides day nbs_part1

TalentNet 2016 - "Recruiting at Tech Events"

Herramientas digitales

Reconverse - Inhouse Recruitment for Startups

Building a developer community around hardware + software

Omdømmedagen 2009: Terje Venold (Veidekke)

2keer.nl bouwt aan de toekomst

観測船みらい船内セミナー　Moteki151119 mr1504seminar

Gagarin55 4 k-imax-presentation-ver5-update29082015 (1)

Prototyping

curso online_de_análise_fundamentalista

Cjeloviti koncept specijalističkog usavršavanja doktora medicine

Sağlık Sektöründe Akredite Laboratuvar İhtiyaçları

Thomas' chocolate club

Similar to Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Using Time Series for Full Observability of a SaaS PlatformDevOps.com

Building data intensive applicationsAmit Kejriwal

Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...InfluxData

Observability - the good, the bad, and the uglyAleksandr Tavgen

Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...Lucidworks

Observability - The good, the bad and the ugly Xp Days 2019 Kiev Ukraine Aleksandr Tavgen

Observability – the good, the bad, and the uglyTimetrix

Big Data LDN 2018: USING FAST DATA AND STREAM PROCESSING TO OPERATIONALISE MA...Matt Stubbs

Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Databricks

Powering Real-Time Decisions with Continuous Data StreamsSafe Software

Using Machine Learning to Optimize DevOps PracticesPeter Varhol

Parallel machines flinkforward2017Nisha Talagala

Stream Processing OverviewMaycon Viana Bordin

Scaling Systems: Architectures that growGibraltar Software

Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...confluent

Rise of the machines -- Owasp israel -- June 2014 meetupShlomo Yona

Nfr testing(performance)Dilip Sharma

Using AWS To Build A Scalable Machine Data Analytics ServiceChristian Beedgen

The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web TestingPerfecto by Perforce

Shikha fdp 62_14july2017Dr. Shikha Mehta

Similar to Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion (20)

Using Time Series for Full Observability of a SaaS Platform

Building data intensive applications

Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...

Observability - the good, the bad, and the ugly

Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...

Observability - The good, the bad and the ugly Xp Days 2019 Kiev Ukraine

Observability – the good, the bad, and the ugly

Big Data LDN 2018: USING FAST DATA AND STREAM PROCESSING TO OPERATIONALISE MA...

Lessons Learned Replatforming A Large Machine Learning Application To Apache ...

Powering Real-Time Decisions with Continuous Data Streams

Using Machine Learning to Optimize DevOps Practices

Parallel machines flinkforward2017

Stream Processing Overview

Scaling Systems: Architectures that grow

Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...

Rise of the machines -- Owasp israel -- June 2014 meetup

Nfr testing(performance)

Using AWS To Build A Scalable Machine Data Analytics Service

The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web Testing

Shikha fdp 62_14july2017

Recently uploaded

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

Scaling API-first – The story of a global engineering organizationRadu Cotescu

GenCyber Cyber Security Day PresentationMichael W. Hawkins

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

Google AI Hackathon: LLM based Evaluator for RAGSujit Pal

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

Recently uploaded (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

Scaling API-first – The story of a global engineering organization

GenCyber Cyber Security Day Presentation

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

Salesforce Community Group Quito, Salesforce 101

08448380779 Call Girls In Civil Lines Women Seeking Men

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

08448380779 Call Girls In Friends Colony Women Seeking Men

Boost PC performance: How more available memory can improve productivity

Data Cloud, More than a CDP by Matt Robison

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

SQL Database Design For Developers at php[tek] 2024

Handwritten Text Recognition for manuscripts and early printed texts

Breaking the Kubernetes Kill Chain: Host Path Mount

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

Google AI Hackathon: LLM based Evaluator for RAG

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

1. Using machine learning to determine drivers of bounce and conversion Velocity 2016 Santa Clara

2. Pat Meenan @patmeenan Tammy Everts @tameverts

3. What we did (and why we did it)

4. Get the code https://github.com/WPO- Foundation/beacon-ml

5. Deep learning weights

6. Random forest Lots of random decision trees

7. Vectorizing the data • Everything needs to be numeric • Strings converted to several inputs as yes/no (1/0) • i.e. Device manufacturer • “Apple” would be a discrete input • Watch out for input explosion (UA String)

8. Balancing the data • 3% conversion rate • 97% accurate by always guessing no • Subsample the data for 50/50 mix

9. Validation data • Train on 80% of the data • Validate on 20% to prevent overfitting

10. Smoothing the data ML works best on normally distributed data scaler = StandardScaler() x_train = scaler.fit_transform(x_train) x_val = scaler.transform(x_val)

11. Input/output relationships • SSL highly correlated with conversions • Long sessions highly correlated with not bouncing • Remove correlated features from training

12. Training deep learning model = Sequential() model.add(...) model.compile(optimizer='adagrad', loss='binary_crossentropy', metrics=["accuracy"]) model.fit(x_train, y_train, nb_epoch=EPOCH_COUNT, batch_size=32, validation_data=(x_val, y_val), verbose=2, shuffle=True)

13. Training random forest clf = RandomForestClassifier(n_estimators=FOREST_SIZE, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, bootstrap=True, oob_score=False, n_jobs=12, random_state=None, verbose=2, warm_start=False, class_weight=None) clf.fit(x_train, y_train)

14. Feature importances clf.feature_importances_

15. What we learned

16. What’s in our beacon? • Top-level – domain, timestamp, SSL • Session – start time, length (in pages), total load time • User agent – browser, OS, mobile ISP • Geo – country, city, organization, ISP, network speed • Bandwidth • Timers – base, custom, user-defined • Custom metrics • HTTP headers • Etc.

17. Conversion rate

18. Conversion rate

19. Bounce rate

20. Bounce rate

21. Finding 1 Number of scripts was a predictor… but not in the way we expected

22. Number of scripts per page (median)

23. Finding 2 When entire sessions were more complex, they converted less

24. Finding 3 Sessions that converted had 38% fewer images than sessions that didn’t

25. Number of images per page (median)

26. Finding 4 DOM ready was the greatest indicator of bounce rate

27. DOM ready (median)

28. Finding 5 Full load time was the second greatest indicator of bounce rate

29. timers_loaded (median)

30. Finding 6 Mobile-related measurements weren’t meaningful predictors of conversions

31. Conversions

32. Finding 7 Some conventional metrics were (almost) meaningless, too

33. Feature Importance (out of 93) DNS lookup 79 Start render 69

34. Takeaways

35. 1. YMMV 2. Do this with your own data 3. Gather your RUM data 4. Run the machine learning against it

36. Thanks!

Editor's Notes

mPulse is built above the boomerang JavaScript library that collects web performance data from a user’s web browser and sends that back to the mPulse servers on a beacon. The simple definition of a beacon is that it is an HTTP(S) request with a ton of data included either as HTTP headers or as part of the Request’s Query String.
Sessions that converted contained 48% more scripts (including third-party scripts, such as ads, analytics beacons, and social buttons) than sessions that didn’t.
Sessions that converted contained 48% more scripts (including third-party scripts, such as ads, analytics beacons, and social buttons) than sessions that didn’t. Why? One likely answer is that checkout pages are likely to be more scripted than other pages in the conversion funnel. Takeaway: Just because shoppers are converting on pages with lots of scripts doesn’t mean those pages are delivering the best possible user experience. More scripts -- especially third-party scripts, which are hosted externally -- can wreak havoc on page loads. Site owners should be aware of the performance impact of all their scripts.
While the previous finding tells us that more scripts correlates to increased conversions, when you add in more images and other elements that make pages more complex, those sessions converted less. Why? The culprit might be the cumulative performance impact of all those page elements. The more elements on a page, the greater the page’s weight (total number of kilobytes) and complexity. Takeaway: A typical web page today contains a hundred or so assets hosted on dozens of different servers. Many of these page assets are unoptimized, unmeasured, unmonitored — and therefore unpredictable. This unpredictability makes page loads volatile. Site owners can tackle this problem by setting performance budgets for their pages and culling unnecessary page elements. They should also audit and monitor all the third-party scripts on their sites.
When we talk about images, we’re referring to every single graphic element on a page -- from favicons to logos to product images. On a retail site, those images can quickly add up. On a typical retail page, images can easily comprise up to two thirds (in other words, hundreds of kilobytes) of a page’s total weight. The result: cumulatively slow page loads throughout a session.
“DOM ready” refers to the amount of time it takes for the page’s HTML to be received and parsed by the browser. Actual page elements, such as images, haven’t appeared yet. (It’s kind of like getting ready to cook. Your cookbook is open, your recipe is in front of you, and your ingredients are on standby.)
“DOM ready” refers to the amount of time it takes for the page’s HTML to be received and parsed by the browser. Actual page elements, such as images, haven’t appeared yet. (It’s kind of like getting ready to cook. Your cookbook is open, your recipe is in front of you, and your ingredients are on standby.) Our research found that bounced sessions had DOM ready times that were 55% slower than non-bounced sessions. We also found that the bounce rate was higher when the first page in a user session was slow. Takeaway: External blocking scripts (such as third-party ads, analytics, and social widgets) and styles (such as externally hosted CSS and fonts) have the greatest impact on DOM ready times. Site owners should measure the impact that these external elements have on their pages and conduct ongoing monitoring to ensure that scripts and styles are available and fast. Whenever possible, scripts should be served asynchronously (in parallel with the rest of the page) or in a non-blocking fashion.
Bounced sessions had median full page load times that were 53% slower than non-bounced sessions. Within the performance community, there has been a growing tendency to regard load time as a meaningless metric. With such a strong correlation between it and bounce rate, dismissing load time may be premature.
Shoppers who used low-bandwidth or mobile connections didn’t convert significantly less than shoppers on faster connections. This is interesting because it confirms that we’ve entered a “mobile everywhere” phase. Takeaway: Internet users don’t behave especially differently depending on what device they’re using. Site owners need to ensure they’re delivering consistent user experiences across device types.
DNS lookup is when the browser looks up the domain of the object being requested by the browser. Think of this as asking the “phone book” of the internet to find someone’s phone number using their first and last name. Start render tells you when content begins to display in the user’s browser. But it’s important to note that start render time doesn’t indicate whether that initial content is useful or important, or simply ads and widgets. This research found that neither of these metrics correlated to a significant impact on conversions. This finding is especially interesting as it pertains to start render time. Up until now, many user experience proponents who participate in the web performance community have placed some value on start render time. This makes sense, because -- on paper, anyway -- start render would seem to reflect the user’s perception of when a page begins to load. But this research suggests that start render isn’t an accurate measure of the user experience -- at least as it pertains to triggering more conversions. Takeaway: There’s an interesting observation to be made here about how performance measurement is driven by what we’re able to measure versus what we need to measure. Performance measurement tools can gather massive amounts of data about a wide swath of metrics, but are all those metrics meaningful? To what extent do we, as people who care about measuring the user experience, let the tail wag the dog?

Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (15)

Similar to Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Similar to Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion (20)

More from SOASTA

More from SOASTA (20)

Recently uploaded

Recently uploaded (20)

Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Editor's Notes