Recently, Google partnered with SOASTA to train a machine-learning model on a large sample of real-world performance, conversion, and bounce data. In this talk at Velocity Santa Clara, Pat Meenan of Google and Tammy Everts of SOASTA offer an overview of the resulting model—able to predict the impact of performance work and other site metrics on conversion and bounce rates.
7. Vectorizing the data
• Everything needs to be numeric
• Strings converted to several inputs as yes/no
(1/0)
• i.e. Device manufacturer
• “Apple” would be a discrete input
• Watch out for input explosion (UA String)
8. Balancing the data
• 3% conversion rate
• 97% accurate by always guessing no
• Subsample the data for 50/50 mix
10. Smoothing the data
ML works best on normally distributed data
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_val = scaler.transform(x_val)
11. Input/output relationships
• SSL highly correlated with conversions
• Long sessions highly correlated with
not bouncing
• Remove correlated features from
training
12. Training deep learning
model = Sequential()
model.add(...)
model.compile(optimizer='adagrad',
loss='binary_crossentropy',
metrics=["accuracy"])
model.fit(x_train,
y_train,
nb_epoch=EPOCH_COUNT,
batch_size=32,
validation_data=(x_val, y_val),
verbose=2,
shuffle=True)
mPulse is built above the boomerang JavaScript library that collects web performance data from a user’s web browser and sends that back to the mPulse servers on a beacon. The simple definition of a beacon is that it is an HTTP(S) request with a ton of data included either as HTTP headers or as part of the Request’s Query String.
Sessions that converted contained 48% more scripts (including third-party scripts, such as ads, analytics beacons, and social buttons) than sessions that didn’t.
Sessions that converted contained 48% more scripts (including third-party scripts, such as ads, analytics beacons, and social buttons) than sessions that didn’t.
Why? One likely answer is that checkout pages are likely to be more scripted than other pages in the conversion funnel.
Takeaway: Just because shoppers are converting on pages with lots of scripts doesn’t mean those pages are delivering the best possible user experience. More scripts -- especially third-party scripts, which are hosted externally -- can wreak havoc on page loads. Site owners should be aware of the performance impact of all their scripts.
While the previous finding tells us that more scripts correlates to increased conversions, when you add in more images and other elements that make pages more complex, those sessions converted less.
Why? The culprit might be the cumulative performance impact of all those page elements. The more elements on a page, the greater the page’s weight (total number of kilobytes) and complexity.
Takeaway: A typical web page today contains a hundred or so assets hosted on dozens of different servers. Many of these page assets are unoptimized, unmeasured, unmonitored — and therefore unpredictable. This unpredictability makes page loads volatile. Site owners can tackle this problem by setting performance budgets for their pages and culling unnecessary page elements. They should also audit and monitor all the third-party scripts on their sites.
When we talk about images, we’re referring to every single graphic element on a page -- from favicons to logos to product images. On a retail site, those images can quickly add up. On a typical retail page, images can easily comprise up to two thirds (in other words, hundreds of kilobytes) of a page’s total weight. The result: cumulatively slow page loads throughout a session.
“DOM ready” refers to the amount of time it takes for the page’s HTML to be received and parsed by the browser. Actual page elements, such as images, haven’t appeared yet. (It’s kind of like getting ready to cook. Your cookbook is open, your recipe is in front of you, and your ingredients are on standby.)
“DOM ready” refers to the amount of time it takes for the page’s HTML to be received and parsed by the browser. Actual page elements, such as images, haven’t appeared yet. (It’s kind of like getting ready to cook. Your cookbook is open, your recipe is in front of you, and your ingredients are on standby.)
Our research found that bounced sessions had DOM ready times that were 55% slower than non-bounced sessions. We also found that the bounce rate was higher when the first page in a user session was slow.
Takeaway: External blocking scripts (such as third-party ads, analytics, and social widgets) and styles (such as externally hosted CSS and fonts) have the greatest impact on DOM ready times. Site owners should measure the impact that these external elements have on their pages and conduct ongoing monitoring to ensure that scripts and styles are available and fast. Whenever possible, scripts should be served asynchronously (in parallel with the rest of the page) or in a non-blocking fashion.
Bounced sessions had median full page load times that were 53% slower than non-bounced sessions.
Within the performance community, there has been a growing tendency to regard load time as a meaningless metric.
With such a strong correlation between it and bounce rate, dismissing load time may be premature.
Shoppers who used low-bandwidth or mobile connections didn’t convert significantly less than shoppers on faster connections. This is interesting because it confirms that we’ve entered a “mobile everywhere” phase.
Takeaway: Internet users don’t behave especially differently depending on what device they’re using. Site owners need to ensure they’re delivering consistent user experiences across device types.
DNS lookup is when the browser looks up the domain of the object being requested by the browser. Think of this as asking the “phone book” of the internet to find someone’s phone number using their first and last name.
Start render tells you when content begins to display in the user’s browser. But it’s important to note that start render time doesn’t indicate whether that initial content is useful or important, or simply ads and widgets.
This research found that neither of these metrics correlated to a significant impact on conversions. This finding is especially interesting as it pertains to start render time. Up until now, many user experience proponents who participate in the web performance community have placed some value on start render time. This makes sense, because -- on paper, anyway -- start render would seem to reflect the user’s perception of when a page begins to load. But this research suggests that start render isn’t an accurate measure of the user experience -- at least as it pertains to triggering more conversions.
Takeaway: There’s an interesting observation to be made here about how performance measurement is driven by what we’re able to measure versus what we need to measure. Performance measurement tools can gather massive amounts of data about a wide swath of metrics, but are all those metrics meaningful? To what extent do we, as people who care about measuring the user experience, let the tail wag the dog?