SlideShare ist ein Scribd-Unternehmen logo
1 von 56
Downloaden Sie, um offline zu lesen
Back a few years ago when social games were exploding on facebook the
conventional wisdom was that you wanted to release your minimum viable product as
quickly as you could, and iterate on it in the wild with real data from players. But that
only made sense in a world where if you “wasted” your early traffic on a poor game it
was relatively easy to get more. Mobile is the opposite: traffic is always at a premium
and a strong global launch has become crucial to success. It’s the best chance to get
substantial features from the platforms, to get noticed in the new charts, and
potentially get picked up by recommendation algorithms.
Ideally we’d all be like Blizzard & Supercell and be able to polish and test games
internally for years, but there are all sorts of pressures and costs pushing games out
the door. For companies with <$1B in annual profit it’s important to be realistic about
what, why, and how to test to maximize our chance at success.
We started Kongregate back in 2006 as an open platform for browser games, a little
like YouTube for games: anyone can upload, we then add chat, comments, forums,
achievements, and a lot of other social features that make the site a whole game
itself. In 10 years more than 100,000 games have been put on our site, covering
pretty much every genre imaginable. A fairly wide array of games are popular, from
casual puzzles and launch games to MMOs and collectible card games. Overall our
audience trends male, with heavy overlap with console and Steam.
Four years ago we started publishing third party games on mobile, and have
launched more than 20 games in the last 3 years. Like on Kongregate itself we
publish a fairly broad range of games, from more niche, high monetizing RPGs and
CCGs to single-player games with mostly ad monetization. To give you a feel of the
range here are the games we published in 2015.
And like everything in life they have different strengths and weaknesses. To use them
properly it’s important to understand what those are, so I’m going to spend a bit of
time going through the different types.
One note: I’m not going to be talking much about defect and bug-oriented QA testing,
which is not to say it’s not important. It’s very important, and you should do it, ideally
with dedicated in-house resources supplemented by 3rd party resources. But I only
have an hour so I have to skimp on some topics.
By Team Playtests I mean both the testing you do naturally as you add on features to
the game, and also scheduled team sessions to look at the game more broadly.
Game is available, everyone’s getting paid to do this
You don’t play like a player, you know how things are supposed to work. And you
know games too well – every convention is obvious to you. And since you’re testing to
make sure features are working you play the game in totally unnatural ways “I’m
jumping right to x” “I’m pushing all the buttons” “I’m going to play with an OP account
to breeze through”
By in-person playtests I mean getting a 3rd party unrelated to game development to
play the game. It could be as informal as handing somebody a device out in the wild,
or could be an organized, in-office thing where you hire people to come in and play
the game.
•  They’re a pain-in-the-ass to arrange whether you’re bringing strangers from
Craigslist into your office or harassing them at a coffee shop. And they take a lot
of skill to run & analyze well – not to prompt the person, to jump to solutions,
conflate problems, hear beyond what they say
•  They are psychologically difficult – exposing your work is hard, It’s pretty
equivalent to the feeling most of us have about getting up to sing in a public
Karaoke bar, only without it being appropriate to get a little drunk first. when
combined with it being a pita, get pushed off indefinitely “it’s not ready” “they’ll
only tell us what we already know”.
•  Depending on how you’re recruiting
Companies like Usertesting.com (whom we use) and others
•  Generally a directed task testers are supposed to complete, so less natural
experience than just picking up a game and exploring, higher willingness to
“figure it out”
•  Limited view of body language, narration expresses conscious thoughts, not
unconscious, no chance to follow-up
•  Still limited to first session experience, now with a time limit
•  Still small sample, luck of the draw on testers, some selection bias in terms of
who takes these kind of gigs
By this I don’t mean in person, but sending out a mobile build or a link to a web
version to a group of people you know and seeing what happens
•  More realistic experience, they’re testing the whole game across as many
sessions as they want to play
•  Depending on who it goes to you can get good qualitative feedback,
•  Audience tends to be biased in your support, professional game developers, or
both. a lot of people won’t want to hurt your feelings. You’re unlikely to hear “this
sucks” even if it’s true.
•  Low sample size & unrepresentative/biased audience = mostly garbage metrics
The way to think about this: if we assume that tutorial completion is around 80%
globally, if I get 400 installs representative of the global audience 95% of the time
their tutorial completion rate will be between 72% and 88%.
Now just because you have a smaller sample than this doesn’t mean metrics are
useless. In a normal distribution you are more likely to be closer to the mean than
farther away. My unscientific rule of thumb is that you start getting directionally useful
if not accurate metrics when an event has occurred 75 or more times. So for
directionally useful tutorial results you need about 100 people installing a game, and
for directionally useful buyer conversion rate more like 5,000.
By this I mean inviting a broader group of players to play a game not yet released,
either on web or Android, usually volunteers from a fanbase
•  All the benefits of friends & family tests with larger sample sizes!
•  REALLY engaged audience excited to give feedback, not constrained by
politeness. They’ll tell you your game sucks.
•  Chance to build a community for your game pre-release
On Kongregate.com beta access to games is a benefit of Kong+, our ad-free version
players pay $29.99/year. We also gift it to our volunteer moderators and big spenders.
There are about 30,000 MAU, and the average beta game gets ~3k beta users.
We consistently have 5-10 games in closed betas, which we use for our publishing
portfolio, but is also open to other developers
Metrics are the product of the audience and the underlying game. You can get
average metrics by either putting poorly qualified traffic into an amazing game, or by
putting amazing traffic into a mediocre game. Now this is somewhat obvious, but if
you’ve been working on a game for a year it’s easy to forget about audience, and
think metrics are entirely about the game. It is especially easy to underestimate how
BIG the audience swing can be.
This was the most extreme split we’ve ever seen, with a 9X difference in % buyers,
which is more commonly 3-4x higher in beta than in global release. But even though
it’s inflated it’s still useful: we could tell this was going to be a high ARPU game with
mediocre initial retention but good long-term retention. (Note, web d1 tends to be
much lower than mobile, but they are comparable by D30.
This is the now classic method for mobile, releasing fully but in just a few countries,
often Canada and/or Australia
The real thing is hard, and a lot of the weakness are just aspects of the mobile game
market
•  It’s hard work releasing anything on mobile – builds, screenshots, but particularly
getting games working on such a broad range of devices
•  Long Apple approval times makes iteration slow even when you can quickly fix a
problem
•  Traffic doesn’t magically show up in games and buying it is expensive – average
is ~$3 per install in Canada & Australia, a bit less on Android
You can do
Australia & Canada may be good proxies for the US, but that’s likely to be <1/3 of
installs
They used to be closer, but the gap widened after the release of the iPhone 6 when a
lot of high end device users switched back to iOS
Especially for more niche, high LTV genres like CCGs, RPGs & Strategy games
whether it’s CPIs going up or retention & LTV going down after your first “golden
cohort”.
In a small market a few big buyers can blow out a market. And test market spend
tends to be less ROI focused, so you see weird patterns.
Spellstone, a polished collectible card game from Synapse games, shows some of
the dynamics. You can see that the performance of iOS is generally much stronger
than Android, and that paid generally is quite a bit stronger than organics. But the
really dramatic number is the huge drop in performance on organic traffic coming from
the substantial features that the game got, with around a 70% drop in the ARPU on
both iOS and Android. This is most dramatic with more niche, high LTV genres like
CCGs, RPGs & Strategy games. For a more casual, broad audience game like
AdVenture Capitalist we don’t typically see a big delta.
Your game is driven by outliers and their presence or absence distort almost anything
you look at.
Binary “yes/no” metrics like D1 retention or tutorial completion are more reliable than
averages involving engagement or revenue. And the deeper your game, the less
spending is capped, the more unpredictable those averages get. So as much as
possible look at binary metrics that are proxies for the averages, or answer the
questions. % repeat buyers, for example, rather than ARPPU.
Don’t get me wrong, I love A/B tests. They divorce correlation from causation, and
manage the audience mix problems well, too. But don’t expect to be able to run a lot
of A/B tests in test markets unless you’re willing to spend major $$s. The problem is
sample size. Take the numbers I was giving earlier for cohort sizes then double them
for an A/B test. And with an A/B test you need to measure the results fairly precisely,
directional numbers are not good enough, or you can make bad choices based on the
results.
More of a strategy than a testing method, but one I think is underused. I’m biased of
course! I have a web portal. But this strategy has worked for a lot of big successes,
from King with Candy Crush to Blizzard recently with Hearthstone.
– Steam, Kongregate, Facebook, Miniclip, Newgrounds, Addicting Games and
hundreds more. But it’s important to find the right audience fit – Facebook is a much
better choice for a very casual game than Steam or Kongregate
Better social feature support (forums, videos, streaming, etc) to build community
Comparable LTVs (at least on Kongregate)
Chrome no longer supports the Unity plug-in, and Firefox will likely kill it by the end of
the year. Flash is still going for now, but will likely be phased out in a few years. But
the webGL export from Unity is improving rapidly, and there are a lot of other good
cross-platform frameworks to work with, such as Haxe.
Over the next 6 months they worked on polishing the UI, adding monetization, all
while releasing it on dozens of additional platforms across the web, and were able to
extend the content and improve the balance while building a bigger and bigger
audience.
Mobile test markets lasted about 2 months, focused on mobile device stability, FTUE
and the new rewarded video integration, a huge addition to monetization
And since each type of test adds something, the ideal plan is to use all of them, in
approximately this order, with company playtests throughout a given.
How much money and time you have are intimately related: time is generally the
biggest cost because each month of a studio’s burn rate adds up. But there are
situations where that’s not true: in a big company you may have intense pressure to
launch by a certain date but a lavish budget for test market marketing. Or an indie
doing work-for-hire or with a full-time jobs may have little time pressure but no money
to buy traffic. The indie should focus on in-person play tests and closed betas, while
with enough money the big studio can blast their way through geo-locked test
markets.
A simple puzzle game or endless runner that are easy to pick up and play are going
to get more value out of the experiential testing that help them nail the fun in the core
experience, but isolated, single session tests are much less useful to multiplayer
games with deep metagames and economies, where long-term metric-based testing
is crucial to getting things right.
Games with lots of graphics, or are otherwise technically demanding, are going to
need extra time in mobile geo-locked test markets to deal with all the problems that
crop up with low memory and low GPU devices on both Android & iOS
And finally: what are your goals? Do you expect to get significant features from the
platforms? Are you look for top 10 grossing? Top 1000? The bigger the launch you’re
expecting, the higher the stakes and the more crucial it is to have the game in the
best state possible at launch.
Assuming both time and money are at least somewhat constrained. Say a mid-sized
studio with most of their burn covered by existing game income, but not a big cushion.
This still assumes 6 months from friends/family to global mobile release. If you cut it
much shorter on a multiplayer game you will almost certainly regret it.
Single-player game made by a small cash-strapped team, mobile-specific controls
and ad-based monetization. In this case most of the real game testing and iteration
should be on the back of rigorous, frequent, in-person testing. Then mobile test
market can concentrate on just a few metrics.
During pre-production & production the key question you’re asking is “Is this fun? Are
we on the right track?” The sooner you figure out something isn’t working as you
expected, the easier it is to fix. The more you are departing from convention and
comparables the more you need to validate as you go along.
As you approach release you’re asking “Is This Ready”? It’s a great time to do remote
playtests focused on the first time user experience, and make sure that analytics are
hooked up and firing correctly – that last is not a given, analytics are very easy to
screw up. This is also a great time to send your game to a 3rd party QA service to test
on a broad range of devices if you’re going straight to mobile.
It’s the next Clash of Clans! Or Crossy Road! Everything is broken! Who would even
play this piece of shit? Total failure.
Depending on the person, they may cherry-pick the good, or focus only on the bad.
This is where the rubber really meets the road. To know if the game is working, you
have to know what you’re looking for.
We have a very successful game with 20% D1 retention. We have very successful
games with $0.03 ARPDAUs.
So here are some sample metrics ranges, from low to high, for the genres I’ve been
using for example test plans, then for all genres. This is loosely based on the metrics
we’ve seen from games in our portfolio and more generally on Kongregate.com. As
you can see the “all genres” low/medium/high can be pretty different from the ranges
by genres. Good retention for a multiplayer RPG game is drastically lower than an
idle game. Good monetization for a casual runner would be terrible for that same
RPG. What’s missing from this is expectations around traffic: that casual runner has
probably 10x the potential traffic of the multiplayer RPG.
Here’s a couple of outcome models based on the average metrics for each of these
genres, then broken out by different levels of traffic. You’ll notice these profitability
scenarios aren’t great: games with average metrics need exceptional traffic to
become a sustainable business. In general for a success you need at least one or two
metrics to fall in one of the “high scenarios”.
Set your goals matching realistic expectations of the genre and acceptable (not ideal)
business outcomes.
If you need to hit top 20 grossing to justify huge budget/company expectations, your
goals should be much higher than if you mostly want to learn.
It’s not that you don’t look at other metrics, but you want to set the gates based on the
most important one for that stage.
One of the benefits of breaking your testing into stages is that on mobile it allows you
to use a wider variety of test markets. Canada & Australia are not only expensive
places to test stability, they’re a bad choice because they have a much lower % of the
low-end devices most likely to trigger issues. That’s better tested in emerging
markets. Overall testing in a range of countries from tier 1 to tier 3 will give you a
much more representative view of your global performance than limiting to a few
English-speaking markets. We’ve used more than 20 different countries in the last
year, all shown in this map.
Note that the sample sizes are cumulative. 12k isn’t enough for statistical significance
on buyer %, but 25k is.
This is not optional, but a surprising number of developers don’t. Crashes are
annoying to players, effecting both retention and your ratings and reviews in the app
stores. We recommend a 3rd party service like Fabric/Crashlytics or Hockey App,
though there’s a quite bit of info in Google Play console as well.
Stage 2 is where you’re really optimizing your game, and you should look at in the
same stream that players are moving through the game, as that’s the order in which
sample sizes will get large enough, and because problems in one will likely flow into
the next. You start with the progression through the first time user experience,
checking drop-off pre-tutorial, and then at each tutorial step. Then you’ll watch PVE
progression, what’s the progression through missions, what are the win rates. After
players have been in the game a little longer you want to look at PVP participation
and win rates, and then finally at the economy, where you should look at the full sinks
and sources flow, but keep a particularly close eye on resource balances and how
they grow.
Retention is the KPI reflection of progression. Without longer term retention, which
reflects commitment and engagement with the game, few people will pay. Conversion
reflects both engagement and balance: do they care enough to buy something, and is
there a good reason to. If retention is good but conversion is bad, then either what’s
being sold is not compelling, the balance is not challenging enough for it to be useful,
or the economy is imbalanced in a way where there’s no reason to purchase,
because you can get it for free. Note that there is some tension between retention
and conversion, though, because tight economies may make players more likely to
churn.
New buyer packs can be great at boosting conversion, while masking underlying
problems. One of the most important stats to look at is repeat conversion – how many
people buy a 2nd time? A 3rd? Repeat conversion shows both how players feel about
their first purchase and whether there is depth of spend. If you have a high
conversion rate but low repeat purchasing, your game will just pop and drop.
Note that I haven’t mentioned ARPPU. That’s because while it’s an important metric,
it’s not one you can really look at with any reliability because revenue is an
exponential distribution, and very erratic in small samples. However ARPPUs are
capped fairly low unless there are repeat purchases, so looking at that statistic, which
is a normal distribution, answers the same question with better stability.
It’s human nature to project causation, so as you make changes to your game you’re
likely to look at daily numbers and think it’s the result of what you’ve done. Resist.
Even with a game doing a reasonably big test market you get tons and tons of
random variation in the daily numbers. These numbers are from Spellstone’s test
markets, which had between 1000-1500 DAU through most of this but the daily
numbers are all over the place.
We track rolling averages, which helps, but if you want to look at the impact of
changes roll up cohorts from before and after the change to get a statistically
significant sample to look at. And still take that with a grain of salt because of
audience mix and confounding effects from other changes you’re making.
An important part of mobile test markets is optimizing the assets you’re using in the
app store – we regularly see substantial gains testing icons, video, screenshots, and
copy, though we’ve never seen a significant difference from game name testing. But
context is important – results are often inconsistent between app stores and geos. For
that reason we don’t start our ASO until we’ve expanded beyond T3 markets.
Google’s tools for this are great, but for iOS you need to use a paid service like Store
Maven.
Test market marketing isn’t just about driving installs, it’s about testing the marketing
itself, optimizing creatives & targeting, generally figuring out: will this work? Will I be
able to drive audience into my game? Can I do it profitably? Test a lot of creatives
across a lot of networks. Keep refreshing. You never know what’s going to work
because again context matters a lot.
There’s nothing worse than having a big feature and then have the servers crumble
under the load. It feels like flushing success down the toilet, and you are. Now “how to
load test a game” is a subject that deserves it’s own GDC talk and I’m not the person
to give it. But if you have a server based game and any hope of success you need to
do this.
Now I want to go through a case study of what this looks like with a game that that
was neither a triumph or a failure.
Raid Brigade is a one-handed party-based Action RPG with base-building elements
and an unusual one-handed control scheme. It’s the first game made by San Diego
studio Ultrabit made up of mostly Zynga veterans. After about 12 months of
development we released it to mobile test markets last June, skipping our normal PC
stage because of questions on how the controls might work. We were all excited for
the game, but the initial results were way below our goals and expectations: the first
few weeks saw dreadful performance with only 40% of people getting through the
tutorial and D7 retention 75% below the goal numbers.
The good news is that there were lots of obvious things to fix. There were long
loading times for assets being streamed in, essential to keep the initial download size
of a game with 3D art under 100MB. Improving those and the tutorial got tutorial
completion up to 70% and doubled D7 retention, though still well below our goal of
18%.
To help people get in to the various branching systems we then switched from a linear
tutorial to one based on a series of quests, which again helped increase D7 retention
significantly, this time up to about 12%. Again it was great to see that much
improvement, now 3x what we started at…but still below goals. And gains in retention
after that became harder to get. Though we kept working on that, along with many
other things.
After 3 months in test markets we we had to face the dilemma: the game was meeting
some of our goals (crash rate, tutorial, D1 retention, conversion) but not all of them
(D7, D30 retention, Repeat Buyer %) and the developer’s runway was starting to be a
concern. We could go ahead and launch in October as planned, or keep working on
it, cut into the developer’s runway, and go up against the glut of games coming out in
the holiday season.
When your metrics are good the decision path on when to launch is easy. When you
have plenty of time and money, you can usually keep working on the game, though
even there it’s important to be realistic about whether the game is fixable. If you’re
pretty sure you understand what’s wrong and have a good idea to fix it that’s great,
but there may be diminishing returns or unfixable flaws. When you get to the point
where you either can’t keep working on the game, or it doesn’t seem worthwhile then
the question becomes: do you launch?
At that point it’s time to think of the money you’ve spent on development as a sunk
cost, and think just about the effort needed to launch and support the game after
launch. For a single-player game without servers, this is fairly simple, and in most
cases you should go ahead and launch and see what happens. But with multiplayer
games this becomes more complicated as there are ongoing server costs and the
necessity of releasing additional content to drive revenue and engagement, as well as
critical player mass and opportunity cost issues. Supercell and some other companies
would only support a game if it’s a huge success, but exactly what that level should
be is going to have different answers by studio. But personally I don’t think you should
launch a game unless you can support it for a fairly extended time frame, a year plus
at least. Players are investing in your game, and that should be respected.
In 4 years of publishing we’ve never cancelled a single-player game, though we did
stop one from bothering to do an Android version. But this year alone we canceled
three multiplayer games, one after a Kong+ beta and two during mobile test markets.
Making a game these days can feel like walking into a dark forest. But remember: you
have many tools at your disposal. Be prepared, like the boy scouts say, have a plan,
be realistic, and hopefully you can make it through the forest and find the treasure
you were looking for.
Effective Testing of Free-to-Play Games

Weitere ähnliche Inhalte

Was ist angesagt?

F2P Design Crash Course (Casual Connect Kyiv 2013)
F2P Design Crash Course (Casual Connect Kyiv 2013)F2P Design Crash Course (Casual Connect Kyiv 2013)
F2P Design Crash Course (Casual Connect Kyiv 2013)Kongregate
 
GDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P Games
GDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P GamesGDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P Games
GDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P GamesTamara (Tammy) Levy
 
GDC Talk: Lifetime Value: The long tail of Mid-Core games
GDC Talk: Lifetime Value: The long tail of Mid-Core gamesGDC Talk: Lifetime Value: The long tail of Mid-Core games
GDC Talk: Lifetime Value: The long tail of Mid-Core gamesTamara (Tammy) Levy
 
Kongregate Web Games Partnership Opportunities
Kongregate Web Games Partnership OpportunitiesKongregate Web Games Partnership Opportunities
Kongregate Web Games Partnership OpportunitiesDavidKongregate
 
Josh Larson’s Talk at White Nights Prague '18
Josh Larson’s Talk at White Nights Prague '18Josh Larson’s Talk at White Nights Prague '18
Josh Larson’s Talk at White Nights Prague '18Kongregate
 
DavidPChiu Kongregate - Maximizing Player Retention and Monetization in Free-...
DavidPChiu Kongregate - Maximizing Player Retention and Monetization in Free-...DavidPChiu Kongregate - Maximizing Player Retention and Monetization in Free-...
DavidPChiu Kongregate - Maximizing Player Retention and Monetization in Free-...David Piao Chiu
 
Metrics for a Brave New Whirled
Metrics for a Brave New WhirledMetrics for a Brave New Whirled
Metrics for a Brave New Whirledcapncleaver
 
Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...
Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...
Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...David Piao Chiu
 
A mysterious adventure_in_social_games_final
A mysterious adventure_in_social_games_finalA mysterious adventure_in_social_games_final
A mysterious adventure_in_social_games_finalcapncleaver
 
NetEase - Story of a Transformation
NetEase - Story of a TransformationNetEase - Story of a Transformation
NetEase - Story of a TransformationDavid Ting
 
The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...
The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...
The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...Adrian Crook and Associates
 
Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...
Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...
Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...David Piao Chiu
 
Harshit sharma,Business plan
Harshit sharma,Business planHarshit sharma,Business plan
Harshit sharma,Business planHarshit Sharma
 
Game monetization: Overview of monetization methods for free-to-play games
Game monetization: Overview of monetization methods for free-to-play gamesGame monetization: Overview of monetization methods for free-to-play games
Game monetization: Overview of monetization methods for free-to-play gamesAndrew Dotsenko
 

Was ist angesagt? (16)

F2P Design Crash Course (Casual Connect Kyiv 2013)
F2P Design Crash Course (Casual Connect Kyiv 2013)F2P Design Crash Course (Casual Connect Kyiv 2013)
F2P Design Crash Course (Casual Connect Kyiv 2013)
 
GDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P Games
GDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P GamesGDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P Games
GDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P Games
 
GDC Talk: Lifetime Value: The long tail of Mid-Core games
GDC Talk: Lifetime Value: The long tail of Mid-Core gamesGDC Talk: Lifetime Value: The long tail of Mid-Core games
GDC Talk: Lifetime Value: The long tail of Mid-Core games
 
Kongregate Web Games Partnership Opportunities
Kongregate Web Games Partnership OpportunitiesKongregate Web Games Partnership Opportunities
Kongregate Web Games Partnership Opportunities
 
Josh Larson’s Talk at White Nights Prague '18
Josh Larson’s Talk at White Nights Prague '18Josh Larson’s Talk at White Nights Prague '18
Josh Larson’s Talk at White Nights Prague '18
 
DavidPChiu Kongregate - Maximizing Player Retention and Monetization in Free-...
DavidPChiu Kongregate - Maximizing Player Retention and Monetization in Free-...DavidPChiu Kongregate - Maximizing Player Retention and Monetization in Free-...
DavidPChiu Kongregate - Maximizing Player Retention and Monetization in Free-...
 
Metrics for a Brave New Whirled
Metrics for a Brave New WhirledMetrics for a Brave New Whirled
Metrics for a Brave New Whirled
 
Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...
Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...
Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...
 
A mysterious adventure_in_social_games_final
A mysterious adventure_in_social_games_finalA mysterious adventure_in_social_games_final
A mysterious adventure_in_social_games_final
 
NetEase - Story of a Transformation
NetEase - Story of a TransformationNetEase - Story of a Transformation
NetEase - Story of a Transformation
 
The Power of Free-To-Play
The Power of Free-To-PlayThe Power of Free-To-Play
The Power of Free-To-Play
 
The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...
The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...
The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...
 
Game Doc
Game DocGame Doc
Game Doc
 
Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...
Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...
Kongregate - Maximizing Player Retention and Monetization in Free-to-Play Gam...
 
Harshit sharma,Business plan
Harshit sharma,Business planHarshit sharma,Business plan
Harshit sharma,Business plan
 
Game monetization: Overview of monetization methods for free-to-play games
Game monetization: Overview of monetization methods for free-to-play gamesGame monetization: Overview of monetization methods for free-to-play games
Game monetization: Overview of monetization methods for free-to-play games
 

Ähnlich wie Effective Testing of Free-to-Play Games

Playtika Bets on Big Data Analytics to Deliver Captivating Social Gaming Expe...
Playtika Bets on Big Data Analytics to Deliver Captivating Social Gaming Expe...Playtika Bets on Big Data Analytics to Deliver Captivating Social Gaming Expe...
Playtika Bets on Big Data Analytics to Deliver Captivating Social Gaming Expe...Dana Gardner
 
Why not to make your next mobile game a paid game
Why not to make your next mobile game a paid gameWhy not to make your next mobile game a paid game
Why not to make your next mobile game a paid gameBowen Paul
 
DWS15 - Game Summit - Mobile Gaming - Sean Kauppinen - IDEA
DWS15 - Game Summit - Mobile Gaming - Sean Kauppinen - IDEADWS15 - Game Summit - Mobile Gaming - Sean Kauppinen - IDEA
DWS15 - Game Summit - Mobile Gaming - Sean Kauppinen - IDEAIDATE DigiWorld
 
Casual gaming metrics applied to social gaming
Casual gaming metrics applied to social gamingCasual gaming metrics applied to social gaming
Casual gaming metrics applied to social gamingMediaShifters
 
Research GamesMobile Brasil 2013
Research GamesMobile Brasil 2013Research GamesMobile Brasil 2013
Research GamesMobile Brasil 2013Moacyr Alves EPP
 
Why make a freemium game?
Why make a freemium game?Why make a freemium game?
Why make a freemium game?Bowen Paul
 
Massively multiplayer data challenges in mobile game analytics
Massively multiplayer data  challenges in mobile game analyticsMassively multiplayer data  challenges in mobile game analytics
Massively multiplayer data challenges in mobile game analyticsJak Marshall
 
Massively multiplayer data challenges in mobile game analytics
Massively multiplayer data  challenges in mobile game analyticsMassively multiplayer data  challenges in mobile game analytics
Massively multiplayer data challenges in mobile game analyticsJak Marshall
 
Luke Hohmann's Software Guru 2009 Keynote: Innovation In Software
Luke Hohmann's Software Guru 2009 Keynote: Innovation In SoftwareLuke Hohmann's Software Guru 2009 Keynote: Innovation In Software
Luke Hohmann's Software Guru 2009 Keynote: Innovation In SoftwareEnthiosys Inc
 
Understanding your game through data
Understanding your game through dataUnderstanding your game through data
Understanding your game through dataDevGAMM Conference
 
SXSW Interactive 2011
SXSW Interactive 2011SXSW Interactive 2011
SXSW Interactive 2011Zach Klein
 
GAMEBAU presentation Brazil June
GAMEBAU presentation Brazil JuneGAMEBAU presentation Brazil June
GAMEBAU presentation Brazil JuneMichelle Jakobs
 
Universal Design Lessons - Boston Games Forum
Universal Design Lessons - Boston Games ForumUniversal Design Lessons - Boston Games Forum
Universal Design Lessons - Boston Games ForumDave Bisceglia
 
LAFS SVI Level 1 - Introduction
LAFS SVI Level 1 - IntroductionLAFS SVI Level 1 - Introduction
LAFS SVI Level 1 - IntroductionDavid Mullich
 

Ähnlich wie Effective Testing of Free-to-Play Games (20)

Playtika Bets on Big Data Analytics to Deliver Captivating Social Gaming Expe...
Playtika Bets on Big Data Analytics to Deliver Captivating Social Gaming Expe...Playtika Bets on Big Data Analytics to Deliver Captivating Social Gaming Expe...
Playtika Bets on Big Data Analytics to Deliver Captivating Social Gaming Expe...
 
Why not to make your next mobile game a paid game
Why not to make your next mobile game a paid gameWhy not to make your next mobile game a paid game
Why not to make your next mobile game a paid game
 
LO1
LO1LO1
LO1
 
DWS15 - Game Summit - Mobile Gaming - Sean Kauppinen - IDEA
DWS15 - Game Summit - Mobile Gaming - Sean Kauppinen - IDEADWS15 - Game Summit - Mobile Gaming - Sean Kauppinen - IDEA
DWS15 - Game Summit - Mobile Gaming - Sean Kauppinen - IDEA
 
Casual gaming metrics applied to social gaming
Casual gaming metrics applied to social gamingCasual gaming metrics applied to social gaming
Casual gaming metrics applied to social gaming
 
Role playing games
Role playing gamesRole playing games
Role playing games
 
Research GamesMobile Brasil 2013
Research GamesMobile Brasil 2013Research GamesMobile Brasil 2013
Research GamesMobile Brasil 2013
 
Online Gaming
Online GamingOnline Gaming
Online Gaming
 
Why make a freemium game?
Why make a freemium game?Why make a freemium game?
Why make a freemium game?
 
IDEA Colombia 3.0 Games Industry Keynote - September 2015
IDEA Colombia 3.0 Games Industry Keynote - September 2015IDEA Colombia 3.0 Games Industry Keynote - September 2015
IDEA Colombia 3.0 Games Industry Keynote - September 2015
 
Massively multiplayer data challenges in mobile game analytics
Massively multiplayer data  challenges in mobile game analyticsMassively multiplayer data  challenges in mobile game analytics
Massively multiplayer data challenges in mobile game analytics
 
Massively multiplayer data challenges in mobile game analytics
Massively multiplayer data  challenges in mobile game analyticsMassively multiplayer data  challenges in mobile game analytics
Massively multiplayer data challenges in mobile game analytics
 
Luke Hohmann's Software Guru 2009 Keynote: Innovation In Software
Luke Hohmann's Software Guru 2009 Keynote: Innovation In SoftwareLuke Hohmann's Software Guru 2009 Keynote: Innovation In Software
Luke Hohmann's Software Guru 2009 Keynote: Innovation In Software
 
Understanding your game through data
Understanding your game through dataUnderstanding your game through data
Understanding your game through data
 
SXSW Interactive 2011
SXSW Interactive 2011SXSW Interactive 2011
SXSW Interactive 2011
 
Mobile legends
Mobile legendsMobile legends
Mobile legends
 
GAMEBAU presentation Brazil June
GAMEBAU presentation Brazil JuneGAMEBAU presentation Brazil June
GAMEBAU presentation Brazil June
 
Universal Design Lessons - Boston Games Forum
Universal Design Lessons - Boston Games ForumUniversal Design Lessons - Boston Games Forum
Universal Design Lessons - Boston Games Forum
 
LEARN THE WORLD OF GAMING
LEARN THE WORLD OF GAMINGLEARN THE WORLD OF GAMING
LEARN THE WORLD OF GAMING
 
LAFS SVI Level 1 - Introduction
LAFS SVI Level 1 - IntroductionLAFS SVI Level 1 - Introduction
LAFS SVI Level 1 - Introduction
 

Kürzlich hochgeladen

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

Effective Testing of Free-to-Play Games

  • 1.
  • 2. Back a few years ago when social games were exploding on facebook the conventional wisdom was that you wanted to release your minimum viable product as quickly as you could, and iterate on it in the wild with real data from players. But that only made sense in a world where if you “wasted” your early traffic on a poor game it was relatively easy to get more. Mobile is the opposite: traffic is always at a premium and a strong global launch has become crucial to success. It’s the best chance to get substantial features from the platforms, to get noticed in the new charts, and potentially get picked up by recommendation algorithms. Ideally we’d all be like Blizzard & Supercell and be able to polish and test games internally for years, but there are all sorts of pressures and costs pushing games out the door. For companies with <$1B in annual profit it’s important to be realistic about what, why, and how to test to maximize our chance at success.
  • 3. We started Kongregate back in 2006 as an open platform for browser games, a little like YouTube for games: anyone can upload, we then add chat, comments, forums, achievements, and a lot of other social features that make the site a whole game itself. In 10 years more than 100,000 games have been put on our site, covering pretty much every genre imaginable. A fairly wide array of games are popular, from casual puzzles and launch games to MMOs and collectible card games. Overall our audience trends male, with heavy overlap with console and Steam.
  • 4. Four years ago we started publishing third party games on mobile, and have launched more than 20 games in the last 3 years. Like on Kongregate itself we publish a fairly broad range of games, from more niche, high monetizing RPGs and CCGs to single-player games with mostly ad monetization. To give you a feel of the range here are the games we published in 2015.
  • 5. And like everything in life they have different strengths and weaknesses. To use them properly it’s important to understand what those are, so I’m going to spend a bit of time going through the different types. One note: I’m not going to be talking much about defect and bug-oriented QA testing, which is not to say it’s not important. It’s very important, and you should do it, ideally with dedicated in-house resources supplemented by 3rd party resources. But I only have an hour so I have to skimp on some topics.
  • 6. By Team Playtests I mean both the testing you do naturally as you add on features to the game, and also scheduled team sessions to look at the game more broadly. Game is available, everyone’s getting paid to do this You don’t play like a player, you know how things are supposed to work. And you know games too well – every convention is obvious to you. And since you’re testing to make sure features are working you play the game in totally unnatural ways “I’m jumping right to x” “I’m pushing all the buttons” “I’m going to play with an OP account to breeze through”
  • 7. By in-person playtests I mean getting a 3rd party unrelated to game development to play the game. It could be as informal as handing somebody a device out in the wild, or could be an organized, in-office thing where you hire people to come in and play the game. •  They’re a pain-in-the-ass to arrange whether you’re bringing strangers from Craigslist into your office or harassing them at a coffee shop. And they take a lot of skill to run & analyze well – not to prompt the person, to jump to solutions, conflate problems, hear beyond what they say •  They are psychologically difficult – exposing your work is hard, It’s pretty equivalent to the feeling most of us have about getting up to sing in a public Karaoke bar, only without it being appropriate to get a little drunk first. when combined with it being a pita, get pushed off indefinitely “it’s not ready” “they’ll only tell us what we already know”. •  Depending on how you’re recruiting
  • 8. Companies like Usertesting.com (whom we use) and others •  Generally a directed task testers are supposed to complete, so less natural experience than just picking up a game and exploring, higher willingness to “figure it out” •  Limited view of body language, narration expresses conscious thoughts, not unconscious, no chance to follow-up •  Still limited to first session experience, now with a time limit •  Still small sample, luck of the draw on testers, some selection bias in terms of who takes these kind of gigs
  • 9. By this I don’t mean in person, but sending out a mobile build or a link to a web version to a group of people you know and seeing what happens •  More realistic experience, they’re testing the whole game across as many sessions as they want to play •  Depending on who it goes to you can get good qualitative feedback, •  Audience tends to be biased in your support, professional game developers, or both. a lot of people won’t want to hurt your feelings. You’re unlikely to hear “this sucks” even if it’s true. •  Low sample size & unrepresentative/biased audience = mostly garbage metrics
  • 10.
  • 11. The way to think about this: if we assume that tutorial completion is around 80% globally, if I get 400 installs representative of the global audience 95% of the time their tutorial completion rate will be between 72% and 88%.
  • 12. Now just because you have a smaller sample than this doesn’t mean metrics are useless. In a normal distribution you are more likely to be closer to the mean than farther away. My unscientific rule of thumb is that you start getting directionally useful if not accurate metrics when an event has occurred 75 or more times. So for directionally useful tutorial results you need about 100 people installing a game, and for directionally useful buyer conversion rate more like 5,000.
  • 13. By this I mean inviting a broader group of players to play a game not yet released, either on web or Android, usually volunteers from a fanbase •  All the benefits of friends & family tests with larger sample sizes! •  REALLY engaged audience excited to give feedback, not constrained by politeness. They’ll tell you your game sucks. •  Chance to build a community for your game pre-release
  • 14. On Kongregate.com beta access to games is a benefit of Kong+, our ad-free version players pay $29.99/year. We also gift it to our volunteer moderators and big spenders. There are about 30,000 MAU, and the average beta game gets ~3k beta users. We consistently have 5-10 games in closed betas, which we use for our publishing portfolio, but is also open to other developers
  • 15. Metrics are the product of the audience and the underlying game. You can get average metrics by either putting poorly qualified traffic into an amazing game, or by putting amazing traffic into a mediocre game. Now this is somewhat obvious, but if you’ve been working on a game for a year it’s easy to forget about audience, and think metrics are entirely about the game. It is especially easy to underestimate how BIG the audience swing can be. This was the most extreme split we’ve ever seen, with a 9X difference in % buyers, which is more commonly 3-4x higher in beta than in global release. But even though it’s inflated it’s still useful: we could tell this was going to be a high ARPU game with mediocre initial retention but good long-term retention. (Note, web d1 tends to be much lower than mobile, but they are comparable by D30.
  • 16. This is the now classic method for mobile, releasing fully but in just a few countries, often Canada and/or Australia The real thing is hard, and a lot of the weakness are just aspects of the mobile game market •  It’s hard work releasing anything on mobile – builds, screenshots, but particularly getting games working on such a broad range of devices •  Long Apple approval times makes iteration slow even when you can quickly fix a problem •  Traffic doesn’t magically show up in games and buying it is expensive – average is ~$3 per install in Canada & Australia, a bit less on Android
  • 18. Australia & Canada may be good proxies for the US, but that’s likely to be <1/3 of installs They used to be closer, but the gap widened after the release of the iPhone 6 when a lot of high end device users switched back to iOS Especially for more niche, high LTV genres like CCGs, RPGs & Strategy games whether it’s CPIs going up or retention & LTV going down after your first “golden cohort”. In a small market a few big buyers can blow out a market. And test market spend tends to be less ROI focused, so you see weird patterns.
  • 19. Spellstone, a polished collectible card game from Synapse games, shows some of the dynamics. You can see that the performance of iOS is generally much stronger than Android, and that paid generally is quite a bit stronger than organics. But the really dramatic number is the huge drop in performance on organic traffic coming from the substantial features that the game got, with around a 70% drop in the ARPU on both iOS and Android. This is most dramatic with more niche, high LTV genres like CCGs, RPGs & Strategy games. For a more casual, broad audience game like AdVenture Capitalist we don’t typically see a big delta.
  • 20. Your game is driven by outliers and their presence or absence distort almost anything you look at. Binary “yes/no” metrics like D1 retention or tutorial completion are more reliable than averages involving engagement or revenue. And the deeper your game, the less spending is capped, the more unpredictable those averages get. So as much as possible look at binary metrics that are proxies for the averages, or answer the questions. % repeat buyers, for example, rather than ARPPU.
  • 21. Don’t get me wrong, I love A/B tests. They divorce correlation from causation, and manage the audience mix problems well, too. But don’t expect to be able to run a lot of A/B tests in test markets unless you’re willing to spend major $$s. The problem is sample size. Take the numbers I was giving earlier for cohort sizes then double them for an A/B test. And with an A/B test you need to measure the results fairly precisely, directional numbers are not good enough, or you can make bad choices based on the results.
  • 22. More of a strategy than a testing method, but one I think is underused. I’m biased of course! I have a web portal. But this strategy has worked for a lot of big successes, from King with Candy Crush to Blizzard recently with Hearthstone. – Steam, Kongregate, Facebook, Miniclip, Newgrounds, Addicting Games and hundreds more. But it’s important to find the right audience fit – Facebook is a much better choice for a very casual game than Steam or Kongregate Better social feature support (forums, videos, streaming, etc) to build community Comparable LTVs (at least on Kongregate) Chrome no longer supports the Unity plug-in, and Firefox will likely kill it by the end of the year. Flash is still going for now, but will likely be phased out in a few years. But the webGL export from Unity is improving rapidly, and there are a lot of other good cross-platform frameworks to work with, such as Haxe.
  • 23. Over the next 6 months they worked on polishing the UI, adding monetization, all while releasing it on dozens of additional platforms across the web, and were able to extend the content and improve the balance while building a bigger and bigger audience.
  • 24. Mobile test markets lasted about 2 months, focused on mobile device stability, FTUE and the new rewarded video integration, a huge addition to monetization
  • 25.
  • 26.
  • 27.
  • 28. And since each type of test adds something, the ideal plan is to use all of them, in approximately this order, with company playtests throughout a given.
  • 29. How much money and time you have are intimately related: time is generally the biggest cost because each month of a studio’s burn rate adds up. But there are situations where that’s not true: in a big company you may have intense pressure to launch by a certain date but a lavish budget for test market marketing. Or an indie doing work-for-hire or with a full-time jobs may have little time pressure but no money to buy traffic. The indie should focus on in-person play tests and closed betas, while with enough money the big studio can blast their way through geo-locked test markets. A simple puzzle game or endless runner that are easy to pick up and play are going to get more value out of the experiential testing that help them nail the fun in the core experience, but isolated, single session tests are much less useful to multiplayer games with deep metagames and economies, where long-term metric-based testing is crucial to getting things right. Games with lots of graphics, or are otherwise technically demanding, are going to need extra time in mobile geo-locked test markets to deal with all the problems that crop up with low memory and low GPU devices on both Android & iOS And finally: what are your goals? Do you expect to get significant features from the platforms? Are you look for top 10 grossing? Top 1000? The bigger the launch you’re expecting, the higher the stakes and the more crucial it is to have the game in the best state possible at launch.
  • 30. Assuming both time and money are at least somewhat constrained. Say a mid-sized studio with most of their burn covered by existing game income, but not a big cushion. This still assumes 6 months from friends/family to global mobile release. If you cut it much shorter on a multiplayer game you will almost certainly regret it.
  • 31. Single-player game made by a small cash-strapped team, mobile-specific controls and ad-based monetization. In this case most of the real game testing and iteration should be on the back of rigorous, frequent, in-person testing. Then mobile test market can concentrate on just a few metrics.
  • 32. During pre-production & production the key question you’re asking is “Is this fun? Are we on the right track?” The sooner you figure out something isn’t working as you expected, the easier it is to fix. The more you are departing from convention and comparables the more you need to validate as you go along.
  • 33.
  • 34. As you approach release you’re asking “Is This Ready”? It’s a great time to do remote playtests focused on the first time user experience, and make sure that analytics are hooked up and firing correctly – that last is not a given, analytics are very easy to screw up. This is also a great time to send your game to a 3rd party QA service to test on a broad range of devices if you’re going straight to mobile.
  • 35. It’s the next Clash of Clans! Or Crossy Road! Everything is broken! Who would even play this piece of shit? Total failure. Depending on the person, they may cherry-pick the good, or focus only on the bad.
  • 36. This is where the rubber really meets the road. To know if the game is working, you have to know what you’re looking for.
  • 37. We have a very successful game with 20% D1 retention. We have very successful games with $0.03 ARPDAUs.
  • 38. So here are some sample metrics ranges, from low to high, for the genres I’ve been using for example test plans, then for all genres. This is loosely based on the metrics we’ve seen from games in our portfolio and more generally on Kongregate.com. As you can see the “all genres” low/medium/high can be pretty different from the ranges by genres. Good retention for a multiplayer RPG game is drastically lower than an idle game. Good monetization for a casual runner would be terrible for that same RPG. What’s missing from this is expectations around traffic: that casual runner has probably 10x the potential traffic of the multiplayer RPG.
  • 39. Here’s a couple of outcome models based on the average metrics for each of these genres, then broken out by different levels of traffic. You’ll notice these profitability scenarios aren’t great: games with average metrics need exceptional traffic to become a sustainable business. In general for a success you need at least one or two metrics to fall in one of the “high scenarios”. Set your goals matching realistic expectations of the genre and acceptable (not ideal) business outcomes. If you need to hit top 20 grossing to justify huge budget/company expectations, your goals should be much higher than if you mostly want to learn.
  • 40. It’s not that you don’t look at other metrics, but you want to set the gates based on the most important one for that stage.
  • 41. One of the benefits of breaking your testing into stages is that on mobile it allows you to use a wider variety of test markets. Canada & Australia are not only expensive places to test stability, they’re a bad choice because they have a much lower % of the low-end devices most likely to trigger issues. That’s better tested in emerging markets. Overall testing in a range of countries from tier 1 to tier 3 will give you a much more representative view of your global performance than limiting to a few English-speaking markets. We’ve used more than 20 different countries in the last year, all shown in this map.
  • 42. Note that the sample sizes are cumulative. 12k isn’t enough for statistical significance on buyer %, but 25k is.
  • 43.
  • 44. This is not optional, but a surprising number of developers don’t. Crashes are annoying to players, effecting both retention and your ratings and reviews in the app stores. We recommend a 3rd party service like Fabric/Crashlytics or Hockey App, though there’s a quite bit of info in Google Play console as well.
  • 45. Stage 2 is where you’re really optimizing your game, and you should look at in the same stream that players are moving through the game, as that’s the order in which sample sizes will get large enough, and because problems in one will likely flow into the next. You start with the progression through the first time user experience, checking drop-off pre-tutorial, and then at each tutorial step. Then you’ll watch PVE progression, what’s the progression through missions, what are the win rates. After players have been in the game a little longer you want to look at PVP participation and win rates, and then finally at the economy, where you should look at the full sinks and sources flow, but keep a particularly close eye on resource balances and how they grow.
  • 46. Retention is the KPI reflection of progression. Without longer term retention, which reflects commitment and engagement with the game, few people will pay. Conversion reflects both engagement and balance: do they care enough to buy something, and is there a good reason to. If retention is good but conversion is bad, then either what’s being sold is not compelling, the balance is not challenging enough for it to be useful, or the economy is imbalanced in a way where there’s no reason to purchase, because you can get it for free. Note that there is some tension between retention and conversion, though, because tight economies may make players more likely to churn. New buyer packs can be great at boosting conversion, while masking underlying problems. One of the most important stats to look at is repeat conversion – how many people buy a 2nd time? A 3rd? Repeat conversion shows both how players feel about their first purchase and whether there is depth of spend. If you have a high conversion rate but low repeat purchasing, your game will just pop and drop. Note that I haven’t mentioned ARPPU. That’s because while it’s an important metric, it’s not one you can really look at with any reliability because revenue is an exponential distribution, and very erratic in small samples. However ARPPUs are capped fairly low unless there are repeat purchases, so looking at that statistic, which is a normal distribution, answers the same question with better stability.
  • 47. It’s human nature to project causation, so as you make changes to your game you’re likely to look at daily numbers and think it’s the result of what you’ve done. Resist. Even with a game doing a reasonably big test market you get tons and tons of random variation in the daily numbers. These numbers are from Spellstone’s test markets, which had between 1000-1500 DAU through most of this but the daily numbers are all over the place. We track rolling averages, which helps, but if you want to look at the impact of changes roll up cohorts from before and after the change to get a statistically significant sample to look at. And still take that with a grain of salt because of audience mix and confounding effects from other changes you’re making.
  • 48. An important part of mobile test markets is optimizing the assets you’re using in the app store – we regularly see substantial gains testing icons, video, screenshots, and copy, though we’ve never seen a significant difference from game name testing. But context is important – results are often inconsistent between app stores and geos. For that reason we don’t start our ASO until we’ve expanded beyond T3 markets. Google’s tools for this are great, but for iOS you need to use a paid service like Store Maven.
  • 49. Test market marketing isn’t just about driving installs, it’s about testing the marketing itself, optimizing creatives & targeting, generally figuring out: will this work? Will I be able to drive audience into my game? Can I do it profitably? Test a lot of creatives across a lot of networks. Keep refreshing. You never know what’s going to work because again context matters a lot.
  • 50. There’s nothing worse than having a big feature and then have the servers crumble under the load. It feels like flushing success down the toilet, and you are. Now “how to load test a game” is a subject that deserves it’s own GDC talk and I’m not the person to give it. But if you have a server based game and any hope of success you need to do this.
  • 51. Now I want to go through a case study of what this looks like with a game that that was neither a triumph or a failure. Raid Brigade is a one-handed party-based Action RPG with base-building elements and an unusual one-handed control scheme. It’s the first game made by San Diego studio Ultrabit made up of mostly Zynga veterans. After about 12 months of development we released it to mobile test markets last June, skipping our normal PC stage because of questions on how the controls might work. We were all excited for the game, but the initial results were way below our goals and expectations: the first few weeks saw dreadful performance with only 40% of people getting through the tutorial and D7 retention 75% below the goal numbers.
  • 52. The good news is that there were lots of obvious things to fix. There were long loading times for assets being streamed in, essential to keep the initial download size of a game with 3D art under 100MB. Improving those and the tutorial got tutorial completion up to 70% and doubled D7 retention, though still well below our goal of 18%.
  • 53. To help people get in to the various branching systems we then switched from a linear tutorial to one based on a series of quests, which again helped increase D7 retention significantly, this time up to about 12%. Again it was great to see that much improvement, now 3x what we started at…but still below goals. And gains in retention after that became harder to get. Though we kept working on that, along with many other things. After 3 months in test markets we we had to face the dilemma: the game was meeting some of our goals (crash rate, tutorial, D1 retention, conversion) but not all of them (D7, D30 retention, Repeat Buyer %) and the developer’s runway was starting to be a concern. We could go ahead and launch in October as planned, or keep working on it, cut into the developer’s runway, and go up against the glut of games coming out in the holiday season.
  • 54. When your metrics are good the decision path on when to launch is easy. When you have plenty of time and money, you can usually keep working on the game, though even there it’s important to be realistic about whether the game is fixable. If you’re pretty sure you understand what’s wrong and have a good idea to fix it that’s great, but there may be diminishing returns or unfixable flaws. When you get to the point where you either can’t keep working on the game, or it doesn’t seem worthwhile then the question becomes: do you launch? At that point it’s time to think of the money you’ve spent on development as a sunk cost, and think just about the effort needed to launch and support the game after launch. For a single-player game without servers, this is fairly simple, and in most cases you should go ahead and launch and see what happens. But with multiplayer games this becomes more complicated as there are ongoing server costs and the necessity of releasing additional content to drive revenue and engagement, as well as critical player mass and opportunity cost issues. Supercell and some other companies would only support a game if it’s a huge success, but exactly what that level should be is going to have different answers by studio. But personally I don’t think you should launch a game unless you can support it for a fairly extended time frame, a year plus at least. Players are investing in your game, and that should be respected. In 4 years of publishing we’ve never cancelled a single-player game, though we did stop one from bothering to do an Android version. But this year alone we canceled three multiplayer games, one after a Kong+ beta and two during mobile test markets.
  • 55. Making a game these days can feel like walking into a dark forest. But remember: you have many tools at your disposal. Be prepared, like the boy scouts say, have a plan, be realistic, and hopefully you can make it through the forest and find the treasure you were looking for.