Datacratic is the leader in real-time machine learning and decisioning and the creator of the RTBkit Open-Source Project. Mark Weiss, head of client solutions at Datacratic shares some of the challenges companies and developers face today as they move into Real Time Bidding. In this presentation he does a developer deep dive into design and implementation choices, technologies, plugins and provide some real world RTB customer use cases. You will also learn how you can join the RTBkit community get support for your upcoming RTBkit initiatives.
2. Overview
● The Project
● RTB Competitive
Landscape
● The Problems With RTB
○ System
○ Selection
○ Value
● How RTBkit Addresses the
Problems with RTB
● Demo
4. A Little History
● Created by machine-learning and digital
marketing company Datacratic
● Code base evolved from running RTB in
production from 2011-2013
● Open sourced in Feb. 2013, with ongoing
support from Datacratic
● Apache-style governance started Jan. 2014
5. Participation and Governance
● Apache-style governance
○ BDNFL - Benevolent
Dictator Not for Life
○ Councillors
○ Committers
● Outside contributions welcome
● Github pull request workflow --
committers review and merge
● Contributor guidelines
● Users can become
Contributors
● Contributors can become
Committers -- currently two
outside Committers
6. Support, Community, Adoption
● Free support from the
community and Datacratic
● Community support from 100s
of users in 25+ countries
● Datacratic provides
engineering support for
development, code review,
governance and evolution
● Participation and contributions
from Rubicon Project
● 230 active developers
● 35 committers, 11 outside of
Datacratic
● 10 installations in prod: N.
America, Germany, France,
Russia, Argentina, China
7. Development Support
● Getting Started Guide
● Working test system:
○ mock Exchange configurable to
run any bid requests
○ mock Ad Server
○ fixed-price Bidding Agent
● Example Code
● Documentation
● Packaging script and weekly tagged
packages for download
● Ubuntu AMI (ami-31acd858)
● Google Group support
● Pull request review and support
8. User Profiles - Reason for Adopting
Data from ongoing survey, 50 responses
9. User Profiles - Expected Spend
Data from ongoing survey, 50 responses
10. User Profiles - Type of Inventory
Data from ongoing survey, 50 responses
11. User Profiles - Geographic Targets
Data from ongoing survey, 50 responses
13. The Problems With RTB
SYSTEM VALUESELECTION
Provided by RTBkit Customized by User
General/Technical Specific/Business
14. Solves the RTB System Problem
SYSTEM SELECTION VALUE
Scale
Speed
Distribution
Reliability
General/Technical Specific/Business
Provided by RTBkit Customized by User
15. Addresses the RTB Selection Problem
SYSTEM SELECTION VALUE
Scale
Speed
Show user an ad?
What ad?
Distribution
Reliability
Provided by RTBkit Customized by User
General/Technical Specific/Business
16. Addresses the RTB Value Problem
SYSTEM SELECTION VALUE
Scale
Speed
Distribution
Show user an ad?
What ad?
What is it worth?
What should I pay?Reliability
General/Technical Specific/Business
Provided by RTBkit Customized by User
17. RTB Competitive Landscape
System Pros Cons Degree of
Difficulty
Exchange /
DSP UI
Easy to get started ● Manual, hard to scale
● Lack of control over bidding
strategy and data
Low
Intermediate
Hosted
Bidding
● More control over
bidding strategy and use
of data
● Don't have to do Ops
Strategy and use of data mediated
by vendor and product features
Medium
Roll Your
Own Bidder
Full control of all aspects of the
system
Solely responsible for everything Hardest
● Benefit from core
problems being solved
● Benefit from
● community
● Flexible customization
● Full control (optionally) but
requires digging in
● Responsible for ops
Hard
19. Architectural Overview
RTBkit Core
● Router
● Banker
● Post Auction Service
● Service Monitor
● Agent Configuration Service
Plugins
● Exchange Connectors
● AdServer Connector
● Bidding Agent
● Augmenter
● Logger
20. Bidder Core Responsibilities
● Core working bidder system
● High-performance real-time components
● Multiple data center support
● Reliable global banker updated once per
second with guarantees against overspend
● Strongly typed currency support
● Guaranteed response time to exchanges
● Automatic load shedding
● Flexible high-performance filtering of bid
requests
● High-performance parsing, routing, filtering,
logging and monitoring
22. Router Responsibilities
● Gets bid requests from Exchange
Connector
● Uses Filters to filter eligible campaigns
● Passes bid requests through Augmenter
● Passes bid requests to Bidding Agents to
generate bid responses
● Communicates with Banker to guarantee no
overspend
● Guarantees timely response
● Only runs if system components are
available
23. Router Components
● Gets bid requests from
Exchange Connector
● Uses Filters to filter eligible
campaigns
● Passes bid requests
through Augmenter
● Passes bid requests to
Bidding Agents to
generate bid responses
● Communicates with
Banker to guarantee no
overspend
● Guarantees timely
response
● Only runs if system
components are available
RTBkitRouter
Exchange
Exchange
Exchange
Connector
Static
Filters
Augmentation
Loop
Dynamic
Filters
Auction
Loop
Slave
Banker
Master
Banker
Bidding
Agents
Augmenter
Post
Auction
Service
Agent
Config
24. Router Data Flows
● Controls the amount of
data flowing through
● Dynamically directs the
Exchange Connector to
shed load to guarantee
timely response
RTBkitRouter
Exchange
Exchange
Exchange
Connector
Static
Filters
Augmentation
Loop
Dynamic
Filters
Auction
Loop
Slave
Banker
Master
Banker
Bidding
Agents
Augmenter
Post
Auction
Service
Agent
Config
26. RTBkit Data Flows
● Five asynchronous data
flows flow through the
Router:
○ Bid request
processing
○ Banking updates
○ Event Matching
○ Notifying Bidding
Agents of Events
○ Filtering and
Bidding Agent
configuration
RTBkitRouter
Exchange
Exchange
Exchange
Connector
Static
Filters
Augmentation
Loop
Dynamic
Filters
Auction
Loop
Slave
Banker
Master
Banker
Bidding
Agents
Augmenter
Post
Auction
Service
Agent
Config
27. Ad Server and Conversion Integration
● Standard HTTP JSON connector for
receiving Wins, Clicks and Conversions
● Event matching of Wins to bid response
● Event matching of Clicks and Conversions
to Wins
● Logging of all campaign events and
matched campaign events
28. Post Auction Service
● Clearinghouse for matching all bids to Wins,
Clicks and Conversions
● Router sends Bid Request messages
● Ad Server Connector sends Wins, Clicks
and Conversions
● Matched Clicks and Conversions similarly
generate Matched messages
● 15-minute window or bid is Inferred Loss
● Match events sent to Logger and to Bidding
Agents
● Shadow account spend bookkeeping
● Current bottleneck, can process events in
the hundreds / sec
● Recently improved: sharded hash tables,
one thread per core
● More improvements on the near roadmap
Post
Auction
Service
Router Bids
Ad Server
Connector
Events
Shadow
Account
Bidding
Agents
Logger
Matched
Events
Matched
Events
Wins and
Inferred Losses
29. Banker Responsibilities
● Single source of truth for budget available
for each Campaign and for each Account
● Authorizes spending of Campaign budget
by Bidding Agents for a Campaign
● Enforces that each Budget has one Account
owner
● Caps per-Campaign and per-Account
spending
● Guarantees won't overspend if wins are
cheaper than bids
● Insulates banker state from "shadow
account" bookkeeping in Router and Post
Auction Service
30. Banker Design
● Totals always go up -- you can always
reason about the relative timing of entries
● Double-entry bookkeeping
● Multiple increasing Currency Pools
● Atomic, idempotent persistence
● Designed for high-latency, low-bandwidth
unreliable connections
● Updates global state once per minute
31. Banker Account Types
Budget Account
● All budget for Account tree set in Master
Banker Budget Account
● Cannot bid from this account
● Cannot track spend directly
● Can transfer budget into child Spend
Accounts
● Can have child Spend Accounts
● Only exists in the Master Banker
Spend Account
● Must have a parent Budget Account
● Can bid from this account
● Can track spend directly
● Cannot have children
● Can be shadowed into a separate process
Budget Account
Spend
Account
Master Banker
Spend
Account
32. Account Hierarchies and Spend Tracking
Banker - Account Hierarchies
● Spend accumulates from Children to Parent
Budget Account in Master Banker
● Temporary bookkeeping for bids happens in
shadow accounts in separate processes
● Shadows sync once per second
○ Router shadow - tracks budget
committed to pending bids
○ PAS shadow - tracks budget to debit on
Wins and to credit on Losses
● Natural partitioning will allow for sharding
Spend Concepts
● Budget: amount allowed to spend
● Spent: amount actually spent
● inFlight: amount in live bids
● Allocated: amount allocated to sub-accounts
● Adjustments: sum of adjustments
Budget
A
Spend
A:B
Master Banker
Spend
A:C
Shadow
Spend
A:B
Router Slave Banker
Shadow
Spend
A:C
Shadow
Spend
A:C
PAS Slave
Banker
Shadow
Spend
A:B
33. Banker Currency Pools
● Currency Pools store entries as 64-bit
integers
● Multiple Currency Pools per Account
● Each Account hierarchy mutated by a single
process
● Strongly typed Currency, won't allow cross-
currency conversions
● Automatic scaling conversions (e.g. CPM to
micro-dollars)
● Debit and Credit Pools
● Credit operations -> increase to a credit
Currency Pool in a hierarchy
● Debit operations -> increase to a debit
Currency Pool in a hierarchy
Account Currency Pools
Credit Debit
budgetIncreases budgetDecreases
allocatedIn allocatedOut
recycledIn recycledOut
commitmentsRetired commitmentsMade
adjustmentsIn adjustmentsOut
spent
34. Banker Account Tree Currency Pools
Name Formula Description
tree.budget budgetIncreases - budgetDecreases Tree max spendable amount
tree.inFlight sum(commitmentsMade) - sum
(commitmentsRetired)
Total outstanding bids
tree.spent sum(spent) Total spent
tree.adjustments sum(adjusmentsIn) - sum(adjustmentsOut) Total adjustments to spent
tree.effectiveBudget sum(budgetIncreases) - sum
(budgetDecreases) + sum(recycledIn) - sum
(recycledOut) + sum(allocatedIn) - sum
(allocatedOut)
Max spendable amount according to
current internal state
tree.adjustedSpent tree.spent - tree.adjustments Total spent after adjustments
tree.available tree.effectiveBudget - tree.adjustedSpent -
tree.inFlight
Tree remaining spendable amount
35. Banker Parent-Child Currency Operations
Name Action Description
child.setBalance increase child.recycledOut and decrease
parent.recycledIn
Set child account balance lower
child.setBalance set
balance higher
increase child.recycledIn and decrease
parent.recycledOut
Set child account balance
higher
child.recuperateTo increase child.recycledOut and parent.
recycledIn until child.balance == 0
37. Banker APIs
● REST API suitable for human reader and
outside tool integration
● Also used by Router, Post Auction Service
and Bidding Agents
● API presents a simple wrapper over the
Account Type, Account Hierarchy and
Currency Pool concepts
○ All Accounts in tree or subtree
○ Accounts in (sub)tree by name
○ Shadow Accounts in (sub)tree by
name
○ Account children by name of parent
○ Account balance of sub(tree) by name
○ Account budget of tree by name
38. Banker Persistence
● Banker state stored in Redis
● Banker dumps its state each second
● Read-Modify-Write so only delta transmitted
● Can detect out of date or corrupt data
○ If a value goes down
○ If sum(credit) - sum(debit) !=
available
● On a banker crash and restart, it reads and
reconciles state from shadow accounts and
persistent store
● Maximum of one second of data lost
● If routers/post auction loops (with shadow
accounts) stay up, no data lost
{
"md" : {"objectType": "Account"; "version": 1},
"type": account type ("budget" or "spent")
"budgetIncreases": amount (in USD/1M),
"budgetDecreases": amount (in USD/1M),
"spent": amount (in USD/1M),
"recycledIn": amount (in USD/1M),
"recycledOut": amount (in USD/1M),
"allocatedIn": amount (in USD/1M),
"allocatedOut": amount (in USD/1M),
"commitmentsMade": amount (in USD/1M),
"commitmentsRetired": amount (in USD/1M),
"adjustmentsIn": amount (in USD/1M),
"adjustmentsOut": amount (in USD/1M),
"lineItems": additional keyed amounts,
"adjustmentLineItems": additional keyed amounts
}
39. Logger
● Logging occurs in a separate process that
each component uses
● Automatically handles compression and log
rotation
● Pub/sub model using the RTBkit service
discovery mechanism (Zookeeper)
● Supports target multiple outputs (file
system, S3) and route messages to one or
more outputs
● Supports combining multiple messages
● Supports callbacks
● Can be extended as needed
40. Monitoring and Operations Tools
● Extensive code instrumentation that
logs to Carbon
● Lock-free, high-performance carbon
logging library, with tunable sampling
rate, one-second granularity and
various useful functions
○ labelled occurrence
○ counters
○ levels (min, max, mean)
○ values (min, max, mean)
● Can use library to add any custom
metrics you desire
● Operational dashboard
● All standard and custom metrics
charted in graphite
● Launcher and real-time tmux shell
44. Filter Design and Features
● Router passes bid requests through
Filtering pipeline
● Bids must pass all filters to reach Agents
● Thread safe
● Useful primitives driven by configuration
and available as building blocks for custom
filters
● Predefined Agent and Creative filters
● Designed to guarantee performance first,
be flexible and powerful second
● Regex support
○ Example: Location filter supports
regexes at Agent and Creative level to
support dynamic filter by geo
45. Generic Filter Primitives
● Building blocks for included Predefined
Filters and for user Custom Filters
● Encapsulate generic comparison logic
● IncludeExcludeFilter
○ True If any included And none
excluded
● ListFilter
○ True If any match in List
● RegexFilter
○ True If any match regex
● IntervalFilter
○ True If any within interval
● DomainFilter
○ True If bid.domain in
DomainList
46. Filter Levels
Filter Levels
● Agent Filters
○ Control whether Agent bids on bid
request
● Creative Filters
○ Control whether Creative eligible to be
the one returned in bid response
Agent
Format
Location
Exchange
Language
Creative
Exchange
Location
Language
Host
URL
Segments
Hour of Week
Fold Position
User Partition
47. Filter Types
● Static Predefined Filters
○ Creative filters match bid and Agent
creative sizes
○ Config filters match bid request
attributes to filter attributes
● Static Segment Filters
○ Filter based on attributes set by
Exchange Conn. bid request parse
● Static Custom Filters
○ Creative or config filters
○ Simple wrapper class API
● Dynamic Predefined Filters
○ Based on system state
○ notEnoughTime, tooManyInFlight
● Augmenter Filters
○ Custom logic and data
Predefined
Segment
Static
Predefined
Dynamic
Augmenter
Custom
Bid
RequestBid
RequestBid
Request
Bid
Request
48. Filter Priorities and Performance
● Prioritized execution order optimized for
performance, not business logic
● Selective, inexpensive filters run earlier
● Expensive filters run later (or not at all!)
● Only fast filters run on the Exchange
Connector thread, which must guarantee a
response within response SLA time
● Static filters build bitfield lookup table from
configs, batch process filters per bid
request in 64-bit blocks
● Filter matching tests match and retrieves
eligible creatives in one pass
Format
Location
Exchange
Language
Creative
Agent
Exchange
Location
Language
Host
URL
Segments
Hour of Week
Fold Position
User Partition
Agent
Bid
Request
Bid
RequestBid
RequestBid
Request
49. Custom Filter Development
● (Creative)IterativeFilter<MyFilter>
○ Simple wrapper interface
○ Set priority, return bool per request
○ Less scale, no batch processing
● (Creative)FilterBaseT<MyFilter>
○ ConfigSet of filter configs
○ CreativeMatrix maps each creative to
its filters
○ FilterState stores state of processing
filters for current bid request
○ Filter batch process by intersecting
ConfigSet and CreativeMatrix
○ Filter code uses ConfigSet bit
operator-style interface and also
sometimes raw bit operators
struct HourOfWeekFilter : public FilterBaseT<HourOfWeekFilter> {
HourOfWeekFilter() { data.fill(ConfigSet()); }
static constexpr const char* name = "HourOfWeek";
unsigned priority() const { return Priority::HourOfWeek; }
void setConfig(unsigned configIndex,const AgentConfig& config,bool value) {
const auto& bitmap = config.hourOfWeekFilter.hourBitmap;
for (size_t i = 0; i < bitmap.size(); ++i) {
if (!bitmap[i]) continue;
data[i].set(configIndex, value);
}
}
void filter(FilterState& state) const {
state.narrowConfigs(data[state.request.timestamp.hourOfWeek()]);
}
private:
std::array<ConfigSet, 24 * 7> data;
};
50. Augmenters: Your Logic and Data
● Bid Requests pass through Augmenter after
Filtering, before Bidding
● Allows for custom filtering based on
combinations of bid request fields, your
data and business logic you code
● Filter based on user agent, device, geo,
user data, etc.
51. Augmenter Implementation
● Provides thread pool of background threads
to run augmenter calls
● Enforces 5ms timeout on router thread
● Sync and async versions. Use async with
callback for calls to outside DBs.
● RTBkit ships with Redis Augmenter. Other
stores such as Aerospike are in the wild.
● Separate config for each Bidding Agent
● Augmenter data is arbitrary JSON
● Can subscribe to other RTBkit data streams
to write data
○ e.g. - frequency cap Augmenter
subscribes to PAS MATCHEDWINs
Router
Bid
Request
Fast
DB TM
Thread Pool
Augmenter
Impl.
Augmenter
Post
Auction
Service
Data Sink
Callback
Augmented
Bid
Request
53. Bidding Agent Configuration
● Bidding Agents configure the Core
● Agents register Agent Config with the Agent
Configuration Service
● Router, PAS and Augmenter periodically
pull updated Agent configs from the ACS
● Router registers
○ creatives per campaign
○ dynamic filters
○ augmenters and augmenter filters
● Router passes ad markup from config to
Exchange Connector for bid response
● Router forwards bid requests passing
filtering to eligible Agents
● PAS forwards Matched Events to Agents
● Augmenter adds augmented fields to Bid
Request based on Agent Configuration
54. Bidding Agent Configuration (con't)
● account -- which Accounts in an Account
tree the Agent bids for
○ Implement different bidding strategies
within an account or "account group"
by mapping Agents to named
accounts in an account tree
● maxInFlight -- outstanding bids
● bidProbability per Agent can be used for
pacing and bidding strategy
● creatives, languageFilter and
segmentFilter supported Static Filters
● augmentations here configures an
Augmenter filter
● providerConfig -- ad markup, not shown
{
"account": ["parent", "child"],
"bidProbability": 0.1,
"creatives": [{"id": 1,"width":300,"height": 250,
"providerConfig": {"supplySourceX": {
"markup": "markup goes here",
"attributes": ["alcohol"]}
}, ...],
"languageFilter": {"include":["en"], "exclude":
[]},
"segmentFilter": {
"sample1": { "include" : [], "exclude":
["bad"] },
"colors": { "include" : ["blue", "red"]}},
"augmentations": {
"freq-rec": {
"required": true, "config": {"maxPerDay":
10}, "filter": { "include": [], "exclude":
["too-many"]
}}
},
"maxInFlight": 10
}
55. Custom Bidding Agent Implementation
● C++ or JavaScript
● Programmatic configuration
● Custom bidding logic based on
○ bid attributes
○ a custom Win Cost Model to adjust for
desired margin and data costs
● Currency support supports bidding at
different price granularities
● Pacing support
○ Custom pacing logic
○ Guaranteed communication between
Bidding Agent and Banker
● Bid callback called on every bid request
● Router sends back bid status messages
● Post Auction Service sends back event
status messages
Custom Bidding
Agent
Configuration
Bidding Logic
Win Cost Model
Pacing
Currency Helpers
Router
Bid Request1Bid
Response
2
onWin
onLoss
onNoBudget
onTooLate
onDroppedBid
onInvalidBid
3
Post
Auction
Service
onImpression
onClick
onVisit 4
56. Augmenters: Your Logic and Data
● Allows you to augment the bid request,
adding any fields you want, based on
combinations of bid request fields, your
data and business logic you code
● Supports custom logic per agent
● Augmented fields are then available to the
Bidding Agents
● So, you can influence your bidding logic by
adding to the bid request
57. Future Directions
Near Term
● Scalability Improvements
○ PAS
○ Number of Agents
● Improved Packaging
● Decoupled Bidding Agent API
This Year
● Performance benchmarking tools
● Protocol versioning of messages
● Open plugin platform supporting
3p marketplace
58. RTBkit Resources
Links
● http://github.com/rtbkit
● http://rtbkit.org
● https://groups.google.com/a/rtbkit.
org/forum/#!forum/discuss
Developers Getting Started Guide
● https://github.
com/rtbkit/rtbkit/wiki/Getting-
Started
59. About Us
Mark Weiss
● Head of Customer Solutions at Datacratic
@marksweiss
Datacratic
● Machine-learning software provider
● Platform supports real-time decisioning
● Current products target digital marketing
○ Hosted RTB Optimization
○ Self-Serve and DMP Lookalike
Modeling
www.datacratic.com
@datacratic
We're hiring! http://datacratic.com/site/careers