SlideShare ist ein Scribd-Unternehmen logo
1 von 18
1 © Jive confidential
TASMO
MATERIALIZED VIEWS
OF EVENT STREAMS
USING HBASE
Tasmo
Materialized Views of Event Streams using HBase
Presenters:
Pete Matern
Jonathan Colt
3 © Jive confidential
What’s the problem
• Joining to death at read time
• With our operational constraints of a single point of failure (single db
instance)
• Can only scale up - not out
• Read load far exceeds write load
• Read every field of an object every time any field changed to support
indexing
• Read every field of an object to update one
4 © Jive confidential
What we needed
• Joins performed at write time (materialized views)
• Horizontally scalable
• No single point of failure
• Incremental updates
• Notification of changes
• Idempotency
• Tolerance of duplicate and out of order input
• Front end developers work against their object model rather than HBase
specific constructs.
5 © Jive confidential
What we built: Tasmo
Stateless HA service which
• Maintains materialized views of data
• Consumes our model (declaration of input and output types)
• Notifies consumers when views change
• Replaces all our relational db usage
6 © Jive confidential
How we consume and render our model
• Every reader of our model defines views for Tasmo to maintain
• Views contain joined/filtered data specific to point of use
• Readers of these views render output or further process the data
events
Hbase ReadersTasmo
read viewsread / write
View
definition ViewsViewsViews
7 © Jive confidential
How we declare our input and output (Model)
Type: Content
● Subject: String
● Body: String
● Container: Reference
● Author: Reference
Event Declarations
Type: User
● Username: String
● First Name: String
● Last Name: String
● Creation Date: Long
Type: Content
● Subject
● Container (Type: Folder)
○ Name
○ ModDate
● Author (Type: User)
○ Username
○ CreationDate
View Declaration
8 © Jive confidential
Event > Model > View > Web Page
body = “When can
we try it?”
Model
Container
Content
Author
Comment
Tasmo
Hbase
View
Comment Event
9 © Jive confidential
Web Page backed by View Instance
10 © Jive confidential
How we notify consumers
• Consumers register for notifications on a type of view
• Applying an event to the model in Tasmo results in the set of affected view
instances.
• We push the modified view instances to registered consumers
Search
events
Tasmo
notify
Binary storage
Activity Analysis
11 © Jive confidential
How we maintain search indices
• Define views of data which correspond to the index schemas
• Indexing engine registers for notifications of these view types
• Tasmo fires notifications for affected view instances per event
• Indexing engine reads the modified views, which represent complete and
up to date documents for indexing.
Search
events
Hbase
Tasmo
notify
read
index
views
read / write
12 © Jive confidential
10,000 feet how it works
Consumes events, consults configuration describing joins and selects, applies
all relevant changes in event to update data views
Values
Existence
Relationships
Write
events
Relationships Views
Traverse Join / Select
writes
scans
concurrency
consistency
retry ( multiversion concurrency)
updates /
removes
Tasmo
13 © Jive confidential
Taking over time
• Snowflake id for every event - makes them unique and time orderable
• Event time is based on when the system receives an event
• Event time is used as HBase cell timestamp - logically stale writes no op
• Event time has the room to disambiguate add vs remove:
o Snowflake ids are even numbers.
o Snowflake is used directly for adds
o Snowflake -1 is used for removes
o For a given event - adds trump removes
14 © Jive confidential
Concurrency Issues
• Problem: As different events add/remove relationships in parallel, we can
fail to add/remove elements of views.
• Solution: Per relationship high water marks maintained in an HBase table.
We test the per relationship times we saw during a path traversal against
the high water mark. If we detect we are stale, we retry the operation.
15 © Jive confidential
Why HBase?
• Timestamp control
• Row level atomicity of changes
• Performance and proven scalability
16 © Jive confidential
Roadmap
• Production later this year. Currently heavily used by
developers at Jive.
• Looking at what work could be moved into
coprocessors.
• Considering double writes into two HBase clusters for
higher availability if MTTR is too high in our
environment.
17 © Jive confidential
Questions and Answers
Open source
https://github.com/jivesoftware/tasmo
Please Help!
jonathan.colt@jivesoftware.com
pete@jivesoftware.com
Open Sourcing Tasmo

Weitere ähnliche Inhalte

Mehr von Jive Software an Aurea company

How FICO Kick-Started Collaboration in a Siloed Culture
How FICO Kick-Started Collaboration in a Siloed CultureHow FICO Kick-Started Collaboration in a Siloed Culture
How FICO Kick-Started Collaboration in a Siloed CultureJive Software an Aurea company
 
It's Just Life (and Other Lessons From Top Women in Tech)
It's Just Life (and Other Lessons From Top Women in Tech) It's Just Life (and Other Lessons From Top Women in Tech)
It's Just Life (and Other Lessons From Top Women in Tech) Jive Software an Aurea company
 
Customers are your Best Advocates - Turn Them into your Best Sales Team
Customers are your Best Advocates - Turn Them into your Best Sales TeamCustomers are your Best Advocates - Turn Them into your Best Sales Team
Customers are your Best Advocates - Turn Them into your Best Sales TeamJive Software an Aurea company
 
How to Rethink Your Company Intranet to Deliver Business Value
How to Rethink Your Company Intranet to Deliver Business ValueHow to Rethink Your Company Intranet to Deliver Business Value
How to Rethink Your Company Intranet to Deliver Business ValueJive Software an Aurea company
 

Mehr von Jive Software an Aurea company (20)

Engage Your Co-Workers Like Customers
Engage Your Co-Workers Like CustomersEngage Your Co-Workers Like Customers
Engage Your Co-Workers Like Customers
 
Employee Engagement in the Work-Anywhere Era
Employee Engagement in the Work-Anywhere EraEmployee Engagement in the Work-Anywhere Era
Employee Engagement in the Work-Anywhere Era
 
Penn Foster - Encouraging Community Self-Service
Penn Foster - Encouraging Community Self-Service Penn Foster - Encouraging Community Self-Service
Penn Foster - Encouraging Community Self-Service
 
How FICO Kick-Started Collaboration in a Siloed Culture
How FICO Kick-Started Collaboration in a Siloed CultureHow FICO Kick-Started Collaboration in a Siloed Culture
How FICO Kick-Started Collaboration in a Siloed Culture
 
CIOs Are At A Crossroads
CIOs Are At A CrossroadsCIOs Are At A Crossroads
CIOs Are At A Crossroads
 
The Top 10 JiveWorld14 Takeaways
The Top 10 JiveWorld14 TakeawaysThe Top 10 JiveWorld14 Takeaways
The Top 10 JiveWorld14 Takeaways
 
5 Artists, 5 Days: How We Rebranded
5 Artists, 5 Days: How We Rebranded5 Artists, 5 Days: How We Rebranded
5 Artists, 5 Days: How We Rebranded
 
The NextGen Worker - Embrace the Millennial Workstyle
The NextGen Worker - Embrace the Millennial WorkstyleThe NextGen Worker - Embrace the Millennial Workstyle
The NextGen Worker - Embrace the Millennial Workstyle
 
It's Just Life (and Other Lessons From Top Women in Tech)
It's Just Life (and Other Lessons From Top Women in Tech) It's Just Life (and Other Lessons From Top Women in Tech)
It's Just Life (and Other Lessons From Top Women in Tech)
 
Top 9 Social Lessons Learned from Soccer
Top 9 Social Lessons Learned from SoccerTop 9 Social Lessons Learned from Soccer
Top 9 Social Lessons Learned from Soccer
 
Get Real at JiveWorld
Get Real at JiveWorldGet Real at JiveWorld
Get Real at JiveWorld
 
How to use Social for Corporate Communications
How to use Social for Corporate CommunicationsHow to use Social for Corporate Communications
How to use Social for Corporate Communications
 
How to Use Social to Innovate Your Business
How to Use Social to Innovate Your BusinessHow to Use Social to Innovate Your Business
How to Use Social to Innovate Your Business
 
Make 50 Hours of Selling More Productive
Make 50 Hours of Selling More ProductiveMake 50 Hours of Selling More Productive
Make 50 Hours of Selling More Productive
 
Customers are your Best Advocates - Turn Them into your Best Sales Team
Customers are your Best Advocates - Turn Them into your Best Sales TeamCustomers are your Best Advocates - Turn Them into your Best Sales Team
Customers are your Best Advocates - Turn Them into your Best Sales Team
 
How to Use Social for IT Project Management
How to Use Social for IT Project ManagementHow to Use Social for IT Project Management
How to Use Social for IT Project Management
 
How to Rethink Your Company Intranet to Deliver Business Value
How to Rethink Your Company Intranet to Deliver Business ValueHow to Rethink Your Company Intranet to Deliver Business Value
How to Rethink Your Company Intranet to Deliver Business Value
 
How to Use Public Communities for Events
How to Use Public Communities for EventsHow to Use Public Communities for Events
How to Use Public Communities for Events
 
6 Secrets to Social Success
6 Secrets to Social Success6 Secrets to Social Success
6 Secrets to Social Success
 
Deal Management Made Easy: The Devoteam Story
Deal Management Made Easy: The Devoteam StoryDeal Management Made Easy: The Devoteam Story
Deal Management Made Easy: The Devoteam Story
 

Kürzlich hochgeladen

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Open Sourcing Tasmo

  • 1. 1 © Jive confidential TASMO MATERIALIZED VIEWS OF EVENT STREAMS USING HBASE
  • 2. Tasmo Materialized Views of Event Streams using HBase Presenters: Pete Matern Jonathan Colt
  • 3. 3 © Jive confidential What’s the problem • Joining to death at read time • With our operational constraints of a single point of failure (single db instance) • Can only scale up - not out • Read load far exceeds write load • Read every field of an object every time any field changed to support indexing • Read every field of an object to update one
  • 4. 4 © Jive confidential What we needed • Joins performed at write time (materialized views) • Horizontally scalable • No single point of failure • Incremental updates • Notification of changes • Idempotency • Tolerance of duplicate and out of order input • Front end developers work against their object model rather than HBase specific constructs.
  • 5. 5 © Jive confidential What we built: Tasmo Stateless HA service which • Maintains materialized views of data • Consumes our model (declaration of input and output types) • Notifies consumers when views change • Replaces all our relational db usage
  • 6. 6 © Jive confidential How we consume and render our model • Every reader of our model defines views for Tasmo to maintain • Views contain joined/filtered data specific to point of use • Readers of these views render output or further process the data events Hbase ReadersTasmo read viewsread / write View definition ViewsViewsViews
  • 7. 7 © Jive confidential How we declare our input and output (Model) Type: Content ● Subject: String ● Body: String ● Container: Reference ● Author: Reference Event Declarations Type: User ● Username: String ● First Name: String ● Last Name: String ● Creation Date: Long Type: Content ● Subject ● Container (Type: Folder) ○ Name ○ ModDate ● Author (Type: User) ○ Username ○ CreationDate View Declaration
  • 8. 8 © Jive confidential Event > Model > View > Web Page body = “When can we try it?” Model Container Content Author Comment Tasmo Hbase View Comment Event
  • 9. 9 © Jive confidential Web Page backed by View Instance
  • 10. 10 © Jive confidential How we notify consumers • Consumers register for notifications on a type of view • Applying an event to the model in Tasmo results in the set of affected view instances. • We push the modified view instances to registered consumers Search events Tasmo notify Binary storage Activity Analysis
  • 11. 11 © Jive confidential How we maintain search indices • Define views of data which correspond to the index schemas • Indexing engine registers for notifications of these view types • Tasmo fires notifications for affected view instances per event • Indexing engine reads the modified views, which represent complete and up to date documents for indexing. Search events Hbase Tasmo notify read index views read / write
  • 12. 12 © Jive confidential 10,000 feet how it works Consumes events, consults configuration describing joins and selects, applies all relevant changes in event to update data views Values Existence Relationships Write events Relationships Views Traverse Join / Select writes scans concurrency consistency retry ( multiversion concurrency) updates / removes Tasmo
  • 13. 13 © Jive confidential Taking over time • Snowflake id for every event - makes them unique and time orderable • Event time is based on when the system receives an event • Event time is used as HBase cell timestamp - logically stale writes no op • Event time has the room to disambiguate add vs remove: o Snowflake ids are even numbers. o Snowflake is used directly for adds o Snowflake -1 is used for removes o For a given event - adds trump removes
  • 14. 14 © Jive confidential Concurrency Issues • Problem: As different events add/remove relationships in parallel, we can fail to add/remove elements of views. • Solution: Per relationship high water marks maintained in an HBase table. We test the per relationship times we saw during a path traversal against the high water mark. If we detect we are stale, we retry the operation.
  • 15. 15 © Jive confidential Why HBase? • Timestamp control • Row level atomicity of changes • Performance and proven scalability
  • 16. 16 © Jive confidential Roadmap • Production later this year. Currently heavily used by developers at Jive. • Looking at what work could be moved into coprocessors. • Considering double writes into two HBase clusters for higher availability if MTTR is too high in our environment.
  • 17. 17 © Jive confidential Questions and Answers Open source https://github.com/jivesoftware/tasmo Please Help! jonathan.colt@jivesoftware.com pete@jivesoftware.com