Pinot is a realtime distributed OLAP datastore, which is used at LinkedIn to deliver scalable real time analytics with low latency. It can ingest data from offline data sources (such as Apache Hadoop and flat files) as well as online sources (such as Apache Kafka). Pinot is designed to scale horizontally.
2. Me
➢ @ananthdurai
➢ Senior Software Engineer
@ Slack
➢ Passionate about all things related
to ethical data management
3. Public launch: 2014 1000+ employees across
7 countries worldwide
HQ in San Francisco
$841M in capital raised
Key investors include Softbank, Accel,
a16z, Social Capital, Index, Thrive, GV,
Kleiner Perkins, GGV, Horizons, Spark,
IVP and DST.
Diverse set of industries
including software/technology, retail, media,
telecom and professional services.
About Slack
6. I have well-defined dimensions & metrics,
and I want a real-time aggregation and
querying.
Type 1 (The Analytical pattern)
(e.g)
❖ performance analytics
❖ experimentation
7. I want to slice & dice metrics in real-time.
If something goes wrong, I would like to
see the detail events for the given
dimensions.
Type 2 (The needle in the haystack pattern)
(e.g)
❖ Mysql slow queries
❖ Audit logs
❖ Security logs
8. I want to dump all the events and explore
further to understand the pattern.
Type 3 (The exploration pattern)
(e.g)
❖ Log inspection
❖ Exploratory analysis
12. Did I broke anything while
launching an experiment?
Is there any performance
degrade with my experiments?
Use case #2: The experimentation framework
13. Pinot Next
Log search support
Add support for free text search on the
dimensions.
https://github.com/apache/incubator-pinot/
issues/2798
Integration with visualization tools
Ease the integration with the operational
intelligence and business intelligence
solutions like grafana, superset etc