What Are The Drone Anti-jamming Systems Technology?
From Content Storage to Scaling Smart Data
1. Smart data,
Lily at scale
madE easy
from content storage
to scaling smart data
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
maandag 6 juni 2011
3. the pain
data
need for
distributed
processing
moore
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 3
maandag 6 juni 2011
4. the pain
» growth of data sets
» smart businesses need
to apply analytics to Smart data,
activities
at scale
» doing business online
means real-time
madE easy
» talent shortage
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 4
maandag 6 juni 2011
5. LILY
The Real-time Platform built for the Age of Data.
We manage, track and measure your data and users,
and do the mat(c)hmaking in-between:
» provide you with business intelligence and analytics
» harvest user profiles and learn their interests
» dynamically engage your users using quality recommendations
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 5
maandag 6 juni 2011
6. where would you use lily?
» large collections of data » large groups of users
» content repositories » e-commerce / retail
» library catalogs » news / media
» (media) asset management
» product catalogs
» ‘live’ archives
» ... if you want to use big
data, but you need easy.
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 6
maandag 6 juni 2011
7. ns
pe
ap
gic h
ma
he
t
re
he
sw
si
+
thi
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 7
maandag 6 juni 2011
9. beyond content management: data + analytics
recommendations
call to action
personalised
revenue
product / service
audience data
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 9
maandag 6 juni 2011
10. LILY 2.0: smart data
SMARTER DATA data processing
s
relation
recommendations
semantic augmentation
Analytics
usage
metrics domain
knowledge
patterns
rules
keywords
lists
...
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 10
maandag 6 juni 2011
11. roadmap
» now: highly-scalable data repository: store, index and search
» next: with real-time usage stats gathering and analytics
» later: and built-in context- and user-sensitive
recommendations
» built on top of Google BigTable / HBase / Solr
» identical, robust technology in use at Facebook, Twitter,
StumbleUpon, Yahoo!
» scales widely over distributed (cloud) infrastructure
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 11
maandag 6 juni 2011
12. Lily Repository Model
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 12
maandag 6 juni 2011
16. HBase indexing & RowLog Library
» building and querying » need for sync/async
indexes, GAE-style operations
» updating of secondary indexes
rowkey col col
content
A val3 foo6 (e.g. link tables)
table
B val2 foo7
» feeding of Indexer
(= indexes Lily-content into Solr)
rowkey col » not: transactions
order
index
table A val2-B
val3-A » need for distribution and
durability
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 16
maandag 6 juni 2011
17. The Lily Indexer
sharding towards
indexing of multiple incremental index blob content
denormalization batch index building multiple SOLR
versions of a record updating extraction
instances
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 17
maandag 6 juni 2011
18. status june 2011
» Lily 1.0.1 released - developing since Q4/09
» some customers - DIY retail / media / news
» e-commerce platform project
» Lily as the data (integration) tier
» first contrib: FrogPond (annotated Java <> Lily mapper)
https://bitbucket.org/calmera/frogpond
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 18
maandag 6 juni 2011
19. Next up: usage stats
» sits in CRUD-path
» tracks users ops against
records
interactions
» from both perspectives
record user
» arbitrary K/V properties: time,
location, ...
rec
» automatically builds user
om
me
nd
ati
o
profiles (as records)
ns
indexes
e
tim
» tied to records ops
» indexed access
» time dimension: trending
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 19
maandag 6 juni 2011
20. from usage stats to recommendations ‘light’
record user
» grouping of users based on
» shared properties
» shared record access
» grouping of records based on
» shared properties
{ connections
» shared user operations recommendations
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 20
maandag 6 juni 2011
21. full-on recommendations
» look at real-time-capable Mahout algorithms
» pre-index or -calculate as much as possible
» save as secondary indexes
» present recommendations as part of record API
» allow user to contribute ‘domain knowledge’ to
record processing pipeline
» pattern detection, keywords, ontologies, ...
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 21
maandag 6 juni 2011
26. Thank you !
for your attention
for your questions
» stevenn@outerthought.org
» @stevenn
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
maandag 6 juni 2011