Submit Search
Upload
Tiny Google Projects
•
Download as PPTX, PDF
•
1 like
•
953 views
Ostap Andrusiv
Follow
Presentation about 3 Google projects.
Read less
Read more
Technology
Report
Share
Report
Share
1 of 44
Download now
Recommended
Memory: The New Disk
Memory: The New Disk
Tim Lossen
ソーシャルゲームログ解析基盤のHadoop活用事例
ソーシャルゲームログ解析基盤のHadoop活用事例
知教 本間
SSDs, IMDGs and All the Rest - Jax London
SSDs, IMDGs and All the Rest - Jax London
Uri Cohen
How to randomly access data in close-to-RAM speeds but a lower cost with SSD’...
How to randomly access data in close-to-RAM speeds but a lower cost with SSD’...
JAXLondon2014
MongoDB Memory Management Demystified
MongoDB Memory Management Demystified
MongoDB
Ops Jumpstart: Admin 101
Ops Jumpstart: Admin 101
MongoDB
Gogolook SQS lesson learnt
Gogolook SQS lesson learnt
cc liu
Frontera распределенный робот для обхода веба в больших объемах / Александр С...
Frontera распределенный робот для обхода веба в больших объемах / Александр С...
Ontico
Recommended
Memory: The New Disk
Memory: The New Disk
Tim Lossen
ソーシャルゲームログ解析基盤のHadoop活用事例
ソーシャルゲームログ解析基盤のHadoop活用事例
知教 本間
SSDs, IMDGs and All the Rest - Jax London
SSDs, IMDGs and All the Rest - Jax London
Uri Cohen
How to randomly access data in close-to-RAM speeds but a lower cost with SSD’...
How to randomly access data in close-to-RAM speeds but a lower cost with SSD’...
JAXLondon2014
MongoDB Memory Management Demystified
MongoDB Memory Management Demystified
MongoDB
Ops Jumpstart: Admin 101
Ops Jumpstart: Admin 101
MongoDB
Gogolook SQS lesson learnt
Gogolook SQS lesson learnt
cc liu
Frontera распределенный робот для обхода веба в больших объемах / Александр С...
Frontera распределенный робот для обхода веба в больших объемах / Александр С...
Ontico
A Look At Google Glass
A Look At Google Glass
Ostap Andrusiv
Lessons learned from Tesla Watch Apps experiments
Lessons learned from Tesla Watch Apps experiments
Ostap Andrusiv
Scaladroids: Developing Android Apps with Scala
Scaladroids: Developing Android Apps with Scala
Ostap Andrusiv
Wearable Connectivity Architectures
Wearable Connectivity Architectures
Ostap Andrusiv
Breaking Glass: Glass development without Glass
Breaking Glass: Glass development without Glass
Ostap Andrusiv
UX Challenges in VR
UX Challenges in VR
Ostap Andrusiv
Wearables - The Next Level of Mobility
Wearables - The Next Level of Mobility
Ostap Andrusiv
The Making of Tesla Smartwatch Apps
The Making of Tesla Smartwatch Apps
Ostap Andrusiv
Blogopolisの裏側
Blogopolisの裏側
Kaisei Hamamoto
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Enkitec
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
MongoDB
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
Server Density
Deployment
Deployment
rogerbodamer
NFS and Oracle
NFS and Oracle
Kyle Hailey
LUG 2014
LUG 2014
Hitoshi Sato
SD, a P2P bug tracking system
SD, a P2P bug tracking system
Jesse Vincent
Oleg Natalushko. Drupal server anatomy. DrupalCamp Kyiv 2011
Oleg Natalushko. Drupal server anatomy. DrupalCamp Kyiv 2011
Vlad Savitsky
Dream colo
Dream colo
dream colo
Speed is Essential for a Great Web Experience
Speed is Essential for a Great Web Experience
Andy Davies
Performance tuning
Performance tuning
Jon Haddad
Golang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war story
Aerospike
OSDC 2016 - Ingesting Logs with Style by Pere Urbon-Bayes
OSDC 2016 - Ingesting Logs with Style by Pere Urbon-Bayes
NETWAYS
More Related Content
Viewers also liked
A Look At Google Glass
A Look At Google Glass
Ostap Andrusiv
Lessons learned from Tesla Watch Apps experiments
Lessons learned from Tesla Watch Apps experiments
Ostap Andrusiv
Scaladroids: Developing Android Apps with Scala
Scaladroids: Developing Android Apps with Scala
Ostap Andrusiv
Wearable Connectivity Architectures
Wearable Connectivity Architectures
Ostap Andrusiv
Breaking Glass: Glass development without Glass
Breaking Glass: Glass development without Glass
Ostap Andrusiv
UX Challenges in VR
UX Challenges in VR
Ostap Andrusiv
Wearables - The Next Level of Mobility
Wearables - The Next Level of Mobility
Ostap Andrusiv
The Making of Tesla Smartwatch Apps
The Making of Tesla Smartwatch Apps
Ostap Andrusiv
Viewers also liked
(8)
A Look At Google Glass
A Look At Google Glass
Lessons learned from Tesla Watch Apps experiments
Lessons learned from Tesla Watch Apps experiments
Scaladroids: Developing Android Apps with Scala
Scaladroids: Developing Android Apps with Scala
Wearable Connectivity Architectures
Wearable Connectivity Architectures
Breaking Glass: Glass development without Glass
Breaking Glass: Glass development without Glass
UX Challenges in VR
UX Challenges in VR
Wearables - The Next Level of Mobility
Wearables - The Next Level of Mobility
The Making of Tesla Smartwatch Apps
The Making of Tesla Smartwatch Apps
Similar to Tiny Google Projects
Blogopolisの裏側
Blogopolisの裏側
Kaisei Hamamoto
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Enkitec
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
MongoDB
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
Server Density
Deployment
Deployment
rogerbodamer
NFS and Oracle
NFS and Oracle
Kyle Hailey
LUG 2014
LUG 2014
Hitoshi Sato
SD, a P2P bug tracking system
SD, a P2P bug tracking system
Jesse Vincent
Oleg Natalushko. Drupal server anatomy. DrupalCamp Kyiv 2011
Oleg Natalushko. Drupal server anatomy. DrupalCamp Kyiv 2011
Vlad Savitsky
Dream colo
Dream colo
dream colo
Speed is Essential for a Great Web Experience
Speed is Essential for a Great Web Experience
Andy Davies
Performance tuning
Performance tuning
Jon Haddad
Golang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war story
Aerospike
OSDC 2016 - Ingesting Logs with Style by Pere Urbon-Bayes
OSDC 2016 - Ingesting Logs with Style by Pere Urbon-Bayes
NETWAYS
Lustre Generational Performance Improvements & New Features
Lustre Generational Performance Improvements & New Features
inside-BigData.com
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Cloudera, Inc.
Day 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConf
Redis Labs
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Lucidworks
Tuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for Logs
Sematext Group, Inc.
New idc architecture
New idc architecture
Mason Mei
Similar to Tiny Google Projects
(20)
Blogopolisの裏側
Blogopolisの裏側
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
Deployment
Deployment
NFS and Oracle
NFS and Oracle
LUG 2014
LUG 2014
SD, a P2P bug tracking system
SD, a P2P bug tracking system
Oleg Natalushko. Drupal server anatomy. DrupalCamp Kyiv 2011
Oleg Natalushko. Drupal server anatomy. DrupalCamp Kyiv 2011
Dream colo
Dream colo
Speed is Essential for a Great Web Experience
Speed is Essential for a Great Web Experience
Performance tuning
Performance tuning
Golang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war story
OSDC 2016 - Ingesting Logs with Style by Pere Urbon-Bayes
OSDC 2016 - Ingesting Logs with Style by Pere Urbon-Bayes
Lustre Generational Performance Improvements & New Features
Lustre Generational Performance Improvements & New Features
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Day 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConf
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for Logs
New idc architecture
New idc architecture
Recently uploaded
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
sammart93
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Juan lago vázquez
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
Zilliz
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
apidays
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Zilliz
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
Igalia
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
Zilliz
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Edi Saputra
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
The Digital Insurer
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
lior mazor
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
apidays
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
debabhi2
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
MadyBayot
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Khem
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Miguel Araújo
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
Dropbox
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Martijn de Jong
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
Overkill Security
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Product Anonymous
Recently uploaded
(20)
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Tiny Google Projects
1.
:)
2.
3.
tiny
:projects
4.
5.
6.
7.
Tesseract OCR 1985
2006 HP Google
8.
Tesseract OCR 2006
2011 TIFF *
9.
Tesseract OCR 2009
2010 Text layout
10.
Tesseract OCR 2007
2011 6 33
11.
Tesseract OCR
Arabic, English, Bulgarian, Catalan, Czech, Chinese (Simplified and Traditional), Danish (standard and Fraktur script), German, Greek, Finnish, French, Hebrew, Croatian, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak (standard and Fraktur script), Slovenian, Spanish, Serbian, Swedish, Tagalog, Thai, Turkish, Ukrainian and Vietnamese
12.
Tesseract OCR Officially supported:
Probably runs on:
13.
Image processing
14.
15.
16.
17.
Google Refine
18.
Runs on:
19.
Runs in:
20.
Major features: Import from
anywhere Faceting Clustering Split crate custom columns GREL transformations Export/etc
21.
22.
google protocol buffers
Person person; person.set_id(123); > message Person { person.set_name("Bob"); required int32 id = 1; person.set_email("bob@example.com"); required string name = 2; optional string email = 3; fstream out("person.pb", ios::out ... } person.SerializeToOstream(&out); out.close();
23.
512
bytes / tweet 340,000,000 tweets / day (2012) 7,253,333,333 bytes / hour 2,014,814 bytes / second 1,921 Mbytes / second 15,371 Mbits / second 8 Tbytes / day (2011) Google: ~ 377M searches/day
24.
+ =
25.
+ =
26.
+ =
27.
>
+ =
28.
>
+ =
29.
>
+ = ? MapReduce
30.
31.
snappy http://code.google.com/p/snappy/
32.
snappy Fast
Stable Robust Free and BSD
33.
Size(less is better)
compression ratio (%) 80 70 60 50 40 30 20 10 0 lzjb 2010 lzo 2.04 1x fastlz 0.1 - fastlz 0.1 - 3.6 vf lzf 3.6 uf lzrw1 lzf lzrw1-a lzrw2 lzrw3 lzrw3-a snappy quicklz quicklz 1 2 1.0 1.5.0 -1 1.5.0 -2
34.
6
Data types 5 4 compression ratio 3 snappy zlib 2 1 0 plain text html jpeg
35.
Size from 20% to
100% bigger :( ...not for amazon glacier
36.
Speed is better)
Compression (MB/s) (more 250 200 150 100 50 0 lzjb 2010 lzo 2.04 fastlz 0.1 - fastlz 0.1 - 3.6 vf lzf 3.6 uf lzrw1 lzf lzrw1-a lzrw2 lzrw3 lzrw3-a snappy quicklz quicklz 1x 1 2 1.0 1.5.0 -1 1.5.0 -2
37.
Speed is better)
Decompression (MB/s) (more 500 450 400 350 300 250 200 150 100 50 0 lzjb 2010 lzo 2.04 fastlz 0.1 - fastlz 0.1 - 3.6 vf lzf 3.6 uf lzrw1 lzf lzrw1-a lzrw2 lzrw3 lzrw3-a snappy quicklz quicklz 1x 1 2 1.0 1.5.0 -1 1.5.0 -2
38.
On 1 core
of 64-bit Core i7 processor: • Compression: 250MB/s • Decompression: 500MB/s :P
39.
Portable, but...
40.
Portable, but primarily
optimized for 64-bit x86-compatible processors
41.
Used: BigTable MapReduce Google RPC
Hadoop
42.
Bindings:
43.
@TarasRoshko
HTTP headers here: http://code.google.com/p/snappy/ source/browse/trunk/framing_for mat.txt
44.
QA?
Ostap Andrusiv Software Engineer Eleks software @p1f
Editor's Notes
http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
In-memory test (compression and decompression) with ENWIK8 using1 core of Intel Xeon X5355 @ 2.66GHz (64-bit compilation under gcc 4.1.1 (Linux) -O3 -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math --param inline-unit-growth=999 -DNDEBUG)
zlibsnappyplain text1.5-1.72.7html2-4 3-7 jpeg11
http://aws.amazon.com/glacier/
http://pastebin.com/SFaNzRuf
http://encode.ru/threads/1255-Google-released-Snappy-compression-decompression-library
http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
http://www.cloudera.com/blog/2011/09/snappy-and-hadoop/
Download now