Big Data World 2013 - How LinkedIn leveraged its data to become the world's largest professional network

•

11 likes•4,240 views

Vitaly Gordon

Technology

How LinkedIn leveraged its data to become the world's
largest professional network

About me
©2013 LinkedIn Corporation. All Rights Reserved. 2
Vitaly Gordon

©2013 LinkedIn Corporation. All Rights Reserved.
Agenda
1 What is Big Data?
2 Big Data Applications
3 LinkedIn’s Big Data Solutions
4 Finding Experts
5 Big Data Recipe
6 Summary

©2013 LinkedIn Corporation. All Rights Reserved.
1 What is Big Data?
2 Big Data Applications
3 LinkedIn’s Big Data Solutions
4 Finding Experts
5 Big Data Recipe
6 Summary

Data sets that are too large and complex
to manipulate or interrogate with standard
methods or tools.
Oxford Dictionary
©2013 LinkedIn Corporation. All Rights Reserved.

Big Data Growth
©2013 LinkedIn Corporation. All Rights Reserved. 7
1E+00
1E+01
1E+02
1E+03
1E+04
1E+05
1E+06
1E+07
1E+08
1E+09
Storage Growth Data Growth

©2013 LinkedIn Corporation. All Rights Reserved.
2 Big Data Applications
3 LinkedIn’s Big Data Solutions
4 Finding Experts
5 Big Data Recipe
6 Summary
1 What is Big Data?

©2013 LinkedIn Corporation. All Rights Reserved. 9

©2013 LinkedIn Corporation. All Rights Reserved. 10
increase in sales

©2013 LinkedIn Corporation. All Rights Reserved. 11

©2013 LinkedIn Corporation. All Rights Reserved. 12
of watched content

©2013 LinkedIn Corporation. All Rights Reserved. 13

©2013 LinkedIn Corporation. All Rights Reserved. 14
40M users in 18 months

Big Data is more about Business
than Data

©2013 LinkedIn Corporation. All Rights Reserved.
3 LinkedIn’s Big Data Solutions
4 Finding Experts
5 Big Data Recipe
6 Summary
1 What is Big Data?
2 Big Data Applications

©2013 LinkedIn Corporation. All Rights Reserved. 17

LinkedIn Revenue
Quarterly Revenue
------------------200 ----------------------------------2010-------------------------------2011----------------
Hiring Solutions Marketing Solutions Premium Subscriptions
($ millions)
-----------------2012-------------------2013---
©2013 LinkedIn Corporation. All Rights Reserved. 18
23 28 30
39 45
55 62
82
94
121
139
168
188
228
252
304
325
0
50
100
150
200
250
300
350
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1

©2013 LinkedIn Corporation. All Rights Reserved. 19

Premium Subscriptions
©2013 LinkedIn Corporation. All Rights Reserved. 20

Marketing Solutions
©2013 LinkedIn Corporation. All Rights Reserved. 21

Talent Solutions
©2013 LinkedIn Corporation. All Rights Reserved. 22

Connecting Talent With Opportunity
©2013 LinkedIn Corporation. All Rights Reserved.

Jobs You May Be Interested In (JYMBII) – Case Study
©2013 LinkedIn Corporation. All Rights Reserved. 24
Software Engineer at
Data Scientist at
Product Manager at

Jobs You May Be Interested In – Case Study
©2013 LinkedIn Corporation. All Rights Reserved. 25
Design

JYMBII – Building The Product
©2013 LinkedIn Corporation. All Rights Reserved. 26
Algorithms
Design
Design Algorithms Framework

Design
©2013 LinkedIn Corporation. All Rights Reserved. 27

Design
©2013 LinkedIn Corporation. All Rights Reserved. 28

Design
©2013 LinkedIn Corporation. All Rights Reserved. 29
1,000X more users

Start simple
Grow with success
©2013 LinkedIn Corporation. All Rights Reserved.

Algorithms
©2013 LinkedIn Corporation. All Rights Reserved. 31
`

Algorithms
©2013 LinkedIn Corporation. All Rights Reserved. 32

Algorithms
©2013 LinkedIn Corporation. All Rights Reserved. 33
50% better results

Technology
©2013 LinkedIn Corporation. All Rights Reserved. 35
Some people, when confronted with a big
data problem, think, I'll use Hadoop.
Now they have a big data problem and a
big Hadoop cluster.
Dmitry Ryaboy, Twitter Engineering Manager

Technology
©2013 LinkedIn Corporation. All Rights Reserved. 36

Technology Advancement
©2013 LinkedIn Corporation. All Rights Reserved. 37

Technology Advancement
©2013 LinkedIn Corporation. All Rights Reserved. 38
50X faster
Kafka

©2013 LinkedIn Corporation. All Rights Reserved.
4 Finding Experts
5 Big Data Recipe
6 Summary
1 What is Big Data?
2 Big Data Applications
3 LinkedIn’s Big Data Solutions

Finding Data Experts
©2013 LinkedIn Corporation. All Rights Reserved. 41
Increase in demand for big data experts
X

Finding Data Experts
©2013 LinkedIn Corporation. All Rights Reserved. 42
Are new analytics experts
33

Finding Data Experts
©2013 LinkedIn Corporation. All Rights Reserved. 43
Be challenged at LinkedIn
We're looking for superb analytical minds
of all levels to expand our small team that
will build some of the most innovative
products at LinkedIn.
No specific technical skills are required
(we'll help you learn SQL, Python, and R).
You should be extremely intelligent, have a
quantitative background, and be able to
learn quickly and work independently.
This is the perfect job for someone who's
really smart, driven, and extremely skilled
at creatively solving problems. You'll learn
statistics, data mining, programming, and
product design, but you've gotta start with
what we can't teach—intellectual
sharpness and creativity.

LinkedIn Experts
©2013 LinkedIn Corporation. All Rights Reserved. 44

LinkedIn Experts
©2013 LinkedIn Corporation. All Rights Reserved. 45

Don't wait for a big data expert to
knock on your door - create your own

©2013 LinkedIn Corporation. All Rights Reserved.
5 Big Data Recipe
6 Summary
1 What is Big Data?
2 Big Data Applications
3 LinkedIn’s Big Data Solutions
4 Finding Experts

©2013 LinkedIn Corporation. All Rights Reserved. 48
Big Data Recipe

©2013 LinkedIn Corporation. All Rights Reserved. 49
Big Data Recipe
INGREDIENTS
1. Important business metric
2. Correlating factors
3. Causing factors
4. Product to affect the behavior
METHOD OF PREPARATION
1. Build a simple prototype
2. Measure the effect
3. Improve logic and scale
4. Measure the effect
5. Improve logic and scale
6. Measure the effect

©2013 LinkedIn Corporation. All Rights Reserved.
6 Summary
1 What is Big Data?
2 Big Data Applications
3 LinkedIn’s Big Data Solutions
4 Finding Experts
5 Big Data Recipe

©2013 LinkedIn Corporation. All Rights Reserved. 51

©2013 LinkedIn Corporation. All Rights Reserved. 52
감사합니다

Similar to Big Data World 2013 - How LinkedIn leveraged its data to become the world's largest professional network

Crowdsourcing Series: LinkedIn. By Vitaly Gordon & Patrick Philips.

Hakka Labs

Examples, techniques, and lessons learned building data products over the last 4 years at LinkedIn. Pete Skomoroch is a Principal Data Scientist at LinkedIn where he leads a team focused on building data products leveraging LinkedIn's powerful identity and reputation data. The talk describes some techniques and best practices applied to develop products like LinkedIn Skills & Endorsements. This talk was presented at the SF Data Science Meetup on September 19th, 2013

SF Data Science: Developing Data Products

Peter Skomoroch

We are all visual thinkers. 75 percent of the sensory neurons in our brains process visual information. As prospects and customers continue to be bombarded with information in the form of pure text such as whitepapers and blogs, it can be difficult to differentiate your content from the competition. Visual content can help and Slideshare is the perfect place to publish. One of the top 150 sites on the web, Slideshare is not just a repository for your slide presentations. It’s a social channel where you can establish your brand as a thought leader and authority around topics and keywords. With 60 million visitors a month and 3 billion slide views a month (That’s 1140 Slides per second) if you are not publishing content here, you are simply missing opportunities. The presentation below is from a session I recently did at Social Fresh West in San Diego. I pulled together seven essential tactics that successful Slideshare content publishers are using to turn their content marketing efforts up to eleven. Also included are some of the top businesses and brands using SlideShare effectively as part of their overall integrated marketing approach. I chose the theme of Badass because it’s a style that’s understated yet instantly recognizable. Like a chopped Harley or the perfect pair of sunglasses, this style is simple, direct and functional. (Just like a good piece of content should be) Visual language aids in decision making, is more persuasive, and makes a better, longer overall impression than simple text. Quite frankly, visual often kicks texts but in terms of overall views and virality across the social sphere. Who’s ready to get started?

7 Badass Tactics for SlideShare Content Domination

7 Badass Tactics for Slideshare Content Domination

Jason Miller

7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)

Social Fresh Conference

Emil Eifrém - The Data Platform for Today’s Intelligent Applications

Neo4j

Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...

Dataconomy Media

How Linkedin uses Automic for Big Data Processes

CA | Automic Software

Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup

Jason Miller

Big Data Ecosystem @ LinkedIn

Minh-Hoang Nguyen

Big data arch_analytics

Srinu Adira

Business Description: Blackcoffer appears to focus on providing consulting and IT solutions to help businesses achieve their objectives. Consulting services can include advising companies on various aspects of their operations, such as strategy development, process improvement, technology implementation, and more. IT solutions often involve designing, developing, and implementing technology systems, software applications, and other digital solutions to address specific business needs. Market Overview: The market for consulting and IT solutions is quite diverse and dynamic. Businesses of all sizes and across various industries seek out these services to enhance their operations, improve efficiency, and stay competitive in a rapidly evolving technological landscape. Some key points about this market include: Demand for Digital Transformation: Many businesses are undergoing digital transformations to modernize their operations, improve customer experiences, and leverage data-driven insights. This creates a significant demand for IT solutions and consulting services to guide these transformations. Industry-Specific Solutions: Different industries have unique challenges and requirements. Consulting and IT companies often specialize in certain sectors, such as healthcare, finance, manufacturing, and more, to provide tailored solutions. Innovation and Emerging Technologies: The IT landscape is constantly evolving with emerging technologies like artificial intelligence, blockchain, cloud computing, and more. Companies that can offer innovative solutions using these technologies are often in high demand. Competition: The consulting and IT solutions market is competitive, with both large firms and smaller specialized agencies vying for clients. Building a reputation for delivering quality, innovative, and cost-effective solutions is crucial. Client Relationships: Building strong relationships with clients is essential. Understanding their unique needs and providing personalized solutions can set a company apart in this industry. Global Reach: With the rise of remote work and the ability to provide services online, consulting and IT firms can serve clients across the globe. Regulations and Security: Depending on the nature of the services provided, adherence to data protection regulations and ensuring cybersecurity are critical concerns. It's important for Blackcoffer to clearly define its value proposition, target audience, and the specific services it offers within the consulting and IT solutions space. Building a strong brand, showcasing successful case studies, and maintaining a skilled and adaptable team can contribute to the company's success in this competitive market.

Blackcoffer Business development

Harshita Singh

Blackcoffer Business Development

Harshita Singh

Presentation from a talk given at Boston Big Data Innovation Summit, September 2012. Summary: The Data Science team at LinkedIn focuses on 3 main goals: (1) providing data-driven business and product insights, (2) creating data products, and (3) extracting interesting insights from our data such as analysis of the economic status of the country or identifying hot companies in a certain geographic region. In this talk I describe how we ensure that our products are data driven -- really data infused at the core -- and share interesting insights we uncover using LinkedIn's rich data. We discuss what makes a good data scientist, and what techniques and technologies LinkedIn data scientists use to convert our rich data into actionable product and business insights, to create data-driven products that truly serve our members.

Data Infused Product Design and Insights at LinkedIn

Yael Garten

Are You Underestimating the Value Within Your Data? A conversation about grap...

Neo4j

BIg data dan data mining

diki70

Founded in 2013 by experts in electronics and software development, Integra Sources has already completed more than 250 projects of varying complexity. Our team has helped more than 100 clients launch new solutions in healthcare, consumer electronics, logistics, education, and other industries. We are happy to work on your projects, improve your business, and develop new devices for you. Our website: https://www.integrasources.com/

Integra Sources Presentation

AndreySolovev

The value of our data

EnterpriseGRC Solutions, Inc.

Today, agility through digital transformation is a business imperative — but many organizations struggle to succeed. In this SlideShare, we’ll cover some of the core concepts explored in our free whitepaper, “A Super Solution Integrator Drives Business Outcomes by Orchestrating Technology” and walk through the digital transformation story — from the ideal scenario to the common reality. Digital transformation requires complex systems integration with modern IT solutions. Most organizations don’t have the skills or transformation experience in-house to manage a large-scale IT initiative. Alternatively, organizations will turn to outside partners for help — but managing multiple vendors across a fragmented IT environment creates its own set of challenges; integration problems, an expanding security footprint and no centralized strategy over the full project scope. Partnering with a super solution integrator can be the difference between success and failure by helping you overcome IT transformation challenges with a robust digital transformation strategy and experience. But what is a ‘super solution integrator’? In this SlideShare, you’ll learn why traditional systems integrators are evolving into a Super Solution Integrator (SSI) — and how this end-to-end partner is driving meaningful business outcomes.

A Super Solution Integrator Drives Business Outcomes by Orchestrating Technology

Insight

When it comes to Analytics and Reporting , There is a fine line between HindSight to Insight to Foresight . With the evolution of BigData technology, there is a need in deriving value out of the larger datasets, not available in the past. Even before we can start using the new shiny technologies, there is a need of understanding what is categorized as reporting or business intelligence or Big Data and Analytics. Based on my experience, people struggle to distinguish between reporting, Analytics, and Business Intelligence.

Big Data : From HindSight to Insight to Foresight

Sunil Ranka

Similar to Big Data World 2013 - How LinkedIn leveraged its data to become the world's largest professional network (20)

Crowdsourcing Series: LinkedIn. By Vitaly Gordon & Patrick Philips.

SF Data Science: Developing Data Products

7 Badass Tactics for SlideShare Content Domination

7 Badass Tactics for Slideshare Content Domination

7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)

Emil Eifrém - The Data Platform for Today’s Intelligent Applications

Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...

How Linkedin uses Automic for Big Data Processes

Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup

Big Data Ecosystem @ LinkedIn

Big data arch_analytics

Blackcoffer Business development

Blackcoffer Business Development

Data Infused Product Design and Insights at LinkedIn

Are You Underestimating the Value Within Your Data? A conversation about grap...

BIg data dan data mining

Integra Sources Presentation

The value of our data

A Super Solution Integrator Drives Business Outcomes by Orchestrating Technology

Big Data : From HindSight to Insight to Foresight

Recently uploaded

Three things you will take away from the session: • How to run an effective tenant-to-tenant migration • Best practices for before, during, and after migration • Tips for using migration as a springboard to prepare for Copilot in Microsoft 365 Main ideas: Migration Overview: The presentation covers the current reality of cross-tenant migrations, the triggers, phases, best practices, and benefits of a successful tenant migration Considerations: When considering a migration, it is important to consider the migration scope, performance, customization, flexibility, user-friendly interface, automation, monitoring, support, training, scalability, data integrity, data security, cost, and licensing structure Next Wave: The next wave of change includes the launch of Copilot, which requires businesses to be prepared for upcoming changes related to Copilot and the cloud, and to consolidate data and tighten governance ShareGate: ShareGate can help with pre-migration analysis, configurable migration tool, and automated, end-user driven collaborative governance

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

sammart93

Enterprise Knowledge’s Urmi Majumder, Principal Data Architecture Consultant, and Fernando Aguilar Islas, Senior Data Science Consultant, presented "Driving Behavioral Change for Information Management through Data-Driven Green Strategy" on March 27, 2024 at Enterprise Data World (EDW) in Orlando, Florida. In this presentation, Urmi and Fernando discussed a case study describing how the information management division in a large supply chain organization drove user behavior change through awareness of the carbon footprint of their duplicated and near-duplicated content, identified via advanced data analytics. Check out their presentation to gain valuable perspectives on utilizing data-driven strategies to influence positive behavioral shifts and support sustainability initiatives within your organization. In this session, participants gained answers to the following questions: - What is a Green Information Management (IM) Strategy, and why should you have one? - How can Artificial Intelligence (AI) and Machine Learning (ML) support your Green IM Strategy through content deduplication? - How can an organization use insights into their data to influence employee behavior for IM? - How can you reap additional benefits from content reduction that go beyond Green IM?

Driving Behavioral Change for Information Management through Data-Driven Gree...

Enterprise Knowledge

Strategies for Landing an Oracle DBA Job as a Fresher

Remote DBA Services

Real Time Object Detection Using Open CV

Khem

In this session, we will delve into strategic approaches for optimizing knowledge management within Microsoft 365, amidst the evolving landscape of Copilot. From leveraging automatic metadata classification and permission governance with SharePoint Premium, to unlocking Viva Engage for the cultivation of knowledge and communities, you will gain actionable insights to bolster your organization's knowledge-sharing initiatives. In this session, we will also explore how to facilitate solutions to enable your employees to find answers and expertise within Microsoft 365. You will leave equipped with practical techniques and a deeper understanding of how there is more to effective knowledge management than just enabling Copilot, but building actual solutions to prepare the knowledge that Copilot and your employees can use.

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Drew Madelung

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Rafal Los

Developing An App To Navigate The Roads of Brazil

V3cube

Advantages of Hiring UIUX Design Service Providers for Your Business

Pixlogix Infotech

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

Discord is a free app offering voice, video, and text chat functionalities, primarily catering to the gaming community. It serves as a hub for users to create and join servers tailored to their interests. Discord’s ecosystem comprises servers, each functioning as a distinct online community with its own channels dedicated to specific topics or activities. Users can engage in text-based discussions, voice calls, or video chats within these channels. Understanding Discord Servers Discord servers are virtual spaces where users congregate to interact, share content, and build communities. Servers may revolve around gaming, hobbies, interests, or fandoms, providing a platform for like-minded individuals to connect. Communication Features Discord offers a range of communication tools, including text channels for messaging, voice channels for real-time audio conversations, and video channels for face-to-face interactions. These features facilitate seamless communication and collaboration. What Does NSFW Mean? The acronym NSFW stands for “Not Safe For Work,” indicating content that may be inappropriate for professional or public settings. NSFW Content NSFW content encompasses material that is sexually explicit, violent, or otherwise graphic in nature. It often includes nudity, profanity, or depictions of sensitive topics.

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

UK Journal

This presentation explores the impact of HTML injection attacks on web applications, detailing how attackers exploit vulnerabilities to inject malicious code into web pages. Learn about the potential consequences of such attacks and discover effective mitigation strategies to protect your web applications from HTML injection vulnerabilities. for more information visit https://bostoninstituteofanalytics.org/category/cyber-security-ethical-hacking/

HTML Injection Attacks: Impact and Mitigation Strategies

Boston Institute of Analytics

Automating Google Workspace (GWS) & more with Apps Script

wesley chun

This presentations targets students or working professionals. You may know Google for search, YouTube, Android, Chrome, and Gmail, but did you know Google has many developer tools, platforms & APIs? This comprehensive yet still high-level overview outlines the most impactful tools for where to run your code, store & analyze your data. It will also inspire you as to what's possible. This talk is 50 minutes in length.

Powerful Google developer tools for immediate impact! (2023-24 C)

wesley chun

The value of a flexible API Management solution for Open Banking Steve Melan, Manager for IT Innovation and Architecture - State's and Saving's Bank of Luxembourg Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The value of a flexible API Management solution for O...

apidays

Boost Fertility New Invention Ups Success Rates.pdf

sudhanshuwaghmare1

💉💊+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI}}+971581248768 +971581248768 Mtp-Kit (500MG) Prices » Dubai [(+971581248768**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Maya Whatsapp +971581248768 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971581248768''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971581248768' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Cl

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

GenAI Risks & Security Meetup 01052024.pdf

lior mazor

Scaling API-first – The story of a global engineering organization

Radu Cotescu

AWS Community Day CPH - Three problems of Terraform

Andrey Devyatkin

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Neo4j

Recently uploaded (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Driving Behavioral Change for Information Management through Data-Driven Gree...

Strategies for Landing an Oracle DBA Job as a Fresher

Real Time Object Detection Using Open CV

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Developing An App To Navigate The Roads of Brazil

Advantages of Hiring UIUX Design Service Providers for Your Business

How to Troubleshoot Apps for the Modern Connected Worker

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

HTML Injection Attacks: Impact and Mitigation Strategies

Automating Google Workspace (GWS) & more with Apps Script

Powerful Google developer tools for immediate impact! (2023-24 C)

Apidays New York 2024 - The value of a flexible API Management solution for O...

Boost Fertility New Invention Ups Success Rates.pdf

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

GenAI Risks & Security Meetup 01052024.pdf

Scaling API-first – The story of a global engineering organization

AWS Community Day CPH - Three problems of Terraform

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Big Data World 2013 - How LinkedIn leveraged its data to become the world's largest professional network

1. How LinkedIn leveraged its data to become the world's largest professional network

15. Big Data is more about Business than Data

18. LinkedIn Revenue Quarterly Revenue ------------------200 ----------------------------------2010-------------------------------2011---------------- Hiring Solutions Marketing Solutions Premium Subscriptions ($ millions) -----------------2012-------------------2013--- ©2013 LinkedIn Corporation. All Rights Reserved. 18 23 28 30 39 45 55 62 82 94 121 139 168 188 228 252 304 325 0 50 100 150 200 250 300 350 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1

35. Technology ©2013 LinkedIn Corporation. All Rights Reserved. 35 Some people, when confronted with a big data problem, think, I'll use Hadoop. Now they have a big data problem and a big Hadoop cluster. Dmitry Ryaboy, Twitter Engineering Manager

39. Start simple, grow with success

43. Finding Data Experts ©2013 LinkedIn Corporation. All Rights Reserved. 43 Be challenged at LinkedIn We're looking for superb analytical minds of all levels to expand our small team that will build some of the most innovative products at LinkedIn. No specific technical skills are required (we'll help you learn SQL, Python, and R). You should be extremely intelligent, have a quantitative background, and be able to learn quickly and work independently. This is the perfect job for someone who's really smart, driven, and extremely skilled at creatively solving problems. You'll learn statistics, data mining, programming, and product design, but you've gotta start with what we can't teach—intellectual sharpness and creativity.

46. Don't wait for a big data expert to knock on your door - create your own

49. ©2013 LinkedIn Corporation. All Rights Reserved. 49 Big Data Recipe INGREDIENTS 1. Important business metric 2. Correlating factors 3. Causing factors 4. Product to affect the behavior METHOD OF PREPARATION 1. Build a simple prototype 2. Measure the effect 3. Improve logic and scale 4. Measure the effect 5. Improve logic and scale 6. Measure the effect

Editor's Notes

LinkedIn – I am currently a data scientist at LinkedIn, one of the world's most advanced big data companies.LivePerson – I have previously worked at LivePerson where I was the first person hired to build their big data solution, so I have experienced both the very beginning of big data solutions and the cutting edge.I will share with you the lessons I've learned while working on big data from both ends of the spectrumI also have a business degree from the Israeli Institute of Technology and a computer science degree from Ben-Gurion university
This is what I am going to talk about, I chose these subjects because they answer the most burning questions both when I was starting with big data and when I was perfecting my craft
The term, Big Data, is used in many ways, so before I'll start talking about big data, I want to explain what big data is
Yes, there is an entry in the Oxford English Dictionary for Big Data
The main word here is standard. Before Big Data, standard methods and tools were enough to process the data we had and now it's not, but what happened?
Data created opportunities, which in turn created demand for even more data and the amount of data in the world grew larger and larger
So what are those big data opportunities I've mentioned? The best way to see is through examples.
Amazon, the ecommerce giant analyzes data about its shoppers. It analyzes what products they are looking at, what products they are searching for and most importantly, what products they are buying.This analysis enables them to produce a product I am sure you have all seen ...
Here we can see that if I look at the book "Big Data Analytics", Amazon provides me with other recommendations about similar books.-- Show increase in sales –So why did it increase sales so much? The logic here is simple, the more products customers see, the higher the chance they will buy something. Amazon wants to show us as many products as it can in order to get us to buy something.
My second example is Netflix.Netflix is an American company that started as a DVD rental service and quickly became a streaming platform for movies and TV shows. It has about 30 million subscribers.At the end of each movie, Netflix asks the viewer to rate the movie he just watched. Netflix has billions of movie ratings from millions of users and it uses this data to create the following product.
Using our rating history, Netflix calculates a unique "taste" for every one of its subscribers and uses this taste to recommend them movies. This product is so important to Netflix, that in 2006 Netflix offered a prize of million dollars to whoever can improve their algorithm by more than 10%.-- Show statisticsSo why is this recommendation engine is so important? The more users find movies they like on Netflix, the longer they will keep their subscription, earning money to Netflix.
My third example is a small Israeli startup. Waze is a GPS mobile app that tracks where people are and at what speed are they travelling.
Waze uses this data to compute traffic maps where they show which streets are have traffic jams and route you according to this data, providing much better traffic suggestions than apps that don't use traffic information.After gaining more than 50 million users for its app, Waze was acquired by Google for about 1.1 billion dollars.Side note: I understand there will be a talk later today by a Korean company that does something very similar.
The above examples, and many more, lead me to the first lesson I've learned about big data
These are great examples. But to dive even deeper to big data applications, let's look at the company I currently work for, LinkedIn.Since we said that Big Data is more about business than data, let me show you first what is LinkedIn's business.
LinkedIn is the largest professional social network in the world. It has more than 225M members. Our largest markets today are North America and Europe, but Asia is growing very well too, with several countries having more than a million members on LinkedIn.
Not only LinkedIn has a lot of members, it also makes significant revenue. Across it 3 bussiness lines, LinkedIn has made almost a billion dollars last year and about 325 million in the first quarter of 2013.
These 3 product lines are Premium Subscriptions, Marketing Solutions and Talent Solutions.Let's dive more deeply into each one of them to understand them better
The premium subscriptions business is for LinkedIn users that want to get extra features on LinkedIn. Those features might be better analytics about who viewed their profile and the ability to contact anyone on LinkedIn through In Mails, LinkedIn's personal messaging system.This product really separates LinkedIn from other social networks in the fact that some of the users of the network pay extra to use it.
Marketing solutions is more similar to what you can find on other social networks. We offer companies the ability to market their products to our members. Since LinkedIn is a professional network with most members having a job or even a lucrative one. The target population is very appealing for marketers who want to market their products.
Our third and largest in terms of revenue product line is the talent solution. Here companies like Sony, Walmart and Loreal pay for their recruiters to have additional functionality for their recruiting needs. This is almost like another product inside LinkedIn for our recruiter members. This product line bring about 57% of LinkedIn's revenue.
LinkedIn's number 1 mission is connecting talent with opportunity. Both helping companies find new talent and helping our 225+ million members find new opportunities when they need themOne of the first big data applications at LinkedIn was to help members find a new job, and I will now dive deep into how it was done
JYMBII is a big data product that matches members with job postings on LinkedIn. For example: here is me, and some of the jobs companies posted on LinkedIn. For every job, we create a score on how much this job is a good fit for the member. Here you can see that I am a good match for a data scientist position at Facebook, and not such a good match for a product manager at Yahoo.
After creating scores for all the jobs in our database, we create a small widget on our homepage where every member can see his top matching jobs.
I will walk you through the 3 pillars of every big data product – Design, Algorithms and Infrastructure/Framework.
Let's start with design. In a consumer oriented company design is very important, because this is how users interact with your product. Also, in many cases, design is the hardest thing for a single small team to change because so many teams are involved.In most companies the big data team is separate from the team that works on the main product, so those of you who already started implementing big data solutions probably know how difficult it is to try to do some tests on the main product. Try to do anything you can to bypass other teams in your organization to test your big data solutions.When LinkedIn's Data Science team decided to build JYMBII, they wanted a very very simple way to test whether their product is working without making too many changes to the main site. This is how they did it. They started with email. Here you can see how the actual email looks today, where I got some recommendations for jobs I might be interested in.The reason why they chose email, is because it is a way to test your product on a small subset of users, without everyone who comes to your website being affected by it and also there is no need to make any changes to the main website.
After the initial emails showed great success and that people are actually interested in it. Our team has built this very small widget that shows the top jobs you might be interested in. Again, it was done with minimum integration with the main website, by having this widget replace one of the ads we had on the site for a certain percentage of the users.
After the great success of the widget, Jobs have now their own section at the LinkedIn website where users can search for jobs and more.Having the job section resulted in having 1000 times more users looking at the LinkedIn jobs than beforehandRemember, JYMBII did not start with its own website, but grew up to have it.
My main message about how to design data products is to start simple and grow with success.
Let's now talk about algorithms, or how does LinkedIn matches members with job postings.The first iteration of the algorithm was very simple. We look at the member's profile, we look at the job posting and we do keyword matching. Very similar to how recruiters screen candidate resumes for a potential match. In this example we can see that my profile is a pretty decent match for this job opportunity.There is no need for a natural language processing expert or a computer science doctor to implement this algorithm. It is pretty simple and worked pretty well for our first prototype.
When the first protype of the email succeeded the team moved to imrove the algorithm a bit further, adding features like education and experience which are also very important for determining the candidate's fit to a position. These improvement, improved the recommendations even further, resulting in more people engaging with jobs on the LinkedIn website
Finally, now that we have our job page on the website where users can search for jobs, save jobs and apply for jobs. We can use all of these signals to recommend users similar jobs to the ones the found themselves.All of these improvements resulted in a 50% more accurate job recommendations to our members.
The message for algorithms is the same as it for design, don't try to implememnt something very difficult before you know your customers even want it. Start simple and grow with success.
Here is a quote from a Twitter engineering manager that I like very much. What it says that most of the time, Hadoop doesn't solve a big data problem, it actually brings a set of new problems to deal with even before we know that what we are trying to build is worth building.
The first JYMBII prototype was developed using a very simple technology. Oracle, some perl scripts in between in some shell scripts. The process involved someone copying files manually from one computer to another, running some scripts on that computer and then copying back the results. The process was so inefficient that it took 6 weeks to run.But 6 weeks is better than never.
After the success of the initial product, LinkedIn has decided to make some infrastructure invetment in buying a parallel database from companies like GreenPlum and AsterData. This sped up the process to run now in a single week instead of 6.
Eventually LinkedIn moved not only to Hadoop but also built it's own infrastucture with project like Kafka, Voldemort and Zoie. You can find more information about them on the linkedin open source page.Now we are generating new recommendations every day, which is 50 times better than having it every 6 weeks.You probably figured out the second lessong by now ...
One of the most important questions that kept me busy for a long time as well is where you find big data expertsBefore I give you the answer, I would like to show you 2 graphs
Here you can see that in the beginning of 2011 the demand for big data experts was 30 times higher than the year before. Now it is even higher. Everyone is looking for big data experts.
Here is a graph from LinkedIn's own analytics team. Here you can see that 33% of the people who started a job as data scientist or analysts are new to this job.You can probably see where I am going with this. Most people who work in big data are new to big data.LinkedIn have realized it quickly and here is the proof ...
Here is an actual LinkedIn job posting from 2008 when LinkedIn just started with big data.The key message is this ... No specific technical skills are requiredHere is an example of how LinkedIn have implemented this strategy on 2 of my colleagues.
Joseph Adler came to LinkedIn from Netflix, where he did Operations Engineering. Now he is one of our top experts on big data and even written a very successful book about it.
Jason is a new data scientist at linkedin. Prior to that he was radar signal processing expert. He is still just at the beginning of his career at LinkedIn, but so far he is doing very well and educating himself quickly,
My third lesson is a bit hard to chew, but if you follow my previous 2, it becomes easier. Look for big data experts everywhere and at all times, but don't let it stop you from starting your projects.
So how do you start a big data project? I would like to show you a very simple recipe you could follow
As always, in order to make it more clear, I will use an example to guide us through the recipe.People You May Know is a LinkedIn Big Data product that traverses your profile and the entire LinkedIn graph to suggest people you should connect with.Let's see how can we use our recipe to create big data applications such as People You May Know.
Important business metric – how often members visit the websiteCorrelating factors – How many new items they have on their news feed. But that is not the root of the cause, something else is affecting it.Causing factors – How many connections do the have.Product – Recommend new connections to users – People You May Know.Beware of the second-system effect, how many of you have been involved with projects where the first prototype was pretty succesful and the second one was much bigger and failed?

Big Data World 2013 - How LinkedIn leveraged its data to become the world's largest professional network

Recommended

Recommended

More Related Content

Similar to Big Data World 2013 - How LinkedIn leveraged its data to become the world's largest professional network

Similar to Big Data World 2013 - How LinkedIn leveraged its data to become the world's largest professional network (20)

Recently uploaded

Recently uploaded (20)

Big Data World 2013 - How LinkedIn leveraged its data to become the world's largest professional network

Editor's Notes