SlideShare ist ein Scribd-Unternehmen logo
1 von 5
WHITE PAPER
BIG DATA AND THE NEEDS OF
THE PHARMA INDUSTRY
JULY 2013
TABLE OF CONTENTS
Introduction.................................................................................... 1
Big Data into Little Data ................................................................2
Big Data as the Next Wave of an Historical Trend ........................3
1960’s Big Data – Patents ....................................................3
1970’s Big Data – Chemistry................................................ 4
1980’s Big Data – Sequences.............................................. 4
1990’s Big Data – Arrays.......................................................5
2000’s Big Data – Next Generation Sequencing ................ 6
Big Data in the Present..................................................................7
New tools and techniques for old problems........................7
New problems; new opportunities..................................... 12
Big Data in the Future..................................................................15
Data about Big Data...........................................................14
Big Data engineers..............................................................15
Conclusion....................................................................................16
BIG DATA AND THE NEEDS OF THE PHARMA INDUSTRY 1
INTRODUCTION
Gartner, Inc. neatly defines Big Data as “high-volume, high-velocity and high-variety information
assets that demand cost-effective, innovative forms of information processing for enhanced insight and
decision making.” Thomson Reuters Life Sciences sees Big Data as both a problem and an opportunity.
In a recent survey, Thomson Reuters asked a group of IT leaders in Pharma how they view Big Data.
Unequivocally, 100 percent responded that Big Data is an opportunity. This isn’t that surprising given
that looking for empirical facts in large bodies of evidence has always been a driver of innovation. The
Pharma industry generally works by identifying some correlation, e.g. between disease and protein,
and then working out how to exploit it. The science as to why that correlation exists tends to follow
despite recent trends towards science-led discovery. The ability to apply this thinking to even larger
and richer sets of data is clearly an opportunity for Pharma to find new correlations and develop new
drugs.
When asked about where they saw Big Data opportunities, our respondents overwhelmingly
highlighted two areas of focus: early-stage discovery (41.2 percent) and understanding the market
(26.5 percent). Figure 1 illustrates this, while also showing the emerging trend of personalized (or
precision) medicine. Drug discovery has always been a data-driven activity and it makes sense to
extend that to the new volumes, varieties and velocities of data coming out of the labs and out of public
initiatives. Market understanding is new. It reflects both the change in the Pharma market dynamics,
caused by the greater influence on prescribing behavior by payers, and the promise of rich patient-level
data from electronic health records. Understanding the patient (personalized medicine), scoring 14.7
percent in our survey, is also a significant focus in Pharma right now, so access to, and the ability to,
digest this kind of data represents a real win to our respondents in providing value to their businesses.
Figure 1: THE BIGGEST OPPORTUNITIES FOR BIG DATA
Source: Thomson Reuters Big Data Survey
2 BIG DATA AND THE NEEDS OF THE PHARMA INDUSTRY
BIG DATA INTO LITTLE DATA
The problem: humans can’t work with Big Data directly. To realize the value of all this data, we need
to reduce it to human proportions. When you examine what our customers are really doing with Big
Data, what you see is the application of tools and techniques to “shrink” the data. In the Life Sciences
business of Thomson Reuters, we call this “making Big Data look like Little Data.” Little Data is the
data we are equipped to handle. It comprises reliable, evidence-backed facts that scientists can use in
models, visualizations and analyses. It is actionable data that can help Pharma companies with their
core business of developing new and better drugs.
In drug discovery, this usually takes the form of designing in-silico experiments. This requires building
up data sets from disparate sources, requiring cross-functional teams to work together on data that
scores highly in the Variety vector of Big Data. Making this data act like Little Data is a challenge in
data harmonization. You need common ontologies and ways to present data to scientists with very
different skills and backgrounds, so it is delivered in a way they understand.
In patient understanding, the challenge is to get the data in one place and to filter the outputs so
that the interesting correlations stand out from the “noise” of correlations that are either obvious or
spurious--a situation analogous to signal amplification in the audio industry.
BIG DATA AND THE NEEDS OF THE PHARMA INDUSTRY 3
BIG DATA AS THE NEXT WAVE OF AN
HISTORICAL TREND
Challenging volumes of data is not a new phenomenon in Pharma. Rather, Big Data is an evolution in
the application of data in drug R&D. It creates new challenges and provides new tools, but is not the
complete revolution that some commentators claim it to be. Over many years, information companies
like Thomson Reuters have supported customers in their efforts to divide whatever is the current “data
elephant” into manageable chunks.
1960’S BIG DATA – PATENTS
An early Big Data challenge was the boom in patenting. In the first half of the twentieth century, a
Pharma scientist could keep up with all the patents in his field by reading the patent applications him/
herself. Figure 2 shows that by 1960, s/he would have to read over 1,000 patents a year to keep up.
And, would have to be able to read at least English, German, French and Japanese. Abstracting and
indexing services such as Derwent Publications emerged to address this challenge. Derwent’s curation
team read, classified and abstracted all the patents coming out of the leading patent offices around
the world (as they continue to do so today). The editorially-enhanced information was published in
weekly bulletins separated by area of interest so scientists could easily get to those of relevance.
As can be seen from the graph, this trend has continued more or less unchecked ever since. Nowadays
nobody would even consider trying to keep up with patent information in any other way than setting up
tailored alerting services on patent databases.
Figure 2: THE RISE IN GLOBAL PHARMA PATENTING
140000
120000
100000
80000
60000
40000
20000
0
1982
1980
1978
1976
1974
1972
1970
1968
1966
1964
1962
1960
1984
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
PatentFamilies
Source: Thomson Reuters Derwent World Patents Index
1970’S BIG DATA – CHEMISTRY
The 1970’s saw the emergence of computer databases. In particular, the development of databases
that could store, search and display chemical structures. These enabled Pharma companies to start
to build the internal registry databases now counted as key assets and prime candidates for novel Big
Data experiments. The emergence of online transactional services, the “Hosts,” like Dialog, Questel
Orbit (now called Orbit) and STN, enabled paper-based indexing services like ISI’s Science Citation

Weitere ähnliche Inhalte

Andere mochten auch

ANURAG_SR_RESUME
ANURAG_SR_RESUMEANURAG_SR_RESUME
ANURAG_SR_RESUMEANURAG SR
 
Концепция и архитектура Информационной системы (базы данных)
Концепция и архитектура Информационной системы (базы данных)Концепция и архитектура Информационной системы (базы данных)
Концепция и архитектура Информационной системы (базы данных)Self-employed
 
Palestra Es Una Navidad
Palestra Es Una NavidadPalestra Es Una Navidad
Palestra Es Una Navidadinfotuc
 
Introduction to Palringo (Mobile Monday Peer Awards)
Introduction to Palringo (Mobile Monday Peer Awards)Introduction to Palringo (Mobile Monday Peer Awards)
Introduction to Palringo (Mobile Monday Peer Awards)kerrypalringo
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingHealth Catalyst
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 

Andere mochten auch (7)

ANURAG_SR_RESUME
ANURAG_SR_RESUMEANURAG_SR_RESUME
ANURAG_SR_RESUME
 
Концепция и архитектура Информационной системы (базы данных)
Концепция и архитектура Информационной системы (базы данных)Концепция и архитектура Информационной системы (базы данных)
Концепция и архитектура Информационной системы (базы данных)
 
Palestra Es Una Navidad
Palestra Es Una NavidadPalestra Es Una Navidad
Palestra Es Una Navidad
 
Introduction to Palringo (Mobile Monday Peer Awards)
Introduction to Palringo (Mobile Monday Peer Awards)Introduction to Palringo (Mobile Monday Peer Awards)
Introduction to Palringo (Mobile Monday Peer Awards)
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 

Mehr von BioWorld Today, Now a Part of Thomson Reuters (6)

Medical Device Daily - April 8, 2013
Medical Device Daily - April 8, 2013Medical Device Daily - April 8, 2013
Medical Device Daily - April 8, 2013
 
Biopharmaceutical Royalty Rates Analysis: Essential Benchmarks for Dealmaking...
Biopharmaceutical Royalty Rates Analysis: Essential Benchmarks for Dealmaking...Biopharmaceutical Royalty Rates Analysis: Essential Benchmarks for Dealmaking...
Biopharmaceutical Royalty Rates Analysis: Essential Benchmarks for Dealmaking...
 
BioWorld's Biotechnology State of the Industry Report 2011
BioWorld's Biotechnology State of the Industry Report 2011BioWorld's Biotechnology State of the Industry Report 2011
BioWorld's Biotechnology State of the Industry Report 2011
 
BioWorld Executive Compensation Report 2013 (Preview)
BioWorld Executive Compensation Report 2013 (Preview)BioWorld Executive Compensation Report 2013 (Preview)
BioWorld Executive Compensation Report 2013 (Preview)
 
2012: Looking Back at the Year in Biotech
2012: Looking Back at the Year in Biotech2012: Looking Back at the Year in Biotech
2012: Looking Back at the Year in Biotech
 
Designing Deals Between Biotech Companies and University TTOs – Opportunities...
Designing Deals Between Biotech Companies and University TTOs – Opportunities...Designing Deals Between Biotech Companies and University TTOs – Opportunities...
Designing Deals Between Biotech Companies and University TTOs – Opportunities...
 

Kürzlich hochgeladen

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 

Kürzlich hochgeladen (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Big Data Whitepaper

  • 1. WHITE PAPER BIG DATA AND THE NEEDS OF THE PHARMA INDUSTRY JULY 2013
  • 2. TABLE OF CONTENTS Introduction.................................................................................... 1 Big Data into Little Data ................................................................2 Big Data as the Next Wave of an Historical Trend ........................3 1960’s Big Data – Patents ....................................................3 1970’s Big Data – Chemistry................................................ 4 1980’s Big Data – Sequences.............................................. 4 1990’s Big Data – Arrays.......................................................5 2000’s Big Data – Next Generation Sequencing ................ 6 Big Data in the Present..................................................................7 New tools and techniques for old problems........................7 New problems; new opportunities..................................... 12 Big Data in the Future..................................................................15 Data about Big Data...........................................................14 Big Data engineers..............................................................15 Conclusion....................................................................................16
  • 3. BIG DATA AND THE NEEDS OF THE PHARMA INDUSTRY 1 INTRODUCTION Gartner, Inc. neatly defines Big Data as “high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.” Thomson Reuters Life Sciences sees Big Data as both a problem and an opportunity. In a recent survey, Thomson Reuters asked a group of IT leaders in Pharma how they view Big Data. Unequivocally, 100 percent responded that Big Data is an opportunity. This isn’t that surprising given that looking for empirical facts in large bodies of evidence has always been a driver of innovation. The Pharma industry generally works by identifying some correlation, e.g. between disease and protein, and then working out how to exploit it. The science as to why that correlation exists tends to follow despite recent trends towards science-led discovery. The ability to apply this thinking to even larger and richer sets of data is clearly an opportunity for Pharma to find new correlations and develop new drugs. When asked about where they saw Big Data opportunities, our respondents overwhelmingly highlighted two areas of focus: early-stage discovery (41.2 percent) and understanding the market (26.5 percent). Figure 1 illustrates this, while also showing the emerging trend of personalized (or precision) medicine. Drug discovery has always been a data-driven activity and it makes sense to extend that to the new volumes, varieties and velocities of data coming out of the labs and out of public initiatives. Market understanding is new. It reflects both the change in the Pharma market dynamics, caused by the greater influence on prescribing behavior by payers, and the promise of rich patient-level data from electronic health records. Understanding the patient (personalized medicine), scoring 14.7 percent in our survey, is also a significant focus in Pharma right now, so access to, and the ability to, digest this kind of data represents a real win to our respondents in providing value to their businesses. Figure 1: THE BIGGEST OPPORTUNITIES FOR BIG DATA Source: Thomson Reuters Big Data Survey
  • 4. 2 BIG DATA AND THE NEEDS OF THE PHARMA INDUSTRY BIG DATA INTO LITTLE DATA The problem: humans can’t work with Big Data directly. To realize the value of all this data, we need to reduce it to human proportions. When you examine what our customers are really doing with Big Data, what you see is the application of tools and techniques to “shrink” the data. In the Life Sciences business of Thomson Reuters, we call this “making Big Data look like Little Data.” Little Data is the data we are equipped to handle. It comprises reliable, evidence-backed facts that scientists can use in models, visualizations and analyses. It is actionable data that can help Pharma companies with their core business of developing new and better drugs. In drug discovery, this usually takes the form of designing in-silico experiments. This requires building up data sets from disparate sources, requiring cross-functional teams to work together on data that scores highly in the Variety vector of Big Data. Making this data act like Little Data is a challenge in data harmonization. You need common ontologies and ways to present data to scientists with very different skills and backgrounds, so it is delivered in a way they understand. In patient understanding, the challenge is to get the data in one place and to filter the outputs so that the interesting correlations stand out from the “noise” of correlations that are either obvious or spurious--a situation analogous to signal amplification in the audio industry.
  • 5. BIG DATA AND THE NEEDS OF THE PHARMA INDUSTRY 3 BIG DATA AS THE NEXT WAVE OF AN HISTORICAL TREND Challenging volumes of data is not a new phenomenon in Pharma. Rather, Big Data is an evolution in the application of data in drug R&D. It creates new challenges and provides new tools, but is not the complete revolution that some commentators claim it to be. Over many years, information companies like Thomson Reuters have supported customers in their efforts to divide whatever is the current “data elephant” into manageable chunks. 1960’S BIG DATA – PATENTS An early Big Data challenge was the boom in patenting. In the first half of the twentieth century, a Pharma scientist could keep up with all the patents in his field by reading the patent applications him/ herself. Figure 2 shows that by 1960, s/he would have to read over 1,000 patents a year to keep up. And, would have to be able to read at least English, German, French and Japanese. Abstracting and indexing services such as Derwent Publications emerged to address this challenge. Derwent’s curation team read, classified and abstracted all the patents coming out of the leading patent offices around the world (as they continue to do so today). The editorially-enhanced information was published in weekly bulletins separated by area of interest so scientists could easily get to those of relevance. As can be seen from the graph, this trend has continued more or less unchecked ever since. Nowadays nobody would even consider trying to keep up with patent information in any other way than setting up tailored alerting services on patent databases. Figure 2: THE RISE IN GLOBAL PHARMA PATENTING 140000 120000 100000 80000 60000 40000 20000 0 1982 1980 1978 1976 1974 1972 1970 1968 1966 1964 1962 1960 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 PatentFamilies Source: Thomson Reuters Derwent World Patents Index 1970’S BIG DATA – CHEMISTRY The 1970’s saw the emergence of computer databases. In particular, the development of databases that could store, search and display chemical structures. These enabled Pharma companies to start to build the internal registry databases now counted as key assets and prime candidates for novel Big Data experiments. The emergence of online transactional services, the “Hosts,” like Dialog, Questel Orbit (now called Orbit) and STN, enabled paper-based indexing services like ISI’s Science Citation