SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Downloaden Sie, um offline zu lesen
Tagging at The New York Times
(1851-Today)
Jennifer Parrucci
Senior Taxonomist
The New York Times
The Taxonomy Team
Kristi Reilly, Taxonomy Manager
Kristi has been working with Times Tags for 15 years. She started at the Times as an Information Architect
and became fascinated with how a 150-year-old vocabulary has evolved
and helped connect readers with the stories that matter most to them. Prior to NYT, she worked for
consulting companies owned by Ogilvy and Deloitte.
Jennifer Parrucci, Senior Taxonomist
Jennifer has been working with Times Tags for 12 years. She began her career at The Times as an indexer
for The New York Times Index where she worked writing abstracts and assigning metadata. Prior to her
work at The Times she received her M.L.I.S. from Pratt Institute, where she later earned her certification in
archives. Jennifer has always been passionate about books, organizing content, surfacing historical
treasures and making sure that everyone can find what they need.
Dan McComas, Taxonomist
Dan has enjoyed a 20-plus-year career at the NYT in wide-ranging roles across the organization. Before
joining the Times Tags team in 2015, Dan was a copy editor in the Times Index department and brings
extensive indexing experience to his current role.
Rigorous quality checks originate from the Index,
a role that was treated as an apprenticeship.
The New York Times Index
“The Paper of Record”
● A controlled vocabulary
○ Named entities (over a million)
■ People, Places, Organizations, Titles
○ Subjects (4,500)
■ Semantic Relationships: Broad, Narrow and Related Terms + Scope Notes
■ Includes News Events
● Assigned to all published assets
○ Articles, videos, slide shows, interactives, podcast episode pages
● Rule-based Software
○ Entity extraction: normalization, disambiguation
○ Categorization: frequency, proximity, placement
What are Times Tags?
Normalization
Qaddafi, Muammar el-
Rule-based Entity Extraction
Muammar El-Gadhafi
Muammar el-Gaddafi
Muammar al-Gaddafi
Muammar el-Gadhafi
Muammar Gadhafi
Muammar Qaddafi
Muammar al-Gadhafi
Muammar Al-Gadhafi
Moammar El-Qaddafi
Muammar El-Gaddafi
Muammar el-Qaddafi
Moammar Al-Gaddafi
Muammar El-Qaddafi
Moammar el-Gaddafi
Moammar Gaddafi
Muammar Al-Gaddafi
Moammar el-Qaddafi
Moammar Al-Gadhafi
Moammar El-Gaddafi
Moammar Al-Qaddafi
Muammar al-Qaddafi
Moammar El-Gadhafi
Moammar al-Qaddafi
Moammar Qaddafi
Muammar Al-Qaddafi
Muammar Gaddafi
Moammar el-Gadhafi
Moammar Gadhafi
Moammar al-Gaddafi
Moammar al-Gadhafi
Disambiguation
Michael Jackson
{(OR,"album","albums","Apollo Theater","Billie Jean","Dan Reed","Debbie Rowe","Jackson 5","Jackson
Five","Janet Jackson","King of Pop","La Toya","LaToya","moonwalk","Neverland","pop star","pop
superstar","Rock and Roll Hall of Fame","Safechuck","This Is It","Wade Robson")}
{(AND,"Bush","Homeland Security",(OR,"deputy director","director"))}
New York Giants v. San Francisco Giants
Rule-based Entity Extraction
(SENT,
(OR,"air traffic
control","airline","airbus","aircraft","aircrafts","airline","airlines","airliner","airplane","airplanes","airship","airships","
boeing","dirigible","dirigibles","flight","helicopter","helicopters","hindenburg","jet","jetliner","landing
gear","mu-2b","plane","planes","pilot","zeppelin","zeppelins"),
(OR,"blast","blast of laser light","blew up","body parts","bodies","bomb","bomb threat","broke apart","broke into
pieces","careered","collided","collision","crash","crashed","crashing","dead","debris","disaster","disasters","destru
ction","destroy@","distress call","distress calls","engine failure","fell","flash-blindness","hit","injured","killed","laser
pointing","laser strike","laser strikes","lost control","lost radar","mangled","mechanical failure","out of
control","plummeted","plummeting","point lasers at","pointed a laser at","pointed lasers at","pointing lasers
at","safety record","sank","skidded","slammed into","smashed","struck","sunk","survivor","survivors","tipped
over","vanish","vanishing","vanished","veered","victim","victims","went down","wreckage","wreck"))
Rule-based Categorization
Rule-based Categorization
Byline: Manhola Dargis → Movies
Kicker: Hungry City → Restaurants
Layout_desk: Obits → Deaths (Obituaries)
● "Aboutness"/Semantic Meaning
○ Tag only the focus
○ Strict guidelines
● Quality control measures ensure accuracy
○ Software suggests > Newsroom selects > Taxonomists check
○ Daily report summarizes
○ 7 days a week, 365 days a year
What’s special about them?
Software Suggests → Humans Verify
Tagging
Guides
Times Tags and CMS Scoop (Oak & Classic UI), Blackbeard
Times Tags and CMS Scoop (Oak & Classic UI), Blackbeard
Harvest Terms
Daily
Report
What do we do with all these tags?
Collections/Topics
https://www.nytimes.com/news-event/ferguso
n-michael-brown
Collections/Topics
news_desk:"Climate" OR subject:("Acid Rain" "Air Pollution" "Algae" "Alternative and
Renewable Energy" "Animal Migration" "Biodiversity" "Biofuels" "Birdwatching" "Carbon
Capture and Sequestration" "Carbon Dioxide" "Coal" "Coast Erosion" "Compost"
"Conservation of Resources" "Coral" "Drilling and Boring" "Eco-Tourism" "Electric and
Hybrid Vehicles" "Electric Light and Power" "Endangered and Extinct Species" "Energy
and Power" "Energy Efficiency" "Environment" "Federal Lands" "Fish and Other Marine
Life" "Fuel Efficiency" "Fuel Emissions (Transportation)" "Geothermal Power" "Global
Warming" "Green New Deal" "Greenhouse Gas Emissions" "Greenhouse Effect" "Hazardous
and Toxic Substances" "Hydroelectric Power" "Hurricanes and Tropical Storms"
"Keystone Pipeline" "Land Use Policies" "Leadership in Energy and Environmental
Design (LEED)" "Light-Emitting Diodes" "Local Food" "Nuclear Energy" "Oil (Petroleum)
and Gasoline" "Plastic Bags" "Pipelines" "Recycling of Waste Materials" "Reefs"
"Solar Energy" "Sustainable Living" "Tidal and Wave Power" "United Nations Framework
Convention on Climate Change" "Water Pollution" "Wetlands" "Wilderness Areas"
"Wildfires" "Wildlife Sanctuaries and Nature Reserves" "Wind Power") OR
subject.contains:("Hurricane") OR organizations.contains:("Energy Department"
"Environmental Protection Agency" "Koch Industries") OR persons:("Perry, Rick"
"Pruitt, Scott" "Wheeler, Andrew R")
Your Feed
News Services/Syndicate
TimesMachine
Archive Discovery
Advertising
● Contextual Targeting
● Brand Security
Search
● Site Search Relevancy: 25% improvement in user click through after "boosting"
on Times Tags
● Collection/Topics pages powered by search are promoted to the first result in
site search
● Bruce Lambert Hot Dogs
TAFI (Twitter and Facebook Interface)
Slackbots/Newsroom Alerts
Newsroom Desk Dashboards
Package Mapper
What’s Next?
● Keep tagging
● Keep training
● Keep evangelizing
● Development better tools for surfacing and browsing the taxonomy
○ As we get more and more product requests around the taxonomy, we want to make it easier for
stakeholders to see what our taxonomy offers
● New classification software
○ Currently evaluating a replacement for our current software
Email: jennifer.parrucci@nytimes.com
Questions?

Weitere ähnliche Inhalte

Mehr von Access Innovations, Inc.

Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Access Innovations, Inc.
 
Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Access Innovations, Inc.
 
Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Access Innovations, Inc.
 
DHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityDHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityAccess Innovations, Inc.
 
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedDHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedAccess Innovations, Inc.
 
The JTHES as Part of the Intelligence Layer for the Sustainability Collection...
The JTHES as Part of the Intelligence Layer for the Sustainability Collection...The JTHES as Part of the Intelligence Layer for the Sustainability Collection...
The JTHES as Part of the Intelligence Layer for the Sustainability Collection...Access Innovations, Inc.
 

Mehr von Access Innovations, Inc. (20)

Plos taxonomy beyond search dhug 2021
Plos taxonomy beyond search   dhug 2021Plos taxonomy beyond search   dhug 2021
Plos taxonomy beyond search dhug 2021
 
Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)
 
Data harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacingData harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacing
 
Data harmony update 2021
Data harmony update 2021 Data harmony update 2021
Data harmony update 2021
 
Atypon dhug2021
Atypon dhug2021Atypon dhug2021
Atypon dhug2021
 
Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021
 
Asce more than just topic taxonomies
Asce more than just topic taxonomiesAsce more than just topic taxonomies
Asce more than just topic taxonomies
 
Acs discoverability-dhug2021
Acs discoverability-dhug2021Acs discoverability-dhug2021
Acs discoverability-dhug2021
 
Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)
 
Health Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut ItHealth Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut It
 
Why Keywords Don't Cut It
Why Keywords Don't Cut ItWhy Keywords Don't Cut It
Why Keywords Don't Cut It
 
Data Harmony update 2020 final
Data Harmony update 2020 finalData Harmony update 2020 final
Data Harmony update 2020 final
 
Data Harmony Update 2020 final
Data Harmony Update 2020 finalData Harmony Update 2020 final
Data Harmony Update 2020 final
 
DHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityDHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository Interoperability
 
DHUG 2018 - Florida Thesis OCR
DHUG 2018 - Florida Thesis OCRDHUG 2018 - Florida Thesis OCR
DHUG 2018 - Florida Thesis OCR
 
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedDHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
 
DHUG 2017 - Thesaurus Construction Training
DHUG 2017 - Thesaurus Construction TrainingDHUG 2017 - Thesaurus Construction Training
DHUG 2017 - Thesaurus Construction Training
 
DHUG 2017 - Access Integrity
DHUG 2017 - Access IntegrityDHUG 2017 - Access Integrity
DHUG 2017 - Access Integrity
 
The JTHES as Part of the Intelligence Layer for the Sustainability Collection...
The JTHES as Part of the Intelligence Layer for the Sustainability Collection...The JTHES as Part of the Intelligence Layer for the Sustainability Collection...
The JTHES as Part of the Intelligence Layer for the Sustainability Collection...
 
I Don’t Have Time for Metadata!
I Don’t Have Time for Metadata!I Don’t Have Time for Metadata!
I Don’t Have Time for Metadata!
 

Kürzlich hochgeladen

Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 

Kürzlich hochgeladen (20)

Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 

Tagging overview - Why Keywords Don't Cut It

  • 1. Tagging at The New York Times (1851-Today) Jennifer Parrucci Senior Taxonomist The New York Times
  • 2. The Taxonomy Team Kristi Reilly, Taxonomy Manager Kristi has been working with Times Tags for 15 years. She started at the Times as an Information Architect and became fascinated with how a 150-year-old vocabulary has evolved and helped connect readers with the stories that matter most to them. Prior to NYT, she worked for consulting companies owned by Ogilvy and Deloitte. Jennifer Parrucci, Senior Taxonomist Jennifer has been working with Times Tags for 12 years. She began her career at The Times as an indexer for The New York Times Index where she worked writing abstracts and assigning metadata. Prior to her work at The Times she received her M.L.I.S. from Pratt Institute, where she later earned her certification in archives. Jennifer has always been passionate about books, organizing content, surfacing historical treasures and making sure that everyone can find what they need. Dan McComas, Taxonomist Dan has enjoyed a 20-plus-year career at the NYT in wide-ranging roles across the organization. Before joining the Times Tags team in 2015, Dan was a copy editor in the Times Index department and brings extensive indexing experience to his current role.
  • 3. Rigorous quality checks originate from the Index, a role that was treated as an apprenticeship. The New York Times Index
  • 4. “The Paper of Record”
  • 5. ● A controlled vocabulary ○ Named entities (over a million) ■ People, Places, Organizations, Titles ○ Subjects (4,500) ■ Semantic Relationships: Broad, Narrow and Related Terms + Scope Notes ■ Includes News Events ● Assigned to all published assets ○ Articles, videos, slide shows, interactives, podcast episode pages ● Rule-based Software ○ Entity extraction: normalization, disambiguation ○ Categorization: frequency, proximity, placement What are Times Tags?
  • 6. Normalization Qaddafi, Muammar el- Rule-based Entity Extraction Muammar El-Gadhafi Muammar el-Gaddafi Muammar al-Gaddafi Muammar el-Gadhafi Muammar Gadhafi Muammar Qaddafi Muammar al-Gadhafi Muammar Al-Gadhafi Moammar El-Qaddafi Muammar El-Gaddafi Muammar el-Qaddafi Moammar Al-Gaddafi Muammar El-Qaddafi Moammar el-Gaddafi Moammar Gaddafi Muammar Al-Gaddafi Moammar el-Qaddafi Moammar Al-Gadhafi Moammar El-Gaddafi Moammar Al-Qaddafi Muammar al-Qaddafi Moammar El-Gadhafi Moammar al-Qaddafi Moammar Qaddafi Muammar Al-Qaddafi Muammar Gaddafi Moammar el-Gadhafi Moammar Gadhafi Moammar al-Gaddafi Moammar al-Gadhafi
  • 7. Disambiguation Michael Jackson {(OR,"album","albums","Apollo Theater","Billie Jean","Dan Reed","Debbie Rowe","Jackson 5","Jackson Five","Janet Jackson","King of Pop","La Toya","LaToya","moonwalk","Neverland","pop star","pop superstar","Rock and Roll Hall of Fame","Safechuck","This Is It","Wade Robson")} {(AND,"Bush","Homeland Security",(OR,"deputy director","director"))} New York Giants v. San Francisco Giants Rule-based Entity Extraction
  • 8. (SENT, (OR,"air traffic control","airline","airbus","aircraft","aircrafts","airline","airlines","airliner","airplane","airplanes","airship","airships"," boeing","dirigible","dirigibles","flight","helicopter","helicopters","hindenburg","jet","jetliner","landing gear","mu-2b","plane","planes","pilot","zeppelin","zeppelins"), (OR,"blast","blast of laser light","blew up","body parts","bodies","bomb","bomb threat","broke apart","broke into pieces","careered","collided","collision","crash","crashed","crashing","dead","debris","disaster","disasters","destru ction","destroy@","distress call","distress calls","engine failure","fell","flash-blindness","hit","injured","killed","laser pointing","laser strike","laser strikes","lost control","lost radar","mangled","mechanical failure","out of control","plummeted","plummeting","point lasers at","pointed a laser at","pointed lasers at","pointing lasers at","safety record","sank","skidded","slammed into","smashed","struck","sunk","survivor","survivors","tipped over","vanish","vanishing","vanished","veered","victim","victims","went down","wreckage","wreck")) Rule-based Categorization
  • 9. Rule-based Categorization Byline: Manhola Dargis → Movies Kicker: Hungry City → Restaurants Layout_desk: Obits → Deaths (Obituaries)
  • 10. ● "Aboutness"/Semantic Meaning ○ Tag only the focus ○ Strict guidelines ● Quality control measures ensure accuracy ○ Software suggests > Newsroom selects > Taxonomists check ○ Daily report summarizes ○ 7 days a week, 365 days a year What’s special about them?
  • 11. Software Suggests → Humans Verify
  • 13. Times Tags and CMS Scoop (Oak & Classic UI), Blackbeard
  • 14. Times Tags and CMS Scoop (Oak & Classic UI), Blackbeard
  • 17. What do we do with all these tags?
  • 19. Collections/Topics news_desk:"Climate" OR subject:("Acid Rain" "Air Pollution" "Algae" "Alternative and Renewable Energy" "Animal Migration" "Biodiversity" "Biofuels" "Birdwatching" "Carbon Capture and Sequestration" "Carbon Dioxide" "Coal" "Coast Erosion" "Compost" "Conservation of Resources" "Coral" "Drilling and Boring" "Eco-Tourism" "Electric and Hybrid Vehicles" "Electric Light and Power" "Endangered and Extinct Species" "Energy and Power" "Energy Efficiency" "Environment" "Federal Lands" "Fish and Other Marine Life" "Fuel Efficiency" "Fuel Emissions (Transportation)" "Geothermal Power" "Global Warming" "Green New Deal" "Greenhouse Gas Emissions" "Greenhouse Effect" "Hazardous and Toxic Substances" "Hydroelectric Power" "Hurricanes and Tropical Storms" "Keystone Pipeline" "Land Use Policies" "Leadership in Energy and Environmental Design (LEED)" "Light-Emitting Diodes" "Local Food" "Nuclear Energy" "Oil (Petroleum) and Gasoline" "Plastic Bags" "Pipelines" "Recycling of Waste Materials" "Reefs" "Solar Energy" "Sustainable Living" "Tidal and Wave Power" "United Nations Framework Convention on Climate Change" "Water Pollution" "Wetlands" "Wilderness Areas" "Wildfires" "Wildlife Sanctuaries and Nature Reserves" "Wind Power") OR subject.contains:("Hurricane") OR organizations.contains:("Energy Department" "Environmental Protection Agency" "Koch Industries") OR persons:("Perry, Rick" "Pruitt, Scott" "Wheeler, Andrew R")
  • 24. Advertising ● Contextual Targeting ● Brand Security Search ● Site Search Relevancy: 25% improvement in user click through after "boosting" on Times Tags ● Collection/Topics pages powered by search are promoted to the first result in site search ● Bruce Lambert Hot Dogs
  • 25. TAFI (Twitter and Facebook Interface)
  • 29. What’s Next? ● Keep tagging ● Keep training ● Keep evangelizing ● Development better tools for surfacing and browsing the taxonomy ○ As we get more and more product requests around the taxonomy, we want to make it easier for stakeholders to see what our taxonomy offers ● New classification software ○ Currently evaluating a replacement for our current software