SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Orion: An Integrated Multimedia Content
Moderation System for Web Services
Yusuke Fujisaka
Akihabara Lab., CyberAgent, Inc.
fujisaka_yusuke@cyberagent.co.jp
Our business
Media Internet AD
Game Startup
Our media services
AbemaTV (AbemaTV, Inc.)
● Free-to-view internet TV with TVCFs
● 30M+ downloads
Ameba
● “Ameblo”: Japan’s largest blog service
● 20,000+ official blogs
Tapple (MatchingAgent, Inc.)
● Japan’s largest dating app
● 3.5M+ users, 100M+ matches
AWA (AWA, Inc.)
● Music subscription service
● 16M+ downloads, 45M+ musics
Agenda
1. Motivation
2. System overview
3. Orion’s effect
4. Conclusion
Motivation
● Social Networking Services (SNS) rely on User Generated Content (UGC)
● Some UGC are viewed as spam
● Platform needs aims to eliminate spam from SNS
Motivation
● Social Networking Services (SNS) rely on User Generated Content (UGC)
● Some UGC are viewed as spam
● Platform needs aims to eliminate spam from SNS
Spam characteristics
● Only a small fraction of content and users are involved with spam
All post
Spam post
〜 1/1000
〜 1/200
Spam characteristics
● Types of spam include:
○ Adult content
○ Grotesque content
○ Duplicate posts originated by certain bot
○ Abusive posts
○ Criminal posts
○ etc.
● Spam affects users not only psychologically, but also physically
● Spam may reduce the reliability of SNS
● Spam trends changes
Filtering vs. Operator
Case 1: Deploy filter systems to moderate UGC
Pros:
● Cost efficient
● Ability to handle huge amount of data
Cons:
● Models must upgrade to follow spam trends
● False-(positive, negative) happens
○ Spam UGC remains on service
○ obviously safe UGCs mistakenly deleted
○ → Service satisfaction may decrease
Filtering vs. Operator
Case 2: Operators control spam messages
Pros:
● Humans always follow trend
○ Operators classify UGCs as same view as users
● Reduce incorrect tagging
○ If operators can effectively moderate contents
Cons:
● Cost inefficient
● Resource limited
Filtering with Operator
● We need to manage a large amount of data, cost efficiently and avoid
incorrect labelling
● Two steps to process
○ Step 1: Deploy automatic filters to extract contents
including suspicious words or behavior
○ Step 2: Perform manual operation to detect actual
spam contents and remove them
Safe data:
Not caught by filter
Step
2
Step
1
Suspicious
contents
Spam
System overview
● Orion: integrated content moderation system
○ Combination of “automatic filtering” and “manual moderation”
Service
log
Service
Streaming
Metadata
DB
Filter Moderation API
Admin API
Web Server
Operator
Feedback
Queue
Retrieval
Engine
Content
DB
Automatic modules Manual modules
Streaming module
● Collects user posts from services
● Filters suspicious content as defined by each service
○ 300+ filters to mark content for moderation
○ Maximum coverage, low latency required
○ Determine whether operator check is required
Correction check
User level check
Filtering / moderation mark
Save to DB
Gather UGCs from service
Word filter Repeat post filter
ML-based filter Image filter
User level
● “Well-behaved” users are considered to not require content checking.
● What is “well-behaved” user?
○ Those who post frequently without spam
● User level
○ “Problem users’” posts must be checked
regardless of filtering
○ “Safe users” need not be checked as often
Problem user
General user
New user
Safe user
Total post #
Deleted post #
Moderation service
● Operators can moderate in service-dedicated window
● Dummy posts & quality checks are included
Analyze / Reporting
● We collect information from a variety of sources
○ Spam category, service, operator...
○ Unique IDs sent from each service are used to identify the information
● Reporting assures quality of moderation
○ If an operator failed to identify dummy spam data, it will be indicated on the report
○ Reports are displayed on a Tableau server
Effect > Spam removal efficiency
● 35+ services in use
● Orion filters and moderates millions of pages of content
New service User level applies New service
All post
Suspicious post
Deleted post
Effect > Spam removal efficiency
● Ratio comparing 2014-2015 vs. 2017-2018
(%) Check/All Delete/Check Delete/All
Min Max Ave Min Max Ave Min Max Ave
‘14-’15 1.17 26.44 7.62 0.10 2.86 0.43 0.004 0.756 0.034
Change 0.61x 5.04x 2.97x
‘17-’18 3.09 6.32 4.66 1.51 3.64 2.17 0.063 0.165 0.101
Effect > Moderation effect
● Orion has been effective since deployment
○ Criminal activity among our company’s services has greatly declined
○ No criminal case has observed in late 2017
→ Time period
→Criminalcase#
→ Orion operational
Conclusion
● Content moderation should not rely solely on automatic classification nor
manual moderation
● We introduced Orion, which integrates automatic filtering and manual
moderation
○ UGCs are screened by various filters and suspicious UGCs are send for manual moderation
○ Operators are monitored to ensure a high moderation quality
● On deploying Orion, the amount of UGC requiring manual moderation
decreased, and the number of criminal posts sharply declined
Bibliography
[1] Roberts, Sarah T. "Commercial content moderation: Digital laborers' dirty work." (2016).
[2] Sawyer, Michael S. "Filters, Fair Use & Feedback: User-Generated Content Principles and the DMCA." Berkeley Tech. LJ
24 (2009): 363.
[3] Ghosh, Arpita, Satyen Kale, and Preston McAfee. "Who moderates the moderators?: crowdsourcing abuse detection in
user-generated content." Proceedings of the 12th ACM conference on Electronic commerce. ACM, 2011.
[4] Wang, Gang, et al. "Social turing tests: Crowdsourcing sybil detection." arXiv preprint arXiv:1205.3856 (2012).
[5] Aoe, Jun‐Ichi, Katsushi Morimoto, and Takashi Sato. "An efficient implementation of trie structures." Software: Practice
and Experience 22.9 (1992): 695-721.
Thank you.

Weitere ähnliche Inhalte

Ähnlich wie Orion an integrated multimedia content moderation system for web services

Splunk in Rakuten: Splunk as a Service for all
Splunk in Rakuten: Splunk as a Service for allSplunk in Rakuten: Splunk as a Service for all
Splunk in Rakuten: Splunk as a Service for allTimur Bagirov
 
Generative AI for Regulatory Analysis
Generative AI for Regulatory AnalysisGenerative AI for Regulatory Analysis
Generative AI for Regulatory AnalysisNimonik
 
Uncovering hidden stories in logs!
Uncovering hidden stories in logs!Uncovering hidden stories in logs!
Uncovering hidden stories in logs!Chandan Jog
 
Microservices operational management | Walkingtree Technologies
Microservices operational management  | Walkingtree TechnologiesMicroservices operational management  | Walkingtree Technologies
Microservices operational management | Walkingtree TechnologiesWalking Tree Technologies
 
crime_management_sysdbms_2024 11[1].pptx
crime_management_sysdbms_2024 11[1].pptxcrime_management_sysdbms_2024 11[1].pptx
crime_management_sysdbms_2024 11[1].pptxsachinkedari257
 
IT Service Intelligence Hands On Breakout Session
IT Service Intelligence Hands On Breakout SessionIT Service Intelligence Hands On Breakout Session
IT Service Intelligence Hands On Breakout SessionSplunk
 
Ledingkart Meetup #1: Monolithic to microservices in action
Ledingkart Meetup #1: Monolithic to microservices in actionLedingkart Meetup #1: Monolithic to microservices in action
Ledingkart Meetup #1: Monolithic to microservices in actionMukesh Singh
 
UiPath Meetup Service now + mainframe and legacy final
UiPath Meetup Service now + mainframe and legacy finalUiPath Meetup Service now + mainframe and legacy final
UiPath Meetup Service now + mainframe and legacy finalUiPath
 
Software estimation challenge diederik wortman - metri
Software estimation challenge   diederik wortman - metriSoftware estimation challenge   diederik wortman - metri
Software estimation challenge diederik wortman - metriNesma
 
Online Crime Reporting ppt
Online Crime Reporting pptOnline Crime Reporting ppt
Online Crime Reporting pptShirinAkhtar5
 
Webinar - System Performance Monitoring with SysKit: Servers, Services and Apps
Webinar - System Performance Monitoring with SysKit: Servers, Services and AppsWebinar - System Performance Monitoring with SysKit: Servers, Services and Apps
Webinar - System Performance Monitoring with SysKit: Servers, Services and AppsSysKit Ltd
 
Reduce Time to Value: Focus First on Configuration Management Debt
Reduce Time to Value: Focus First on Configuration Management DebtReduce Time to Value: Focus First on Configuration Management Debt
Reduce Time to Value: Focus First on Configuration Management DebtChris Sterling
 
The Economic Benefits of the Postman API Platform
The Economic Benefits of the Postman API PlatformThe Economic Benefits of the Postman API Platform
The Economic Benefits of the Postman API PlatformPostman
 
E crime-converted
E crime-convertedE crime-converted
E crime-convertedWeTheCoders
 
MuleSoft Surat Meetup#51 - API Monitoring - Through a New Lens
MuleSoft Surat Meetup#51 - API Monitoring - Through a New LensMuleSoft Surat Meetup#51 - API Monitoring - Through a New Lens
MuleSoft Surat Meetup#51 - API Monitoring - Through a New LensJitendra Bafna
 
track1 02. 라쿠텐 트라벨 Next Search Platform구축까지의 이야기 그리고 라쿠텐에서 엔지니어로 사는것
track1 02. 라쿠텐 트라벨 Next Search Platform구축까지의 이야기 그리고 라쿠텐에서 엔지니어로 사는것 track1 02. 라쿠텐 트라벨 Next Search Platform구축까지의 이야기 그리고 라쿠텐에서 엔지니어로 사는것
track1 02. 라쿠텐 트라벨 Next Search Platform구축까지의 이야기 그리고 라쿠텐에서 엔지니어로 사는것 양 한빛
 
10 Best Practices for Magento Maintenance and Support
10 Best Practices for Magento Maintenance and Support10 Best Practices for Magento Maintenance and Support
10 Best Practices for Magento Maintenance and SupportAPPSeCONNECT
 

Ähnlich wie Orion an integrated multimedia content moderation system for web services (20)

Splunk in Rakuten: Splunk as a Service for all
Splunk in Rakuten: Splunk as a Service for allSplunk in Rakuten: Splunk as a Service for all
Splunk in Rakuten: Splunk as a Service for all
 
Generative AI for Regulatory Analysis
Generative AI for Regulatory AnalysisGenerative AI for Regulatory Analysis
Generative AI for Regulatory Analysis
 
Uncovering hidden stories in logs!
Uncovering hidden stories in logs!Uncovering hidden stories in logs!
Uncovering hidden stories in logs!
 
Microservices operational management | Walkingtree Technologies
Microservices operational management  | Walkingtree TechnologiesMicroservices operational management  | Walkingtree Technologies
Microservices operational management | Walkingtree Technologies
 
crime_management_sysdbms_2024 11[1].pptx
crime_management_sysdbms_2024 11[1].pptxcrime_management_sysdbms_2024 11[1].pptx
crime_management_sysdbms_2024 11[1].pptx
 
IT Service Intelligence Hands On Breakout Session
IT Service Intelligence Hands On Breakout SessionIT Service Intelligence Hands On Breakout Session
IT Service Intelligence Hands On Breakout Session
 
Ledingkart Meetup #1: Monolithic to microservices in action
Ledingkart Meetup #1: Monolithic to microservices in actionLedingkart Meetup #1: Monolithic to microservices in action
Ledingkart Meetup #1: Monolithic to microservices in action
 
UiPath Meetup Service now + mainframe and legacy final
UiPath Meetup Service now + mainframe and legacy finalUiPath Meetup Service now + mainframe and legacy final
UiPath Meetup Service now + mainframe and legacy final
 
Software estimation challenge diederik wortman - metri
Software estimation challenge   diederik wortman - metriSoftware estimation challenge   diederik wortman - metri
Software estimation challenge diederik wortman - metri
 
Online Crime Reporting ppt
Online Crime Reporting pptOnline Crime Reporting ppt
Online Crime Reporting ppt
 
3 types of monitoring for 2020
3 types of monitoring for 20203 types of monitoring for 2020
3 types of monitoring for 2020
 
Webinar - System Performance Monitoring with SysKit: Servers, Services and Apps
Webinar - System Performance Monitoring with SysKit: Servers, Services and AppsWebinar - System Performance Monitoring with SysKit: Servers, Services and Apps
Webinar - System Performance Monitoring with SysKit: Servers, Services and Apps
 
Reduce Time to Value: Focus First on Configuration Management Debt
Reduce Time to Value: Focus First on Configuration Management DebtReduce Time to Value: Focus First on Configuration Management Debt
Reduce Time to Value: Focus First on Configuration Management Debt
 
The Economic Benefits of the Postman API Platform
The Economic Benefits of the Postman API PlatformThe Economic Benefits of the Postman API Platform
The Economic Benefits of the Postman API Platform
 
E crime-converted
E crime-convertedE crime-converted
E crime-converted
 
My portfolio
My portfolioMy portfolio
My portfolio
 
MuleSoft Surat Meetup#51 - API Monitoring - Through a New Lens
MuleSoft Surat Meetup#51 - API Monitoring - Through a New LensMuleSoft Surat Meetup#51 - API Monitoring - Through a New Lens
MuleSoft Surat Meetup#51 - API Monitoring - Through a New Lens
 
track1 02. 라쿠텐 트라벨 Next Search Platform구축까지의 이야기 그리고 라쿠텐에서 엔지니어로 사는것
track1 02. 라쿠텐 트라벨 Next Search Platform구축까지의 이야기 그리고 라쿠텐에서 엔지니어로 사는것 track1 02. 라쿠텐 트라벨 Next Search Platform구축까지의 이야기 그리고 라쿠텐에서 엔지니어로 사는것
track1 02. 라쿠텐 트라벨 Next Search Platform구축까지의 이야기 그리고 라쿠텐에서 엔지니어로 사는것
 
Magento maintenance
Magento maintenanceMagento maintenance
Magento maintenance
 
10 Best Practices for Magento Maintenance and Support
10 Best Practices for Magento Maintenance and Support10 Best Practices for Magento Maintenance and Support
10 Best Practices for Magento Maintenance and Support
 

Mehr von cyberagent

WWW2019で見るモバイルコンピューティングの技術と動向 山本悠ニ
WWW2019で見るモバイルコンピューティングの技術と動向    山本悠ニWWW2019で見るモバイルコンピューティングの技術と動向    山本悠ニ
WWW2019で見るモバイルコンピューティングの技術と動向 山本悠ニcyberagent
 
Web フィルタリング最前線: 「「検閲回避」回避」 角田孝昭
Web フィルタリング最前線: 「「検閲回避」回避」    角田孝昭Web フィルタリング最前線: 「「検閲回避」回避」    角田孝昭
Web フィルタリング最前線: 「「検閲回避」回避」 角田孝昭cyberagent
 
WebにおけるHuman Dynamics 武内慎
WebにおけるHuman Dynamics    武内慎WebにおけるHuman Dynamics    武内慎
WebにおけるHuman Dynamics 武内慎cyberagent
 
Webと経済学 數見拓朗
Webと経済学    數見拓朗Webと経済学    數見拓朗
Webと経済学 數見拓朗cyberagent
 
Data Engineering Meetup #1 持続可能なデータ基盤のためのデータの多様性に対する取り組み
Data Engineering Meetup #1 持続可能なデータ基盤のためのデータの多様性に対する取り組みData Engineering Meetup #1 持続可能なデータ基盤のためのデータの多様性に対する取り組み
Data Engineering Meetup #1 持続可能なデータ基盤のためのデータの多様性に対する取り組みcyberagent
 
継続的な開発スタイル AbemaTVのiOSアプリを週1でリリースしている話
継続的な開発スタイル AbemaTVのiOSアプリを週1でリリースしている話継続的な開発スタイル AbemaTVのiOSアプリを週1でリリースしている話
継続的な開発スタイル AbemaTVのiOSアプリを週1でリリースしている話cyberagent
 
AbemaTVにおける推薦システム
AbemaTVにおける推薦システムAbemaTVにおける推薦システム
AbemaTVにおける推薦システムcyberagent
 
AbemaTV レコメンド開発エンジニアによる RecSys 2018 参加レポート
AbemaTV レコメンド開発エンジニアによる RecSys 2018 参加レポートAbemaTV レコメンド開発エンジニアによる RecSys 2018 参加レポート
AbemaTV レコメンド開発エンジニアによる RecSys 2018 参加レポートcyberagent
 
機械学習エンジニアを見せたAWSの再:発明とは? 〜re:Invent 2018 参加レポート〜
機械学習エンジニアを見せたAWSの再:発明とは? 〜re:Invent 2018 参加レポート〜機械学習エンジニアを見せたAWSの再:発明とは? 〜re:Invent 2018 参加レポート〜
機械学習エンジニアを見せたAWSの再:発明とは? 〜re:Invent 2018 参加レポート〜cyberagent
 
インターネットテレビ局「AbemaTV」プロダクトの変遷
インターネットテレビ局「AbemaTV」プロダクトの変遷インターネットテレビ局「AbemaTV」プロダクトの変遷
インターネットテレビ局「AbemaTV」プロダクトの変遷cyberagent
 
番組宣伝に関するAbemaTV分析事例の紹介
番組宣伝に関するAbemaTV分析事例の紹介番組宣伝に関するAbemaTV分析事例の紹介
番組宣伝に関するAbemaTV分析事例の紹介cyberagent
 
WWW2018 論文読み会  Webと経済学
 WWW2018 論文読み会  Webと経済学 WWW2018 論文読み会  Webと経済学
WWW2018 論文読み会  Webと経済学cyberagent
 
WWW2018 論文読み会 WebにおけるHuman Dynamics
WWW2018 論文読み会 WebにおけるHuman DynamicsWWW2018 論文読み会 WebにおけるHuman Dynamics
WWW2018 論文読み会 WebにおけるHuman Dynamicscyberagent
 
WWW2018 論文読み会 Web Search and Mining
WWW2018 論文読み会 Web Search and MiningWWW2018 論文読み会 Web Search and Mining
WWW2018 論文読み会 Web Search and Miningcyberagent
 
サイバーエージェントの機械学習エンジニアが体験したGoogle I/O 2018
サイバーエージェントの機械学習エンジニアが体験したGoogle I/O 2018サイバーエージェントの機械学習エンジニアが体験したGoogle I/O 2018
サイバーエージェントの機械学習エンジニアが体験したGoogle I/O 2018cyberagent
 
ログ解析基盤におけるストリーム処理パイプラインについて
ログ解析基盤におけるストリーム処理パイプラインについてログ解析基盤におけるストリーム処理パイプラインについて
ログ解析基盤におけるストリーム処理パイプラインについてcyberagent
 
Orion an integrated multimedia content moderation system for web services
Orion  an integrated multimedia content moderation system for web servicesOrion  an integrated multimedia content moderation system for web services
Orion an integrated multimedia content moderation system for web servicescyberagent
 
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018cyberagent
 
"マルチメディア機械学習" の取り組み
"マルチメディア機械学習"  の取り組み"マルチメディア機械学習"  の取り組み
"マルチメディア機械学習" の取り組みcyberagent
 
推薦アルゴリズムの今までとこれから
推薦アルゴリズムの今までとこれから推薦アルゴリズムの今までとこれから
推薦アルゴリズムの今までとこれからcyberagent
 

Mehr von cyberagent (20)

WWW2019で見るモバイルコンピューティングの技術と動向 山本悠ニ
WWW2019で見るモバイルコンピューティングの技術と動向    山本悠ニWWW2019で見るモバイルコンピューティングの技術と動向    山本悠ニ
WWW2019で見るモバイルコンピューティングの技術と動向 山本悠ニ
 
Web フィルタリング最前線: 「「検閲回避」回避」 角田孝昭
Web フィルタリング最前線: 「「検閲回避」回避」    角田孝昭Web フィルタリング最前線: 「「検閲回避」回避」    角田孝昭
Web フィルタリング最前線: 「「検閲回避」回避」 角田孝昭
 
WebにおけるHuman Dynamics 武内慎
WebにおけるHuman Dynamics    武内慎WebにおけるHuman Dynamics    武内慎
WebにおけるHuman Dynamics 武内慎
 
Webと経済学 數見拓朗
Webと経済学    數見拓朗Webと経済学    數見拓朗
Webと経済学 數見拓朗
 
Data Engineering Meetup #1 持続可能なデータ基盤のためのデータの多様性に対する取り組み
Data Engineering Meetup #1 持続可能なデータ基盤のためのデータの多様性に対する取り組みData Engineering Meetup #1 持続可能なデータ基盤のためのデータの多様性に対する取り組み
Data Engineering Meetup #1 持続可能なデータ基盤のためのデータの多様性に対する取り組み
 
継続的な開発スタイル AbemaTVのiOSアプリを週1でリリースしている話
継続的な開発スタイル AbemaTVのiOSアプリを週1でリリースしている話継続的な開発スタイル AbemaTVのiOSアプリを週1でリリースしている話
継続的な開発スタイル AbemaTVのiOSアプリを週1でリリースしている話
 
AbemaTVにおける推薦システム
AbemaTVにおける推薦システムAbemaTVにおける推薦システム
AbemaTVにおける推薦システム
 
AbemaTV レコメンド開発エンジニアによる RecSys 2018 参加レポート
AbemaTV レコメンド開発エンジニアによる RecSys 2018 参加レポートAbemaTV レコメンド開発エンジニアによる RecSys 2018 参加レポート
AbemaTV レコメンド開発エンジニアによる RecSys 2018 参加レポート
 
機械学習エンジニアを見せたAWSの再:発明とは? 〜re:Invent 2018 参加レポート〜
機械学習エンジニアを見せたAWSの再:発明とは? 〜re:Invent 2018 参加レポート〜機械学習エンジニアを見せたAWSの再:発明とは? 〜re:Invent 2018 参加レポート〜
機械学習エンジニアを見せたAWSの再:発明とは? 〜re:Invent 2018 参加レポート〜
 
インターネットテレビ局「AbemaTV」プロダクトの変遷
インターネットテレビ局「AbemaTV」プロダクトの変遷インターネットテレビ局「AbemaTV」プロダクトの変遷
インターネットテレビ局「AbemaTV」プロダクトの変遷
 
番組宣伝に関するAbemaTV分析事例の紹介
番組宣伝に関するAbemaTV分析事例の紹介番組宣伝に関するAbemaTV分析事例の紹介
番組宣伝に関するAbemaTV分析事例の紹介
 
WWW2018 論文読み会  Webと経済学
 WWW2018 論文読み会  Webと経済学 WWW2018 論文読み会  Webと経済学
WWW2018 論文読み会  Webと経済学
 
WWW2018 論文読み会 WebにおけるHuman Dynamics
WWW2018 論文読み会 WebにおけるHuman DynamicsWWW2018 論文読み会 WebにおけるHuman Dynamics
WWW2018 論文読み会 WebにおけるHuman Dynamics
 
WWW2018 論文読み会 Web Search and Mining
WWW2018 論文読み会 Web Search and MiningWWW2018 論文読み会 Web Search and Mining
WWW2018 論文読み会 Web Search and Mining
 
サイバーエージェントの機械学習エンジニアが体験したGoogle I/O 2018
サイバーエージェントの機械学習エンジニアが体験したGoogle I/O 2018サイバーエージェントの機械学習エンジニアが体験したGoogle I/O 2018
サイバーエージェントの機械学習エンジニアが体験したGoogle I/O 2018
 
ログ解析基盤におけるストリーム処理パイプラインについて
ログ解析基盤におけるストリーム処理パイプラインについてログ解析基盤におけるストリーム処理パイプラインについて
ログ解析基盤におけるストリーム処理パイプラインについて
 
Orion an integrated multimedia content moderation system for web services
Orion  an integrated multimedia content moderation system for web servicesOrion  an integrated multimedia content moderation system for web services
Orion an integrated multimedia content moderation system for web services
 
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
 
"マルチメディア機械学習" の取り組み
"マルチメディア機械学習"  の取り組み"マルチメディア機械学習"  の取り組み
"マルチメディア機械学習" の取り組み
 
推薦アルゴリズムの今までとこれから
推薦アルゴリズムの今までとこれから推薦アルゴリズムの今までとこれから
推薦アルゴリズムの今までとこれから
 

Kürzlich hochgeladen

UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 

Kürzlich hochgeladen (20)

UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 

Orion an integrated multimedia content moderation system for web services

  • 1. Orion: An Integrated Multimedia Content Moderation System for Web Services Yusuke Fujisaka Akihabara Lab., CyberAgent, Inc. fujisaka_yusuke@cyberagent.co.jp
  • 2. Our business Media Internet AD Game Startup
  • 3. Our media services AbemaTV (AbemaTV, Inc.) ● Free-to-view internet TV with TVCFs ● 30M+ downloads Ameba ● “Ameblo”: Japan’s largest blog service ● 20,000+ official blogs Tapple (MatchingAgent, Inc.) ● Japan’s largest dating app ● 3.5M+ users, 100M+ matches AWA (AWA, Inc.) ● Music subscription service ● 16M+ downloads, 45M+ musics
  • 4. Agenda 1. Motivation 2. System overview 3. Orion’s effect 4. Conclusion
  • 5. Motivation ● Social Networking Services (SNS) rely on User Generated Content (UGC) ● Some UGC are viewed as spam ● Platform needs aims to eliminate spam from SNS
  • 6. Motivation ● Social Networking Services (SNS) rely on User Generated Content (UGC) ● Some UGC are viewed as spam ● Platform needs aims to eliminate spam from SNS
  • 7. Spam characteristics ● Only a small fraction of content and users are involved with spam All post Spam post 〜 1/1000 〜 1/200
  • 8. Spam characteristics ● Types of spam include: ○ Adult content ○ Grotesque content ○ Duplicate posts originated by certain bot ○ Abusive posts ○ Criminal posts ○ etc. ● Spam affects users not only psychologically, but also physically ● Spam may reduce the reliability of SNS ● Spam trends changes
  • 9. Filtering vs. Operator Case 1: Deploy filter systems to moderate UGC Pros: ● Cost efficient ● Ability to handle huge amount of data Cons: ● Models must upgrade to follow spam trends ● False-(positive, negative) happens ○ Spam UGC remains on service ○ obviously safe UGCs mistakenly deleted ○ → Service satisfaction may decrease
  • 10. Filtering vs. Operator Case 2: Operators control spam messages Pros: ● Humans always follow trend ○ Operators classify UGCs as same view as users ● Reduce incorrect tagging ○ If operators can effectively moderate contents Cons: ● Cost inefficient ● Resource limited
  • 11. Filtering with Operator ● We need to manage a large amount of data, cost efficiently and avoid incorrect labelling ● Two steps to process ○ Step 1: Deploy automatic filters to extract contents including suspicious words or behavior ○ Step 2: Perform manual operation to detect actual spam contents and remove them Safe data: Not caught by filter Step 2 Step 1 Suspicious contents Spam
  • 12. System overview ● Orion: integrated content moderation system ○ Combination of “automatic filtering” and “manual moderation” Service log Service Streaming Metadata DB Filter Moderation API Admin API Web Server Operator Feedback Queue Retrieval Engine Content DB Automatic modules Manual modules
  • 13. Streaming module ● Collects user posts from services ● Filters suspicious content as defined by each service ○ 300+ filters to mark content for moderation ○ Maximum coverage, low latency required ○ Determine whether operator check is required Correction check User level check Filtering / moderation mark Save to DB Gather UGCs from service Word filter Repeat post filter ML-based filter Image filter
  • 14. User level ● “Well-behaved” users are considered to not require content checking. ● What is “well-behaved” user? ○ Those who post frequently without spam ● User level ○ “Problem users’” posts must be checked regardless of filtering ○ “Safe users” need not be checked as often Problem user General user New user Safe user Total post # Deleted post #
  • 15. Moderation service ● Operators can moderate in service-dedicated window ● Dummy posts & quality checks are included
  • 16. Analyze / Reporting ● We collect information from a variety of sources ○ Spam category, service, operator... ○ Unique IDs sent from each service are used to identify the information ● Reporting assures quality of moderation ○ If an operator failed to identify dummy spam data, it will be indicated on the report ○ Reports are displayed on a Tableau server
  • 17. Effect > Spam removal efficiency ● 35+ services in use ● Orion filters and moderates millions of pages of content New service User level applies New service All post Suspicious post Deleted post
  • 18. Effect > Spam removal efficiency ● Ratio comparing 2014-2015 vs. 2017-2018 (%) Check/All Delete/Check Delete/All Min Max Ave Min Max Ave Min Max Ave ‘14-’15 1.17 26.44 7.62 0.10 2.86 0.43 0.004 0.756 0.034 Change 0.61x 5.04x 2.97x ‘17-’18 3.09 6.32 4.66 1.51 3.64 2.17 0.063 0.165 0.101
  • 19. Effect > Moderation effect ● Orion has been effective since deployment ○ Criminal activity among our company’s services has greatly declined ○ No criminal case has observed in late 2017 → Time period →Criminalcase# → Orion operational
  • 20. Conclusion ● Content moderation should not rely solely on automatic classification nor manual moderation ● We introduced Orion, which integrates automatic filtering and manual moderation ○ UGCs are screened by various filters and suspicious UGCs are send for manual moderation ○ Operators are monitored to ensure a high moderation quality ● On deploying Orion, the amount of UGC requiring manual moderation decreased, and the number of criminal posts sharply declined
  • 21. Bibliography [1] Roberts, Sarah T. "Commercial content moderation: Digital laborers' dirty work." (2016). [2] Sawyer, Michael S. "Filters, Fair Use & Feedback: User-Generated Content Principles and the DMCA." Berkeley Tech. LJ 24 (2009): 363. [3] Ghosh, Arpita, Satyen Kale, and Preston McAfee. "Who moderates the moderators?: crowdsourcing abuse detection in user-generated content." Proceedings of the 12th ACM conference on Electronic commerce. ACM, 2011. [4] Wang, Gang, et al. "Social turing tests: Crowdsourcing sybil detection." arXiv preprint arXiv:1205.3856 (2012). [5] Aoe, Jun‐Ichi, Katsushi Morimoto, and Takashi Sato. "An efficient implementation of trie structures." Software: Practice and Experience 22.9 (1992): 695-721.