SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
Harvesting crowdsourcing biodiversity data from Facebook groups
             Jason Guan-Shuo Mai1, Cheng-Hsin Hsu1, Dong-Po Deng2, De-En Lin3, Hsu-Hong Lin3, Kwang-Tsao Shao1
1 Taiwan Biodiversity Information Facility (TaiBIF), Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
2 Institute of Information Science, Academia Sinica, Taipei, Taiwan
3 Taiwan Endemic Species Research Institute, Council of Agriculture, Nantou, Taiwan

The emergence of Web 2.0 enables people to contribute their biodiversity observations on the Web. These crowdsourcing biodiversity data are increasing their
value in scientific studies due to the potentially broader spatial and temporal scales. However, the data provided in plain text hinder the process of data retrieval
and analysis. In this study, we propose a framework to automatically structure the loose-format text so that volunteers can keep providing data in their own
familiar ways, while interested citizens, biodiversity researchers and managers can benefit from the semantically structured information. We take 2 Facebook
biodiversity interest groups Reptile-Road-Mortality and Enjoy-Moths as examples.
                                          0. Crowdsourcing -                                                               Thread
                                              participants provide                          2. Using natural language                            Post message
                                              unstructured data                             processing techs with Taiwan
                                              voluntarily                                   Geographic Name and Taiwan                           Post Picture
                                                                                            Catalogue of Life databases as
                                               Facebook interest groups                     knowledge bases to extract
                                                                                                                                             Comment message
                                                                                            species vernacular names and
      6. Improving                                                                          place names from a thread                        Comment message
      source data
                                                                                                                                             Comment message
      quality without
      changing users’                                                                                                                                  …
                                        Reptile-Road-Mortality Enjoy-Moths                                                             What a typical discussion thread
      own familiar                                                                                                                     looks like.
      ways                              1. Crawling data from
                                        Facebook via its API                                                           Our algorithm picks a most related species
                                                                                                                       name appearing in a thread based on social
                                                                                                                       networking characteristics.
    Semantic
    annotation tool
    disambiguates                                                                                        For each vernacular name in TaiCOL do:
    toponymic                                                                                                           occurs in the message?    Full-matched
    homonyms                                                                                                細紋南蛇
                                                                                                                                      Yes         name
                                                                                                                       No
                                                                                                                          occurs in the
                                                                                                             Prefix3      message?              Postfix2 occurs in the thread?
                                                                                                             細紋南                        Yes      南蛇                     Yes

                                                                                                                         No                                      No
                                                                                                                           occurs in the
                                               One click on a                                                              message?
                                               message to
                                               recognize species
                                                                          Main                               Prefix2
                                                                                                              細紋
                                                                                                                                         Yes        Postfix1
                                                                                                                                                      蛇
                                                                                                                                                                      No
                                                                                                                                                                           Yes

                                                                                                                                  No
                                               vernacular names
                                               and related
                                                                         Database                                    Name doesn’t exist in the          Matched abbreviation
                                                                                                                     message                            Calculate confidence score
                                               information
                                                                                                                                                        of this name
        5. Developing
                                                           4. Publishing
        browser plug-
                                                           linked open
        ins to give
                                                           data via D2R
        users digested
                                                           server for
        feedback of
                                                           open access
        structuralized
                                                           and usage
        data




       Our dataset is linked to other datasets on
       linked open data cloud such as DBPedia,
       GeoNames and LODE (Linked Open Data of           3. Introducing content management
       Ecology) so it can have benefit from the large
       amount of meta-information they provide.         system Drupal for easier data                                      Algorithms used to recognize abbreviations
                                                        management (including error                                        of vernacular names and place names
                                                        correction) and display

Weitere ähnliche Inhalte

Mehr von Dongpo Deng

20180226 data driven smart governance
20180226 data driven smart governance20180226 data driven smart governance
20180226 data driven smart governanceDongpo Deng
 
The methods and practices of Linked Open Data
The methods and practices of Linked Open DataThe methods and practices of Linked Open Data
The methods and practices of Linked Open DataDongpo Deng
 
Construction and reuse of linked traceable agricultural product records - An ...
Construction and reuse of linked traceable agricultural product records - An ...Construction and reuse of linked traceable agricultural product records - An ...
Construction and reuse of linked traceable agricultural product records - An ...Dongpo Deng
 
農產品產銷履歷資料鏈結化處理 (Linked Traceable Agricultural Data )
農產品產銷履歷資料鏈結化處理 (Linked Traceable Agricultural Data )農產品產銷履歷資料鏈結化處理 (Linked Traceable Agricultural Data )
農產品產銷履歷資料鏈結化處理 (Linked Traceable Agricultural Data )Dongpo Deng
 
開放街圖社群經營的不等式
開放街圖社群經營的不等式開放街圖社群經營的不等式
開放街圖社群經營的不等式Dongpo Deng
 
OSM 與 LocalWiki 的整合: 支援社區層級災害管理
OSM 與 LocalWiki 的整合: 支援社區層級災害管理OSM 與 LocalWiki 的整合: 支援社區層級災害管理
OSM 與 LocalWiki 的整合: 支援社區層級災害管理Dongpo Deng
 
啟動開放,創新價值
啟動開放,創新價值 啟動開放,創新價值
啟動開放,創新價值 Dongpo Deng
 
2016年歐洲資料論壇
2016年歐洲資料論壇2016年歐洲資料論壇
2016年歐洲資料論壇Dongpo Deng
 
From Structured Data to Linked Open Governmental Data
From Structured Data to Linked Open Governmental DataFrom Structured Data to Linked Open Governmental Data
From Structured Data to Linked Open Governmental DataDongpo Deng
 
開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )
開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )
開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )Dongpo Deng
 
20150427_NCDR_OSM_Disaster_Mapping
20150427_NCDR_OSM_Disaster_Mapping20150427_NCDR_OSM_Disaster_Mapping
20150427_NCDR_OSM_Disaster_MappingDongpo Deng
 
Toward Next Generation of Gazetteer: Utilizing GeoSPARQL For Developing Link...
Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Link...Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Link...
Toward Next Generation of Gazetteer: Utilizing GeoSPARQL For Developing Link...Dongpo Deng
 
20141018_OD_meetup#3
20141018_OD_meetup#320141018_OD_meetup#3
20141018_OD_meetup#3Dongpo Deng
 
20141001 climate change&osm
20141001 climate change&osm20141001 climate change&osm
20141001 climate change&osmDongpo Deng
 
20140721 open geomeeting
20140721 open geomeeting20140721 open geomeeting
20140721 open geomeetingDongpo Deng
 
20140710 tca gsdi
20140710 tca gsdi20140710 tca gsdi
20140710 tca gsdiDongpo Deng
 
Social Web Meets Sensor Web: Linked Crowdsourced Observation Data
Social Web Meets Sensor Web: Linked Crowdsourced Observation DataSocial Web Meets Sensor Web: Linked Crowdsourced Observation Data
Social Web Meets Sensor Web: Linked Crowdsourced Observation DataDongpo Deng
 
20131106 acm geocrowd
20131106 acm geocrowd20131106 acm geocrowd
20131106 acm geocrowdDongpo Deng
 

Mehr von Dongpo Deng (20)

20180226 data driven smart governance
20180226 data driven smart governance20180226 data driven smart governance
20180226 data driven smart governance
 
The methods and practices of Linked Open Data
The methods and practices of Linked Open DataThe methods and practices of Linked Open Data
The methods and practices of Linked Open Data
 
Construction and reuse of linked traceable agricultural product records - An ...
Construction and reuse of linked traceable agricultural product records - An ...Construction and reuse of linked traceable agricultural product records - An ...
Construction and reuse of linked traceable agricultural product records - An ...
 
農產品產銷履歷資料鏈結化處理 (Linked Traceable Agricultural Data )
農產品產銷履歷資料鏈結化處理 (Linked Traceable Agricultural Data )農產品產銷履歷資料鏈結化處理 (Linked Traceable Agricultural Data )
農產品產銷履歷資料鏈結化處理 (Linked Traceable Agricultural Data )
 
開放街圖社群經營的不等式
開放街圖社群經營的不等式開放街圖社群經營的不等式
開放街圖社群經營的不等式
 
OSM 與 LocalWiki 的整合: 支援社區層級災害管理
OSM 與 LocalWiki 的整合: 支援社區層級災害管理OSM 與 LocalWiki 的整合: 支援社區層級災害管理
OSM 與 LocalWiki 的整合: 支援社區層級災害管理
 
啟動開放,創新價值
啟動開放,創新價值 啟動開放,創新價值
啟動開放,創新價值
 
2016年歐洲資料論壇
2016年歐洲資料論壇2016年歐洲資料論壇
2016年歐洲資料論壇
 
From Structured Data to Linked Open Governmental Data
From Structured Data to Linked Open Governmental DataFrom Structured Data to Linked Open Governmental Data
From Structured Data to Linked Open Governmental Data
 
開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )
開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )
開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )
 
20150427_NCDR_OSM_Disaster_Mapping
20150427_NCDR_OSM_Disaster_Mapping20150427_NCDR_OSM_Disaster_Mapping
20150427_NCDR_OSM_Disaster_Mapping
 
2014_WWW_BTOR
2014_WWW_BTOR2014_WWW_BTOR
2014_WWW_BTOR
 
Toward Next Generation of Gazetteer: Utilizing GeoSPARQL For Developing Link...
Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Link...Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Link...
Toward Next Generation of Gazetteer: Utilizing GeoSPARQL For Developing Link...
 
20141018_OD_meetup#3
20141018_OD_meetup#320141018_OD_meetup#3
20141018_OD_meetup#3
 
20141001 climate change&osm
20141001 climate change&osm20141001 climate change&osm
20141001 climate change&osm
 
20140721 open geomeeting
20140721 open geomeeting20140721 open geomeeting
20140721 open geomeeting
 
20140710 tca gsdi
20140710 tca gsdi20140710 tca gsdi
20140710 tca gsdi
 
Social Web Meets Sensor Web: Linked Crowdsourced Observation Data
Social Web Meets Sensor Web: Linked Crowdsourced Observation DataSocial Web Meets Sensor Web: Linked Crowdsourced Observation Data
Social Web Meets Sensor Web: Linked Crowdsourced Observation Data
 
TGIS 2013
TGIS 2013TGIS 2013
TGIS 2013
 
20131106 acm geocrowd
20131106 acm geocrowd20131106 acm geocrowd
20131106 acm geocrowd
 

2012 Biodiversity Asia Poster

  • 1. Harvesting crowdsourcing biodiversity data from Facebook groups Jason Guan-Shuo Mai1, Cheng-Hsin Hsu1, Dong-Po Deng2, De-En Lin3, Hsu-Hong Lin3, Kwang-Tsao Shao1 1 Taiwan Biodiversity Information Facility (TaiBIF), Biodiversity Research Center, Academia Sinica, Taipei, Taiwan 2 Institute of Information Science, Academia Sinica, Taipei, Taiwan 3 Taiwan Endemic Species Research Institute, Council of Agriculture, Nantou, Taiwan The emergence of Web 2.0 enables people to contribute their biodiversity observations on the Web. These crowdsourcing biodiversity data are increasing their value in scientific studies due to the potentially broader spatial and temporal scales. However, the data provided in plain text hinder the process of data retrieval and analysis. In this study, we propose a framework to automatically structure the loose-format text so that volunteers can keep providing data in their own familiar ways, while interested citizens, biodiversity researchers and managers can benefit from the semantically structured information. We take 2 Facebook biodiversity interest groups Reptile-Road-Mortality and Enjoy-Moths as examples. 0. Crowdsourcing - Thread participants provide 2. Using natural language Post message unstructured data processing techs with Taiwan voluntarily Geographic Name and Taiwan Post Picture Catalogue of Life databases as Facebook interest groups knowledge bases to extract Comment message species vernacular names and 6. Improving place names from a thread Comment message source data Comment message quality without changing users’ … Reptile-Road-Mortality Enjoy-Moths What a typical discussion thread own familiar looks like. ways 1. Crawling data from Facebook via its API Our algorithm picks a most related species name appearing in a thread based on social networking characteristics. Semantic annotation tool disambiguates For each vernacular name in TaiCOL do: toponymic occurs in the message? Full-matched homonyms 細紋南蛇 Yes name No occurs in the Prefix3 message? Postfix2 occurs in the thread? 細紋南 Yes 南蛇 Yes No No occurs in the One click on a message? message to recognize species Main Prefix2 細紋 Yes Postfix1 蛇 No Yes No vernacular names and related Database Name doesn’t exist in the Matched abbreviation message Calculate confidence score information of this name 5. Developing 4. Publishing browser plug- linked open ins to give data via D2R users digested server for feedback of open access structuralized and usage data Our dataset is linked to other datasets on linked open data cloud such as DBPedia, GeoNames and LODE (Linked Open Data of 3. Introducing content management Ecology) so it can have benefit from the large amount of meta-information they provide. system Drupal for easier data Algorithms used to recognize abbreviations management (including error of vernacular names and place names correction) and display