Gehören Sie zu den Ersten, denen das gefällt!
The Linked Data for Information Extraction challenge explores aims at extracting structured data from Web pages. It is based on a subset of the Web Data Commons Microformats dataset.
For the challenge, original annotated pages are provided, as well as the triples extracted from them. Based on that information, participants have to design an Information extraction system for extracting that information from other web pages. In this year's challenge, we focus on hCard data, i.e., information about persons. The use case of such a system could be the assembly of a large database on person data.
The systems are evaluated on a test set of annotated web pages, from which all annotations have been removed. The participants have to extract triples from those pages and send in their resulting triple files. The submitted files are evaluated against the gold standard of the original triples, ranking the solutions by F-measure.