"Methodology for Assessment of Linked Data Quality: A Framework" at Workshop on Linked Data Quality
Paper: https://dl.dropboxusercontent.com/u/2265375/LDQ/ldq2014_submission_3.pdf
6. Phase I: Requirement Analysis
Step I: Use Case Analysis
- Description that best illustrates the intended
usage of the dataset(s)
Two types of users
➢Consumers
➢Potential consumers
7. Phase II: Quality Assessment
Step II: Identification of quality issues
➢Based on the use case
➢Checklist-based approach
➢Yes - 1, No - 0
➢List of quality dimensions
10. Data Quality Score
➢Ratio
○ DQscore = 1 - (V/T)
■ V - total no. of instances that violate a DQ rule
■ T - total no. of relevant instances
■ for each property
○ DQweightedscore= (DQscore * wi / W)
■ wi - weight
■ W - sum of all weighted factors of the properties
■ for quality of overall properties
11. Phase III: Quality Improvement
Step V: Root Cause Analysis
➢Analyze cause of each quality issue
➢Helps user interpret the results
➢Detect whether the problem occurs in the
original dataset
➢In case original dataset is unavailable,
analyze the available dataset to determine
the cause
13. Conclusion and Future Work
➢Assessment methodology - 3 phases, 6
steps
➢Focus on use case
➢Improvement phase
!
Future Work
➢Application to an actual use case
➢Build a tool