Competitive advantage from Data Mining: some lessons learnt ...
1. Competitive advantage from Data Mining: some lessons learnt in the Information Systems field Mykola Pechenizkiy , Seppo Puuronen Department of Computer Science University of Jyväskylä Finland Alexey Tsymbal Department of Computer Science Trinity College Dublin Ireland PMKD’05 Copenhagen, Denmark August 22-26, 2005
2.
3. What is Data Mining Data mining or Knowledge discovery is the process of finding previously unknown and potentially interesting patterns and relations in large databases (Fayyad, KDD’96) Data mining is the emerging science and industry of applying modern statistical and computational technologies to the problem of finding useful patterns hidden within large databases (John 1997) Intersection of many fields : statistics, AI, machine learning, databases, neural networks, pattern recognition, econometrics, etc.
17. Knowledge discovery as a process Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R., Advances in Knowledge Discovery and Data Mining , AAAI/MIT Press, 1997. I
27. The ISs-based paradigm for DM Ives B., Hamilton S., Davis G. (1980). “A Framework for Research in Computer-based MIS” Management Science , 26 (9), 910-934. “ Information systems are powerful instruments for organizational problem solving through formal information processing” Lyytinen, K., 1987, “Different perspectives on ISs: problems and solutions.” ACM Computing Surveys , 19 (1), 5-46.
28. DM Artifact Development Adapted from: Nunamaker, W., Chen, M., and Purdin, T. 1990-91, Systems development in information systems research, Journal of Management Information Systems , 7 (3), 89-106. A multimethodological approach to the construction of an artefact for DM DM Artifact Development Experimentation Theory Building Observation
29.
30. The Action Research and Design Science Approach to Artifact Creation Design Knowledge Awareness of business problem Action planning Action taking Conclusion Business Knowledge Artifact Development Artifact Evaluation Contextual Knowledge
31. DM Artifact Use: Success Model 1 of 3 Adapted from D&M IS Success Models System Quality Information Quality Use User Satisfaction Individual Impact Organizational Impact Service Quality
ACM classification system for the computing field: DM is a subject of database applications (H.2.8), database management (H.2), and information systems field (H.)
SPSS whitepaper [4] states that “Unless there’s a method, there’s madness”. It is accepted that just by pushing a button someone should not expect useful results to appear. An industry standard to DM projects CRISP-DM is a good initiative and a starting point directed towards the development of DM meta-artifact (methodology to produce DM artifacts). However, in our opinion it is just one guideline, which is too general-level, that every DM developer follows with or without success.
In fact, the study of development and use processes was recognized to be of importance in the IS fields many years ago, and therefore it has been introduced into the different IS frameworks.
Nevertheless, so far in the DM community there exist too few research activities directed towards the study of a DM system as an artifact aimed to enable certain DM tasks in a certain context (Figure 1). In the IS discipline two research paradigms – the behavioral-science paradigm and design-science paradigm – have
The first efforts in that direction are the ones presented in the DM Review magazine [9, 21], referred below. We believe that such efforts should be encouraged in DM research and followed by research-based reports.
Lin in Wu et al. [43] notices that in fact there have been no major impacts of DM on the business world echoed. However, even reporting of existing success stories is important. Giraud-Carrier [18] reported 136 success stories of DM, covering 9 business areas with 30 DM tools or DM vendors referred. Unfortunately, there was no deep analysis provided that would summarize or discover the main success factors and the research should be continued.
In order to distinguish between the knowledge extracted from data and the higher-level knowledge (from the KDS perspective) required for managing techniques’ selection, combination and application we will refer to the latter as meta-knowledge .