SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Association Analysis
Association Analysis-Definition Association Analysis is the task of uncovering relationships among data. Association rules: It  is a model that identifies how the data items are associated with each other. Ex:        It is used in retail sales to identify that are frequently purchased together.
What is a rule?  ,[object Object],If (condition) then (result)  Example: IF a customer purchases coke, then the customer also purchases orange juice  The first part is the rule body and the second part is the rule head
Strength of a rule  How certain is the rule?  Confidence measures the certainty of a rule  It is the percentage of transactions containing all items stated in the condition that also contain the items in result  Confidence (A ,B) = P(B | A)  Example: The rule "If Coke then Oranje Juice" has a confidence of 100%
Strength of a rule  How often is the rule occurred?  Support measures the usefulness of a rule  It is the percentage of transactions that contains all items in the rule  Support (A , B) = P(A ,B)  Example: For the rule If Coke then Oranj juice  In all 5 transactions, 2 contains both coke and OJ  The support of the rule is 40% 
Association Rule Mining Two-step process  Find all frequent k-item sets, k=1, 2, 3, …  All items in a rule is referred as an itemset Rules that contains k item forms a k-itemset The occurrence frequency of an k-itemset is the number of transactions that contain all k items in the itemset An itemset satisfies a minimum support (or minimum occurrence frequency) is called a frequent itemset
Association Rule Mining 2.Generate strong association rules from the frequent k-itemsets Rules satisfy both a minimum support threshold and a minimum confidence threshold are called strong rules
Apriori Algorithm: Find all frequent k-item sets Apriori principle: If an itemset is frequent, then all of its subsets must also be frequent
Illustrating Apriori Principle
Apriori Algorithm Method:  Let k=1 Generate frequent itemsets of length 1 Repeat until no new frequent itemsets are identified Generate length (k+1) candidate itemsets from length k frequent itemsets
Contd… Prune candidate itemsets containing subsets of length k that are infrequent  Count the support of each candidate by scanning the DB Eliminate candidates that are infrequent, leaving only those that are frequent
Generate strong association rules from the frequent k-itemsets For each frequent k-itemset, generate all non-empty subsets  Fore every nonempty subset, generate the rule and the associated confidence  Output the rule if the minimum confidence threshold is satisfied
Multilevel association rules Difficult to find strong associations at very low or primitive levels of data    Few people may buy "IBM desktop computer" and "Sony b/w printer" together  Many people may purchase "computer" and "printer" together
Concept hierarchy defines a sequence of mappings from a set of low level concepts to higher level EX:                                IBM                                           Microsoft                                           Hp                                              ………                                          computer                                      software                                       printer                                    accessory 
Steps to be followed Top-down, progressive deepening approach  First mine high-level frequent items  Then mine their lower level frequent items and so on  At each level, Apriori algorithm is used  Use uniform minimum support for all levels, or  Use reduced minimum support at lower levels
Sequential Association Rule  Concerns sequences of events  New homeowners purchase shower curtains before purchasing furniture  When a customer goes into a bank branch and ask for an account reconciliation, there is a good chance that he or she will close all his or her accounts
Sequential Association Rule  Transaction must have two additional features:  a time stamp or sequencing information to determine when transactions occurred relative to each other  identifying information, such as account number or id number
Some important parameters  Duration  duration may be the entire available sequence in the database, or a user selected subsequence, such as year 1999  Event folding window  a set of events occurring within a specified period of time, such as within the same day, can be viewed as occurring together.
Some important parameters  Interval  between events in the discovered pattern  0 interval means to find strictly consecutive sequences  min_int <= interval <= max_int means to find patterns that are separated by at least min_int at most max_int interval = c, to find patterns carrying an exact interval
Some Practical Issues  Time window of transactions  Level of aggregation  Level of support and confidence
Time window of transactions  Select a time window for the transaction covers at least 2 product cycles  e.g. customer purchases a product with a frequency of six month or less, select a 12-month window of customer transaction data  For frequently purchased products, a short time window is sufficient  For low frequency items, a longer time window is necessary.
Level of aggregation  If product codes in the data are too specific (such as based on product details such as size and flavour), few associations will be discovered  Group products into categories according to the product hierarchy or create new level manually
Level of support and confidence  Start with a high support and gradually reduce it  Set confidence to around 50% to reduce the number of permutation
Conclusion Association analysis rules such as multidimensional and sequential association rules are studied. Apriori algorithm is described in detail Various practical issues in association rules are analyzed.
Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net

Weitere ähnliche Inhalte

Ähnlich wie Association Analysis-Uncover Relationships Between Data

big data seminar.pptx
big data seminar.pptxbig data seminar.pptx
big data seminar.pptxAmenahAbbood
 
Top Down Approach to find Maximal Frequent Item Sets using Subset Creation
Top Down Approach to find Maximal Frequent Item Sets using Subset CreationTop Down Approach to find Maximal Frequent Item Sets using Subset Creation
Top Down Approach to find Maximal Frequent Item Sets using Subset Creationcscpconf
 
Software requirementspecification
Software requirementspecificationSoftware requirementspecification
Software requirementspecificationoshin-japanese
 
20IT501_DWDM_PPT_Unit_III.ppt
20IT501_DWDM_PPT_Unit_III.ppt20IT501_DWDM_PPT_Unit_III.ppt
20IT501_DWDM_PPT_Unit_III.pptPalaniKumarR2
 
viva_dd.pptx
viva_dd.pptxviva_dd.pptx
viva_dd.pptxdivlee1
 
20IT501_DWDM_U3.ppt
20IT501_DWDM_U3.ppt20IT501_DWDM_U3.ppt
20IT501_DWDM_U3.pptSamPrem3
 
Businesses involved in mergers and acquisitions must exercise due di.docx
Businesses involved in mergers and acquisitions must exercise due di.docxBusinesses involved in mergers and acquisitions must exercise due di.docx
Businesses involved in mergers and acquisitions must exercise due di.docxdewhirstichabod
 
Association Rule based Recommendation System using Big Data
Association Rule based Recommendation System using Big DataAssociation Rule based Recommendation System using Big Data
Association Rule based Recommendation System using Big DataIRJET Journal
 
A wrapper for QuantLib and reference data
A wrapper for QuantLib and reference dataA wrapper for QuantLib and reference data
A wrapper for QuantLib and reference dataJun Hong
 
Profitable Itemset Mining using Weights
Profitable Itemset Mining using WeightsProfitable Itemset Mining using Weights
Profitable Itemset Mining using WeightsIRJET Journal
 
Customer Decision Support System
Customer Decision Support SystemCustomer Decision Support System
Customer Decision Support SystemIRJET Journal
 
Refining The System Definition
Refining The System DefinitionRefining The System Definition
Refining The System DefinitionSandeep Ganji
 
Monitoring Distributed Systems
Monitoring Distributed SystemsMonitoring Distributed Systems
Monitoring Distributed SystemsAleksandr Tavgen
 
Predicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithmsPredicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithmsArmando Vieira
 
 risk-based approach of managing information systems is a holistic.docx
 risk-based approach of managing information systems is a holistic.docx risk-based approach of managing information systems is a holistic.docx
 risk-based approach of managing information systems is a holistic.docxodiliagilby
 
Lecture7 use case modeling
Lecture7 use case modelingLecture7 use case modeling
Lecture7 use case modelingShahid Riaz
 
Introduction To Multilevel Association Rule And Its Methods
Introduction To Multilevel Association Rule And Its MethodsIntroduction To Multilevel Association Rule And Its Methods
Introduction To Multilevel Association Rule And Its MethodsIJSRD
 
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATIONADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATIONijwscjournal
 

Ähnlich wie Association Analysis-Uncover Relationships Between Data (20)

big data seminar.pptx
big data seminar.pptxbig data seminar.pptx
big data seminar.pptx
 
Top Down Approach to find Maximal Frequent Item Sets using Subset Creation
Top Down Approach to find Maximal Frequent Item Sets using Subset CreationTop Down Approach to find Maximal Frequent Item Sets using Subset Creation
Top Down Approach to find Maximal Frequent Item Sets using Subset Creation
 
Software requirementspecification
Software requirementspecificationSoftware requirementspecification
Software requirementspecification
 
20IT501_DWDM_PPT_Unit_III.ppt
20IT501_DWDM_PPT_Unit_III.ppt20IT501_DWDM_PPT_Unit_III.ppt
20IT501_DWDM_PPT_Unit_III.ppt
 
viva_dd.pptx
viva_dd.pptxviva_dd.pptx
viva_dd.pptx
 
20IT501_DWDM_U3.ppt
20IT501_DWDM_U3.ppt20IT501_DWDM_U3.ppt
20IT501_DWDM_U3.ppt
 
Businesses involved in mergers and acquisitions must exercise due di.docx
Businesses involved in mergers and acquisitions must exercise due di.docxBusinesses involved in mergers and acquisitions must exercise due di.docx
Businesses involved in mergers and acquisitions must exercise due di.docx
 
Association Rule based Recommendation System using Big Data
Association Rule based Recommendation System using Big DataAssociation Rule based Recommendation System using Big Data
Association Rule based Recommendation System using Big Data
 
A wrapper for QuantLib and reference data
A wrapper for QuantLib and reference dataA wrapper for QuantLib and reference data
A wrapper for QuantLib and reference data
 
Profitable Itemset Mining using Weights
Profitable Itemset Mining using WeightsProfitable Itemset Mining using Weights
Profitable Itemset Mining using Weights
 
Customer Decision Support System
Customer Decision Support SystemCustomer Decision Support System
Customer Decision Support System
 
Refining The System Definition
Refining The System DefinitionRefining The System Definition
Refining The System Definition
 
Dma unit 2
Dma unit  2Dma unit  2
Dma unit 2
 
Monitoring Distributed Systems
Monitoring Distributed SystemsMonitoring Distributed Systems
Monitoring Distributed Systems
 
Predicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithmsPredicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithms
 
 risk-based approach of managing information systems is a holistic.docx
 risk-based approach of managing information systems is a holistic.docx risk-based approach of managing information systems is a holistic.docx
 risk-based approach of managing information systems is a holistic.docx
 
Lecture7 use case modeling
Lecture7 use case modelingLecture7 use case modeling
Lecture7 use case modeling
 
Introduction To Multilevel Association Rule And Its Methods
Introduction To Multilevel Association Rule And Its MethodsIntroduction To Multilevel Association Rule And Its Methods
Introduction To Multilevel Association Rule And Its Methods
 
PrésentationKnime-Final
PrésentationKnime-FinalPrésentationKnime-Final
PrésentationKnime-Final
 
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATIONADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
ADAPTIVE MODEL FOR WEB SERVICE RECOMMENDATION
 

Mehr von Datamining Tools

Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web miningDatamining Tools
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysisDatamining Tools
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDatamining Tools
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDatamining Tools
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDatamining Tools
 
Data Mining: Data warehouse and olap technology
Data Mining: Data warehouse and olap technologyData Mining: Data warehouse and olap technology
Data Mining: Data warehouse and olap technologyDatamining Tools
 
Data MIning: Data processing
Data MIning: Data processingData MIning: Data processing
Data MIning: Data processingDatamining Tools
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysisDatamining Tools
 
Data mining: Classification and Prediction
Data mining: Classification and PredictionData mining: Classification and Prediction
Data mining: Classification and PredictionDatamining Tools
 
Data Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysisData Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysisDatamining Tools
 
Data Mining: Data mining and key definitions
Data Mining: Data mining and key definitionsData Mining: Data mining and key definitions
Data Mining: Data mining and key definitionsDatamining Tools
 
Data Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalizationData Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalizationDatamining Tools
 
Data Mining: Applying data mining
Data Mining: Applying data miningData Mining: Applying data mining
Data Mining: Applying data miningDatamining Tools
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data miningDatamining Tools
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceDatamining Tools
 

Mehr von Datamining Tools (20)

Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Data Mining: Data warehouse and olap technology
Data Mining: Data warehouse and olap technologyData Mining: Data warehouse and olap technology
Data Mining: Data warehouse and olap technology
 
Data MIning: Data processing
Data MIning: Data processingData MIning: Data processing
Data MIning: Data processing
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysis
 
Data mining: Classification and Prediction
Data mining: Classification and PredictionData mining: Classification and Prediction
Data mining: Classification and Prediction
 
Data Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysisData Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysis
 
Data Mining: Data mining and key definitions
Data Mining: Data mining and key definitionsData Mining: Data mining and key definitions
Data Mining: Data mining and key definitions
 
Data Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalizationData Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalization
 
Data Mining: Applying data mining
Data Mining: Applying data miningData Mining: Applying data mining
Data Mining: Applying data mining
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
AI: Planning and AI
AI: Planning and AIAI: Planning and AI
AI: Planning and AI
 
AI: Logic in AI 2
AI: Logic in AI 2AI: Logic in AI 2
AI: Logic in AI 2
 
AI: Logic in AI
AI: Logic in AIAI: Logic in AI
AI: Logic in AI
 
AI: Learning in AI 2
AI: Learning in AI  2AI: Learning in AI  2
AI: Learning in AI 2
 
AI: Learning in AI
AI: Learning in AI AI: Learning in AI
AI: Learning in AI
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
 

Kürzlich hochgeladen

Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 

Kürzlich hochgeladen (20)

Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 

Association Analysis-Uncover Relationships Between Data

  • 2. Association Analysis-Definition Association Analysis is the task of uncovering relationships among data. Association rules: It is a model that identifies how the data items are associated with each other. Ex: It is used in retail sales to identify that are frequently purchased together.
  • 3.
  • 4. Strength of a rule How certain is the rule? Confidence measures the certainty of a rule It is the percentage of transactions containing all items stated in the condition that also contain the items in result Confidence (A ,B) = P(B | A) Example: The rule "If Coke then Oranje Juice" has a confidence of 100%
  • 5. Strength of a rule How often is the rule occurred? Support measures the usefulness of a rule It is the percentage of transactions that contains all items in the rule Support (A , B) = P(A ,B) Example: For the rule If Coke then Oranj juice In all 5 transactions, 2 contains both coke and OJ The support of the rule is 40% 
  • 6. Association Rule Mining Two-step process Find all frequent k-item sets, k=1, 2, 3, … All items in a rule is referred as an itemset Rules that contains k item forms a k-itemset The occurrence frequency of an k-itemset is the number of transactions that contain all k items in the itemset An itemset satisfies a minimum support (or minimum occurrence frequency) is called a frequent itemset
  • 7. Association Rule Mining 2.Generate strong association rules from the frequent k-itemsets Rules satisfy both a minimum support threshold and a minimum confidence threshold are called strong rules
  • 8. Apriori Algorithm: Find all frequent k-item sets Apriori principle: If an itemset is frequent, then all of its subsets must also be frequent
  • 10. Apriori Algorithm Method: Let k=1 Generate frequent itemsets of length 1 Repeat until no new frequent itemsets are identified Generate length (k+1) candidate itemsets from length k frequent itemsets
  • 11. Contd… Prune candidate itemsets containing subsets of length k that are infrequent Count the support of each candidate by scanning the DB Eliminate candidates that are infrequent, leaving only those that are frequent
  • 12. Generate strong association rules from the frequent k-itemsets For each frequent k-itemset, generate all non-empty subsets Fore every nonempty subset, generate the rule and the associated confidence Output the rule if the minimum confidence threshold is satisfied
  • 13. Multilevel association rules Difficult to find strong associations at very low or primitive levels of data   Few people may buy "IBM desktop computer" and "Sony b/w printer" together Many people may purchase "computer" and "printer" together
  • 14. Concept hierarchy defines a sequence of mappings from a set of low level concepts to higher level EX: IBM  Microsoft  Hp ……… computer  software  printer  accessory 
  • 15. Steps to be followed Top-down, progressive deepening approach First mine high-level frequent items Then mine their lower level frequent items and so on At each level, Apriori algorithm is used Use uniform minimum support for all levels, or Use reduced minimum support at lower levels
  • 16. Sequential Association Rule  Concerns sequences of events New homeowners purchase shower curtains before purchasing furniture When a customer goes into a bank branch and ask for an account reconciliation, there is a good chance that he or she will close all his or her accounts
  • 17. Sequential Association Rule  Transaction must have two additional features: a time stamp or sequencing information to determine when transactions occurred relative to each other identifying information, such as account number or id number
  • 18. Some important parameters Duration duration may be the entire available sequence in the database, or a user selected subsequence, such as year 1999 Event folding window a set of events occurring within a specified period of time, such as within the same day, can be viewed as occurring together.
  • 19. Some important parameters Interval between events in the discovered pattern 0 interval means to find strictly consecutive sequences min_int <= interval <= max_int means to find patterns that are separated by at least min_int at most max_int interval = c, to find patterns carrying an exact interval
  • 20. Some Practical Issues  Time window of transactions Level of aggregation Level of support and confidence
  • 21. Time window of transactions Select a time window for the transaction covers at least 2 product cycles e.g. customer purchases a product with a frequency of six month or less, select a 12-month window of customer transaction data For frequently purchased products, a short time window is sufficient For low frequency items, a longer time window is necessary.
  • 22. Level of aggregation If product codes in the data are too specific (such as based on product details such as size and flavour), few associations will be discovered Group products into categories according to the product hierarchy or create new level manually
  • 23. Level of support and confidence Start with a high support and gradually reduce it Set confidence to around 50% to reduce the number of permutation
  • 24. Conclusion Association analysis rules such as multidimensional and sequential association rules are studied. Apriori algorithm is described in detail Various practical issues in association rules are analyzed.
  • 25. Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net