SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Topic Models Based Personalized Spam Filter Sudarsun. S Director – R & D, Checktronix India Pvt Ltd, Chennai Venkatesh Prabhu. G Research Associate, Checktronix India Pvt Ltd, Chennai Valarmathi B Professor, SKP Engineering College, Thiruvannamalai
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Content Based Filtering ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Content-Based Filtering -- Methods ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],LSA Model, In Brief
PLSA Model ,[object Object],[object Object],[object Object],[object Object],[object Object],Where ,[object Object]
N–Gram Approach ,[object Object],[object Object],[object Object],[object Object]
Overall System Architecture Training Mails   Preprocessor LSA  Model PLSA  Model N-Gram Other  Classifiers Combiner Final Result Test Mail … .
Preprocessing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Principle Component Analysis - PCA ,[object Object],[object Object],[object Object],[object Object]
LSA Classification Score Input Mails LSA Model PCA BPN Token List Vector 1xR R: Rank MxR M: Vocab Size R: Rank Vector 1xR’ RxR’ R: InVar Size R’: OutVar Size
PLSA Classification Score Input Mails PLSA Model PCA BPN Token List Vector 1xZ Z: Aspects MxZ M: Vocab Size R: Aspects Count Vector 1xZ’ ZxZ’ Z: InVar Size Z’: OutVar Size
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],(P)LSA Classification
N-Gram method ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
An Example N-Gram Tree T5 T1 T2 T3 T4 3 rd   2 nd   N1 2 nd   1 s t   N2 N3 N 4
Combiner ,[object Object],[object Object],[object Object],[object Object]
Conclusion ,[object Object],[object Object],[object Object],[object Object],[object Object]
References [1]I. Androutsopoulos, J. Koutsias, K. V. Chandrinos, G. Paliouras, and C. D. Spyropoulos. “An Evaluation of Naïve Bayesian Anti-Spam Filtering”, Proc. of the workshop on Machine Learning in the New Information Age, 2000. [2]W. Cohen, “Learning rules that classify e-mail”, AAAI Spring Symposium on Machine Learning in Information Access, 1996. [3] W. Daelemans, J. Zavrel, K. van der Sloot, and A. van den Bosch, “TiMBL: Tilburg Memory-Based Learner - version 4.0 Reference Guide”, 2001. [4] H. Drucker, D. Wu, and V. N. Vapnik., “Support Vector Machines for Spam Categorization”, IEEE Trans. on Neural networks, 1999. [5] D. Mertz, “Spam Filtering Techniques. Six approaches to eliminating unwanted e-mail.”, Gnosis Software Inc., September, 2002. Ciencias Físicas, Universidad de Valencia, 1992. [6] M. Vinther, “Junk Detection using neural networks”, MeeSoft Technical Report, June 2002. Available: http://logicnet.dk/reports/JunkDetection/JunkDetection.htm. [7] Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. “Indexing By Latent Semantic Analysis”,  Journal of the American Society For Information  Science , 41, 391-407. (1990) [8] Sudarsun Santhiappan, Venkatesh Prabhu Gopalan, and Sathish Kumar Veeraswamy,”Role of Weighting on TDM in Improvising Performance of LSA on Text Data”,  Proceedings of   IEEE INDICON  2006 . [9] Thomas Hofmann, “Probabilistic Latent Semantic Indexing,”  Proc. 22 Int’l SIGIR Conf. on Research and Development in Information Retrieval, 1999 [10]Sudarsun Santhiappan, Dalou Kalaivendhan and Venkateswarlu Malapatti  .” Unsupervised Contextual Keyword Relevance Learning and Measurement using PLSA”,  Proceedings of IEEE INDICON 2006.  [11]Landauer, T. K., Foltz, P. W., & Laham, D. “Introduction to Latent Semantic Analysis”, DiscourseProcesses, 25, 259-284. (1998). [12]G. Furnas, S. Deerwester, S. Dumais, T. Landauer, R. Harshman, L. Streeter and K. Lochbaum, "Information retrieval using a singular value decomposition model of latent semantic structure,"  in The 11th International Conference on Research and Development in Information Retrieval, Grenoble, France: ACM Press , pp. 465--480. (1988) [13] Damashek, M. Gauging , “Similarity via N-Grams:  Language-Independant Sorting, Categorization and Retrieval of  Text”.  Science ,  267 . 843-848. [14]  Sholomo Hershkop, Salvatore J.Stolfo , “Combining Email models for False Positive Reduction”,  KDD’05, August 2005.
Any Queries…. ? You can post your queries to  [email_address]

Weitere ähnliche Inhalte

Was ist angesagt?

Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...Innovation Quotient Pvt Ltd
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
A Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalA Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalBhaskar Mitra
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information RetrievalBhaskar Mitra
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Bhaskar Mitra
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptbutest
 
Search Engines
Search EnginesSearch Engines
Search Enginesbutest
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressNeural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressBhaskar Mitra
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchBhaskar Mitra
 
taghelper-final.doc
taghelper-final.doctaghelper-final.doc
taghelper-final.docbutest
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Dev Sahu
 
Information Retrieval using Semantic Similarity
Information Retrieval using Semantic SimilarityInformation Retrieval using Semantic Similarity
Information Retrieval using Semantic SimilaritySaswat Padhi
 
Text Mining for Lexicography
Text Mining for LexicographyText Mining for Lexicography
Text Mining for LexicographyLeiden University
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text ClassificationSai Srinivas Kotni
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241Urjit Patel
 
Improved text clustering with
Improved text clustering withImproved text clustering with
Improved text clustering withIJDKP
 

Was ist angesagt? (20)

Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
Domain-Specific Term Extraction for Concept Identification in Ontology Constr...
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
A Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalA Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information Retrieval
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)
 
The Duet model
The Duet modelThe Duet model
The Duet model
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressNeural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progress
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for Search
 
taghelper-final.doc
taghelper-final.doctaghelper-final.doc
taghelper-final.doc
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
Information Retrieval using Semantic Similarity
Information Retrieval using Semantic SimilarityInformation Retrieval using Semantic Similarity
Information Retrieval using Semantic Similarity
 
Text Mining for Lexicography
Text Mining for LexicographyText Mining for Lexicography
Text Mining for Lexicography
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text Classification
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241
 
Improved text clustering with
Improved text clustering withImproved text clustering with
Improved text clustering with
 

Ähnlich wie Topic Models Based Personalized Spam Filter

A rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationA rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationNinad Samel
 
Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...ijtsrd
 
The Geometry of Learning
The Geometry of LearningThe Geometry of Learning
The Geometry of Learningfridolin.wild
 
Lexicon base approch
Lexicon base approchLexicon base approch
Lexicon base approchanil maurya
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Rich Heimann
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown BagDataTactics
 
PhD Consortium ADBIS presetation.
PhD Consortium ADBIS presetation.PhD Consortium ADBIS presetation.
PhD Consortium ADBIS presetation.Giuseppe Ricci
 
G04124041046
G04124041046G04124041046
G04124041046IOSR-JEN
 
Dimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A SurveyDimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A SurveyIJTET Journal
 
776_AlluruMPranav_CEE
776_AlluruMPranav_CEE776_AlluruMPranav_CEE
776_AlluruMPranav_CEEPranav A
 
LSTM Based Sentiment Analysis
LSTM Based Sentiment AnalysisLSTM Based Sentiment Analysis
LSTM Based Sentiment Analysisijtsrd
 
797_NaveenKKapoor_CEE
797_NaveenKKapoor_CEE797_NaveenKKapoor_CEE
797_NaveenKKapoor_CEENaveen Kapoor
 
Learning group dssm - 20170605
Learning group   dssm - 20170605Learning group   dssm - 20170605
Learning group dssm - 20170605Shuai Zhang
 
A Survey on Unsupervised Graph-based Word Sense Disambiguation
A Survey on Unsupervised Graph-based Word Sense DisambiguationA Survey on Unsupervised Graph-based Word Sense Disambiguation
A Survey on Unsupervised Graph-based Word Sense DisambiguationElena-Oana Tabaranu
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGdannyijwest
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGdannyijwest
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGIJwest
 

Ähnlich wie Topic Models Based Personalized Spam Filter (20)

A rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationA rough set based hybrid method to text categorization
A rough set based hybrid method to text categorization
 
Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...
 
The Geometry of Learning
The Geometry of LearningThe Geometry of Learning
The Geometry of Learning
 
Lexicon base approch
Lexicon base approchLexicon base approch
Lexicon base approch
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown Bag
 
PhD Consortium ADBIS presetation.
PhD Consortium ADBIS presetation.PhD Consortium ADBIS presetation.
PhD Consortium ADBIS presetation.
 
G04124041046
G04124041046G04124041046
G04124041046
 
Dimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A SurveyDimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A Survey
 
776_AlluruMPranav_CEE
776_AlluruMPranav_CEE776_AlluruMPranav_CEE
776_AlluruMPranav_CEE
 
LSTM Based Sentiment Analysis
LSTM Based Sentiment AnalysisLSTM Based Sentiment Analysis
LSTM Based Sentiment Analysis
 
797_NaveenKKapoor_CEE
797_NaveenKKapoor_CEE797_NaveenKKapoor_CEE
797_NaveenKKapoor_CEE
 
E1062530
E1062530E1062530
E1062530
 
Miraj Vashi_CPEE
Miraj Vashi_CPEEMiraj Vashi_CPEE
Miraj Vashi_CPEE
 
Learning group dssm - 20170605
Learning group   dssm - 20170605Learning group   dssm - 20170605
Learning group dssm - 20170605
 
HiteshAgarwal_CPEE
HiteshAgarwal_CPEEHiteshAgarwal_CPEE
HiteshAgarwal_CPEE
 
A Survey on Unsupervised Graph-based Word Sense Disambiguation
A Survey on Unsupervised Graph-based Word Sense DisambiguationA Survey on Unsupervised Graph-based Word Sense Disambiguation
A Survey on Unsupervised Graph-based Word Sense Disambiguation
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 

Mehr von Sudarsun Santhiappan

Mehr von Sudarsun Santhiappan (13)

Challenges in Large Scale Machine Learning
Challenges in Large Scale  Machine LearningChallenges in Large Scale  Machine Learning
Challenges in Large Scale Machine Learning
 
Software Patterns
Software PatternsSoftware Patterns
Software Patterns
 
Search Engine Demystified
Search Engine DemystifiedSearch Engine Demystified
Search Engine Demystified
 
Distributed Computing
Distributed ComputingDistributed Computing
Distributed Computing
 
Essentials for a Budding IT professional
Essentials for a Budding IT professionalEssentials for a Budding IT professional
Essentials for a Budding IT professional
 
What it takes to be the Best IT Trainer
What it takes to be the Best IT TrainerWhat it takes to be the Best IT Trainer
What it takes to be the Best IT Trainer
 
Using Behavioral Patterns In Treating Autistic
Using Behavioral Patterns In Treating AutisticUsing Behavioral Patterns In Treating Autistic
Using Behavioral Patterns In Treating Autistic
 
Latent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalLatent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information Retrieval
 
Audio And Video Over Internet
Audio And Video Over InternetAudio And Video Over Internet
Audio And Video Over Internet
 
Practical Network Security
Practical Network SecurityPractical Network Security
Practical Network Security
 
How To Do A Project
How To Do A ProjectHow To Do A Project
How To Do A Project
 
Ontology
OntologyOntology
Ontology
 
Object Oriented Design
Object Oriented DesignObject Oriented Design
Object Oriented Design
 

Kürzlich hochgeladen

FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607dollysharma2066
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis UsageNeil Kimberley
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03DallasHaselhorst
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyotictsugar
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...lizamodels9
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy Verified Accounts
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...lizamodels9
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...lizamodels9
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfrichard876048
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesKeppelCorporation
 
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607dollysharma2066
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailAriel592675
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...ssuserf63bd7
 
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadIslamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadAyesha Khan
 
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCRashishs7044
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfRbc Rbcua
 

Kürzlich hochgeladen (20)

FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyot
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail Accounts
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdf
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
 
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detail
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...
 
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadIslamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
 
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
 
Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdf
 

Topic Models Based Personalized Spam Filter

  • 1. Topic Models Based Personalized Spam Filter Sudarsun. S Director – R & D, Checktronix India Pvt Ltd, Chennai Venkatesh Prabhu. G Research Associate, Checktronix India Pvt Ltd, Chennai Valarmathi B Professor, SKP Engineering College, Thiruvannamalai
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9. Overall System Architecture Training Mails Preprocessor LSA Model PLSA Model N-Gram Other Classifiers Combiner Final Result Test Mail … .
  • 10.
  • 11.
  • 12. LSA Classification Score Input Mails LSA Model PCA BPN Token List Vector 1xR R: Rank MxR M: Vocab Size R: Rank Vector 1xR’ RxR’ R: InVar Size R’: OutVar Size
  • 13. PLSA Classification Score Input Mails PLSA Model PCA BPN Token List Vector 1xZ Z: Aspects MxZ M: Vocab Size R: Aspects Count Vector 1xZ’ ZxZ’ Z: InVar Size Z’: OutVar Size
  • 14.
  • 15.
  • 16. An Example N-Gram Tree T5 T1 T2 T3 T4 3 rd 2 nd N1 2 nd 1 s t N2 N3 N 4
  • 17.
  • 18.
  • 19. References [1]I. Androutsopoulos, J. Koutsias, K. V. Chandrinos, G. Paliouras, and C. D. Spyropoulos. “An Evaluation of Naïve Bayesian Anti-Spam Filtering”, Proc. of the workshop on Machine Learning in the New Information Age, 2000. [2]W. Cohen, “Learning rules that classify e-mail”, AAAI Spring Symposium on Machine Learning in Information Access, 1996. [3] W. Daelemans, J. Zavrel, K. van der Sloot, and A. van den Bosch, “TiMBL: Tilburg Memory-Based Learner - version 4.0 Reference Guide”, 2001. [4] H. Drucker, D. Wu, and V. N. Vapnik., “Support Vector Machines for Spam Categorization”, IEEE Trans. on Neural networks, 1999. [5] D. Mertz, “Spam Filtering Techniques. Six approaches to eliminating unwanted e-mail.”, Gnosis Software Inc., September, 2002. Ciencias Físicas, Universidad de Valencia, 1992. [6] M. Vinther, “Junk Detection using neural networks”, MeeSoft Technical Report, June 2002. Available: http://logicnet.dk/reports/JunkDetection/JunkDetection.htm. [7] Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. “Indexing By Latent Semantic Analysis”, Journal of the American Society For Information Science , 41, 391-407. (1990) [8] Sudarsun Santhiappan, Venkatesh Prabhu Gopalan, and Sathish Kumar Veeraswamy,”Role of Weighting on TDM in Improvising Performance of LSA on Text Data”, Proceedings of IEEE INDICON 2006 . [9] Thomas Hofmann, “Probabilistic Latent Semantic Indexing,” Proc. 22 Int’l SIGIR Conf. on Research and Development in Information Retrieval, 1999 [10]Sudarsun Santhiappan, Dalou Kalaivendhan and Venkateswarlu Malapatti .” Unsupervised Contextual Keyword Relevance Learning and Measurement using PLSA”, Proceedings of IEEE INDICON 2006. [11]Landauer, T. K., Foltz, P. W., & Laham, D. “Introduction to Latent Semantic Analysis”, DiscourseProcesses, 25, 259-284. (1998). [12]G. Furnas, S. Deerwester, S. Dumais, T. Landauer, R. Harshman, L. Streeter and K. Lochbaum, "Information retrieval using a singular value decomposition model of latent semantic structure," in The 11th International Conference on Research and Development in Information Retrieval, Grenoble, France: ACM Press , pp. 465--480. (1988) [13] Damashek, M. Gauging , “Similarity via N-Grams: Language-Independant Sorting, Categorization and Retrieval of Text”. Science , 267 . 843-848. [14] Sholomo Hershkop, Salvatore J.Stolfo , “Combining Email models for False Positive Reduction”, KDD’05, August 2005.
  • 20. Any Queries…. ? You can post your queries to [email_address]