PresentationMinchenNathan

•

0 gefällt mir•102 views

Nathan Minchen

Overview
• Email data from text book website
• 4601 emails (1813 spam)
• 58 features
– Ex: number of consecutive capital letters, number
of times a particular word appears (57)
– Classified as spam/not spam (1)
• Randomly split into training (3000) and testing
(1601) sets
Data Source: http://statweb.stanford.edu/~tibs/ElemStatLearn/. Creators: Mark Hopkins, Erik Reeber,
George Forman, Jaap Suermondt Hewlett-Packard Labs, 1501 Page Mill Rd., Palo Alto, CA 94304

Methods
• Linear and nonlinear methods
• Variable transformation and standardization
• Feature space modifications:
– No basis expansion
– Raw Polynomials (degree 3)
– Orthogonal Polynomials (degree 3)
– Natural Splines (Cross Validated using LR to df = 4)
• Misclassification is highly undesirable
• Definition of a “maybe spam” class

Empfohlen

Quality Metrics for Linked Open Data ebrahim_bagheri

Argument extraction from news, blogs and social media.Shubhangi Tandon

Stat Final Projectprizumz

STAT 3510 PresentationMegan Kaehms

Scott Cunningham STAT512 Final ProjectScott Cunningham

Statistics ProjectRonan Santos

2024 State of Marketing Report – by HubspotMarius Sescu

Everything You Need To Know About ChatGPTExpeed Software

Empfohlen

Quality Metrics for Linked Open Data ebrahim_bagheri

Argument extraction from news, blogs and social media.Shubhangi Tandon

Stat Final Projectprizumz

STAT 3510 PresentationMegan Kaehms

Scott Cunningham STAT512 Final ProjectScott Cunningham

Statistics ProjectRonan Santos

2024 State of Marketing Report – by HubspotMarius Sescu

Everything You Need To Know About ChatGPTExpeed Software

Product Design Trends in 2024 | Teenage EngineeringsPixeldarts

How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow

AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork

Skeleton Culture CodeSkeleton Technologies

PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley

Content Methodology: A Best Practices Report (Webinar)contently

How to Prepare For a Successful Job Search for 2024Albert Qian

Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)

Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal

5 Public speaking tips from TED - Visualized summarySpeakerHub

ChatGPT and the Future of Work - Clark Boyd Clark Boyd

Getting into the tech field. what next Tessa Mero

Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray

How to have difficult conversations Rajiv Jayarajah, MAppComm, ACC

Introduction to Data ScienceChristy Abraham Joy

Time Management & Productivity - Best PracticesVit Horky

The six step guide to practical project managementMindGenius

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools

12 Ways to Increase Your Influence at WorkGetSmarter

Weitere ähnliche Inhalte

Empfohlen

Product Design Trends in 2024 | Teenage EngineeringsPixeldarts

How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow

AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork

Skeleton Culture CodeSkeleton Technologies

PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley

Content Methodology: A Best Practices Report (Webinar)contently

How to Prepare For a Successful Job Search for 2024Albert Qian

Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)

Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal

5 Public speaking tips from TED - Visualized summarySpeakerHub

ChatGPT and the Future of Work - Clark Boyd Clark Boyd

Getting into the tech field. what next Tessa Mero

Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray

How to have difficult conversations Rajiv Jayarajah, MAppComm, ACC

Introduction to Data ScienceChristy Abraham Joy

Time Management & Productivity - Best PracticesVit Horky

The six step guide to practical project managementMindGenius

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools

12 Ways to Increase Your Influence at WorkGetSmarter

Empfohlen (20)

Product Design Trends in 2024 | Teenage Engineerings

How Race, Age and Gender Shape Attitudes Towards Mental Health

AI Trends in Creative Operations 2024 by Artwork Flow.pdf

Skeleton Culture Code

PEPSICO Presentation to CAGNY Conference Feb 2024

Content Methodology: A Best Practices Report (Webinar)

How to Prepare For a Successful Job Search for 2024

Social Media Marketing Trends 2024 // The Global Indie Insights

Trends In Paid Search: Navigating The Digital Landscape In 2024

5 Public speaking tips from TED - Visualized summary

ChatGPT and the Future of Work - Clark Boyd

Getting into the tech field. what next

Google's Just Not That Into You: Understanding Core Updates & Search Intent

How to have difficult conversations

Introduction to Data Science

Time Management & Productivity - Best Practices

The six step guide to practical project management

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...

12 Ways to Increase Your Influence at Work

PresentationMinchenNathan

1. Nathan Minchen Final Project Stat 588 Prof. Yang

2. Overview • Email data from text book website • 4601 emails (1813 spam) • 58 features – Ex: number of consecutive capital letters, number of times a particular word appears (57) – Classified as spam/not spam (1) • Randomly split into training (3000) and testing (1601) sets Data Source: http://statweb.stanford.edu/~tibs/ElemStatLearn/. Creators: Mark Hopkins, Erik Reeber, George Forman, Jaap Suermondt Hewlett-Packard Labs, 1501 Page Mill Rd., Palo Alto, CA 94304

3. Methods • Linear and nonlinear methods • Variable transformation and standardization • Feature space modifications: – No basis expansion – Raw Polynomials (degree 3) – Orthogonal Polynomials (degree 3) – Natural Splines (Cross Validated using LR to df = 4) • Misclassification is highly undesirable • Definition of a “maybe spam” class