Clustering Models to Assist in Student Outreach

•Als PPTX, PDF herunterladen•

0 gefällt mir•216 views

Presentation delivered at Texas Association of Institutional Research on the applications of unsupervised clustering models to drive student outreach, as well as a general overview of common algorithms.

Daten & Analysen

1. Why clustering?
2. Example 1, hierarchical clustering
3. Questions
4. Overview of common algorithms and strengths/limitations
5. Example 2, K-means clustering
6. Questions
2

4
Reverse looking
What are characteristics of
students who used/did not use
a service? Did/did not persist?
Who is the prototypical
student in each academic
program?
Forward looking
What characteristics comprise
our prospective student
personas?
How do risk factors overlap in
some groups of students?

The challenge
Contrary to national trends, Dallas College has experienced larger
than typical drops in female student re-enrollment patterns. Our
goal was to stem decreases for the Fall 2021 semester through a
text campaign.
The question
What messaging should be used to resonate with these students?
6

45k students
Diverse features for the model
• Demographics: race/ethnicity, gender, age
• Financial: income, employment intensity
• Household: household size, dependent count
• Academics: last term of enrollment, credits, GPA
• Special Pop membership
7

8
Step 1
Starts with a case as leaf node
Each case goes into that leaf
node or breaks into a new
node
Result is a Cluster Features
tree
Step 2
Leaf nodes are combined
through agglomeration
Result is several “best”
clusters with a silhouette score

10
African Am.
3 dependents
Part-time job
Hispanic
4 dependents
Part-time job
Hispanic
2 dependents
Full-time job
White
1 dependent
Full-time job

BIRCH and Two-step (SPSS proprietary; Ex 1)
K-means, K-modes, K-medoids (Ex 2)
Hierarchical (Agglomerative and Divisive)
12

15
Clusters based on
distance cutoff line
Distance cutoff line

Algorithm Variable Types
BIRCH / 2-Step Any, multiple at the same time
Hierarchical Any, but stick to one kind at a time and use
well-paired distance measure
K-means Continuous
K-modes Categorical
K-medoids Any, multiple at the same time with the
right distance measure (Gower)
16

17
Algorithm +/-
BIRCH / 2-Step Flexible variable types, fast compute, large
data; there is an element of a black box
Hierarchical Flexible modeling, highly explainable;
limited to small data
K-Means, K-Modes Easy, fast, large data; sensitive to K, curse
of dimensionality, clusters same sized
K-medoids Like K-Means but more flexible variable
types and more costly tuning & compute

• Population: 24 years and older adults enrolled for Fall 2021
• Features : Academics, Demographic, Financial , Household and Veteran status.
• Mix of discrete and continuous variables with majority of them being categorical data.
19
Below Federal poverty
Level
Below
Median
Lower half Above
Median
Upper half Above
Median
Median household income in Texas
Poverty Flag INCOME BIN

20
V1 V2 V3 V4 V5
0 1 1 0 1
V1 V2 V3 V4 V5
1 0 0 0 1
1+1+1+1 = 4
Dissimilarity measure
0: Mismatch 1: Match
• Randomly select the K initial centers
• Repeat
1. Assign the samples to nearest center
2. Update means/modes based on newly formed cluster
3. Calculate the cost (SSE/Sum of dissimilarity)
• Stop when cluster centers converges
Using frequency-based
method to calculate mode
instead of the mean of the
sample
Minimize the cost function
Sample 1
Sample 2

22
Missing values for categorical variables
such as employment status
Missing values for
continuous variables
such as income
Possible solutions
• Revisit our data warehouse to obtain as much
information as we can to fill in the missing values
• Imputation with K-Nearest neighbors with
Hamming distance for categorical data
• Imputation with K-Nearest neighbors with
Euclidean distance for continuous data

Jeremy Anderson, Associate Vice Chancellor of Strategic
Analytics, jeremy.anderson@dcccd.edu
Dillon Lu, Data Analyst, DLu@dcccd.edu

Weitere ähnliche Inhalte

Ähnlich wie Clustering Models to Assist in Student Outreach

Proposal defense apr19

rgeurtz

Using Data to Mobilize Commuities and Change Lives

RaisingTheBar2015

CB FORUM FINAL

Dan Lundquist

Margaret Patton, Dissertation Defense, Dr. William Allan Kritsonis, PhD Disse...

William Kritsonis

Dr. Margaret Curette Patton, PhD Dissertation, Dr. William Allan Kritsonis, D...

William Kritsonis

Transitioning to Common Core: What it means, What to look for

Curriculum Associates

Local Critical Issue Version 2

Tim Hasse

Union Col lecture

Dan Lundquist

Rabbani - Education

Copenhagen_Consensus

Getting to the Root Causes of Disproportionate Representation in Special Educ...

SPPTAP

Connecticut Core 2014

EdAdvance

2.why the common_core_presentation_with_facilitators_notes_update_072213

WRHSlibrary

New Canaan BOE

EdAdvance

SACAC Session E.12 Sealing the Deal

Raise.me

Localcriticalissueversion2 091109081645 Phpapp01

Tim Hasse

Chautauqua County School Board Dinner

John Sipple

Funding Dries Up For Non Profit And Educational Institutions Serving Black Co...

Larry Cochran, MBA

Vladimir shares his experience in tackling the unique challenges that he has faced as a credit risk modeler in а Bulgarian bank. Credit risk models predict whether a borrower will pay their loan back, and people like Vladimir that develop them in effect decide if you will be approved for a credit card or a mortgage loan. Building such models in Bulgaria presents some very interesting challenges related to data quality, variable interaction and model form. You won't find some of the discussed problems and solutions in any popular textbook, so don't miss out this great opportunity to update your expertise!

Credit risk predictive analytics

Data Science Society

Missouri ACT Identified Keys to Enrollment Success

StephaneGeyer

Top 11 Metrics Every Financial Aid Director Should Be Measuring

CampusLogic

Ähnlich wie Clustering Models to Assist in Student Outreach (20)

Proposal defense apr19

Using Data to Mobilize Commuities and Change Lives

CB FORUM FINAL

Margaret Patton, Dissertation Defense, Dr. William Allan Kritsonis, PhD Disse...

Dr. Margaret Curette Patton, PhD Dissertation, Dr. William Allan Kritsonis, D...

Transitioning to Common Core: What it means, What to look for

Local Critical Issue Version 2

Union Col lecture

Rabbani - Education

Getting to the Root Causes of Disproportionate Representation in Special Educ...

Connecticut Core 2014

2.why the common_core_presentation_with_facilitators_notes_update_072213

New Canaan BOE

SACAC Session E.12 Sealing the Deal

Localcriticalissueversion2 091109081645 Phpapp01

Chautauqua County School Board Dinner

Funding Dries Up For Non Profit And Educational Institutions Serving Black Co...

Credit risk predictive analytics

Missouri ACT Identified Keys to Enrollment Success

Top 11 Metrics Every Financial Aid Director Should Be Measuring

Mehr von Jeremy Anderson

Defining Constituents, Data Vizzes and Telling a Data Story

Jeremy Anderson

How I Lead and Manage

Jeremy Anderson

Creating a Print-on-Demand Initiative for Open Educational Resources

Jeremy Anderson

How any institution can get started on learning analytics

Jeremy Anderson

Creating Wraparound Supports for Students through Internal Partnerships

Jeremy Anderson

Four LMS Tools to Change Your Life

Jeremy Anderson

Addressing the Adjunct Underclass: Fit and Employment Outcomes in Part-Time F...

Jeremy Anderson

Plan an OER Stragegic Plan: Change Models and Processes

Jeremy Anderson

Sustaining Accessible OER at Scale

Jeremy Anderson

Hybrid Course Design - Will It Blend?!

Jeremy Anderson

The Triple A (AAA) of OER: Accessibility, Availability, and Affordability

Jeremy Anderson

Case Study: Increasing Access through OER Adoption

Jeremy Anderson

What data can deliver: A new way of operating

Jeremy Anderson

2018 Horizon Report Webinar: Adaptive Learning and OER at Scale

Jeremy Anderson

The Nitty Gritty of OER Adoption

Jeremy Anderson

The Path to Creating an Integrated Online Contingent Faculty Competency System

Jeremy Anderson

Strategies to Scale Adaptive Course Design

Jeremy Anderson

OER in the Time of Adaptive

Jeremy Anderson

Implementing Adaptive, Data-Driven Course Design to Improve Student Learning

Jeremy Anderson

Lessons from Adopting an Adaptive Learning Platform

Jeremy Anderson

Mehr von Jeremy Anderson (20)

Defining Constituents, Data Vizzes and Telling a Data Story

How I Lead and Manage

Creating a Print-on-Demand Initiative for Open Educational Resources

How any institution can get started on learning analytics

Creating Wraparound Supports for Students through Internal Partnerships

Four LMS Tools to Change Your Life

Addressing the Adjunct Underclass: Fit and Employment Outcomes in Part-Time F...

Plan an OER Stragegic Plan: Change Models and Processes

Sustaining Accessible OER at Scale

Hybrid Course Design - Will It Blend?!

The Triple A (AAA) of OER: Accessibility, Availability, and Affordability

Case Study: Increasing Access through OER Adoption

What data can deliver: A new way of operating

2018 Horizon Report Webinar: Adaptive Learning and OER at Scale

The Nitty Gritty of OER Adoption

The Path to Creating an Integrated Online Contingent Faculty Competency System

Strategies to Scale Adaptive Course Design

OER in the Time of Adaptive

Implementing Adaptive, Data-Driven Course Design to Improve Student Learning

Lessons from Adopting an Adaptive Learning Platform

Kürzlich hochgeladen

How I opened a fake bank account and didn't go to prison

Payment Village

Exploratory Data Analysis - Dilip S.pptx

DilipVasan

Unlock the power of data analytics with our comprehensive slide deck from the Advanced Digital Strategy MGMT X 466.05 course at UCLAx. This presentation reviews the fundamental concepts and practical applications of data analytics in business and marketing. Key Topics Covered: Overview & Concepts: Learn how data analytics uses statistics, predictive modeling, and machine learning to enhance business performance. Types of Data: Understand the differences between structured and unstructured data, and how to leverage quantitative and qualitative data. Key Techniques: Explore descriptive, diagnostic, predictive, and prescriptive analytics to transform raw data into actionable insights. Common Tools: Get acquainted with popular tools like Google Analytics, Google Looker, Adobe Analytics, and HubSpot for effective data tracking and analysis. Data Analysis Process: Follow a step-by-step guide to collecting, cleaning, modeling, and interpreting data to drive informed decision-making. Optimizing Campaigns: Learn how to use A/B testing and past campaign performance data to enhance future marketing efforts. Defining Audiences: Discover how to segment target audiences using demographic data, purchase histories, and online behaviors for more precise marketing strategies. Advanced Methods: Dive into advanced data analysis techniques like cohort analysis, cluster analysis, sentiment analysis, and regression analysis. Customer Journey Analytics: Visualize the customer journey and identify key engagement moments to optimize the customer experience. Data Visualization & Storytelling: Master the art of communicating data insights effectively through visualizations and contextual storytelling.

Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...

Valters Lauzums

WSU毕业证原版定制【微信：176555708】【西悉尼大学毕业证成绩单-学位证】【微信：176555708】（留信学历认证永久存档查询）采用学校原版纸张、特殊工艺完全按照原版一比一制作（包括：隐形水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠，文字图案浮雕，激光镭射，紫外荧光，温感，复印防伪）行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备，十五年致力于帮助留学生解决难题，业务范围有加拿大、英国、澳洲、韩国、美国、新加坡，新西兰等学历材料，包您满意。 ◆◆◆◆◆ — — — — — — — — 【留学教育】留学归国服务中心 — — — — — -◆◆◆◆◆ 【主营项目】一.毕业证【微信：176555708】成绩单、使馆认证、教育部认证、雅思托福成绩单、学生卡等！二.真实使馆公证(即留学回国人员证明,不成功不收费) 三.真实教育部学历学位认证（教育部存档！教育部留服网站永久可查）四.办理各国各大学文凭(一对一专业服务,可全程监控跟踪进度) 如果您处于以下几种情况： ◇在校期间，因各种原因未能顺利毕业……拿不到官方毕业证【微信：176555708】 ◇面对父母的压力，希望尽快拿到； ◇不清楚认证流程以及材料该如何准备； ◇回国时间很长，忘记办理； ◇回国马上就要找工作，办给用人单位看； ◇企事业单位必须要求办理的 ◇需要报考公务员、购买免税车、落转户口 ◇申请留学生创业基金留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分→ 【关于价格问题（保证一手价格）我们所定的价格是非常合理的，而且我们现在做得单子大多数都是代理和回头客户介绍的所以一般现在有新的单子我给客户的都是第一手的代理价格，因为我想坦诚对待大家不想跟大家在价格方面浪费时间对于老客户或者被老客户介绍过来的朋友，我们都会适当给一些优惠。 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才选择实体注册公司办理，更放心，更安全！我们的承诺：可来公司面谈，可签订合同，会陪同客户一起到教育部认证窗口递交认证材料，客户在教育部官方认证查询网站查询到认证通过结果后付款，不成功不收费！学历顾问：微信：176555708

一比一原版西悉尼大学毕业证成绩单如何办理

pyhepag

Slip-and-fall Injuries: Top Workers' Comp Claims

Bisnar Chase Personal Injury Attorneys

Pre-ProductionImproveddsfjgndflghtgg.pptx

Stephen266013

MQU毕业证原版定制【微信：176555708】【麦考瑞大学毕业证成绩单-学位证】【微信：176555708】（留信学历认证永久存档查询）采用学校原版纸张、特殊工艺完全按照原版一比一制作（包括：隐形水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠，文字图案浮雕，激光镭射，紫外荧光，温感，复印防伪）行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备，十五年致力于帮助留学生解决难题，业务范围有加拿大、英国、澳洲、韩国、美国、新加坡，新西兰等学历材料，包您满意。 ◆◆◆◆◆ — — — — — — — — 【留学教育】留学归国服务中心 — — — — — -◆◆◆◆◆ 【主营项目】一.毕业证【微信：176555708】成绩单、使馆认证、教育部认证、雅思托福成绩单、学生卡等！二.真实使馆公证(即留学回国人员证明,不成功不收费) 三.真实教育部学历学位认证（教育部存档！教育部留服网站永久可查）四.办理各国各大学文凭(一对一专业服务,可全程监控跟踪进度) 如果您处于以下几种情况： ◇在校期间，因各种原因未能顺利毕业……拿不到官方毕业证【微信：176555708】 ◇面对父母的压力，希望尽快拿到； ◇不清楚认证流程以及材料该如何准备； ◇回国时间很长，忘记办理； ◇回国马上就要找工作，办给用人单位看； ◇企事业单位必须要求办理的 ◇需要报考公务员、购买免税车、落转户口 ◇申请留学生创业基金留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分→ 【关于价格问题（保证一手价格）我们所定的价格是非常合理的，而且我们现在做得单子大多数都是代理和回头客户介绍的所以一般现在有新的单子我给客户的都是第一手的代理价格，因为我想坦诚对待大家不想跟大家在价格方面浪费时间对于老客户或者被老客户介绍过来的朋友，我们都会适当给一些优惠。 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才选择实体注册公司办理，更放心，更安全！我们的承诺：可来公司面谈，可签订合同，会陪同客户一起到教育部认证窗口递交认证材料，客户在教育部官方认证查询网站查询到认证通过结果后付款，不成功不收费！学历顾问：微信：176555708

一比一原版麦考瑞大学毕业证成绩单如何办理

cyebo

The Significance of Transliteration Enhancing

mohamed Elzalabany

Atlantic Grupa Case Study (Mintec Data AI)

Jon Hansen

社内勉強会資料　Mamba - A new era or ephemeral

NABLAS株式会社

Fuzzy Sets decision making under information of uncertainty

RafigAliyev2

Monash毕业证原版定制【微信：176555708】【莫纳什大学毕业证成绩单-学位证】【微信：176555708】（留信学历认证永久存档查询）采用学校原版纸张、特殊工艺完全按照原版一比一制作（包括：隐形水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠，文字图案浮雕，激光镭射，紫外荧光，温感，复印防伪）行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备，十五年致力于帮助留学生解决难题，业务范围有加拿大、英国、澳洲、韩国、美国、新加坡，新西兰等学历材料，包您满意。 ◆◆◆◆◆ — — — — — — — — 【留学教育】留学归国服务中心 — — — — — -◆◆◆◆◆ 【主营项目】一.毕业证【微信：176555708】成绩单、使馆认证、教育部认证、雅思托福成绩单、学生卡等！二.真实使馆公证(即留学回国人员证明,不成功不收费) 三.真实教育部学历学位认证（教育部存档！教育部留服网站永久可查）四.办理各国各大学文凭(一对一专业服务,可全程监控跟踪进度) 如果您处于以下几种情况： ◇在校期间，因各种原因未能顺利毕业……拿不到官方毕业证【微信：176555708】 ◇面对父母的压力，希望尽快拿到； ◇不清楚认证流程以及材料该如何准备； ◇回国时间很长，忘记办理； ◇回国马上就要找工作，办给用人单位看； ◇企事业单位必须要求办理的 ◇需要报考公务员、购买免税车、落转户口 ◇申请留学生创业基金留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分→ 【关于价格问题（保证一手价格）我们所定的价格是非常合理的，而且我们现在做得单子大多数都是代理和回头客户介绍的所以一般现在有新的单子我给客户的都是第一手的代理价格，因为我想坦诚对待大家不想跟大家在价格方面浪费时间对于老客户或者被老客户介绍过来的朋友，我们都会适当给一些优惠。 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才选择实体注册公司办理，更放心，更安全！我们的承诺：可来公司面谈，可签订合同，会陪同客户一起到教育部认证窗口递交认证材料，客户在教育部官方认证查询网站查询到认证通过结果后付款，不成功不收费！学历顾问：微信：176555708

一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理

pyhepag

Artificial_General_Intelligence__storm_gen_article.pdf

scitechtalktv

UCI毕业证原版定制【微信：176555708】【加利福尼亚大学尔湾分校毕业证成绩单-学位证】【微信：176555708】（留信学历认证永久存档查询）采用学校原版纸张、特殊工艺完全按照原版一比一制作（包括：隐形水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠，文字图案浮雕，激光镭射，紫外荧光，温感，复印防伪）行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备，十五年致力于帮助留学生解决难题，业务范围有加拿大、英国、澳洲、韩国、美国、新加坡，新西兰等学历材料，包您满意。 ◆◆◆◆◆ — — — — — — — — 【留学教育】留学归国服务中心 — — — — — -◆◆◆◆◆ 【主营项目】一.毕业证【微信：176555708】成绩单、使馆认证、教育部认证、雅思托福成绩单、学生卡等！二.真实使馆公证(即留学回国人员证明,不成功不收费) 三.真实教育部学历学位认证（教育部存档！教育部留服网站永久可查）四.办理各国各大学文凭(一对一专业服务,可全程监控跟踪进度) 如果您处于以下几种情况： ◇在校期间，因各种原因未能顺利毕业……拿不到官方毕业证【微信：176555708】 ◇面对父母的压力，希望尽快拿到； ◇不清楚认证流程以及材料该如何准备； ◇回国时间很长，忘记办理； ◇回国马上就要找工作，办给用人单位看； ◇企事业单位必须要求办理的 ◇需要报考公务员、购买免税车、落转户口 ◇申请留学生创业基金留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分→ 【关于价格问题（保证一手价格）我们所定的价格是非常合理的，而且我们现在做得单子大多数都是代理和回头客户介绍的所以一般现在有新的单子我给客户的都是第一手的代理价格，因为我想坦诚对待大家不想跟大家在价格方面浪费时间对于老客户或者被老客户介绍过来的朋友，我们都会适当给一些优惠。 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才选择实体注册公司办理，更放心，更安全！我们的承诺：可来公司面谈，可签订合同，会陪同客户一起到教育部认证窗口递交认证材料，客户在教育部官方认证查询网站查询到认证通过结果后付款，不成功不收费！学历顾问：微信：176555708

一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理

pyhepag

This presentation explores the potential of machine learning in predicting the severity of road accidents. We will delve into the data analysis process, the chosen machine learning algorithms, and the evaluation of our model's performance. This project aims to contribute to improved emergency response times and accident prevention strategies. visit for more: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/

Machine Learning for Accident Severity Prediction

Boston Institute of Analytics

NCL毕业证原版定制【微信：176555708】【纽卡斯尔大学毕业证成绩单-学位证】【微信：176555708】（留信学历认证永久存档查询）采用学校原版纸张、特殊工艺完全按照原版一比一制作（包括：隐形水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠，文字图案浮雕，激光镭射，紫外荧光，温感，复印防伪）行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备，十五年致力于帮助留学生解决难题，业务范围有加拿大、英国、澳洲、韩国、美国、新加坡，新西兰等学历材料，包您满意。 ◆◆◆◆◆ — — — — — — — — 【留学教育】留学归国服务中心 — — — — — -◆◆◆◆◆ 【主营项目】一.毕业证【微信：176555708】成绩单、使馆认证、教育部认证、雅思托福成绩单、学生卡等！二.真实使馆公证(即留学回国人员证明,不成功不收费) 三.真实教育部学历学位认证（教育部存档！教育部留服网站永久可查）四.办理各国各大学文凭(一对一专业服务,可全程监控跟踪进度) 如果您处于以下几种情况： ◇在校期间，因各种原因未能顺利毕业……拿不到官方毕业证【微信：176555708】 ◇面对父母的压力，希望尽快拿到； ◇不清楚认证流程以及材料该如何准备； ◇回国时间很长，忘记办理； ◇回国马上就要找工作，办给用人单位看； ◇企事业单位必须要求办理的 ◇需要报考公务员、购买免税车、落转户口 ◇申请留学生创业基金留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分→ 【关于价格问题（保证一手价格）我们所定的价格是非常合理的，而且我们现在做得单子大多数都是代理和回头客户介绍的所以一般现在有新的单子我给客户的都是第一手的代理价格，因为我想坦诚对待大家不想跟大家在价格方面浪费时间对于老客户或者被老客户介绍过来的朋友，我们都会适当给一些优惠。 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才选择实体注册公司办理，更放心，更安全！我们的承诺：可来公司面谈，可签订合同，会陪同客户一起到教育部认证窗口递交认证材料，客户在教育部官方认证查询网站查询到认证通过结果后付款，不成功不收费！学历顾问：微信：176555708

一比一原版纽卡斯尔大学毕业证成绩单如何办理

cyebo

Supply chain analytics to combat the effects of Ukraine-Russia-conflict

Jack Cole

Saudi Arabia [ Abortion pills) Jeddah/riaydh/dammam/+966572737505☎️] cytotec tablets uses abortion pills 💊💊 How effective is the abortion pill? 💊💊 +966572737505) "Abortion pills in Jeddah" how to get cytotec tablets in Riyadh " Abortion pills in dammam*💊💊 The abortion pill is very effective. If you’re taking mifepristone and misoprostol, it depends on how far along the pregnancy is, and how many doses of medicine you take:💊💊 +966572737505) how to buy cytotec pills At 8 weeks pregnant or less, it works about 94-98% of the time. +966572737505[ 💊💊💊 At 8-9 weeks pregnant, it works about 94-96% of the time. +966572737505) At 9-10 weeks pregnant, it works about 91-93% of the time. +966572737505)💊💊 If you take an extra dose of misoprostol, it works about 99% of the time. At 10-11 weeks pregnant, it works about 87% of the time. +966572737505) If you take an extra dose of misoprostol, it works about 98% of the time. In general, taking both mifepristone and+966572737505 misoprostol works a bit better than taking misoprostol only. +966572737505 Taking misoprostol alone works to end the+966572737505 pregnancy about 85-95% of the time — depending on how far along the+966572737505 pregnancy is and how you take the medicine. +966572737505 The abortion pill usually works, but if it doesn’t, you can take more medicine or have an in-clinic abortion. +966572737505 When can I take the abortion pill?+966572737505 In general, you can have a medication abortion up to 77 days (11 weeks)+966572737505 after the first day of your last period. If it’s been 78 days or more since the first day of your last+966572737505 period, you can have an in-clinic abortion to end your pregnancy.+966572737505 Why do people choose the abortion pill? Which kind of abortion you choose all depends on your personal+966572737505 preference and situation. With+966572737505 medication+966572737505 abortion, some people like that you don’t need to have a procedure in a doctor’s office. You can have your medication abortion on your own+966572737505 schedule, at home or in another comfortable place that you choose.+966572737505 You get to decide who you want to be with during your abortion, or you can go it alone. Because+966572737505 medication abortion is similar to a miscarriage, many people feel like it’s more “natural” and less invasive. And some+966572737505 people may not have an in-clinic abortion provider close by, so abortion pills are more available to+966572737505 them. +966572737505 Your doctor, nurse, or health center staff can help you decide which kind of abortion is best for you. +966572737505 More questions from patients: Saudi Arabia+966572737505 CYTOTEC Misoprostol Tablets. Misoprostol is a medication that can prevent stomach ulcers if you also take NSAID medications. It reduces the amount of acid in your stomach, which protects your stomach lining. The brand name of this medication is Cytotec®.+966573737505) Unwanted Kit is a combination of two medicines

Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec

Abortion pills in Riyadh +966572737505 get cytotec

Easy and simple project file on mp online

balibahu1313

Adelaide毕业证原版定制【微信：176555708】【阿德莱德大学毕业证成绩单-学位证】【微信：176555708】（留信学历认证永久存档查询）采用学校原版纸张、特殊工艺完全按照原版一比一制作（包括：隐形水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠，文字图案浮雕，激光镭射，紫外荧光，温感，复印防伪）行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备，十五年致力于帮助留学生解决难题，业务范围有加拿大、英国、澳洲、韩国、美国、新加坡，新西兰等学历材料，包您满意。 ◆◆◆◆◆ — — — — — — — — 【留学教育】留学归国服务中心 — — — — — -◆◆◆◆◆ 【主营项目】一.毕业证【微信：176555708】成绩单、使馆认证、教育部认证、雅思托福成绩单、学生卡等！二.真实使馆公证(即留学回国人员证明,不成功不收费) 三.真实教育部学历学位认证（教育部存档！教育部留服网站永久可查）四.办理各国各大学文凭(一对一专业服务,可全程监控跟踪进度) 如果您处于以下几种情况： ◇在校期间，因各种原因未能顺利毕业……拿不到官方毕业证【微信：176555708】 ◇面对父母的压力，希望尽快拿到； ◇不清楚认证流程以及材料该如何准备； ◇回国时间很长，忘记办理； ◇回国马上就要找工作，办给用人单位看； ◇企事业单位必须要求办理的 ◇需要报考公务员、购买免税车、落转户口 ◇申请留学生创业基金留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分→ 【关于价格问题（保证一手价格）我们所定的价格是非常合理的，而且我们现在做得单子大多数都是代理和回头客户介绍的所以一般现在有新的单子我给客户的都是第一手的代理价格，因为我想坦诚对待大家不想跟大家在价格方面浪费时间对于老客户或者被老客户介绍过来的朋友，我们都会适当给一些优惠。 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才选择实体注册公司办理，更放心，更安全！我们的承诺：可来公司面谈，可签订合同，会陪同客户一起到教育部认证窗口递交认证材料，客户在教育部官方认证查询网站查询到认证通过结果后付款，不成功不收费！学历顾问：微信：176555708

一比一原版阿德莱德大学毕业证成绩单如何办理

pyhepag

Kürzlich hochgeladen (20)

How I opened a fake bank account and didn't go to prison

Exploratory Data Analysis - Dilip S.pptx

Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...

一比一原版西悉尼大学毕业证成绩单如何办理

Slip-and-fall Injuries: Top Workers' Comp Claims

Pre-ProductionImproveddsfjgndflghtgg.pptx

一比一原版麦考瑞大学毕业证成绩单如何办理

The Significance of Transliteration Enhancing

Atlantic Grupa Case Study (Mintec Data AI)

社内勉強会資料　Mamba - A new era or ephemeral

Fuzzy Sets decision making under information of uncertainty

一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理

Artificial_General_Intelligence__storm_gen_article.pdf

一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理

Machine Learning for Accident Severity Prediction

一比一原版纽卡斯尔大学毕业证成绩单如何办理

Supply chain analytics to combat the effects of Ukraine-Russia-conflict

Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec

Easy and simple project file on mp online

一比一原版阿德莱德大学毕业证成绩单如何办理

Clustering Models to Assist in Student Outreach

1. February 8, 2022

2. 1. Why clustering? 2. Example 1, hierarchical clustering 3. Questions 4. Overview of common algorithms and strengths/limitations 5. Example 2, K-means clustering 6. Questions 2

3. 3

4. 4 Reverse looking What are characteristics of students who used/did not use a service? Did/did not persist? Who is the prototypical student in each academic program? Forward looking What characteristics comprise our prospective student personas? How do risk factors overlap in some groups of students?

5. 5

6. The challenge Contrary to national trends, Dallas College has experienced larger than typical drops in female student re-enrollment patterns. Our goal was to stem decreases for the Fall 2021 semester through a text campaign. The question What messaging should be used to resonate with these students? 6

7. 45k students Diverse features for the model • Demographics: race/ethnicity, gender, age • Financial: income, employment intensity • Household: household size, dependent count • Academics: last term of enrollment, credits, GPA • Special Pop membership 7

8. 8 Step 1 Starts with a case as leaf node Each case goes into that leaf node or breaks into a new node Result is a Cluster Features tree Step 2 Leaf nodes are combined through agglomeration Result is several “best” clusters with a silhouette score

9. 9

10. 10 African Am. 3 dependents Part-time job Hispanic 4 dependents Part-time job Hispanic 2 dependents Full-time job White 1 dependent Full-time job

11. 11

12. BIRCH and Two-step (SPSS proprietary; Ex 1) K-means, K-modes, K-medoids (Ex 2) Hierarchical (Agglomerative and Divisive) 12

13. 13

14. 14

15. 15 Clusters based on distance cutoff line Distance cutoff line

16. Algorithm Variable Types BIRCH / 2-Step Any, multiple at the same time Hierarchical Any, but stick to one kind at a time and use well-paired distance measure K-means Continuous K-modes Categorical K-medoids Any, multiple at the same time with the right distance measure (Gower) 16

17. 17 Algorithm +/- BIRCH / 2-Step Flexible variable types, fast compute, large data; there is an element of a black box Hierarchical Flexible modeling, highly explainable; limited to small data K-Means, K-Modes Easy, fast, large data; sensitive to K, curse of dimensionality, clusters same sized K-medoids Like K-Means but more flexible variable types and more costly tuning & compute

18. 18

19. • Population: 24 years and older adults enrolled for Fall 2021 • Features : Academics, Demographic, Financial , Household and Veteran status. • Mix of discrete and continuous variables with majority of them being categorical data. 19 Below Federal poverty Level Below Median Lower half Above Median Upper half Above Median Median household income in Texas Poverty Flag INCOME BIN

20. 20 V1 V2 V3 V4 V5 0 1 1 0 1 V1 V2 V3 V4 V5 1 0 0 0 1 1+1+1+1 = 4 Dissimilarity measure 0: Mismatch 1: Match • Randomly select the K initial centers • Repeat 1. Assign the samples to nearest center 2. Update means/modes based on newly formed cluster 3. Calculate the cost (SSE/Sum of dissimilarity) • Stop when cluster centers converges Using frequency-based method to calculate mode instead of the mean of the sample Minimize the cost function Sample 1 Sample 2

21. 21 Student Personas

22. 22 Missing values for categorical variables such as employment status Missing values for continuous variables such as income Possible solutions • Revisit our data warehouse to obtain as much information as we can to fill in the missing values • Imputation with K-Nearest neighbors with Hamming distance for categorical data • Imputation with K-Nearest neighbors with Euclidean distance for continuous data

23.

24.

25. Jeremy Anderson, Associate Vice Chancellor of Strategic Analytics, jeremy.anderson@dcccd.edu Dillon Lu, Data Analyst, DLu@dcccd.edu

Hinweis der Redaktion

REFER to recent Dallas Morning News Article and its Framing Give brief overview – focused on 2 components – Data and Demographics by Core Populations AND Interventions to help support all students.
You have a population in n dimensions of space Your model breaks that into subpopulations that are self-similar and distinct from other subpopulations There are many algorithms that attack this basic task in different ways
The idea with all of these kinds of research question is that you would take a very large population and break it down into smaller groups. Then, you customize the messaging to fit the cluster personas. Used a lot in the marketing world to create customer personas, but has applications to any kind of “product” like academic programs and student services.
The case count was fairly high. 45k+ female students had enrolled in prior three terms, had not attained a credential or transferred, and were not enrolled for the Fall. Some variables like income and household size are highly missing if we rely just on FAFSA because only about half of our students complete it. Instead, we supplement by using a homegrown Student Information Profile that goes out 1x a year and asks those and other questions. Part of data preparation is merging the two data streams, which also is a challenge because they are binned as categorical in the SIP but are continuous in FAFSA. The other variables are highly available and mostly clean, so very little cleanup was necessary. Other than that, we do some regular feature engineering around membership in special populations like athletes, international students, foster students, veterans, etc. That's code that we've written and reuse across projects to save some time. After the data prep, I stepped back and saw that the features for consideration were all mixed. Some were binary and others categorical or continuous. For that reason, and for the size of the data, I chose the two-step clustering algorithm in SPSS because it works well under both conditions and because, as a social scientist by training, I am most familiar with SPSS as an analysis tool. With the data ready, I then played around with the features to maximize the evaluation score I was seeing. Loaded them all in at once, then looked at the cluster overlap charts which is a feature in SPSS. Took out ones that didn’t have strong separation and was left with a collection of features that would be most useful/impactful for final modeling.
The tool I chose to use was SPSS because that's my bread and butter as a social scientist by trade. It has a couple of built-in options for clustering. In step 1, the similarity score, based on the distance between the case and the node values, is the determining factor. If the distance is above the algorithm's threshold and the node would spread to wide to incorporate it, then the data point goes into a newly created node. In step 2, the nodes from the tree start to get clumped together, also based on their distances. Lots of different cluster counts are tried and evaluated automatically with one of the available clustering criteria that you can choose from in the model settings in SPSS Ultimately, you get a set of the best clusters for the data. The model picks the number of clusters automatically through the Cluster Tree creation and the evaluation of the agglomerations.
I got my silhouette measure that gave me a sense of how well the points in each cluster were self-similar and how much each cluster was different from the others. Score was ranging from 0.6 to 0.7, which falls in the good range Was able to click into the scoring for details. That’s the clusters matrix that shows the clusters as columns and the features used to build the clusters as rows Can click on one of the cells to see that feature’s overall distribution and what’s present in the cluster in the Cluster Distribution visual Can click multiple clusters to see how they compare in a visual way in the Cluster Comparison
Hispanic, Black, or White, with 3-4 dependents, employed part-time Hispanic, 3 dependents, employed full-time Black or White, 1-2 dependents, employed full-time Overriding message was POC with more dependents, especially if employed less than full-time. The hypothesis there was that we should focus messaging on emergency aid Whereas full-time workers with fewer dependents might benefit from an awareness campaign about our childcare assistance. Really, though, we needed to test these assumptions. Used these clusters to then create a call list of ~350 students with the aim of talking to 50 (10-15 per cluster). The phone interviews were short, about 10 minutes, and guided by a set of template questions that our Student Success Research team and our student success staff worked on together. Results from the qualitative discussions left us with a collection of themes for each cluster and some prospective talking points to use in the text campaign. Ended up with 5600 students responding to the text campaign and re-enrolling, equating to an 18% response
Before we take a quick tour of some other clustering algorithms and their strengths and weaknesses, let's pause for one or two comments or questions about our first example. We will also have some time at the end for other questions.
Today, we're looking at three very common categories of clustering algorithms, though there are others that are applicable to other types of challenges. Already covered two-step, which is similar to BIRCH, in example 1. The point there is that it blends the approach of the other types of algorithms listed to get the best of both worlds. For comparison, then, we’ll look at these other two common approaches in the K-algorithms and hierarchical algorithm.
The premise of all three of these is you pick a number of K (clusters). Rather than pick at random, two good ways to narrow in on the best K for the data are: elbow method – for this it would be somewhere between 6 and 8 that you might want to try Silhouette Coefficient The cluster picking methods and the algorithms in general are available in Python's Scikit-learn package and they're in R, but it'll take a few packages
So, how does it work, generally? K-means places points as the center of the clusters, whereas K-medoids chooses the most representative case. K-medoids is less sensitive to outliers, like using the median rather than the mean, so it may be preferable to K-means depending on your data K-mode, meanwhile, looks for the number of similarities between data points when considering the different features, but it still collapses all of this into a pre-determined number of clusters that you set.
Match up pairs of cases based on distance between them Then match up pairs of pairs, also based on distance Pick what distance you’re okay with and use that to determine the clusters. That's where it's different from Two-Step and BIRCH which will do that automatically. Divisive flips this on its head and starts with all of the variables in one Available in Python scikit-learn and R
Divide our student body into groups so we can look at each specific group more closely and identify what they need and what can we do to help them reach their educational goals.
K-mode clustering Determine initial number of clusters K Minimize cost function while maintain relatively small value for K. Purpose of study is that we want to divide our student body into large groups so we can study each group more effectively therefore to have a better understanding of our student body as a whole. We prefer generalization rather than looking at specific cases. Choose 4 in this case and move on to the next page
This information can help us to develop strategic plans that focus on a particular group of student to help them to achieve their goals and be successful here at our institution.
Many students with missing income end up in the second income bin due to nature of our dataset design Revisit our data and eliminate as many missing value as possible to obtain a more complete data should help greatly to have our students more evenly distributed in INCOME_BIN variable

Clustering Models to Assist in Student Outreach

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Clustering Models to Assist in Student Outreach

Ähnlich wie Clustering Models to Assist in Student Outreach (20)

Mehr von Jeremy Anderson

Mehr von Jeremy Anderson (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Clustering Models to Assist in Student Outreach

Hinweis der Redaktion