Classifier Evaluation and Selection. Review and Overview of Methods

Similar documents
Random Forests. Gradient Boosting. and. Bagging and Boosting

Support Vector Machines

JUDGE, JURY AND CLASSIFIER

Practice Questions for Exam #2

Statistical Analysis of Corruption Perception Index across countries

Supplementary Materials A: Figures for All 7 Surveys Figure S1-A: Distribution of Predicted Probabilities of Voting in Primary Elections

SIMPLE LINEAR REGRESSION OF CPS DATA

Overview. Ø Neural Networks are considered black-box models Ø They are complex and do not provide much insight into variable relationships

Remittances and Poverty. in Guatemala* Richard H. Adams, Jr. Development Research Group (DECRG) MSN MC World Bank.

Response to the Report Evaluation of Edison/Mitofsky Election System

Analysis of Categorical Data from the California Department of Corrections

List of Tables and Appendices

Lab 3: Logistic regression models

Cluster Analysis. (see also: Segmentation)

Differences Lead to Differences: Diversity and Income Inequality Across Countries

Identifying Factors in Congressional Bill Success

Understanding factors that influence L1-visa outcomes in US

Analyzing the Power Consumption Behavior of a Large Scale Data Center

Classification of posts on Reddit

Learning from Small Subsamples without Cherry Picking: The Case of Non-Citizen Registration and Voting

Preliminary Effects of Oversampling on the National Crime Victimization Survey

Corruption and business procedures: an empirical investigation

Psychological Factors

Volume 35, Issue 1. An examination of the effect of immigration on income inequality: A Gini index approach

Migration and Tourism Flows to New Zealand

Report for the Associated Press. November 2015 Election Studies in Kentucky and Mississippi. Randall K. Thomas, Frances M. Barlas, Linda McPetrie,

Supplementary Materials for Strategic Abstention in Proportional Representation Systems (Evidence from Multiple Countries)

ANNUAL SURVEY REPORT: REGIONAL OVERVIEW

Educated Preferences: Explaining Attitudes Toward Immigration In Europe. Jens Hainmueller and Michael J. Hiscox. Last revised: December 2005

John Parman Introduction. Trevon Logan. William & Mary. Ohio State University. Measuring Historical Residential Segregation. Trevon Logan.

EXAMINATION 3 VERSION B "Wage Structure, Mobility, and Discrimination" April 19, 2018

Skill Classification Does Matter: Estimating the Relationship Between Trade Flows and Wage Inequality

Kakuma Refugee Camp: Household Vulnerability Study

Introduction to Path Analysis: Multivariate Regression

WP 2015: 9. Education and electoral participation: Reported versus actual voting behaviour. Ivar Kolstad and Arne Wiig VOTE

Happiness and economic freedom: Are they related?

Analysis of the Reputation System and User Contributions on a Question Answering Website: StackOverflow

Online Appendix for The Contribution of National Income Inequality to Regional Economic Divergence

TECHNICAL APPENDIX. Immigrant Earnings Growth: Selection Bias or Real Progress. Garnett Picot and Patrizio Piraino*

The Impact of Unionization on the Wage of Hispanic Workers. Cinzia Rienzo and Carlos Vargas-Silva * This Version, December 2014.

The Cook Political Report / LSU Manship School Midterm Election Poll

The Mexican Migration Project weights 1

The Impact of Unionization on the Wage of Hispanic Workers. Cinzia Rienzo and Carlos Vargas-Silva * This Version, May 2015.

Issue Importance and Performance Voting. *** Soumis à Political Behavior ***

CALTECH/MIT VOTING TECHNOLOGY PROJECT A

Violent Conflict and Inequality

VoteCastr methodology

Migrant Wages, Human Capital Accumulation and Return Migration

Attenuation Bias in Measuring the Wage Impact of Immigration. Abdurrahman Aydemir and George J. Borjas Statistics Canada and Harvard University

Research and strategy for the land community.

Magruder s American Government

Appendix: Uncovering Patterns Among Latent Variables: Human Rights and De Facto Judicial Independence

Probabilistic Latent Semantic Analysis Hofmann (1999)

Supporting Information for Do Perceptions of Ballot Secrecy Influence Turnout? Results from a Field Experiment

Towards Tackling Hate Online Automatically

School Choice & Segregation

Is Corruption Anti Labor?

Part 2: Risk Analysis and Scenario-Based Planning

Children's Referendum Poll

ARTNeT Trade Economists Conference Trade in the Asian century - delivering on the promise of economic prosperity rd September 2014

Labour Market Success of Immigrants to Australia: An analysis of an Index of Labour Market Success

Appendix: Supplementary Tables for Legislating Stock Prices

Online Appendix: The Effect of Education on Civic and Political Engagement in Non-Consolidated Democracies: Evidence from Nigeria

Learning and Visualizing Political Issues from Voting Records Erik Goldman, Evan Cox, Mikhail Kerzhner. Abstract

! = ( tapping time ).

Supplementary Material for Preventing Civil War: How the potential for international intervention can deter conflict onset.

Identity Theft. What does a victim look like?

Deep Learning and Visualization of Election Data

RBS SAMPLING FOR EFFICIENT AND ACCURATE TARGETING OF TRUE VOTERS

To What Extent Are Canadians Exposed to Low-Income?

Gender preference and age at arrival among Asian immigrant women to the US

Do two parties represent the US? Clustering analysis of US public ideology survey

Dimension Reduction. Why and How

Supporting Information Political Quid Pro Quo Agreements: An Experimental Study

Errata Summary. Comparison of the Original Results with the New Results

A comparative analysis of subreddit recommenders for Reddit

IV. Labour Market Institutions and Wage Inequality

CHAPTER 5 SOCIAL INCLUSION LEVEL

Guns and Butter in U.S. Presidential Elections

Comparison of the Psychometric Properties of Several Computer-Based Test Designs for. Credentialing Exams

Voter Turnout, Income Inequality, and Redistribution. Henning Finseraas PhD student Norwegian Social Research

Is there a Strategic Selection Bias in Roll Call Votes. in the European Parliament?

1/12/12. Introduction-cont Pattern classification. Behavioral vs Physical Traits. Announcements

The transition of corruption: From poverty to honesty

Chapter. Estimating the Value of a Parameter Using Confidence Intervals Pearson Prentice Hall. All rights reserved

UTS:IPPG Project Team. Project Director: Associate Professor Roberta Ryan, Director IPPG. Project Manager: Catherine Hastings, Research Officer

A Profile of the Gauteng Province: Demographics, Poverty, Income, Inequality and Unemployment from 2000 till 2007

Public Opinions towards Gun Control vs. Gun Ownership. Society today is witnessing a major increase in violent crimes involving guns.

Europeans support a proportional allocation of asylum seekers

Online Appendix 1: Treatment Stimuli

Corruption and quality of public institutions: evidence from Generalized Method of Moment

Intersections of political and economic relations: a network study

Appendix to Non-Parametric Unfolding of Binary Choice Data Keith T. Poole Graduate School of Industrial Administration Carnegie-Mellon University

Case Study: Get out the Vote

GENDER EQUALITY IN THE LABOUR MARKET AND FOREIGN DIRECT INVESTMENT

Raymundo Miguel Campos-Vázquez. Center for Economic Studies, El Colegio de México, and consultant to the OECD. and. José Antonio Rodríguez-López

Uncovering patterns among latent variables: human rights and de facto judicial independence

Split Decisions: Household Finance when a Policy Discontinuity allocates Overseas Work

Candidate Faces and Election Outcomes: Is the Face-Vote Correlation Caused by Candidate Selection? Corrigendum

Impact of Human Rights Abuses on Economic Outlook

Transcription:

Classifier Evaluation and Selection Review and Overview of Methods

Things to consider Ø Interpretation vs. Prediction Ø Model Parsimony vs. Model Error Ø Type of prediction task: Ø Decisions Interested only in resulting classification Ø Rankings Interested in ranking individuals by their true likelihood of an outcome Ø Estimates Interested in predicting probabilities or a continuous outcome accurately

Model Fit Statistics Summary Prediction Type Model Fit Statistics Decisions Accuracy/ Misclassification Profit/Loss KS-Statistic Rankings ROC Index (concordance statistic) Gini Coefficient Estimates Average Squared error SBC/Likelihood MAPE R "

Confusion Matrix Metrics from Confusion Matrix: 1. Accuracy: Proportion of total predictions that were correct 2. Precision/ Positive Predictive Value: Proportion of predicted positive that were actually positive 3. Negative Predictive Value: Proportion of predicted negative that were actually negative 4. Sensitivity/Recall: Proportion of actual positive cases correctly identified 5. Specificity: Proportion of actual negative cases which are correctly identified

Kolmogorov-Smirnov (KS) Statistic 100% 90% 80% 80% of negative observations have predicted probability <48% 70% 60% 50% 40% 30% 20% 10% 0% Cumulative NEG % Cumulative POS % 0% 16% 32% 48% 64% 80% 100% Predicted Probability from Model 25% of positive observations have predicted probability <48%

Kolmogorov-Smirnov (KS) Statistic 100% 90% 80% 70% 60% 50% 40% Max Distance: Kolmogorov-Smirnov (KS) Statistic 30% 20% 10% 0% Cumulative NEG % Cumulative POS % 0% 16% 32% 48% 64% 80% 100% Predicted Probability from Model

ROC Charts Each point on ROC curve corresponds to fraction of cases, ordered by decreasing predicted value. The x,y coordinates assume we predict that fraction of cases positive.

ROC Charts For example, this point might represent the 40% of cases with the highest predicted probabilities.

ROC Charts 70% of the actual positive outcome cases are captured => True Positive Rate = 0.7

ROC Charts ~10% of the actual negative outcome cases are captured => False Positive Rate = 0.1

Gini Coefficient Gini = 2*Shaded Area = 2*(AUC-0.5)

ROC Charts for Decision Trees p=3/4 p=1/3 p $%&'(( = 1 TPR = 0 FPR = 0 1 3 < p $%&'(( < 3 4 TPR = 0.6 FPR = 0.2 p $%&'(( < 1 3 TPR = 1 FPR = 1

ROC Charts for Decision Trees

Response/Gain Charts 100% 90% Cumulative % Responders 80% 70% 60% 50% 40% 30% 20% 10% 0% 0% 18% 36% 54% 72% 90% Percentile of Modeled Values

Response/Gain Charts Cumulative % Responders 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 0% 18% 36% 54% 72% 90% Percentile of Modeled Values Of top 18% of observations by predicted probability, 90% are responders (positive outcomes)

Response/Gain Charts Cumulative % Responders 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 0% 18% 36% 54% 72% 90% Percentile of Modeled Values Overall population response rate is ~27%

Lift Chart While it s great to know what percent of responders you should get using the top p% of observations scored by the model, it s even better to know how this compares to random selection. Lift = % Responders from Model % Responders from Random Selection

Cumulative Lift 4 3.5 3 At a depth of ~20%, we have a lift of almost 3.5 2.5 2 1.5 1 0.5 0 If we target the top 20% of customers as scored by our model, we ll get 3.5 times as many responders than we would if we randomly targeted customers. 0% 18% 36% 54% 72% 90%

Average Squared Error (ASE) M J 1 nl D D yf " GH y GH HKL GKL Ø For class targets, let L be the number of levels in the target. Ø This objective function sets y GH = 1 if observation i takes level j of the target and 0 otherwise. Ø Computes sum of squared error with probabilities.

Average Squared Error (ASE) M J 1 nl D D yf " GH y GH HKL GKL Example: Name P(red) P(blue) P(none) Actual JimBob 0.3 0.4 0.3 BLUE BillyBob 0.1 0.5 0.4 NONE

Average Squared Error (ASE) M J 1 nl D D yf " GH y GH HKL GKL Example: Name P(red) P(blue) P(none) Actual JimBob 0.3 0.4 0.3 BLUE BillyBob 0.1 0.5 0.4 NONE 0 0.3 " + 1 0.4 " + 0 0.3 " + 0 0.1 " + 0 0.5 " + 1 0.4 " 2 3

Average Squared Error (ASE) M J 1 nl D D yf " GH y GH HKL GKL Example: Name P(red) P(blue) P(none) Actual JimBob 0.3 0.4 0.3 BLUE BillyBob 0.1 0.5 0.4 NONE 0 0.3 " + 1 0.4 " + 0 0.3 " + 0 0.1 " + 0 0.5 " + 1 0.4 " 2 3

Decisions: Accounting for Profit/Loss (or other external evaluation metrics)

Decisions in SAS EM Ø Enter information about profit/loss into the decisions on a dataset panel Ø Enterprise miner calculates the most profitable or least costly decision for each obs. Ø Click Build when first opening prompt, then open decisions tab.

Decisions in SAS EM Ø Decision and Cost Matrices do not affect: Ø Estimating parameters in the regression node Ø Learning weights in the neural network node Ø Growing decision trees Ø Fit statistics Ø Residuals, error functions, misclassification rate Ø Decision and Cost Matrices do affect: Ø Choice of models in regression node Ø Pruning trees in decision tree node

Undersampling/ Oversampling and Prior Probabilities Can be accounted for automatically in SAS EM

Undersampling and Prior Probabilities Ø Say you have a rare event as target (<10% of data) Ø Fraud Ø Catastrophic failure Ø 10%+ single day change in value of stock market index Ø May have trouble modelling because a model is accurate for classifying everything as nonevent! Ø Potential Solution: Create a biased sample Ø Under-represent the common events in the training data. Ø Keep all rare events and only an equal number of common events

Undersampling and Prior Probabilities Ø Models provide posterior probabilities for events. Ø The accuracy of the posterior probabilities rely on a representative sample. Ø If we bias our sample, must adjust the posterior probabilities to account for this.

Undersampling and Prior Probabilities Ø Let l = l L, l ",, l J be the levels of the target variable Ø Let i = 1,2,, n index the observations in the data Ø Let OldPost(i, l) be the posterior probability from the model on oversampled data Ø Let OldPrior(l) be the proportion of target level in the oversampled data Ø Let Prior(l) be the correct proportion of target level in true population NewPost i, l = Prior(l) OldPost(i, l) OldPrior(l) J Prior(l H ) HKL OldPost(i, l H ) OldPrior(l H )

Ø Priors are also adjusted in the decisions on a dataset panel. Entering Priors into SAS EM Ø Click Build when first opening the prompt, then click priors tab.

Undersampling and Prior Probabilities Ø In SAS EM, accounting for priors has no effect on: Ø Estimating parameters in logistic regression Ø Learning weights in Neural Network Ø Fit statistics like misclassification rate and average squared error Ø Growing decision trees Ø Priors do affect: Ø Pruning decision trees Ø Net Effects: Ø Increasing a prior probability increases the posterior probability Ø Decreasing a prior decreases the posterior probability Ø Changing prior will have more noticeable effect if the original posterior is near 0.5 than if it is near 0 or 1.

Oversampling Ø Instead of undersampling the common events, we can replicate the rare events in our data. Ø We have to be careful to do this after the training/validation split so that we don t have the same observation in both training and validation set. Ø OR, use a hybrid technique like SMOTE (Chawla, 2002) that creates new data points like the rare events (not exact replicates) as well as undersamples the common events

Using the Model Comparison Node in SAS EM

Cutoff Node Ø Cutoff node used to specify a cutoff probability other than 0.5 when you have decision factors. Ø Currently, the model comparison node does not use the cutoff probability from the cutoff node. Ø Most of the assessment statistics are not affected anyway, aside from misclassification rate.

Self Study: Using Enterprise Miner to Determine a Custom Probability Cutoff Profit/Loss or other Decisions

Average Profit on Pred_Yes Ø EM can use a decision matrix to compute the average profit per observation. Ø This calculation assumes that you have some level of profit/loss for every person in the data and want to average over every person in the data. Ø What if you only stand to profit/lose from those observations which you predict positive? i.e. nothing ventured, nothing gained (or lost). Ø Then you d want to take the profit from the model and average it only over those who were predicted positive. Ø EM cannot use a decision matrix to compute an average profit per positive prediction. Ø But we can do it quite easily with the program editor and dataset explorer!

Open Results from Cutoff Node

Open Model Diagnostics Table

Save Model Diagnostics Table

Save Model Diagnostics Table

Open Program Editor

Write Program to Calculate Avg. Profit

Run Program

Check Log

Open Explorer

Navigate to Dataset and Open

Sort by Average Profit Find largest for validation data