JUDGE, JURY AND CLASSIFIER

Similar documents
Random Forests. Gradient Boosting. and. Bagging and Boosting

Classifier Evaluation and Selection. Review and Overview of Methods

Network Derived Domain Maps of the United States Supreme Court:

2007 Annenberg Public Policy Center Judicial Survey Exact Question Wording, By Category

AP Gov Chapter 15 Outline

6+ Decades of Freedom of Expression in the U.S. Supreme Court

The Judicial Branch. CP Political Systems

Political Sophistication and Third-Party Voting in Recent Presidential Elections

U.S. Supreme Court Key Findings

Unit 7 SG 1. Campaign Finance

VoteCastr methodology

Political Sophistication and Third-Party Voting in Recent Presidential Elections

National Survey Findings: Americans Want A Balanced Supreme Court

MEMORANDUM. June 30, From: Akin Gump Strauss Hauer & Feld LLP and SCOTUSblog.com Re: End of Term Statistical Analysis October Term 2008

Lab 3: Logistic regression models

Can Ideal Point Estimates be Used as Explanatory Variables?

Understanding factors that influence L1-visa outcomes in US

SCOTUSBLOG MEMORANDUM. Saturday, June 30, Re: End-of-Term Statistical Analysis October Term 2011

U.S. Catholics split between intent to vote for Kerry and Bush.

Chapter Outline and Learning Objectives. Chapter Outline and Learning Objectives. Chapter Outline and Learning Objectives

Support Vector Machines

Copyright 2011 Pearson Education, Inc. Publishing as Longman

MEMORANDUM. June 26, From: Akin Gump Strauss Hauer & Feld LLP and SCOTUSblog.com Re: End of Term Statistical Analysis October Term 2007

SIMPLE LINEAR REGRESSION OF CPS DATA

Gender preference and age at arrival among Asian immigrant women to the US

(a) Draw side-by-side box plots that show the yields of the two types of land. Check for outliers before making the plots.

Online Appendix 1: Treatment Stimuli

Classification of posts on Reddit

The Republican Race: Trump Remains on Top He ll Get Things Done February 12-16, 2016

Judiciary and Political Parties. Court Rulings on Parties. Presidential Nomination Rules. Presidential Nomination Rules

UC-BERKELEY. Center on Institutions and Governance Working Paper No. 22. Interval Properties of Ideal Point Estimators

Ken Winneg: (215) , Kathleen Hall Jamieson: (215) ,

DATA ANALYSIS USING SETUPS AND SPSS: AMERICAN VOTING BEHAVIOR IN PRESIDENTIAL ELECTIONS

List of Tables and Appendices

Presidents and The US Economy: An Econometric Exploration. Working Paper July 2014

The Effect of Public Opinion on the Voting Behavior of Supreme Court Justices. By Kristen Rosano

2018 Jackson Lewis P.C.

Beyond Binary Labels: Political Ideology Prediction of Twitter Users

RATIONAL JUDICIAL BEHAVIOR:

THE GRANITE STATE POLL THE UNIVERSITY OF NEW HAMPSHIRE

Eric J. Williams, PhD. Dept. Chair of CCJS, SSU

A SUPREME COURT SIMULATION COURSE

ROGERS v. UNITED STATES. certiorari to the united states court of appeals for the eleventh circuit

Chapter 8: Mass Media and Public Opinion Section 1 Objectives Key Terms public affairs: public opinion: mass media: peer group: opinion leader:

Overview. Ø Neural Networks are considered black-box models Ø They are complex and do not provide much insight into variable relationships

The Ideological Operation of the United States Supreme Court

AP US Government: The Judiciary Test(including the Supreme Court) Study Guide There was no judicial system under the Articles of Confederation

Sources and Consequences of Polarization on the U.S. Supreme Court Brandon Bartels

Simulating Electoral College Results using Ranked Choice Voting if a Strong Third Party Candidate were in the Election Race

EBAY INC. v. MERC EXCHANGE, L.L.C. 126 S.Ct (2006)

Web Appendix for More a Molehill than a Mountain: The Effects of the Blanket Primary on Elected Officials Behavior in California

Supporting Information for Signaling and Counter-Signaling in the Judicial Hierarchy: An Empirical Analysis of En Banc Review

Preferences in Political Mapping (Measuring, Modeling, and Visualization)

RE: Survey of New York State Business Decision Makers

Chapter 13: The Judiciary

Supplementary Materials A: Figures for All 7 Surveys Figure S1-A: Distribution of Predicted Probabilities of Voting in Primary Elections

Supplementary/Online Appendix for:

A Vote Equation and the 2004 Election

The Gender Gap's Back

This journal is published by the American Political Science Association. All rights reserved.

Presidency (cont.) The Judiciary Preview of Next Time The Judiciary Department of Political Science and Government Aarhus University October 9, 2014

INTRO TO POLI SCI 11/30/15

This week. Monroe & Kersh Chpt. 13 (Courts) Monroe & Kersh Chpt. 4 (Liber;es) Discussion. War Powers Consulta;on Act, Chapter 20 (2 nd ed)

Chapter 5: Public Opinion and Political Action

FOR IMMEDIATE RELEASE DATE: April 23, 2004 CONTACT: Adam Clymer at or (cell) VISIT:

The Roberts Court: Year 1

Truman Policy Research Harry S Truman School of Public Affairs

***JURISDICTION: A court s power to rule on a case. There are two primary systems of courts in the U.S.:

Commerce Clause Doctrine

Practice Questions for Exam #2

FOR RELEASE July 17, 2018

MEMORANDUM. The pregnancy endangers the life of the woman 75% 18% The pregnancy poses a threat to the physical health 70% 21% of the woman

Health Policy: National Issues Litigation Concerning Health Care Reform. Robert Schapiro April 11, 2012

THE HEALTH CARE BILL, THE PUBLIC OPTION, ABORTION, AND CONGRESS November 13-16, 2009

Behind Kerry s New Hampshire Win: Broad Base, Moderate Image, Electability

Colorado 2014: Comparisons of Predicted and Actual Turnout

A Dead Heat and the Electoral College

THE JUDICIAL BRANCH: THE FEDERAL COURTS

ADVISORY Health Care SUPREME COURT RULES ON THE CONSTITUTIONALITY OF THE AFFORDABLE CARE ACT. June 29, 2012

Economic Issues in Ohio Work to Kerry s Advantage

Congressional samples Juho Lamminmäki

Separation of powers and the democratic process

SUPREME COURT OF THE UNITED STATES

The RAND 2016 Presidential Election Panel Survey (PEPS) Michael Pollard, Joshua Mendelsohn, Alerk Amin

Unit 4C STUDY GUIDE. The Judiciary. Use the Constitution to answer questions #1-9. Unless noted, all questions are based on Article III.

CS 229 Final Project - Party Predictor: Predicting Political A liation

Retrospective Voting

The Supreme Court of the United States. Donald Trump... The United States Congress...

Incumbency Advantages in the Canadian Parliament

Introduction. Midterm elections are elections in which the American electorate votes for all seats of the

Swing Voters in Swing States Troubled By Iraq, Economy; Unimpressed With Bush and Kerry, Annenberg Data Show

Voting: Issues, Problems, and Systems, Continued

What is The Probability Your Vote will Make a Difference?

Courts, Judges, and the Law

CHAPTER 11 PUBLIC OPINION AND POLITICAL SOCIALIZATION. Narrative Lecture Outline

The So-Called Moderate Justices on the Rehnquist Court: The Role of Stare Decisis in Salient and Closely-Divided Cases

a. Exceptions: Australia, Canada, Germany, India, and a few others B. Debate is over how the Constitution should be interpreted

Why (and When) Judges Dissent: A Theoretical and Empirical Analysis

Methodology. 1 State benchmarks are from the American Community Survey Three Year averages

1. The Relationship Between Party Control, Latino CVAP and the Passage of Bills Benefitting Immigrants

Transcription:

JUDGE, JURY AND CLASSIFIER An Introduction to Trees 15.071x The Analytics Edge

The American Legal System The legal system of the United States operates at the state level and at the federal level Federal courts hear cases beyond the scope of state law Federal courts are divided into: District Courts Makes initial decision Circuit Courts Hears appeals from the district courts Supreme Court Highest level makes final decision 15.071x Judge, Jury and Classifier: An Introduction to Trees 1

The Supreme Court of the United States Consists of nine judges ( justices ), appointed by the President Justices are distinguished judges, professors of law, state and federal attorneys The Supreme Court of the United States (SCOTUS) decides on most difficult and controversial cases Often involve interpretation of Constitution Significant social, political and economic consequences 15.071x Judge, Jury and Classifier: An Introduction to Trees 2

Notable SCOTUS Decisions Wickard v. Filburn (1942) Congress allowed to intervene in industrial/economic activity Roe v. Wade (1973) Legalized abortion Bush v. Gore (2000) Decided outcome of presidential election! National Federation of Independent Business v. Sebelius (2012) Patient Protection and Affordable Care Act ( ObamaCare ) upheld the requirement that individuals must buy health insurance 15.071x Judge, Jury and Classifier: An Introduction to Trees 3

Predicting Supreme Court Cases Legal academics and political scientists regularly make predictions of SCOTUS decisions from detailed studies of cases and individual justices In 2002, Andrew Martin, a professor of political science at Washington University in St. Louis, decided to instead predict decisions using a statistical model built from data Together with his colleagues, he decided to test this model against a panel of experts 15.071x Judge, Jury and Classifier: An Introduction to Trees 4

Predicting Supreme Court Cases Martin used a method called Classification and Regression Trees (CART) Why not logistic regression? Logistic regression models are generally not interpretable Model coefficients indicate importance and relative effect of variables, but do not give a simple explanation of how decision is made 15.071x Judge, Jury and Classifier: An Introduction to Trees 5

Data Cases from 1994 through 2001 In this period, same nine justices presided SCOTUS Breyer, Ginsburg, Kennedy, O Connor, Rehnquist (Chief Justice), Scalia, Souter, Stevens, Thomas Rare data set longest period of time with the same set of justices in over 180 years We will focus on predicting Justice Stevens decisions Started out moderate, but became more liberal Self-proclaimmed conservative 15.071x Judge, Jury and Classifier: An Introduction to Trees 1

Variables Dependent Variable: Did Justice Stevens vote to reverse the lower court decision? 1 = reverse, 0 = affirm Independent Variables: Properties of the case Circuit court of origin (1 st 11 th, DC, FED) Issue area of case (e.g., civil rights, federal taxation) Type of petitioner, type of respondent (e.g., US, an employer) Ideological direction of lower court decision (conservative or liberal) Whether petitioner argued that a law/practice was unconstitutional 15.071x Judge, Jury and Classifier: An Introduction to Trees 2

Logistic Regression for Justice Stevens Some significant variables and their coefficients: Case is from 2 nd circuit court: +1.66 Case is from 4 th circuit court: +2.82 Lower court decision is liberal: -1.22 This is complicated Difficult to understand which factors are more important Difficult to quickly evaluate what prediction is for a new case 15.071x Judge, Jury and Classifier: An Introduction to Trees 3

Classification and Regression Trees Build a tree by splitting on variables To predict the outcome for an observation, follow the splits and at the end, predict the most frequent outcome Does not assume a linear model Interpretable 15.071x Judge, Jury and Classifier: An Introduction to Trees 4

Splits in CART Independent Variable Y 25 23 21 19 17 15 13 Split 1 Predict Red Predict Gray Split 2 Predict Red Predict Gray 25 35 45 55 65 75 85 95 105 115 Split 3 Independent Variable X 15.071x Judge, Jury and Classifier: An Introduction to Trees 5

Final Tree Independent Variable Y 25 23 21 19 17 15 Split 1 Predict Red Predict Gray Predict Red Predict Gray 13 25 35 45 55 65 75 85 95 105 115 Split 3 Independent Variable X Split 2 X < 60 Yes No Red Y < 20 Yes X < 85 Yes No Red Gray No Gray 15.071x Judge, Jury and Classifier: An Introduction to Trees 6

When Does CART Stop Splitting? There are different ways to control how many splits are generated One way is by setting a lower bound for the number of points in each subset In R, a parameter that controls this is minbucket The smaller it is, the more splits will be generated If it is too small, overfitting will occur If it is too large, model will be too simple and accuracy will be poor 15.071x Judge, Jury and Classifier: An Introduction to Trees 1

Predictions from CART In each subset, we have a bucket of observations, which may contain both outcomes (i.e., affirm and reverse) Compute the percentage of data in a subset of each type Example: 10 affirm, 2 reverse! 10/(10+2) = 0.87 Just like in logistic regression, we can threshold to obtain a prediction Threshold of 0.5 corresponds to picking most frequent outcome 15.071x Judge, Jury and Classifier: An Introduction to Trees 2

ROC curve for CART Vary the threshold to obtain an ROC curve 15.071x Judge, Jury and Classifier: An Introduction to Trees 3

Random Forests Designed to improve prediction accuracy of CART Works by building a large number of CART trees Makes model less interpretable To make a prediction for a new observation, each tree votes on the outcome, and we pick the outcome that receives the majority of the votes 15.071x Judge, Jury and Classifier: An Introduction to Trees 1

Building Many Trees Each tree can split on only a random subset of the variables Each tree is built from a bagged / bootstrapped sample of the data Select observations randomly with replacement Example original data: 1 2 3 4 5 New data : 15.071x Judge, Jury and Classifier: An Introduction to Trees 2

Random Forest Parameters Minimum number of observations in a subset In R, this is controlled by the nodesize parameter Smaller nodesize may take longer in R Number of trees In R, this is the ntree parameter Should not be too small, because bagging procedure may miss observations More trees take longer to build 15.071x Judge, Jury and Classifier: An Introduction to Trees 3

Parameter Selection In CART, the value of minbucket can affect the model s out-of-sample accuracy How should we set this parameter? We could select the value that gives the best testing set accuracy This is not right! 15.071x Judge, Jury and Classifier: An Introduction to Trees 1

K-fold Cross-Validation Given training set, split into k pieces (here k = 5) Use k-1 folds to estimate a model, and test model on remaining one fold ( validation set ) for each candidate parameter value Repeat for each of the k folds Predict Fold 4 from Folds 1, 2, 3, 5 Predict Fold Fold 53 from 1 Folds 21, Whole 2, Fold 3, Training 4 3 Set Fold 4 Fold 5 Folds 1, 2, 4, 5 15.071x Judge, Jury and Classifier: An Introduction to Trees 2

Output of k-fold Cross-Validation 0.8 0.7 0.6 Accuracy 0.5 0.4 0.3 0.2 Fold 1 Fold 1 Fold 2 Fold 2 Fold 31 Fold Average 1 Fold 31 Fold 42 Fold 4 Fold 5 Fold 5 Average 0.1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Parameter value 15.071x Judge, Jury and Classifier: An Introduction to Trees 3

Cross-Validation in R Before, we limited our tree using minbucket When we use cross-validation in R, we ll use a parameter called cp instead Complexity Parameter Like Adjusted R 2 and AIC Measures trade-off between model complexity and accuracy on the training set Smaller cp leads to a bigger tree (might overfit) 15.071x Judge, Jury and Classifier: An Introduction to Trees 4

Martin s Model Used 628 previous SCOTUS cases between 1994 and 2001 Made predictions for the 68 cases that would be decided in October 2002, before the term started Two stage approach based on CART: First stage: one tree to predict a unanimous liberal decision, other tree to predict unanimous conservative decision If conflicting predictions or predict no, move to next stage Second stage consists of predicting decision of each individual justice, and using majority decision as prediction 15.071x Judge, Jury and Classifier: An Introduction to Trees 1

Tree for Justice O Connor Is the lower court decision liberal? Yes No Reverse Is the case from the 2 nd 3 rd, DC or Federal Circuit Court? Yes No Affirm Yes Is the Respondent the US? No Is the primary issue civil rights, First Amendment, econ. activity or federalism? Yes No Reverse Reverse Affirm 15.071x Judge, Jury and Classifier: An Introduction to Trees 2

Tree for Justice Souter Is Justice Ginsburg s predicted decision liberal? Yes Is the lower court decision liberal? No Is the lower court decision liberal? Yes No Yes No Affirm Reverse Reverse Affirm Make a liberal decision Make a conservative decision 15.071x Judge, Jury and Classifier: An Introduction to Trees 3

The Experts Martin and his colleagues recruited 83 legal experts 71 academics and 12 attorneys 38 previously clerked for a Supreme Court justice, 33 were chaired professors and 5 were current or former law school deans Experts only asked to predict within their area of expertise; more than one expert to each case Allowed to consider any source of information, but not allowed to communicate with each other regarding predictions 15.071x Judge, Jury and Classifier: An Introduction to Trees 4

The Results For the 68 cases in October 2002: Overall case predictions: Model accuracy: 75% Experts accuracy: 59% Individual justice predictions: Model accuracy: 67% Experts accuracy: 68% 15.071x Judge, Jury and Classifier: An Introduction to Trees 5

The Analytics Edge Predicting Supreme Court decisions is very valuable to firms, politicians and non-governmental organizations A model that predicts these decisions is both more accurate and faster than experts CART model based on very high-level details of case beats experts who can process much more detailed and complex information 15.071x Judge, Jury and Classifier: An Introduction to Trees 6