Categorical Data Analysis

Similar documents
POLS 500c Advanced Statistical Methods. Spring 2013

Course Sequence (structured very close to two topics per week):

POLS 500C. Advanced Statistical Methods. Spring 2014

1. The Relationship Between Party Control, Latino CVAP and the Passage of Bills Benefitting Immigrants

NATHANIEL L. BECK CURRICULUM VITAE

Statistical Methods in Social Science (226) Part C

ECPR Winter School on Methods and Techniques Course Description Form. B17. Statistical Modelling of the Spatial Theory of Voting

ABSENTEE VOTING, MOBILIZATION, AND PARTICIPATION

Change in the Components of the Electoral Decision. Herbert F. Weisberg The Ohio State University. May 2, 2008 version

Of Shirking, Outliers, and Statistical Artifacts: Lame-Duck Legislators and Support for Impeachment

Delia Bailey. Center for Empirical Research in the Law Washington University Campus Box 1120 One Brookings Drive St.

Migration and Tourism Flows to New Zealand

Can Politicians Police Themselves? Natural Experimental Evidence from Brazil s Audit Courts Supplementary Appendix

Methodological and Substantive Issues in Analyses of a Dependent Nominal-Level Variable in Comparative Research. The Case of Party Choice

SOSC 5170 Qualitative Research Methodology

MCKINLEY L. BLACKBURN. Department of Economics Office Phone:

POLS 509: The Linear Model

POLITICAL SCIENCE 6402 INTERMEDIATE TECHNIQUES IN POLICY RESEARCH Monday and Wednesday 5:30-6:45 PM in 134 SSB

Appendices for Elections and the Regression-Discontinuity Design: Lessons from Close U.S. House Races,

Transnational Dimensions of Civil War

Experiments: Supplemental Material

EXTENDED FAMILY INFLUENCE ON INDIVIDUAL MIGRATION DECISION IN RURAL CHINA

Appendix: Uncovering Patterns Among Latent Variables: Human Rights and De Facto Judicial Independence

UC Davis UC Davis Previously Published Works

Powersharing, Protection, and Peace. Scott Gates, Benjamin A. T. Graham, Yonatan Lupu Håvard Strand, Kaare W. Strøm. September 17, 2015

IS THE MEASURED BLACK-WHITE WAGE GAP AMONG WOMEN TOO SMALL? Derek Neal University of Wisconsin Presented Nov 6, 2000 PRELIMINARY

Daniel C. Reed, Ph.D.

Pre-Electoral Coalition Formation in Parliamentary Democracies

Political Research Methods POLS 1600

Electoral Reform, Party Mobilization and Voter Turnout. Robert Stein, Rice University

Corruption and business procedures: an empirical investigation

UNDERSTANDING TAIWAN INDEPENDENCE AND ITS POLICY IMPLICATIONS

Chapter Four: Chamber Competitiveness, Political Polarization, and Political Parties

SOC 220: Inequality, Mobility, and the American Dream

The Impact of Unionization on the Wage of Hispanic Workers. Cinzia Rienzo and Carlos Vargas-Silva * This Version, May 2015.

PAUL GOREN. Curriculum Vita September Social Sciences Building th Ave South Minneapolis, MN 55455

Appendix to Non-Parametric Unfolding of Binary Choice Data Keith T. Poole Graduate School of Industrial Administration Carnegie-Mellon University

Understanding Taiwan Independence and Its Policy Implications

Experiments in Election Reform: Voter Perceptions of Campaigns Under Preferential and Plurality Voting

Feel like a more informed citizen of the United States and of the world

RESEARCH NOTE The effect of public opinion on social policy generosity

PLS 492 Congress and the Presidency Fall 2009

Electoral Surprise and the Midterm Loss in US Congressional Elections

Naturalisation and on-the-job training participation. of first-generation immigrants in Germany

Supplementary Material for Preventing Civil War: How the potential for international intervention can deter conflict onset.

Issue Importance and Performance Voting. *** Soumis à Political Behavior ***

AN ASSESSMENT OF THE INCOME AND EDUCATION DETERMINANTS OF PARTY IDENTIFICATION IN THE UNITED STATES

Class Bias in the U.S. Electorate,

The Impact of Unionization on the Wage of Hispanic Workers. Cinzia Rienzo and Carlos Vargas-Silva * This Version, December 2014.

arxiv: v1 [stat.ap] 10 Sep 2015

Spring 2017 Grad Course Atlas

Transnational Ties of Latino and Asian Americans by Immigrant Generation. Emi Tamaki University of Washington

David E. Lewis Revised December, 2009

A Report on the Social Network Battery in the 1998 American National Election Study Pilot Study. Robert Huckfeldt Ronald Lake Indiana University

410 Areeda Hall (Office) 1514 Massachusetts Avenue Cambridge, MA

Political Science 254 American Political Development Fall 2011

Schooling and Cohort Size: Evidence from Vietnam, Thailand, Iran and Cambodia. Evangelos M. Falaris University of Delaware. and

Presidency and Executive Politics

Comparative Electoral Politics Spring 2008 Professor Orit Kedar Tuesday, Thursday, 3-4:30 Room E51-061

PSC 558: Comparative Parties and Elections Spring 2010 Mondays 2-4:40pm Harkness 329

The Immigrant Double Disadvantage among Blacks in the United States. Katharine M. Donato Anna Jacobs Brittany Hearne

Opinion Polarization: Important Contributions, Necessary Limitations 1

Daniel C. Reed, Ph.D.

University of Toronto Department of Political Science. POL 314H1F L0101 Public Opinion and Voting. Fall 2018 Monday 10-12

Modeling Political Information Transmission as a Game of Telephone

Jeffrey J. Harden. Curriculum Vitæ. Department of Political Science Phone:

Macroeconomic Determinants of Tariff Policy in Pakistan

Supplemental Information Appendix. This appendix provides a detailed description of the data used in the paper and also. Turnout-by-Age Data

Female parliamentarians and economic growth: Evidence from a large panel

FERTILITY OF IMMIGRANTS AND NON-IMMIGRANTS IN THE UNITED STATES

Understanding Transit s Impact on Public Safety

COALITION FORMATION. Hanna Bäck Department of Government Uppsala University

from the SAGE Social Science Collections. All Rights Reserved.

Soren Jordan Updated: January 2018

Office: SSC 4217 Phone: ext Office Hours: Thursday 11:30am- 1pm

Speaking about Women in the Year of Hillary Clinton

Challenger Quality and the Incumbency Advantage

Residential segregation and socioeconomic outcomes When did ghettos go bad?

Segal and Howard also constructed a social liberalism score (see Segal & Howard 1999).

Case Study: Get out the Vote

Midterm Elections Used to Gauge President s Reelection Chances

Comparative Political Systems (GOVT_ 040) July 6 th -Aug. 7 th, 2015

AMERICAN POLITICAL INSTITUTIONS

All s Well That Ends Well: A Reply to Oneal, Barbieri & Peters*

ANES Panel Study Proposal Voter Turnout and the Electoral College 1. Voter Turnout and Electoral College Attitudes. Gregory D.

Burning the Midnight Oil: Clandestine Behavior, Hard Work, or Strategic Rush in Congressional Voting?

Asymmetric Partisan Biases in Perceptions of Political Parties

Elite Polarization and Mass Political Engagement: Information, Alienation, and Mobilization

Incumbency as a Source of Spillover Effects in Mixed Electoral Systems: Evidence from a Regression-Discontinuity Design.

Chad Westerland Curriculum Vitae

Factors influencing Latino immigrant householder s participation in social networks in rural areas of the Midwest

The Impact of Demographic, Socioeconomic and Locational Characteristics on Immigrant Remodeling Activity

Determinants of legislative success in House committees*

ANALYSIS OF THE EFFECT OF REMITTANCES ON ECONOMIC GROWTH USING PATH ANALYSIS ABSTRACT

Political Science Graduate Program Class Schedule - Spring 2016

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Aspirant candidate behaviour and progressive political ambition

CALIFORNIA INSTITUTE OF TECHNOLOGY

CALTECH/MIT VOTING TECHNOLOGY PROJECT A

ELITE AND MASS ATTITUDES ON HOW THE UK AND ITS PARTS ARE GOVERNED VOTING AT 16 WHAT NEXT? YEAR OLDS POLITICAL ATTITUDES AND CIVIC EDUCATION

Transcription:

Categorical Data Analysis Jeremy Freese Department of Sociology University of Wisconsin-Madison This syllabus may be subject to additional, presumably minor revision before or after the course begins. The latest rendition will always be made available at the webpage listed below. Instructor: Jeremy Freese, University of Wisconsin-Madison jfreese@ssc.wisc.edu Teaching Assistant: Jason Beckfield, Indiana University jbeckfie@indiana.edu Course webpage: http://www.ssc.wisc.edu/~jfreese/cda.htm This workshop introduces students to current methods for analyzing categorical data, with its principal focus being regression models for categorical outcomes. We will consider models for binary, ordinal, and nominal outcomes, as well as useful and related models for censored and count outcomes. We will discuss the appropriate specification of models, their estimation with statistical software, and the proper and practical interpretation. Computing in the course will primarily use Stata. The course assumes a good working knowledge of the linear regression model for continuous variables, as well as an elementary knowledge of matrix algebra. Books This book is will serve as our primary reading for much of the course: Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage. This book would also be really helpful if it were available, but unfortunately it won t be until shortly after our course ends: 1 Long, J. Scott and Jeremy Freese. 2001. Regression Models for Categorical Outcomes Using Stata. College Station, TX: Stata Press. 1 Even so, you should still buy it and cite it in everything you ever write.

The following books are also referenced on the syllabus: Agresti, Alan. 1990. Categorical Data Analysis. New York: John Wiley. Amemiya, Takeshi. 1985. Advanced Econometrics. Cambridge, MA: Harvard University Press. Fienberg, Stephen E. 1980. The Analysis of Cross-Classified Data (2nd ed.). Cambridge, MA: MIT Press. Cameron, A. Colin and Pravin K. Trivedi. 1998. Regression Analysis of Count Data. Oxford: Oxford University Press. Christensen, Ronald. 1997. Log-linear Models and Logistic Regression (2nd ed.). New York: Springer. Greene, William C. 2000. Econometric Analysis (4th ed.). New York: Prentice Hall. Hosmer, David W. and Stanley Lemeshow. 2000. Applied Logistic Regression. 2nd Edition. New York: Wiley. King, Gary. 1989. Unifying Political Methodology: The Likelihood Theory of Statistical Inference. Cambridge: Cambridge University Press. Powers, Daniel A. and Yu Xie. 2000. Statistical Methods for Categorical Data Analysis. San Diego: Academic Press. While all of these books have virtues, none are in any way required. Of them, I would recommend buying the Powers and Xie book if one wanted a book that integrated models for grouped data with the regression models from the first part of the course, and I would recommend buying the Agresti book first if one wanted a book specifically for its treatment of contingency table data. The Cameron and Trivedi book is peerless if one is going to do a lot of serious work with count models. The King book occupies a place of obvious importance in the political methodology movement within political science. The syllabus also makes reference to a few of the famed little green books published by Sage. The full references to these are included in the reading list. Readings and Schedule The pace of course like this tends to depend on sufficiently unpredictable factors, including student participation and reactions, that providing a precise daily schedule seems an exercise in pedagogical delusion. What follows is a listing of the topics that we might cover in the order that we will cover them. The reading list is intended less as assigned reading as an effort to provide both reading for the course and a bibliography of next sources should one

want to pursue any of these models in detail. As the course proceeds, I will provide more information about which readings would be the most instructive to do before class meetings. 1. Salutation; overview; Stata preliminaries McCloskey, Deirdre N. and Stephen T. Ziliak. 1996. The Standard Error of Regressions. Journal of Economic Literature 34: 97-114. 2. Review of the linear regression model and categorical independent variables in the linear regression model Long, Chapters 1-2 Powers and Xie, Chapter 2 3. Maximum likelihood estimation Long, Chapter 2.6 Powers and Xie, Appendix B Eliason, Scott R. 1993. Maximum Likelihood Estimation: Logic and Practice. Newbury Park, CA: Sage. King, Chapter 4 4. Regression models for censored and truncated data Long, Chapter 7 King, Chapter 9.1-9.3 Greene, Chapter 20.1-20.3 Application: Krasno, Jonathan, Donald Green, and Jonathan Cowden. 1994. The Dynamics of Campaign Fundraising in House Elections. Journal of Politics 56:459-474. On the Heckman model for sample selection bias: Heckman, James J. 1979. Sample Selection Bias as a Specification Error. Econometrica 47:153-161. 5. Binary models: specification and estimation

Long, Chapter 3.1-3.6 Powers and Xie, Chapter 3 King, Chapter 5.1-5.3 Christensen, Chapter 4 Fienberg, Chapter 6 Aldrich, John and Forrest Nelson. 1984. Linear Probability, Logit, and Probit Models. Newbury Park, CA: Sage. Jaccard, James. Interaction Effects in Logistic Regression. Newbury Park, CA: Sage. On measuring the magnitude of categorical covariates: Kaufman, Robert. 1996. Comparing Effects in Dichotomous Logistic Regression: A Variety of Standardized Coefficients. Social Science Quarterly 77:90-109. On models for rare events: King, Gary and Langche Zeng. 2001. "Logistic Regression in Rare Events Data." Political Analysis 9. Interpretation of results: Long, Chapter 3.7-3.9 Hosmer and Lemeshow, Chapter 5 Skewed logit model: Nagler, Jonathan. 1994. Scobit: an alternative estimator to logit and probit. American Journal of Political Science 38: 230-255. Heteroskedastic probit model: See discussion in Greene Applications: Brooks, Clem and Jeff Manza. 1997. "Social Cleavages and Political Alignments: U.S. Presidential Elections, 1960 to 1972." American Sociological Review 62:937-946.

Bartels, Larry. 2000. "Partisanship and Voting Behavior, 1952-1996." American Journal of Political Science 44:35-50. Rosenstone, Stephen and John Hansen. 2001. "Solving the Puzzle of Participation in Electoral Politics." Pp. 69-82 in Richard Niemi and Herbert Weisberg (eds.) Controversies in Voting Behavior. Washington DC: C.Q. Press. 6. Hypothesis testing and measuring goodness of fit Long, Chapter 4 Cameron and Trivedi, Chapter 5 7. Models for ordered outcomes: specification and estimation Long Chapter 5.1-5.3 Powers and Xie, Chapter 6 King, Chapter 5.4 Interpretation, parallel regression assumption, generalized model: Long, Chapter 5.4-5.7 Applications: Huckfeldt, Robert. 2001. "The Social Communication of Political Expertise." American Journal of Political Science. 45: 425-438. Greeley, Andrew M. and Michael Hout. 1999. "Americans' Increasing Belief in Life after Death: Religious Competition and Acculturation." American Sociological Review 64:813-835. (But if you read this, you should also check out the debate between Stolzenberg and Greeley/Hout in ASR 66(1): 146-158.) Stereotype ordinal regression model: Anderson, J.A. 1984. "Regression and ordered categorical variables (with discussion)." Journal of the Royal Statistical Society Series B 46:1-30. Lunt, Mark. 2001. "Stereotype Ordinal Regression." Stata Technical Bulletin 61:12-18. 8. Models for nominal outcomes: specification and estimation Long, Chapter 6.1-6.5

Powers and Xie, Chapter 7 Hosmer and Lemeshow, 8.1 Alvarez, R. Michael and Jonathan Nagler. 1998. "When politics and models collide: Estimating models of multiparty elections." American Journal of Political Science 42:55-96. Gould, William. 2000. "Interpreting Logistic Regression in All Its Forms." Stata Technical Bulletin Reprints 9:257-270. On the nested logit model: Amemiya, Chapter 9.3.5 (pp. 300-306) See also discussion in Greene Interpretation: Long, Chapter 6.6-6.10 King, Gary, Michael Tomz, and Jason Wittenberg. 2000. Making the Most Out of Statistical Analyses: Improving Interpretation and Presentation. American Journal of Political Science 44:341-355. Applications: Hao, Lingxin and Mary C. Brinton. 1997. "Productive Activities and Support Systems of Single Mothers." American Journal of Sociology 102:1305-1344. Brooks, Clem. 2000. "Civil Rights Liberalism and the Suppression of a Republican Political Realignment in the United States, 1972 to 1996." American Sociological Review 65:483-505. On testing the assumption of the independence of irrelevant alternatives: Hausman, J. A. and D. McFadden. 1984. Specification tests for the multinomial logit model. Econometrica 52:1219-1240. Small, K. A. and C. Hsiao. 1985. Multinomial logit specification tests. International Economic Review 26:619-627. Zhang, Junsen and Saul D. Hoffman. 1993. Discrete-Choice Logit Models. Sociological Methods and Research 22:193-213.

9. Poisson and negative binomial regression models for count outcomes Long, Chapter 8 Cameron and Trivedi, Chapters 1-3 King, Chapter 5.5-5.10 King, Gary. 1988. Statistical Models for Political Science Event Counts: Bias in Conventional Procedures and Evidence for the Exponential Poisson Regression Model. American Journal of Political Science 32:838-863. Applications: Kernell, Samuel and Michael McDonald. 1999. "Congress and America's Political Development: The Transformation of the Post Office from Patronage to Service." American Journal of Political Science. 43: 792-811. Lewis, David and James Michael Strine. 1996. "What Time Is It? The Use of Power in Four Different Types of Presidential Time." Journal of Politics. 58: 682-706. Sampson, Robert J. and John H. Laub. 1996. "Socioeconomic Achievement in the Life Course of Disadvantaged Men: Military Service as a Turning Point, Circa 1940-1965." American Sociological Review 61:347-367. Some further details on count models: Cameron and Trivedi, Chapters 4 and 12 (the entire book is tremendous, incidentally) 10. Event-history analysis: The point of this is not to teach you how to do event history analysis, as that is a matter which would require certainly more time than we can give and really an entire course. What I hope to do is to give an orientation into what an event history or survival analysis problem looks like, when and why you need special models for this kind of data, and how the approach is connected to the Poisson models that we just covered. Powers and Xie, Chapter 5 Carroll, Glenn R. 1983. "Dynamic Analysis of Discrete Dependent Variables: A Didactic Essay." Quality and Quantity 17:425-460. A good overall treatment of these models can be found in: Hosmer, David W. and Stanley Lemeshow. 1999. Applied Survival Analysis: Regression Modeling of Time to Event Data. New York: Wiley.

Applications: Hannan, Michael T. and Glenn R. Carroll. 1981. "Dynamics of Formal Political Structure: An Event-History Analysis." American Sociological Review 46:19-35. Warwick, Paul and Stephen T. Easton. 1992. "The Cabinet Stability Controversy: New Perspectives on a Classic Problem." American Journal of Political Science 36:122-146. 11. Contingency table analysis. Note: We are going to spend less time on this than planned in the last rendition of the course, but I couldn t see any reason not to include the whole reading list from last time as at least a reference for any students who become more interested in the topic. Introduction and the two-way table: Powers and Xie, Chapter 4.1-4.4.3 Fienberg, Chapter 2 Agresti, Chapter 2 Knoke, David and Peter J. Burke. 1980. Log-Linear Models. Newbury Park, CA: Sage. Multiway tables: Powers and Xie, Chapter 4.6 Agresti, Chapter 5 Fienberg, Chapter 3 Model comparison: Fienberg, Chapter 4 Agresti, Chapter 7 The Bayes Information Criterion (BIC) statistic: Raftery, Adrian. 1986. Choosing Models for Cross-Classifications. American Sociological Review 51:145-146. Raftery, Adrian E. 1995. Bayesian Model Selection in Social Research. Sociological Methodology 25:111-163.

Weakliem, David. 1999. A Critique of the Bayes Information Criterion for Model Selection. Sociological Methods and Research 27:411-427. Raftery, Adrian. 1999. Bayes Factors and BIC: Comment on 'A Critique of the Bayesian Information Criterion for Model Selection'. Sociological Methods and Research 27:411-427. Models for ordered categories (uniform association, row effects, column effects): Powers and Xie, Chapter 4.5 Christensen, Chapter 7 Agresti, Chapter 8 Green, J. A. 1988. Loglinear Analysis of Cross-Classified Ordinal Data: Application in Developmental Research. Child Development 59:1-25. Square tables: Powers and Xie, Chapter 4.4.5 and 4.4.6 Agresti, Chapter 10.1-10.5 (Compare to Agresti 11.1-11.2) Hout, Michael. 1983. Mobility Tables. Newbury Park, CA: Sage. Sobel, Michael, Michael Hout, and Otis Dudley Duncan. 1985. Exchange, Structure, and Symmetry in Occupational Mobility. American Journal of Sociology 91:359-372. Sobel, Michael E. 1988. Some Models for the Multiway Contingency Table with One-to-One Correspondence among Categories. Sociological Methodology 18:165-191. 12. Propensity-score matching models (for categorical independent variables) Rosenbaum, P. and D. Rubin. 1984. "Reducing Bias in Observational Studies using Subclassification on the Propensity Score." Journal of the American Statistical Association 79:516-524. Smith, Herbert L. 1997. "Matching with Multiple Controls to Estimate Treatment Effects in Observational Studies." Sociological Methodology 27:325-353.

Dehejia, Rajeev H. and Sadek Wahba. 1998. "Propensity Score Matching Methods for Non-Experimental Causal Studies." in Technical Working Paper Working Paper 6829 National Bureau of Economic Research. Imbens, Guido W. 1999. "The Role of Propensity Score in Estimating Dose-Response Functions." Technical Working Paper 237, National Bureau of Economic Research.