Cluster Analysis. (see also: Segmentation)
|
|
- Adela Gibson
- 5 years ago
- Views:
Transcription
1 Cluster Analysis (see also: Segmentation)
2 Cluster Analysis Ø Unsupervised: no target variable for training Ø Partition the data into groups (clusters) so that: Ø Observations within a cluster are similar in some sense Ø Observations in different clusters are different in some sense Ø There is no one correct answer, though there are good and bad clusters Ø No method words best all the time That s not very specific
3 (Some) Applications of Clustering Ø Customer segmentation: groups of customers with similar shopping or buying patterns Ø Dimension reduction: Ø cluster variables together Ø cluster individuals together and use cluster variable as proxy for demographic or behavioral variables Ø Image segmentation Ø Gather stores with similar characteristics for sales forecasting Ø Find related topics in text data Ø Find communities in social networks
4 Methodology Ø Hard vs. Fuzzy Clustering Ø Hard: objects can belong to only one cluster Ø k-means (PROC FASTCLUS) Ø DBSCAN Ø Hierarchical (PROC CLUSTER) Ø Fuzzy: objects can belong to more than one cluster (usually with some probability) Ø Gaussian Mixture Models
5 Methodology Ø Hierarchical vs. Flat Ø Hierarchical: clusters form a tree so you can visually see which clusters are most similar to each other.
6 Methodology Ø Hierarchical vs. Flat Ø Hierarchical: clusters form a tree so you can visually see which clusters are most similar to each other. Ø Agglomerative: points start out as individual clusters, and they are combined until everything is in one cluster. Ø Divisive: All points start in same cluster and at each step a cluster is divided into two clusters. Ø Flat: Clusters are created according to some other process, usually iteratively updating cluster assignments
7 Hierarchical Clustering (Agglomerative) Some Data A B C I H J D G E F
8 Hierarchical Clustering (Agglomerative) First Step
9 Hierarchical Clustering (Agglomerative) Second Step
10 Hierarchical Clustering (Agglomerative) Third Step
11 Hierarchical Clustering (Agglomerative) Forth Step
12 Hierarchical Clustering (Agglomerative) Fifth Step
13 Hierarchical Clustering (Agglomerative) Sixth Step
14 Hierarchical Clustering (Agglomerative) Seventh Step We might have known that we only wanted 3 clusters, in which case we d stop once we had 3.
15 Hierarchical Clustering (Agglomerative) Eighth Step
16 Hierarchical Clustering (Agglomerative) Final Step
17 Hierarchical Clustering Levels of the Dendrogram
18 Resulting Dendrogram A B C D E F G H I J
19 Linkages Which clusters/points are closest to each other? How do I measure the distance between a point/cluster and a cluster?
20 Linkages Single Linkage: Distance between the closest points in the clusters. (Minimum Spanning Tree)
21 Linkages Complete Linkage: Distance between the farthest points in the clusters.
22 Linkages Centroid Linkage: Distance between the centroids (means) of each cluster. x x
23 Linkages Average Linkage: Average distance between all points in the clusters.
24 Linkages Ward s Method: Increase in SSE (variance) when clusters are combined. centroid for cluster i, c i x Ø Default in SAS PROC CLUSTER Ø Shown mathematically similar to centroid linkage data points in cluster i: x 1, x 2,, x Ni
25 Hierarchical Clustering Summary Ø Disadvantages Ø Lacks global objective function: only makes decision based on local criteria. Ø Merging decisions are final. Once a point is assigned to a cluster, it stays there. Ø Computationally intensive, large storage requirements, not good for large datasets Ø Poor performance on noisy or high-dimensional data like text. Ø Advantages Ø Lacks global objective function: no complicated algorithm or problem with local minima Ø Creates hierarchy that can help choose the number of clusters and examine how those clusters relate to each other. Ø Can be used in conjunction with other faster methods
26 k- Means Clustering (PROC FASTCLUS in SAS) Ø The most popular clustering algorithm data points in Cluster 1 x Cluster 2 (C 2 ) centroid c 2 Cluster 1 (C 1 ) centroid c 1 x data points in Cluster 2 Ø Tries to minimize the sum of squared distances from each point to its cluster centroid. (Global objective function)
27 k- Means Algorithm Ø Start with k seed points Ø Randomly initialized (most software) Ø Determined methodically (SAS PROC FASTCLUS) Ø Assign each data point to the closest seed point. Ø The seed point then represents a cluster of data Ø Reset seed points to be the centroids of the cluster Ø Repeat steps 2-4 updating the cluster centroids until they do not change.
28 k- Means Interactive Demo (You may have to add the site to your exceptions list on the Java Control Panel to view.)
29 Choice of Distance Metric Ø Most distances like Euclidean, Manhattan, or Max will provide similar answers. Ø Use cosine distance (really 1-cos since cosine measures similarity) for text data. This is called spherical k-means. Ø Using Mahalanobis distance is essentially the Expectation-Maximization (EM method) for Gaussian Mixtures.
30 Determining Number of Clusters (SSE) Ø Try the algorithm with k=1,2,3, Ø Examine the objective function values Ø Look for a place where the marginal benefit to objective function for adding a cluster becomes small k=1 objective function (SSE) is 902
31 Determining Number of Clusters (SSE) Ø Try the algorithm with k=1,2,3, Ø Examine the objective function values Ø Look for a place where the marginal benefit to objective function for adding a cluster becomes small k=2 objective function (SSE) is 213
32 Determining Number of Clusters (SSE) Ø Try the algorithm with k=1,2,3, Ø Examine the objective function values Ø Look for a place where the marginal benefit to objective function for adding a cluster becomes small k=3 objective function (SSE) is 193
33 Determining Number of Clusters (SSE) Ø Try the algorithm with k=1,2,3, Ø Examine the objective function values Ø Look for a place where the marginal benefit to objective function for adding a cluster becomes small Objective Function k=1 k=2 k=3 k=4 Elbow => k=2
34 k- Means Summary Ø Disadvantages Ø Dependent on initialization (initial seeds) Ø Can be sensitive to outliers Ø If problem, should consider k-mediods (uses median not mean) Ø Have to input the number of clusters Ø Difficulty detecting non-spheroidal (globular) clusters Ø Advantages Ø Modest time/storage requirements. Ø Shown you can terminate method after small number of iterations with good results. Ø Good for wide variety of data types
35 Cluster Validation How do I know that my clusters are actually clusters? Ø Lots of techniques/metrics have been proposed Ø Measure separation between clusters Ø Measure cohesion within clusters Ø All have merit, most are difficult to interpret in the context of statistical significance.
36 Cluster Validation Ø To establish statistical significance: Ø Show that you can t do just as well with randomized data (i.e. assume the null hypothesis of no clusters) Ø Simulate ~1000 random data sets choosing from the distributions or ranges of your variables. Cluster them with the same number of clusters. Record the SSE (k-means objective function) or validity metric of choice. Use this to show that your actual SSE is far better than you could expect to achieve if no clusters exist.
37 Profiling Clusters Now that we have clusters, how do we describe them? Ø Use basic descriptives and hypothesis tests to show differences between clusters Ø Use a decision tree to predict cluster Ø SAS EM has segment profiler node
38 Other types of Clustering (self- study) Ø DBSCAN Density based algorithm designed to find dense areas of points. Capable of identifying noise points which do not belong to any clusters. Ø Graph/Network Clustering Spectral clustering and modularity maximization. Covered in Social Network Analysis in Spring.
39
40 Some Explanation of SAS s Clustering Output (SELF- STUDY) Because it s not exceedingly easy to figure out online!
41 Cubic Clustering Criterion (CCC) Ø Only available in SAS (to my knowledge) Ø CCC > 2 means that clustering is good Ø 0 > CCC > 2 means clustering requires examination Ø If slightly negative, risk of outliers is low Ø If ~< -30 then risk of outliers is high Ø Should not be used with single or complete linkage, but with centroid or ward s method. Ø Each cluster must have >10 observations. Source: Tufféry, Stéphane. Data Mining and Statistics for Decision Making. Wiley 2011
42 Determining Number of Clusters with the Cubic Clustering Criterion (CCC) Ø A partition into k clusters is good when we see a dip in CCC for k-1 clusters and a peak for k clusters. Ø After k clusters, the CCC should either a gradually decrease or a gradual rise (the latter event happens when more isolated groups or points are present) 1 Source: Tufféry, Stéphane. Data Mining and Statistics for Decision Making. Wiley 2011
43 Determining Number of Clusters with the Cubic Clustering Criterion (CCC) Image Source: Tufféry, Stéphane. Data Mining and Statistics for Decision Making. Wiley 2011
44 Determining Number of Clusters with the Cubic Clustering Criterion (CCC) WARNING: Do not expect the CCC to be common knowledge outside of the SAS domain.
45 Overall R- Squared and Pseudo- F These statistics draw connections between a final clustering and ANOVA. Ø Total Sum of Squares (SST) Ø Between Group Sum of Squares (SSB) Ø Within Group Sum of Squares (SSW) Ø This is the k-means objective previously referred to as SSE. Ø Minimizing SSW => Maximizing SSB Ø SST = SSB + SSW. Ø Overall R 2 = SSB/SST Ø b
46 Example: PenDigit Data Ø Goal: Automatic recognition of handwritten digits Ø Digit database of 250 samples from 44 writers Ø Subjects wrote digits in random order inside boxes of 500 by 500 tablet pixel resolution Ø Spatial resampling to obtain a constant number of regularly spaced points on the trajectory Ø (x #, x % ) give the first point coordinate Ø (x ',x ( ) give the second point coordinate Ø etc.
47 Example: PenDigit Data proc fastclus run; data=datasets.pendigittest maxclusters=10 out = clus; var x1--x16;
48 Example: PenDigit Data The first step to creating your own hierarchical dendrogram.
49 Example: PenDigit Data proc glm data= clus; class cluster; model x1 = cluster; run; quit;
50 Example: PenDigit Data
51 Example: PenDigit Data
52 Example: PenDigit Data Essentially using the centroids as predictions and then computing R- squared.
Dimension Reduction. Why and How
Dimension Reduction Why and How The Curse of Dimensionality As the dimensionality (i.e. number of variables) of a space grows, data points become so spread out that the ideas of distance and density become
More informationNo Adults Allowed! Unsupervised Learning Applied to Gerrymandered School Districts
No Adults Allowed! Unsupervised Learning Applied to Gerrymandered School Districts Divya Siddarth, Amber Thomas 1. INTRODUCTION With more than 80% of public school students attending the school assigned
More informationSupport Vector Machines
Support Vector Machines Linearly Separable Data SVM: Simple Linear Separator hyperplane Which Simple Linear Separator? Classifier Margin Objective #1: Maximize Margin MARGIN MARGIN How s this look? MARGIN
More informationInstructors: Tengyu Ma and Chris Re
Instructors: Tengyu Ma and Chris Re cs229.stanford.edu Ø Probability (CS109 or STAT 116) Ø distribution, random variable, expectation, conditional probability, variance, density Ø Linear algebra (Math
More informationDo two parties represent the US? Clustering analysis of US public ideology survey
Do two parties represent the US? Clustering analysis of US public ideology survey Louisa Lee 1 and Siyu Zhang 2, 3 Advised by: Vicky Chuqiao Yang 1 1 Department of Engineering Sciences and Applied Mathematics,
More informationAMONG the vast and diverse collection of videos in
1 Broadcasting oneself: Visual Discovery of Vlogging Styles Oya Aran, Member, IEEE, Joan-Isaac Biel, and Daniel Gatica-Perez, Member, IEEE Abstract We present a data-driven approach to discover different
More informationRandom Forests. Gradient Boosting. and. Bagging and Boosting
Random Forests and Gradient Boosting Bagging and Boosting The Bootstrap Sample and Bagging Simple ideas to improve any model via ensemble Bootstrap Samples Ø Random samples of your data with replacement
More informationA comparative analysis of subreddit recommenders for Reddit
A comparative analysis of subreddit recommenders for Reddit Jay Baxter Massachusetts Institute of Technology jbaxter@mit.edu Abstract Reddit has become a very popular social news website, but even though
More informationStatistical Analysis of Corruption Perception Index across countries
Statistical Analysis of Corruption Perception Index across countries AMDA Project Summary Report (Under the guidance of Prof Malay Bhattacharya) Group 3 Anit Suri 1511007 Avishek Biswas 1511013 Diwakar
More informationProbabilistic earthquake early warning in complex earth models using prior sampling
Probabilistic earthquake early warning in complex earth models using prior sampling Andrew Valentine, Paul Käufl & Jeannot Trampert EGU 2016 21 st April www.geo.uu.nl/~andrew a.p.valentine@uu.nl A case
More informationSubreddit Recommendations within Reddit Communities
Subreddit Recommendations within Reddit Communities Vishnu Sundaresan, Irving Hsu, Daryl Chang Stanford University, Department of Computer Science ABSTRACT: We describe the creation of a recommendation
More informationOverview. Ø Neural Networks are considered black-box models Ø They are complex and do not provide much insight into variable relationships
Neural Networks Overview Ø s are considered black-box models Ø They are complex and do not provide much insight into variable relationships Ø They have the potential to model very complicated patterns
More informationWeb Mining: Identifying Document Structure for Web Document Clustering
Web Mining: Identifying Document Structure for Web Document Clustering by Khaled M. Hammouda A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of
More informationClassifier Evaluation and Selection. Review and Overview of Methods
Classifier Evaluation and Selection Review and Overview of Methods Things to consider Ø Interpretation vs. Prediction Ø Model Parsimony vs. Model Error Ø Type of prediction task: Ø Decisions Interested
More informationComputational challenges in analyzing and moderating online social discussions
Computational challenges in analyzing and moderating online social discussions Aristides Gionis Department of Computer Science Aalto University Machine learning coffee seminar Oct 23, 2017 social media
More informationRecommendations For Reddit Users Avideh Taalimanesh and Mohammad Aleagha Stanford University, December 2012
Recommendations For Reddit Users Avideh Taalimanesh and Mohammad Aleagha Stanford University, December 2012 Abstract In this paper we attempt to develop an algorithm to generate a set of post recommendations
More informationEvaluating the Connection Between Internet Coverage and Polling Accuracy
Evaluating the Connection Between Internet Coverage and Polling Accuracy California Propositions 2005-2010 Erika Oblea December 12, 2011 Statistics 157 Professor Aldous Oblea 1 Introduction: Polls are
More informationUTS:IPPG Project Team. Project Director: Associate Professor Roberta Ryan, Director IPPG. Project Manager: Catherine Hastings, Research Officer
IPPG Project Team Project Director: Associate Professor Roberta Ryan, Director IPPG Project Manager: Catherine Hastings, Research Officer Research Assistance: Theresa Alvarez, Research Assistant Acknowledgements
More informationPOPULATION AGEING: a Cross-Disciplinary Approach Harokopion University, Tuesday 25 May 2010 Drawing the profile of elder immigrants in Greece
POPULATION AGEING: a Cross-Disciplinary Approach Harokopion University, Tuesday 25 May 2010 Drawing the profile of elder immigrants in Greece Alexandra TRAGAKI Department of Geography, Harokopion University
More informationBlockmodels/Positional Analysis Implementation and Application. By Yulia Tyshchuk Tracey Dilacsio
Blockmodels/Positional Analysis Implementation and Application By Yulia Tyshchuk Tracey Dilacsio Articles O Wasserman and Faust Chapter 12 O O Bearman, Peter S. and Kevin D. Everett (1993). The Structure
More informationPartition Decomposition for Roll Call Data
Partition Decomposition for Roll Call Data G. Leibon 1,2, S. Pauls 2, D. N. Rockmore 2,3,4, and R. Savell 5 Abstract In this paper we bring to bear some new tools from statistical learning on the analysis
More informationProbabilistic Latent Semantic Analysis Hofmann (1999)
Probabilistic Latent Semantic Analysis Hofmann (1999) Presenter: Mercè Vintró Ricart February 8, 2016 Outline Background Topic models: What are they? Why do we use them? Latent Semantic Analysis (LSA)
More informationBiogeography-Based Optimization Combined with Evolutionary Strategy and Immigration Refusal
Biogeography-Based Optimization Combined with Evolutionary Strategy and Immigration Refusal Dawei Du, Dan Simon, and Mehmet Ergezer Department of Electrical and Computer Engineering Cleveland State University
More information8 5 Sampling Distributions
8 5 Sampling Distributions Skills we've learned 8.1 Measures of Central Tendency mean, median, mode, variance, standard deviation, expected value, box and whisker plot, interquartile range, outlier 8.2
More informationCompare Your Area User Guide
Compare Your Area User Guide October 2016 Contents 1. Introduction 2. Data - Police recorded crime data - Population data 3. How to interpret the charts - Similar Local Area Bar Chart - Within Force Bar
More informationResponse to the Report Evaluation of Edison/Mitofsky Election System
US Count Votes' National Election Data Archive Project Response to the Report Evaluation of Edison/Mitofsky Election System 2004 http://exit-poll.net/election-night/evaluationjan192005.pdf Executive Summary
More informationCommittee for Economic Development: October Business Leader Study. Submitted to:
ZOGBY INTERNATIONAL Committee for Economic Development: October Business Leader Study Submitted to: Mike Petro Vice President of Business and Government Policy and Chief of Staff Submitted by: Zogby International
More informationExperiments on Data Preprocessing of Persian Blog Networks
Experiments on Data Preprocessing of Persian Blog Networks Zeinab Borhani-Fard School of Computer Engineering University of Qom Qom, Iran Behrouz Minaie-Bidgoli School of Computer Engineering Iran University
More informationResearch Statement. Jeffrey J. Harden. 2 Dissertation Research: The Dimensions of Representation
Research Statement Jeffrey J. Harden 1 Introduction My research agenda includes work in both quantitative methodology and American politics. In methodology I am broadly interested in developing and evaluating
More informationDiscovering Migrant Types Through Cluster Analysis: Changes in the Mexico-U.S. Streams from 1970 to 2000
Discovering Migrant Types Through Cluster Analysis: Changes in the Mexico-U.S. Streams from 1970 to 2000 Extended Abstract - Do not cite or quote without permission. Filiz Garip Department of Sociology
More informationSituational Analysis: Peterborough & the Kawarthas
Canadian Centre for Economic Analysis Toronto Situational Analysis: February 2018 Geospatial Data Analysis Group ISBN: 978-1-989077-03-0 c 2018 Canadian Centre for Economic Analysis The Canadian Centre
More informationComparison Sorts. EECS 2011 Prof. J. Elder - 1 -
Comparison Sorts - 1 - Sorting Ø We have seen the advantage of sorted data representations for a number of applications q Sparse vectors q Maps q Dictionaries Ø Here we consider the problem of how to efficiently
More informationPolice patrol districting method and simulation evaluation using agent-based model & GIS
Zhang and Brown Security Informatics 2013, 2:7 RESEARCH Open Access Police patrol districting method and simulation evaluation using agent-based model & GIS Yue Zhang * and Donald E Brown Abstract Police
More informationParties, Candidates, Issues: electoral competition revisited
Parties, Candidates, Issues: electoral competition revisited Introduction The partisan competition is part of the operation of political parties, ranging from ideology to issues of public policy choices.
More informationPerformance Evaluation of Cluster Based Techniques for Zoning of Crime Info
Performance Evaluation of Cluster Based Techniques for Zoning of Crime Info Ms. Ashwini Gharde 1, Mrs. Ashwini Yerlekar 2 1 M.Tech Student, RGCER, Nagpur Maharshtra, India 2 Asst. Prof, Department of Computer
More informationInstant Runoff Voting s Startling Rate of Failure. Joe Ornstein. Advisor: Robert Norman
Instant Runoff Voting s Startling Rate of Failure Joe Ornstein Advisor: Robert Norman June 6 th, 2009 --Abstract-- Instant Runoff Voting (IRV) is a sophisticated alternative voting system, designed to
More informationPotential alliances for Turkey in coming WTO agricultural negotiations. CIHEAM Analytic note. N 20 June Berna Türkekul
CIHEAM Analytic note N 20 June 2007 Potential alliances for Turkey in coming WTO agricultural negotiations Berna Türkekul Ege University Faculty of Agriculture Agricultural Economics Department Potential
More informationREVEALING THE GEOPOLITICAL GEOMETRY THROUGH SAMPLING JONATHAN MATTINGLY (+ THE TEAM) DUKE MATH
REVEALING THE GEOPOLITICAL GEOMETRY THROUGH SAMPLING JONATHAN MATTINGLY (+ THE TEAM) DUKE MATH gerrymander manipulate the boundaries of an electoral constituency to favor one party or class. achieve (a
More informationIdeological Perfectionism on Judicial Panels
Ideological Perfectionism on Judicial Panels Daniel L. Chen (ETH) and Moti Michaeli (EUI) and Daniel Spiro (UiO) Chen/Michaeli/Spiro Ideological Perfectionism 1 / 46 Behavioral Judging Formation of Normative
More informationEconomics 470 Some Notes on Simple Alternatives to Majority Rule
Economics 470 Some Notes on Simple Alternatives to Majority Rule Some of the voting procedures considered here are not considered as a means of revealing preferences on a public good issue, but as a means
More informationThe 2017 TRACE Matrix Bribery Risk Matrix
The 2017 TRACE Matrix Bribery Risk Matrix Methodology Report Corruption is notoriously difficult to measure. Even defining it can be a challenge, beyond the standard formula of using public position for
More informationA Cluster-Based Approach for identifying East Asian Economies: A foundation for monetary integration
A Cluster-Based Approach for identifying East Asian Economies: A foundation for monetary integration Hazel Yuen a, b a Department of Economics, National University of Singapore, email:hazel23@singnet.com.sg.
More informationA Retrospective Study of State Aid Control in the German Broadband Market
A Retrospective Study of State Aid Control in the German Broadband Market Tomaso Duso 1 Mattia Nardotto 2 Jo Seldeslachts 3 1 DIW Berlin, TU Berlin, Berlin Centre for Consumer Policies, CEPR, and CESifo
More informationSocial Rankings in Human-Computer Committees
Social Rankings in Human-Computer Committees Moshe Bitan 1, Ya akov (Kobi) Gal 3 and Elad Dokow 4, and Sarit Kraus 1,2 1 Computer Science Department, Bar Ilan University, Israel 2 Institute for Advanced
More informationIDENTIFYING FAULT-PRONE MODULES IN SOFTWARE FOR DIAGNOSIS AND TREATMENT USING EEPORTERS CLASSIFICATION TREE
IDENTIFYING FAULT-PRONE MODULES IN SOFTWARE FOR DIAGNOSIS AND TREATMENT USING EEPORTERS CLASSIFICATION TREE Bassey. A. Ekanem 1, Nseabasi Essien 2 1 Department of Computer Science, Delta State Polytechnic,
More informationThe Seventeenth Amendment, Senate Ideology, and the Growth of Government
The Seventeenth Amendment, Senate Ideology, and the Growth of Government Danko Tarabar College of Business and Economics 1601 University Ave, PO BOX 6025 West Virginia University Phone: 681-212-9983 datarabar@mix.wvu.edu
More informationPolitical Economics II Spring Lectures 4-5 Part II Partisan Politics and Political Agency. Torsten Persson, IIES
Lectures 4-5_190213.pdf Political Economics II Spring 2019 Lectures 4-5 Part II Partisan Politics and Political Agency Torsten Persson, IIES 1 Introduction: Partisan Politics Aims continue exploring policy
More informationAgent Modeling of Hispanic Population Acculturation and Behavior
Agent of Hispanic Population Acculturation and Behavior Agent Modeling of Hispanic Population Acculturation and Behavior Lyle Wallis Dr. Mark Paich Decisio Consulting Inc. 201 Linden St. Ste 202 Fort Collins
More informationA GENERAL TYPOLOGY OF PERSONAL NETWORKS OF IMMIGRANTS WITH LESS THAN 10 YEARS LIVING IN SPAIN
1 XXIII International Sunbelt Social Network Conference 14-16th, February, Cancún (México) A GENERAL TYPOLOGY OF PERSONAL NETWORKS OF IMMIGRANTS WITH LESS THAN 10 YEARS LIVING IN SPAIN Isidro Maya Jariego
More information* Source: Part I Theoretical Distribution
Problem: A recent report from Pew Research Center (September 14, 2018) discussed key finding about U.S. immigrants. One result was that Mexico is the top origin country of the U.S. immigrant population.
More informationStructural Folds: Generative Disruption in Overlapping Groups. Balázs Vedres David Stark
Structural Folds: Generative Disruption in Overlapping Groups Balázs Vedres David Stark Columbia University Central European University Santa Fe Institute AJS, January 2010: Vedres, Balázs, and David Stark.
More informationTHE PRIMITIVES OF LEGAL PROTECTION AGAINST DATA TOTALITARIANISMS
THE PRIMITIVES OF LEGAL PROTECTION AGAINST DATA TOTALITARIANISMS Mireille Hildebrandt Research Professor at Vrije Universiteit Brussel (Law) Parttime Full Professor at Radboud University Nijmegen (CS)
More informationIN THE UNITED STATES DISTRICT COURT FOR THE EASTERN DISTRICT OF PENNSYLVANIA
IN THE UNITED STATES DISTRICT COURT FOR THE EASTERN DISTRICT OF PENNSYLVANIA Mahari Bailey, et al., : Plaintiffs : C.A. No. 10-5952 : v. : : City of Philadelphia, et al., : Defendants : PLAINTIFFS EIGHTH
More informationDeep Learning Working Group R-CNN
Deep Learning Working Group R-CNN Includes slides from : Josef Sivic, Andrew Zisserman and so many other Nicolas Gonthier February 1, 2018 Recognition Tasks Image Classification Does the image contain
More informationtwentieth century and early years of the twenty-first century, reversed its net migration result,
Resident population in Portugal in working ages, according to migratory profiles, 2008 EPC 2012, Stockholm Maria Graça Magalhães, Statistics Portugal and University of Évora (PhD student) Maria Filomena
More informationList of Tables and Appendices
Abstract Oregonians sentenced for felony convictions and released from jail or prison in 2005 and 2006 were evaluated for revocation risk. Those released from jail, from prison, and those served through
More informationAcculturation over time among adolescents from immigrant Chinese families
Acculturation over time among adolescents from immigrant Chinese families Catherine L. Costigan University of Victoria Workshop on the Immigrant Family May 28-29, 2012 Population Change and Lifecourse
More informationEUROPEAN CITIZENSHIP
Standard Eurobarometer 81 Spring 2014 EUROPEAN CITIZENSHIP REPORT Fieldwork: June 2014 This survey has been requested and co-ordinated by the European Commission, Directorate-General for Communication.
More informationHoboken Public Schools. AP Statistics Curriculum
Hoboken Public Schools AP Statistics Curriculum AP Statistics HOBOKEN PUBLIC SCHOOLS Course Description AP Statistics is the high school equivalent of a one semester, introductory college statistics course.
More informationKey Considerations for Implementing Bodies and Oversight Actors
Implementing and Overseeing Electronic Voting and Counting Technologies Key Considerations for Implementing Bodies and Oversight Actors Lead Authors Ben Goldsmith Holly Ruthrauff This publication is made
More informationChapter 8: Recursion
Chapter 8: Recursion Presentation slides for Java Software Solutions for AP* Computer Science 3rd Edition by John Lewis, William Loftus, and Cara Cocking Java Software Solutions is published by Addison-Wesley
More informationSIERRA LEONE 2012 ELECTIONS PROJECT PRE-ANALYSIS PLAN: INDIVIDUAL LEVEL INTERVENTIONS
SIERRA LEONE 2012 ELECTIONS PROJECT PRE-ANALYSIS PLAN: INDIVIDUAL LEVEL INTERVENTIONS PIs: Kelly Bidwell (IPA), Katherine Casey (Stanford GSB) and Rachel Glennerster (JPAL MIT) THIS DRAFT: 15 August 2013
More informationProcesses. Criteria for Comparing Scheduling Algorithms
1 Processes Scheduling Processes Scheduling Processes Don Porter Portions courtesy Emmett Witchel Each process has state, that includes its text and data, procedure call stack, etc. This state resides
More informationA model for election night forecasting applied to the 2004 South African elections
Volume 22 (1), pp. 89 103 http://www.orssa.org.za ORiON ISSN 0529-191-X c 2006 A model for election night forecasting applied to the 2004 South African elections JM Greben C Elphinstone J Holloway Received:
More informationIowa Voting Series, Paper 6: An Examination of Iowa Absentee Voting Since 2000
Department of Political Science Publications 5-1-2014 Iowa Voting Series, Paper 6: An Examination of Iowa Absentee Voting Since 2000 Timothy M. Hagle University of Iowa 2014 Timothy M. Hagle Comments This
More information1. The Relationship Between Party Control, Latino CVAP and the Passage of Bills Benefitting Immigrants
The Ideological and Electoral Determinants of Laws Targeting Undocumented Migrants in the U.S. States Online Appendix In this additional methodological appendix I present some alternative model specifications
More informationAn Integrated Tag Recommendation Algorithm Towards Weibo User Profiling
An Integrated Tag Recommendation Algorithm Towards Weibo User Profiling Deqing Yang, Yanghua Xiao, Hanghang Tong, Junjun Zhang and Wei Wang School of Computer Science Shanghai Key Laboratory of Data Science
More informationHoboken Public Schools. Project Lead The Way Curriculum Grade 8
Hoboken Public Schools Project Lead The Way Curriculum Grade 8 Project Lead The Way HOBOKEN PUBLIC SCHOOLS Course Description PLTW Gateway s 9 units empower students to lead their own discovery. The hands-on
More informationQUALITY OF LIFE IN TALLINN AND IN THE CAPITALS OF OTHER EUROPEAN UNION MEMBER STATES
QUALITY OF LIFE IN TALLINN AND IN THE CAPITALS OF OTHER EUROPEAN UNION MEMBER STATES Marika Kivilaid, Mihkel Servinski Statistics Estonia The article gives an overview of the results of the perception
More informationWard profile information packs: Ryde North East
% of Island population % of Island population Ward profile information packs: The information within this pack is designed to offer key data and information about this ward in a variety of subjects. It
More informationEfficiency Consequences of Affirmative Action in Politics Evidence from India
Efficiency Consequences of Affirmative Action in Politics Evidence from India Sabyasachi Das, Ashoka University Abhiroop Mukhopadhyay, ISI Delhi* Rajas Saroy, ISI Delhi Affirmative Action 0 Motivation
More informationUsing a Fuzzy-Based Cluster Algorithm for Recommending Candidates in eelections
Using a Fuzzy-Based Cluster Algorithm for Recommending Candidates in eelections Luis Terán University of Fribourg, Switzerland Andreas Lander Institut de Hautes Études en Administration Publique (IDHEAP),
More informationThe Timeline Method of Studying Electoral Dynamics. Christopher Wlezien, Will Jennings, and Robert S. Erikson
The Timeline Method of Studying Electoral Dynamics by Christopher Wlezien, Will Jennings, and Robert S. Erikson 1 1. Author affiliation information CHRISTOPHER WLEZIEN is Hogg Professor of Government at
More informationKNOW THY DATA AND HOW TO ANALYSE THEM! STATISTICAL AD- VICE AND RECOMMENDATIONS
KNOW THY DATA AND HOW TO ANALYSE THEM! STATISTICAL AD- VICE AND RECOMMENDATIONS Ian Budge Essex University March 2013 Introducing the Manifesto Estimates MPDb - the MAPOR database and
More informationPublicizing malfeasance:
Publicizing malfeasance: When media facilitates electoral accountability in Mexico Horacio Larreguy, John Marshall and James Snyder Harvard University May 1, 2015 Introduction Elections are key for political
More informationCities and product variety: evidence from restaurants
1 / 20 Cities and product variety: evidence from restaurants Nathan Schiff School of Economics Shanghai University of Finance and Economics Urban Land Institute Award Ceremony March 22, 2016 2 / 20 Quality
More informationAnalysis of National Identity Data Based on ISSP Questionnaires
1 Analysis of National Identity Data Based on ISSP Questionnaires Bachelor s Thesis for acquiring the degree of Bachelor of Science (B.Sc.) in Economics at the School of Business and Economics of Humboldt-Universität
More informationNetwork Indicators: a new generation of measures? Exploratory review and illustration based on ESS data
Network Indicators: a new generation of measures? Exploratory review and illustration based on ESS data Elsa Fontainha 1, Edviges Coelho 2 1 ISEG Technical University of Lisbon, e-mail: elmano@iseg.utl.pt
More informationThe parametric g- formula in SAS JESSICA G. YOUNG CIMPOD 2017 CASE STUDY 1
The parametric g- formula in SAS JESSICA G. YOUNG CIMPOD 2017 CASE STUDY 1 Structure of the workshop Part I: Motivation Ø Why we might use the parametric g- formula and how it works in general Part II:
More informationDANISH TECHNOLOGICAL INSTITUTE. Supporting Digital Literacy Public Policies and Stakeholder Initiatives. Topic Report 2.
Supporting Digital Literacy Public Policies and Stakeholder Initiatives Topic Report 2 Final Report Danish Technological Institute Centre for Policy and Business Analysis February 2009 1 Disclaimer The
More informationUnderstanding the Effect of Gerrymandering on Voter Influence through Shape-based Metrics
Understanding the Effect of Gerrymandering on Voter Influence through Shape-based Metrics Jack Cackler 1 and Luke Bornn 2 1 Department of Biostatistics, Harvard University 2 Department of Statistics, Harvard
More informationDU PhD in Home Science
DU PhD in Home Science Topic:- DU_J18_PHD_HS 1) Electronic journal usually have the following features: i. HTML/ PDF formats ii. Part of bibliographic databases iii. Can be accessed by payment only iv.
More informationLearning and Visualizing Political Issues from Voting Records Erik Goldman, Evan Cox, Mikhail Kerzhner. Abstract
Learning and Visualizing Political Issues from Voting Records Erik Goldman, Evan Cox, Mikhail Kerzhner Abstract For our project, we analyze data from US Congress voting records, a dataset that consists
More informationAcculturation Strategies : The Case of the Muslim Minority in the United States
Acculturation Strategies : The Case of the Muslim Minority in the United States Ziad Swaidan, Jackson State University Kimball P. Marshall, Jackson State University J. R. Smith, Jackson State University
More informationIntroduction to Path Analysis: Multivariate Regression
Introduction to Path Analysis: Multivariate Regression EPSY 905: Multivariate Analysis Spring 2016 Lecture #7 March 9, 2016 EPSY 905: Multivariate Regression via Path Analysis Today s Lecture Multivariate
More informationOutline. From Pixels to Semantics Research on automatic indexing and retrieval of large collections of images. Research: Main Areas
From Pixels to Semantics Research on automatic indexing and retrieval of large collections of images James Z. Wang PNC Technologies Career Development Professorship School of Information Sciences and Technology
More informationMaternity support policies: a cluster analysis of 22 European Union countries
Maternity support policies: a cluster analysis of 22 European Union countries Martina Pezer Institute of Public Finance, Smičiklasova 21, Zagreb, Croatia martina.pezer@ijf.hr Abstract: Maternity support
More informationRECOMMENDED CITATION: Pew Research Center, May, 2017, Partisan Identification Is Sticky, but About 10% Switched Parties Over the Past Year
NUMBERS, FACTS AND TRENDS SHAPING THE WORLD FOR RELEASE MAY 17, 2017 FOR MEDIA OR OTHER INQUIRIES: Carroll Doherty, Director of Political Research Jocelyn Kiley, Associate Director, Research Bridget Johnson,
More informationProgressives in Alberta
Progressives in Alberta Public opinion on policy, political leaders, and the province s political identity Conducted for Progress Alberta Report prepared by David Coletto, PhD Methodology This study was
More informationA COMPARISON OF ARIZONA TO NATIONS OF COMPARABLE SIZE
A COMPARISON OF ARIZONA TO NATIONS OF COMPARABLE SIZE A Report from the Office of the University Economist July 2009 Dennis Hoffman, Ph.D. Professor of Economics, University Economist, and Director, L.
More informationThe Direct Democracy Deficit in Two-tier Voting
The Direct Democracy Deficit in Two-tier Voting Preliminary Notes Please do not circulate Nicola Maaser and Stefan Napel, March 2011 Abstract A large population of citizens have single-peaked preferences
More informationSupreme Court of Florida
Supreme Court of Florida No. AOSC18-58 IN RE: JUROR SELECTION PLAN: MIAMI-DADE COUNTY ADMINISTRATIVE ORDER Section 40.225, Florida Statutes, provides for the selection of jurors to serve within the county
More informationLiving in the Shadows or Government Dependents: Immigrants and Welfare in the United States
Living in the Shadows or Government Dependents: Immigrants and Welfare in the United States Charles Weber Harvard University May 2015 Abstract Are immigrants in the United States more likely to be enrolled
More informationSupporting Information for Do Perceptions of Ballot Secrecy Influence Turnout? Results from a Field Experiment
Supporting Information for Do Perceptions of Ballot Secrecy Influence Turnout? Results from a Field Experiment Alan S. Gerber Yale University Professor Department of Political Science Institution for Social
More informationWisconsin Economic Scorecard
RESEARCH PAPER> May 2012 Wisconsin Economic Scorecard Analysis: Determinants of Individual Opinion about the State Economy Joseph Cera Researcher Survey Center Manager The Wisconsin Economic Scorecard
More informationAnalyzing Racial Disparities in Traffic Stops Statistics from the Texas Department of Public Safety
Analyzing Racial Disparities in Traffic Stops Statistics from the Texas Department of Public Safety Frank R. Baumgartner, Leah Christiani, and Kevin Roach 1 University of North Carolina at Chapel Hill
More informationData Assimilation in Geosciences
Data Assimilation in Geosciences Alberto Carrassi The Nordic Centre of Excellence for ensemble-based data assimilation Laurent Bertino (Lead), Alberto Carrassi (Co-Lead), Colin Grudzien (PD), Patrick Raanes
More informationMining Expert Comments on the Application of ILO Conventions on Freedom of Association and Collective Bargaining
Mining Expert Comments on the Application of ILO Conventions on Freedom of Association and Collective Bargaining G. Ritschard (U. Geneva), D.A. Zighed (U. Lyon 2), L. Baccaro (IILS & MIT), I. Georgiu (IILS
More informationinformation it takes to make tampering with an election computationally hard.
Chapter 1 Introduction 1.1 Motivation This dissertation focuses on voting as a means of preference aggregation. Specifically, empirically testing various properties of voting rules and theoretically analyzing
More informationShould the Democrats move to the left on economic policy?
Should the Democrats move to the left on economic policy? Andrew Gelman Cexun Jeffrey Cai November 9, 2007 Abstract Could John Kerry have gained votes in the recent Presidential election by more clearly
More information