Semi-supervised graph labelling reveals increasing partisanship in the United States Congress

Size: px
Start display at page:

Download "Semi-supervised graph labelling reveals increasing partisanship in the United States Congress"

Transcription

1 Glonek et al. RESEARCH Semi-supervised graph labelling reveals increasing partisanship in the United States Congress Max Glonek 1,2*, Jonathan Tuke 1,2, Lewis Mitchell 1,2,3 and Nigel Bean 1,2 arxiv: v1 [cs.si] 2 Apr 2019 * Correspondence: max.glonek@adelaide.edu.au 1 School of Mathematical Sciences, University of Adelaide, Adelaide, SA, 5005, Australia Full list of author information is available at the end of the article Abstract Graph labelling is a key activity of network science, with broad practical applications, and close relations to other network science tasks, such as community detection and clustering. While a large body of work exists on both unsupervised and supervised labelling algorithms, the class of random walk-based supervised algorithms requires further exploration, particularly given their relevance to social and political networks. This work refines and expands upon a new semi-supervised graph labelling method, the GLaSS method, that exactly calculates absorption probabilities for random walks on connected graphs. The method models graphs exactly as discrete-time Markov chains, treating labelled nodes as absorbing states. The method is applied to roll call voting data for 42 meetings of the United States House of Representatives and Senate, from 1935 to Analysis of the 84 resultant political networks demonstrates strong and consistent performance of GLaSS when estimating labels for unlabelled nodes in graphs, and reveals a significant trend of increasing partisanship within the United States Congress. Keywords: community detection; graph labelling; random walk; Markov chain; political networks Introduction Graph labelling is concerned with the problem of estimating the labels of one or more nodes within a graph, where an association between the graph s structure and the distribution of labels is assumed to exist. Many graph labelling algorithms exist, both supervised [2, 8, 17] and unsupervised [11, 21]. In both approaches, a graph comprises unlabelled and labelled nodes, and the algorithms seek to estimate the labels of the unlabelled nodes. While a diverse range of graph labelling methods exist [4], this work focusses on the class of dynamical and statistical inference methods that use random walks. One prominent application of network science is in the analysis of political networks [18, 19], including the labelling of nodes in political voting networks. Previous works have examined methods to locate individual politicians within a multidimensional political spectrum [13, 14], the detection of voting blocs or communities within a political voting network [20], and an analysis of partisanship trends reflected in voting networks from the United States Congress [1, 12]. This work presents an analysis of a large collection of United States Congressional roll call voting networks, using a semi-supervised graph labelling method to determine the party affiliation of individuals. Changes in partisanship over time are also examined, with results in accordance with previous studies [1, 12].

2 Glonek et al. Page 2 of 14 Related Work Random Walk-Based Graph Labelling Methods In unsupervised algorithms, the graph is organised into clusters, without consideration of the labelled nodes. Once clustered, labels for unlabelled nodes in the graph can be estimated based on the clusters to which labelled nodes belong. However, cases may arise where an identified cluster contains no labelled nodes, or where a cluster contains multiple nodes with different labels, creating uncertainty as to how labels should be estimated for nodes in such clusters. The Walktrap algorithm is one commonly used random walk-based unsupervised graph labelling method [11]. Walktrap searches for densely connected subgraphs by simulating short random walks on a graph, reasoning that short walks are more likely to remain in the same cluster than to leave it. Walktrap quantifies the similarity between nodes using a distance metric, then recursively merges identified clusters based on short random walks, providing a hard classification for each node. Because Walktrap does not use information about labelled nodes, there is no generally accepted method for estimating the labels for unlabelled nodes based on the clusters it identifies. Unlike unsupervised algorithms, supervised algorithms utilise the information contained in labelled nodes when estimating the labels of unlabelled nodes. A common approach is to treat labelled nodes as absorbing states and unlabelled nodes as transient states in a discrete-time Markov chain (DTMC), and estimate the absorption probabilities or expected times to absorption for all transient states in the chain. Labels for each unlabelled state can then be estimated using the approximate probabilities or times. However, while existing supervised and semi-supervised methods use both labelled nodes and the graph s structure to estimate labels, they only approximate absorption probabilities and times, rather than calculating them exactly. The Rendezvous algorithm [2] labels nodes in a semi-supervised setting by constructing a simplified, rendezvous graph, where edges are drawn from an unlabelled node to only its M nearest neighbours. M is chosen to be as small as possible while ensuring that each unlabelled node in the rendezvous graph is connected to at least one labelled node. Once the rendezvous graph has been constructed, edge weights are calculated using a Euclidean distance metric, and absorption probabilities are calculated using the eigenvalues and eigenvectors of the rendezvous graph s transition matrix. Absorption probabilities for nodes in the rendezvous graph are then used to estimate the label of nodes in the full graph. Another semi-supervised graph labelling method seeks to label nodes in a binary setting according to expected time to absorption, rather than absorption probability [8]. The Censored Time method simulates step-limited random walks over a graph, recording the number of steps taken for all walks that are absorbed before being terminated by the step limit. The censored times to absorption for absorbed walks are used to approximate the conditional expected time to absorption in each labelled node in the graph. A hard binary classification is used to estimate labels according to the lowest censored conditional time to absorption.

3 Glonek et al. Page 3 of 14 Political Science and Networks Analysis of United States Congressional voting data is a popular activity within the field of political science and political networks, in part because large amounts of voting data are freely available [10]. Various attempts have been made to analyse voting trends within Congressional voting data, including modelling Congresses as political networks, where nodes represent individual politicians, and edges capture some relationship between them. DW-NOMINATE [14], and its predecessors, D-NOMINATE [13] and W-NOMINATE, represents one of the most detailled attempts to study voting behaviour and trends in the United States Congress. A multidimensional scaling method, DW-NOMINATE models individual politicians as points embedded in multidimensional space. Each point, representing the politician s true political alignment, can be estimated by analysing historical voting records. Individuals with similar ideologies (as reflected by their voting records) are spatially close to one another, while individuals with differing ideologies are distant. Amongst many other applications, DW-NOMINATE is notable for its use in analysing changes in partisanship over time [12]. More recent work also discusses changes in partisanship in roll call voting networks over time [1]. The work examines pairs of nodes within roll call voting networks, modelling the probability distibutions for cooperation between politicians (edges between nodes) from the same party and from opposing parties. A significant long-term trend of increasing partisanship and decreasing inter-party cooperation is identified; increasing the probability of edges between nodes from the same party and decreasing the probability of edges between nodes from opposing parties. The work also makes reference to the continued, though diminishing, presence of super-cooperators - members who cooperate across party lines - in Congress. Separate work examining United States Congressional voting data uses modularity to measure political polarisation [20]. This work detects voting blocs or communities within Congressional roll call voting networks without making assumptions that rely on the two-party system, revealing more (and more varied) groups than simply Democrats and Republicans. The composition and behaviour of blocs is observed to vary significantly over time, as are the strengths of connections between blocs. The work reveals not only increases in partisanship over time, but points to a possible underestimation of partisanship by other methods in Congresses with weaker party structure. Contributions This work expands upon a new semi-supervised graph labelling method, the Graph Labelling Semi-Supervised (GLaSS) method, using random walks to absorption [6]. The method models a graph as a DTMC, where transient states correspond to unlabelled nodes, and absorbing states correspond to labelled nodes. The transition matrix P, for the DTMC, is formed from the graph s weighted adjacency matrix by normalising the weighted out-degree of each node in the network. From careful construction of P, the probability of absorption in each absorbing state can be calculated exactly, and these probabilities can then be used to estimate the label for every node corresponding to a transient state in the DTMC.

4 Glonek et al. Page 4 of 14 By calculating exact absorption probabilities and expected times to absorption, the GLaSS method provides better label estimates than contemporary supervised methods, which rely on approximations of these quantites [6]. By utilising the information contained in labelled nodes in the graph, GLaSS also provides a clear method for estimating the label of unlabelled nodes using quantities that are meaningful and interpretable, unlike unsupervised random walk methods. This work also contributes to existing work on political networks, through the analysis of a large collection of US Congressional roll call voting networks. In particular, this work contains the first analysis of roll call voting networks using a random walk-based graph labelling method, while also identifying notable trends wthin the House of Representatives and the Senate. The GLaSS method is able to detect and confirm rising partisanship within the House of Representatives and the Senate [1], while also identifying possible historical periods of reduced partisanship. This work formally introduces the GLaSS method, describes, in detail, the data to be analysed, presents a full description of all analyses performed, and discusses the results of this analysis and possible areas of further work. Method Consider an undirected graph G = (V, E) comprising n nodes, V = {v 1,..., v n }, connected by a set of positive real-weighted edges E. Define the weighted adjacency matrix A = [a i,j ], where a i,j = a j,i records the weight of the edge connecting v i and v j, and a i,j = 0 if no edge connects v i and v j. Suppose the first u nodes in G are unlabelled, and the remaining l nodes in G are labelled, where n = u + l, and construct the sets U = {1,..., u} and L = {u + 1,..., n} to index the unlabelled and labelled nodes of G, respectively. Arrange A as A = [ A U,U A L,U A U,L A L,L ], where A J,K describes the weighted edges connecting nodes indexed by J to nodes indexed by K. Consider a random walk on G, described by a discrete-time Markov chain (DTMC) where all unlabelled nodes map to transient states and all labelled nodes map to absorbing states. Let X t denote the state of the chain at time t. Calculate the transition probabilities for the DTMC using the adjacency matrix A, where p i,j = P (X t+1 = j X t = i) = a i,j n k=1 a i,k (1) is the probability that the DTMC is in state j at the next time step, given that the DTMC is currently in state i. Construct the transition matrix P = [p i,j ] = [ P U,U P L,U P U,L P L,L ] [ ] R S =. (2) 0 I l The u u matrix R governs transitions between transient states, the u l matrix S governs transitions from transient states to absorbing states, 0 is an l u zero matrix, and I l is the l l identity matrix.

5 Glonek et al. Page 5 of 14 DTMC Absorption Probabilities Let h i,j be the probability that the DTMC is eventually absorbed in state j, given that the chain starts in state i. Define the matrix of absorption probabilities H = [h i,j ]. H is restricted to have u rows and l columns, corresponding to the u transient states and l absorbing states of the DTMC, respectively. Then H can be formally calculated as H = (I u R) 1 S (3) where I u is the u u identity matrix, and R and S are as above [7]. a Semi-Supervised Graph Labelling Given a graph G and the matrix of absorption probabilities H, let the random variable Y i be the label of an unlabelled node v i, and let x j be the label of a labelled node v j. The distribution over Y i can be directly derived from H, for all i U, as follows: P (Y i = k) = n j=u+1 h i,j 1(x j = k) (4) where 1 is the indicator function, taking value 1 if its argument is true, and 0 otherwise. DTMC Expected Times to Absorption Let t i be the expected number of time steps before the DTMC is absorbed in any absorbing state, given that the chain starts in state i. Define the vector of expected times to absorption t = (t 1,..., t u ) T, where the u elements of t correspond to the u transient states of the DTMC. Then t can be calculated as t = (I u R) 1 c (5) where c is a column vector of length u whose entries are all 1, and I u and R are as above [7]. The Graph Labelling Semi-Supervised (GLaSS) Method Consider a graph G, with u unlabelled nodes and l labelled nodes, and suppose that all labelled nodes have one of two labels; either K 1 or K 2. From the weighted adjacency matrix A, construct the transition matrix P, as in (1). Using P, calculate the vector of expected times to absorption t, as in (5). The expected times to absorption may, optionally, be used as a filtering criterion; nodes with a large expected time to absorption, relative to the disibution of t i over all nodes in the graph, may be excluded from further analysis. Once nodes have been optionally filtered using t, calculate the matrix of absorption probabilities H, by (3), and calculate P (Y i = K 1 ) and P (Y i = K 2 ) for all i U, as in (4). Because P (Y i = K 1 ) + P (Y i = K 2 ) = 1, only one probability is required to classify the unlabelled nodes. For the purposes of this analysis, nodes are classified in the following way: Suppose that m of the u unlabelled nodes in G have a true label K 2, and that the remaining

6 Glonek et al. Page 6 of 14 (u m) unlabelled nodes have a true label K 1. That is, the ratio of nodes with label K 1 to nodes with label K 2 is known, but which nodes should bear those labels is not. Sorting the probabilities P (Y i = K 1 ) from smallest to largest, the m th order statistic (the m th smallest probability) is chosen as a threshold α, and a binary classifier is implemented. If P (Y i = K 1 ) > α, estimate the label for node v i as K 1 ; otherwise, if P (Y i = K 1 ) α, estimate the label for node v i as K 2. Thus, α is chosen to assign a label of K 1 to the (u m) nodes deemed most likely to have that label, and assigns a label of K 2 to the remaining m unlabelled nodes in G. Using this method, it is possible to estimate the label for every unlabelled node in G. This method forms a modification and extension to the GLaSS method [6], a graph labelling method in a semi-supervised setting. Hereafter, we refer to this modification as the GLaSS method. Data Validating the GLaSS method requires graphs with a clear community structure and known labels for all nodes. To emulate a graph with few known labels, only a small subset of all known labels will be used by GLaSS, with the remaining labels withheld to emulate unlabelled nodes in the graph. All labels estimated by GLaSS can then be compared to actual, withheld labels, to assess performance. United States roll call voting data are chosen to validate the GLaSS method. In the United States House of Representatives (the House) and the Senate, parliamentary procedure occasionally gives rise to roll call votes. In a roll call vote, the vote of every member of the House or the Senate is recorded, making it possible to see which members voted the same way. Roll call voting data for the House and the Senate can be modelled as an undirected graph, where each node represents a member of Congress, and a positive integer-weighted edge records the number of times respective members voted the same way. Roll call voting data for the House and the Senate are modelled as separate graphs. The results of roll call votes in the House and the Senate for 42 separate Congresses, between 1935 and 2019, b have been collected for analysis, and modelled as 84 separate undirected graphs. The data has been made available on Voteview [10] and Figshare for analysis. c For simplicity, in each Congress, the following rules are applied: 1 Only yea and nay votes are considered. 2 Only members whose party affiliation is Democrat or Republican are considered. 3 In cases where a member s party affiliation changes during a meeting of Congress, their party affiliation at the time they were elected is used. 4 In rare cases, a member of Congress does not sit for the entire meeting of Congress, and their seat is taken by a new member. In these cases, the voting records of both members are retained. d 5 In both the House and the Senate, votes where the Democrat and Republican leader cast the same vote (either yea or nay ) are not considered, as they provide no information about partisanship. 6 In both the House and the Senate, multiple members may serve (nonconcurrently) as party leader. In these cases, the vote cast by the party leader at the time the vote was held is considered when implementing rule 5.

7 Glonek et al. Page 7 of 14 7 In the House, multiple members may serve (non-concurrently) as Speaker of the House. In cases where only one Speaker is actve for the meeting of Congress, votes cast by the Speaker are not considered. e In cases where multiple Speakers are active, votes cast by the first Speaker (chronologically) are not considered, but votes cast by subsequent Speakers are considered. f This rule does not apply to the President Pro Tempore of the Senate. Because the party affiliation of each member of Congress is known, all nodes in each graph are labelled. For thse analyses, only the labels of nodes corresponding to the Democrat and Republican Leaders are retained, thus all other nodes in each graph are unlabelled. All graphs are either fully connected or nearly fully connected, and detailed summaries of each graph of the House and the Senate are contained in Tables 1 and 2, respectively. Table 1 Years covered, total number of members (nodes), number of Democrats (excluding leaders), number of Republicans (excluding leaders), number of Democrat leaders, number of Republican leaders, and number of roll call votes for each House. Houses where the number of Democrats is shown in bold had a Democrat majority, and Houses where the number of Republicans is shown in bold had a Republican majority. Congress Years Members Leaders Votes Total Dem Rep Dem Rep

8 Glonek et al. Page 8 of 14 Table 2 Years covered, total number of members (nodes), number of Democrats (excluding leaders), number of Republicans (excluding leaders), number of Democrat leaders, number of Republican leaders, and number of roll call votes for each Senate. Senates where the number of Democrats is shown in bold had a Democrat majority, and Senates where the number of Republicans is shown in bold had a Republican majority. Congress Years Members Leaders Votes Total Dem Rep Dem Rep * * Democrats formed a working majority in coalition with two Independent Senators. Results Each House and Senate is modelled as a graph, and each graph is analysed using the GLaSS method, as described above. Expected time to absorption is calculated for each unlabelled node in each graph. Based on the distribution of t, for each graph, no filtering is required, and labels are estimated for all unlabelled nodes in all graphs. In graphs containing only two labelled nodes (one Democrat leader, one Republican leader), each labelled node forms an absorbing state, and the probability of being absorbed in the Democrat state of the corresponding DTMC is considered. In graphs containing more than two labelled nodes (multiple Democrat leaders or multiple Republican leaders), labelled nodes for each party are taken to form an

9 Glonek et al. Page 9 of 14 absorbing class, and the probability of being absorbed in the Democrat class of the corresponding DTMC is considered. For illustrative purposes, full graphs, and histograms of absorption probabilities for the 90th and 110th Senates are provided in Figure 1. Histograms for all Houses and all Senates show separation between Democrat and Republican members, though some overlap between clusters does exist for some Congresses Frequency Frequency Figure 1 Clockwise from top right: histogram of absorption probabilities for the 110th Senate; graph of the 110th Senate; graph of the 90th Senate; histogram of absorption probabilities for the 90th Senate. In the histograms, red bars represent Republican members and blue bars represent Democrat members. In the graphs, red nodes represent Republican members and blue nodes represent Democrat members. Graphs are visualised using the Fruchterman-Reingold force-directed layout algorithm. The 90th Senate was the most difficult to correctly label using GLaSS (F1 = ). This diffculty is reflected in the overlap of red and blue bars in the histogram (top left), and interspersal of red and blue nodes in the graph (bottom left). The isolated red node on the far right of the graph corresponds to the isolated red bar on the far right of the histogram. The 110th Senate was perfectly labelled by GLaSS (F1 score = 1), as reflected by the clear separation of red and blue bars in the histogram (top right). Also of note is the clearer separation of red and blue nodes within the graph of the 110th Senate (bottom right), suggesting a more obvious community structure Probability of absorption in Democrat cluster Probability of absorption in Democrat cluster Using the binary classifier in GLaSS, a threshold α is chosen for each House and each Senate. If P (Yi = Democrat) > α, then member i is labelled a Democrat; otherwise, member i is labelled a Republican. Estimated labels are compared to the true party affiliation for all unlabelled nodes in each graph. A confusion matrix is constructed for each graph, and used to calculate an F1 score, to measure the

10 Glonek et al. Page 10 of 14 performance of GLaSS. An F1 score of 1 implies that GLaSS is able to correctly label all unlabelled members in the corresponding House or Senate. Plots of F1 score over time are given for the House and the Senate in Figures 2 and 3, respectively. F1 scores calculated for the 84 graphs range from a minimum of (achieved by the 90 th Senate) to a maximum of 1 (achieved by 8 Houses and 9 Senates). Figure 2 Top: F1 Score for the House from (74 th House) to (115 th House). Bottom: Difference between the smallest standardised P (Y i = Democrat) among all true Democrats and the largest standardised P (Y i = Democrat) for all true Republicans for the House from (74 th House) to (115 th House). While there is some variability over time, F1 scores are relatively high for all Houses (minimum F1 score = ), indicating very strong performance of the GLaSS method in labelling members of the House as Democrat or Republican. In particular, every House from the 108 th ( ) onwards has an F1 score of 1, implying that the GLaSS method was able to perfectly identify the party affiliation of every member in those Houses. The plot of standardised differences shows the magnitude of overlap (values below the horizontal line at 0) or separation (values above the horizontal line at 0) between Democrats and Republicans, according to absorption probabilities calculated by the GLaSS method. F1 score appears to decrease with increasing magnitude of overlap, while also showing that the two parties have grown increasingly far apart since they first separated entirely in the 108 th House. F1 Score Year Standardised difference Year To better understand the behaviour of graphs with an F1 score of 1, absorption probabilities are standardised using the population mean and pooled variance. The difference between the lowest standardised P (Y i = Democrat) among all true Democrats and the highest standardised P (Y i = Democrat) among all true Republicans is calculated and plotted against time (see Figures 2 and 3). In conjunction with plots of F1 score over time, these figures illustrate where overlap exists between Democrats and Republicans, but also by how much Democrats and Republicans are separated when they do not overlap. Figures 2 and 3 show a notable drop in F1 score through the 1960s and 1970s, corresponding to a period where the GLaSS method is less able to determine the party affiliation of members of the House and the Senate. The causes for this drop are unclear, but represent a possible decrease in partisanship during this period. The figures also show that, over the last 10 to 15 years, partisanship in both the House and the Senate has increased significantly. During this period, the F1 scores for both

11 Glonek et al. Page 11 of 14 Figure 3 Top: F1 Score for the Senate from (74 th Senate) to (115 th Senate). Bottom: Difference between the smallest standardised P (Y i = Democrat) among all true Democrats and the largest standardised P (Y i = Democrat) for all true Republicans for the Senate from (74 th Senate) to (115 th Senate). F1 scores show some variability over time, but are again relatively high for all Senates (minimum F1 score = ), indicating that the GLaSS method also performs very strongly in labelling members of the Senate as Democrat or Republican. Every Senate since the 110 th ( ) has an F1 score of 1, implying complete separation of the parties and perfect performance by GLaSS for those Senates. The Senate also experienced complete separation of Democrats and Republicans in , , and (98 th, 104 th, and 105 th Senates, respectively). The plot of standardised differences illustrates the magnitude of overlap or separation between Democrats and Republicans in the Senate, as measured by the GLaSS method. Some negative association between F1 score and magnitude of overlap is apparent, while it is also clear that the parties are now more separated in the Senate than at any time since F1 Score Year Standardised difference Year the House and the Senate are constant at 1, and plots of standardised difference over time show increasing separation between Democrats and Republicans, as measured by the GLaSS method. The Effects of Party Affiliation and Control on Partisanship To better understand the factors that may influence partisanship within the House and the Senate, as measured by F1 scores calculated for the GLaSS method, two regression models are fitted. The first model examines the effect of three factors on F1 score; which party holds a majority in the House, which party holds a majority in the Senate, and which party holds the Presidency. Each factor comprises two levels - in each instance, Democrats or Republicans are in majority in the House, Democrats or Republicans are in majority in the Senate, and the sitting President is a Democrat or Republican - and all two-way and three-way interaction terms are considered. Thus, the full model is F H + S + P + (H S) + (H P ) + (S P ) + (H S P ), (6) where F is F1 score, H is the party in majority in the House, S is the party in majority in the Senate, P is the party affiliation of the sitting President, and

12 Glonek et al. Page 12 of 14 denotes an interaction between factors. The model is fitted for F1 scores from the House and the Senate separately, but no significant predictors are identified in either case. In the second model, the number of factors is reduced. For the House, two binary factors are considered; whether the party that holds a majority in the Senate is the same as the party that holds a majority in the House, and whether the party that holds the Presidency is the same as the party that holds a majority in the House. An equivalent model is specified for the Senate, and two-way interactions are considered in both cases. For the House, the full model is F S + P + (S P ), (7) where S and P denote whether the party in control of the Senate and the Presidency, respectively, is the same as the party in control of the House. For the Senate, the full model is F H + P + (H P ), (8) where H and P denote whether the party in control of the House and the Presidency, respectively, is the same as the party in control of the Senate. In (7) and (8), F denotes F1 score for the House and Senate, respectively, and is as previously defined. Again, no significant predictors of F1 are identified for the House or the Senate. Thus, while partisanship has clearly varied over time in both the House and the Senate, it appears that this variation is not explained by which party controls each branch of the US Government, or the interplay between controlling parties. Discussion Graph labelling is a fundamental task within network science, with diverse applications. This work builds upon a previously introduced [6] semi-supervised graph labelling method using random walks to absorption, the GLaSS method, and uses it to analyse a collection of undirected politcal networks from the United States House of Representatives and Senate. In these networks, the GLaSS method is used to estimate the party affiliation of members of the House and the Senate based on roll call voting data. The GLaSS method shows universally strong performance in analysing these networks, returning F1 scores in excess of 0.85 for all 84 graphs, even where graphs display significant overlap between the two communities. In 20% of cases (17 of the 84 networks analysed), GLaSS returns an F1 score of 1, indicating perfect labelling of all members in the House or the Senate based on random walks to absorption in these roll call voting networks. Previous work evaluating GLaSS showed that it outperormed other supervised and unsupervised random walk-based graph labelling methods in anaylsing a small series of undirected political networks [6]. Results in this work provide more evidence that continued investigation and evaluation of the GLaSS method is warranted. Future work will extend this work to examine the performance of the GLaSS method for graphs of varying size, connectedness, density, and with different numbers of known labels. Extending the GLaSS method to label graphs with more than two clusters, and graphs with fewer labelled nodes than clusters, is of particular interest.

13 Glonek et al. Page 13 of 14 This work also provides reinforcing empirical evidence of increasing partisanship in United States politics [1, 12], while also representing the first such analysis to use a random walk-based graph labelling method. Analysis of roll call voting data in both the House and the Senate using GLaSS shows that the Democrat and Republican parties have undergone a recent rapid separation. The party affiliation of members can now be entirely predicted by their voting trends, where previously some uncertainty existed. This raises important questions about the causes of the recent and increasing polarisation in both the House and the Senate, as well as what factors historically decreased political partisanship in Congresses with lower F1 scores as calculated by GLaSS. Regression modelling indicates that variations in partisanship cannot be explained by which parties control the House, the Senate, and the Presidency. In an applied setting, future work will also use GLaSS to further explore social, political, and other networks. Online and social-media networks are of particular interest, with a growing body of work examining the structure, dynamics, and polarisation of online social networks [3, 5, 15, 16]. Future applied work with GLaSS will examine these characteristics for new and existing graphs. The roll call voting data presented here have a clear longitudinal structure; the construction of a metagraph from graphs of individual Houses or Senates and extension of the GLaSS method to analyse metagraphs is also an area for future work. Endnotes a In practice, calculating the matrix inverse (I u R) 1 may be computationally impractical when R is large, but a variety of techniques exist to find H approximately. b Each meeting of Congress begins on January 3 and runs for a period of two years. c Primary data is available from Voteview < with supporting datasets made available on Figshare < articles/ans CN2018 Special Issue/ >. d Consequently, while the House has 435 seats and the Senate has 100 seats, graphs may contain more than 435 and 100 nodes, respectively. e Conventionally, the Speaker of the House participates in very few votes. f Prior to serving as Speaker, such members are generally party leaders, and at least active voting members of Congress. Acknowledgements MG acknolwedges the support received through the provision of scholarships by the University of Adelaide and Data to Decisions CRC. All authors thank Data to Decisions CRC and the ARC Centre of Excellence for Mathematical and Statistical Frontiers for their financial support. Authors contributions All authors participated in the conception and design of this work. MG prepared and analyzed the data and drafted the manuscript. All authors critically reviewed the manuscript and approved its final form. All authors read and approved the final manuscript. Availability of data and materials The datasets generated and analysed during the current study are available on Voteview [10] and Figshare. c Competing interests The authors declare that they have no competing interests.

14 Glonek et al. Page 14 of 14 Author details 1 School of Mathematical Sciences, University of Adelaide, Adelaide, SA, 5005, Australia. 2 ARC Centre of Excellence for Mathematical and Statistical Frontiers. 3 Stream Lead: Beat the News, Data to Decisions CRC. References 1. Andris C, Lee D, Hamilton MJ, Martino M, Gunning CE, Selden JA (2015) The rise of partisanship and super-cooperators in the US House of Representatives. PLoS one 10(4):e Azran A (2007) The rendezvous algorithm: Multiclass semi-supervised learning with markov random walks. In: Proceedings of the 24th international conference on Machine learning (ICML), p Fish B, Huang Y, Reyzin L (2016) Recovering social networks by observing votes. In: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, p Fortunato S (2010) Community detection in graphs. Physics reports 486(3-5): Garimella K, Weber I (2017) A long-term analysis of polarization on Twitter. arxiv preprint, arxiv: Glonek M, Tuke J, Mitchell L, Bean N (2018) GLaSS: Semi-supervised graph labelling with markov random walks to absorption. In: Proceedings of the 7th International Conference on Complex Networks and Their Applictions, p Grinstead CM, Snell JL (2012) Introduction to Probability. American Mathematical Soc. 8. Hassan A, Radev D (2010) Identifying text polarity using random walks. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), p Csardi G, Nepusz T (2006) The igraph software package for complex network research. In: InterJournal Complex Systems URL: Accessed 28 August Lewis JB, Poole K, Rosenthal H, Boche A, Rudkin A, Sonnet L (2019) Voteview: congressional roll-call votes database. URL: Accessed 13 February Pons P, Latapy M (2005) Computing communities in large networks using random walks. In: International symposium on computer and information sciences, p Poole KT, Rosenthal H (1984) The polarization of American politics. In: Journal of Politics 46(4): Poole KT, Rosenthal H (1985) A spatial model for legislative roll call analysis. In: American Journal of Political Science 29(2): Poole KT, Rosenthal H (2001) D-NOMINATE after 10 years: A comparative update to Congress: A political-economic history of roll-call voting. In: Legislative Studies Quarterly 26(1): Rizoiu MA, Graham T, Zhang R, Zhang Y, Ackland R, Xie L (2018) #debatenight: The role and influence of socialbots on twitter during the 1st us presidential debate. arxiv preprint, arxiv: Shai S, Stanley N, Granell C, Taylor D, Mucha PJ (2017) Case studies in network community detection. arxiv preprint, arxiv: Talukdar PP, Reisinger J, Paşca M, Ravichandran D, Bhagat R, Pereira F (2008) Weakly-supervised acquisition of labelled class instances using graph random walks. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, p Victor JN, Montgomery AH, Lubell M (eds) (2017) The Oxford Handbook of Political Networks. Oxford University Press. 19. Ward MD, Stovel K, Sacks A (2011) Network analysis and political science. In: Annual Review of Political Science 14: Waugh AS, Pei L, Fowler JH, Murcha PJ, Porter MA (2009) Party polarization in congress: A network science approach. arxiv preprint, arxiv: Zhou H, Lipkowsky R (2004) Network brownian motion: A new method to measure vertex-vertex proximity and to identify communities and subcommunities. In: International conference on computational science (ICCS), p

Do Individual Heterogeneity and Spatial Correlation Matter?

Do Individual Heterogeneity and Spatial Correlation Matter? Do Individual Heterogeneity and Spatial Correlation Matter? An Innovative Approach to the Characterisation of the European Political Space. Giovanna Iannantuoni, Elena Manzoni and Francesca Rossi EXTENDED

More information

Hyo-Shin Kwon & Yi-Yi Chen

Hyo-Shin Kwon & Yi-Yi Chen Hyo-Shin Kwon & Yi-Yi Chen Wasserman and Fraust (1994) Two important features of affiliation networks The focus on subsets (a subset of actors and of events) the duality of the relationship between actors

More information

Learning and Visualizing Political Issues from Voting Records Erik Goldman, Evan Cox, Mikhail Kerzhner. Abstract

Learning and Visualizing Political Issues from Voting Records Erik Goldman, Evan Cox, Mikhail Kerzhner. Abstract Learning and Visualizing Political Issues from Voting Records Erik Goldman, Evan Cox, Mikhail Kerzhner Abstract For our project, we analyze data from US Congress voting records, a dataset that consists

More information

Congressional Gridlock: The Effects of the Master Lever

Congressional Gridlock: The Effects of the Master Lever Congressional Gridlock: The Effects of the Master Lever Olga Gorelkina Max Planck Institute, Bonn Ioanna Grypari Max Planck Institute, Bonn Preliminary & Incomplete February 11, 2015 Abstract This paper

More information

Analyzing and Representing Two-Mode Network Data Week 8: Reading Notes

Analyzing and Representing Two-Mode Network Data Week 8: Reading Notes Analyzing and Representing Two-Mode Network Data Week 8: Reading Notes Wasserman and Faust Chapter 8: Affiliations and Overlapping Subgroups Affiliation Network (Hypernetwork/Membership Network): Two mode

More information

Using Poole s Optimal Classification in R

Using Poole s Optimal Classification in R Using Poole s Optimal Classification in R January 22, 2018 1 Introduction This package estimates Poole s Optimal Classification scores from roll call votes supplied though a rollcall object from package

More information

Approval Voting Theory with Multiple Levels of Approval

Approval Voting Theory with Multiple Levels of Approval Claremont Colleges Scholarship @ Claremont HMC Senior Theses HMC Student Scholarship 2012 Approval Voting Theory with Multiple Levels of Approval Craig Burkhart Harvey Mudd College Recommended Citation

More information

Subreddit Recommendations within Reddit Communities

Subreddit Recommendations within Reddit Communities Subreddit Recommendations within Reddit Communities Vishnu Sundaresan, Irving Hsu, Daryl Chang Stanford University, Department of Computer Science ABSTRACT: We describe the creation of a recommendation

More information

Dimension Reduction. Why and How

Dimension Reduction. Why and How Dimension Reduction Why and How The Curse of Dimensionality As the dimensionality (i.e. number of variables) of a space grows, data points become so spread out that the ideas of distance and density become

More information

Vote Compass Methodology

Vote Compass Methodology Vote Compass Methodology 1 Introduction Vote Compass is a civic engagement application developed by the team of social and data scientists from Vox Pop Labs. Its objective is to promote electoral literacy

More information

Should the Democrats move to the left on economic policy?

Should the Democrats move to the left on economic policy? Should the Democrats move to the left on economic policy? Andrew Gelman Cexun Jeffrey Cai November 9, 2007 Abstract Could John Kerry have gained votes in the recent Presidential election by more clearly

More information

Using Poole s Optimal Classification in R

Using Poole s Optimal Classification in R Using Poole s Optimal Classification in R August 15, 2007 1 Introduction This package estimates Poole s Optimal Classification scores from roll call votes supplied though a rollcall object from package

More information

Network Indicators: a new generation of measures? Exploratory review and illustration based on ESS data

Network Indicators: a new generation of measures? Exploratory review and illustration based on ESS data Network Indicators: a new generation of measures? Exploratory review and illustration based on ESS data Elsa Fontainha 1, Edviges Coelho 2 1 ISEG Technical University of Lisbon, e-mail: elmano@iseg.utl.pt

More information

Do two parties represent the US? Clustering analysis of US public ideology survey

Do two parties represent the US? Clustering analysis of US public ideology survey Do two parties represent the US? Clustering analysis of US public ideology survey Louisa Lee 1 and Siyu Zhang 2, 3 Advised by: Vicky Chuqiao Yang 1 1 Department of Engineering Sciences and Applied Mathematics,

More information

Instructors: Tengyu Ma and Chris Re

Instructors: Tengyu Ma and Chris Re Instructors: Tengyu Ma and Chris Re cs229.stanford.edu Ø Probability (CS109 or STAT 116) Ø distribution, random variable, expectation, conditional probability, variance, density Ø Linear algebra (Math

More information

DATA ANALYSIS USING SETUPS AND SPSS: AMERICAN VOTING BEHAVIOR IN PRESIDENTIAL ELECTIONS

DATA ANALYSIS USING SETUPS AND SPSS: AMERICAN VOTING BEHAVIOR IN PRESIDENTIAL ELECTIONS Poli 300 Handout B N. R. Miller DATA ANALYSIS USING SETUPS AND SPSS: AMERICAN VOTING BEHAVIOR IN IDENTIAL ELECTIONS 1972-2004 The original SETUPS: AMERICAN VOTING BEHAVIOR IN IDENTIAL ELECTIONS 1972-1992

More information

The League of Women Voters of Pennsylvania et al v. The Commonwealth of Pennsylvania et al. Nolan McCarty

The League of Women Voters of Pennsylvania et al v. The Commonwealth of Pennsylvania et al. Nolan McCarty The League of Women Voters of Pennsylvania et al v. The Commonwealth of Pennsylvania et al. I. Introduction Nolan McCarty Susan Dod Brown Professor of Politics and Public Affairs Chair, Department of Politics

More information

UC-BERKELEY. Center on Institutions and Governance Working Paper No. 22. Interval Properties of Ideal Point Estimators

UC-BERKELEY. Center on Institutions and Governance Working Paper No. 22. Interval Properties of Ideal Point Estimators UC-BERKELEY Center on Institutions and Governance Working Paper No. 22 Interval Properties of Ideal Point Estimators Royce Carroll and Keith T. Poole Institute of Governmental Studies University of California,

More information

national congresses and show the results from a number of alternate model specifications for

national congresses and show the results from a number of alternate model specifications for Appendix In this Appendix, we explain how we processed and analyzed the speeches at parties national congresses and show the results from a number of alternate model specifications for the analysis presented

More information

Supplementary Materials for Strategic Abstention in Proportional Representation Systems (Evidence from Multiple Countries)

Supplementary Materials for Strategic Abstention in Proportional Representation Systems (Evidence from Multiple Countries) Supplementary Materials for Strategic Abstention in Proportional Representation Systems (Evidence from Multiple Countries) Guillem Riambau July 15, 2018 1 1 Construction of variables and descriptive statistics.

More information

Approaches to Analysing Politics Variables & graphs

Approaches to Analysing Politics Variables & graphs Approaches to Analysing Politics Variables & Johan A. Elkink School of Politics & International Relations University College Dublin 6 8 March 2017 1 2 3 Outline 1 2 3 A variable is an attribute that has

More information

In Elections, Irrelevant Alternatives Provide Relevant Data

In Elections, Irrelevant Alternatives Provide Relevant Data 1 In Elections, Irrelevant Alternatives Provide Relevant Data Richard B. Darlington Cornell University Abstract The electoral criterion of independence of irrelevant alternatives (IIA) states that a voting

More information

SHOULD THE DEMOCRATS MOVE TO THE LEFT ON ECONOMIC POLICY? By Andrew Gelman and Cexun Jeffrey Cai Columbia University

SHOULD THE DEMOCRATS MOVE TO THE LEFT ON ECONOMIC POLICY? By Andrew Gelman and Cexun Jeffrey Cai Columbia University Submitted to the Annals of Applied Statistics SHOULD THE DEMOCRATS MOVE TO THE LEFT ON ECONOMIC POLICY? By Andrew Gelman and Cexun Jeffrey Cai Columbia University Could John Kerry have gained votes in

More information

Sleepwalking towards Johannesburg? Local measures of ethnic segregation between London s secondary schools, /9.

Sleepwalking towards Johannesburg? Local measures of ethnic segregation between London s secondary schools, /9. Sleepwalking towards Johannesburg? Local measures of ethnic segregation between London s secondary schools, 2003 2008/9. Richard Harris A Headline Headteacher expresses alarm over racial segregation in

More information

Cluster Analysis. (see also: Segmentation)

Cluster Analysis. (see also: Segmentation) Cluster Analysis (see also: Segmentation) Cluster Analysis Ø Unsupervised: no target variable for training Ø Partition the data into groups (clusters) so that: Ø Observations within a cluster are similar

More information

Congruence in Political Parties

Congruence in Political Parties Descriptive Representation of Women and Ideological Congruence in Political Parties Georgia Kernell Northwestern University gkernell@northwestern.edu June 15, 2011 Abstract This paper examines the relationship

More information

Lab 3: Logistic regression models

Lab 3: Logistic regression models Lab 3: Logistic regression models In this lab, we will apply logistic regression models to United States (US) presidential election data sets. The main purpose is to predict the outcomes of presidential

More information

Predicting Information Diffusion Initiated from Multiple Sources in Online Social Networks

Predicting Information Diffusion Initiated from Multiple Sources in Online Social Networks Predicting Information Diffusion Initiated from Multiple Sources in Online Social Networks Chuan Peng School of Computer science, Wuhan University Email: chuan.peng@asu.edu Kuai Xu, Feng Wang, Haiyan Wang

More information

EXTENDING THE SPHERE OF REPRESENTATION:

EXTENDING THE SPHERE OF REPRESENTATION: EXTENDING THE SPHERE OF REPRESENTATION: THE IMPACT OF FAIR REPRESENTATION VOTING ON THE IDEOLOGICAL SPECTRUM OF CONGRESS November 2013 Extend the sphere, and you take in a greater variety of parties and

More information

AMERICAN JOURNAL OF UNDERGRADUATE RESEARCH VOL. 3 NO. 4 (2005)

AMERICAN JOURNAL OF UNDERGRADUATE RESEARCH VOL. 3 NO. 4 (2005) , Partisanship and the Post Bounce: A MemoryBased Model of Post Presidential Candidate Evaluations Part II Empirical Results Justin Grimmer Department of Mathematics and Computer Science Wabash College

More information

Using Poole s Optimal Classification in R

Using Poole s Optimal Classification in R Using Poole s Optimal Classification in R September 23, 2010 1 Introduction This package estimates Poole s Optimal Classification scores from roll call votes supplied though a rollcall object from package

More information

The Australian Society for Operations Research

The Australian Society for Operations Research The Australian Society for Operations Research www.asor.org.au ASOR Bulletin Volume 34, Issue, (06) Pages -4 A minimum spanning tree with node index Elias Munapo School of Economics and Decision Sciences,

More information

Wasserman & Faust, chapter 5

Wasserman & Faust, chapter 5 Wasserman & Faust, chapter 5 Centrality and Prestige - Primary goal is identification of the most important actors in a social network. - Prestigious actors are those with large indegrees, or choices received.

More information

Identifying Factors in Congressional Bill Success

Identifying Factors in Congressional Bill Success Identifying Factors in Congressional Bill Success CS224w Final Report Travis Gingerich, Montana Scher, Neeral Dodhia Introduction During an era of government where Congress has been criticized repeatedly

More information

Preliminary Effects of Oversampling on the National Crime Victimization Survey

Preliminary Effects of Oversampling on the National Crime Victimization Survey Preliminary Effects of Oversampling on the National Crime Victimization Survey Katrina Washington, Barbara Blass and Karen King U.S. Census Bureau, Washington D.C. 20233 Note: This report is released to

More information

Intersections of political and economic relations: a network study

Intersections of political and economic relations: a network study Procedia Computer Science Volume 66, 2015, Pages 239 246 YSC 2015. 4th International Young Scientists Conference on Computational Science Intersections of political and economic relations: a network study

More information

Appendices for Elections and the Regression-Discontinuity Design: Lessons from Close U.S. House Races,

Appendices for Elections and the Regression-Discontinuity Design: Lessons from Close U.S. House Races, Appendices for Elections and the Regression-Discontinuity Design: Lessons from Close U.S. House Races, 1942 2008 Devin M. Caughey Jasjeet S. Sekhon 7/20/2011 (10:34) Ph.D. candidate, Travers Department

More information

arxiv: v1 [cs.si] 29 Oct 2018

arxiv: v1 [cs.si] 29 Oct 2018 Analyzing Dynamic Ideological Communities in Congressional Voting Networks Carlos H. G. Ferreira, Breno de Sousa Matos, and Jussara M. Almeira Departament of Computer Science, Universidade Federal de Minas

More information

Statistics, Politics, and Policy

Statistics, Politics, and Policy Statistics, Politics, and Policy Volume 1, Issue 1 2010 Article 3 A Snapshot of the 2008 Election Andrew Gelman, Columbia University Daniel Lee, Columbia University Yair Ghitza, Columbia University Recommended

More information

The new Voteview.com: preserving and continuing. observers of Congress

The new Voteview.com: preserving and continuing. observers of Congress The new Voteview.com: preserving and continuing Keith Poole s infrastructure for scholars, students and observers of Congress Adam Boche Jeffrey B. Lewis Aaron Rudkin Luke Sonnet March 8, 2018 This project

More information

Computational Social Choice: Spring 2017

Computational Social Choice: Spring 2017 Computational Social Choice: Spring 2017 Ulle Endriss Institute for Logic, Language and Computation University of Amsterdam Ulle Endriss 1 Plan for Today So far we saw three voting rules: plurality, plurality

More information

Supporting Information for Signaling and Counter-Signaling in the Judicial Hierarchy: An Empirical Analysis of En Banc Review

Supporting Information for Signaling and Counter-Signaling in the Judicial Hierarchy: An Empirical Analysis of En Banc Review Supporting Information for Signaling and Counter-Signaling in the Judicial Hierarchy: An Empirical Analysis of En Banc Review In this appendix, we: explain our case selection procedures; Deborah Beim Alexander

More information

Agreement Beyond Polarization: Spectral Network Analysis of Congressional Roll Call Votes 1

Agreement Beyond Polarization: Spectral Network Analysis of Congressional Roll Call Votes 1 Agreement Beyond Polarization: Spectral Network Analysis of Congressional Roll Call Votes 1 Matthew C. Harding MIT and Harvard University 2 September, 2006 1 Thanks to Jerry Hausman, Iain Johnstone, Gary

More information

Types of Networks. Directed and non-directed relations; relational and affiliational networks; multiple networks; nested networks.

Types of Networks. Directed and non-directed relations; relational and affiliational networks; multiple networks; nested networks. POL 279 Political Networks: Methods and Applications Course Website: http://psfaculty.ucdavis.edu/zmaoz/courses.html/pol279-09.htm Winter 2012 Zeev Maoz zmaoz@ucdavis.edu Wednesday 3:00-6:00 Office Hours:

More information

Measuring Bias and Uncertainty in Ideal Point Estimates via the Parametric Bootstrap

Measuring Bias and Uncertainty in Ideal Point Estimates via the Parametric Bootstrap Political Analysis (2004) 12:105 127 DOI: 10.1093/pan/mph015 Measuring Bias and Uncertainty in Ideal Point Estimates via the Parametric Bootstrap Jeffrey B. Lewis Department of Political Science, University

More information

Agreement Scores, Ideal Points, and Legislative Polarization

Agreement Scores, Ideal Points, and Legislative Polarization Agreement Scores, Ideal Points, and Legislative Polarization Betsy Sinclair University of Chicago Seth Masket University of Denver Jennifer Victor University of Pittsburgh Gregory Koger University of Miami

More information

Estimating the Margin of Victory for Instant-Runoff Voting

Estimating the Margin of Victory for Instant-Runoff Voting Estimating the Margin of Victory for Instant-Runoff Voting David Cary Abstract A general definition is proposed for the margin of victory of an election contest. That definition is applied to Instant Runoff

More information

Partition Decomposition for Roll Call Data

Partition Decomposition for Roll Call Data Partition Decomposition for Roll Call Data G. Leibon 1,2, S. Pauls 2, D. N. Rockmore 2,3,4, and R. Savell 5 Abstract In this paper we bring to bear some new tools from statistical learning on the analysis

More information

Chapter. Estimating the Value of a Parameter Using Confidence Intervals Pearson Prentice Hall. All rights reserved

Chapter. Estimating the Value of a Parameter Using Confidence Intervals Pearson Prentice Hall. All rights reserved Chapter 9 Estimating the Value of a Parameter Using Confidence Intervals 2010 Pearson Prentice Hall. All rights reserved Section 9.1 The Logic in Constructing Confidence Intervals for a Population Mean

More information

Elite Polarization and Mass Political Engagement: Information, Alienation, and Mobilization

Elite Polarization and Mass Political Engagement: Information, Alienation, and Mobilization JOURNAL OF INTERNATIONAL AND AREA STUDIES Volume 20, Number 1, 2013, pp.89-109 89 Elite Polarization and Mass Political Engagement: Information, Alienation, and Mobilization Jae Mook Lee Using the cumulative

More information

Powersharing, Protection, and Peace. Scott Gates, Benjamin A. T. Graham, Yonatan Lupu Håvard Strand, Kaare W. Strøm. September 17, 2015

Powersharing, Protection, and Peace. Scott Gates, Benjamin A. T. Graham, Yonatan Lupu Håvard Strand, Kaare W. Strøm. September 17, 2015 Powersharing, Protection, and Peace Scott Gates, Benjamin A. T. Graham, Yonatan Lupu Håvard Strand, Kaare W. Strøm September 17, 2015 Corresponding Author: Yonatan Lupu, Department of Political Science,

More information

FOURIER ANALYSIS OF THE NUMBER OF PUBLIC LAWS David L. Farnsworth, Eisenhower College Michael G. Stratton, GTE Sylvania

FOURIER ANALYSIS OF THE NUMBER OF PUBLIC LAWS David L. Farnsworth, Eisenhower College Michael G. Stratton, GTE Sylvania FOURIER ANALYSIS OF THE NUMBER OF PUBLIC LAWS 1789-1976 David L. Farnsworth, Eisenhower College Michael G. Stratton, GTE Sylvania 1. Introduction. In an earlier study (reference hereafter referred to as

More information

Parties, Candidates, Issues: electoral competition revisited

Parties, Candidates, Issues: electoral competition revisited Parties, Candidates, Issues: electoral competition revisited Introduction The partisan competition is part of the operation of political parties, ranging from ideology to issues of public policy choices.

More information

Analysis of Categorical Data from the California Department of Corrections

Analysis of Categorical Data from the California Department of Corrections Lab 5 Analysis of Categorical Data from the California Department of Corrections About the Data The dataset you ll examine is from a study by the California Department of Corrections (CDC) on the effectiveness

More information

Appendix: Supplementary Tables for Legislating Stock Prices

Appendix: Supplementary Tables for Legislating Stock Prices Appendix: Supplementary Tables for Legislating Stock Prices In this Appendix we describe in more detail the method and data cut-offs we use to: i.) classify bills into industries (as in Cohen and Malloy

More information

Recommendations For Reddit Users Avideh Taalimanesh and Mohammad Aleagha Stanford University, December 2012

Recommendations For Reddit Users Avideh Taalimanesh and Mohammad Aleagha Stanford University, December 2012 Recommendations For Reddit Users Avideh Taalimanesh and Mohammad Aleagha Stanford University, December 2012 Abstract In this paper we attempt to develop an algorithm to generate a set of post recommendations

More information

Role of Political Identity in Friendship Networks

Role of Political Identity in Friendship Networks Role of Political Identity in Friendship Networks Surya Gundavarapu, Matthew A. Lanham Purdue University, Department of Management, 403 W. State Street, West Lafayette, IN 47907 sgundava@purdue.edu; lanhamm@purdue.edu

More information

Blockmodels/Positional Analysis Implementation and Application. By Yulia Tyshchuk Tracey Dilacsio

Blockmodels/Positional Analysis Implementation and Application. By Yulia Tyshchuk Tracey Dilacsio Blockmodels/Positional Analysis Implementation and Application By Yulia Tyshchuk Tracey Dilacsio Articles O Wasserman and Faust Chapter 12 O O Bearman, Peter S. and Kevin D. Everett (1993). The Structure

More information

Constraint satisfaction problems. Lirong Xia

Constraint satisfaction problems. Lirong Xia Constraint satisfaction problems Lirong Xia Spring, 2017 Project 1 Ø You can use Windows Ø Read the instruction carefully, make sure you understand the goal search for YOUR CODE HERE Ø Ask and answer questions

More information

The Effect of Ballot Order: Evidence from the Spanish Senate

The Effect of Ballot Order: Evidence from the Spanish Senate The Effect of Ballot Order: Evidence from the Spanish Senate Manuel Bagues Berta Esteve-Volart November 20, 2011 PRELIMINARY AND INCOMPLETE Abstract This paper analyzes the relevance of ballot order in

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Linearly Separable Data SVM: Simple Linear Separator hyperplane Which Simple Linear Separator? Classifier Margin Objective #1: Maximize Margin MARGIN MARGIN How s this look? MARGIN

More information

What is The Probability Your Vote will Make a Difference?

What is The Probability Your Vote will Make a Difference? Berkeley Law From the SelectedWorks of Aaron Edlin 2009 What is The Probability Your Vote will Make a Difference? Andrew Gelman, Columbia University Nate Silver Aaron S. Edlin, University of California,

More information

Essential Questions Content Skills Assessments Standards/PIs. Identify prime and composite numbers, GCF, and prime factorization.

Essential Questions Content Skills Assessments Standards/PIs. Identify prime and composite numbers, GCF, and prime factorization. Map: MVMS Math 7 Type: Consensus Grade Level: 7 School Year: 2007-2008 Author: Paula Barnes District/Building: Minisink Valley CSD/Middle School Created: 10/19/2007 Last Updated: 11/06/2007 How does the

More information

Introduction to Path Analysis: Multivariate Regression

Introduction to Path Analysis: Multivariate Regression Introduction to Path Analysis: Multivariate Regression EPSY 905: Multivariate Analysis Spring 2016 Lecture #7 March 9, 2016 EPSY 905: Multivariate Regression via Path Analysis Today s Lecture Multivariate

More information

A comparative analysis of subreddit recommenders for Reddit

A comparative analysis of subreddit recommenders for Reddit A comparative analysis of subreddit recommenders for Reddit Jay Baxter Massachusetts Institute of Technology jbaxter@mit.edu Abstract Reddit has become a very popular social news website, but even though

More information

Using W-NOMINATE in R

Using W-NOMINATE in R Using W-NOMINATE in R James Lo January 30, 2018 1 Introduction This package estimates Poole and Rosenthal W-NOMINATE scores from roll call votes supplied though a rollcall object from package pscl. 1 The

More information

Gab: The Alt-Right Social Media Platform

Gab: The Alt-Right Social Media Platform Gab: The Alt-Right Social Media Platform Yuchen Zhou 1, Mark Dredze 1[0000 0002 0422 2474], David A. Broniatowski 2, William D. Adler 3 1 Center for Language and Speech Processing Johns Hopkins University,

More information

Simulating Electoral College Results using Ranked Choice Voting if a Strong Third Party Candidate were in the Election Race

Simulating Electoral College Results using Ranked Choice Voting if a Strong Third Party Candidate were in the Election Race Simulating Electoral College Results using Ranked Choice Voting if a Strong Third Party Candidate were in the Election Race Michele L. Joyner and Nicholas J. Joyner Department of Mathematics & Statistics

More information

Can Ideal Point Estimates be Used as Explanatory Variables?

Can Ideal Point Estimates be Used as Explanatory Variables? Can Ideal Point Estimates be Used as Explanatory Variables? Andrew D. Martin Washington University admartin@wustl.edu Kevin M. Quinn Harvard University kevin quinn@harvard.edu October 8, 2005 1 Introduction

More information

Analyzing the Legislative Productivity of Congress During the Obama Administration

Analyzing the Legislative Productivity of Congress During the Obama Administration Western Michigan University ScholarWorks at WMU Honors Theses Lee Honors College 12-5-2017 Analyzing the Legislative Productivity of Congress During the Obama Administration Zachary Hunkins Western Michigan

More information

The Integer Arithmetic of Legislative Dynamics

The Integer Arithmetic of Legislative Dynamics The Integer Arithmetic of Legislative Dynamics Kenneth Benoit Trinity College Dublin Michael Laver New York University July 8, 2005 Abstract Every legislature may be defined by a finite integer partition

More information

No Adults Allowed! Unsupervised Learning Applied to Gerrymandered School Districts

No Adults Allowed! Unsupervised Learning Applied to Gerrymandered School Districts No Adults Allowed! Unsupervised Learning Applied to Gerrymandered School Districts Divya Siddarth, Amber Thomas 1. INTRODUCTION With more than 80% of public school students attending the school assigned

More information

Matthew A. Cole and Eric Neumayer. The pitfalls of convergence analysis : is the income gap really widening?

Matthew A. Cole and Eric Neumayer. The pitfalls of convergence analysis : is the income gap really widening? LSE Research Online Article (refereed) Matthew A. Cole and Eric Neumayer The pitfalls of convergence analysis : is the income gap really widening? Originally published in Applied economics letters, 10

More information

DU PhD in Home Science

DU PhD in Home Science DU PhD in Home Science Topic:- DU_J18_PHD_HS 1) Electronic journal usually have the following features: i. HTML/ PDF formats ii. Part of bibliographic databases iii. Can be accessed by payment only iv.

More information

Designing Weighted Voting Games to Proportionality

Designing Weighted Voting Games to Proportionality Designing Weighted Voting Games to Proportionality In the analysis of weighted voting a scheme may be constructed which apportions at least one vote, per-representative units. The numbers of weighted votes

More information

Read My Lips : Using Automatic Text Analysis to Classify Politicians by Party and Ideology 1

Read My Lips : Using Automatic Text Analysis to Classify Politicians by Party and Ideology 1 Read My Lips : Using Automatic Text Analysis to Classify Politicians by Party and Ideology 1 Eitan Sapiro-Gheiler 2 June 15, 2018 Department of Economics Princeton University 1 Acknowledgements: I would

More information

Computational Social Choice: Spring 2007

Computational Social Choice: Spring 2007 Computational Social Choice: Spring 2007 Ulle Endriss Institute for Logic, Language and Computation University of Amsterdam Ulle Endriss 1 Plan for Today This lecture will be an introduction to voting

More information

CS 229 Final Project - Party Predictor: Predicting Political A liation

CS 229 Final Project - Party Predictor: Predicting Political A liation CS 229 Final Project - Party Predictor: Predicting Political A liation Brandon Ewonus bewonus@stanford.edu Bryan McCann bmccann@stanford.edu Nat Roth nroth@stanford.edu Abstract In this report we analyze

More information

COSC-282 Big Data Analytics. Final Exam (Fall 2015) Dec 18, 2015 Duration: 120 minutes

COSC-282 Big Data Analytics. Final Exam (Fall 2015) Dec 18, 2015 Duration: 120 minutes Student Name: COSC-282 Big Data Analytics Final Exam (Fall 2015) Dec 18, 2015 Duration: 120 minutes Instructions: This is a closed book exam. Write your name on the first page. Answer all the questions

More information

Classifier Evaluation and Selection. Review and Overview of Methods

Classifier Evaluation and Selection. Review and Overview of Methods Classifier Evaluation and Selection Review and Overview of Methods Things to consider Ø Interpretation vs. Prediction Ø Model Parsimony vs. Model Error Ø Type of prediction task: Ø Decisions Interested

More information

NOMINATE: A Short Intellectual History. Keith T. Poole. When John Londregan asked me to write something for TPM about NOMINATE

NOMINATE: A Short Intellectual History. Keith T. Poole. When John Londregan asked me to write something for TPM about NOMINATE NOMINATE: A Short Intellectual History by Keith T. Poole When John Londregan asked me to write something for TPM about NOMINATE and why we (Howard Rosenthal and I) went high tech rather than using simpler

More information

Designing police patrol districts on street network

Designing police patrol districts on street network Designing police patrol districts on street network Huanfa Chen* 1 and Tao Cheng 1 1 SpaceTimeLab for Big Data Analytics, Department of Civil, Environmental, and Geomatic Engineering, University College

More information

Random Forests. Gradient Boosting. and. Bagging and Boosting

Random Forests. Gradient Boosting. and. Bagging and Boosting Random Forests and Gradient Boosting Bagging and Boosting The Bootstrap Sample and Bagging Simple ideas to improve any model via ensemble Bootstrap Samples Ø Random samples of your data with replacement

More information

Statistical Analysis of Corruption Perception Index across countries

Statistical Analysis of Corruption Perception Index across countries Statistical Analysis of Corruption Perception Index across countries AMDA Project Summary Report (Under the guidance of Prof Malay Bhattacharya) Group 3 Anit Suri 1511007 Avishek Biswas 1511013 Diwakar

More information

The Impact of Unionization on the Wage of Hispanic Workers. Cinzia Rienzo and Carlos Vargas-Silva * This Version, December 2014.

The Impact of Unionization on the Wage of Hispanic Workers. Cinzia Rienzo and Carlos Vargas-Silva * This Version, December 2014. The Impact of Unionization on the Wage of Hispanic Workers Cinzia Rienzo and Carlos Vargas-Silva * This Version, December 2014 Abstract This paper explores the role of unionization on the wages of Hispanic

More information

Partisan Agenda Control and the Dimensionality of Congress

Partisan Agenda Control and the Dimensionality of Congress Partisan Agenda Control and the Dimensionality of Congress Keith L. Dougherty Associate Professor University of Georgia dougherk@uga.edu Michael S. Lynch Assistant Professor University of Kansas mlynch@ku.edu

More information

Research Statement. Jeffrey J. Harden. 2 Dissertation Research: The Dimensions of Representation

Research Statement. Jeffrey J. Harden. 2 Dissertation Research: The Dimensions of Representation Research Statement Jeffrey J. Harden 1 Introduction My research agenda includes work in both quantitative methodology and American politics. In methodology I am broadly interested in developing and evaluating

More information

Expected Modes of Policy Change in Comparative Institutional Settings * Christopher K. Butler and Thomas H. Hammond

Expected Modes of Policy Change in Comparative Institutional Settings * Christopher K. Butler and Thomas H. Hammond Expected Modes of Policy Change in Comparative Institutional Settings * Christopher K. Butler and Thomas H. Hammond Presented at the Annual Meeting of the American Political Science Association, Washington,

More information

IN THE UNITED STATES DISTRICT COURT FOR THE WESTERN DISTRICT OF WISCONSIN. v. Case No. 15-cv-421-bbc

IN THE UNITED STATES DISTRICT COURT FOR THE WESTERN DISTRICT OF WISCONSIN. v. Case No. 15-cv-421-bbc Case: 3:15-cv-00421-bbc Document #: 76 Filed: 02/04/16 Page 1 of 55 IN THE UNITED STATES DISTRICT COURT FOR THE WESTERN DISTRICT OF WISCONSIN WILLIAM WHITFORD, et al., Plaintiffs, v. Case No. 15-cv-421-bbc

More information

The California Primary and Redistricting

The California Primary and Redistricting The California Primary and Redistricting This study analyzes what is the important impact of changes in the primary voting rules after a Congressional and Legislative Redistricting. Under a citizen s committee,

More information

The Impact of Unionization on the Wage of Hispanic Workers. Cinzia Rienzo and Carlos Vargas-Silva * This Version, May 2015.

The Impact of Unionization on the Wage of Hispanic Workers. Cinzia Rienzo and Carlos Vargas-Silva * This Version, May 2015. The Impact of Unionization on the Wage of Hispanic Workers Cinzia Rienzo and Carlos Vargas-Silva * This Version, May 2015 Abstract This paper explores the role of unionization on the wages of Hispanic

More information

Can the number of veto players measure policy stability?

Can the number of veto players measure policy stability? Can the number of veto players measure policy stability? Monika Nalepa and Ji Xue (The University of Chicago) February 22, 2018 Abstract Ever since the publication of George Tsebelis s Veto Players, political

More information

Colorado 2014: Comparisons of Predicted and Actual Turnout

Colorado 2014: Comparisons of Predicted and Actual Turnout Colorado 2014: Comparisons of Predicted and Actual Turnout Date 2017-08-28 Project name Colorado 2014 Voter File Analysis Prepared for Washington Monthly and Project Partners Prepared by Pantheon Analytics

More information

Poverty Reduction and Economic Growth: The Asian Experience Peter Warr

Poverty Reduction and Economic Growth: The Asian Experience Peter Warr Poverty Reduction and Economic Growth: The Asian Experience Peter Warr Abstract. The Asian experience of poverty reduction has varied widely. Over recent decades the economies of East and Southeast Asia

More information

ANALYSIS AND COMPARISON OF GREEK PARLIAMENTARY ELECTORAL SYSTEMS OF THE PERIOD

ANALYSIS AND COMPARISON OF GREEK PARLIAMENTARY ELECTORAL SYSTEMS OF THE PERIOD ANALYSIS AND COMPARISON OF GREEK PARLIAMENTARY ELECTORAL SYSTEMS OF THE PERIOD 1974-1999 Aikaterini Kalogirou and John Panaretos Department of Statistics, Athens University of Economics and Business, 76,

More information

Is there a Strategic Selection Bias in Roll Call Votes. in the European Parliament?

Is there a Strategic Selection Bias in Roll Call Votes. in the European Parliament? Is there a Strategic Selection Bias in Roll Call Votes in the European Parliament? Revised. 22 July 2014 Simon Hix London School of Economics and Political Science Abdul Noury New York University Gerard

More information

Package wnominate. February 12, 2018

Package wnominate. February 12, 2018 Version 1.2.5 Date 2018-02-11 Title Multidimensional Vote Scaling Software Package wnominate February 12, 2018 Author, Jeffrey Lewis , and Royce Carroll Maintainer

More information

QUANTIFYING GERRYMANDERING REVEALING GEOPOLITICAL STRUCTURE THROUGH SAMPLING

QUANTIFYING GERRYMANDERING REVEALING GEOPOLITICAL STRUCTURE THROUGH SAMPLING QUANTIFYING GERRYMANDERING REVEALING GEOPOLITICAL STRUCTURE THROUGH SAMPLING GEOMETRY OF REDISTRICTING WORKSHOP CALIFORNIA GREG HERSCHLAG, JONATHAN MATTINGLY + THE TEAM @ DUKE MATH Impact of Duke Team

More information

Rural Migration and Social Dislocation: Using GIS data on social interaction sites to measure differences in rural-rural migrations

Rural Migration and Social Dislocation: Using GIS data on social interaction sites to measure differences in rural-rural migrations 1 Rural Migration and Social Dislocation: Using GIS data on social interaction sites to measure differences in rural-rural migrations Elizabeth Sully Office of Population Research Woodrow Wilson School

More information

Contiguous States, Stable Borders and the Peace between Democracies

Contiguous States, Stable Borders and the Peace between Democracies Contiguous States, Stable Borders and the Peace between Democracies Douglas M. Gibler June 2013 Abstract Park and Colaresi argue that they could not replicate the results of my 2007 ISQ article, Bordering

More information