From Brexit to Trump: Social Media s Role in Democracy

Similar documents
Us and Them Adversarial Politics on Twitter

Big Data, information and political campaigns: an application to the 2016 US Presidential Election

THE GOP DEBATES BEGIN (and other late summer 2015 findings on the presidential election conversation) September 29, 2015

Computational challenges in analyzing and moderating online social discussions

THE AUTHORITY REPORT. How Audiences Find Articles, by Topic. How does the audience referral network change according to article topic?

Do you feel things in the country are going in the right direction, or do you feel things have gotten off on the wrong track? 67% 56% 51% 51% 49% 49%

Clinton vs. Trump 2016: Analyzing and Visualizing Tweets and Sentiments of Hillary Clinton and Donald Trump

5 Key Facts. About Online Discussion of Immigration in the New Trump Era

Predicting Information Diffusion Initiated from Multiple Sources in Online Social Networks

FOR RELEASE NOVEMBER 07, 2017

Issues in Information Systems Volume 18, Issue 2, pp , 2017

Logan McHone COMM 204. Dr. Parks Fall. Analysis of NPR's Social Media Accounts

RECOMMENDED CITATION: Pew Research Center, December, 2016, Low Approval of Trump s Transition but Outlook for His Presidency Improves

2. Do you approve or disapprove of the job Congress is doing? Sep 08 17% 73 9 Democrats 28% Sep 08 23% 68 8 Republicans 10% 87 3

RECOMMENDED CITATION: Pew Research Center, October, 2016, Trump, Clinton supporters differ on how media should cover controversial statements

Issues vs. the Horse Race

The Fourth GOP Debate: Going Beyond Mentions

Cosentino Brands Monthly Social Media Report. December/End of the Year 2014

THE WORKMEN S CIRCLE SURVEY OF AMERICAN JEWS. Jews, Economic Justice & the Vote in Steven M. Cohen and Samuel Abrams

Polarization, Partisanship and Junk News Consumption over Social Media in the US COMPROP DATA MEMO / FEBRUARY 6, 2018

The Digital Battleground: The Political Pulpit to Political Profile

The Cook Political Report / LSU Manship School Midterm Election Poll

Public Opinion on Health Care Issues October 2012

Lab 3: Logistic regression models

The NRA and Gun Control ADPR 5750 Spring 2016

Dynamic Results in Real-Time

2016 Presidential Elections

WHAT IS PUBLIC OPINION? PUBLIC OPINION IS THOSE ATTITUDES HELD BY A SIGNIFICANT NUMBER OF PEOPLE ON MATTERS OF GOVERNMENT AND POLITICS

FOR RELEASE: TUESDAY, FEBRUARY 17 AT 12:30 PM

The Attack of the Bots and Trolls: The Social Storms that are Destroying Public Confidence in Institutions

Forecasting the 2016 EU Referendum with Big Data: Remain to win, in spite of Cameron

Understanding Oklahoma Voters. A Compilation of Studies Conducted Summer 2016

GOP leads on economy, Democrats on health care, immigration

The Digital Road to the White House: Insights on the Political Landscape Online

Political Polls John Zogby (2007)

(READ AND RANDOMIZE LIST)

Social Network and Topic Modeling Analysis of US Political Blogosphere

A User Modeling Pipeline for Studying Polarized Political Events in Social Media

FOX News/Opinion Dynamics Poll 28 September 06

Team 1 IBM UNH

News Consumption Patterns in American Politics

State of the Facts 2018

ROBOTROLLING ISSUE 2 ROBOTROLLING CENTRE OF EXCELLENCE CENTRE OF EXCELLENCE

Politicians as Media Producers

FOR RELEASE October 1, 2018

Patterns of Poll Movement *

Red Oak Strategic Presidential Poll

Polarisation in Political Twitter Conversations

Fake news on Twitter. Lisa Friedland, Kenny Joseph, Nir Grinberg, David Lazer Northeastern University

Political Environment and Congressional Breakdown Charts. October 17, 2017

A Winning Middle Class Reform Government & Politics Message. December 16, 2015

Chapter 8: Mass Media and Public Opinion Section 1 Objectives Key Terms public affairs: public opinion: mass media: peer group: opinion leader:

1. Do you approve or disapprove of the job Barack Obama is doing as president? Republicans 28% Democrats 84% 10 6

Public Opinion and Political Participation

Political Environment and Congressional Breakdown Charts. November 7, 2017

BY Amy Mitchell FOR RELEASE DECEMBER 3, 2018 FOR MEDIA OR OTHER INQUIRIES:

CASE SOCIAL NETWORKS ZH

RECOMMENDED CITATION: Pew Research Center, February, 2015, Democrats Have More Positive Image, But GOP Runs Even or Ahead on Key Issues

Quantitative Prediction of Electoral Vote for United States Presidential Election in 2016

BY Galen Stocking and Nami Sumida

Social Networking and Constituent Communications: Members Use of Vine in Congress

Return on Investment from Inbound Marketing through Implementing HubSpot Software

Gab: The Alt-Right Social Media Platform

Data Literacy and Voting

PEW RESEARCH CENTER. FOR RELEASE January 16, 2019 FOR MEDIA OR OTHER INQUIRIES:

Public Opinion on Health Care Issues

Congressional Forecast. Brian Clifton, Michael Milazzo. The problem we are addressing is how the American public is not properly informed about

Ipsos MORI June 2016 Political Monitor

Trump Topple: Which Trump Supporters Are Disapproving of the President s Job Performance?

Marist College Institute for Public Opinion 2455 South Road, Poughkeepsie, NY Phone Fax

Can Hashtags Change Democracies? By Juliana Luiz * Universidade Estadual do Rio de Janeiro, Rio de Janeiro, Rio de Janeiro, Brazil

FOX News/Opinion Dynamics Poll 26 October 06

Electoral forecasting with Stata

Useful Vot ing Informat ion on Political v. Ente rtain ment Sho ws. Group 6 (3 people)

Political Environment and Congressional Breakdown Charts. December 12, 2017

FOR RELEASE DECEMBER 14, 2017

ANNUAL SURVEY REPORT: REGIONAL OVERVIEW

1. Do you approve or disapprove of the job Barack Obama is doing as president? Apr 09 62% 29 8 Democrats 87% 8 5

FOR RELEASE: WEDNESDAY, SEPTEMBER 2 AT 2 PM

Scope of Research and Methodology. National survey conducted November 8, Florida statewide survey conducted November 8, 2016

Characterizing the 2016 U.S. Presidential Campaign using Twitter Data

FOR RELEASE AUGUST 16, 2018

Chapter 9 Content Statement

Minnesota Public Radio News and Humphrey Institute Poll. Coleman Lead Neutralized by Financial Crisis and Polarizing Presidential Politics

FOR RELEASE MAY 10, 2018

Monitoring social and geopolitical events with Big Data

Wide and growing divides in views of racial discrimination

Topline questionnaire

RECOMMENDED CITATION: Pew Research Center, May, 2015, Negative Views of New Congress Cross Party Lines

Political Environment and Congressional Breakdown Charts. August 23, 2017

Wisconsin Economic Scorecard

Report on the Implementation of the Public Information Interim Policy (November 2017 to September 2018)

FOR RELEASE DECEMBER 07, 2017

Tax Cut Welcomed in BC, But No Bounce for Campbell Before Exit

1. Do you approve or disapprove of the job Barack Obama is doing as president? May 09 60% 30 9 Democrats 84% 11 6

Gingrich, Romney Most Heard About Candidates Primary Fight and Obama Speech Top News Interest

U.S. VOTERS BACK DEM PLAN TO REOPEN GOVERNMENT 2-1, QUINNIPIAC UNIVERSITY NATIONAL POLL FINDS; MORE U.S. VOTERS SAY TRUMP TV ADDRESS WAS MISLEADING

GOP Seen as Principled, But Out of Touch and Too Extreme

Growing share of public says there is too little focus on race issues

Politics: big yellow flag

Transcription:

COVER FEATURE OUTLOOK From Brexit to Trump: Social Media s Role in Democracy Wendy Hall, Ramine Tinati, and Will Jennings, University of Southampton The ability to share, access, and connect facts and opinions among like-minded (and not so) citizens has encouraged wholesale political adoption of platforms like Twitter and Facebook. Yet our ability to understand the impact that social networks have had on the democratic process is currently very limited. The authors analyze the role social media played in the outcome of the 216 US presidential election and the Brexit referendum. Building on the framework of several existing studies, we explore the role of social media during two high-profile political events: the Brexit referendum in the UK, in which British citizens voted on whether to remain a part of the European Union, and the 216 US presidential election in which Donald Trump and Hillary Clinton stood as candidates for the Republican and Democrat parties, respectively. We consider the format, structure, content, and sentiment of social media, and how they compare with what is reported in mainstream public opinion polling records and predictions. Our analysis reveals how the interpretation of the political event is determined by the analytical methods used, which highlights the need for the development of standards to support a common analytical approach in this area of research. In the past decade, social media platforms such as Twitter and Facebook have become critical tools enabling social and political communication on a global scale among broad networks of people. Information on a wide variety of topics is exchanged at unprecedented volumes on these platforms, giving users an open and domain-agnostic communication venue and the benefits and challenges that go along with that. 18 COMPUTER PUBLISHED BY THE IEEE COMPUTER SOCIETY 18-9162/18/$33. 218 IEEE

THROUGH ANALYSIS OF SOCIAL MEDIA DATA WE CAN BETTER UNDERSTAND CITIZENS ENGAGEMENT, OPINIONS, AND POLITICAL PREFERENCES. For individuals, these platforms can be used to advance a variety of agendas with only a minimal set of features, such as following, liking, and sharing. For politicians and political parties, social media is used extensively to campaign on referendums, engage in debates, and provide information on national elections. In political science, social media analysis is now key to understanding the nature of political engagement during campaigns. 1 A number of research endeavors have helped improve our understanding of political polarizations, social media bot detection, 2 and promoted campaign design (for example, http://truthy.indiana.edu). Social media s ever-expanding role in political discourse makes it essential that we improve our understanding of its impact on the complex mechanisms associated with political success, 3 including forecasting election results vis-à-vis candidates and their supporters ideologies. 4 As political activity including campaigning and news coverage have increasingly moved to online media platforms, researchers are working to discern how well Twitter and Facebook users reflect the general voting public, both in terms of demographics and political affiliation. 5,6 One of the key reasons for President Barack Obama s win in the 212 election was the impact of the 15 social media sites used in his reelection campaign. 2 With increasing use of social media to engage with society and potential voters, there is also growing interest in how traditional political polling and news reporting are affected by these activities. In recent elections, traditional polling methods did not accurately predict the outcome of voting. In contrast, social media activity seemed to better capture the overall sentiment and potential election outcomes. However, despite these recent trends, it is important to recognize that in fact only a very small percentage of the population participates in social media. We hope that through analysis of social media data we can better understand citizens engagement, opinions, and political preferences, 7 and we hope to gain valuable insight into ongoing discussions, how information spreads or is contained, and to discern an overall strategy that underlies political events and campaigns. Modeling likes; statistically analyzing the number of retweets; weighting a mixture of topics; 7 mapping network structures; and tracking the types of content shared, the sentiment of users, and the potential to predict a given event are examples of methodologies that help shed light on the participation of the electorate, the popularity of candidates and their platforms, as well as the effectiveness of a variety of online campaigns. 8 We analyzed the role of social media during the Brexit referendum and the 216 US presidential election. Our goal was to understand how social media was used during the political campaigns leading up to these events, as well as to develop a critical Web science lens through which to examine traditional methods of reporting and predicting political outcomes, in light of the fact that the outcomes seemed to conflict with predictions of pollsters, political analysts, and statisticians. We report the socio-technical analysis of these events that we undertook in order to improve our understanding of consistency and variability in the role of social media in democratic elections in different parts of the world. In addition, we also sought to gain insight into temporal dynamics, network structure changes, and campaign polarization, as well as to identify which topics were the most prominent and which sentiments related to these topics. RELATED WORK Twitter s popularity has earned it an unprecedented amount of daily traffic. Candidates, elected officials, and government agencies now use social media platforms like Twitter as their most common mode for mass communication. Thus, a large body of academic research has looked at social media s role in modern political campaigns and activities. 9 Studies have shown how information diffusion during political campaigns and debates can play an important role in being able to reach a large network of individuals. 1 Thus, those who become part of a given network will primarily receive content that is shared by those with the same views and network connections. The network structure s dynamics also yield insights about the content being shared. 11 Social media networks have also been analyzed with respect to their temporality, which helps demonstrate how political campaigns lead to formation of a political affiliation based community structure, as well as how groups migrate and the kinds of patterns and structures that emerge. 12 Studies have also demonstrated how the sentiment of a conversation is strongly connected to real-world events effecting a political campaign. 13 Topic sentiment has also been explored with regard to the network s temporal components, showing that the network s sentiment can shift over time, which is highly related to offline activities and events (for example, a public debate airing on national television). Closely related to this is the prediction JANUARY 218 19

OUTLOOK Table 1. Twitter datasets used in political outcome analysis. Dataset Brexit (23 June 216) 216 US presidential election (8 November 216 Query keywords Brexit USElection16 Collection date 1 May 216 to 31 July 216 1 October 216 to 31 December 216 Tweets 4,64,217 985,28 Retweets 2,197,3 659,96 @ Mentions 437,36 74,484 Hashtags 149,758 43,865 Dataset Slice data into timeframes of election results using social media, including predicting success using the structural properties of a network, as well as the type of content produced and the network s overall sentiment and polarization. 14 However, predictive modeling is limited in its ability to predict human behavior. Typically, the modeling relies on data generated by social media platforms, and the biased representative nature of social media users. DATA GATHERING METHODOLOGY Two Twitter datasets were collected during these two major political events. We used Twitter s standard Search API via the Southampton Web Observatory platform, 15 with a 2-minute polling rate. The use of the standard Search API provides a sample of Twitter posts containing specific keywords, which we predefined based on an initial analysis of the trending hashtag for the given topic (for example, #Brexit, #US Election16). Although the API only provides a sample of the Twitter feed for a specific query, the conjecture that we LDA topic model Sentiment analysis LDA topics over time (full data) FIGURE 1: A depiction of the data pipeline and analysis. LDA: Latent Dirichlet Allocation. Data analysis Data visualization tested in these experiments was that the data collected provides a representative sample of conversations on Twitter. Similarly, although Twitter is not a representative reflection of the different demographics in society, our conjecture in this study was that it offers a view of those who actively participate on the platform. Table 1 details the respective datasets, including the keywords used to query the Twitter Search API, along with a set of general descriptive statistics about the datasets. For this analysis, we limited the posts within the dataset to include a range of time from the month before to the month after the election. That is, as the Brexit vote took place in June 216, our dataset included posts from the beginning of May to the end of July. As shown in Table 1, the datasets share similarities with respect to the distribution and diffusion of the data. ANALYTICAL METHODS We used existing analytical methods to examine social media use during two major political events. Conversation sentiment analysis can yield impressive results for determining the overall feeling for how a political campaign is performing. Complementing this, topic modeling can help elucidate the type of topics that are most highly discussed. We draw upon an extension of the topic modeling technique exploiting the data s temporal aspects and giving us the ability to model topics over time. Thus, we have a means to show the shift in topic popularity over a given time period. The analysis conducted in this paper considered the temporal aspects of sentiment analysis and topic modeling in order to determine the changes in social media engagement with a political event as it evolved over time. This provides an exciting overview of how, if at all, the general attitude and mood in social media shifted over time, and how this potentially relates to the overall outcome of the political event. Figure 1 shows the data pipeline used, and how the different sentiment and topic modeling were used to generate snapshots of how social media usage changed over time. Due to the binary nature of the political events under investigation, we used a number of predefined keywords to classify the data into different categories based on either the political party (for example, Republican vs. Democrat) or the outcome of the vote (that is, VOTEYES vs. VOTENO). ANALYZING SOCIAL MEDIA IN POLITICAL EVENTS To understand the role of social media during political events, we used the Twitter datasets described in Table 1 and used a collection of analytical methods to help describe the structure and content of the network as well as the interactions between humans. 2 COMPUTER WWW.COMPUTER.ORG/COMPUTER

Our analysis focused on the three main areas in question: the temporal evolution of the network, its structure, and the topics and sentiment of content within it. An overview of the events Brexit the UK s referendum on EU membership was Europe s defining political event of 216, and it was arguably the most significant political event in the UK in half a century. The official (and unofficial) campaigns, politicians, and citizens engaged heavily in discussions on social media platforms to communicate Brexit-related information and arguments, gain attention, and influence voters. Journalists and commentators used these platforms to report on the events of the campaign. Although traditional polling methods were used to estimate voter preferences in the lead up to the vote itself (and these generally predicted a Remain win in the final polls), social media also served an important function in providing a source of information on how the campaign was unfolding for many media outlets. In contrast to the Brexit campaign, which was a one-off event, the 216 US presidential election is part of a long-established political cycle. For several US presidential campaigns, social media has been a prominent venue for discussion and a focus of engagement for candidates and news media. In the lead up to the election, the discourse was dominated by the selected candidates of the Republican and Democratic parties, Donald Trump and Hillary Clinton, respectively. Due to the popularity of social media platforms such as Twitter, and their use in political reporting, social media garnered significant engagement during debates that were broadcast on (a) 1, 5, 2-Apr-16 1-May-16 3-May-16 19-Jun-16 1, (b) 5, 15, 1, 5, (c) national television, as well as during other events throughout campaigning on both sides. Social media was also mentioned during discussions about polling (namely in national news broadcasts and online news reporting), with reports of the sentiments coming from social media conversations based on black-box methods. Temporal dynamics The distribution of interactions during a political campaign provides an overview of the level of engagement, both from the official accounts (of campaigns, parties, and candidates), and from the interactions with the average user. For the EU Twitter data on the Brexit referendum, interactions (tweets, retweets, and mentions) were evenly distributed across the timeframe of the pre- and post-campaign period, with a spike in activity surrounding the day of the vote (23 June). Total tweets Retweets Mentions 9-Jul-16 29-Jul-16 18-Aug-16 Total tweets Retweets Mentions 2-Apr-16 1-May-16 3-May-16 19-Jun-16 9-Jul-16 29-Jul-16 18-Aug-16 Total tweets Retweets Mentions 2-Apr-16 1-May-16 3-May-16 19-Jun-16 9-Jul-16 29-Jul-16 18-Aug-16 FIGURE 2. (a) Brexit tweets over time. (b) Brexit Leave tweets (daily). (c) Brexit Remain tweets (daily). The distribution of published content encompassed a much wider window of discussion compared with many of the other political campaigns that have been studied. As shown in Figure 2a (and shown in more detail in Figure 2b and 2c), although there was a lot of discussion prior to the Brexit vote, our analysis showed a sustained level of interaction after the vote as well, no doubt stemming from the political debate that continued after the outcome of the referendum, in terms of the implications for the UK. In comparison, the temporal profile of the US presidential election shown in Figure 3a reveals increasing activity on social media prior to the election, and an increase (greater than 2 percent) on Election Day. Figures 3b and 3c illustrate this in more detail, showing the spike in activity before Election Day. However, unlike the Brexit referendum, activity on social media after JANUARY 218 21

OUTLOOK (a) 1, 8, 6, 4, 2, 7-Sep-16 27-Sep-16 17-Oct-16 6-Nov-16 Total tweets Retweets Mentions 26-Nov-16 16-Dec-16 5-Jan-17 (b) (c) 8, 6, 4, 2, 7-Sep-16 3, 2, 1, 7-Sep-16 27-Sep-16 17-Oct-16 6-Nov-16 the presidential election decreased at a much faster rate. This contrast of declining activity over time can be attributed to the more ephemeral nature of a regular election compared with a major constitutional referendum. It could also be due to the migration of social media discussion and activity to another hashtag or online community and space. In both the UK and US datasets, more than 9 percent of the content increase is the re-shared content on the day of the vote. In both cases, the week surrounding the election contained more than 7 percent of the total retweets, and featured the longest retweet chains (that is, a single tweet shared repeatedly within a given time window, without modification of the tweet content). As such, the ratio between tweets and retweets suggests that as Election Day approaches, publication of new content within the social media network decreases and the recirculation of information becomes increasingly 27-Sep-16 17-Oct-16 6-Nov-16 Total tweets - Democrat Retweets - Democrat Mentions - Democrat 26-Nov-16 16-Dec-16 5-Jan-17 Total tweets - Republican Retweets - Republican Mentions - Republican 26-Nov-16 16-Dec-16 5-Jan-17 FIGURE 3. (a) 216 US presidential elections Twitter time series (daily), (b) Democrat tweets, and (c) Republican tweets. common. This is an important research point for understanding timeline and strategy of how offline campaigns affect the nature of online campaigns. Network structure The network structure which refers to the clusters of activity and the overall connectivity between these clusters is a measure of the interactions between actors within the network and the diversity of communities engaging. Figure 4 illustrates the structure of the retweet network, where nodes represent users, and edges represent the retweeting of content between users. In the context of political events, the structure of the network offers a way to evaluate the discussions between different political parties or candidates and their supporters, the structure of the subcommunities within a network, and the diffusion of information. The overall network structure is similar to existing studies of political engagement on social networking platforms; there is a non-uniform distribution of tweets that is, there ware several highly active contributors, and many users who only published a few tweets. There were also large numbers of retweets within the network (which contributed to the highly active contributors numbers of tweets), and these represented interactions between different, strongly connected hubs. In terms of what this means for a social media user, the average voice in the network can be lost as the algorithms favor the popular (defined as: having high friends/followers and post count) and push their posts to the top of the stack. There are also many long retweet chains repeating the same information, which has a strong relationship with the diversity of topics available. This means that political discussion can become dominated by a small number of individuals or organizations. Structurally, social media activity for the 216 US presidential election was similar to the network for the Brexit vote. We again observed an uneven distribution of content being produced by several key actors in the network. As a consequence of the small number of highly active actors, there were several (more than in the Brexit dataset) strongly connected clusters of actors who are producing more than 6 percent of the network content (excluding retweets). Manual inspection of these participants revealed that they represent members of two opposing political parties, national news and media sources, or high-profile individuals who had strong political affiliations. Exploring this in more depth, Figures 5a and 5b show the distribution of users by the number of tweets they published, 22 COMPUTER WWW.COMPUTER.ORG/COMPUTER

(a) (b) FIGURE 4. (a) 216 US Election retweet Twitter network Illustrating one giant component. (b) Brexit retweet Twitter network, illustrating one large component, and several smaller disconnected communities. which have been categorized by their political view (namely, Democrat or Republican, Remain or Leave). In the US presidential election, when we compared the number of users who published more than 5 tweets, there were those who published content associated with the Republican category, and none who associated with the Democrat category. Considering this in terms of visible content within the network, this had the potential to bias the network structure, as the tweets (and users) who produced a lot of content could attract visibility on the Twitter public timeline, and overshadow the content from the less active users. In contrast to this, in the Brexit dataset, Remain and Leave users were balanced (when examined by number of published tweets), with similar numbers of users across the different bins (for example, <2 to 1,+). The analysis reveals that although there are many strong communities within the network, the low network modularly suggests that these communities were also interacting with each other. Putting this into context, although there were strongly connected networks that represented different political parties or candidates, there was also discussion (such (a) (b) TNumber of users (%) TNumber of users (%) 1 1 1.1.1.1 1 1 1.1.1.1 as tweeting and retweeting) between the different political views (or political candidates). Examining the actors who were responsible for decreasing the modularity (namely, those who interacted with several strongly connected Number of published tweets Number of published tweets Democrat Republican <2 <1 <5 <1 <2 <5 <1 1+ Remain Leave <2 <1 <5 <1 <2 <5 <1 1+ FIGURE 5. (a) Distribution of published tweets for the US presidential election: Democrat vs. Republican. (b) Distribution of published tweets for the Brexit referendum: Remain vs. Leave. clusters), we found that they both produced and shared content at an equal rate. In terms of the impact of these actors, they had the ability to introduce content into different communities, which could have been disconnected or unaware of what other JANUARY 218 23

OUTLOOK TABLE 2. 216 US presidential election topic keywords. Class October October/November November November/December December Democrat clinton hillary votes presidential history clinton hillary fbi emails director clinton hillary elections early delete clinton hillary presidential candidate popular clinton republicans loves game elections Republican trump donald elections voting presidency trump donald vote president popular trump donald elections presidential win trump donald sexual great face trump donald election world times Mixed trump clinton basically crushed factcheckers trump clinton foundations vastly donald clinton trump hillary donald votes clinton trump hillary donald poll clinton trump donald hillary russia No Keywords elections vote electoral fact russia elections news read russian results elections voting bad chief america elections russian voted message man elections time people polls para discussions were occurring. However, alternatively, these could have been bots or actors responsible for aggregating content or sharing misleading information. Polarization Although polarization can be considered as an outcome of the network structure and content within the network rather than a definitive metric in several studies it has been shown to be a particularly useful metric in the context of political campaigns. 17 A common approach to calculating the polarization within a network would be to examine the network structure, and then label actors within the network according to their given political affiliation. This can be then used as a measure to determine the balance between a given set of labels (or classes). In the context of the Brexit campaign, using simple methods of binning data into Leave and Remain camps (using hashtags/keywords as the bins), we observe a clear bias toward the Leave campaign. More than 65 percent of the tweets contained the string Leave. Over time, the proportion of tweets containing Leave increased (that is, during the time of voting, there was a shift in the number of tweets containing Leave ). If this had been known during the campaign it might have suggested that the Leave campaign was succeeding in getting its message across to voters. Applying the same binning technique for labeling content associated with the two leading US presidential candidates and their respective parties, the network illustrated a clear bias toward the Republican candidate, Donald Trump. This is consistent with Trump s wider domination of news media during the primaries and the campaign. In the dataset collected, more than 8 percent of the tweets contained reference to either Trump or the Republican Party. We also found that closer to Election Day, the polarization between the different candidates shifted. Before the week of Election Day, 65 percent of published tweets made reference to the Republican Party or Donald Trump; closer to Election Day, this shifted dramatically, representing more than 85 percent of content produced or shared. Without inspecting the content, language, and intended purpose of the tweets, it is not possible to know whether these tweets were in favor of or against the given political party/candidate. In Trump s case, it is plausible that there was a substantial share of negative coverage of his campaign, for example, given the release of the infamous Access Hollywood video on 7 October; and similarly for Clinton on 28 October, when the FBI announced they had reopened the case regarding her use of a private email server when she was Secretary of State. Nonetheless, these measures provided a rough overview of the general balance of discussions occurring, independent of whether they were positive or negative in reference to the person being discussed. Topics and sentiment To understand the context of the network, topic modeling offers a way to learn common topics of discussion with large corpuses of text, without needing to bootstrap it with any initial list of keywords or topics. Although this requires significant tuning, it has been shown to perform particularly well on Twitter and other similar social networks. Perhaps more interesting to the political studies is the temporal diversity of topics, which provided a measure of the number of topics discussed over time. Network sentiment is also another useful metric when used appropriately. Sentiment can be measured across all content within a dataset (for example, applying sentiment across all messages at once), or applied after documents have been grouped based on the topic modeling results. Another approach applicable to political events is applying sentiment based on the categories established during the polarization process, which allows for finer levels of sentiment to be detected at the category level. Table 2 provides an overview of the topic keywords identified during the three months included in the 216 US presidential election dataset. Topics 24 COMPUTER WWW.COMPUTER.ORG/COMPUTER

TABLE 3. Brexit referendum topic keywords. Class Topic1 Topic2 Topic3 Topic4 Topic5 Leave leave brexit vote britain european leave brexit campaign poll vote leave vote brexit britain campaign leave brexit vote campaign decision leave brexit vote campaign british Remain remain brexit vote voting camp remain brexit campaign vote businesses remain brexit campaign referendum vote remain brexit poll campaign ahead remain brexit vote team campaign mixed leave remain poll ahead puts leave remain brexit vote voters leave remain brexit campaign polls leave remain european vote brexit remain leave brexit debate london none brexit vote people britain immigration brexit trump trade donald les brexit cameron david britain vote brexit vote economy bank british brexit del los por para were generated for tweets containing the keywords for both the Republican and the Democrat parties, as well as for tweets containing both keywords, and no keywords at all. The topic keywords identified reflect controversial topics that were discussed in mainstream media (such as Hillary Clinton s email scandal and the Russian hacking scandal). Table 3 shows the topic keywords for the UK s referendum on EU membership. The topic diversity (which can be measured over time), shows that the topics initially in the Leave class were quite diverse and tended to be negative with respect to the sentiment of the content. However, closer to the day of the vote, the number of topics identified became less diverse (that is, discussions were increasingly focused around a similar topic), and the sentiment of social media activity became less negative. In contrast to this, the topics for the Remain tweets continued to be diverse and their sentiment did not change (remaining negative). This is consistent with the prevailing criticism that the Remain campaign lacked a common positive message, whereas the Leave campaign effectively focused on the message of taking back control in the closing weeks leading up to the vote. Similarly, we found that in the 216 US presidential election, although the sentiment of social media discussion for the Republican Party was initially negative, as Election Day approached the diversity of topics decreased (namely, people on social media were increasingly talking about the same thing in relation to the party), and the sentiment of the topics became less negative. This is consistent with the Republicans late gains in popularity toward the end of the campaign, which was reflected in national opinion polls in the generic congressional election. In contrast, the sentiment surrounding Democrats remained steady (toward negative), and the number of topics that were being discussed did not decrease. DISCUSSION In our analysis and experiments, we demonstrated a range of techniques that can be applied to determine how social media is used during a political campaign or event and the nature of the conversations that arise. By exploiting a range of methods it is possible to develop a high-level overview of the structure of interactions and drill down into the content of the network to understand what is discussed and shared, as well as the general sentiment of a given community. In terms of interpreting these results, we must consider that discussions on social media only represent a small portion of the overall discussions in a political campaign, and, moreover, they only play a small part in the overall political ecosystem. However, as recent results have shown, traditional polling and political forecasting does not appear to be correctly predicting voting outcomes, whereas analysis of social media platforms is increasingly showing their impact on the outcome of the vote. For both the Brexit vote and the US presidential election, the expected outcome was completely overturned when the actual results were announced. However, conversations on social media were strongly indicative according to our analysis of the actual results. Is this just coincidence? Or is the strength of negative sentiment in social media conversations a better indicator of the general population s mood than we would rationally expect? Much more experimental analysis is needed to even begin to answer this question. One of the core limitations of many social media research papers is data access and data quality, rather than the application of methods. For example, Twitter provides only samples of the full discussions, as in just 1 percent of full firehose, and the method for generating this sample is not disclosed. As a result, it is not possible to determine if the sample is representative of the full stream of content. Another limitation is the method of data collection, which was based on keyword or query search. This is problematic because it assumes that all content related to a specific topic will use at least one of the given keywords. The use of social media in political campaigning is still very much in its infancy, and the analysis of the resulting datasets is even more so. There are no common JANUARY 218 25

OUTLOOK methodologies, standards, or benchmarking against which to run experiments. In addition, as researchers, we are all using different datasets for our experiments. The source of the datasets may be same social network (such as Twitter), but the sample obtained will be different, as will the design of the experiments and the algorithms used to do the analysis. We cannot claim any general results from the work described in this paper. We are simply reporting on the analysis we did of the particular datasets we obtained from Twitter pertaining to the UK s Brexit referendum and the 216 US presidential election. Our analysis considered temporal dynamics, network structure, polarization, and topics and sentiment broadly across the datasets we collected. In future work, we plan to delve more deeply into the data to explore, for example, the impact of extreme negative sentiment, such as the impact the release of the Access Hollywood video had on the Trump campaign, or the impact the FBI s announcement of their investigation into Clinton s email had on her campaign. We are also applying the algorithms developed to do the analysis described in this paper to other campaigns. As this paper has made clear, there are many different ways to approach the analysis of social media with regard to its role in democracy and political campaigning. Yet research groups are actively working in isolation, applying different algorithms and methodologies to different datasets that have been collected in undocumented ways. To establish a body of work in this area, we need to be able to repeat and reproduce experiments and reuse the same datasets to run other experiments. This highlights the need for data repositories that enable sharing for research purposes. The Web Observatory is an example of such repository to support Web Science research, but this need extends to many other scientific disciplines. 18 The datasets we collected to do our analysis can be found on the Southampton Web Observatory. 19,2 We have one final word of caution about the dangers of doing this type of work. The sociologist Manuel Castells is famously quoted as saying, Power does not reside in institutions, not even the state or large corporations. It is located in the networks that structure society. So, whoever controls the networks has the power, and the sort of tools researchers and commercial ventures are developing to analyze social media could be used in the future to determine the outcome of democratic elections. This means the campaign with the best algorithms wins. We have already seen the beginnings of this behavior in recent elections. Society might well need to quickly determine new ethical boundaries around the use of social media data analysis during election campaigns, or AI could determine who our next leaders will be. REFERENCES 1. J.E. Carlisle and R.C. Patton, Is Social Media Changing How We Understand Political Engagement? An Analysis of Facebook and the 28 Presidential Election, Political Research Quarterly, vol. 66, no. 4, 213, pp. 883 895. 2. E. Ferrara, et al., The Rise of Social Bots, Comm. ACM, vol. 59, no. 7, 216, pp. 96 14. 3. X. Yang, et al., Social Politics: Agenda Setting and Political Communication on Social Media, Int l Conf. Social Informatics, 216, pp. 33 344. 4. R. Bond and S. Messing, Quantifying Social Media s Political Space: Estimating Ideology from Publicly Revealed Preferences on Facebook, Am. Political Science Review, vol. 19, no. 1, 215, pp. 62 78. 5. J. Mellon and C. Prosser. Twitter and Facebook Are Not Representative of the General Population: Political Attitudes and Demographics of British Social Media Users, Research & Politics, vol. 4, no. 3, 217; doi: 1.1177/25316817728. 6. S. Flaxman, S. Goel, and J.M. Rao. Filter Bubbles, Echo Chambers, and Online News Consumption, Public Opinion Quarterly, vol. 8, no. S1, 216, pp. 298 32. 7. Y. Wang, et al., Catching Fire via Likes Inferring Topic Preferences of Trump Followers on Twitter, Proc. 216 Int l AAAI Conf. Web and Social Media (ICWSM 16), pp. 719 722. 8. P.T. Metaxas and E. Mustafaraj, Social Media and the Elections. Science, vol. 338.616, 212, pp. 472 473. 9. A. Jungherr, Twitter in Politics: A Comprehensive Literature Review, SSRN, 27 Feb. 214; https://dx.doi.org /1.2139/ssrn.242443. 1. Y. Mejova, P. Srinivasan, and B. Boynton. GOP Primary Season on Twitter: Popular Political Sentiment in Social Media, Proc. 6th ACM Int l Conf. Web Search Data Mining (ICWSDM 13), 213. 11. P. Debjyoti et al., Compass: Spatio Temporal Sentiment Analysis of US Election What Twitter Says!. Proc. 23rd ACM SIGKDD Int l Conf. Knowledge Discovery and Data Mining (KDD 17), 217, pp. 1585 1594; doi.org /1.1145/397983.39853. 12. S.A. Myers et al., Information Network or Social Network?: The Structure of the Twitter Follow Graph, 26 COMPUTER WWW.COMPUTER.ORG/COMPUTER

http://www.computer.org http://www.computer.org http://www.computer.org ABOUT THE AUTHORS Proc. 23rd ACM Int l Conf. World Wide Web (WWW 14), 214, pp. 493 498. 13. C.S. Park, Does Twitter Motivate Involvement in Politics? Tweeting, Opinion Leadership, and Political Engagement, Computers in Human Behavior, vol. 29, no. 4, 213, pp. 1641 1648. 14. A. Tumasjan et al., Predicting Elections with Twitter: What 14 Characters Reveal About Political Sentiment, Proc. Int l Conf. Weblogs Social Media (ICWSM 1), 21, pp. 178 185. 15. R. Tinati et al., A Streaming Real- Time Web Observatory Architecture for Monitoring the Health of Social Machines, Proc. 24th ACM Int l Conf. World Wide Web (WWW 15), 215, pp. 1149 1154. 16. A.O. Larsson and H. Moe, Studying Political Microblogging: Twitter Users in the 21 Swedish Election Campaign, New Media & Society, vol. 14, no. 5, 212, pp. 729 747. 17. M. Conover et al. Political Polarization on Twitter. Proc. Int l Conf. Weblogs Social Media (ICWSM 11), 211, pp. 89 96. 18. R. Tinati et al., Building a Real-Time Web Observatory, IEEE Internet Computing, vol. 19, no. 6, 215, pp. 36 45. 19. 216 UK BREXIT Tweet Dataset, Web Observatory; webobservatory.soton.ac.uk/datasets/f8nc2axztdel3cspw. 2. 216 US Presidential Elections Dataset, Web Observatory; webobservatory.soton.ac.uk/datasets /hnbddvjesspnjm4y5. Read your subscriptions through the mycs publications portal at http://mycs.computer.org WENDY HALL, DBE, FRS, FREng is Regius Professor of Computer Science at the University of Southampton, and is the executive director of the Web Science Institute. With Sir Tim Berners-Lee and Sir Nigel Shadbolt she co-founded the Web Science Research Initiative in 26 and is the managing director of the Web Science Trust. She became a Dame Commander of the British Empire in the 29 UK New Year s Honours list, and is a Fellow of the Royal Society. She has previously been President of the ACM, Senior Vice President of the Royal Academy of Engineering, and a member of the UK Prime Minister s Council for Science and Technology. She is also co-chair of the UK government s AI Review. Contact her at wh@ecs.soton.ac.uk. RAMINE TINATI is a senior data scientist on the Microsoft Global Black Belt Team in Singapore. While performing the work reported in this article, he was a researcher at the University of Southampton. Tinati s research interests include real-time big data stream processing and querying. He was a New Frontiers Fellow and Senior Research Fellow in the Web and Internet Science group at the University of Southampton, where he worked on the EPSRC-funded project, SOCIAM, which involves developing methods and analytics to understand the development and connectivity of the Web. Tinati received a PhD and MSc in Web Science from the University of Southampton. Contact him at r.tinati@ soton.ac.uk. WILL JENNINGS is professor of political science and public policy at the University of Southampton. His research explores questions relating to public policy and political behavior, specifically in relation to agenda setting, public opinion, elections, democratic innovations, political geography, policy disasters, and anti-politics. Jennings was a member of the independent inquiry instigated by the British Polling Council and Market Research Society to investigate the performance of the pre-election polls at the 215 general election. He received a PhD in politics & international relations from the University of Oxford. Contact him at W.J.Jennings@soton.ac.uk. Call for Papers General Interest IEEE MultiMedia serves the community of scholars, developers, practitioners, and students who are interested in multiple media types and work in fields such as image and video processing, audio analysis, text retrieval, and data fusion. We are currently accepting papers discussing innovative approaches across a wide range of multimedia subjects, from theory to practice. www.computer.org/multimedia IEEE MultiMedia July September 216 Quality Modeling Volume 23 Number IEEE MultiMedia January March 216 Social Media for Learning Volume 23 Number IEEE MultiMedia April June 216 Ubiquitous Multimedia Volume 23 Number july september 216 january march 216 april june 216 JANUARY 218 27