Computational challenges in analyzing and moderating online social discussions

Similar documents
Subreddit Recommendations within Reddit Communities

Topline questionnaire

The NRA and Gun Control ADPR 5750 Spring 2016

Cluster Analysis. (see also: Segmentation)

Polarization, Partisanship and Junk News Consumption over Social Media in the US COMPROP DATA MEMO / FEBRUARY 6, 2018

Explaining the Spread of Misinformation on Social Media: Evidence from the 2016 U.S. Presidential Election.

Issues in Information Systems Volume 18, Issue 2, pp , 2017

An Homophily-based Approach for Fast Post Recommendation in Microblogging Systems

Predicting Information Diffusion Initiated from Multiple Sources in Online Social Networks

Gab: The Alt-Right Social Media Platform

COSC-282 Big Data Analytics. Final Exam (Fall 2015) Dec 18, 2015 Duration: 120 minutes

An Integrated Tag Recommendation Algorithm Towards Weibo User Profiling

CS 4407 Algorithms Greedy Algorithms and Minimum Spanning Trees

Clinton vs. Trump 2016: Analyzing and Visualizing Tweets and Sentiments of Hillary Clinton and Donald Trump

All The President s Tweets: l. Political Rhetoric on Social Media THAD KOUSSER AND STAN OKLOBDZIJA DEPARTMENT OF POLITICAL SCIENCE, UC SAN DIEGO

Project Presentations - 1

Do two parties represent the US? Clustering analysis of US public ideology survey

Can Hashtags Change Democracies? By Juliana Luiz * Universidade Estadual do Rio de Janeiro, Rio de Janeiro, Rio de Janeiro, Brazil

From Brexit to Trump: Social Media s Role in Democracy

Logan McHone COMM 204. Dr. Parks Fall. Analysis of NPR's Social Media Accounts

Fake news on Twitter. Lisa Friedland, Kenny Joseph, Nir Grinberg, David Lazer Northeastern University

Cross Social Media Recommenda1on

Comparative Candidate Survey (CCS) Module III. Core Questionnaire ( )

Users reading habits in online news portals

Experiments on Data Preprocessing of Persian Blog Networks

BY Galen Stocking and Nami Sumida

Don Me: Experimentally Reducing Partisan Incivility on Twitter

The Fourth GOP Debate: Going Beyond Mentions

RECOMMENDED CITATION: Pew Research Center, October, 2016, Trump, Clinton supporters differ on how media should cover controversial statements

Conspiracist propaganda

What's in a name? The Interplay between Titles, Content & Communities in Social Media

Big Data, information and political campaigns: an application to the 2016 US Presidential Election

BY Aaron Smith FOR RELEASE JUNE 28, 2018 FOR MEDIA OR OTHER INQUIRIES:

Algorithms, Games, and Networks February 7, Lecture 8

AMERICANS ON GLOBALIZATION: A Study of US Public Attitudes March 28, 2000

Combating Friend Spam Using Social Rejections

RECOMMENDED CITATION: Pew Research Center, March, 2015, More Approve Than Disapprove of Iran Talks, But Most Think Iranians Are Not Serious

Egypt: The Time of Pharaohs Pre-Promotion Contest. Presented by the Royal BC Museum

Supreme Court s Favorability Edges Below 50%

arxiv: v2 [cs.si] 10 Apr 2017

VEWS. Video News from all Views. Stanford University. Digital Media Entrepreneurship. Vignesh Ramachandran. Marcella De Laurentiis.

Events and Memes in Media- rich Social Informa7on Networks

AMERICAN VIEWS: TRUST, MEDIA AND DEMOCRACY A GALLUP/KNIGHT FOUNDATION SURVEY

Business Wire. At a Glance. January 13, 2015 at 9am - January 20, 2015 at 9am Page VC. 2% Positive Peak: 1 mentions on January 14th at 4pm

Junk News on Military Affairs and National Security: Social Media Disinformation Campaigns Against US Military Personnel and Veterans

PEW RESEARCH CENTER. FOR RELEASE January 16, 2019 FOR MEDIA OR OTHER INQUIRIES:

Us and Them Adversarial Politics on Twitter

A User Modeling Pipeline for Studying Polarized Political Events in Social Media

Polarisation in Political Twitter Conversations

RECOMMENDED CITATION: Pew Research Center, February, 2015, Democrats Have More Positive Image, But GOP Runs Even or Ahead on Key Issues

RECOMMENDED CITATION: Pew Research Center, May, 2017, Partisan Identification Is Sticky, but About 10% Switched Parties Over the Past Year

Presidential Campaigns and Social Networks: How Clinton and Trump Used Facebook and Twitter During the 2016 Election

The First 100 Days: A Corpus Of Political Agendas on Twitter

Using Social Media to Build Your Brand. Susan Getgood

Internet Governance Forum Guadalajara, Mexico

Social Media and Political Participation

Monitoring social and geopolitical events with Big Data

Role of Political Identity in Friendship Networks

Fake News 101 To Believe or Not to Believe

Twitter Topic Modeling and the 2016 Presidential Campaigns

ANNUAL SURVEY REPORT: ARMENIA

Chapter 9 Content Statement

Key Note Speaker by Delia Gallagher, Vatican Correspondent CNN

EasyChair Preprint. (Anti-)Echo Chamber Participation: Examing Contributor Activity Beyond the Chamber

A Social Contagion: An Empirical Study of Information Spread on Digg and Twitter Follower Graphs

RECOMMENDED CITATION: Pew Research Center, May, 2015, Public Continues to Back U.S. Drone Attacks

FOR RELEASE NOVEMBER 07, 2017

Identifying Factors in Congressional Bill Success

CS 6474 Social Compu7ng: Polariza7on and Selec7ve Exposure

arxiv: v2 [cs.si] 12 Aug 2013

Navigating Information Sources in a Time of Fake News and Alternative Facts

ANNUAL SURVEY REPORT: REGIONAL OVERVIEW

Social Computing in Blogosphere

Opposition to Syrian Airstrikes Surges

New Horizons #PlutoFlyby

Introduction to Social Media for Unitarian Universalist Leaders

PRRI/The Atlantic April 2016 Survey Total = 2,033 (813 Landline, 1,220 Cell phone) March 30 April 3, 2016

Little Support for U.S. Intervention in Syrian Conflict

11757 W Ken Caryl Ave, F124, Littleton, Colorado , Tel. (303)

5 Key Facts. About Online Discussion of Immigration in the New Trump Era

The Civic Mission of MOOCs: Measuring Engagement across Political Differences in Forums

NATIONAL: FAKE NEWS THREAT TO MEDIA; EDITORIAL DECISIONS, OUTSIDE ACTORS AT FAULT

The Digital Battleground: The Political Pulpit to Political Profile

Social Media Audit and Conversation Analysis

RECOMMENDED CITATION: Pew Research Center, October, 2015, On Immigration Policy, Wider Partisan Divide Over Border Fence Than Path to Legal Status

FOR RELEASE: WEDNESDAY, SEPTEMBER 2 AT 2 PM

Voting and Complexity

Committee for Economic Development: October Business Leader Study. Submitted to:

Dynamic Results in Real-Time

ANNUAL SURVEY REPORT: BELARUS

You Are What You Tweet: An Official Survival Guide

RESEARCH COORDINATOR

Demographics of News Sharing in the U.S. Twittersphere

In her respective works, Robert-Millers presents a fascinating and detailed insight into the workings of

North Carolina Races Tighten as Election Day Approaches

How Incivility in Partisan Media (De-)Polarizes. the Electorate

Page 1 of 10 Half of Canadians say their country is too generous toward illegal border crossers

How to identify experts in the community?

NUMBERS, FACTS AND TRENDS SHAPING THE WORLD. FOR RELEASE September 12, 2014 FOR FURTHER INFORMATION ON THIS REPORT:

General GD Topics. Is China a threat to the Indian software industry. Position of Women in India compared to other nations.

Transcription:

Computational challenges in analyzing and moderating online social discussions Aristides Gionis Department of Computer Science Aalto University Machine learning coffee seminar Oct 23, 2017

social media social media consume content news about friends, politics, favorite artists people use social media to generate content share experiences, interesting articles interact with others comment, rate, and discuss hundreds of millions of active users share information, express opinion, comment, interact, discuss, get personalized news feed 62% of adults in US get their news from social media Michael Mathioudakis 2 PEW RESEARCH CENTER

social media : good and bad sides advantages no information barriers citizen journalism social connectivity democratization... disadvantages harassment fake news echo chambers polarization...

polarization political or social polarization the act of separating or making people separate into two groups with completely opposite opinions related term: controversy public discussion and argument about something that many people strongly disagree about oxford english dictionary

polarization in US politics 1994 2014 PEW RESEARCH CENTER

the polarization cycle user choices algorithmic personalization related to the filter bubble and echo chamber

research questions can we identify polarized discussions in social media? has polarization increased over time? how does collective attention impact polarization? can we design algorithms to help reduce polarization? can we design algorithms to moderate online discussions?

research question identify and quantify polarization K. Garimella, G. De Francisci Morales, A. Gionis, M. Mathioudakis, Quantifying controversy in social media, ACM WSDM 2016

focus on twitter microblogging platform launched in 2006 300 million active users users post short messages tweets

tweet retweets replies connections

how can we identify polarization? ideas content do opposing sides say different things? sentiment do polarized topics exhibit wider range of emotions? interactions do people interact more with their own side?

method template build an interaction graph try several types retweets, replies, connections is the interaction graph polarized? output polarization score non polarized polarized two sides well separated

pipeline what type of interaction graph should we use? how to find two sides in the graph? how to measure the separation between two sides? do we identify polarized discussions? topic Graph Building Graph Partitioning Controversy Polarization Measure evaluation retweets replies connections any state-of-the-art algorithm random-walk edge betweenness embedding-based

random-walk controversy score (RWC) assume graph is partitioned in two sides, A and B consider a random walk that started at a random node and finished in a hub in Y {A, B} probability that random walk started in X {A, B} P XY = Pr(r.w. started in X r.w. finished in Y ) random-walk controversy score (RWC) RWC = P AA P BB P AB P BA does not depend on cluster sizes and relative in-degrees

evaluation annotate polarized and non-polarized topics polarized indian beefban, nemtsov protests, netanyahu US congress speech, baltimore riots, ukraine non-polarized germanwings plane crash, sxsw, mother s day, jurassic world movie, national kissing day evaluate different settings on ground truth

best performing setting pipeline what type of interaction graph should we use? how to find two sides in the graph? how to measure the separation between two sides? do we identify polarized discussions? topic Graph Building Graph Partitioning Controversy Polarization Measure evaluation retweet graph RWC other good settings: edgemichael betweenness Mathioudakis score sentiment variance 19

example of results results high RWC high RWC low RWC low RWC polarized topics nemtsov protests indian beef ban germanwings plane crash sxsw conference non-polarized topics interaction graphs: retweets Michael Mathioudakis using retweet graph 31

example of results results interaction graphs for nemtsov protests retweets retweets replies replies Michael Mathioudakis 32

research questions does polarization increase over time? does polarization increase with spikes of activity? K. Garimella, G. De Francisci Morales, A. Gionis, M. Mathioudakis, The effect of collective attention on controversial debates on social media, ACM Web Science 2017

polarization over time data 1% sample of all tweets September 2011 to September 2016 method for a given topic (e.g., obamacare) build retweet graph for each day measure RWC score

RWC s Time over time September 2011 September 2016

activity spikes at major events Michael Mathioudakis 39

RWC vs. activity volume RWC vs Volume higher controversy higher volume higher volume higher volume Michael Mathioudakis 41

other measures vs. activity volume clustering coefficient, core density, core-periphery edges, bi-directional links, content distribution, etc. findings polarization increases with volume most retweeting activity occurs within a side retweet network becomes more hierarchical more discussion on the reply network content becomes more similar between the two sides

research questions design algorithms to help reduce polarization design algorithms to moderate online discussions K. Garimella, G. De Francisci Morales, A. Gionis, M. Mathioudakis, Reducing controversy by connecting opposing views, ACM WSDM 2017 K. Garimella, A. Gionis, N. Parotsidis, N. Tatti, Balancing information exposure in social networks, NIPS 2017

reducing polarization how can we bridge the divide? assuming polarization score measured by RWC we want to reduce RWC problem add k edges that maximally reduce RWC

reducing polarization greedy algorithm find the single best edge to reduce RWC repeat k times inefficient computing RWC requires O(MMULT(n)) faster in practice with iterative computation still, greedy requires O(n 2 k MMULT(n)) improvements consider adding edges only between hubs incremental RWC computation using Sherman-Morrison formula

reducing polarization what does it mean add k edges? answer: recommendations but many recommendations are unlikely to be materialized no point recommending D. Trump to retweet H. Clinton incorporate probability of accepting a recommendation compute user polarity, and acceptance probability as a function of user polarity

reducing polarization : real example polarity=-.99 polarity=.95

reducing polarization : real example polarity=-.99 polarity=.15

reducing polarization : results

balancing information exposure the standard viral-marking setting [Kempe et al. 2003] a social network a model of information propagation e.g., the independent-cascade model an action (e.g., meme) propagates in the network the influence-maximization problem find k seed nodes to maximize spread the standard solution spread is non-decreasing and submodular greedy given (1 1 e ) approximation

balancing information exposure proposed setting a social network and two campaigns seed nodes I 1 and I 2 for the two campaigns a model of information propagation the problem of balancing information exposure find additional seeds S 1 and S 2, with S 1 + S 2 k s.t. minimize # of users who see only one campaign or maximize # of users who see both or none

balancing information exposure : our results optimization problem is NP-hard objective function non monotone and non submodular different models of how the two campaigns propagate approximation guarantee 1 2 (1 1 e ) maximization version

balancing information exposure : example

discussion, limitations, future work models use mostly network structure language-independent, but incorporating language can help simple models two-sided controversies external influence is ignored random walk and independent cascade too simple evaluation is challenging, done on few topics go beyond twitter

references K. Garimella, A. Gionis, N. Parotsidis, N. Tatti, Balancing information exposure in social networks, NIPS 2017 K. Garimella, G. De Francisci Morales, A. Gionis, M. Mathioudakis, The effect of collective attention on controversial debates on social media, International ACM Web Science 2017 K. Garimella, G. De Francisci Morales, A. Gionis, M. Mathioudakis, Reducing controversy by connecting opposing views, ACM WSDM 2017 K. Garimella, G. De Francisci Morales, A. Gionis, M. Mathioudakis, Quantifying controversy in social media, ACM WSDM 2016

VK thank you Q & A JK PY PD HPK