What's in a name? The Interplay between Titles, Content & Communities in Social Media

Similar documents
Case study. Web Mining and Recommender Systems. Using Regression to Predict Content Popularity on Reddit

CSE 190 Assignment 2. Phat Huynh A Nicholas Gibson A

CSE 190 Professor Julian McAuley Assignment 2: Reddit Data. Forrest Merrill, A Marvin Chau, A William Werner, A

Classification of posts on Reddit

Talking to the crowd: What do people react to in online discussions?

Here, have an upvote: communication behaviour and karma on Reddit

Social Media Audit and Conversation Analysis

Computational challenges in analyzing and moderating online social discussions

A comparative analysis of subreddit recommenders for Reddit

Reddit Advertising: A Beginner s Guide To The Self-Serve Platform. Written by JD Prater Sr. Account Manager and Head of Paid Social

reddit Roadmap The Front Page of the Internet Alex Wang

Popularity Dynamics and Intrinsic Quality in Reddit and Hacker News

Prediction for the Newsroom: Which Articles Will Get the Most Comments?

Social Media in Staffing Guide. Best Practices for Building Your Personal Brand and Hiring Talent on Social Media

Recommendations For Reddit Users Avideh Taalimanesh and Mohammad Aleagha Stanford University, December 2012

Social Computing in Blogosphere

Popularity Prediction of Reddit Texts

Lifespan and propagation of information in On-line Social Networks: a Case Study

Analyzing behavioral trends in community driven discussion platforms like Reddit

El Paso Giving Day Nonprofit Social Media Guide

Predicting Information Diffusion Initiated from Multiple Sources in Online Social Networks

Chapters: Is There Such a Thing as Free Traffic? Reddit Stats Setting Up Your Account Reddit Lingo Navigating Reddit What is a Subreddit?

An Homophily-based Approach for Fast Post Recommendation in Microblogging Systems

GLOBAL STANDARDS FOR POLITICAL PARTIES

Project Presentations - 1

Rich Traffic Hack. Get The Flood of Traffic to Your Website, Affiliate or CPA offer Overnight by This Simple Trick! Introduction

Reddit. By Martha Nelson Digital Learning Specialist

Characterizing Conversation Patterns in Reddit: From the Perspectives of Content Properties and User Participation Behaviors

The Digital Battleground: The Political Pulpit to Political Profile

Office of Communications Social Media Handbook

Malicious Behavior on the Web: Characterization and Detection

Social News Methods of research and exploratory analyses

Twitter. Presentation by Sue Burzynski Bullard University of Nebraska - Lincoln

arxiv: v1 [cs.si] 20 Jun 2016

101 Ways Your Intern Can Triple Your Website Traffic & Performance This Year

Purple Feed: Identifying High Consensus News Posts on Social Media

If you have questions about Speak Up or the contents of this packet, please contact the Speak Up team at

Capturing the Modern News Consumer

reddit: How to engage with the internet s passionate communities Victoria Taylor, director of communications at reddit

The Role of Information Visibility in Network Gatekeeping: Information Aggregation on Reddit during Crisis Events

Analysis of Social Voting Patterns on Digg

Instant Traffic Hacks

An Integrated Tag Recommendation Algorithm Towards Weibo User Profiling

BY Michael Barthel, Galen Stocking, Jesse Holcomb and Amy Mitchell

Increasing Your Impact with Social. Rebecca Vander Linde, Social Media Manager Rachel Weatherly, Director of Digital Communications Strategy

A Vote Equation and the 2004 Election

Analysis of Social Voting Patterns on Digg

The NRA and Gun Control ADPR 5750 Spring 2016

Researching Social News Is reddit.com a mouthpiece for the Hive Mind, or a Collective Intelligence approach to Information Overload?

How Social are Social News Sites? Exploring the Motivations for Using Reddit.com

New Horizons #PlutoFlyby

Tutorial. National Webinar Social Media Strategies to Advance your Mission

SOCIAL MEDIA OPTIMIZATION

How (Not) To Predict Elections

Media pack

click to subscribe to /r/battlefield_one!

Upvoting Hurricane Sandy: Event-Based News Production Processes on a Social News Site

Purple Feed: Identifying High Consensus News Posts on Social Media

Big Data, information and political campaigns: an application to the 2016 US Presidential Election

Stochastic Models of Social Media Dynamics

Stand Up and Stand Out:

Demographics of News Sharing in the U.S. Twittersphere

A Note on Internet Use and the 2016 Election Outcome

arxiv: v2 [cs.si] 12 Aug 2013

User Perception of Information Credibility of News on Twitter

SMCSac --Who We Are. The centerpiece for gatherings surrounding the subject of social media. o Expands social media literacy and shares best practices

Business Wire. At a Glance. January 13, 2015 at 9am - January 20, 2015 at 9am Page VC. 2% Positive Peak: 1 mentions on January 14th at 4pm

Clinton vs. Trump 2016: Analyzing and Visualizing Tweets and Sentiments of Hillary Clinton and Donald Trump

Governance and Resilience

Utilitarianism Revision Help Pack

CASE SOCIAL NETWORKS ZH

Subreddit Recommendations within Reddit Communities

Social Media Community Case Studies. Presented by: Gavin McGarry, Founder

Tracking Sentiment Evolution on User-Generated Content: A Case Study on the Brazilian Political Scene

Why Your Brand Or Business Should Be On Reddit

community2vec: Vector representations of online communities encode semantic relationships

Understanding factors that influence L1-visa outcomes in US

Coercion, Capacity, and Coordination: A Risk Assessment M

Reddit Best Practices

A Social Contagion: An Empirical Study of Information Spread on Digg and Twitter Follower Graphs

THE SPN JACKPOT REWARDS

LOCAL epolitics REPUTATION CASE STUDY

Conference of the States Parties to the United Nations Convention against Corruption

Mobilizing the Trump Train: Understanding Collective Action in a Political Trolling Community

IPSOS MORI HIGHLIGHTS

UNIVERSITY OF DEBRECEN Faculty of Economics and Business

CALL FOR PROPOSALS. Selection of qualified Responsible Party for the Programme

#Free Speech and #Public Records Considerations for Social Media. Frayda Bluestein Bob Joyce

Espionage in Botball

Appendix H. Engagement with Beneficiaries through Social Media

UNIVERSITE PARIS 1 PANTHEON-SORBONNE UFR 06 / SGEL. LICENCE DE GESTION ET ECONOMIE D ENTREPRISE, Semestre 3. Partiel d anglais appliqué à la gestion

arxiv: v1 [cs.cy] 11 Jun 2008

Author(s) Title Date Dataset(s) Abstract

Blogging about R. Audiorecording of this talk is on: R-statistics.com Tal Galili[1]

CFC s Financial Webinar Series Social Media: Fad or Established Business Tool? How to Submit Your Question. Financial Webinar Series

#Free Speech and #PublicRecords Considerations for Social Media. Session Overview. Part I: Elected Officials Use of Social Media

England and the 13 Colonies: Growing Apart

CS 229: r/classifier - Subreddit Text Classification

In this work Liu seeks, ostensibly, to address the question of the importance of sports and

Deep Classification and Generation of Reddit Post Titles

Transcription:

What's in a name? The Interplay between Titles, Content & Communities in Social Media Himabindu Lakkaraju, Julian McAuley, Jure Leskovec Stanford University

Motivation Content, Content Everywhere!! How to get your content noticed amidst such information overload?

An Example Understanding a submission and its popularity Content 62 I'm not sure I quite understand this piece Popularity Submitted 2 years ago to pics by xxx 24 comments Time User Popularity Community Title

An Example Understanding a submission and its popularity Content 62 Is content the only factor I'm not sure I quite understand this piece Popularity Submitted 2 years ago to pics by xxx in 24 determining comments popularity? Time User Popularity Community Title

An Example 62 20 I'm not sure I quite understand this piece Submitted 2 years ago to pics by xxx 24 comments How wars are won Submitted 18 months ago to WTF by xxx 1 comment 774 Murica! Submitted 1 year ago to funny by xxx 59 comments 10 Bring it on England, Bring it on!! Submitted 10 months ago to pics by xxx 4 comments 226 I believe this is quite relevant currently Submitted 7 months ago to funny by xxx 15 comments God bless whoever. makes these Submitted 1 month. 794 ago to funny by xxx 34 comments...

An Example 62 20 I'm not sure I quite understand this piece Submitted 2 years ago to pics by xxx 24 comments How wars are won Submitted 18 months ago to WTF by xxx 1 comment 774 Murica! Submitted 1 year ago to funny by xxx 59 comments 10 Bring it on England, Bring it on!! Submitted 10 months ago to pics by xxx 4 comments 226 I believe this is quite relevant currently Submitted 7 months ago to funny by xxx 15 comments God bless whoever. makes these Submitted 1 month. 794 ago to funny by xxx 34 comments...

An Example 62 20 I'm not sure I quite understand this piece Submitted 2 years ago to pics by xxx 24 comments How wars are won Submitted 18 months ago to WTF by xxx 1 comment Content is not Murica! 774 Submitted 1 year ago to funny by xxx 59 comments the only factor!! 10 Bring it on England, Bring it on!! Submitted 10 months ago to pics by xxx 4 comments 226 I believe this is quite relevant currently Submitted 7 months ago to funny by xxx 15 comments God bless whoever. makes these Submitted 1 month. 794 ago to funny by xxx 34 comments...

An Example 62 20 I'm not sure I quite understand this piece Submitted 2 years ago to pics by xxx 24 comments How wars are won Submitted 18 months ago to WTF by xxx 1 comment Given a piece of content, Murica! 774 Submitted 1 year ago to funny by xxx 59 comments 10 226 can we maximize the probability of its success? Bring it on England, Bring it on!! Submitted 10 months ago to pics by xxx 4 comments I believe this is quite relevant currently Submitted 7 months ago to funny by xxx 15 comments God bless whoever. makes these Submitted 1 month. 794 ago to funny by xxx 34 comments...

Motivation Factors influencing popularity Community or Forum Time of posting Title of submission Popularity of user Previous submissions of same content + Content and their confounding interplay!

Motivation Factors influencing popularity Community or Forum Time of posting How do we Title of submission tease Popularity apart of user Previous submissions of same content these effects? + Content and their confounding interplay!

Teasing apart.. How do we tease apart effects of various factors? Dataset which accomodates Resubmissions of same content Submissions across multiple communities Communities with varying characteristics Submissions by multiple users

Teasing apart.. Reddit to the rescue!

Teasing apart.. Our Dataset A novel dataset of 132K reddit submissions Every piece of content (image) submitted multiple times 16.7K original submissions Average of 7 resubmissions per image Data available at http://snap.stanford.edu/data

Our Goal To study the effect of the interplay between content, title, communities on a submission's popularity To understand how much of a submission's popularity is due to its Inherent quality Community choice Time of posting Characteristics of submission title

Our Approach Model the popularity of a submission as a combination of various factors Evaluate the goodness of the model by predicting popularity How do we quantify popularity? Reddit score = # of upvotes - # of downvotes

Our Contributions Popularity = Community Model + Language Model Community model: choice of community + time of submission + previous submissions of same content Language model: linguistic features of submission title + language of community and, a novel dataset which allows the study of various factors

Related Work Predicting the success of social media content Content based approaches [Bandari et. al.] [Tsagkias et. al.] [Yano et. al.] Understanding the relationship between language and social engagement Analysis of lexical features [Danescu-Niculescu-Mizil et. al.] [Hong et. al.] [Petrovic et. al.] [Suh et al.]

Related Work Predicting the success of social media content Content based approaches [Bandari et. al.] [Tsagkias et. al.] [Yano et. al.] Understanding the relationship between language and social engagement Analysis of lexical features [Danescu-Niculescu-Mizil et. al.] [Hong et. al.] [Petrovic et. al.] [Suh et al.] Our work focusses on the interplay between content, lexical features, communities and the resulting composite effect on popularity

Insights Understanding community activity Popularity varies with time of the day

Insights Understanding community activity Content is less popular with each resubmission

Insights Understanding community activity Resubmittions are forgiven given enough time

Insights Understanding inter-community effects gifs funny WTF space pics reddit.com aww gifs funny WTF space pics reddit.com aww Don't resubmit to same community (diagonal) Don't resubmit highly visible content (rows)

Our Approach Community Model Input Output Inherent popularity Resubmission decay Popularity Forgetfulness Inter-community effects

Our Approach Language Model Language of a Community Targeting title to a community Content Specificity Title reflecting content Title Originality Novelty of the title Sentiment polarity, POS tags, # of words in title

Insights Understanding language characteristics Titles should balance novelty and familiarity

Insights Understanding language characteristics Resubmissions benefit from novel titles

Insights Understanding language characteristics Various communities prefer different POS

Quantitative Evaluation Predicting reddit score Evaluating predictive power on a held out test set of 25% of the data Coefficient of determination R 2 statistic (value of 1.0 indicates perfect fit) Model R 2 Community Model 0.528 Language-only Model 0.081 Community + Language 0.618

Qualitative Evaluation

Qualitative Evaluation Top 10% (++) Top 25% (+) Bottom 25% (-) Bottom 10% (--)

Qualitative Evaluation Top 10% (++) Top 25% (+) Bottom 25% (-) Bottom 10% (--)

In Situ Evaluation Real time action on Reddit! A sample of 85 images from our dataset Assigned a good and a bad title for each image Total score of all good submissions is 3 times higher 2 of our good submissions hit Reddit front page 3 more featured on front pages of communities

Conclusion Popularity is effected by the interplay of various content, language and community specific aspects We propose models which disentangle these effects Modeling these effects helps us understand what fraction of popularity can be attributed to each of these factors

Thank you!! R. Bandari, S. Asur and B. Huberman. The pulse of news in social media: Forecasting popularity. In ICWSM 2012. M. Tsagkias, W. Weerkamp and M. Derijke. Predicting the volume of comments on online news stories. In CIKM 2009. Y. Yano and N. Smith. What's worthy of comment? content and comment volume in political blogs. In ICWSM 2010. C. Danescu-Niculescu-Mizil, M. Gamon and S. Dumais. Mark my words! Linguistic style accommodation in social media. In WWW 2011. L. Dang, O. Dan and B. Davison. Predicting popular messages in twitter. In WWW 2011. S. Petrovic, M. Osborne and V. Lavrenko. Rt to win! Predicting message propagation in twitter. In ICWSM 2011. B. Suh, L. Hong, P. Pirolli and E. Chi. Want to be retweeted? large scale analytics on factors impacting retweet in twitter network. In SocialCom 2010. D. Blei and J. McAuliffe. Supervised topic models. In NIPS 2007.