Cross Social Media Recommenda1on

Similar documents
An Integrated Tag Recommendation Algorithm Towards Weibo User Profiling

Social Computing in Blogosphere

ASK ALL: Q.1 Do you use any of the following social networking sites? [RANDOMIZE A-D FOLLOWED BY E-K, KEEP L LAST] Yes No No answer

THE AUTHORITY REPORT. How Audiences Find Articles, by Topic. How does the audience referral network change according to article topic?

Ushio: Analyzing News Media and Public Trends in Twitter

Computational challenges in analyzing and moderating online social discussions

PEW RESEARCH CENTER FOR THE PEOPLE & THE PRESS/WASHINGTON POST MAY OSAMA BIN LADEN SURVEY FINAL TOPLINE May 2, 2011 N=654

The Hispanic Millennial Project

Events and Memes in Media- rich Social Informa7on Networks

How Social Computing Impacts Society

CS388: Natural Language Processing Coreference Resolu8on. Greg Durrett

PEW RESEARCH CENTER S PROJECT FOR EXCELLENCE IN JOURNALISM IN COLLABORATION WITH THE ECONOMIST GROUP 2011 Tablet News Phone Survey July 15-30, 2011

arxiv: v2 [cs.si] 10 Apr 2017

Social Media at USM. USM Office of Public Affairs - Oct. 2015

Coreference Semantics from Web Features. Mohit Bansal and Dan Klein UC Berkeley

September 2015 SWEEPS REPORT

August 2015 SWEEPS REPORT

Increasing Your Impact with Social. Rebecca Vander Linde, Social Media Manager Rachel Weatherly, Director of Digital Communications Strategy

An Analysis on the US New Media Public Diplomacy Toward China on WeChat Public Account

PEW RESEARCH CENTER FOR THE PEOPLE & THE PRESS August 7-10, 2009 NEWS INTEREST INDEX OMNIBUS SURVEY FINAL TOPLINE N=1004

Politics and Social Media. Nov 6, 2012

Appendix H. Engagement with Beneficiaries through Social Media

PEW RESEARCH CENTER FOR THE PEOPLE & THE PRESS JANUARY OMNIBUS FINAL TOPLINE January 14-17, 2010 N=1003

Big Data, information and political campaigns: an application to the 2016 US Presidential Election

How Social Media Is Changing Communications

Experiments on Data Preprocessing of Persian Blog Networks

Heavy Coverage of Pakistan, Only Modest Interest WIDESPREAD INTEREST IN RISING OIL PRICES

SOCIAL MEDIA OPTIMIZATION

1. ISSUING AGENCY: The City of Albuquerque Human Resources Department.

UTAH LEGISLATIVE BILL WATCH

PEW RESEARCH CENTER FOR THE PEOPLE & THE PRESS BIENNIAL MEDIA CONSUMPTION SURVEY 2010 FINAL TOPLINE June 8-28, 2010 N=3006

Who is he? Sentence- level Analysis. Natural Language Processing. Document- level Analysis. Document- level Analysis. Narra8ve Structure

Natural Language Processing

PUBLIC S NEWS INTERESTS: CAMPAIGN, WAR AND RETURNING TROOPS

Instructors: Tengyu Ma and Chris Re

BASED ON ALL TABLET OWNERS AND THOSE WHO HAVE TABLETS IN HH [N=2806]:

comscore Single Source Cross-Platform Measurement Study

EDITION 2, January 2012

Q1 In the past month, which of the following have you used or visited? (Select all that apply.)

Predicting Information Diffusion Initiated from Multiple Sources in Online Social Networks

PEW RESEARCH CENTER FOR THE PEOPLE & THE PRESS MARCH 13-16, 2009 NEWS INTEREST INDEX OMNIBUS SURVEY TOPLINE N=1,000

News Consumption Patterns in American Politics

Breaking News English.com Ready-to-Use English Lessons by Sean Banville

VEWS. Video News from all Views. Stanford University. Digital Media Entrepreneurship. Vignesh Ramachandran. Marcella De Laurentiis.

PEW RESEARCH CENTER FOR THE PEOPLE & THE PRESS July 10-13, 2009 NEWS INTEREST INDEX OMNIBUS SURVEY TOPLINE N=1000

2013 Spring Meetings Recap

BIG IDEAS. GREAT RESULTS.

Branding CAP. PAO Academy IX By: Julie DeBardelaben ONE CIVIL AIR PATROL, EXCELLING IN SERVICE TO OUR NATION AND OUR MEMBERS!

US MOBILE NEWS SEEKING TRENDS. Based on October September 2015 data. Excerpted from a full findings report delivered November 2015.

Topline questionnaire

An Algorithmic and Computational Approach to Optimizing Gerrymandering

The language for most tablet questions was customized based on whether the respondent said they had an ipad or another type of tablet computer.

Year 7 EAL Boy Overboard

Quartz at Work. Our guide to leading, building and navigating the modern workplace. Quartz Index

They Work For Us: A Self-Advocate s Guide to Getting Through to your Elected Officials

BY Amy Mitchell, Tom Rosenstiel and Leah Christian

Pioneers in Mining Electronic News for Research

Limited Interest in World Cup PUBLIC REACTS POSITIVELY TO EXTENSIVE GULF COVERAGE

Get Out The VOTE! Overview Materials Duration Teacher Preparation Procedure Voter Turnout

The Firm: A Novel [Kindle Edition] By John Grisham READ ONLINE

Find Your Way. Technovation Team: B.A.S.I.C. BALSA:

NATIONAL SOCIAL MEDIA ENGAGEMENT POLICY. February 2013

Subreddit Recommendations within Reddit Communities

Immigration Reform to Advance America s Agriculture Industry WASHINGTON, DC FEBRUARY iamimmigration.org

Company LOGO. Katie Grien and Abby Liebeskind.

PEW RESEARCH CENTER FOR THE PEOPLE & THE PRESS MAY 1999 NEWS INTEREST INDEX FINAL TOPLINE May 12-16, 1999 N=1,179

BY Amy Mitchell FOR RELEASE DECEMBER 3, 2018 FOR MEDIA OR OTHER INQUIRIES:

PEW RESEARCH CENTER FOR THE PEOPLE & THE PRESS LATE DECEMBER, 2007 POLITICAL COMMUNICATIONS STUDY FINAL TOPLINE December 19- December 30, 2007 N=1430

An Homophily-based Approach for Fast Post Recommendation in Microblogging Systems

Submission to the Speaker s Digital Democracy Commission

Many Republicans Unaware of Romney s Religion PUBLIC STILL GETTING TO KNOW LEADING GOP CANDIDATES

Issues in Information Systems Volume 18, Issue 2, pp , 2017

China-Pakistan Economic Corridor (CPEC) should be supported by people to people contacts

The UK Policy Agendas Project Media Dataset Research Note: The Times (London)

Bloomberg BusinessWeek Business Exchange. Wednesday February 24, 2010

Reddit Advertising: A Beginner s Guide To The Self-Serve Platform. Written by JD Prater Sr. Account Manager and Head of Paid Social

May 2013 SWEEPS REPORT

Introduction to Text Modeling

February 2014 REPORT #1 Late News - Channel 2 Action News Nightbeat at 11

PEW RESEARCH CENTER FOR THE PEOPLE AND THE PRESS NEWS SAVVY PROJECT FINAL TOPLINE February 1-13, 2007 N= 1502

Users reading habits in online news portals

A secure environment for trading

TOTAL NATIONAL POST NETWORK 12,315,080. Report for September 2012 DIGITAL EDITION (See Notes #1)

TOTAL NATIONAL POST NETWORK 13,980,756. CONSOLIDATED MEDIA REPORT Newspaper. Report for September 2013

Monday, March 4, 13 1

Case: 5:15-cr DAP Doc #: 37 Filed: 12/08/16 1 of 9. PageID #: 241 IN THE UNITED STATES DISTRICT COURT NORTHERN DISTRICT OF OHIO EASTERN DIVISION

SOCIAL NETWORKING PRE-READING 1. 2 Name three popular social networking sites in your country. Complete the text with the words in the box.

Biggest Stories of 2008: Economy Tops Campaign INTERNET OVERTAKES NEWSPAPERS AS NEWS OUTLET

AHR SURVEY: NATIONAL RESULTS

April 2014 REPORT #1 Local Newscast - Channel 2 Action News This Morning

Expresso - O Popular INMA Awards 2015

SOCIAL MEDIA 101 Facebook and Twitter. Mike Lisi UUP Communications Director

Social. Media. in prevention efforts. Lyndsey Hawkins. Bradley University

Institutional aspects: What are the institutional actions to promote data sharing?

PEW RESEARCH CENTER August 15-18, 2013 OMNIBUS FINAL TOPLINE N=1,000

August #NGBSummit

Who We Are. With dozens of new, original articles and videos every day, Noiseporn delivers everything you want to know.

The Personal. The Media Insight Project

Candidate Evaluation. Candidate Evaluation. Name: Name:

May 2014 REPORT. Channel 2 Action News - #1 All Day, Every Day. Grew Each Newscast 4-7P Over 2013

Transcription:

Cross Social Media Recommenda1on @ICWSM16 Xiaozhong Liu, Indiana University Bloomington Tian Xia, Renmin University Yingying Yu, Dalian Mari1me University Chun Guo, Indiana University Bloomington Yizhou Sun, Northeastern University

If Social Media can Mirror this World What the world may look like?

Overture If Social Media can Mirror this World What the world may look like?

As a result, if we are interested to model the world level knowledge, all the research findings based on a single social media system can be BIASED, and the social networks or knowledge networks generated from a single system, or to say a single community, CANNOT fully represent the people from all of the world.

Language Bubbles, Culture Bubbles, and Network Bubbles Bubbles Goal: give everyone SAME informa1on access

Enable global information recommendation across different social media - Pseudo Global Social Media Network (PGSMN)

Message (content): I don t think I like #iphone6 nothing new Hashtag (vector): #icwsm I like ipad (but I don t have iphone) User (LM): kobe LeBron I m working on social media data, but I don t know ICWSM (yet) 1. No physical links between Twitter and Weibo 2. User cannot register both (policy + language barrier)

Message (content): I don t think I like #iphone6 nothing new I like ipad (but I don t have iphone) English text Explicit Semantic Analysis (ESA) Wikipedia Chinese text iphone Apple Inc. random walk via PGSMN ipad

Wait! Twitter + Weibo text can be very short and noisy. obamma Barack Obama Wikipedia redirect page English text Explicit Semantic Analysis (ESA) Wikipedia Chinese text iphone Apple Inc. random walk via PGSMN ipad

Explicit Semantic Path Mining (ESPM) Project text (can be short and noisy) into a number of ranked Wikipedia category paths Text ESA vector Link Analysis (Wiki category tree) Wiki category paths (ESPM results) Original Text: Iraq's most in uential Shiite cleric suggested on Friday that Prime Minister Nouri al-maliki needed to recast his approach or step down, adding his powerful voice to growing criticism of the Shiite-dominated government's leader... Rank. Top 5 ESA results Top 5 ESPM results 1 22 January 2007 Baghdad bombings Society->Religion and society->religion and politics 2 Clericalism Politics->Politics by issue->petroleum politics 3 Sadrist Movement Politics->Political philosophy->political philosophy by politician 4 Sipah-e-Muhammad Pakistan People->Aspects of individual lives->deaths by person 5 23 November 2006 Sadr City bombings Politics->Politics by country->politics of Iraq/Iraqi nationalism Original Text(SportsCenter@SportsCenter Aug 17): Greg Oden, the No. 1 overall pick in 2007 NBA draft, has signed a one-year, $1.2 million deal to play in China. Rank. Top 5 ESA results Top 5 ESPM results 1 Greg Oden Sports->Sports trophies and awards->basketball trophies and awards-> Gatorade National Basketball Player of the Year 2 Henry Oden Sports->Works about sports->sports video games->nbc Sports video games 3 Oden Sports->Sports business->sports management companies->maple Leaf Sports & Entertainment 4 William B. Oden Sports->Sports-related lists->top sports lists->national Basketball Association statistical leaders 5 Lon Oden Sports->Sports controversies/->basketball controversies->national Basketball Association controversies Input can be a #hashtag (associated text) #ipadair2 1. Technology->Mobile operating systems->ios (Apple) (P=0.45) 2. Technology->Personal computers->tablet computers (P=0.55).

Explicit Semantic Path Mining (ESPM) Wikipedia Category Tree (871,978 nodes + 1,229,833 edges) Input: text Output: Wikipedia category path 26 general categories, i.e., Culture, Educa1on, Environment, Poli1cs, and Science American military personnel killed in the War of 1812

Input: text

Random Walk between different category paths (on the tree) Hashtag ESPM ESPM Decay function for random walk - higher level (expert contributed) can be more important

Random Walk between Twitter and Weibo (case: recommend twitter hashtag to Weibo users) Weibo User -> Weibo Hashtag -> Wiki page -> Wiki Category <- Wiki page <- Twitter hashtag 东 (mid-east) Wiki: OPEC Category:Petroleum politics Wiki: Oil Crisis #iraqwar Learning to Rank

Experiment Raw Data: 2012/09/17-2012/09/23 (1 week data). Weibo: 3,296,945 messages Twitter: 20,128,826 messages Wikipedia March 2014 Dump Testing set: 459 hashtags shared by Twitter and Weibo users 401 Weibo users use those Hashtag Remove those from Weibo for evaluation (20,248 uer-tahstag pairs for evaluation) LeR Weibo, Right TwiTer

Experiment - baseline Google Machine Translation: Weibo user -> text -> translate to English -> BM25 -> Twitter hashtag index -> top ranked Twitter hashtags Language can be noisy spoken language challenge machine learning algorithms

Result MRR 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0

Result P20 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0

Result NDCG30 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0

Categories: The Voice (TV series)

1 Categories: Basketball Na1onal Basketball Associa1on Chinese Basketball Associa1on 2 Category:Professional sports leagues

Conclusion Cross-social media information recommendation is a novel but important task Facebook <-> Twitter (user has accounts for both) Twitter <-> Weibo (user cannot reg for both) Wikipedia can be a bridge Random walk via Pseudo Global Social Media Network (PGSMN)

Future Community-based Information Adoption/Diffusion Comparison of Different Microblogging Sites ACM Hypertext 2016

Data and system will be available Thank you. liu237@indiana.edu