Events and Memes in Media- rich Social Informa7on Networks Lexing Xie Computer Science Australian Na7onal University EBMIP Workshop, Oct 2013
2 Internet Memes Quotes Tags Links #occupy hqp://y2u.be/_oblgsz8ssm Pictures and video
3 Ques7ons we ask network [ICWSM2012] user [COSN2013] content [ACM MM 2011, TMM 2013] What does the macro structure of the event web look like? Will fine- grained social profile and interac7ons help predict user interest / preferences? Can we predict video remix trends on Youtube? A few con7nuing inquiries Analyzing media- rich microblogs in China Inferring hidden network from data (Wed PM oral)
4 TOPIC 1: Network A MAP OF THE EVENT WEB Who are discussing news events in the blogosphere? Kim, Xie, Christen [ICWSM 2012]
5 What Drives Linking and Quo7ng? A significant amount of online content reflect news
U 1 U 2 6 Real- world Event in Social Media U 3 U 4 Mainstream News U 5 Social Media E 3 Weblog E 4 E 2 E 1 Document User Event Hyperlink
7 Construct an Event Registry from Wikipedia Category Reference Documents Event Episode
8 Analyzing the Jan- 2011 Spinn3r Collec7on 380+ #documents (M) #documents 59.9 #links 4.1 2.2 2.2 all EN linked user event Within- category cascade Largest cascade
9 Sta7s7cs about Users and Posts 350000 300000 87.69% # users Weakly Connected Component 250000 200000 Strongly Connected Component 150000 100000 50000 4.26% 1.02% 0 Reciprocal Core WCC SCC RC WCC 2000000 1500000 1000000 500000 99.61% 56.89% # posts 42.98% 0 WCC SCC RC
10 Event Media as Skewed Bow7e ~285K news, blog and social media users, authoring ~2M documents on events from Jan- Feb 2011. SCC: 4.3% IN: 55% OUT 3.7% Tendrils: 24.2% Tube: 0.53% Disc: 12.3% SCC, IN, OUT are all about 20-25% of all sites. [Broader et al. 1999] 1% of the users, consis7ng of the reciprocal core of SCC, authored 43% of all documents.
11 Event Media as Skewed Bow7e ~285K news, blog and social media users, authoring ~2M documents on the events in Jan- Feb 2011. SCC: 4.3% IN: 55% OUT 3.7% Propor7ons of users by media types Tendrils: 24.2% Tube: 0.53% Disc: 12.3% Moreover, 1% of the users, consis7ng of the reciprocal core of SCC, authored 43% of all documents.
12 TOPIC 2: User PREDICTING PREFERENCES SOCIALLY You are what you like? Sedhain, Sanner, Xie et al. [ACM COSN 2013]
The Social Recommenda7on problem ANU LinkR App: Recommend 3 daily links on Facebook Non-friend Recommendation (only link context) Rating + Optional Link Feedback Friend Recommendation (friend message + link context) [Tran, Noel, Sanner, Christen, Xie, WWW 2012]
Social Recommenda7on: Problem Serng U Like/Dislike? URL
Social Recommenda7on: Common Methods Matrix Factoriza7on (MF) Social Similarity
Matrix Factoriza7on Methods PMF [Salakhutdinov, Mnih NIPS- 2008] min U, V MatchBox [Stern, Herbrich, Graepel, WWW- 09] min U, V Social Matchbox [Noel, Sanner, Tran, Christen, Xie WWW 2012] Social regularizer Social spectral regularizer Don t predict S x,z, use it to vary regulariza7on strength!
How to define social similarity? In Reality Liked U s videos #msgs sent Social Similarity
Social Affinity Interac7ons {link, post, photo, video} {like, tag, comment} {incoming, outgoing} Ac7vi7es Groups 3,469 Pages 10,771 Favorites 4,284 18
Social Affinity Features for Recommenda7on User u Like? Link i Interactions and Activities Social Affinity Group Like/Dislike 19
Social Affinity Features {u 2, u 7, u 9 } {u 2, u 5, u 11. } (Pages) Social Affinity Filtering(SAF) Naïve Bayes LogisSc Regression SVM Train Test
Data Descrip7on LinkR: Link Recommender App 119 users and 37,872 friends
SAF Accuracy Baselines Social Affinity Filtering
Not all social interac7ons are equal Use Condi7onal entropy as informa7ve- ness measure: 23
Are large groups more informa7ve than small groups? Large group tend not to be predic7ve Most predic7ve group were small in size
Are all favourites equally informa7ve? Majority less informa7ve Very Informa7ve outliers
Most and Median Informa7ve Favourites Median favorites were generic Most informa7ve were specialized
The power of page likes Michal Kosinskia, David S7llwella, and Thore Graepel, Private traits and aqributes are predictable from digital records of human behavior, PNAS 2013 Page likes can be used to predict gender, rela7onship status, religion etc. Yongzheng Zhang and Marco Pennacchior, Predic7ng purchase behaviors from social media, WWW '13 Page likes can be used to predict purchase behavior in ebay If you are building a social recommender Ask users to link their Facebook profile Ask for page likes permission (ONLY) Use SAF to build scalable state- of- the- art recommender system
28 TOPIC 3: Content TRACKING REMIX ON YOUTUBE What is retwee7ng for video? [ACM MM 2011, TMM 2013]
Remixing on YouTube Iran topic Video A Video B Youtube Video page Meme shot examples
Visual memes Meme := a cultural unit (an idea or value or paqern of behavior) that is passed from one person to another by social means Visual meme := frequently re- posted visual units - - image or short video segments upload: 2009-06-21 author: shobeir1976 title: Ey Shahid (O Martyr) 51 other videos, 2009-06 ~ 08
How prevalent are visual memes? Probability that a video contains meme >50% video contain memes, ~70% authors par7cipate in producing and dissemina7ng memes. Video popularity (viewcount) can be inversely correlated with being meme- videos!
32 Meme graphs Video graph Author graph video J8wjwLcrJAA upload date 2009-06-28 author Shapulak
Diffusion influence of authors 33
Can we predict virality? On YouTube: early view- count correlate well with ul7mate view count. What should be the dimensions of meme virality? Volume: how many people remix with this visual meme? Longevity: when is the last remix (among the observed)? [Szabo and Huberman, CACM2010]
Predic7ng importance with meme features Content volume up to 24 hours Author graph centrality features Degree centrality Betweeness centrality Closeness centrality Author influence index The Mean+Std of the above features over all authors in the first 24 hours
36 Predic7ng Meme Populari7es known 0 1 log(y) = f(x) predict t(days) 0.966~1.035x volume [Xie, Natsev, et al, submiqed] 0.794~1.259x lifespan
37 Keywords vs popularity VOLUME LIFESPAN freq. words high corr low corr high corr low corr
38 Predic7ng popularity on YouTube Early view- count correlate well with ul7mate view count. [Szabo and Huberman, CACM2010] [Pinto et al, WSDM2013] What about sudden changes?
39 Summary network [ICWSM2012] user [COSN2013] content [ACM MM 2011, TMM 2013] Events can be traced across the social web. Its macro structure is a Skewed bow7e. Fine- grained social profile is a powerful predict of user interest. Just ask for page likes! Image analysis helps track popular content and influen7al users on YouTube. Meme popularity can be predicted from network features. Also ask me about: Mul7media- Hard problems Culture differences in microblog communica7ons Building a mul7media knowledge graph (Wed oral session)
Thank You! Get in touch: lexing.xie@anu.edu.au hqp://cecs.anu.edu.au/~xlx/ http://acml2013.conference.nicta.com.au/
41