CS代写 CCF 2011.

MULTIMEDIA RETRIEVAL
Semester 1, 2022
Social Media
 Background  Content

Copyright By PowCoder代写 加微信 powcoder

 Text + Multimedia
 Users  Profile
School of Computer Science

Explosive Worldwide Growth of Social Media
 Network: Facebook, Wechat, Tiktok
 Blogging/Microblogging:
Twitter, LiveJournal, Blogger
 Sharing: Flickr, YouTube, Pinterest, Instagram
 News: Digg, Slashdot
 Bookmarking: Delicious
 Knowledge: Wikipedia
 Shopping: Groupon
 Location: Foursquare
 Games: Zynga, Playfish
Courtesy of , “Social Networking,” CCF 2011.
School of Computer Science
Social Media as Sensor
 A sensor network consists of spatially distributed autonomous sensors to monitor physical or environmental conditions, such as temperature, sound, light, pressure, etc. and to cooperatively pass their data through the network to a main location.
autonomous. intelligent. active. mobile. no maintenance. multimodal. social. sentiment .
School of Computer Science

Rich application of social media
 Topic evolution and tracking [Alsumait, ICDM’08; Ramage, WSM’10]
Source: stories.facebook.com
School of Computer Science
Rich application of social media
 Tracking political activity and campaigns [Jin, MM’10]
#OccupyWallStreet: origin and spread visualized http://www.bbc.co.uk/news/world-us-canada-19794259
School of Computer Science

Extensive Research Interests on Social Media
 Computational social science [Lazer, Science’09]
 … leverage the capacity to collect and analyze data with an
unprecedented breadth and depth and scale…
 Twitter as medium and message [Savage, Comm of ACM’11]
 Twitter data may help answer sociological questions that are otherwise hard to approach…
 IEEE Trans. on Big Data, IEEE Trans. on Computational Social Systems, Special issues in IEEE Intelligent Systems, IEEE Trans on Multimedia, TOMCCAP…
 Special sessions/workshops in international conferences  Social Media Workshop at ACM Multimedia and IEEE ICME  Web Search and Data Mining (WSDM)
 AAAI Conference on Weblogs and Social Media (ICWSM)  WWW/SIGKDD/SIGIR/ICDM/CIKM
School of Computer Science
Controversial Language Usage with COVID-19
 COVID-19, the official
name from WHO
 However, terms such as Chinese Virus and Wuhan Virus were also used for many reasons.
 Discrimination, Hate Speeches
 Developing Racism, Dividing Societies
H. Lyu, L. Chen, Y. Wang, and J. Luo. Sense and Sensibility: Characterizing Social Media Users Regarding the Use of Controversial Terms for COVID-19. IEEE Trans. on Big Data, 2021.
School of Computer Science

Controversial Language Usage with COVID-19
 Question
Who are using these controversial terms?
 Prevent the exponential growth of such a mindset
 Solution
 A social media and big data study
 17 million tweets on Twitter
Demographics analysis
 User level features, political status, geo-locations
H. Lyu, L. Chen, Y. Wang, and J. Luo. Sense and Sensibility: Characterizing Social Media Users Regarding the Use of Controversial Terms for COVID-19. IEEE Trans. on Big Data, 2021.
School of Computer Science
Controversial Language Usage with COVID-19
 Approach
Data Collection and Pre-processing
 Controversial keywords
 “chinese virus” and “#ChineseVirus”
 Non-controversial keywords
 “corona”, “covid19”, “covid19”, “coronavirus”, “#Corona”,
“#Covid 19” and “#coronavirus”
 4-day period: March 23-26, 2020
 1,125,285 tweets for Controversial Dataset (CD)
 16,320,176 tweets for Non-controversial Dataset (ND)
School of Computer Science

Controversial Language Usage with COVID-19
 Approach
Data Collection and Pre-processing
 Baseline Dataset
 7 user-level features were either collected or computed
 Followers_count, friends_count, statuses_count, favorites_count, listed_count, account_length, verified status
 1,125,176 tweets in CD with 593,233 distinct users  1,599,013 tweets in ND with 490,168 distinct users
School of Computer Science
Controversial Language Usage with COVID-19
 Approach
Data Collection and Pre-processing
 Demographic Datasets
 Face++ API to obtain inferred age and gender information by analyzing users’ profile images
School of Computer Science

Controversial Language Usage with COVID-19
 Approach
Data Collection and Pre-processing  Geo-location Datasets
School of Computer Science
Controversial Language Usage with COVID-19
 Approach
Data Collection and Pre-processing
 Geo-location Datasets
 Only a very limited number of tweets contain self-reported
locations (1.2% of crawled data)
 Use the user profile location as the source of geo-location, which has a substantially higher percentage of entries in the crawled datasets (16.2% of crawled data).
 Aggregate Datasets
 Datasets with both demographic and geo-location features
were created.
 These datasets contain complete attributes that were analyzed in our study, while trading off with the relatively small size, with 5,772 for CD and 12,403 for ND.
School of Computer Science

Controversial Language Usage with COVID-19
School of Computer Science
Controversial Language Usage with COVID-19
 Proportion of Political Following Status
 Most of users do not follow any of these people.
 The second biggest group of users in both CD and ND corresponds to users who only follow .
School of Computer Science

Controversial Language Usage with COVID-19
 Geo-location
School of Computer Science
Controversial Language Usage with COVID-19
 What to do?
 To predict Twitter users who employ controversial
terms in the discussion of COVID-19.
 Investigated six classification models  Logistic Regression (Logit)
 Random Forest (RF)
 Support Vector Machine (SVM)
 Stochastic Gradient Descent Classification (SGD)  Multi-layer Perception (MLP)
 XGBoost (XGB)
School of Computer Science

Controversial Language Usage with COVID-19
 Classification performance
School of Computer Science
Controversial Language Usage with COVID-19
 Feature Association
School of Computer Science

Controversial Language Usage with COVID-19
 Insights
 Young people tend to use non-controversial terms.
 Male users is more likely than female users to use controversial terms.
 Non-verified users are more likely to use controversial terms.
 More users in the CD group follow Donam Trump than in the ND group.
 Users living in rural or suburban areas are more likely to use controversial terms.
School of Computer Science
Tracking shifts in Mental Health Signals in COVID-19
 Motivation
 Widespread problems of psychological distress have been observed in many countries following the outbreak of COVID-19, including Australia.
 What is lacking from current scholarship is a national-scale assessment that tracks the shifts in mental health during the pandemic timeline and across geographic contexts.
 Analysing geotagged tweets in Australia
 Employing machine learning and spatial mapping techniques to classify, measure and map changes in the Australian public’s mental health signals, and track their change across the different phases of the pandemic in eight Australian capital cities.
S. Wang, et al. The times, they are a-changin’: tracking shifts in mental health signals from early phase to later phase of the COVID-19 pandemic in Australia. BMJ Global Health, 2022;7:e007081.
School of Computer Science

Tracking shifts in Mental Health Signals in COVID-19
Temporal change of the public’s (A) sentiment score and (B) emotion by type over four phases.
School of Computer Science
Tracking shifts in Mental Health Signals in COVID-19
Keywords potentially related to optimistic and pessimistic mental health signals over four phases.
School of Computer Science

Tracking shifts in Mental Health Signals in COVID-19
Kernel density estimates of optimistic and pessimistic mental health signals in eight capital cities.
School of Computer Science
Tracking shifts in Mental Health Signals in COVID-19
Change of emotion by type over four phases in eight capital cities
School of Computer Science

Tracking shifts in Mental Health Signals in COVID-19
 Data Collection
 244,406 geotweets were retrieved from the total 860+ million tweets in Australia using AFT-API (Academic Full Track – API)
 Searching terms: pandemic, epidemic, virus, covid*, coronavirus, corona, and vaccin*
 Time: 1/1/2020 – 31/05/2021
School of Computer Science
Tracking shifts in Mental Health Signals in COVID-19
 Sentiment Analysis
 Valence Aware Dictionary for sEntiment
Reasoning (VADER) model
 Combines lexical features and five simple heuristics and utilizes a human-centric approach via the combination of qualitative analysis and empirical validation using human raters
Returns four scores: positive, negative, neutral, and compound
 https://github.com/cjhutto/vaderSentiment
School of Computer Science

Tracking shifts in Mental Health Signals in COVID-19
 Emotion Analysis
National Research Council Canada Emotional
Lexicon (NRCLex)
Mental health signals
 Pessimistic: feat, sadness, anger, disgust  Optimistic: joy, anticipation, trust, surprise
 https://pypi.org/project/NRCLex/
School of Computer Science
What is social multimedia?
 A group of internet-based applications that build on the ideological and technological foundations of Web 2.0, which allows the creation and exchange of user-generated content (text, image, video, location, time, link)
 Social Multimedia = Multimedia Generated for Social Interactions
. Kaplan, (2010). “Users of the world, unite! The challenges and opportunities of social media,” Business Horizons, 53(1): 59-68. [Courtesy of ]
School of Computer Science

Why multimedia (image/video)?
 Pictures and videos are more distinctive than plain texts, and can be comprehended more quickly
 Photography is the only “language” understood in all parts of the world
 “A picture is worth a thousand words”  眼见为实,一图胜千言
School of Computer Science
Social multimedia on the Web
 Explosion of image/video data
 digital photos, personal photo/video collections, geospatial imagery,
broadcast news/sports videos, Wikipedia, social media, etc.
 Data mining can unlock the wealth of information in social multimedia
 Understanding of users can benefit social multimedia recommendation (context-aware, community-aware, personalized)
50millionofphotosuploadedper 1billionvisitspermonth,4billionhrswatched Pinterest:48millionusers(Feb.2013)
month (2012) per month, 72 hrs of video uploaded per min
School of Computer Science

Social multimedia on the Web
 To predict other human geography metrics, such as GDP, wealth by region, unemployment rate, stock market sentiment
 To monitor health/disease, environment, ecology, social unrests, …
School of Computer Science
The Role of “Multimedia” for Social Computing
Leverage social context for better understanding multimedia data
Leverage social multimedia data for better understanding users
School of Computer Science

Mining Personal Patterns from Selfie Behaviors
School of Computer Science
Sentiment Analysis in Social Media
 Sentiment is arguably the most important signal from social media
 User connections  User preference
 Most existing methods are based on textual information only
 Comments, reviews, textual tweets, and status updates
 Easy? Only a limited number of words per tweet  Difficult? Lots of noise and little information
 Questions
 Do users express themselves only using text?
 Can the emerging multimedia content provide additional useful signals?
School of Computer Science

Textual Sentiment Analysis
 Dictionary-based approach
 Lexicon contains a large amount of words with sentiment labels  Emoticons, widely used in online social networks
 Simple and effective in most cases, however fail to capture the rich contextual information
 Semantic analysis
 Using NLP related techniques to build more robust features  Difficult to develop a method that works for all languages
 Use sentiment140 for textual sentiment analysis  Using emoticons as auxiliary information
 Open API available
Source: http://www.sentiment140.com
School of Computer Science
Image Tweets
 Image tweets: tweets that contain images
 Different users may prefer different types of tweets
Observation: Users who prefer image tweets tend to have more positive tweets
School of Computer Science

Deep Learning for Image Sentiment Analysis
Weakly labelled Images
Users who like to post many image tweets, they are more likely to have positive sentiments.
Convolutional Neural Network for Image Sentiment Analysis • Domain‐transfer Learning;
Q. You, et al., Visual Sentiment Analysis by Attending on Local Image Regions, AAAI, 2017. • Boosted Learning using Noisy Labels Q. You, et al., Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks, AAAI, 2015.
School of Computer Science
Deep Learning for Image Sentiment Analysis
School of Computer Science

Deep Learning for Image Sentiment Analysis
 Half million weakly labeled Flickr images from Visual Sentiment Ontology
 Two GTX Titan GPUS and 32 GB RAM
 Statistics of the data set
 Performance of CNN and PCNN on the testing images
School of Computer Science
Challenges for understanding users
 Heterogeneous data forms
Profile: gender, residence, interest, age…
 Content: text, image, video, link, trajectory…
 Context: identity, location, time, status, emotion, weather, comments, views, friends, groups…
Behaviors: gesture, commenting, sharing, distributing, …
 Missing attributes
 Large-scale social graph construction  Real-time sharing/propagation
School of Computer Science

Heterogeneous data types in social multimedia
• Building • Sea •…
School of Computer Science
Understanding Users
 Part 1: Understand user profile
 Community discovery from heterogeneous data [Zhuang, MM’11]  Multi-social-graph construction [Yao, WWW’13]
 Part 2: Understand user context
 Mobile visual localization [Liu, MM’12]
 Mobile recommendation [Zhuang, Ubicomp’11]
 Part 3: Understand user Interaction
 O-search [Zhang, MM’11]
 SocialTransfer: cross-domain social media recommendation [Roy, MM’12]
 Interactive multimodal mobile visual search [Wang, MM’11]
 Browse-to-Search [Zhang, MMSP’11/2; Lu, MM’12]
School of Computer Science

Part 1: Understand user profile
 Social graph from multi- dimensional (m-d) relations
 Explore m-d user relationship  mutual comments
 co-locations
 similar/related photos
 common interest groups
 Build a unified social graph
 Infer hidden relationship/interest as a new context
School of Computer Science
People may not have “explicit” relationship, but…
• comment with each other
• travel to the same place
• take or share pictures for the same landmark
• join the same interest group
• be in the contact list

People may not have “explicit” relationship, but…
• comment with each other
• travel to the same place
• take or share pictures for the same landmark
• join the same interest group
• be in the contact list
This work tries to:
• mine “implicit” social relationship
• explore heterogeneous data and user behaviors
 Modeling continuous social relationship Learning to model (discriminative & model-free)
 Exploration of heterogeneous data
(image | tag | location | friend list | interest group | commenting)
Multiple kernel learning
 Combination of multimodal features Kernel alignment for integration
School of Computer Science

 Assumptions
 #1: Social strength modeling = Prediction of pairwise relationship
Linear regression model:
where is pairwise feature mapping
 #2: Prediction complies with existing (explicit) relationship
Learning-based formulation: loss + regularization
regularization (model complexity)
where the solution is:
loss (prediction error)
School of Computer Science
Heterogeneous user data
Multimodal graph representation
Kernelized multi-graph over user space
Multi-kernel alignment (𝜽∗) to weight kernels
Learning to model relation: loss + regularization
School of Computer Science

Experiment
 Flickr as the testbed (training : testing = 4 : 1)
 Settings
 Single kernel vs. uniform combination  Multiple kernel learning
 Evaluations
 Friend/group recommendation  Gender/location prediction
Application: social relation/attribute modeling
App #1: interest group recomm. App#2: social attributes prediction
• Friend recommendation (70%
• Gender estimation (72% prec.)
• Geo-location estimation (43% prec.)
Social signal
Co-locations
Similar photos
Mutual comment
Uniform fusion
Precision@ 1
School of Computer Science
Multi kernel learning 0 55

Understanding Users
 Part 1: Understand user profile
 Community discovery from heterogeneous data [Zhuang, MM’11]  Multi-social-graph construction [Yao, WWW’13]
 Part 2: Understand user context
 Mobile visual localization [Liu, MM’12]
 Mobile recommendation [Zhuang, Ubicomp’11]
 Part 3: Understand user Interaction
 O-search [Zhang, MM’11]
 SocialTransfer: cross-domain social media recommendation [Roy, MM’12]
 Interactive multimodal mobile visual search [Wang, MM’11]
 Browse-to-Search [Zhang, MMSP’11/2; Lu, MM’12]
School of Computer Science

Robust and accurate mobile visual localization [Liu, MM’12; Xu, ICIMCS’12]
 Comprehensiveandaccuratesenseofgeo-contextviamobiledevices  real location of user and scene (error<15m)  view direction (error<9°)  distance of the target (error<21m), ...  Meritsonbothscalabilityandaccuracy User Location: (37.80, -112.41) View Direction: 39.87° Distance to Scene: 44.05m School of Computer Science Mobile localization: GPS & wireless signal iPhone4 GPS sensor Ground truth Error range: 50~100m GPS sensor [Schroth, IEEE SPM’11] • Track recorded in San Francisco, using iPhone4 at dense areas • Error range: 50~100m Angle of arrival (AOA) [Wang, ICC’08] • Triangulation relationship • Error range: 100~200m Enhanced Observed Time Difference (E-OTD) [Wang, ICC’08] • Estimation of time difference of arrival of a signal • Error range: 50~200m School of Computer Science Mobile localization: vision-based approach  IM2GPS[Hays,CVPR’08]  Find nearest neighbors to estimate location distribution  400 randomly selected queries from 6.5M Flickr photos  Locate 25% images within 750km  Far from accurate: global feature is not distinctive  Localfeatureforlandmarkidentification[Chen,CVPR’11;Park,MM’11;Hao, CVPR’12]  SIFT vocabulary search, GPS filtering, 3D visual phrase  Search similar images: accuracy 80% and recall 75% in SF (1.06M SV)  View direction error: 11 degree in NYC  Only return nearest images, failure rate ~50% [Yu, MM’11]  Matchingrepeatedpatterns[Schindler,CVPR’08]  2D-to-3D pattern matching to geo-located planar façade models  Localization error: ~6m on 9 façades from 7 buildings  Difficult to scale up (rely on existed 3D façade database) School of Computer Science Mobile visual localization  State-of-the-art  GPS sensor: 50~100m (dense areas)  Wireless signal: 50~200m (dense areas)  Vision-based: incomplete result, not scalable  This work, we are aiming at  Comprehensive location parameters (dense areas): location of user & scene | view direction | distance  Tradeoff between scalability and accuracy (e.g., location error<15𝑚, direction error<15°) 1. location? 2. direction? 3. distance? 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com