Discovering the city by mining diverse and multimodal data streams
Journal
2014 ACM Multimedia Conference
Pages
201-204
ISBN
9781450330633
Date Issued
2014
Author(s)
Kuo Y.-H.
Chen Y.-Y.
Chen B.-C.
Lee W.-Y.
Wu C.-C.
Lin C.-H.
Hou Y.-L.
Cheng W.-F.
Tsai Y.-C.
Hung C.-Y.
Hsieh L.-C.
Abstract
This work attempts to tackle the IBM grand challenge - seeing the daily life of New York City (NYC) in various perspectives by exploring rich and diverse social media content. Most existing works address this problem relying on single media source and covering limited life aspects. Because different social media are usually chosen for specific purposes, multiple social media mining and integration are essential to understand a city comprehensively. In this work, we first discover the similar and unique natures (e.g., attractions, topics) across social media in terms of visual and semantic perceptions. For example, Instagram users share more food and travel photos while Twitter users discuss more about sports and news. Based on these characteristics, we analyze a broad spectrum of life aspects - trends, events, food, wearing and transportation in NYC by mining a huge amount of diverse and freely available media (e.g., 1.6M Instagram photos, 5.3M Twitter posts). Because transportation logs are hardly available in social media, the NYC Open Data (e.g., 6.5B subway station transactions) is leveraged to visualize temporal traffic patterns. Furthermore, the experiments demonstrate that our approaches can effectively overview urban life with considerable technical improvement, e.g., having 16% relative gains in food recognition accuracy by a hierarchy crossmedia learning strategy, reducing the feature dimensions of sentiment analysis by 10 times without sacrificing precision.
Subjects
Cross-media mining; Multiple media sources; Visualization
SDGs
Other Subjects
Data streams; Flow visualization; Open Data; Semantics; Sentiment analysis; Subway stations; Cross-media; Feature dimensions; Learning strategy; Multimodal data streams; Recognition accuracy; Social media minings; Technical improvement; Traffic pattern; Social networking (online)
Type
conference paper