電機資訊學院: 電信工程學研究所指導教授: 陳銘憲帥宏翰Shuai, Hong-HanHong-HanShuai2017-03-062018-07-052017-03-062018-07-052015http://ntur.lib.ntu.edu.tw//handle/246246/276032隨著行動裝置與Web 2.0蓬勃發展與遍佈全球,社群網路如今變得十分熱門。像是Facebook擁有12.6億使用者、Twitter擁有5.55億使用者,社群網路已轉變成一個全世界規模之巨型資料庫。另一方面,隨著社群網路之崛起,社群網路相關研究也隨之大量浮現、日益重要。是故,在本篇博士論文中,我們研究社群網路上三個重要且根本之問題。而本篇論文於社群網路分析上面臨了三大挑戰。第一,不同於交易紀錄,社群網路上點靠著許多邊連結成不同結構,這結構使得許多問題變成NP-Hard(如本論文中第三個問題即為NP-Hard)。第二,我們同時研究社群網路上不同特性,這些特性之間會產生交互作用,必須要仔細處理。第三,社群網路計算量往往遠大於交易紀錄,因此需要仔細設計演算法與資料結構。 在本論文中,我們首先探討如何對多個社群網路做聯合採樣,並提出一個稱作品質保證多網路聯合採樣演算法,品質保證多網路聯合採樣能夠有效採樣多網路並提供採樣網路與真實網路差異之統計保證。然而,由於採樣社群網路資料變得越來越困難、現有真實資料節點數大多是百萬等級、且基於統計之圖產生器只被設計保有圖之統計性質(像是原圖之度分布、圖半徑與聚合係數)而缺乏保存頻繁標型特徵,因此我們設計一個頻繁標行保存圖產生器同時保存頻繁標型、度分布與聚合係數,且能快速產生十億規模等級之社群網路。除此之外我們也探討社交活動意願最佳化之問題,我們設計一套隨機演算法能夠快速且有效解決此問題。演算法透過理論分析能夠最佳化計算資源且找到保證近似比之解。在臉書上實作成果顯示,所設計之演算法在品質與時間上顯著贏過使用者所選取結果。By the popularization of the mobile phones and the developing of the Web 2.0, the online social network websites become very popular nowadays. For example, the numbers of active users on Facebook and Twitter are 1.26 billion and 550 million, respectively. On the other hand, with the emergence of Online Social Networks (OSNs), they have motivated a great deal of research on social network analysis. Therefore, in this dissertation, we study different important and fundamental problems in social networks. The challenges faced for social network analysis are threefold. First, unlike dealing with transactions, in social networks, nodes are connected with edges and form a graph, which complicates the analysis. Therefore, many problems related to graph are NP-Hard problems (as the third problem studied in this dissertation is). Second, our study considers the graph properties in social networks, in which we need to carefully address the interplay between social network properties. Third, the computation needed is much greater than the transaction case, which demands carefully designed data structures and algorithms. In this study, we first explore the joint sampling of multiple OSNs and propose an approach called Quality-guaranteed Multi-network Sampler (QMSampler) that can crawl and jointly sample multiple OSNs. QMSampler provides a statistical guarantee on the difference between the crawled real dataset and the ground truth (the perfect integration of all OSNs). Afterward, since nowadays most available real datasets only support millions of nodes and current popular statistical graph generators are properly designed to preserve only the statistical metrics, such as the degree distribution, diameter, and clustering coefficient of the original social graphs without considering the importance of frequent graph patterns, we make the first attempt to design a Pattern Preserving Graph Generation (PPGG) algorithm to generate a graph including all frequent patterns and three most popular statistical parameters: degree distribution, clustering coefficient, and average vertex degree. In addition, given the available social network datasets, we also explore the group formations with willingness optimization for social group activity. We design a new randomized algorithm to effectively and efficiently solve the problem. Given the available computational budgets, the proposed algorithm is able to optimally allocate the resources and find a solution with an approximation ratio. We implement the proposed algorithm in Facebook, and the user study demonstrates that social groups obtained by the proposed algorithm significantly outperform the solutions manually configured by users.1952306 bytesapplication/pdf論文公開時間: 2015/7/24論文使用權限: 同意有償授權(權利金給回饋學校)演算法設計與分析詢問處理社群網路近似演算法資?產生Algorithm Design and AnalysisQuery ProcessingSocial NetworksApproximation AlgorithmData Generation社群網路取樣、產生與應用Sampling, Generation, and Application of Social Networksthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/276032/1/ntu-104-D99942020-1.pdf