許永真Hsu, Yung-Jen臺灣大學:資訊工程學研究所連家峻Lian, Chia-ChunChia-ChunLian2010-05-182018-07-052010-05-182018-07-052008U0001-2608200803463300http://ntur.lib.ntu.edu.tw//handle/246246/183599在社交場合進行交談行為辨識 (Chatting Activity Recognition) 對於社交網路 (Social Network) 的建立來說實在是不可或缺的一環,而且在各式各樣的社交行為當中,交談行為更是一種非常明顯的指標,但是要在一個公共場所進行交談行為辨識最大的困難點在於:有多人同時進行著多重的行為,這意味著有很多對話會在同一時間點同步進行,這將嚴重混淆多重交談行為的辨識。 為了將這種同步交談行為的對話動態情形加以模型化,我提出使用「階乘式條件隨機場模型」(Factorial Conditional Random Fields) 來涵蓋多重行為狀態之間的同步關係 (Co-temporal Relationship),並且同時減少模型的複雜度;除此之外,為了避免使用效率較低的「信念傳遞演算法」(Loopy Belief Propagation Algorithm),我也提出使用「反覆分類演算法」(Iterative Classification Algorithm) 來進行「階乘式條件隨機場模型」的推論。 我設計許多實驗來比較「階乘式條件隨機場模型」和其它動態機率模型 (Dynamic Probabilistic Models) 對於音訊資料在學習 (Learning) 與解密 (Decoding) 過程上的差異,其中包括了和「平行化條件隨機場」(Parallel Conditional Random Fields) 以及一些類「隱藏式馬可夫模型」(Hidden Markov Models) 的比較。 在考慮多重同步行為的可能之下,實驗結果發現「階乘式條件隨機場模型」表現得比「平行化條件隨機場模型」以及其它類「隱藏式馬可夫模型」還要更好;我們也發現當「階乘式條件隨機場模型」搭配「反覆分類演算法」來一起使用,除了可以增加辨識的準確度之外,比起「信念傳遞演算法」來說,它還可以大幅降低學習與解密的時間。Recognition of chatting activities occurring in social occasions is an important building block for constructing a human social network. Among the various types of social interactions, chatting with others is a significant indicator. However, the main challenge of chatting activity recognition in public occasions is the existence of multiple people involved in multiple activities. That is, several conversations may take place concurrently, thereby causing a great deal of confusion for the recognition of multiple chatting activities. To model the conversational dynamics of co-existing chatting behaviors, I advocate using the Factorial Conditional Random Fields (FCRFs) to accommodate co-temporal relationships among multi-activity states and to reduce model complexity. In addition, to avoid the use of the inefficient Loopy Belief Propagation (LBP) algorithm, I propose using the Iterative Classification Algorithm (ICA) as the inference method for FCRFs. We designed several experiments to compare our FCRFs model with other dynamic probabilistic models, such as the Parallel Condition Random Fields (PCRFs) and the Hidden Markov Models (HMMs), in learning and decoding based on auditory data. While considering the existence of multiple concurrent chatting activities, the experimental results show that the FCRFs models outperform the PCRFs model and other HMMs-like models. We also discover that the FCRFs model using the ICA inference approach not only improves the recognition accuracy but also takes significantly much less time to do learning and decoding processes than the LBP inference method.Acknowledgments iiibstract vist of Figures xiiiist of Tables xvhapter 1 Introduction 1.1 Motivation 1.1.1 Human Social Network Building 1.1.2 Automatic Conversation Detection 2.1.3 Common Conversational Style 3.2 Challenges 3.2.1 Multi-Tasking Problems of Chatting Activity Recognition 4.2.2 Insufficiency of Dynamic Bayesian Networks 4.2.3 Inefficiency of Probabilistic Inference Method 5.3 Problem Definition 5.3.1 Assumption 6.3.2 Input 6.3.3 Output 7.4 Proposed Solution 7.5 Thesis Organization 9hapter 2 Background 11.1 Related Work 11.1.1 Activity Recognition Using Ubiquitous Sensors 11.1.2 Activity Recognition Using Wearable Digital Sensors 12.1.3 Conversation Detection Using Bottom-Up Approach 13.1.4 Conversation Modeling Using Top-Down Approach 13.1.5 Multiple Sequences Labeling 15.2 Related Technology 16.2.1 Dynamic Probabilistic Models 16.2.2 Probabilistic Inference Algorithms 18hapter 3 Model Structures 25.1 Notation Definition 25.2 Parallel Conditional Random Fields 31.2.1 Model Design 31.2.2 Learning Process 35.2.3 Inference Process 39.2.4 Decoding Process 39.3 Factorial Conditional Random Fields 40.3.1 Model Design 40.3.2 Learning Process 43.3.3 Inference Process 45.3.4 Decoding Process 47hapter 4 Auditory Feature Extraction 49.1 Volume and Mutual Information 49.2 Human Voice Detection 53.3 Summary of Auditory Feature Values 54hapter 5 Experimental Design and Result 57.1 Experimental Design 57.1.1 Scenario and Data Collection 57.1.2 Audio Recorder 58.1.3 Annotation 60.1.4 Model Training 60.1.5 Performance Evaluation 65.2 Experimental Results 68.2.1 Meeting Activity 68.2.2 Public Occasion 69hapter 6 Conclusions and Future Work 75.1 Summary of Contributions 75.2 Future Work 76ibliography 77application/pdf12398222 bytesapplication/pdfen-US交談行為辨識動態機率模型階乘式條件隨機場信念傳遞演算法反覆分類演算法Chatting Activity RecognitionDynamic Probabilistic ModelsFactorial Conditional Random FieldsLoopy Belief Propagation AlgorithmIterative Classification Algorithm使用階乘式條件隨機場與反覆分類法進行交談行為辨識Chatting Activity Recognition Using Factorial Conditional Random Fields with Iterative Classificationthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/183599/1/ntu-97-R95922047-1.pdf