新型多人人臉追蹤方法之共同注意力發展

簡宏景; Jian, Hung-Jing

DC 欄位	值	語言
dc.contributor	黃漢邦	en
dc.contributor	臺灣大學：機械工程學研究所	zh_TW
dc.contributor.author	簡宏景	zh
dc.contributor.author	Jian, Hung-Jing	en
dc.creator	簡宏景	zh
dc.creator	Jian, Hung-Jing	en
dc.date	2007	en
dc.date.accessioned	2007-11-28T07:59:19Z	-
dc.date.accessioned	2018-06-28T17:08:00Z	-
dc.date.available	2007-11-28T07:59:19Z	-
dc.date.available	2018-06-28T17:08:00Z	-
dc.date.issued	2007	-
dc.identifier	en-US	en
dc.identifier.uri	http://ntur.lib.ntu.edu.tw//handle/246246/61350	-
dc.description.abstract	本論文主要目的為研究多人人臉追蹤方法以及人形機器人與使用者之間的共同注意力發展。我們發展出一套新型的多人人臉追蹤的方法(Modified Multi-CAMSHIFT, MMC)來實現多物件追蹤，其利用結合色彩跟形狀兩種主要資訊，可以更有效的來找出並追蹤影像中所有人臉的位置。色彩資訊是利用我們所發展的Modified Multi-CAMSHIFT理論計算而得；形狀資訊是使用Scharr kernel mask求得。再分別計算出兩者的色彩和方向分佈直方圖，代入特徵選擇機制(Adaptive Feature Selection)裡面做最佳化追蹤判斷。為了分辨出人臉區域跟非人臉區域，我們加入雙眼快速取出機制(Eyes-pair Fast Extracting)。我們提出的多人人臉追蹤的方法，都是在適應性多重解析度(Adaptive Multi-Resolution)下進行運算，可以減少影像處理運算量。實驗結果顯示，加入上述種種機制，我們提出的多人人臉追蹤方法(Modified Multi-CAMSHIFT )是一個效果很好的追蹤方法。找出人臉後，再進一步來判斷出每個人臉的方向，研究其與機器人之間的互動情形，亦即共同注意力(Joint Attention)。我們使用靜態及動態兩種資訊，來判斷人臉的方向。動態資訊是利用光流(Optical Flow)來觀察計算當使用者的注視方向從看機器人轉移到看另一個目標物時的運動資訊。而靜態資訊為當人臉注視某目標物時，所計算出的人臉邊界影像資訊。靜態和動態資訊有互補的特性，前者雖然演算法很複雜但是可以給予精確的注視方向。另一方面，後者提供粗略資訊但是可以很容易來理解注視方向上的轉移和馬達跟隨著使用者視線轉移輸出之間的關係。學習模式是利用支撐向量機(SVM)，從觀察使用者的視線移動獲得的靜態和動態兩種資訊，使得機器人能夠有效地獲得共同注意力能力和與人自然的互動。動態資訊搭配靜態資訊可以加速共同注意力的獲悉而提升整體的性能。我們將上述的方法以及理論，成功的實現多人人臉追蹤與共同注意力發展。	zh_TW
dc.description.abstract	This thesis aims to develop a system for multiple objects tracking and joint attention between people and robot. We propose a new method (Modified Multi-CAMSHIFT, MMC), which is based on the characteristics of color and shape probability distribution, to solve the tracking problems for multiple objects. The color cue information is calculated by MMC that improves from CAMSHIFT theory. And the shape cue information is calculated by procedure of Scharr kernel mask. Then we calculate out color histogram and orientation histogram respectively, and use the Adaptive Feature Selection for optimal tracking. For judging face or non-face regions, we have included Eyes-pair Fast Extracting. Our proposed MMC is based on adaptive multi-resolution (AMR) framework for reducing computation. The experimental results show that based on all the mechanisms mentioned above, the proposed MMC is a tracking method that performs satisfactory effects. After finding human faces, we tell the direction of each human face, and research the human-robot interaction between human and robot that is called Joint Attention. We establish joint attention with a human by utilizing both static and dynamic information. As the static information, we extract the edge image of the human face when he/she is gazing at the object. As the dynamic information, the robot uses the optical flow detected when observing a human who is shifting his/her gaze from looking at the robot to looking at another object. The static and dynamic information have complementary characteristics. The static information gives the exact direction of gaze, even though it is difficult to interpret. On the other hand, the dynamic information provides a rough direction but it is easily understandable relationship between the direction of gaze shift and motor output to follow the gaze. We use Support Vector Machine (SVM) for learning model. Utilizing both static and dynamic information acquired from observing a human’s gaze shift enables the robot to efficiently acquire joint attention ability and to naturally interact with the human by SVM. The dynamic information accelerates the learning of joint attention while the static information improves the task performance. From experiment results, the proposed Modified Multi-CAMSHIFT was successfully applied to multiple faces tracking and the development of the Joint Attention.	en
dc.description.tableofcontents	摘要 I Abstract III Contents V List of Tables VIII List of Figures IX Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Related Works 2 1.2.1 Object Tracking 3 1.2.2 Human-Robot Interaction (HRI) 4 1.3 Objectives and Contributions 6 1.4 Thesis Organization 9 Chapter 2 Background Knowledge 10 2.1 Color Space Used for Skin Modeling 10 2.2 The CAMSHIFT Algorithm 12 2.2.1 Introduction to the CAMSHIFT Algorithm 12 2.2.2 Mass Center Calculation 13 2.2.3 Probability Distribution 14 2.3 Joint Attention 15 2.4 Edge Detector 18 2.5 Optical Flow 21 2.5.1 Optical Flow Computation 22 2.6 Support Vector Machine (SVM) 24 2.6.1 Structural Risk Minimization 24 2.6.2 Introduction to SVMs 25 Chapter 3 Faces Tracking for Multiple People 30 3.1 Skin Color Model 30 3.1.1 Skin Color Probability Modeling 30 3.1.2 Adaptive Skin Color Probability Model Update 33 3.2 Modified CAMSHIFT 34 3.2.1 Interested probability Enhancement 36 3.2.2 Initial Block Searching in Small Resolution 38 3.2.3 Search Window of CAMSHIFT 39 3.2.4 Center Tendency 40 3.3 Multi-CAMSHIFT Algorithm 41 3.3.1 Sort Indexes of MCAMSHIFT 44 3.4 Adaptive Multi-Resolution (AMR) 45 3.5 Modified Multi-CAMSHIFT Algorithm 50 3.5.1 Adaptive Feature Extraction 54 3.5.2 Eyes-pair Fast Extracting 59 Chapter 4 Development of Joint Attention 62 4.1 Model of Joint Attention 62 4.2 Face Image Orientation Detector 64 4.2.1 Edge Detector for Static Information 65 4.2.2 Optical Flow for Dynamic Information 67 4.3 Learning Module with SVM 69 4.4 Joint Attention with Modify Multi-CAMSHIFT 70 4.4.1 PID Control Theorem in Pan-Tilt System 73 Chapter 5 Experiment Results 74 5.1 System Overview 74 5.2 Performance of Modified MCAMSHIFT Tracking 76 5.3 Performance of Joint Attention with Modified Multi-CAMSHIFT Tracking Experiments 82 5.3.1 Static Orientation Detector Experiments 83 5.3.2 Dynamic Orientation Detector Experiments 84 5.3.3 Tracking Object of Joint Attention with PTU System Experiments 86 5.3.4 Graphical user interfaces of System 88 Chapter 6 Conclusions 91 6.1 Conclusions 91 6.2 Future Works 92 References 94	en
dc.language	en-US	en
dc.language.iso	en_US	-
dc.subject	人臉	en
dc.subject	追蹤	en
dc.subject	連續適應性中心移動演算法	en
dc.subject	邊緣偵測	en
dc.subject	光流	en
dc.subject	共同注意力	en
dc.subject	支持向量機	en
dc.subject	Face	en
dc.subject	Tracking	en
dc.subject	CAMSHIFT	en
dc.subject	Edge Detection	en
dc.subject	Optical Flow	en
dc.subject	Joint Attention	en
dc.subject	SVM	en
dc.title	新型多人人臉追蹤方法之共同注意力發展	zh
dc.title	Development of the Joint Attention with a New Face Tracking Method for Multiple People	en
dc.type	thesis	en
dc.relation.reference	[1]J. G. Allen, R. Y. D. Xu, and J. S. Jin, “Object Tracking Using CAMSHIFT Algorithm and Multiple Quantized Feature Spaces,” presented at Workshop on Visual Information Processing, Sydney, Australia, pp. 3-7, 2003. [2]M. Asada, K. F. MacDorman, H. Ishiguro, and Y. Kuniyoshi, “Cognitive Developmental Robotics as a New Paradigm for the design of humanoid robots,” Robotics and Autonomous Systems, Vol. 37, pp. 185–193, 2001. [3]D. A. Baldwin, Joint Attention: Its Origins and Role in Development, Hillsdale, NJ: Lawrence Erlbaum Associates Inc., 1st Edition, pp. 119–143, 1995. [4]G. Bebis, S. Uthiram, and M. Georgiopoulos, “Face Detection and Verification Using Genetic Search,” International Journal of Artificial Intelligence Tools, Vol. 9, No. 2, pp. 225-246, 2000. [5]G. R. Bradski, “Computer Vision Face Tracking for Use in a Perceptual User Interface,” Intel Technology Journal, 2nd Quarter, pp. 1-20, 1998. [6]J. C. Burges, “A Tutorial on Support Vector Machines for Pattern Recognition,” Data Mining and Knowledge Discovery, Vol. 2, pp. 121-167, 1998. [7]G. Butterworth. “The ontogeny and phylogeny of joint visual attention.” In Andrew Whiten, editor, Natural Theories of Mind, Blackwell, pp. 223-232, 1991. [8]T. C. Chang, T. S. Huang, and C. Novak, “Facial Feature Extraction from Color Image,” Proceedings of the 12th International Conference on Pattern Recognition, Vol. 2, pp. 39-43, 1994. [9]C. C. Chang, and C. J. Lin, LIBSVM: A Library for Support Vector Machines, http://www.csie.ntu.edu.tw/~cjlin/libsvm, 2001. [10]Y. Cheng, “Mean Shift, Mode Seeking, and Clustering,” IEEE Transactions on Pattern Analysis Machine Intelligence, Vol. 17, pp. 790-799, 1995. [11]Y. T. Chung, “Face Tracking and Recognition,” Master Thesis, Department of Mechanical Engineering, National Taiwan University, pp. 1-7, 2004. [12]D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-Based Object Tracking,” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 25, No. 5, pp. 564-575, May 2003. [13]C. Corts, and V. N. Vapnik, “Support Vector Networks,” Machine Learning, Vol. 20, pp. 273-297, 1995. [14]K. Crammer, and Y. Singer, “On the Algorithmic Implementation of Multiclass Kernel-Based Vector Machines,” Technical Report, School of Computer Science and Engineering, Hebrew University, pp. 1-8, 2001. [15]F. V. Dam, and F. Hughes, Computer Graphics: Principles and Practice, Reading, MA: Addison-Wesley, 2nd Edition, pp. 1-11, 1995. [16]K. Dautenhahn, and C. L. Nehaniv, “Human Robot Interaction,” 6th IEEE International Symposium on Computational Intelligence in Robotics and Automation, Finland, pp. 1-8, 2005. [17]E. Durucan, and T. Dbrahimi, “Change Detection and Background Extraction by Linear Algebra,” Proceedings of the IEEE, Vol. 89, No. 10, pp. 1368-1381, 2001. [18]M. D. Fairchild, Color Appearance Models, Reading, MA: Addison-Wesley, 1st Edition, pp. 1-6, 1998. [19]Y. Fang, Y. Wang, and T. Tan, “Combining Color, Contour and Region For Face Detection,” The 5th Asian Conference on Computer Vision, Melbourne, Australia, pp. 23-25, 2002. [20]W. T. Freeman, K. Tanaka, J. Ohta, and K. Kyuma, “Computer Vision for Computer Games,” Int. Conf. On Automatic Face and Gesture Recognition, Killington, pp. 100-105, October, 1996. [21]K. Fukunaga, Introduction to Statistical Pattern Recognition, New York: Academic Press, 2nd Edition, pp. 1-10, 1990. [22]E. Garcia; M. A. Jimenez, P. G. De Santos, and M. Armada, “The evolution of robotics research,” Robotics & Automation Magazine, IEEE Vol. 14, Issue 1, pp. 90-103, March 2007. [23]D. M. Gavrila, “The Visual Analysis of Human Movement: A Survey,” Computer Vision and Image Understanding, Vol. 75, No. 1, pp. 1-6, 1999. [24]R. C. Gonzalez, and R. E. Woods, Digital Image Processing, Upper Saddle River, NJ: Prentice Hall, 2nd Edition, pp. 1-10, 2002. [25]Y. Guo, S. Hsu, H. S. Sawhney, R. Kumar, and Y. Shan, “Robust Object Matching for Persistent Tracking with Heterogeneous Features,” Pattern Analysis and Machine Intelligence, IEEE Transactions on Vol. 29, Issue 5, pp.824-839, May 2007. [26]G. Guo, S. Z. Li, and K. Chan. “Face Recognition by Support Vector Machines,” Proceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition, France, pp.196-201, 2000. [27]S. Haykin, Neural Networks: A Comprehensive Foundation, Upper Saddle River, NJ: Prentice Hall, 2nd Edition, pp. 7-12, 1999. [28]M. Heath, S. Sarkar, T. Sanocki, and K. Bowyer, “Comparison of edge detectors: a methodology and initial study,” Proceedings of the Conference on Computer Vision and Pattern Recognition, San Francisco, pp. 1-35, 1996. [29]B. Heisele, P. Ho, and T. Poggio, “Face Recognition with Support Vector Machines Global Versus Component-Based Approach”, Proceedings of IEEE International Conference on Computer Vision, Vancouver, pp. 1-6, 2001. [30]E. Hjelmas, and B. K. Low, “Face Detection: A Survey,” Computer Vision and Image Understanding, Vol. 83(2), pp. 236-274, 2001. [31]B. K. P. Horn, and B.G. Schunck, “Determining Optical Flow,” ARTIFICIAL INTELLIGENCE, Vol. 17, pp. 185-203, April 1981. [32]C. W. Hsu, and C. J. Lin, “A Comparison of Methods for Multi-Class Support Vector Machines,” IEEE Transactions on Neural Networks, Vol. 13, No. 2, pp. 415-425, 2002. [33]H. P. Huang, and C. T. Lin, “Multi-CAMSHIFT for Multi-View Faces Tracking and Recognition,” Master thesis, Department of Mechanical Engineering, National Taiwan University, pp. 27-51, 2005. [34]J. Q. Huang, X. X. Chen, and L. Y. Wang, “A novel method for tracking pedestrians from real-time video,” Journal of ZheJiang University, Vol.5, No.1, pp.99-105, 2004. [35]R. W. G. Hunt, The Reproduction of Colour in Photography, Printing & Television, England: Fountain Press, 5th Edition, pp. 1-10, 1995. [36]M. Imai, T. Ono, and H. Ishiguro, “Physical relation and expression: Joint attention for human-robot interaction,” IEEE Trans. Industrial Electronics, Vol.50, No.4, pp.636-643, Aug. 2003. [37]M. Isard, and A. Blake. “Condensation-Conditional Density Propagation for Visual Tracking,” Int’l J. Computer Vision, Vol. 29, pp. 5-28, 1998. [38]D. S. Jang, and H. I. Choi, “Active Models for Tracking Moving Objects,” Pattern Recognition, Vol. 33, No. 7, pp. 1135-1146, 2000. [39]K. Jung, K.I. Kim, T. Kurata, M. Kourogi, and J. Han, “Text Scanner with Text Detection Technology on Image Sequences,” In Proc. International Conference on Pattern Recognition in Quebec City, Canada, Vol. 3, pp. 473-476, 2002. [40]C. H. Lee, S. W. Park, W. Chang, and J. W. Park, “Improving the Performance of Multi-Class SVMs in Face Recognition with Nearest Neighbor Rule,” Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence, Sacramento ,USA, pp. 411-415, 2003. [41]J. S. Liu, and R. Chen, “Sequential Monte Carlo Methods for Dynamic Systems,” Journal of American Statistical Association, Vol. 443, pp. 1032-1044, 1998. [42]D. Marr, and E. Hildreth, “Theory of edge detection,” Proceedings Royal Society of London, Vol. 207, pp. 187-217, 1980. [43]B. Martinkauppi, “Face Colour under Varying Illumination - Analysis and Applications,” Doctoral Dissertation, Department of Electrical and Information Engineering, University of Oulu , pp. 1-12, 2002. [44]Y. Ming, J. Jiang, and J. Ming, “Background Modeling and Subtraction Using a Local- Linear-Dependence-Based Cauchy Statistical Model,” Proceedings of the 7th Digital Image Computing: Techniques and Applications, Sydney, pp. 469-478, 2003. [45]M. Morales; P. Mundy, and J. Rojas, “Following the direction of gaze and language development in 6-month-olds,” Infant Behavior and Development, Vol. 21, No. 2, pp. 373-377, 1998. [46]A. Morita, Y. Yoshikawa, K. Hosoda, and M. Asada, “Joint Attention with Strangers based on Generalization through Joint Attention with Caregivers,” Proceedings of 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, pp.3744-3749, Sep. 2004. [47]C. Moore, and P. J. Dunham, Joint Attention: Its Origins and Role in Development, Hillsdale, NJ:Lawrence Erlbaum Associates Inc., pp. 8-16, 1995. [48]Y. Nagai, M. Asada, and K. Hosoda, “Developmental Learning Model for Joint Attention,” In Proceedings of the 2002 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, pp. 932-937, October 2002. [49]D. J. Niu, Y. Z. Zhan, and S. L. Song, “Research and Implementation of Real-Time Face Detection, Tracking and Protection,” Proceedings of the Second International Conference on Machine Learning and Cybernetics, Xi’an, pp. 2-5, 2003. [50]E. Osuna, R. Freund, and F. Girosi, “Training Support Vector Machines: an Application to Face Detection.” Proceedings of Computer Vision and Pattern Recognition, Puerto Rico, pp.1-6, 1997. [51]J. C. Platt, N. Cristianini, and J. Shawe-Taylor, “Large Margin DAG’s for Multiclass Classification,” Advances in Neural Information Processing Systems, Cambridge, MA: MIT Press, Vol. 12, pp. 547-553, 2000. [52]B. Scassellati, “Imitation and mechanisms of joint attention: a developmental structure for building social skills on a humanoid robot,” In Lecture Notes in Computer Science, Berlin, Germany: Springer, Vol. 1562, pp. 176–195, 1999. [53]K. Schwerdt, J. L. Crowley, and J.-B. Durand, “Robustification of Detection and Tracking of Faces,” Computer Vision and Mobile Robotics Workshop, Greece, pp.12-20, September, 1998. [54]Y. Shirai, T. Yamane, and R. Okada, “Robust Visual Tracking by Integrating Various Cues,” IEICE Transaction on Information & Systems, Vol. E81-D, No. 9, pp.951-958, 1998. [55]V. N. Vapnik, Statistical Learning Theory, New York: Wiley, 1st Edition, pp.1-52, 1998. [56]V. Vezhnevets, V. Sazonov, and A. Andreeva, “A Survey on Pixel-Based Skin Color Detection Techniques,” Proc. Graphicon-2003, Moscow, Russia, pp. 85-92, Sep. 2003. [57]Y. Wang, and B. Yuan, “Fast Method for Face Location and Tracking by Distributed Behaviour-Based Agents,” IEE Proceedings Image Signal Process, Vol. 149. No. 3, 2002. [58]J. Weickerta, and H. Scharr, “A Scheme for Coherence-Enhancing Diffusion Filtering with Optimized Rotation Invariance,” Journal of Visual Communication and Image Representation, Vol. 13, Issues 1-2, pp.103-118, March 2002. [59]J. F. Xu, and S.F. Li, “Image Text Location Based on Color Edge and SVM,” Application Research of Computers, China, pp.155-157, 2006. [60]H. D. Yang, A. Y. Park, and S. W. Lee, “Gesture Spotting and Recognition for Human–Robot Interaction,” Robotics and Automation, IEEE Transactions, Vol. 23, Issue 2, pp. 256-270, April 2007. [61]S. M. Yoon, and H. Kim, “Real-Time Multiple People Detection Using Skin Color, Motion and Appearance Information,” Proceedings of the 2004 IEEE International Workshop on Robot and Human Interactive Communication Kurashiki, Okayama Japan, pp. 20-22, 2004. [62]S. Zheng, J. Liu, and J. W. Tian, “A new efficient SVM-based edge detection method,” Pattern Recognition Letters, Vol. 25, Issue 10, pp. 1143-1154, 2004. [63]Y. Zhong, A. K. Jain, and M.P. Dubutsson-Jolly, “Object Tracking Using Deformable Templates,” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 22, pp. 544-549, 2000.	en
item.languageiso639-1	en_US	-
item.fulltext	no fulltext	-
item.grantfulltext	none	-
item.openairetype	thesis	-
item.openairecristype	http://purl.org/coar/resource_type/c_46ec	-
item.cerifentitytype	Publications	-
顯示於：	機械工程學系

顯示文件簡單紀錄

Page view(s) 10

checked on 2024/4/13

Google Scholar^TM

檢查

TAIR相關文章

Page view(s) 10

Google ScholarTM

Google Scholar^TM