林智仁Lin, Chih-Jen臺灣大學:資訊工程學研究所王湘叡Wang, Hsiang-JuiHsiang-JuiWang2010-05-182018-07-052010-05-182018-07-052008U0001-0307200818555700http://ntur.lib.ntu.edu.tw//handle/246246/183620近年來,很多領域興起將序列的資料標上標籤。條件隨機場則是一種常用來解此類問題的方法,但其封閉形式的海森矩陣並不易導出。這困難致使一些使用二次微分資訊的最佳化方法不適用,如牛頓法。自動微分則是一種技巧,可以用來計算一個函數的導數值而無梯度函數。並且,藉由自動微分來計算海森矩陣與向量之乘積只需梯度函數而無需海森矩陣。本篇論文先說明自動微分的背景知識。然後結合截斷牛頓法及自動微分,並用之於解決條件隨機場。In recent years, labeling sequential data arises in many fields. Conditional random fields are a popular model for solving this type of problems. Its Hessian matrix in a closed form is not easy to derive. This difficulty causes that optimization methods using second-order information like the Hessian-vector products may not be suitable. Automatic differentiation is a technique to evaluate derivatives of a function without its gradient function. Moreover, computing Hessian-vector products by automatic differentiation only requires the gradient function but not the Hessian matrix. This thesis first gives a study on the background knowledge of automatic differentiation. Then it merges truncated Newton methods with automatic differentiation for solving conditional random fields.口試委員會審定書 . . . . . . . . . . . . . . . . . . . . . . . . . . . . i要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiBSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . viIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiHAPTER I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1 II. Automatic differentiation . . . . . . . . . . . . . . . . . . . 4 2.1 The Fundamental of the Automatic Differentiation . . . . . 6 2.2 Implementation of Differentiation . . . . . . . . . . . . 9 2.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4 Evaluating Hessian-vector Products . . . . . . . . . . . . 11 III. Implementation of TRON with Automatic Differentiation . . . . . 15 3.1 Implementation Details . . . . . . . . . . . . . . . . . . 18 3.2 Real-World Data Sets . . . . . . . . . . . . . . . . . . . 20 3.2.1 a9a . . . . . . . . . . . . . . . . . . . . . . . 20 3.2.2 RCV1-V2 . . . . . . . . . . . . . . . . . . . . . 20 3.2.3 news20 . . . . . . . . . . . . . . . . . . . . . . 21 3.2.4 real-sim . . . . . . . . . . . . . . . . . . . . . 21 3.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . 21 IV. Conditional Random Fields . . . . . . . . . . . . . . . . . . . . 23 4.1 Named-Entity Recognition . . . . . . . . . . . . . . . . . 23 4.2 Hidden Markov Models . . . . . . . . . . . . . . . . . . . 25 4.3 Conditional Random Fields . . . . . . . . . . . . . . . . 29 4.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . 31 4.4.1 Data Set . . . . . . . . . . . . . . . . . . . . . 31 4.4.2 Features . . . . . . . . . . . . . . . . . . . . . 32 4.4.3 Evaluation . . . . . . . . . . . . . . . . . . . . 32 4.4.4 Experiment Settings . . . . . . . . . . . . . . . 33 4.4.5 Results . . . . . . . . . . . . . . . . . . . . . 34 V. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 38IBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40application/pdf967065 bytesapplication/pdfen-US自動微分共軛梯度法截斷牛頓法最大熵值法條件隨機場automatic differentiationconjugate gradient methodstruncated New- ton methodsmaximum entropyconditional random fields應用自動微分及截斷牛頓法於條件隨機場Applying Automatic Differentiation and Truncated Newton Methods to Conditional Random Fieldsthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/183620/1/ntu-97-R95922073-1.pdf