階層的地圖形成控制算法

劉長遠臺灣大學：資訊工程學研究所陸羿Soto, Luis Eduardo RodriguezLuis Eduardo RodriguezSoto2007-11-262018-07-052007-11-262018-07-052006http://ntur.lib.ntu.edu.tw//handle/246246/53752In this thesis we propose a motor control model inspired by organizational priciples of the cerebral cortex. Specifically the model is based on cortical maps and functional hierarchy in sensory and motor areas of the brain. We introduce observed properties of the F5 area in the macaque monkey brain, an area which combines sensory and motor information, producing actions without high processing information. The properties here observed can be quickly summarized to mdularity and hierarchical processing. These form the basis for the model we propose. We make use of well known computational tools, to put together a biology imitating model, for action learning and motor control. The Self-Organizing Maps (SOM) have proven to be useful in modeling cortical topological maps. A hierarchical SOM provides a natural way to extract hierarchical information from the environment, which we propose may in turn be used to select actions hierarchically. We use a neighborhood update version of the Q-learning algorithm, so the final model maps a continuous input space to a continuous action space in a hierarchical, topology preserving manner. The model is called the Hierarchical Map Forming model (HMF) due to the way in which it forms maps in both the input and output spaces in a hierarchical manner.1 Neurophysiology of Visuo-Motor Area F5 7 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2 Neurophysiology of the Mirror Neuron System . . . . . . . . . 7 1.2.1 Visuomotor Neurons . . . . . . . . . . . . . . . . . . . 8 1.2.2 Canonical Neurons . . . . . . . . . . . . . . . . . . . . 9 1.2.3 Mirror Neurons . . . . . . . . . . . . . . . . . . . . . . 9 1.2.4 On Action understanding . . . . . . . . . . . . . . . . 12 1.3 In Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4 Brain Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.4.1 The FARS model . . . . . . . . . . . . . . . . . . . . . 14 1.4.2 The MOSAIC model . . . . . . . . . . . . . . . . . . . 21 2 Theoretical Background 24 2.1 Self-Organization and the Self-Organizing Map . . . . . . . . 24 2.1.1 Motivation Behind Kohonen’s Self-Organizing Map . . 24 2.1.2 Conditions for Self-Organization . . . . . . . . . . . . 25 2.1.3 The Basic SOM . . . . . . . . . . . . . . . . . . . . . . 26 2.1.4 Self-Organizing Map and Brain Models . . . . . . . . 29 2.1.5 Self-Organizing Trees . . . . . . . . . . . . . . . . . . 30 2.1.6 Case Analisys 1 . . . . . . . . . . . . . . . . . . . . . . 33 2.1.7 Case Analisys 2 . . . . . . . . . . . . . . . . . . . . . . 34 2.2 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . 39 2.2.1 The Reinforcement Learning Problem . . . . . . . . . 39 2.2.2 Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.2.3 Value . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.2.4 Temporal Difference Learning . . . . . . . . . . . . . . 41 2.2.5 TD(0) Method . . . . . . . . . . . . . . . . . . . . . . 41 2.2.6 Q-Learning . . . . . . . . . . . . . . . . . . . . . . . . 41 2.2.7 Applications of SOM to Reinforcement Learning Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3 The Hierarchical Map Forming Model 45 3.1 Combining TS-SOM and Q-learning . . . . . . . . . . . . . . 45 3.1.1 Neighborhood Q-learning . . . . . . . . . . . . . . . . 47 3.2 Training Methodologies . . . . . . . . . . . . . . . . . . . . . 48 3.2.1 Sequential Training . . . . . . . . . . . . . . . . . . . . 48 3.2.2 Concurrent Training . . . . . . . . . . . . . . . . . . . 48 3.3 Software Implementation . . . . . . . . . . . . . . . . . . . . . 49 3.3.1 M-language version . . . . . . . . . . . . . . . . . . . . 49 3.3.2 Simulink version . . . . . . . . . . . . . . . . . . . . . 49 3.3.3 HMF Toolbox . . . . . . . . . . . . . . . . . . . . . . . 49 3.4 2D Trajectory Mapping Simulation . . . . . . . . . . . . . . . 50 3.5 3D Trajectory Mapping Simulation . . . . . . . . . . . . . . . 59 3.5.1 The 3D arm mapping problem . . . . . . . . . . . . . 59 4 Conclusions 65 4.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5 Bibliography 67598110 bytesapplication/pdfen-US階層控制類神經網路增強學會Hierarchical ControlNeural NetworksReinforcement Learning階層的地圖形成控制算法The Hierarchical Map Forming Modelthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/53752/1/ntu-95-R93922144-1.pdf