Brain-inspired Neocortical Computing Model and System Design for Intelligent Visual Recognition Applications
Date Issued
2011
Date
2011
Author(s)
Lee, Yu-Ju
Abstract
As the technologies continue to evolve, our computers have more and more computing capacity, which drives a lot of intelligent applications to emerge like smile shutter, automatic surveillance system, smart car and smart home. These smart machines can sense the surrounding like human and provide safety, convenience and efficiency to help human. On the other hand, since we are in the era where radio equipped computers dominate, the amount of multimedia data is growing extremely fast. Youtube have reported that more than 35 hours of video are being uploaded to the video-sharing site every minute in 2010. In this rate, we need to handle over one zettabyte of information annually. Therefore, to support various intelligent applications and manage this huge amount of data, we need an efficient and scalable hardware platform to provide the required computation capability. The ultimate goal is to approach human-like intelligence. For building an intelligent machine, mimicking the structures and functions of visual cortex has always been a major approach to implement a human-like intelligent visual system. In this thesis, we started from exploring brain''s computing style and architecture, then designed a brain-like computing system for visual recognition, which can be easily scalable with the amount of resources for future intelligent applications. The whole system design flow starts from Neocortical Computing (NC) model design, Neocortical Computing System design and then the Neocortical Memory Map Generation (NMG). NC model provides the functionality for required intelligent applications. NC system is an efficient and scalable hardware platform optimized for NC model. And NMG provides the programmability to NC system by transforming the NC model into the specific memory content that can be interpreted by platform. In this thesis, the main system design strategy is to provide the scalability and efficiency as human brains.
At first, we analyze the current NC models and find that they are suffered from massive feature matching and thus lack the scalability as the network and number of matching feature become larger. To solve this problem, inspired from the object selectivity of cortical columns in IT and locality-sensitive hashing techniques, we proposed a Feature-Selective Hashing (FSH) to index the object instances efficiently. From the experimental result, using FSH reduces at most 90% of memory matching time with less than 1% accuracy drop, and also provides the computation scalability when the number of learned object instances increases. It proves that proposed NC model using FSH is both efficient and scalable for our target system requirement.
Second, for the NC system design of NC model, we analyze the computation of NC model and state its main problem -- massive and irregular data access, which results in power inefficiency, redundant external bandwidth usage, slow response and no scalability. In current computing system, this problem causes the NC system becomes a memory-bounded system. To address this issue, inspired from the information forwarding scheme of neurons, we proposed a Push-based Dataflow (Push-DF) structure using push-based processing for external memory access reduction and efficient sparse data forwarding. From the experimental result, the Push-DF in many-core architecture can achieve lower latency, power consumption and external bandwidth than RISC and GPU. Utilizing push-based processing greatly reduces the massive external memory access so that our NC system can break the bottleneck of traditional memory-bounded system. This important feature provides the communication scalability of our NC system, which meets the design goal for a scalable brain-mimicking hardware platform.
At last, we utilized the proposed Push-DF structure for designing NC system and implemented a 36-core NCSoC in TSMC 65nm technology. To ease the heavy traffic of Push-DF structure between each core, we adopt the Kautz NoC in NCSoC, which features log-diameter, easy-to-route, regular network and provides structure scalability. We adopt a bio-inspired network mapping to fold the NC model into the Kautz NoC. Under this mapping scheme, the NC model goes through the proposed NMG responsible for task partition, network parsing, optimization and memory format transformation, and is converted to a memory map. Our final implementation of NCSoC takes 0.033 seconds to recognize a 128×128 image (with a 65k neural units and 19M synapses recognition network) at 225 MHz with 208 mW of power consumption. And it can achieve 30.2 Gops and 1.8 Tbps. Compared with Intel i7-860 2.93 GHz CPU, NCSoC achieves 44.1x speedup, 644x normalized performance for area efficiency and 15,516x for power efficiency. Compared with other state-of-the-art many-core processors, NCSoC can achieve at less 6.87x Gops/W and thus has better power efficiency. In conclusion, NCSoC supports NC model for various intelligent recognition tasks, and provides better performance, efficiency and scalability over current computing platform. As a result, it have the potential to support various intelligent applications and manage huge amount of multimedia data for future applications.
Subjects
object recognition
neocortex
hashing
dataflow
system-on-chip
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-100-R98943023-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):2622fe7fbd9a12239e3c762538dff6d1
