https://scholars.lib.ntu.edu.tw/handle/123456789/498657
Title: | Named entity recognition from spoken documents using global evidences and external knowledge sources with applications on Mandarin Chinese | Authors: | Pant, Y.-C. Liu, Y.-Y. LIN-SHAN LEE |
Issue Date: | 2005 | Journal Volume: | 2005 | Start page/Pages: | 390-395 | Source: | Proceedings of ASRU 2005: 2005 IEEE Automatic Speech Recognition and Understanding Workshop | Conference: | ASRU 2005: 2005 IEEE Automatic Speech Recognition and Understanding Workshop | Abstract: | In this paper, we propose two efficient approaches for Named Entity recognition (NER) from spoken documents. The first approach used a very efficient data structure, the PAT trees, to extract global evidences from the whole spoken documents, to be used with the well-known local (internal and external) evidences popularly used by conventional approaches. The basic idea is that a Named Entity (NE) may not be easily recognized in certain contexts, but may become much more easily recognized when its repeated occurrences in all the different sentences in the same spoken document are considered jointly. This approach is equally useful for NER from text and spoken documents. The second approach is to try to recover some Named Entities (NEs) which are out-of-vocabulary (OOV) words and thus can't be obtained in the transcriptions. The basic idea is to use reliable and important words in the transcription to construct queries to retrieve relevant text documents from external knowledge sources (such as Internet). Matching the NEs obtained from these retrieved relevant text documents with some selected sections of the phone lattice of the spoken document can recover some NEs which are OOV words. The experiments were performed on Mandarin Chinese by incorporating these two approaches to a conventional hybrid statistic/rule-based NER system for Chinese language. Very significant performance improvements were obtained. © 2005 IEEE. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/498657 https://www.scopus.com/inward/record.uri?eid=2-s2.0-33846261202&doi=10.1109%2fASRU.2005.1566535&partnerID=40&md5=cb9cda2d3448cb7e0f78aa48b9022724 |
DOI: | 10.1109/ASRU.2005.1566535 | SDG/Keyword: | Data structures; Information retrieval; Information retrieval systems; Statistical methods; Text processing; Trees (mathematics); External knowledge sources; Named Entity recognition (NER); Speech recognition |
Appears in Collections: | 電機工程學系 |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.