https://scholars.lib.ntu.edu.tw/handle/123456789/498657
標題: | Named entity recognition from spoken documents using global evidences and external knowledge sources with applications on Mandarin Chinese | 作者: | Pant, Y.-C. Liu, Y.-Y. LIN-SHAN LEE |
公開日期: | 2005 | 卷: | 2005 | 起(迄)頁: | 390-395 | 來源出版物: | Proceedings of ASRU 2005: 2005 IEEE Automatic Speech Recognition and Understanding Workshop | 會議論文: | ASRU 2005: 2005 IEEE Automatic Speech Recognition and Understanding Workshop | 摘要: | In this paper, we propose two efficient approaches for Named Entity recognition (NER) from spoken documents. The first approach used a very efficient data structure, the PAT trees, to extract global evidences from the whole spoken documents, to be used with the well-known local (internal and external) evidences popularly used by conventional approaches. The basic idea is that a Named Entity (NE) may not be easily recognized in certain contexts, but may become much more easily recognized when its repeated occurrences in all the different sentences in the same spoken document are considered jointly. This approach is equally useful for NER from text and spoken documents. The second approach is to try to recover some Named Entities (NEs) which are out-of-vocabulary (OOV) words and thus can't be obtained in the transcriptions. The basic idea is to use reliable and important words in the transcription to construct queries to retrieve relevant text documents from external knowledge sources (such as Internet). Matching the NEs obtained from these retrieved relevant text documents with some selected sections of the phone lattice of the spoken document can recover some NEs which are OOV words. The experiments were performed on Mandarin Chinese by incorporating these two approaches to a conventional hybrid statistic/rule-based NER system for Chinese language. Very significant performance improvements were obtained. © 2005 IEEE. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/498657 https://www.scopus.com/inward/record.uri?eid=2-s2.0-33846261202&doi=10.1109%2fASRU.2005.1566535&partnerID=40&md5=cb9cda2d3448cb7e0f78aa48b9022724 |
DOI: | 10.1109/ASRU.2005.1566535 | SDG/關鍵字: | Data structures; Information retrieval; Information retrieval systems; Statistical methods; Text processing; Trees (mathematics); External knowledge sources; Named Entity recognition (NER); Speech recognition |
顯示於: | 電機工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。