https://scholars.lib.ntu.edu.tw/handle/123456789/488423
標題: | Cross Document Event Clustering Using Knowledge Mining from Co-reference Chains. | 作者: | Kuo, June-Jei HSIN-HSI CHEN |
關鍵字: | Co-reference chains; Controlled vocabulary; Event clustering; Multi-document summarization | 公開日期: | 2005 | 起(迄)頁: | 121-134 | 來源出版物: | Information Retrieval Technology, Second Asia Information Retrieval Symposium, AIRS 2005, Jeju Island, Korea, October 13-15, 2005, Proceedings | 摘要: | Unifying terminology usages which captures more term semantics is useful for event clustering. This paper proposes a metric of normalized chain edit distance to mine, incrementally, controlled vocabulary from cross-document co-reference chains. Controlled vocabulary is employed to unify terms among different co-reference chains. A novel threshold model that incorporates both time decay function and spanning window uses the controlled vocabulary for event clustering on streaming news. Under correct co-reference chains, the proposed system has a 15.97% performance increase compared to the baseline system, and a 5.93% performance increase compared to the system without introducing controlled vocabulary. Furthermore, a Chinese co-reference resolution system with a chain filtering mechanism is used to experiment on the robustness of the proposed event clustering system. The clustering system using noisy co-reference chains still achieves a 10.55% performance increase compared to the baseline system. The above shows that our approach is promising. © 2006 Elsevier Ltd. All rights reserved. |
URI: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-33750473709&doi=10.1016%2fj.ipm.2006.07.016&partnerID=40&md5=31fd5607063d7e18f0f3fc5258e31b20 | DOI: | 10.1007/11562382_10 | SDG/關鍵字: | Information theory; Knowledge acquisition; Robustness (control systems); Semantics; Thesauri; Co-reference chains; Event clustering; Multi document summarization; Data mining |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。