Semantic v.s. Positions: Utilizing Balanced Proximity in Language Model Smoothing for Information Retrieval
Journal
6th International Joint Conference on Natural Language Processing, IJCNLP 2013 - Proceedings of the Main Conference
Pages
507-515
Date Issued
2013
Author(s)
Abstract
Work on information retrieval has shown that language model smoothing leads to more accurate estimation of document models and hence is crucial for achieving good retrieval performance. Several smoothing methods have been proposed in the literature, using either semantic or positional information. In this paper, we propose a unified proximity-based framework to smooth language models, leveraging semantic and positional information simultaneously in combination. The key idea is to project terms to positions where they originally do not exist (i.e., zero count), which is actually a word count propagation process. We achieve this projection through two proximity-based density functions indicating semantic association and positional adjacency. We balance the effects of semantic and positional smoothing, and score a document based on the smoothed language model. Experiments on four standard TREC test collections show that our smoothing model is effective for information retrieval and generally performs better than the state of the art. © IJCNLP 2013.All right reserved.
Other Subjects
Computational linguistics; Semantics; Accurate estimation; Document modeling; Language model; Positional information; Propagation process; Retrieval performance; Semantic associations; Semantics Information; Smoothing methods; Zero count; Information retrieval
Type
conference paper
