Practical Counterfactual Policy Learning for Top-K Recommendations
Journal
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
ISBN
9781450393850
Date Issued
2022-08-14
Author(s)
Abstract
For building recommender systems, a critical task is to learn a policy with collected feedback (e.g., ratings, clicks) to decide which items to be recommended to users. However, it has been shown that the selection bias in the collected feedback leads to biased learning and thus a sub-optimal policy. To deal with this issue, counterfactual learning has received much attention, where existing approaches can be categorized as either value learning or policy learning approaches. This work studies policy learning approaches for top-K recommendations with a large item space and points out several difficulties related to importance weight explosion, observation insufficiency, and training efficiency. A practical framework for policy learning is then proposed to overcome these difficulties. Our experiments confirm the effectiveness and efficiency of the proposed framework.
Subjects
counterfactual learning | policy learning | recommender systems | selection bias
Type
conference paper