Linear Upper Confidence Bound Algorithm for Contextual Bandit Problem with Piled Rewards.

Huang, Kuan-Hao;Lin, Hsuan-Tien

標題:	Linear Upper Confidence Bound Algorithm for Contextual Bandit Problem with Piled Rewards.
作者:	Huang, Kuan-Hao HSUAN-TIEN LIN
關鍵字:	Bandit; Piled rewards; Upper confidence bound
公開日期:	2016
起(迄)頁:	143-155
來源出版物:	Advances in Knowledge Discovery and Data Mining - 20th Pacific-Asia Conference, PAKDD 2016, Auckland, New Zealand, April 19-22, 2016, Proceedings, Part II
摘要:	We study the contextual bandit problem with linear payoff function. In the traditional contextual bandit problem, the algorithm iteratively chooses an action based on the observed context, and immediately receives a reward for the chosen action. Motivated by a practical need in many applications, we study the design of algorithms under the piled-reward setting, where the rewards are received as a pile instead of immediately. We present how the Linear Upper Confidence Bound (Lin- UCB) algorithm for the traditional problem can be naïvely applied under the piled-reward setting, and prove its regret bound. Then, we extend LinUCB to a novel algorithm, called Linear Upper Confidence Bound with Pseudo Reward (LinUCBPR), which digests the observed contexts to choose actionsmore strategically before the piled rewards are received.We prove that LinUCBPR can match LinUCB in the regret bound under the piled-reward setting. Experiments on the artificial and real-world datasets demonstrate the strong performance of LinUCBPR in practice. © Springer International Publishing Switzerland 2016.
URI:	https://www.scopus.com/inward/record.uri?eid=2-s2.0-84987968686&doi=10.1007%2f978-3-319-31750-2_12&partnerID=40&md5=501943f3a03c11b306f2153c1a896c65
DOI:	10.1007/978-3-319-31750-2_12
SDG/關鍵字:	Iterative methods; Piles; Probability; Bandit; Contextual bandits; Design of algorithms; Novel algorithm; Payoff function; Piled rewards; Real-world datasets; Upper confidence bound; Data mining
顯示於：	資訊工程學系

顯示文件完整紀錄

SCOPUS^TM
Citations

checked on 2023/12/27

Page view(s)

checked on 2024/4/27

Google Scholar^TM

檢查

Altmetric

TAIR相關文章

SCOPUSTM Citations

Page view(s)

Google ScholarTM

Altmetric

Altmetric

SCOPUS^TM
Citations

Google Scholar^TM