Joongkyu Lee
Joongkyu Lee
Home
Publications & Preprints
Light
Dark
Automatic
1
Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options
We study online preference-based reinforcement learning (PbRL) with the goal of improving sample efficiency. While a growing body of …
Joongkyu Lee
,
Seouh-won Yi
,
Min-hwan Oh
True Impact of Cascade Length in Contextual Cascading Bandits
We revisit the contextual cascading bandit, where a learning agent recommends an ordered list ($\text{\textit{cascade}}$) of items and …
Hyunjun Choi
,
Joongkyu Lee
,
Min-hwan Oh
Combinatorial Reinforcement Learning with Preference Feedback
In this paper, we consider combinatorial reinforcement learning with preference feedback,where a learning agent sequentially offers an …
Joongkyu Lee
,
Min-hwan Oh
PDF
Improved Online Confidence Bounds for Multinomial Logistic Bandits
In this paper, we propose an improved online confidence bound for multinomial logistic (MNL) models and apply this result to MNL …
Joongkyu Lee
,
Min-hwan Oh
PDF
Nearly Minimax Optimal Regret for Multinomial Logistic Bandit (Top 0.2%, 32/15671)
In this paper, we study the contextual multinomial logit (MNL) bandit problem in which a learning agent sequentially selects an …
Joongkyu Lee
,
Min-hwan Oh
PDF
Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation
We study reinforcement learning with
multinomial logistic
(MNL) function approximation where the underlying transition probability …
Wooseong Cho
,
Taehyun Hwang
,
Joongkyu Lee
,
Min-hwan Oh
PDF
Demystifying Linear MDPs and Novel Dynamics Aggregation Framework
In this paper, we first challenge the common premise that linear MDPs always induce performance guarantees independent of the state …
Joongkyu Lee
,
Min-hwan Oh
PDF
Learning Uncertainty-Aware Temporally-Extended Actions
In reinforcement learning, temporal abstraction in the action space, exemplified by action repetition, is a technique to facilitate …
Joongkyu Lee
,
Seung Joon Park
,
Yunhao Tang
,
Min-hwan Oh
PDF
Cite
×