Publications & Preprints

Joongkyu Lee, Min-hwan Oh (2026). Nonstationary Generalized Linear Bandits with Discounted Online Mirror Descent. arXiv 2026.

Deokgyu Yoon, Hyungkyu Kang, Joongkyu Lee, Byeongchan Kim, Gyungin Shin, Sungrae Park, Min-hwan Oh (2026). Multi-Step Likelihood-Ratio Correction for Reinforcement Learning with Verifiable Rewards. arXiv 2026.

Heesang Ann, Joongkyu Lee, Min-hwan Oh (2026). Block-Sphere Vector Quantization. arXiv 2026.

Joongkyu Lee, Min-hwan Oh (2026). Optimal Design for Multinomial Logit Model with Applications to Best Assortment Identification. ICML 2026.

Hyunjun Choi, Joongkyu Lee, Min-hwan Oh (2025). True Impact of Cascade Length in Contextual Cascading Bandits. NeurIPS 2025.

Joongkyu Lee, Seouh-won Yi, Min-hwan Oh (2025). Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options. NeurIPS 2025.

Joongkyu Lee, Min-hwan Oh (2025). Improved Online Confidence Bounds for Multinomial Logistic Bandits. ICML 2025.

Joongkyu Lee, Min-hwan Oh (2025). Combinatorial Reinforcement Learning with Preference Feedback. ICML 2025.

Wooseong Cho, Taehyun Hwang, Joongkyu Lee, Min-hwan Oh (2024). Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation. NeurIPS 2024.

Joongkyu Lee, Min-hwan Oh (2024). Nearly Minimax Optimal Regret for Multinomial Logistic Bandit (Top 0.2%, 32/15671). NeurIPS 2024.

Joongkyu Lee, Min-hwan Oh (2024). Demystifying Linear MDPs and Novel Dynamics Aggregation Framework. ICLR 2024.

Joongkyu Lee, Seung Joon Park, Yunhao Tang, Min-hwan Oh (2024). Learning Uncertainty-Aware Temporally-Extended Actions. AAAI 2024.