Joongkyu Lee
Joongkyu Lee
Home
Publications & Preprints
Light
Dark
Automatic
3
Nonstationary Generalized Linear Bandits with Discounted Online Mirror Descent
We study nonstationary generalized linear bandits (GLBs), where the expected reward is modeled through a nonlinear link function with …
Joongkyu Lee
,
Min-hwan Oh
PDF
Multi-Step Likelihood-Ratio Correction for Reinforcement Learning with Verifiable Rewards
Reinforcement learning with verifiable rewards (RLVR) plays a pivotal role in improving the reasoning ability of large language models. …
Deokgyu Yoon
,
Hyungkyu Kang
,
Joongkyu Lee
,
Byeongchan Kim
,
Gyungin Shin
,
Sungrae Park
,
Min-hwan Oh
PDF
Block-Sphere Vector Quantization
Vector quantization is a fundamental primitive for scalable machine learning systems, enabling memory-efficient storage, fast …
Heesang Ann
,
Joongkyu Lee
,
Min-hwan Oh
PDF
Cite
×