The study showed that models trained with INTUITOR achieved up to a 76% gain in accuracy on code reasoning benchmarks and a 65% gain on code generation tasks, indicating strong performance improvements.
Founders
Xuandong Zhao, Zhewei Kang, Sergey Levine, Dawn Song, Aosong Feng
Company Description
INTUITOR is a system developed by researchers at UC Berkeley and Yale that enables large language models (LLMs) to improve reasoning skills without external rewards by training on their own internal confidence scores. This method, called Reinforcement Learning from Internal Feedback (RLIF), allows models to generalize more effectively across various tasks.