CODEBLOCK: Learning to Supervise Code at the Right Granularity
CODEBLOCK introduces a granular approach to supervising code LLMs by selecting high-value tokens rather than applying uniform loss.
Standard SFT applies uniform cross-entropy loss to all tokens, which can disrupt code syntax and semantics. CODEBLOCK optimizes training by identifying and supervising only the most informative tokens, improving model performance on code generation tasks while maintaining structural integrity.