lesswrong.com · Apr 24, 2026 05:26 PM UTC

An Empirical Study of Methods for SFTing Opaque Reasoning Models — LessWrong

Summary

We open-source our code here. • Introduction Current reasoning models produce chains of thought that are largely human-readable, which makes supervi…

We open-source our code here. • Introduction Current reasoning models produce chains of thought that are largely human-readable, which makes supervi…

AFBytes is a read-only aggregator. Use the original source for full context and complete reporting.