An Empirical Study of Methods for SFTing Opaque Reasoning Models — LessWrong

An Empirical Study of Methods for SFTing Opaque Reasoning Models — LessWrong

Summary

We open-source our code here. • Introduction Current reasoning models produce chains of thought that are largely human-readable, which makes supervi…

Description

We open-source our code here. • Introduction Current reasoning models produce chains of thought that are largely human-readable, which makes supervi…

Original reporting

AFBytes is a read-only aggregator. Use the original source for full context and complete reporting.

Open original source

Related coverage