Interactionless Inverse Reinforcement Learning: A Data-Centric Framework for Durable Alignment
arXiv:2602.14844v1 Announce Type: new Abstract: AI alignment is growing in importance, yet current approaches suffer from a critical structural flaw that entangles the safety objectives...