The Double Life of Code World Models: Provably Unmasking Malicious Behavior Through Execution Traces
arXiv:2512.13821v2 Announce Type: replace Abstract: Large language models (LLMs) increasingly generate code with minimal human oversight, raising critical concerns about backdoor injection and malicious behavior....