RFEval: Benchmarking Reasoning Faithfulness under Counterfactual Reasoning Intervention in Large Reasoning Models
arXiv:2602.17053v2 Announce Type: replace Abstract: Large Reasoning Models (LRMs) exhibit strong performance, yet often produce rationales that sound plausible but fail to reflect their true decision process, undermining reliability and trust. We introduce a formal framework f...
🔗 Read more: https://arxiv.org/abs/2602.17053
#News #Engineering #Software #Policy #AI #Academic
Edited
Comments
Log in to leave a comment.
No comments yet. Be the first to comment!