RM-R1: Reward Modeling as Reasoning
arXiv:2505.02387v4 Announce Type: replace Abstract: Reward modeling is essential for aligning large language models with human preferences through reinforcement learning. To provide accurate reward signals,...
Stay updated with the latest research and technology news
arXiv:2505.02387v4 Announce Type: replace Abstract: Reward modeling is essential for aligning large language models with human preferences through reinforcement learning. To provide accurate reward signals,...
arXiv:2505.05283v3 Announce Type: replace Abstract: Code large language models (CodeLLMs) and agents are increasingly being integrated into complex software engineering tasks spanning the entire Software...
arXiv:2505.09764v3 Announce Type: replace Abstract: All-to-All(v) communication is a critical primitive in modern machine learning workloads, particularly mixture-of-experts (MoE) models. Unfortunately, efficient scheduling is challenging...
arXiv:2505.11165v2 Announce Type: replace Abstract: Event cameras deliver visual data with high temporal resolution, low latency, and minimal redundancy, yet their asynchronous, sparse sequential nature...
arXiv:2505.12189v2 Announce Type: replace Abstract: Large language models (LLMs) exhibit reasoning biases, often conflating content plausibility with formal logical validity. This can lead to wrong...
arXiv:2505.13531v2 Announce Type: replace Abstract: Assessing Large Language Models'(LLMs) underlying value differences enables comprehensive comparison of their misalignment, cultural adaptability, and biases. Nevertheless, current value...
arXiv:2505.13782v3 Announce Type: replace Abstract: The paper presents a novel sample-based algorithm, called C*, for real-time coverage path planning (CPP) of unknown environments. C* is...
arXiv:2505.18663v4 Announce Type: replace Abstract: Diffusion Transformers (DiTs) have emerged as the state-of-the-art architecture for video generation, yet their computational and memory demands hinder practical...
arXiv:2505.19297v2 Announce Type: replace Abstract: Pre-training equips text-to-image (T2I) models with broad world knowledge, but this alone is often insufficient to achieve high aesthetic quality...
arXiv:2505.19916v2 Announce Type: replace Abstract: Modern systems exhibit unprecedented complexity due to their increased scale, interconnectedness, and the heterogeneity of their digital and physical components....
arXiv:2505.21099v2 Announce Type: replace Abstract: Deep learning based Image Super-Resolution (ISR) relies on large training datasets to optimize model generalization; this requires substantial computational and...
arXiv:2505.23819v5 Announce Type: replace Abstract: Efficient tensor computation is a cornerstone of modern deep learning (DL) workloads, yet existing approaches struggle to achieve flexible and...
arXiv:2506.03362v2 Announce Type: replace Abstract: Humans subconsciously choose robust ways of selecting and using tools, for example, choosing a ladle over a flat spatula to...
arXiv:2506.06727v4 Announce Type: replace Abstract: Large Multimodal Models have achieved remarkable progress in integrating vision and language, enabling strong performance across perception, reasoning, and domain-specific...
arXiv:2506.07658v3 Announce Type: replace Abstract: Accurate domain-specific benchmarking of LLMs is essential, specifically in domains with direct implications for humans, such as law, healthcare, and...
arXiv:2506.08706v3 Announce Type: replace Abstract: Systems built on the Robot Operating System (ROS) are increasingly easy to assemble, yet hard to govern and reliably coordinate....
arXiv:2506.13082v4 Announce Type: replace Abstract: Moral competence is the ability to act in accordance with moral principles. As large language models (LLMs) are increasingly deployed...
arXiv:2506.15735v2 Announce Type: replace Abstract: Identifying inputs that trigger specific behaviours or latent features in language models could have a wide range of safety use...
arXiv:2506.15751v2 Announce Type: replace Abstract: As large language models (LLMs) are deployed in safety-critical settings, it is essential to ensure that their responses comply with...
arXiv:2506.23138v2 Announce Type: replace Abstract: The notable gap between user-provided and model-preferred prompts poses a significant challenge for generating high-quality images with text-to-image models, compelling...
arXiv:2507.01654v2 Announce Type: replace Abstract: Vision Transformers naturally accommodate sparsity, yet standard tokenization methods confine features to discrete patch grids. This constraint prevents models from...
arXiv:2507.03197v3 Announce Type: replace Abstract: CD8+ "killer" T cells and CD4+ "helper" T cells play a central role in the adaptive immune system by recognizing...
arXiv:2507.06265v2 Announce Type: replace Abstract: Understanding how different AI models encode the same high-level concepts, such as objects or attributes, remains challenging because each model...
arXiv:2507.06543v2 Announce Type: replace Abstract: Deriving compact and temporally aware visual representations from dynamic scenes is essential for successful execution of sequential scene understanding tasks...