CUEBES

RM-R1: Reward Modeling as Reasoning

arXiv:2505.02387v4 Announce Type: replace Abstract: Reward modeling is essential for aligning large language models with human preferences through reinforcement learning. To provide accurate reward signals,...

Software Policy

arXiv CS Mar 9

Software Development Life Cycle Perspective: A Survey of Benchmarks for Code Large Language Models and Agents

arXiv:2505.05283v3 Announce Type: replace Abstract: Code large language models (CodeLLMs) and agents are increasingly being integrated into complex software engineering tasks spanning the entire Software...

Software Engineering

arXiv CS Mar 9

FAST: An Efficient Scheduler for All-to-All GPU Communication

arXiv:2505.09764v3 Announce Type: replace Abstract: All-to-All(v) communication is a critical primitive in modern machine learning workloads, particularly mixture-of-experts (MoE) models. Unfortunately, efficient scheduling is challenging...

Chemistry Policy

arXiv CS Mar 9

Maximizing Asynchronicity in Event-based Neural Networks

arXiv:2505.11165v2 Announce Type: replace Abstract: Event cameras deliver visual data with high temporal resolution, low latency, and minimal redundancy, yet their asynchronous, sparse sequential nature...

Software Artificial Intelligence

arXiv CS Mar 9

Mitigating Content Effects on Reasoning in Language Models through Fine-Grained Activation Steering

arXiv:2505.12189v2 Announce Type: replace Abstract: Large language models (LLMs) exhibit reasoning biases, often conflating content plausibility with formal logical validity. This can lead to wrong...

Biology Technology

arXiv CS Mar 9

AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference

arXiv:2505.13531v2 Announce Type: replace Abstract: Assessing Large Language Models'(LLMs) underlying value differences enables comprehensive comparison of their misalignment, cultural adaptability, and biases. Nevertheless, current value...

Artificial Intelligence Biology

arXiv CS Mar 9

C*: A Coverage Path Planning Algorithm for Unknown Environments using Rapidly Covering Graphs

arXiv:2505.13782v3 Announce Type: replace Abstract: The paper presents a novel sample-based algorithm, called C*, for real-time coverage path planning (CPP) of unknown environments. C* is...

Robotics Software

arXiv CS Mar 9

DVD-Quant: Data-free Video Diffusion Transformers Quantization

arXiv:2505.18663v4 Announce Type: replace Abstract: Diffusion Transformers (DiTs) have emerged as the state-of-the-art architecture for video generation, yet their computational and memory demands hinder practical...

Software Technology

arXiv CS Mar 9

Alchemist: Turning Public Text-to-Image Data into Generative Gold

arXiv:2505.19297v2 Announce Type: replace Abstract: Pre-training equips text-to-image (T2I) models with broad world knowledge, but this alone is often insufficient to achieve high aesthetic quality...

Biology Software

arXiv CS Mar 9

Systems of Twinned Systems: A Systematic Literature Review

arXiv:2505.19916v2 Announce Type: replace Abstract: Modern systems exhibit unprecedented complexity due to their increased scale, interconnectedness, and the heterogeneity of their digital and physical components....

Technology Engineering

arXiv CS Mar 9

Instance Data Condensation for Image Super-Resolution

arXiv:2505.21099v2 Announce Type: replace Abstract: Deep learning based Image Super-Resolution (ISR) relies on large training datasets to optimize model generalization; this requires substantial computational and...

Biology Software

arXiv CS Mar 9

Linear Layouts: Robust Code Generation of Efficient Tensor Computation Using $\mathbb{F}_2$

arXiv:2505.23819v5 Announce Type: replace Abstract: Efficient tensor computation is a cornerstone of modern deep learning (DL) workloads, yet existing approaches struggle to achieve flexible and...

Hardware Engineering

arXiv CS Mar 9

Robustness-Aware Tool Selection and Manipulation Planning with Learned Energy-Informed Guidance

arXiv:2506.03362v2 Announce Type: replace Abstract: Humans subconsciously choose robust ways of selecting and using tools, for example, choosing a ladle over a flat spatula to...

Biology Robotics

arXiv CS Mar 9

VisioMath: Benchmarking Figure-based Mathematical Reasoning in LMMs

arXiv:2506.06727v4 Announce Type: replace Abstract: Large Multimodal Models have achieved remarkable progress in integrating vision and language, enabling strong performance across perception, reasoning, and domain-specific...

Mathematics Chemistry

arXiv CS Mar 9

From Raw Corpora to Domain Benchmarks: Automated Evaluation of LLM Domain Expertise

arXiv:2506.07658v3 Announce Type: replace Abstract: Accurate domain-specific benchmarking of LLMs is essential, specifically in domains with direct implications for humans, such as law, healthcare, and...

Health Policy

arXiv CS Mar 9

ROS-related Robotic Systems Development with V-model-based Application of MeROS Metamodel

arXiv:2506.08706v3 Announce Type: replace Abstract: Systems built on the Robot Operating System (ROS) are increasingly easy to assemble, yet hard to govern and reliably coordinate....

Engineering Robotics

arXiv CS Mar 9

Discerning What Matters: A Multi-Dimensional Assessment of Moral Competence in LLMs

arXiv:2506.13082v4 Announce Type: replace Abstract: Moral competence is the ability to act in accordance with moral principles. As large language models (LLMs) are increasingly deployed...

Artificial Intelligence Software

arXiv CS Mar 9

ContextBench: Modifying Contexts for Targeted Latent Activation

arXiv:2506.15735v2 Announce Type: replace Abstract: Identifying inputs that trigger specific behaviours or latent features in language models could have a wide range of safety use...

Software Biology

arXiv CS Mar 9

Sysformer: Safeguarding Frozen Large Language Models with Adaptive System Prompts

arXiv:2506.15751v2 Announce Type: replace Abstract: As large language models (LLMs) are deployed in safety-critical settings, it is essential to ensure that their responses comply with...

Technology Psychology

arXiv CS Mar 9

VisualPrompter: Semantic-Aware Prompt Optimization with Visual Feedback for Text-to-Image Synthesis

arXiv:2506.23138v2 Announce Type: replace Abstract: The notable gap between user-provided and model-preferred prompts poses a significant challenge for generating high-quality images with text-to-image models, compelling...

Engineering Software

arXiv CS Mar 9

SPoT: Subpixel Placement of Tokens in Vision Transformers

arXiv:2507.01654v2 Announce Type: replace Abstract: Vision Transformers naturally accommodate sparsity, yet standard tokenization methods confine features to discrete patch grids. This constraint prevents models from...

Policy

arXiv CS Mar 9

Quantifying Cross-Attention Interaction in Transformers for Interpreting TCR-pMHC Binding

arXiv:2507.03197v3 Announce Type: replace Abstract: CD8+ "killer" T cells and CD4+ "helper" T cells play a central role in the adaptive immune system by recognizing...

Psychology Chemistry

arXiv CS Mar 9

SPARC: Concept-Aligned Sparse Autoencoders for Cross-Model and Cross-Modal Interpretability

arXiv:2507.06265v2 Announce Type: replace Abstract: Understanding how different AI models encode the same high-level concepts, such as objects or attributes, remains challenging because each model...

Software Technology

arXiv CS Mar 9

Token Bottleneck: One Token to Remember Dynamics

arXiv:2507.06543v2 Announce Type: replace Abstract: Deriving compact and temporally aware visual representations from dynamic scenes is essential for successful execution of sequential scene understanding tasks...

Robotics Environment