CUEBES

ECHO-2: A Large-Scale Distributed Rollout Framework for Cost-Efficient Reinforcement Learning

arXiv:2602.02192v3 Announce Type: replace Abstract: Reinforcement learning (RL) is a critical stage in post-training large language models (LLMs), involving repeated interaction between rollout generation, reward...

Policy Artificial Intelligence

arXiv CS Feb 11

Advancing General-Purpose Reasoning Models with Modular Gradient Surgery

arXiv:2602.02301v2 Announce Type: replace Abstract: Reinforcement learning (RL) has played a central role in recent advances in large reasoning models (LRMs), yielding strong gains in...

Psychology Software

arXiv CS Feb 11

Spark: Modular Spiking Neural Networks

arXiv:2602.02306v2 Announce Type: replace Abstract: Nowadays, neural networks act as a synonym for artificial intelligence. Present neural network models, although remarkably powerful, are inefficient both...

Hardware Artificial Intelligence

arXiv CS Feb 11

A Large-Scale Dataset for Molecular Structure-Language Description via a Rule-Regularized Method

arXiv:2602.02320v2 Announce Type: replace Abstract: Molecular function is largely determined by structure. Accurately aligning molecular structure with natural language is therefore essential for enabling large...

Chemistry Software

arXiv CS Feb 11

Building a Correct-by-Design Lakehouse. Data Contracts, Versioning, and Transactional Pipelines for Humans and Agents

arXiv:2602.02335v2 Announce Type: replace Abstract: Lakehouses are the default cloud platform for analytics and AI, but they become unsafe when untrusted actors concurrently operate on...

Software Engineering

arXiv CS Feb 11

Modelling Socio-Psychological Drivers of Land Management Intensity

arXiv:2602.02347v2 Announce Type: replace Abstract: Land management intensity shapes ecosystem service provision, socio-ecological resilience and is central to sustainable transformation. Yet most land use models...

Environment Psychology

arXiv CS Feb 11

Unified Personalized Reward Model for Vision Generation

arXiv:2602.02380v2 Announce Type: replace Abstract: Recent advancements in multimodal reward models (RMs) have significantly propelled the development of visual generation. Existing frameworks typically adopt Bradley-Terry-style...

Psychology Chemistry

arXiv CS Feb 11

Reward-free Alignment for Conflicting Objectives

arXiv:2602.02495v2 Announce Type: replace Abstract: Direct alignment methods are increasingly used to align large language models (LLMs) with human preferences. However, many real-world alignment problems...

Software World News

arXiv CS Feb 11

Toward Ultra-Long-Horizon Sequential Model Editing

arXiv:2602.02543v2 Announce Type: replace Abstract: Model editing has emerged as a practical approach for mitigating factual errors and outdated knowledge in large language models (LLMs)....

Software Artificial Intelligence

arXiv CS Feb 11

RAP: KV-Cache Compression via RoPE-Aligned Pruning

arXiv:2602.02599v3 Announce Type: replace Abstract: Long-context inference in large language models is increasingly bottlenecked by the memory and compute cost of the KV-Cache. Low-rank factorization...

Software Policy

arXiv CS Feb 11

AdaptMMBench: Benchmarking Adaptive Multimodal Reasoning for Mode Selection and Reasoning Process

arXiv:2602.02676v2 Announce Type: replace Abstract: Adaptive multimodal reasoning has emerged as a promising frontier in Vision-Language Models (VLMs), aiming to dynamically modulate between tool-augmented visual...

Psychology World News

arXiv CS Feb 11

Variational Sparse Paired Autoencoders (vsPAIR) for Inverse Problems and Uncertainty Quantification

arXiv:2602.02948v2 Announce Type: replace Abstract: Inverse problems are fundamental to many scientific and engineering disciplines; they arise when one seeks to reconstruct hidden, underlying quantities...

Software Engineering

arXiv CS Feb 11

FlashSinkhorn: IO-Aware Entropic Optimal Transport

arXiv:2602.03067v2 Announce Type: replace Abstract: Entropic optimal transport (EOT) via Sinkhorn iterations is widely used in modern machine learning, yet GPU solvers remain inefficient at...

Software Energy

arXiv CS Feb 11

From Scalar Rewards to Potential Trends: Shaping Potential Landscapes for Model-Based Reinforcement Learning

arXiv:2602.03201v2 Announce Type: replace Abstract: Model-based reinforcement learning (MBRL) achieves high sample efficiency by simulating future trajectories with learned dynamics and reward models. However, its...

Environment Policy

arXiv CS Feb 11

ConsisDrive: Identity-Preserving Driving World Models for Video Generation by Instance Mask

arXiv:2602.03213v3 Announce Type: replace Abstract: Autonomous driving relies on robust models trained on large-scale, high-quality multi-view driving videos. Although world models provide a cost-effective solution...

Robotics Software

arXiv CS Feb 11

RAWDet-7: A Multi-Scenario Benchmark for Object Detection and Description on Quantized RAW Images

arXiv:2602.03760v2 Announce Type: replace Abstract: Most vision models are trained on RGB images processed through ISP pipelines optimized for human perception, which can discard sensor-level...

Environment Software

arXiv CS Feb 11

Entropy-Aware Structural Alignment for Zero-Shot Handwritten Chinese Character Recognition

arXiv:2602.03913v2 Announce Type: replace Abstract: Zero-shot Handwritten Chinese Character Recognition (HCCR) aims to recognize unseen characters by leveraging radical-based semantic compositions. However, existing approaches often...

Software Engineering

arXiv CS Feb 11

ProAgentBench: Evaluating LLM Agents for Proactive Assistance with Real-World Data

arXiv:2602.04482v2 Announce Type: replace Abstract: Proactive agents that anticipate user intentions without explicit prompts represent a significant evolution in human-AI interaction, promising to reduce cognitive...

Biology Artificial Intelligence

arXiv CS Feb 11

VK-LSVD: A Large-Scale Industrial Dataset for Short-Video Recommendation

arXiv:2602.04567v2 Announce Type: replace Abstract: Short-video recommendation presents unique challenges, such as modeling rapid user interest shifts from implicit feedback, but progress is constrained by...

Policy Software

arXiv CS Feb 11

Learning Where It Matters: Geometric Anchoring for Robust Preference Alignment

arXiv:2602.04909v2 Announce Type: replace Abstract: Direct Preference Optimization (DPO) and related methods align large language models from pairwise preferences by regularizing updates against a fixed...

Policy Software

arXiv CS Feb 11

Privileged Information Distillation for Language Models

arXiv:2602.04942v2 Announce Type: replace Abstract: Training-time privileged information (PI) can enable language models to succeed on tasks they would otherwise fail, making it a powerful...

Psychology Policy

arXiv CS Feb 11

Traceable Cross-Source RAG for Chinese Tibetan Medicine Question Answering

arXiv:2602.05195v2 Announce Type: replace Abstract: Retrieval-augmented generation (RAG) promises grounded question answering, yet domain settings with multiple heterogeneous knowledge bases (KBs) remain challenging. In Chinese...

Embedded Systems Medicine & Health

arXiv CS Feb 11

LMMRec: LLM-driven Motivation-aware Multimodal Recommendation

arXiv:2602.05474v2 Announce Type: replace Abstract: Motivation-based recommendation systems uncover user behavior drivers. Motivation modeling, crucial for decision-making and content preference, explains recommendation generation. Existing methods...

Psychology Software

arXiv CS Feb 11

Taylor-Accelerated Neural Network Interpolation Operators on Irregular Grids with Higher Order Approximation

arXiv:2602.05589v2 Announce Type: replace Abstract: In this paper, a new class of \emph{Taylor-accelerated neural network interpolation operators} is introduced on quasi-uniform irregular grids. These operators...

Neuroscience Software