CUEBES

Do We Need Adam? Surprisingly Strong and Sparse Reinforcement Learning with SGD in LLMs

arXiv:2602.07729v2 Announce Type: replace Abstract: Reinforcement learning (RL), particularly RL from verifiable reward (RLVR), has become a crucial phase of training large language models (LLMs)...

Psychology Policy

arXiv CS Feb 25

AceGRPO: Adaptive Curriculum Enhanced Group Relative Policy Optimization for Autonomous Machine Learning Engineering

arXiv:2602.07906v2 Announce Type: replace Abstract: Autonomous Machine Learning Engineering (MLE) requires agents to perform sustained, iterative optimization over long horizons. While recent LLM-based agents show...

Engineering Software

arXiv CS Feb 25

Language Modeling and Understanding Through Paraphrase Generation and Detection

arXiv:2602.08274v3 Announce Type: replace Abstract: Language enables humans to share knowledge, reason about the world, and pass on strategies for survival and innovation across generations....

Software Policy

arXiv CS Feb 25

GOT-Edit: Geometry-Aware Generic Object Tracking via Online Model Editing

arXiv:2602.08550v3 Announce Type: replace Abstract: Human perception for effective object tracking in 2D video streams arises from the implicit use of prior 3D knowledge and...

Software Business

arXiv CS Feb 25

The Wisdom of Many Queries: Complexity-Diversity Principle for Dense Retriever Training

arXiv:2602.09448v2 Announce Type: replace Abstract: Prior synthetic query generation for dense retrieval produces one query per document, focusing on quality. We systematically study multi-query synthesis,...

Chemistry Policy

arXiv CS Feb 25

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

arXiv:2602.10693v2 Announce Type: replace Abstract: Training stability remains a central challenge in reinforcement learning (RL) for large language models (LLMs). Policy staleness, asynchronous training, and...

Policy Psychology

arXiv CS Feb 25

Ecological mapping with geospatial foundation models

arXiv:2602.10720v2 Announce Type: replace Abstract: The value of Earth observation foundation models for high-impact ecological applications remains insufficiently characterized. This study is one of the...

Software Business

arXiv CS Feb 25

KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models

arXiv:2602.11184v2 Announce Type: replace Abstract: Mixture of Experts (MoE) models have achieved great success by significantly improving performance while maintaining computational efficiency through sparse expert...

Software Technology

arXiv CS Feb 25

MUSE: Multi-Tenant Model Serving With Seamless Model Updates

arXiv:2602.11776v2 Announce Type: replace Abstract: In binary classification systems, decision thresholds translate model scores into actions. Choosing suitable thresholds relies on the specific distribution of...

Policy Environment

arXiv CS Feb 25

MalTool: Malicious Tool Attacks on LLM Agents

arXiv:2602.12194v2 Announce Type: replace Abstract: In a malicious tool attack, an attacker uploads a malicious tool to a distribution platform; once a user installs the...

Software Cybersecurity

arXiv CS Feb 25

PMG: Parameterized Motion Generator for Human-like Locomotion Control

arXiv:2602.12656v2 Announce Type: replace Abstract: Recent advances in data-driven reinforcement learning and motion tracking have substantially improved humanoid locomotion, yet critical practical challenges remain. In...

Robotics Psychology

arXiv CS Feb 25

DriveMamba: Task-Centric Scalable State Space Model for Efficient End-to-End Autonomous Driving

arXiv:2602.13301v2 Announce Type: replace Abstract: Recent advances towards End-to-End Autonomous Driving (E2E-AD) have been often devoted on integrating modular designs into a unified framework for...

Robotics Software

arXiv CS Feb 25

Sim2Radar: Toward Bridging the Radar Sim-to-Real Gap with VLM-Guided Scene Reconstruction

arXiv:2602.13314v3 Announce Type: replace Abstract: Millimeter-wave (mmWave) radar provides reliable perception in visually degraded indoor environments (e.g., smoke, dust, and low light), but learning-based radar...

Materials Science Environment

arXiv CS Feb 25

Pawsterior: Variational Flow Matching for Structured Simulation-Based Inference

arXiv:2602.13813v2 Announce Type: replace Abstract: We introduce Pawsterior, a variational flow-matching framework for improved and extended simulation-based inference (SBI). Many SBI problems involve posteriors constrained...

Psychology Software

arXiv CS Feb 25

Joint Task Assistance Planning via Nested Branch and Bound (Extended Version)

arXiv:2602.13932v2 Announce Type: replace Abstract: We introduce and study the Joint Task Assistance Planning problem which generalizes prior work on optimizing assistance in robotic collaboration....

Robotics Software

arXiv CS Feb 25

Context Shapes LLMs Retrieval-Augmented Fact-Checking Effectiveness

arXiv:2602.14044v2 Announce Type: replace Abstract: Large language models (LLMs) show strong reasoning abilities across diverse tasks, yet their performance on extended contexts remains inconsistent. While...

Climate & Environment Software

arXiv CS Feb 25

MCPShield: A Security Cognition Layer for Adaptive Trust Calibration in Model Context Protocol Agents

arXiv:2602.14281v3 Announce Type: replace Abstract: The Model Context Protocol (MCP) standardizes tool use for LLM-based agents and enable third-party servers. This openness introduces a security...

Psychology Biology

arXiv CS Feb 25

Silent Inconsistency in Data-Parallel Full Fine-Tuning: Diagnosing Worker-Level Optimization Misalignment

arXiv:2602.14462v2 Announce Type: replace Abstract: Data-parallel (DP) training with synchronous all-reduce is a dominant paradigm for full-parameter fine-tuning of large language models (LLMs). While parameter...

Embedded Systems Artificial Intelligence

arXiv CS Feb 25

Divine Benevolence is an $x^2$: GLUs scale asymptotically faster than MLPs

arXiv:2602.14495v2 Announce Type: replace Abstract: Scaling laws can be understood from ground-up numerical analysis, where traditional function approximation theory can explain shifts in model architecture...

Software Policy

arXiv CS Feb 25

ST-EVO: Towards Generative Spatio-Temporal Evolution of Multi-Agent Communication Topologies

arXiv:2602.14681v3 Announce Type: replace Abstract: LLM-powered Multi-Agent Systems (MAS) have emerged as an effective approach towards collaborative intelligence, and have attracted wide research interests. Among...

Biology Technology

arXiv CS Feb 25

More than Decision Support: Exploring Patients' Longitudinal Usage of Large Language Models in Real-World Healthcare-Seeking Journeys

arXiv:2602.14733v2 Announce Type: replace Abstract: Large language models (LLMs) have been increasingly adopted to support patients' healthcare-seeking in recent years. While prior patient-centered studies have...

Health Psychology

arXiv CS Feb 25

Anatomy of Capability Emergence: Scale-Invariant Representation Collapse and Top-Down Reorganization in Neural Networks

arXiv:2602.15997v3 Announce Type: replace Abstract: Capability emergence during neural network training remains mechanistically opaque. We track five geometric measures across five model scales (405K--85M parameters),...

Software World News

arXiv CS Feb 25

AI-CARE: Carbon-Aware Reporting Evaluation Metric for AI Models

arXiv:2602.16042v2 Announce Type: replace Abstract: As machine learning (ML) continues its rapid expansion, the environmental cost of model training and inference has become a critical...

Climate & Environment Environment

arXiv CS Feb 25

Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation

arXiv:2602.16705v2 Announce Type: replace Abstract: Visual loco-manipulation of arbitrary objects in the wild with humanoid robots requires accurate end-effector (EE) control and a generalizable understanding...

Robotics Apple & Mac