CUEBES

ViSurf: Visual Supervised-and-Reinforcement Fine-Tuning for Large Vision-and-Language Models

arXiv:2510.10606v3 Announce Type: replace Abstract: Post-training Large Vision-and-Language Models (LVLMs) typically involves Supervised Fine-Tuning (SFT) for knowledge injection or Reinforcement Learning with Verifiable Rewards (RLVR)...

Policy Biology

arXiv CS Jan 30

LLM$\times$MapReduce-V3: Enabling Interactive In-Depth Survey Generation through a MCP-Driven Hierarchically Modular Agent System

arXiv:2510.10890v2 Announce Type: replace Abstract: We introduce LLM x MapReduce-V3, a hierarchically modular agent system designed for long-form survey generation. Building on the prior work,...

Software Business

arXiv CS Jan 30

Efficient Test-Time Adaptation through Latent Subspace Coefficients Search

arXiv:2510.11068v2 Announce Type: replace Abstract: Real-world deployment often exposes models to distribution shifts, making test-time adaptation (TTA) critical for robustness. Yet most TTA methods are...

Software World News

arXiv CS Jan 30

Neural Weight Compression for Language Models

arXiv:2510.11234v2 Announce Type: replace Abstract: Efficient storage and transmission of language model weights are increasingly critical as model scale and deployment grow. Yet, most existing...

Neuroscience Software

arXiv CS Jan 30

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

arXiv:2510.11341v3 Announce Type: replace Abstract: General SVG modeling remains challenging due to fragmented datasets, limited transferability of methods across tasks, and the difficulty of handling...

Engineering Policy

arXiv CS Jan 30

Hey Dashboard!: Supporting Voice, Text, and Pointing Modalities in Dashboard Onboarding

arXiv:2510.12386v2 Announce Type: replace Abstract: Visualization dashboards are regularly used for data exploration and analysis, but their complex interactions and interlinked views often require time-consuming...

Materials Science Energy

arXiv CS Jan 30

Repairing Reward Functions with Feedback to Mitigate Reward Hacking

arXiv:2510.13036v2 Announce Type: replace Abstract: Human-designed reward functions for reinforcement learning (RL) agents are frequently misaligned with the humans' true, unobservable objectives, and thus act...

Policy Software

arXiv CS Jan 30

MotionBeat: Motion-Aligned Music Representation via Embodied Contrastive Learning and Bar-Equivariant Contact-Aware Encoding

arXiv:2510.13244v2 Announce Type: replace Abstract: Music is both an auditory and an embodied phenomenon, closely linked to human motion and naturally expressed through dance. However,...

Psychology Engineering

arXiv CS Jan 30

NOSA: Native and Offloadable Sparse Attention

arXiv:2510.13602v2 Announce Type: replace Abstract: Decoding throughput improvements from larger inference batches are limited by GPU memory, which is largely consumed by the key-value (KV)...

Software Policy

arXiv CS Jan 30

Toward Robust Multilingual Adaptation of LLMs for Low-Resource Languages

arXiv:2510.14466v2 Announce Type: replace Abstract: Large language models (LLMs) continue to struggle with low-resource languages, primarily due to limited training data, translation noise, and unstable...

Software Policy

arXiv CS Jan 30

AudioEval: Automatic Dual-Perspective and Multi-Dimensional Evaluation of Text-to-Audio-Generation

arXiv:2510.14570v2 Announce Type: replace Abstract: Text-to-audio (TTA) generation is advancing rapidly, but evaluation remains challenging because human listening studies are expensive and existing automatic metrics...

Policy Biology

arXiv CS Jan 30

MOSAIC: Masked Objective with Selective Adaptation for In-domain Contrastive Learning

arXiv:2510.16797v2 Announce Type: replace Abstract: We introduce MOSAIC (Masked Objective with Selective Adaptation for In-domain Contrastive learning), a multi-stage framework for domain adaptation of text...

Software Business

arXiv CS Jan 30

Policy Learning with Abstention

arXiv:2510.19672v3 Announce Type: replace Abstract: Policy learning algorithms are widely used in areas such as personalized medicine and advertising to develop individualized treatment regimes. However,...

Software Policy

arXiv CS Jan 30

DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion

arXiv:2510.20766v2 Announce Type: replace Abstract: Diffusion Transformer models can generate images with remarkable fidelity and detail, yet training them at ultra-high resolutions remains extremely costly...

Software Energy

arXiv CS Jan 30

Convergence of Stochastic Gradient Langevin Dynamics in the Lazy Training Regime

arXiv:2510.21245v3 Announce Type: replace Abstract: Continuous-time models provide important insights into the training dynamics of optimization algorithms in deep learning. In this work, we establish...

Mathematics Artificial Intelligence

arXiv CS Jan 30

An Evidence-Based Post-Hoc Adjustment Framework for Anomaly Detection Under Data Contamination

arXiv:2510.21296v2 Announce Type: replace Abstract: Unsupervised anomaly detection (AD) methods typically assume clean training data, yet real-world datasets often contain undetected or mislabeled anomalies, leading...

Software World News

arXiv CS Jan 30

On Uncertainty Calibration for Equivariant Functions

arXiv:2510.21691v4 Announce Type: replace Abstract: Data-sparse settings such as robotic manipulation, molecular physics, and galaxy morphology classification are some of the hardest domains for deep...

Robotics Policy

arXiv CS Jan 30

Penalizing Length: Uncovering Systematic Bias in Quality Estimation Metrics

arXiv:2510.22028v2 Announce Type: replace Abstract: Quality Estimation (QE) metrics are vital in machine translation for reference-free evaluation and as a reward signal in tasks like...

Software Mathematics

arXiv CS Jan 30

Edge Collaborative Gaussian Splatting with Integrated Rendering and Communication

arXiv:2510.22718v2 Announce Type: replace Abstract: Gaussian splatting (GS) struggles with degraded rendering quality on low-cost devices. To address this issue, we present edge collaborative GS...

Energy Policy

arXiv CS Jan 30

FreeFuse: Multi-Subject LoRA Fusion via Adaptive Token-Level Routing at Test Time

arXiv:2510.23515v2 Announce Type: replace Abstract: This paper proposes FreeFuse, a training-free framework for multi-subject text-to-image generation through automatic fusion of multiple subject LoRAs. In contrast...

Software Biology

arXiv CS Jan 30

Think Twice: Branch-and-Rethink Reasoning Reward Model

arXiv:2510.23596v3 Announce Type: replace Abstract: Large language models (LLMs) increasingly rely on thinking models that externalize intermediate steps and allocate extra test-time compute, with think-twice...

Psychology Software

arXiv CS Jan 30

IBNorm: Information-Bottleneck Inspired Normalization for Representation Learning

arXiv:2510.25262v2 Announce Type: replace Abstract: Normalization is fundamental to deep learning, but existing approaches such as BatchNorm, LayerNorm, and RMSNorm are variance-centric by enforcing zero...

Psychology Software

arXiv CS Jan 30

$\pi_\texttt{RL}$: Online RL Fine-tuning for Flow-based Vision-Language-Action Models

arXiv:2510.25889v3 Announce Type: replace Abstract: Vision-Language-Action (VLA) models enable robots to understand and perform complex tasks from multimodal input. Although recent work explores using reinforcement...

Robotics Technology

arXiv CS Jan 30

A Likely Geometry of Generative Models

arXiv:2510.26266v2 Announce Type: replace Abstract: The geometry of generative models serves as the basis for interpolation, model inspection, and more. Unfortunately, most generative models lack...

Mathematics Policy