CUEBES

Analysis of Off-Policy $n$-Step TD-Learning with Linear Function Approximation

arXiv:2502.08941v3 Announce Type: replace Abstract: This paper analyzes multi-step temporal difference (TD)-learning algorithms within the ``deadly triad'' scenario, characterized by linear function approximation, off-policy learning,...

Policy Psychology

arXiv CS 3d ago

From Contextual Combinatorial Semi-Bandits to Bandit List Classification: Improved Sample Complexity with Sparse Rewards

arXiv:2502.09257v4 Announce Type: replace Abstract: We study the problem of contextual combinatorial semi-bandits, where input contexts are mapped into subsets of size $m$ of a...

Software Policy

arXiv CS 3d ago

Equality of cycle lengths in one- and two-dimensional $\sigma$ automata

arXiv:2502.10898v2 Announce Type: replace Abstract: When the game Lights Out is played according to an algorithm specifying the player's sequence of moves, it can be...

Policy Artificial Intelligence

arXiv CS 3d ago

AdaGC: Improving Training Stability for Large Language Model Pretraining

arXiv:2502.11034v2 Announce Type: replace Abstract: Loss spikes remain a persistent obstacle in large-scale language model pretraining. While previous research has attempted to identify the root...

Hardware Technology

arXiv CS 3d ago

Ambig-SWE: Interactive Agents to Overcome Underspecificity in Software Engineering

arXiv:2502.13069v3 Announce Type: replace Abstract: AI agents are increasingly being deployed to automate tasks, often based on underspecified user instructions. Making unwarranted assumptions to compensate...

Engineering Software

arXiv CS 3d ago

VQEL: Enabling Self-Play in Emergent Language Games via Agent-Internal Vector Quantization

arXiv:2503.04940v2 Announce Type: replace Abstract: Emergent Language (EL) focuses on the emergence of communication among artificial agents. Although symbolic communication channels more closely mirror the...

Software Biology

arXiv CS 3d ago

Exploring Interpretability for Visual Prompt Tuning with Cross-layer Concepts

arXiv:2503.06084v2 Announce Type: replace Abstract: Visual prompt tuning offers significant advantages for adapting pre-trained visual foundation models to specific tasks. However, current research provides limited...

Software Policy

arXiv CS 3d ago

Hier-COS: Making Deep Features Hierarchy-aware via Composition of Orthogonal Subspaces

arXiv:2503.07853v2 Announce Type: replace Abstract: Traditional classifiers treat all labels as mutually independent, thereby considering all negative classes to be equally incorrect. This approach fails...

Software World News

arXiv CS 3d ago

SphOR: A Representation Learning Perspective on Open-set Recognition for Identifying Unknown Classes in Deep Learning Models

arXiv:2503.08049v3 Announce Type: replace Abstract: The reliance on Deep Neural Network (DNN)-based classifiers in safety-critical and real-world applications necessitates Open-Set Recognition (OSR). OSR enables the...

Software Technology

arXiv CS 3d ago

Test-Time Training Provably Improves Transformers as In-context Learners

arXiv:2503.11842v2 Announce Type: replace Abstract: Test-time training (TTT) methods explicitly update the weights of a model to adapt to the specific test instance, and they...

Policy Artificial Intelligence

arXiv CS 3d ago

PSGait: Gait Recognition using Parsing Skeleton

arXiv:2503.12047v3 Announce Type: replace Abstract: Gait recognition has emerged as a robust biometric modality due to its non-intrusive nature. Conventional gait recognition methods mainly rely...

Psychology Software

arXiv CS 3d ago

Stable Volume Dissipation for High-Order Finite-Difference and Spectral-Element Methods with the Summation-by-Parts Property

arXiv:2503.12670v3 Announce Type: replace Abstract: The construction of stable, conservative, and accurate volume dissipation is extended to discretizations that possess a generalized summation-by-parts (SBP) property...

Biology Software

arXiv CS 3d ago

VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning

arXiv:2503.13444v3 Announce Type: replace Abstract: Videos, with their unique temporal dimension, demand precise grounded understanding, where answers are directly linked to visual, interpretable evidence. Despite...

Technology Software

arXiv CS 3d ago

KINESIS: Motion Imitation for Human Musculoskeletal Locomotion

arXiv:2503.14637v2 Announce Type: replace Abstract: How do humans move? Advances in reinforcement learning (RL) have produced impressive results in capturing human motion using physics-based humanoid...

Software Policy

arXiv CS 3d ago

ShapeShift: Text-to-Mosaic Synthesis via Semantic Phase-Field Guidance

arXiv:2503.14720v2 Announce Type: replace Abstract: We present ShapeShift, a method for arranging rigid objects into configurations that visually convey semantic concepts specified by natural language....

Software Energy

arXiv CS 3d ago

Advancing Mobile GUI Agents: A Verifier-Driven Approach to Practical Deployment

arXiv:2503.15937v5 Announce Type: replace Abstract: We propose V-Droid, a mobile GUI task automation agent. Unlike previous mobile agents that utilize Large Language Models (LLMs) as...

Software Robotics

arXiv CS 3d ago

CoBRA: A Universal Strategyproof Confirmation Protocol for Quorum-based Proof-of-Stake Blockchains

arXiv:2503.16783v3 Announce Type: replace Abstract: The security of many Proof-of-Stake (PoS) payment systems relies on quorum-based State Machine Replication (SMR) protocols. While classical analyses assume...

Psychology Software

arXiv CS 3d ago

Can Vision-Language Models Answer Face to Face Questions in the Real-World?

arXiv:2503.19356v3 Announce Type: replace Abstract: AI models have made significant strides in recent years in their ability to describe and answer questions about real-world images....

Robotics World News

arXiv CS 3d ago

Learn by Reasoning: Analogical Weight Generation for Few-Shot Class-Incremental Learning

arXiv:2503.21258v2 Announce Type: replace Abstract: Few-shot class-incremental Learning (FSCIL) enables models to learn new classes from limited data while retaining performance on previously learned classes....

Neuroscience Psychology

arXiv CS 3d ago

JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization

arXiv:2503.23377v2 Announce Type: replace Abstract: This paper introduces JavisDiT, a novel Joint Audio-Video Diffusion Transformer designed for synchronized audio-video generation (JAVG). Based on the powerful...

Energy World News

arXiv CS 3d ago

Autonomous Learning with High-Dimensional Computing Architecture Similar to von Neumann's

arXiv:2503.23608v2 Announce Type: replace Abstract: We model human and animal learning by computing with high-dimensional vectors (H = 10,000 for example). The architecture resembles traditional...

Robotics Technology

arXiv CS 3d ago

Order Matters: On Parameter-Efficient Image-to-Video Probing for Recognizing Nearly Symmetric Actions

arXiv:2503.24298v2 Announce Type: replace Abstract: Fine-grained understanding of human actions is essential for safe and intuitive human--robot interaction. We study the challenge of recognizing nearly...

Robotics World News

arXiv CS 3d ago

Noise-Aware Generalization: Robustness to In-Domain Noise and Out-of-Domain Generalization

arXiv:2504.02996v2 Announce Type: replace Abstract: Methods addressing Learning with Noisy Labels (LNL) and multi-source Domain Generalization (DG) use training techniques to improve downstream task performance...

Technology Software

arXiv CS 3d ago

Meta-DAN: towards an efficient prediction strategy for page-level handwritten text recognition

arXiv:2504.03349v2 Announce Type: replace Abstract: Recent advances in text recognition led to a paradigm shift for page-level recognition, from multi-step segmentation-based approaches to end-to-end attention-based...

Software Psychology