CUEBES

How does information access affect LLM monitors' ability to detect sabotage?

arXiv:2601.21112v1 Announce Type: new Abstract: Frontier language model agents can exhibit misaligned behaviors, including deception, exploiting reward hacks, and pursuing hidden objectives. To control potentially...

Software Artificial Intelligence

arXiv CS Jan 30

Planner-Auditor Twin: Agentic Discharge Planning with FHIR-Based LLM Planning, Guideline Recall, Optional Caching and Self-Improvement

arXiv:2601.21113v1 Announce Type: new Abstract: Objective: Large language models (LLMs) show promise for clinical discharge planning, but their use is constrained by hallucination, omissions, and...

Materials Science Health

arXiv CS Jan 30

Multi-task Code LLMs: Data Mix or Model Merge?

arXiv:2601.21115v1 Announce Type: new Abstract: Recent research advocates deploying smaller, specialized code LLMs in agentic frameworks alongside frontier models, sparking interest in efficient strategies for...

Software Technology

arXiv CS Jan 30

AI-Assisted Engineering Should Track the Epistemic Status and Temporal Validity of Architectural Decisions

arXiv:2601.21116v1 Announce Type: new Abstract: This position paper argues that AI-assisted software engineering requires explicit mechanisms for tracking the epistemic status and temporal validity of...

Engineering Software

arXiv CS Jan 30

An AI Framework for Microanastomosis Motion Assessment

arXiv:2601.21120v1 Announce Type: new Abstract: Proficiency in microanastomosis is a fundamental competency across multiple microsurgical disciplines. These procedures demand exceptional precision and refined technical skills,...

Technology Neuroscience

arXiv CS Jan 30

CUA-Skill: Develop Skills for Computer Using Agent

arXiv:2601.21123v1 Announce Type: new Abstract: Computer-Using Agents (CUAs) aim to autonomously operate computer systems to complete real-world tasks. However, existing agentic systems remain difficult to...

Software Robotics

arXiv CS Jan 30

PhaseCoder: Microphone Geometry-Agnostic Spatial Audio Understanding for Multimodal LLMs

arXiv:2601.21124v1 Announce Type: new Abstract: Current multimodal LLMs process audio as a mono stream, ignoring the rich spatial information essential for embodied AI. Existing spatial...

Software Policy

arXiv CS Jan 30

AI-Augmented Density-Driven Optimal Control (D2OC) for Decentralized Environmental Mapping

arXiv:2601.21126v1 Announce Type: new Abstract: This paper presents an AI-augmented decentralized framework for multi-agent (multi-robot) environmental mapping under limited sensing and communication. While conventional coverage...

Robotics Environment

arXiv CS Jan 30

Beyond a Single Reference: Training and Evaluation with Paraphrases in Sign Language Translation

arXiv:2601.21128v1 Announce Type: new Abstract: Most Sign Language Translation (SLT) corpora pair each signed utterance with a single written-language reference, despite the highly non-isomorphic relationship...

Biology Psychology

arXiv CS Jan 30

WheelArm-Sim: A Manipulation and Navigation Combined Multimodal Synthetic Data Generation Simulator for Unified Control in Assistive Robotics

arXiv:2601.21129v1 Announce Type: new Abstract: Wheelchairs and robotic arms enhance independent living by assisting individuals with upper-body and mobility limitations in their activities of daily...

Robotics Cybersecurity

arXiv CS Jan 30

What You Feel Is Not What They See: On Predicting Self-Reported Emotion from Third-Party Observer Labels

arXiv:2601.21130v1 Announce Type: new Abstract: Self-reported emotion labels capture internal experience, while third-party labels reflect external perception. These perspectives often diverge, limiting the applicability of...

Psychology Health

arXiv CS Jan 30

Large Language Models Naively Recover Ethnicity from Individual Records

arXiv:2601.21132v1 Announce Type: new Abstract: I demonstrate that large language models can infer ethnicity from names with accuracy exceeding that of Bayesian Improved Surname Geocoding...

Software Artificial Intelligence

arXiv CS Jan 30

TRACE: Trajectory Recovery for Continuous Mechanism Evolution in Causal Representation Learning

arXiv:2601.21135v1 Announce Type: new Abstract: Temporal causal representation learning methods assume that causal mechanisms switch instantaneously between discrete domains, yet real-world systems often exhibit continuous...

Biology World News

arXiv CS Jan 30

EnsembleLink: Accurate Record Linkage Without Training Data

arXiv:2601.21138v1 Announce Type: new Abstract: Record linkage, the process of matching records that refer to the same entity across datasets, is essential to empirical social...

Software Politics

arXiv CS Jan 30

Optimization and Mobile Deployment for Anthropocene Neural Style Transfer

arXiv:2601.21141v1 Announce Type: new Abstract: This paper presents AnthropoCam, a mobile-based neural style transfer (NST) system optimized for the visual synthesis of Anthropocene environments. Unlike...

Environment Materials Science

arXiv CS Jan 30

Maxwait: A Generalized Mechanism for Distributed Time-Sensitive Systems

arXiv:2601.21146v1 Announce Type: new Abstract: Distributed time-sensitive systems must balance timing requirements (availability) and consistency in the presence of communication delays and synchronization uncertainty. This...

Software Psychology

arXiv CS Jan 30

Smooth Dynamic Cutoffs for Machine Learning Interatomic Potentials

arXiv:2601.21147v1 Announce Type: new Abstract: Machine learning interatomic potentials (MLIPs) have proven to be wildly useful for molecular dynamics simulations, powering countless drug and materials...

Software Materials Science

arXiv CS Jan 30

BrainStack: Neuro-MoE with Functionally Guided Expert Routing for EEG-Based Language Decoding

arXiv:2601.21148v1 Announce Type: new Abstract: Decoding linguistic information from electroencephalography (EEG) remains challenging due to the brain's distributed and nonlinear organization. We present BrainStack, a...

Neuroscience World News

arXiv CS Jan 30

Mobility-Embedded POIs: Learning What A Place Is and How It Is Used from Human Movement

arXiv:2601.21149v1 Announce Type: new Abstract: Recent progress in geospatial foundation models highlights the importance of learning general-purpose representations for real-world locations, particularly points-of-interest (POIs) where...

Software Biology

arXiv CS Jan 30

Can Neural Networks Learn Small Algebraic Worlds? An Investigation Into the Group-theoretic Structures Learned By Narrow Models Trained To Predict Group Operations

arXiv:2601.21150v1 Announce Type: new Abstract: While a real-world research program in mathematics may be guided by a motivating question, the process of mathematical discovery is...

Mathematics Artificial Intelligence

arXiv CS Jan 30

Learning to Advect: A Neural Semi-Lagrangian Architecture for Weather Forecasting

arXiv:2601.21151v1 Announce Type: new Abstract: Recent machine-learning approaches to weather forecasting often employ a monolithic architecture, where distinct physical mechanisms (advection, transport), diffusion-like mixing, thermodynamic...

Climate & Environment Neuroscience

arXiv CS Jan 30

Bridging the Arithmetic Gap: The Cognitive Complexity Benchmark and Financial-PoT for Robust Financial Reasoning

arXiv:2601.21157v1 Announce Type: new Abstract: While Large Language Models excel at semantic tasks, they face a critical bottleneck in financial quantitative reasoning, frequently suffering from...

Neuroscience Health

arXiv CS Jan 30

Bidirectional Cross-Perception for Open-Vocabulary Semantic Segmentation in Remote Sensing Imagery

arXiv:2601.21159v1 Announce Type: new Abstract: High-resolution remote sensing imagery is characterized by densely distributed land-cover objects and complex boundaries, which places higher demands on both...

Software Energy

arXiv CS Jan 30

A Federated Generalized Expectation-Maximization Algorithm for Mixture Models with an Unknown Number of Components

arXiv:2601.21160v1 Announce Type: new Abstract: We study the problem of federated clustering when the total number of clusters $K$ across clients is unknown, and the...

Software World News