CUEBES

BoRP: Bootstrapped Regression Probing for Scalable and Human-Aligned LLM Evaluation

arXiv:2601.18253v1 Announce Type: new Abstract: Accurate evaluation of user satisfaction is critical for iterative development of conversational AI. However, for open-ended assistants, traditional A/B testing...

Software Artificial Intelligence

arXiv CS Jan 28

Beyond Retention: Orchestrating Structural Safety and Plasticity in Continual Learning for LLMs

arXiv:2601.18255v1 Announce Type: new Abstract: Continual learning in Large Language Models (LLMs) faces the critical challenge of balancing stability (retaining old knowledge) and plasticity (learning...

Engineering Software

arXiv CS Jan 28

A Mechanical Wi-Fi Antenna Device for Automatic Orientation Tuning with Bayesian Optimization

arXiv:2601.18256v1 Announce Type: new Abstract: Wi-Fi access points have been widely deployed in homes, offices, and public spaces. Some APs allow users to adjust the...

Engineering Software

arXiv CS Jan 28

Depth to Anatomy: Learning Internal Organ Locations from Surface Depth Images

arXiv:2601.18260v1 Announce Type: new Abstract: Automated patient positioning plays an important role in optimizing scanning procedure and improving patient throughput. Leveraging depth information captured by...

Neuroscience Software

arXiv CS Jan 28

FGGM: Fisher-Guided Gradient Masking for Continual Learning

arXiv:2601.18261v1 Announce Type: new Abstract: Catastrophic forgetting impairs the continuous learning of large language models. We propose Fisher-Guided Gradient Masking (FGGM), a framework that mitigates...

Software Biology

arXiv CS Jan 28

Revisiting Aerial Scene Classification on the AID Benchmark

arXiv:2601.18263v1 Announce Type: new Abstract: Aerial images play a vital role in urban planning and environmental preservation, as they consist of various structures, representing different...

Software Energy

arXiv CS Jan 28

Neural Network Approximation: A View from Polytope Decomposition

arXiv:2601.18264v1 Announce Type: new Abstract: Universal approximation theory offers a foundational framework to verify neural network expressiveness, enabling principled utilization in real-world applications. However, most...

Software Neuroscience

arXiv CS Jan 28

Orchestrating Specialized Agents for Trustworthy Enterprise RAG

arXiv:2601.18267v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) shows promise for enterprise knowledge work, yet it often underperforms in high-stakes decision settings that require deep...

Chemistry Biology

arXiv CS Jan 28

Designing large language model prompts to extract scores from messy text: A shared dataset and challenge

arXiv:2601.18271v1 Announce Type: new Abstract: In some areas of computing, natural language processing and information science, progress is made by sharing datasets and challenging the...

Artificial Intelligence Software

arXiv CS Jan 28

TEFormer: Structured Bidirectional Temporal Enhancement Modeling in Spiking Transformers

arXiv:2601.18274v1 Announce Type: new Abstract: In recent years, Spiking Neural Networks (SNNs) have achieved remarkable progress, with Spiking Transformers emerging as a promising architecture for...

Energy Neuroscience

arXiv CS Jan 28

When Nobody Around Is Real: Exploring Public Opinions and User Experiences On the Multi-Agent AI Social Platform

arXiv:2601.18275v1 Announce Type: new Abstract: Powered by large language models, a new genre of multi-agent social platforms has emerged. Apps such as Social.AI deploy numerous...

Software Energy

arXiv CS Jan 28

What Do Learned Models Measure?

arXiv:2601.18278v1 Announce Type: new Abstract: In many scientific and data-driven applications, machine learning models are increasingly used as measurement instruments, rather than merely as predictors...

Software Artificial Intelligence

arXiv CS Jan 28

Validation of a Software-Defined 100-Gb/s RDMA Streaming Architecture for Ultrafast Optoacoustic and Ultrasound Imaging

arXiv:2601.18280v1 Announce Type: new Abstract: Optoacoustic (OA) imaging has emerged as a powerful investigation tool, with demonstrated applicability in oncology, neuroscience, and cardiovascular biology. However,...

Software Technology

arXiv CS Jan 28

Reflecting Twice before Speaking with Empathy: Self-Reflective Alternating Inference for Empathy-Aware End-to-End Spoken Dialogue

arXiv:2601.18281v1 Announce Type: new Abstract: End-to-end Spoken Language Models (SLMs) hold great potential for paralinguistic perception, and numerous studies have aimed to enhance their capabilities,...

Biology Software

arXiv CS Jan 28

Think-Augmented Function Calling: Improving LLM Parameter Accuracy Through Embedded Reasoning

arXiv:2601.18282v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated remarkable capabilities in function calling for autonomous agents, yet current mechanisms lack explicit reasoning...

Artificial Intelligence Robotics

arXiv CS Jan 28

VissimRL: A Multi-Agent Reinforcement Learning Framework for Traffic Signal Control Based on Vissim

arXiv:2601.18284v1 Announce Type: new Abstract: Traffic congestion remains a major challenge for urban transportation, leading to significant economic and environmental impacts. Traffic Signal Control (TSC)...

Software

arXiv CS Jan 28

U-Fold: Dynamic Intent-Aware Context Folding for User-Centric Agents

arXiv:2601.18285v1 Announce Type: new Abstract: Large language model (LLM)-based agents have been successfully deployed in many tool-augmented settings, but their scalability is fundamentally constrained by...

Software Technology

arXiv CS Jan 28

Quest2ROS2: A ROS 2 Framework for Bi-manual VR Teleoperation

arXiv:2601.18289v1 Announce Type: new Abstract: Quest2ROS2 is an open-source ROS2 framework for bi-manual teleoperation designed to scale robot data collection. Extending Quest2ROS, it overcomes workspace...

Software Robotics

arXiv CS Jan 28

TriPlay-RL: Tri-Role Self-Play Reinforcement Learning for LLM Safety Alignment

arXiv:2601.18292v1 Announce Type: new Abstract: In recent years, safety risks associated with large language models have become increasingly prominent, highlighting the urgent need to mitigate...

Biology Artificial Intelligence

arXiv CS Jan 28

Reinforcement Learning with Distributed MPC for Fuel-Efficient Platoon Control with Discrete Gear Transitions

arXiv:2601.18294v1 Announce Type: new Abstract: Cooperative control of groups of autonomous vehicles (AVs), i.e., platoons, is a promising direction to improving the efficiency of autonomous...

Robotics Technology

arXiv CS Jan 28

Temp-R1: A Unified Autonomous Agent for Complex Temporal KGQA via Reverse Curriculum Reinforcement Learning

arXiv:2601.18296v1 Announce Type: new Abstract: Temporal Knowledge Graph Question Answering (TKGQA) is inherently challenging, as it requires sophisticated reasoning over dynamic facts with multi-hop dependencies...

Robotics Neuroscience

arXiv CS Jan 28

A Heterogeneous Massive MIMO Technique for Uniform Service in Cellular Networks

arXiv:2601.18298v1 Announce Type: new Abstract: Traditional cellular networks struggle with poor quality of service (QoS) for cell-edge users, while cell-free (CF) systems offer uniform QoS...

Biology Technology

arXiv CS Jan 28

Gradient-Informed Machine Learning in Electromagnetics

arXiv:2601.18300v1 Announce Type: new Abstract: Simulation techniques such as the finite element method are essential for designing electrical devices, but their computational cost can be...

Technology Engineering

arXiv CS Jan 28

Contextual Range-View Projection for 3D LiDAR Point Clouds

arXiv:2601.18301v1 Announce Type: new Abstract: Range-view projection provides an efficient method for transforming 3D LiDAR point clouds into 2D range image representations, enabling effective processing...

Software Artificial Intelligence