CUEBES

Evaluating Actionability in Explainable AI

arXiv:2601.20086v1 Announce Type: new Abstract: A core assumption of Explainable AI (XAI) is that explanations are useful to users -- that is, users will do...

Software Policy

arXiv CS Jan 29

Should I Have Expressed a Different Intent? Counterfactual Generation for LLM-Based Autonomous Control

arXiv:2601.20090v1 Announce Type: new Abstract: Large language model (LLM)-powered agents can translate high-level user intents into plans and actions in an environment. Yet after observing...

Robotics Engineering

arXiv CS Jan 29

Dynamics of Human-AI Collective Knowledge on the Web: A Scalable Model and Insights for Sustainable Growth

arXiv:2601.20099v1 Announce Type: new Abstract: Humans and large language models (LLMs) now co-produce and co-consume the web's shared knowledge archives. Such human-AI collective knowledge ecosystems...

Artificial Intelligence Policy

arXiv CS Jan 29

Taming Toxic Talk: Using chatbots to intervene with users posting toxic comments

arXiv:2601.20100v1 Announce Type: new Abstract: Generative AI chatbots have proven surprisingly effective at persuading people to change their beliefs and attitudes in lab settings. However,...

Psychology Software

arXiv CS Jan 29

Counterfactual Cultural Cues Reduce Medical QA Accuracy in LLMs: Identifier vs Context Effects

arXiv:2601.20102v1 Announce Type: new Abstract: Engineering sustainable and equitable healthcare requires medical language models that do not change clinically correct diagnoses when presented with non-decisive...

Engineering Health

arXiv CS Jan 29

Benchmarking Reward Hack Detection in Code Environments via Contrastive Analysis

arXiv:2601.20103v1 Announce Type: new Abstract: Recent advances in reinforcement learning for code generation have made robust environments essential to prevent reward hacking. As LLMs increasingly...

Artificial Intelligence Environment

arXiv CS Jan 29

NucFuseRank: Dataset Fusion and Performance Ranking for Nuclei Instance Segmentation

arXiv:2601.20104v1 Announce Type: new Abstract: Nuclei instance segmentation in hematoxylin and eosin (H&E)-stained images plays an important role in automated histological image analysis, with various...

Artificial Intelligence Software

arXiv CS Jan 29

FFE-Hallu:Hallucinations in Fixed Figurative Expressions:Benchmark of Idioms and Proverbs in the Persian Language

arXiv:2601.20105v1 Announce Type: new Abstract: Figurative language, particularly fixed figurative expressions (FFEs) such as idioms and proverbs, poses persistent challenges for large language models (LLMs)....

Artificial Intelligence Policy

arXiv CS Jan 29

Are We All Using Agents the Same Way? An Empirical Study of Core and Peripheral Developers Use of Coding Agents

arXiv:2601.20106v1 Announce Type: new Abstract: Autonomous AI agents are transforming software development and redefining how developers collaborate with AI. Prior research shows that the adoption...

Software Robotics

arXiv CS Jan 29

Look in the Middle: Structural Anchor Pruning for Scalable Visual RAG Indexing

arXiv:2601.20107v1 Announce Type: new Abstract: Recent Vision-Language Models (e.g., ColPali) enable fine-grained Visual Document Retrieval (VDR) but incur prohibitive index vector size overheads. Training-free pruning...

Engineering Software

arXiv CS Jan 29

Beyond Bug Fixes: An Empirical Investigation of Post-Merge Code Quality Issues in Agent-Generated Pull Requests

arXiv:2601.20109v1 Announce Type: new Abstract: The increasing adoption of AI coding agents has increased the number of agent-generated pull requests (PRs) merged with little or...

Software Policy

arXiv CS Jan 29

Usage, Effects and Requirements for AI Coding Assistants in the Enterprise: An Empirical Study

arXiv:2601.20112v1 Announce Type: new Abstract: The rise of large language models (LLMs) has accelerated the development of automated techniques and tools for supporting various software...

Software Technology

arXiv CS Jan 29

A Data-Informed Local Subspaces Method for Error-Bounded Lossy Compression of Large-Scale Scientific Datasets

arXiv:2601.20113v1 Announce Type: new Abstract: The growing volume of scientific simulation data presents a significant challenge for storage and transfer. Error-bounded lossy compression has emerged...

Technology Environment

arXiv CS Jan 29

How Much Progress Has There Been in NVIDIA Datacenter GPUs?

arXiv:2601.20115v1 Announce Type: new Abstract: Graphics Processing Units (GPUs) are the state-of-the-art architecture for essential tasks, ranging from rendering 2D/3D graphics to accelerating workloads in...

World News Policy

arXiv CS Jan 29

In-Context Reinforcement Learning From Suboptimal Historical Data

arXiv:2601.20116v1 Announce Type: new Abstract: Transformer models have achieved remarkable empirical successes, largely due to their in-context learning capabilities. Inspired by this, we explore training...

Policy Psychology

arXiv CS Jan 29

A Reinforcement Learning Based Universal Sequence Design for Polar Codes

arXiv:2601.20118v1 Announce Type: new Abstract: To advance Polar code design for 6G applications, we develop a reinforcement learning-based universal sequence design framework that is extensible...

Software Policy

arXiv CS Jan 29

Improving Smoothed Aggregation AMG Robustness on Stretched Mesh Applications

arXiv:2601.20119v1 Announce Type: new Abstract: Strength-of-connection algorithms play a key role in algebraic multigrid (AMG). Specifically, they determine which matrix nonzeros are classified as weak...

Software Biology

arXiv CS Jan 29

Going NUTS with ADVI: Exploring various Bayesian Inference techniques with Facebook Prophet

arXiv:2601.20120v1 Announce Type: new Abstract: Since its introduction, Facebook Prophet has attracted positive attention from both classical statisticians and the Bayesian statistics community. The model...

Technology Software

arXiv CS Jan 29

Membership Inference Attacks Against Fine-tuned Diffusion Language Models

arXiv:2601.20125v1 Announce Type: new Abstract: Diffusion Language Models (DLMs) represent a promising alternative to autoregressive language models, using bidirectional masked token prediction. Yet their susceptibility...

Software Cybersecurity

arXiv CS Jan 29

Rewarding Intellectual Humility Learning When Not To Answer In Large Language Models

arXiv:2601.20126v1 Announce Type: new Abstract: Large Language Models (LLMs) often produce hallucinated or unverifiable content, undermining their reliability in factual domains. This work investigates Reinforcement...

Software Policy

arXiv CS Jan 29

BengaliSent140: A Large-Scale Bengali Binary Sentiment Dataset for Hate and Non-Hate Speech Classification

arXiv:2601.20129v1 Announce Type: new Abstract: Sentiment analysis for the Bengali language has attracted increasing research interest in recent years. However, progress remains constrained by the...

Psychology Policy

arXiv CS Jan 29

Real-Time Robot Execution with Masked Action Chunking

arXiv:2601.20130v1 Announce Type: new Abstract: Real-time execution is essential for cyber-physical systems such as robots. These systems operate in dynamic real-world environments where even small...

Policy Robotics

arXiv CS Jan 29

Taxonomy of the Retrieval System Framework: Pitfalls and Paradigms

arXiv:2601.20131v1 Announce Type: new Abstract: Designing an embedding retrieval system requires navigating a complex design space of conflicting trade-offs between efficiency and effectiveness. This work...

Neuroscience Software

arXiv CS Jan 29

Control systems for synthetic biology and a case-study in cell fate reprogramming

arXiv:2601.20135v1 Announce Type: new Abstract: This paper gives an overview of the use of control systems engineering in synthetic biology, motivated by applications such as...

Software Biology