CUEBES

Differential syntactic and semantic encoding in LLMs

arXiv:2601.04765v3 Announce Type: replace Abstract: We study how syntactic and semantic information is encoded in inner layer representations of Large Language Models (LLMs), focusing on...

Software Artificial Intelligence

arXiv CS Jan 28

Orchestrating Intelligence: Confidence-Aware Routing for Efficient Multi-Agent Collaboration across Multi-Scale Models

arXiv:2601.04861v2 Announce Type: replace Abstract: While multi-agent systems (MAS) have demonstrated superior performance over single-agent approaches in complex reasoning tasks, they often suffer from significant...

Neuroscience Software

arXiv CS Jan 28

Analyzing Message-Code Inconsistency in AI Coding Agent-Authored Pull Requests

arXiv:2601.04886v2 Announce Type: replace Abstract: Pull request (PR) descriptions generated by AI coding agents are the primary channel for communicating code changes to human reviewers....

Software Artificial Intelligence

arXiv CS Jan 28

Inside Out: Evolving User-Centric Core Memory Trees for Long-Term Personalized Dialogue Systems

arXiv:2601.05171v2 Announce Type: replace Abstract: Existing long-term personalized dialogue systems struggle to reconcile unbounded interaction streams with finite context constraints, often succumbing to memory noise...

Biology Energy

arXiv CS Jan 28

MoE3D: A Mixture-of-Experts Module for 3D Reconstruction

arXiv:2601.05208v3 Announce Type: replace Abstract: We propose a simple yet effective approach to enhance the performance of feed-forward 3D reconstruction models. Existing methods often struggle...

Software Biology

arXiv CS Jan 28

Concurrent Balanced Augmented Trees

arXiv:2601.05225v2 Announce Type: replace Abstract: Augmentation makes search trees tremendously more versatile, allowing them to support efficient aggregation queries, order-statistic queries, and range queries in...

Artificial Intelligence Biology

arXiv CS Jan 28

Coding the Visual World: From Image to Simulation Using Vision Language Models

arXiv:2601.05344v3 Announce Type: replace Abstract: The ability to construct mental models of the world is a central aspect of understanding. Similarly, visual understanding can be...

Software Materials Science

arXiv CS Jan 28

Falsifying Sparse Autoencoder Reasoning Features in Language Models

arXiv:2601.05679v5 Announce Type: replace Abstract: We study how reliably sparse autoencoders (SAEs) support claims about reasoning-related internal features in large language models. We first give...

Software Artificial Intelligence

arXiv CS Jan 28

Overcoming the Float Wall: Verifying Mathematical Laws at $10^{50}$ Scale with BigInt Transformers

arXiv:2601.06117v2 Announce Type: replace Abstract: A central question in artificial intelligence is whether models learn universal laws or merely memorize statistical heuristics. This distinction is...

Software Artificial Intelligence

arXiv CS Jan 28

The Need for a Socially-Grounded Persona Framework for User Simulation

arXiv:2601.07110v2 Announce Type: replace Abstract: Synthetic personas are widely used to condition large language models (LLMs) for social simulation, yet most personas are still constructed...

Engineering Artificial Intelligence

arXiv CS Jan 28

Motion Focus Recognition in Fast-Moving Egocentric Video

arXiv:2601.07154v2 Announce Type: replace Abstract: From Vision-Language-Action (VLA) systems to robotics, existing egocentric datasets primarily focus on action recognition tasks, while largely overlooking the inherent...

Robotics

arXiv CS Jan 28

MeepleLM: A Virtual Playtester Simulating Diverse Subjective Experiences

arXiv:2601.07251v3 Announce Type: replace Abstract: Recent advancements have expanded the role of Large Language Models in board games from playing agents to creative co-designers. However,...

Artificial Intelligence Biology

arXiv CS Jan 28

Explaining Generalization of AI-Generated Text Detectors Through Linguistic Analysis

arXiv:2601.07974v2 Announce Type: replace Abstract: AI-text detectors achieve high accuracy on in-domain benchmarks, but often struggle to generalize across different generation conditions such as unseen...

Artificial Intelligence Biology

arXiv CS Jan 28

Generation-Augmented Generation: A Plug-and-Play Framework for Private Knowledge Injection in Large Language Models

arXiv:2601.08209v2 Announce Type: replace Abstract: In domains such as biomedicine, materials, and finance, high-stakes deployment of large language models (LLMs) requires injecting private, domain-specific knowledge...

Materials Science Software

arXiv CS Jan 28

MPCI-Bench: A Benchmark for Multimodal Pairwise Contextual Integrity Evaluation of Language Model Agents

arXiv:2601.08235v3 Announce Type: replace Abstract: As language-model agents evolve from passive chatbots into proactive assistants that handle personal data, evaluating their adherence to social norms...

Cybersecurity

arXiv CS Jan 28

T3: Benchmarking Sycophancy and Skepticism in Causal Judgment

arXiv:2601.08258v2 Announce Type: replace Abstract: We introduce T3 (Testing Trustworthy Thinking), a diagnostic benchmark designed to rigorously evaluate LLM causal judgment across Pearl's Ladder of...

Artificial Intelligence Software

arXiv CS Jan 28

An Efficient Algorithm to Sample Quantum Low-Density Parity-Check Codes

arXiv:2601.08387v2 Announce Type: replace Abstract: In this paper, we present an efficient algorithm to sample random sparse matrices to be used as check matrices for...

Software Quantum Computing

arXiv CS Jan 28

Resisting Manipulative Bots in Meme Coin Copy Trading: A Multi-Agent Approach with Chain-of-Thought Reasoning

arXiv:2601.08641v2 Announce Type: replace Abstract: Copy trading has become the dominant entry strategy in meme coin markets. However, due to the market's extreme illiquid and...

Software Energy

arXiv CS Jan 28

Prism: Towards Lowering User Cognitive Load in LLMs via Complex Intent Understanding

arXiv:2601.08653v2 Announce Type: replace Abstract: Large Language Models are rapidly emerging as web-native interfaces to social platforms. On the social web, users frequently have ambiguous...

Software Neuroscience

arXiv CS Jan 28

Agent Contracts: A Formal Framework for Resource-Bounded Autonomous AI Systems

arXiv:2601.08815v2 Announce Type: replace Abstract: The Contract Net Protocol (1980) introduced coordination through contracts in multi-agent systems. Modern agent protocols standardize connectivity and interoperability; yet,...

Robotics Artificial Intelligence

arXiv CS Jan 28

Layer-Parallel Training for Transformers

arXiv:2601.09026v2 Announce Type: replace Abstract: We present a new training methodology for transformers using a multilevel, layer-parallel approach. Through a neural ODE formulation of transformers,...

Software Artificial Intelligence

arXiv CS Jan 28

OpenDecoder: Open Large Language Model Decoding to Incorporate Document Quality in RAG

arXiv:2601.09028v2 Announce Type: replace Abstract: The development of large language models (LLMs) has achieved superior performance in a range of downstream tasks, including LLM-based retrieval-augmented...

Software Biology

arXiv CS Jan 28

Exploring the Effects of Generative AI Assistance on Writing Self-Efficacy

arXiv:2601.09033v2 Announce Type: replace Abstract: Generative AI (GenAI) is increasingly used in academic writing, yet its effects on students' writing self-efficacy remain contingent on how...

Artificial Intelligence Biology

arXiv CS Jan 28

Who Fails Where? LLM and Human Error Patterns in Endometriosis Ultrasound Report Extraction

arXiv:2601.09053v2 Announce Type: replace Abstract: In this study, we evaluate a locally-deployed large-language model (LLM) to convert unstructured endometriosis transvaginal ultrasound (eTVUS) scan reports into...

Engineering Artificial Intelligence