CUEBES

LACONIC: Length-Aware Constrained Reinforcement Learning for LLM

arXiv:2602.14468v1 Announce Type: new Abstract: Reinforcement learning (RL) has enhanced the capabilities of large language models (LLMs) through reward-driven training. Nevertheless, this process can introduce...

Policy Software

arXiv CS 1d ago

Measuring and Mitigating Post-hoc Rationalization in Reverse Chain-of-Thought Generation

arXiv:2602.14469v1 Announce Type: new Abstract: Reverse Chain-of-Thought Generation (RCG) synthesizes reasoning traces from query-answer pairs, but runs the risk of producing post-hoc rationalizations: when models...

Neuroscience Engineering

arXiv CS 1d ago

HyperRAG: Reasoning N-ary Facts over Hypergraphs for Retrieval Augmented Generation

arXiv:2602.14470v1 Announce Type: new Abstract: Graph-based retrieval-augmented generation (RAG) methods, typically built on knowledge graphs (KGs) with binary relational facts, have shown promise in multi-hop...

Engineering Software

arXiv CS 1d ago

Socially-Weighted Alignment: A Game-Theoretic Framework for Multi-Agent LLM Systems

arXiv:2602.14471v1 Announce Type: new Abstract: Deploying large language model (LLM) agents in shared environments introduces a fundamental tension between individual alignment and collective stability: locally...

Psychology Artificial Intelligence

arXiv CS 1d ago

Learning Transferability: A Two-Stage Reinforcement Learning Approach for Enhancing Quadruped Robots' Performance in U-Shaped Stair Climbing

arXiv:2602.14473v1 Announce Type: new Abstract: Quadruped robots are employed in various scenarios in building construction. However, autonomous stair climbing across different indoor staircases remains a...

Robotics Policy

arXiv CS 1d ago

One Good Source is All You Need: Near-Optimal Regret for Bandits under Heterogeneous Noise

arXiv:2602.14474v1 Announce Type: new Abstract: We study $K$-armed Multiarmed Bandit (MAB) problem with $M$ heterogeneous data sources, each exhibiting unknown and distinct noise variances $\{\sigma_j^2\}_{j=1}^M$....

Biology Software

arXiv CS 1d ago

Truthful Reverse Auctions for Adaptive Selection via Contextual Multi-Armed Bandits

arXiv:2602.14476v1 Announce Type: new Abstract: We study the problem of selecting large language models (LLMs) for user queries in settings where multiple LLM providers submit...

Artificial Intelligence Politics

arXiv CS 1d ago

When OpenClaw AI Agents Teach Each Other: Peer Learning Patterns in the Moltbook Community

arXiv:2602.14477v1 Announce Type: new Abstract: Peer learning, where learners teach and learn from each other, is foundational to educational practice. A novel phenomenon has emerged:...

Software Policy

arXiv CS 1d ago

On the Rate-Distortion-Complexity Tradeoff for Semantic Communication

arXiv:2602.14481v1 Announce Type: new Abstract: Semantic communication is a novel communication paradigm that focuses on conveying the user's intended meaning rather than the bit-wise transmission...

Psychology Software

arXiv CS 1d ago

TikArt: Aperture-Guided Observation for Fine-Grained Visual Reasoning via Reinforcement Learning

arXiv:2602.14482v1 Announce Type: new Abstract: We address fine-grained visual reasoning in multimodal large language models (MLLMs), where key evidence may reside in tiny objects, cluttered...

World News Policy

arXiv CS 1d ago

Revisiting the Platonic Representation Hypothesis: An Aristotelian View

arXiv:2602.14486v1 Announce Type: new Abstract: The Platonic Representation Hypothesis suggests that representations from neural networks are converging to a common statistical model of reality. We...

Genetics Neuroscience

arXiv CS 1d ago

BETA-Labeling for Multilingual Dataset Construction in Low-Resource IR

arXiv:2602.14488v1 Announce Type: new Abstract: IR in low-resource languages remains limited by the scarcity of high-quality, task-specific annotated datasets. Manual annotation is expensive and difficult...

Business Policy

arXiv CS 1d ago

Parameter-Efficient Fine-Tuning of LLMs with Mixture of Space Experts

arXiv:2602.14490v1 Announce Type: new Abstract: Large Language Models (LLMs) have achieved remarkable progress, with Parameter-Efficient Fine-Tuning (PEFT) emerging as a key technique for downstream task...

Biology Technology

arXiv CS 1d ago

Query as Anchor: Scenario-Adaptive User Representation via Large Language Model

arXiv:2602.14492v1 Announce Type: new Abstract: Industrial-scale user representation learning requires balancing robust universality with acute task-sensitivity. However, existing paradigms primarily yield static, task-agnostic embeddings that...

Psychology Chemistry

arXiv CS 1d ago

Gaussian Mesh Renderer for Lightweight Differentiable Rendering

arXiv:2602.14493v1 Announce Type: new Abstract: 3D Gaussian Splatting (3DGS) has enabled high-fidelity virtualization with fast rendering and optimization for novel view synthesis. On the other...

Engineering Chemistry

arXiv CS 1d ago

Divine Benevolence is an $x^2$: GLUs scale asymptotically faster than MLPs

arXiv:2602.14495v1 Announce Type: new Abstract: Scaling laws can be understood from ground-up numerical analysis, where traditional function approximation theory can explain shifts in model architecture...

Software Policy

arXiv CS 1d ago

Uncertainty-Aware Vision-Language Segmentation for Medical Imaging

arXiv:2602.14498v1 Announce Type: new Abstract: We introduce a novel uncertainty-aware multimodal segmentation framework that leverages both radiological images and associated clinical text for precise medical...

Software Medicine & Health

arXiv CS 1d ago

Prototype Instance-semantic Disentanglement with Low-rank Regularized Subspace Clustering for WSIs Explainable Recognition

arXiv:2602.14501v1 Announce Type: new Abstract: The tumor region plays a key role in pathological diagnosis. Tumor tissues are highly similar to precancerous lesions and non...

Quantum Computing Health

arXiv CS 1d ago

Behavioral Feature Boosting via Substitute Relationships for E-commerce Search

arXiv:2602.14502v1 Announce Type: new Abstract: On E-commerce platforms, new products often suffer from the cold-start problem: limited interaction data reduces their search visibility and hurts...

Psychology Software

arXiv CS 1d ago

Bounding Probabilities of Causation with Partial Causal Diagrams

arXiv:2602.14503v1 Announce Type: new Abstract: Probabilities of causation are fundamental to individual-level explanation and decision making, yet they are inherently counterfactual and not point-identifiable from...

Software Engineering

arXiv CS 1d ago

Adaptive Finite Elements with Algebraic Stabilization for Convection-Dominated Transport

arXiv:2602.14504v1 Announce Type: new Abstract: We present a numerical investigation of residual-based a posteriori error estimation for finite element discretizations of convection--diffusion equations stabilized by...

Energy Mathematics

arXiv CS 1d ago

Formally Verifying and Explaining Sepsis Treatment Policies with COOL-MC

arXiv:2602.14505v1 Announce Type: new Abstract: Safe and interpretable sequential decision-making is critical in healthcare, yet reinforcement learning (RL) policies for sepsis treatment optimization remain opaque...

Health Policy

arXiv CS 1d ago

Covariance-Aware Transformers for Quadratic Programming and Decision Making

arXiv:2602.14506v1 Announce Type: new Abstract: We explore the use of transformers for solving quadratic programs and how this capability benefits decision-making problems that involve covariance...

Software Mathematics

arXiv CS 1d ago

MacNet: An End-to-End Manifold-Constrained Adaptive Clustering Network for Interpretable Whole Slide Image Classification

arXiv:2602.14509v1 Announce Type: new Abstract: Whole slide images (WSIs) are the gold standard for pathological diagnosis and sub-typing. Current main-stream two-step frameworks employ offline feature...

Software Health