CUEBES

Bi-Level Prompt Optimization for Multimodal LLM-as-a-Judge

arXiv:2602.11340v1 Announce Type: new Abstract: Large language models (LLMs) have become widely adopted as automated judges for evaluating AI-generated content. Despite their success, aligning LLM-based...

Biology Software

arXiv CS Feb 13

Situated, Dynamic, and Subjective: Envisioning the Design of Theory-of-Mind-Enabled Everyday AI with Industry Practitioners

arXiv:2602.11342v1 Announce Type: new Abstract: Theory of Mind (ToM) -- the ability to infer what others are thinking (e.g., intentions) from observable cues -- is...

Psychology Software

arXiv CS Feb 13

Divide and Learn: Multi-Objective Combinatorial Optimization at Scale

arXiv:2602.11346v1 Announce Type: new Abstract: Multi-objective combinatorial optimization seeks Pareto-optimal solutions over exponentially large discrete spaces, yet existing methods sacrifice generality, scalability, or theoretical guarantees....

Hardware Technology

arXiv CS Feb 13

AgentNoiseBench: Benchmarking Robustness of Tool-Using LLM Agents Under Noisy Condition

arXiv:2602.11348v1 Announce Type: new Abstract: Recent advances in large language models have enabled LLM-based agents to achieve strong performance on a variety of benchmarks. However,...

Environment Psychology

arXiv CS Feb 13

ArtContext: Contextualizing Artworks with Open-Access Art History Articles and Wikidata Knowledge through a LoRA-Tuned CLIP Model

arXiv:2602.11349v1 Announce Type: new Abstract: Many Art History articles discuss artworks in general as well as specific parts of works, such as layout, iconography, or...

Materials Science Software

arXiv CS Feb 13

Structured Hybrid Mechanistic Models for Robust Estimation of Time-Dependent Intervention Outcomes

arXiv:2602.11350v1 Announce Type: new Abstract: Estimating intervention effects in dynamical systems is crucial for outcome optimization. In medicine, such interventions arise in physiological regulation (e.g.,...

Software Policy

arXiv CS Feb 13

Pushing Forward Pareto Frontiers of Proactive Agents with Behavioral Agentic Optimization

arXiv:2602.11351v1 Announce Type: new Abstract: Proactive large language model (LLM) agents aim to actively plan, query, and interact over multiple turns, enabling efficient task completion...

Software Psychology

arXiv CS Feb 13

Bizarre Love Triangle: Generative AI, Art, and Kitsch

arXiv:2602.11353v1 Announce Type: new Abstract: Generative artificial intelligence (GenAI) has engrossed the mainstream culture, expanded AI's creative user base, and catalyzed economic, legal, and aesthetic...

Technology Policy

arXiv CS Feb 13

ReplicatorBench: Benchmarking LLM Agents for Replicability in Social and Behavioral Sciences

arXiv:2602.11354v1 Announce Type: new Abstract: The literature has witnessed an emerging interest in AI agents for automated assessment of scientific papers. Existing benchmarks focus primarily...

Software Artificial Intelligence

arXiv CS Feb 13

A 16 nm 1.60TOPS/W High Utilization DNN Accelerator with 3D Spatial Data Reuse and Efficient Shared Memory Access

arXiv:2602.11357v1 Announce Type: new Abstract: Achieving high compute utilization across a wide range of AI workloads is crucial for the efficiency of versatile DNN accelerators....

Hardware Technology

arXiv CS Feb 13

When Models Examine Themselves: Vocabulary-Activation Correspondence in Self-Referential Processing

arXiv:2602.11358v1 Announce Type: new Abstract: Large language models produce rich introspective language when prompted for self-examination, but whether this language reflects internal computation or sophisticated...

Engineering Software

arXiv CS Feb 13

Bootstrapping-based Regularisation for Reducing Individual Prediction Instability in Clinical Risk Prediction Models

arXiv:2602.11360v1 Announce Type: new Abstract: Clinical prediction models are increasingly used to support patient care, yet many deep learning-based approaches remain unstable, as their predictions...

Health Artificial Intelligence

arXiv CS Feb 13

Finding the Cracks: Improving LLMs Reasoning with Paraphrastic Probing and Consistency Verification

arXiv:2602.11361v1 Announce Type: new Abstract: Large language models have demonstrated impressive performance across a variety of reasoning tasks. However, their problem-solving ability often declines on...

Policy Artificial Intelligence

arXiv CS Feb 13

Real Life Is Uncertain. Consensus Should Be Too!

arXiv:2602.11362v1 Announce Type: new Abstract: Modern distributed systems rely on consensus protocols to build a fault-tolerant-core upon which they can build applications. Consensus protocols are...

Software World News

arXiv CS Feb 13

Preprocessed 3SUM for Unknown Universes with Subquadratic Space

arXiv:2602.11363v1 Announce Type: new Abstract: We consider the classic 3SUM problem: given sets of integers $A, B, C $, determine whether there is a tuple...

Engineering Policy

arXiv CS Feb 13

The Energy of Falsehood: Detecting Hallucinations via Diffusion Model Likelihoods

arXiv:2602.11364v1 Announce Type: new Abstract: Large Language Models (LLMs) frequently hallucinate plausible but incorrect assertions, a vulnerability often missed by uncertainty metrics when models are...

Energy Policy

arXiv CS Feb 13

Interpretive Cultures: Resonance, randomness, and negotiated meaning for AI-assisted tarot divination

arXiv:2602.11367v1 Announce Type: new Abstract: While generative AI tools are increasingly adopted for creative and analytical tasks, their role in interpretive practices, where meaning is...

Biology World News

arXiv CS Feb 13

The Manifold of the Absolute: Religious Perennialism as Generative Inference

arXiv:2602.11368v1 Announce Type: new Abstract: This paper formalizes religious epistemology through the mathematics of Variational Autoencoders. We model religious traditions as distinct generative mappings from...

Software Engineering

arXiv CS Feb 13

A Unified Estimation--Guidance Framework Based on Bayesian Decision Theory

arXiv:2602.11373v1 Announce Type: new Abstract: Using Bayesian decision theory, we modify the perfect-information, differential game-based guidance law (DGL1) to address the inevitable estimation error occurring...

Mathematics Policy

arXiv CS Feb 13

Retrieval-Aware Distillation for Transformer-SSM Hybrids

arXiv:2602.11374v1 Announce Type: new Abstract: State-space models (SSMs) offer efficient sequence modeling but lag behind Transformers on benchmarks that require in-context retrieval. Prior work links...

Policy Space & Astronomy

arXiv CS Feb 13

Modelling Trust and Trusted Systems: A Category Theoretic Approach

arXiv:2602.11376v1 Announce Type: new Abstract: We introduces a category-theoretic framework for modelling trust as applied to trusted computation systems and remote attestation. By formalizing elements,...

Environment Software

arXiv CS Feb 13

Toward Adaptive Non-Intrusive Reduced-Order Models: Design and Challenges

arXiv:2602.11378v1 Announce Type: new Abstract: Projection-based Reduced Order Models (ROMs) are often deployed as static surrogates, which limits their practical utility once a system leaves...

Energy Software

arXiv CS Feb 13

Chemo Hydrodynamic Transceivers for the Internet of Bio-Nano Things, Modeling the Joint Propulsion Transmission trade-off

arXiv:2602.11380v1 Announce Type: new Abstract: The Internet of Bio-Nano Things (IoBNT) requires mobile nanomachines that navigate complex fluids while exchanging molecular signals under external supervision....

Robotics Climate & Environment

arXiv CS Feb 13

Markovian protocols and an upper bound on the extension complexity of the matching polytope

arXiv:2602.11382v1 Announce Type: new Abstract: This paper investigates the extension complexity of polytopes by exploiting the correspondence between non-negative factorizations of slack matrices and randomized...

Policy