CUEBES

Visual Reasoning Benchmark: Evaluating Multimodal LLMs on Classroom-Authentic Visual Problems from Primary Education

arXiv:2602.12196v1 Announce Type: new Abstract: AI models have achieved state-of-the-art results in textual reasoning; however, their ability to reason over spatial and relational structures remains...

Artificial Intelligence Policy

arXiv CS 8h ago

MalTool: Malicious Tool Attacks on LLM Agents

arXiv:2602.12194v1 Announce Type: new Abstract: In a malicious tool attack, an attacker uploads a malicious tool to a distribution platform; once a user installs the...

Software Cybersecurity

arXiv CS 8h ago

Query-focused and Memory-aware Reranker for Long Context Processing

arXiv:2602.12192v1 Announce Type: new Abstract: Built upon the existing analysis of retrieval heads in large language models, we propose an alternative reranking framework that trains...

Software Policy

arXiv CS 8h ago

WaveFormer: Wavelet Embedding Transformer for Biomedical Signals

arXiv:2602.12189v1 Announce Type: new Abstract: Biomedical signal classification presents unique challenges due to long sequences, complex temporal dynamics, and multi-scale frequency patterns that are poorly...

Psychology Neuroscience

arXiv CS 8h ago

SAGEO Arena: A Realistic Environment for Evaluating Search-Augmented Generative Engine Optimization

arXiv:2602.12187v1 Announce Type: new Abstract: Search-Augmented Generative Engines (SAGE) have emerged as a new paradigm for information access, bridging web-scale retrieval with generative capabilities to...

Engineering Environment

arXiv CS 8h ago

Unknown Attack Detection in IoT Networks using Large Language Models: A Robust, Data-efficient Approach

arXiv:2602.12183v1 Announce Type: new Abstract: The rapid evolution of cyberattacks continues to drive the emergence of unknown (zero-day) threats, posing significant challenges for network intrusion...

Artificial Intelligence Biology

arXiv CS 8h ago

Rate-Reliability Tradeoff for Deterministic Identification over Gaussian Channels

arXiv:2602.12182v1 Announce Type: new Abstract: We extend the recent analysis of the rate-reliability tradeoff in deterministic identification (DI) to general linear Gaussian channels, marking the...

Engineering Policy

arXiv CS 8h ago

Convex Markov Games and Beyond: New Proof of Existence, Characterization and Learning Algorithms for Nash Equilibria

arXiv:2602.12181v1 Announce Type: new Abstract: Convex Markov Games (cMGs) were recently introduced as a broad class of multi-agent learning problems that generalize Markov games to...

Software Policy

arXiv CS 8h ago

How Sampling Shapes LLM Alignment: From One-Shot Optima to Iterative Dynamics

arXiv:2602.12180v1 Announce Type: new Abstract: Standard methods for aligning large language models with human preferences learn from pairwise comparisons among sampled candidate responses and regularize...

Policy World News

arXiv CS 8h ago

Systematic Analysis of Penalty-Optimised Illumination Design for Tomographic Volumetric Additive Manufacturing via the Extendable Framework TVAM AID Using the Core Imaging Library

arXiv:2602.12178v1 Announce Type: new Abstract: Tomographic Volumetric Additive Manufacturing(TVAM) is a novel manufacturing method that allows for the fast creation of objects of complex geometry...

Materials Science Energy

arXiv CS 8h ago

EO-VAE: Towards A Multi-sensor Tokenizer for Earth Observation Data

arXiv:2602.12177v1 Announce Type: new Abstract: State-of-the-art generative image and video models rely heavily on tokenizers that compress high-dimensional inputs into more efficient latent representations. While...

Software Biology

arXiv CS 8h ago

Improved Online Algorithms for Inventory Management Problems with Holding and Delay Costs: Riding the Wave Makes Things Simpler, Stronger, & More General

arXiv:2602.12175v1 Announce Type: new Abstract: The Joint Replenishment Problem (JRP) is a classical inventory management problem, that aims to model the trade-off between coordinating orders...

Software Policy

arXiv CS 8h ago

SAM3-LiteText: An Anatomical Study of the SAM3 Text Encoder for Efficient Vision-Language Segmentation

arXiv:2602.12173v1 Announce Type: new Abstract: Vision-language segmentation models such as SAM3 enable flexible, prompt-driven visual grounding, but inherit large, general-purpose text encoders originally designed for...

Software Energy

arXiv CS 8h ago

Pedagogically-Inspired Data Synthesis for Language Model Knowledge Distillation

arXiv:2602.12172v1 Announce Type: new Abstract: Knowledge distillation from Large Language Models (LLMs) to smaller models has emerged as a critical technique for deploying efficient AI...

Artificial Intelligence Technology

arXiv CS 8h ago

Statistical Parsing for Logical Information Retrieval

arXiv:2602.12170v1 Announce Type: new Abstract: In previous work (Coppola, 2024) we introduced the Quantified Boolean Bayesian Network (QBBN), a logical graphical model that implements the...

Software World News

arXiv CS 8h ago

Sci-CoE: Co-evolving Scientific Reasoning LLMs via Geometric Consensus with Sparse Supervision

arXiv:2602.12164v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated exceptional reasoning capabilities, and co-evolving paradigms have shown promising results in domains such as...

Psychology Software

arXiv CS 8h ago

Amortized Molecular Optimization via Group Relative Policy Optimization

arXiv:2602.12162v1 Announce Type: new Abstract: Molecular design encompasses tasks ranging from de-novo design to structural alteration of given molecules or fragments. For the latter, state-of-the-art...

Policy Engineering

arXiv CS 8h ago

DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation

arXiv:2602.12160v1 Announce Type: new Abstract: Recent advancements in foundation models have revolutionized joint audio-video generation. However, existing approaches typically treat human-centric tasks including reference-based audio-video...

Software Biology

arXiv CS 8h ago

3DGSNav: Enhancing Vision-Language Model Reasoning for Object Navigation via Active 3D Gaussian Splatting

arXiv:2602.12159v1 Announce Type: new Abstract: Object navigation is a core capability of embodied intelligence, enabling an agent to locate target objects in unknown environments. Recent...

Psychology Robotics

arXiv CS 8h ago

SafeNeuron: Neuron-Level Safety Alignment for Large Language Models

arXiv:2602.12158v1 Announce Type: new Abstract: Large language models (LLMs) and multimodal LLMs are typically safety-aligned before release to prevent harmful content generation. However, recent studies...

Biology Neuroscience

arXiv CS 8h ago

TexSpot: 3D Texture Enhancement with Spatially-uniform Point Latent Representation

arXiv:2602.12157v1 Announce Type: new Abstract: High-quality 3D texture generation remains a fundamental challenge due to the view-inconsistency inherent in current mainstream multi-view diffusion pipelines. Existing...

Software Psychology

arXiv CS 8h ago

FAIL: Flow Matching Adversarial Imitation Learning for Image Generation

arXiv:2602.12155v1 Announce Type: new Abstract: Post-training of flow matching models-aligning the output distribution with a high-quality target-is mathematically equivalent to imitation learning. While Supervised Fine-Tuning...

Policy Biology

arXiv CS 8h ago

dVoting: Fast Voting for dLLMs

arXiv:2602.12153v1 Announce Type: new Abstract: Diffusion Large Language Models (dLLMs) represent a new paradigm beyond autoregressive modeling, offering competitive performance while naturally enabling a flexible...

Technology Software

arXiv CS 8h ago

OServe: Accelerating LLM Serving via Spatial-Temporal Workload Orchestration

arXiv:2602.12151v1 Announce Type: new Abstract: Serving Large Language Models (LLMs) can benefit immensely from parallelizing both the model and input requests across multiple devices, but...

Artificial Intelligence World News