CUEBES

When Flores Bloomz Wrong: Cross-Direction Contamination in Machine Translation Evaluation

arXiv:2601.20858v1 Announce Type: new Abstract: Large language models (LLMs) can be benchmark-contaminated, resulting in inflated scores that mask memorization as generalization, and in multilingual settings,...

Policy Artificial Intelligence

arXiv CS Jan 29

Evolutionary Strategies lead to Catastrophic Forgetting in LLMs

arXiv:2601.20861v1 Announce Type: new Abstract: One of the biggest missing capabilities in current AI systems is the ability to learn continuously after deployment. Implementing such...

Artificial Intelligence Biology

arXiv CS Jan 29

Critical Transit Infrastructure in Smart Cities and Urban Air Quality: A Multi-City Seasonal Comparison of Ridership and PM2.5

arXiv:2601.19937v1 Announce Type: cross Abstract: Public transit is a critical component of urban mobility and equity, yet mobility and air-quality linkages are rarely operationalized in...

Engineering Environment

arXiv CS Jan 29

MK-SGC-SC: Multiple Kernel guided Sparse Graph Construction in Spectral Clustering for Unsupervised Speaker Diarization

arXiv:2601.19946v1 Announce Type: cross Abstract: Speaker diarization aims to segment audio recordings into regions corresponding to individual speakers. Although unsupervised speaker diarization is inherently challenging,...

Technology Environment

arXiv CS Jan 29

RIR-Mega-Speech: A Reverberant Speech Corpus with Comprehensive Acoustic Metadata and Reproducible Evaluation

arXiv:2601.19949v1 Announce Type: cross Abstract: Despite decades of research on reverberant speech, comparing methods remains difficult because most corpora lack per-file acoustic annotations or provide...

Environment Psychology

arXiv CS Jan 29

VoxPrivacy: A Benchmark for Evaluating Interactional Privacy of Speech Language Models

arXiv:2601.19956v1 Announce Type: cross Abstract: As Speech Language Models (SLMs) transition from personal devices to shared, multi-user environments such as smart homes, a new challenge...

Environment Software

arXiv CS Jan 29

Deep Neural Networks as Iterated Function Systems and a Generalization Bound

arXiv:2601.19958v1 Announce Type: cross Abstract: Deep neural networks (DNNs) achieve remarkable performance on a wide range of tasks, yet their mathematical analysis remains fragmented: stability...

Software Neuroscience

arXiv CS Jan 29

Do we really need Self-Attention for Streaming Automatic Speech Recognition?

arXiv:2601.19960v1 Announce Type: cross Abstract: Transformer-based architectures are the most used architectures in many deep learning fields like Natural Language Processing, Computer Vision or Speech...

Software Environment

arXiv CS Jan 29

Global Plane Waves From Local Gaussians: Periodic Charge Densities in a Blink

arXiv:2601.19966v1 Announce Type: cross Abstract: We introduce ELECTRAFI, a fast, end-to-end differentiable model for predicting periodic charge densities in crystalline materials. ELECTRAFI constructs anisotropic Gaussians...

Materials Science Psychology

arXiv CS Jan 29

Exploring the holographic entropy cone via reinforcement learning

arXiv:2601.19979v1 Announce Type: cross Abstract: We develop a reinforcement learning algorithm to study the holographic entropy cone. Given a target entropy vector, our algorithm searches...

Software Policy

arXiv CS Jan 29

FORM Version 5.0

arXiv:2601.19982v1 Announce Type: cross Abstract: We present FORM 5, a major release of the symbolic-manipulation system FORM. Version 5 introduces an integrated diagram generator, based...

Psychology Mathematics

arXiv CS Jan 29

The Sound of Noise: Leveraging the Inductive Bias of Pre-trained Audio Transformers for Glitch Identification in LIGO

arXiv:2601.20034v1 Announce Type: cross Abstract: Transient noise artifacts, or glitches, fundamentally limit the sensitivity of gravitational-wave (GW) interferometers and can mimic true astrophysical signals, particularly...

Technology Psychology

arXiv CS Jan 29

Minimax Rates for Hyperbolic Hierarchical Learning

arXiv:2601.20047v1 Announce Type: cross Abstract: We prove an exponential separation in sample complexity between Euclidean and hyperbolic representations for learning on hierarchical data under standard...

Software Policy

arXiv CS Jan 29

Explainable deep learning reveals the physical mechanisms behind the turbulent kinetic energy equation

arXiv:2601.20052v1 Announce Type: cross Abstract: In this work, we investigate the physical mechanisms governing turbulent kinetic energy transport using explainable deep learning (XDL). An XDL...

Energy Engineering

arXiv CS Jan 29

Randomized Feasibility Methods for Constrained Optimization with Adaptive Step Sizes

arXiv:2601.20076v1 Announce Type: cross Abstract: We consider minimizing an objective function subject to constraints defined by the intersection of lower-level sets of convex functions. We...

Software Policy

arXiv CS Jan 29

Rate-induced tipping in a solvable model with the Allee effect

arXiv:2601.20128v1 Announce Type: cross Abstract: We present a novel exactly solvable ordinary differential equation model for rate-induced tipping: a dynamic phenomenon of dynamical systems where...

Software Mathematics

arXiv CS Jan 29

The Interplay Between Domination and Separation in Graphs

arXiv:2601.20153v1 Announce Type: cross Abstract: In the literature, several identification problems in graphs have been studied, of which, the most widely studied are the ones...

Software Policy

arXiv CS Jan 29

Bias-Reduced Estimation of Finite Mixtures: An Application to Latent Group Structures in Panel Data

arXiv:2601.20197v1 Announce Type: cross Abstract: Finite mixture models are widely used in econometric analyses to capture unobserved heterogeneity. This paper shows that maximum likelihood estimation...

Software Health

arXiv CS Jan 29

Quantum statistics from classical simulations via generative Gibbs sampling

arXiv:2601.20228v1 Announce Type: cross Abstract: Accurate simulation of nuclear quantum effects is essential for molecular modeling but expensive using path integral molecular dynamics (PIMD). We...

Quantum Computing Materials Science

arXiv CS Jan 29

Efficient Evaluation of LLM Performance with Statistical Guarantees

arXiv:2601.20251v1 Announce Type: cross Abstract: Exhaustively evaluating many large language models (LLMs) on a large suite of benchmarks is expensive. We cast benchmarking as finite-population...

Policy Software

arXiv CS Jan 29

A Quantum Photonic Approach to Graph Coloring

arXiv:2601.20263v1 Announce Type: cross Abstract: Gaussian Boson Sampling (GBS) is a quantum computational model that leverages linear optics to solve sampling problems believed to be...

Software World News

arXiv CS Jan 29

Empirical Likelihood-Based Fairness Auditing: Distribution-Free Certification and Flagging

arXiv:2601.20269v1 Announce Type: cross Abstract: Machine learning models in high-stakes applications, such as recidivism prediction and automated personnel selection, often exhibit systematic performance disparities across...

Software Artificial Intelligence

arXiv CS Jan 29

Do Whitepaper Claims Predict Market Behavior? Evidence from Cryptocurrency Factor Analysis

arXiv:2601.20336v1 Announce Type: cross Abstract: Cryptocurrency projects articulate value propositions through whitepapers, making claims about functionality and technical capabilities. This study investigates whether these narratives...

Economics Technology

arXiv CS Jan 29

Convergence Analysis of Randomized Subspace Normalized SGD under Heavy-Tailed Noise

arXiv:2601.20399v1 Announce Type: cross Abstract: Randomized subspace methods reduce per-iteration cost; however, in nonconvex optimization, most analyses are expectation-based, and high-probability bounds remain scarce even...

Mathematics Policy