Evolutionary Strategies lead to Catastrophic Forgetting in LLMs
arXiv:2601.20861v1 Announce Type: new Abstract: One of the biggest missing capabilities in current AI systems is the ability to learn continuously after deployment. Implementing such...
Stay updated with the latest research and technology news
arXiv:2601.20861v1 Announce Type: new Abstract: One of the biggest missing capabilities in current AI systems is the ability to learn continuously after deployment. Implementing such...
arXiv:2601.19937v1 Announce Type: cross Abstract: Public transit is a critical component of urban mobility and equity, yet mobility and air-quality linkages are rarely operationalized in...
arXiv:2601.19946v1 Announce Type: cross Abstract: Speaker diarization aims to segment audio recordings into regions corresponding to individual speakers. Although unsupervised speaker diarization is inherently challenging,...
arXiv:2601.19949v1 Announce Type: cross Abstract: Despite decades of research on reverberant speech, comparing methods remains difficult because most corpora lack per-file acoustic annotations or provide...
arXiv:2601.19956v1 Announce Type: cross Abstract: As Speech Language Models (SLMs) transition from personal devices to shared, multi-user environments such as smart homes, a new challenge...
arXiv:2601.19958v1 Announce Type: cross Abstract: Deep neural networks (DNNs) achieve remarkable performance on a wide range of tasks, yet their mathematical analysis remains fragmented: stability...
arXiv:2601.19960v1 Announce Type: cross Abstract: Transformer-based architectures are the most used architectures in many deep learning fields like Natural Language Processing, Computer Vision or Speech...
arXiv:2601.19966v1 Announce Type: cross Abstract: We introduce ELECTRAFI, a fast, end-to-end differentiable model for predicting periodic charge densities in crystalline materials. ELECTRAFI constructs anisotropic Gaussians...
arXiv:2601.19979v1 Announce Type: cross Abstract: We develop a reinforcement learning algorithm to study the holographic entropy cone. Given a target entropy vector, our algorithm searches...
arXiv:2601.19982v1 Announce Type: cross Abstract: We present FORM 5, a major release of the symbolic-manipulation system FORM. Version 5 introduces an integrated diagram generator, based...
arXiv:2601.20034v1 Announce Type: cross Abstract: Transient noise artifacts, or glitches, fundamentally limit the sensitivity of gravitational-wave (GW) interferometers and can mimic true astrophysical signals, particularly...
arXiv:2601.20047v1 Announce Type: cross Abstract: We prove an exponential separation in sample complexity between Euclidean and hyperbolic representations for learning on hierarchical data under standard...
arXiv:2601.20052v1 Announce Type: cross Abstract: In this work, we investigate the physical mechanisms governing turbulent kinetic energy transport using explainable deep learning (XDL). An XDL...
arXiv:2601.20076v1 Announce Type: cross Abstract: We consider minimizing an objective function subject to constraints defined by the intersection of lower-level sets of convex functions. We...
arXiv:2601.20128v1 Announce Type: cross Abstract: We present a novel exactly solvable ordinary differential equation model for rate-induced tipping: a dynamic phenomenon of dynamical systems where...
arXiv:2601.20153v1 Announce Type: cross Abstract: In the literature, several identification problems in graphs have been studied, of which, the most widely studied are the ones...
arXiv:2601.20197v1 Announce Type: cross Abstract: Finite mixture models are widely used in econometric analyses to capture unobserved heterogeneity. This paper shows that maximum likelihood estimation...
arXiv:2601.20228v1 Announce Type: cross Abstract: Accurate simulation of nuclear quantum effects is essential for molecular modeling but expensive using path integral molecular dynamics (PIMD). We...
arXiv:2601.20251v1 Announce Type: cross Abstract: Exhaustively evaluating many large language models (LLMs) on a large suite of benchmarks is expensive. We cast benchmarking as finite-population...
arXiv:2601.20263v1 Announce Type: cross Abstract: Gaussian Boson Sampling (GBS) is a quantum computational model that leverages linear optics to solve sampling problems believed to be...
arXiv:2601.20269v1 Announce Type: cross Abstract: Machine learning models in high-stakes applications, such as recidivism prediction and automated personnel selection, often exhibit systematic performance disparities across...
arXiv:2601.20336v1 Announce Type: cross Abstract: Cryptocurrency projects articulate value propositions through whitepapers, making claims about functionality and technical capabilities. This study investigates whether these narratives...
arXiv:2601.20399v1 Announce Type: cross Abstract: Randomized subspace methods reduce per-iteration cost; however, in nonconvex optimization, most analyses are expectation-based, and high-probability bounds remain scarce even...
arXiv:2601.20447v1 Announce Type: cross Abstract: Enabling natural communication through brain-computer interfaces (BCIs) remains one of the most profound challenges in neuroscience and neurotechnology. While existing...