News

Past, Present, and Future of Bug Tracking in the Generative AI Era

arXiv:2510.08005v2 Announce Type: replace Abstract: Traditional bug-tracking systems rely heavily on manual reporting, reproduction, classification, and resolution, involving multiple stakeholders such as end users, customer...

Software Technology

TaoSR-AGRL: Adaptive Guided Reinforcement Learning Framework for E-commerce Search Relevance

arXiv:2510.08048v3 Announce Type: replace Abstract: Query-product relevance prediction is fundamental to e-commerce search and has become even more critical in the era of AI-powered shopping,...

Technology Software

Coordinates from Context: Using LLMs to Ground Complex Location References

arXiv:2510.08741v2 Announce Type: replace Abstract: Geocoding is the task of linking a location reference to an actual geographic location and is essential for many downstream...

Saving SWE-Bench: A Benchmark Mutation Approach for Realistic Agent Evaluation

arXiv:2510.08996v4 Announce Type: replace Abstract: Current benchmarks for evaluating software engineering agents, such as SWE-Bench Verified, are predominantly derived from GitHub issues and fail to...

Software Technology

Value-State Gated Attention for Mitigating Extreme-Token Phenomena in Transformers

arXiv:2510.09017v3 Announce Type: replace Abstract: Large models based on the Transformer architecture are susceptible to extreme-token phenomena, such as attention sinks and value-state drains. These...

Biology

Beyond Single-Granularity Prompts: A Multi-Scale Chain-of-Thought Prompt Learning for Graph

arXiv:2510.09394v4 Announce Type: replace Abstract: The ``pre-train, prompt" paradigm, designed to bridge the gap between pre-training tasks and downstream objectives, has been extended from the...

iBERT: Interpretable Embeddings via Sense Decomposition

arXiv:2510.09882v2 Announce Type: replace Abstract: We present iBERT (interpretable-BERT), an encoder to produce inherently interpretable and controllable embeddings - designed to modularize and expose the...

Engineering Software

PairSem: LLM-Guided Pairwise Semantic Matching for Scientific Document Retrieval

arXiv:2510.09897v2 Announce Type: replace Abstract: Scientific document retrieval is a critical task for enabling knowledge discovery and supporting research across diverse domains. However, existing dense...

BILLY: Steering Large Language Models via Merging Persona Vectors for Creative Generation

arXiv:2510.10157v2 Announce Type: replace Abstract: Multi-LLM systems enhance the creativity of large language models by simulating human collective intelligence but suffer from significant drawbacks, such...

A Style-Based Profiling Framework for Quantifying the Synthetic-to-Real Gap in Autonomous Driving Datasets

arXiv:2510.10203v3 Announce Type: replace Abstract: Ensuring the reliability of autonomous driving perception systems requires extensive environment-based testing, yet real-world execution is often impractical. Synthetic datasets...

Biology Robotics

FAC-FACodec: Controllable Zero-Shot Foreign Accent Conversion with Factorized Speech Codec

arXiv:2510.10785v2 Announce Type: replace Abstract: Previous accent conversion (AC) methods, including foreign accent conversion (FAC), lack explicit control over the degree of modification. Because accent...

Software

Revisiting Model Interpolation for Efficient Reasoning

arXiv:2510.10977v2 Announce Type: replace Abstract: Model merging, typically on Instruct and Thinking models, has shown remarkable performance for efficient reasoning. In this paper, we systematically...

Does LLM Focus on the Right Words? Mitigating Context Bias in LLM-based Recommenders

arXiv:2510.10978v3 Announce Type: replace Abstract: Large language models (LLMs), owing to their extensive open-domain knowledge and semantic reasoning capabilities, have been increasingly integrated into recommender...

Deep Research with Open-Domain Evaluation and Multi-Stage Guardrails for Safety

arXiv:2510.10994v2 Announce Type: replace Abstract: Deep research frameworks have shown promising capabilities in synthesizing comprehensive reports from web sources. While deep research possesses significant potential...

Artificial Intelligence

GeoVLMath: Enhancing Geometry Reasoning in Vision-Language Models via Cross-Modal Reward for Auxiliary Line Creation

arXiv:2510.11020v2 Announce Type: replace Abstract: Auxiliary lines are essential for solving complex geometric problems but remain challenging for large vision-language models (LVLMs). Recent attempts construct...

Biology Software

From Reasoning LLMs to BERT: A Two-Stage Distillation Framework for Search Relevance

arXiv:2510.11056v3 Announce Type: replace Abstract: Query-service relevance prediction in e-commerce search systems faces strict latency requirements that prevent the direct application of Large Language Models...

BLEnD-Vis: Benchmarking Multimodal Cultural Understanding in Vision Language Models

arXiv:2510.11178v2 Announce Type: replace Abstract: As vision-language models (VLMs) are deployed globally, their ability to understand culturally situated knowledge becomes essential. Yet, existing evaluations largely...

Energy Artificial Intelligence

Explainability, risk modeling, and segmentation based customer churn analytics for personalized retention in e-commerce

arXiv:2510.11604v2 Announce Type: replace Abstract: In online retail, customer acquisition typically incurs higher costs than customer retention, motivating firms to invest in churn analytics. However,...

Understanding Parametric Knowledge Injection in Retrieval-Augmented Generation

arXiv:2510.12668v2 Announce Type: replace Abstract: Context-grounded generation underpins many LLM applications, including long-document question answering (QA), conversational personalization, and retrieval-augmented generation (RAG). However, classic token-based...

Tandem Training for Language Models

arXiv:2510.13551v2 Announce Type: replace Abstract: As language models continue to rapidly improve, we can expect their actions and reasoning to become difficult or impossible for...

Artificial Intelligence

PIShield: Detecting Prompt Injection Attacks via Intrinsic LLM Features

arXiv:2510.14005v3 Announce Type: replace Abstract: LLM-integrated applications are vulnerable to prompt injection attacks, where an attacker contaminates the input to inject malicious instructions, causing the...

MathMist: A Parallel Multilingual Benchmark Dataset for Mathematical Problem Solving and Reasoning

arXiv:2510.14305v2 Announce Type: replace Abstract: Mathematical reasoning remains one of the most challenging domains for large language models (LLMs), requiring not only linguistic understanding but...