CUEBES

One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

arXiv:2603.03291v1 Announce Type: new Abstract: Reward Models (RMs) are crucial for online alignment of language models (LMs) with human preferences. However, RM-based preference-tuning is vulnerable...

Policy Biology

arXiv CS Mar 5

From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

arXiv:2603.03292v1 Announce Type: new Abstract: Large Language Models (LLMs) exhibit high reasoning capacity in medical question-answering, but their tendency to produce hallucinations and outdated knowledge...

Health Biology

arXiv CS Mar 5

SE-Search: Self-Evolving Search Agent via Memory and Dense Reward

arXiv:2603.03293v1 Announce Type: new Abstract: Retrieval augmented generation (RAG) reduces hallucinations and factual errors in large language models (LLMs) by conditioning generation on retrieved external...

Biology Robotics

arXiv CS Mar 5

Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory

arXiv:2603.03294v1 Announce Type: new Abstract: Large Language Models show promise for agricultural advisory, yet vanilla models exhibit unsupported recommendations, generic advice lacking specific, actionable detail,...

Artificial Intelligence Software

arXiv CS Mar 5

Language Model Goal Selection Differs from Humans' in an Open-Ended Task

arXiv:2603.03295v1 Announce Type: new Abstract: As large language models (LLMs) get integrated into human decision-making, they are increasingly choosing goals autonomously rather than only completing...

Psychology Software

arXiv CS Mar 5

PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents

arXiv:2603.03296v1 Announce Type: new Abstract: Long-term memory is essential for large language model (LLM) agents operating in complex environments, yet existing memory designs are either...

Neuroscience Environment

arXiv CS Mar 5

TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement

arXiv:2603.03297v1 Announce Type: new Abstract: Test-time Training enables model adaptation using only test questions and offers a promising paradigm for improving the reasoning ability of...

Biology Psychology

arXiv CS Mar 5

TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation

arXiv:2603.03298v1 Announce Type: new Abstract: Large Language Models (LLMs) have improved substantially alignment, yet their behavior remains highly sensitive to prompt phrasing. This brittleness has...

Engineering Psychology

arXiv CS Mar 5

How LLMs Cite and Why It Matters: A Cross-Model Audit of Reference Fabrication in AI-Assisted Academic Writing and Methods to Detect Phantom Citations

arXiv:2603.03299v1 Announce Type: new Abstract: Large language models (LLMs) have been noted to fabricate scholarly citations, yet the scope of this behavior across providers, domains,...

Biology Psychology

arXiv CS Mar 5

Benchmarking Legal RAG: The Promise and Limits of AI Statutory Surveys

arXiv:2603.03300v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) offers significant potential for legal AI, yet systematic benchmarks are sparse. Prior work introduced LaborBench to benchmark...

Software Policy

arXiv CS Mar 5

From Exact Hits to Close Enough: Semantic Caching for LLM Embeddings

arXiv:2603.03301v1 Announce Type: new Abstract: The rapid adoption of large language models (LLMs) has created demand for faster responses and lower costs. Semantic caching, reusing...

Policy Technology

arXiv CS Mar 5

Developing an AI Assistant for Knowledge Management and Workforce Training in State DOTs

arXiv:2603.03302v1 Announce Type: new Abstract: Effective knowledge management is critical for preserving institutional expertise and improving the efficiency of workforce training in state transportation agencies....

Artificial Intelligence Technology

arXiv CS Mar 5

HumanLM: Simulating Users with State Alignment Beats Response Imitation

arXiv:2603.03303v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly used to simulate how specific users respond to a given context, enabling more user-centric...

Software Biology

arXiv CS Mar 5

Knowledge Graph and Hypergraph Transformers with Repository-Attention and Journey-Based Role Transport

arXiv:2603.03304v1 Announce Type: new Abstract: We present a concise architecture for joint training on sentences and structured data while keeping knowledge and language representations separable....

Software World News

arXiv CS Mar 5

Draft-Conditioned Constrained Decoding for Structured Generation in LLMs

arXiv:2603.03305v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used to generate executable outputs, JSON objects, and API calls, where a single syntax...

Engineering Software

arXiv CS Mar 5

Token-Oriented Object Notation vs JSON: A Benchmark of Plain and Constrained Decoding Generation

arXiv:2603.03306v1 Announce Type: new Abstract: Recently presented Token-Oriented Object Notation (TOON) aims to replace JSON as a serialization format for passing structured data to LLMs...

Engineering Policy

arXiv CS Mar 5

TopicENA: Enabling Epistemic Network Analysis at Scale through Automated Topic-Based Coding

arXiv:2603.03307v1 Announce Type: new Abstract: Epistemic Network Analysis (ENA) is a method for investigating the relational structure of concepts in text by representing co-occurring concepts...

Biology Engineering

arXiv CS Mar 5

Old Habits Die Hard: How Conversational History Geometrically Traps LLMs

arXiv:2603.03308v1 Announce Type: new Abstract: How does the conversational past of large language models (LLMs) influence their future performance? Recent work suggests that LLMs are...

Technology Psychology

arXiv CS Mar 5

Combating data scarcity in recommendation services: Integrating cognitive types of VARK and neural network technologies (LLM)

arXiv:2603.03309v1 Announce Type: new Abstract: Cold start scenarios present fundamental obstacles to effective recommendation generation, particularly when dealing with users lacking interaction history or items...

Neuroscience Artificial Intelligence

arXiv CS Mar 5

Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

arXiv:2603.03310v1 Announce Type: new Abstract: Modern large language model (LLM) inference engines optimize throughput and latency under fixed decoding rules, treating generation as a linear...

Psychology Software

arXiv CS Mar 5

The Logovista English-Japanese Machine Translation System

arXiv:2603.03311v1 Announce Type: new Abstract: This paper documents the architecture, development practices, and preserved artifacts of the Logovista English--Japanese machine translation system, a large, explicitly...

Technology Engineering

arXiv CS Mar 5

Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding

arXiv:2603.03312v1 Announce Type: new Abstract: Decoding natural language from non-invasive EEG signals is a promising yet challenging task. However, current state-of-the-art models remain constrained by...

Software Neuroscience

arXiv CS Mar 5

How does fine-tuning improve sensorimotor representations in large language models?

arXiv:2603.03313v1 Announce Type: new Abstract: Large Language Models (LLMs) exhibit a significant "embodiment gap", where their text-based representations fail to align with human sensorimotor experiences....

Biology Policy

arXiv CS Mar 5

Towards Self-Robust LLMs: Intrinsic Prompt Noise Resistance via CoIPO

arXiv:2603.03314v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated remarkable and steadily improving performance across a wide range of tasks. However, LLM performance...

Software Psychology