March 2026

Interpretable Perception and Reasoning for Audiovisual Geolocation

digitado ⋅ 9 de March de 2026

arXiv:2603.05708v1 Announce Type: new Abstract: While recent advances in Multimodal Large Language Models (MLLMs) have improved image-based localization, precise global geolocation remains a formidable challenge due to the inherent ambiguity of visual landscapes and the largely untapped potential of auditory cues. In this paper, we introduce Audiovisual Geolocation, a framework designed to resolve geographic ambiguity through interpretable perception and reasoning. We present AVG, a high-quality global-scale video benchmark for geolocation, comprising 20,000 curated clips across 1,000 distinct locations. […]

Ver mais

Like 0

Liked Liked

technocracy

Reasoning Models Struggle to Control their Chains of Thought

digitado ⋅ 9 de March de 2026

arXiv:2603.05706v1 Announce Type: new Abstract: Chain-of-thought (CoT) monitoring is a promising tool for detecting misbehaviors and understanding the motivations of modern reasoning models. However, if models can control what they verbalize in their CoT, it could undermine CoT monitorability. To measure this undesirable capability — CoT controllability — we introduce the CoT-Control evaluation suite, which includes tasks that require models to solve problems while adhering to CoT instructions, e.g., reasoning about a genetics question without using the word […]

Ver mais

Like 0

Liked Liked

technocracy

Querying with Conflicts of Interest

digitado ⋅ 9 de March de 2026

arXiv:2603.05704v1 Announce Type: new Abstract: Conflicts of interest often arise between data sources and their users regarding how the users’ information needs should be interpreted by the data source. For example, an online product search might be biased towards presenting certain products higher than in its list of results to improve its revenue, which may not follow the user’s desired ranking expressed in their query. The research community has proposed schemes for data systems to implement to ensure […]

Ver mais

Like 0

Liked Liked

technocracy

Towards Robust Retrieval-Augmented Generation Based on Knowledge Graph: A Comparative Analysis

digitado ⋅ 9 de March de 2026

arXiv:2603.05698v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) was introduced to enhance the capabilities of Large Language Models (LLMs) beyond their encoded prior knowledge. This is achieved by providing LLMs with an external source of knowledge, which helps reduce factual hallucinations and enables access to new information not available during pretraining. However, inconsistent retrieved information can negatively affect LLM responses. The Retrieval-Augmented Generation Benchmark (RGB) was introduced to evaluate the robustness of RAG systems under such conditions. In […]

Ver mais

Like 0

Liked Liked

technocracy

MultiHaystack: Benchmarking Multimodal Retrieval and Reasoning over 40K Images, Videos, and Documents

digitado ⋅ 9 de March de 2026

arXiv:2603.05697v1 Announce Type: new Abstract: Multimodal large language models (MLLMs) achieve strong performance on benchmarks that evaluate text, image, or video understanding separately. However, these settings do not assess a critical real-world requirement, which involves retrieving relevant evidence from large, heterogeneous multimodal corpora prior to reasoning. Most existing benchmarks restrict retrieval to small, single-modality candidate sets, substantially simplifying the search space and overstating end-to-end reliability. To address this gap, we introduce MultiHaystack, the first benchmark designed to evaluate […]

Ver mais

Like 0

Liked Liked

technocracy

Autonomous Algorithm Discovery for Ptychography via Evolutionary LLM Reasoning

digitado ⋅ 9 de March de 2026

arXiv:2603.05696v1 Announce Type: new Abstract: Ptychography is a computational imaging technique widely used for high-resolution materials characterization, but high-quality reconstructions often require the use of regularization functions that largely remain manually designed. We introduce Ptychi-Evolve, an autonomous framework that uses large language models (LLMs) to discover and evolve novel regularization algorithms. The framework combines LLM-driven code generation with evolutionary mechanisms, including semantically-guided crossover and mutation. Experiments on three challenging datasets (X-ray integrated circuits, low-dose electron microscopy of apoferritin, […]

Ver mais

Like 0

Liked Liked

technocracy

Warm Starting State-Space Models with Automata Learning

digitado ⋅ 9 de March de 2026

arXiv:2603.05694v1 Announce Type: new Abstract: We prove that Moore machines can be exactly realized as state-space models (SSMs), establishing a formal correspondence between symbolic automata and these continuous machine learning architectures. These Moore-SSMs preserve both the complete symbolic structure and input-output behavior of the original Moore machine, but operate in Euclidean space. With this correspondence, we compare the training of SSMs with both passive and active automata learning. In recovering automata from the SYNTCOMP benchmark, we show that […]

Ver mais

Like 0

Liked Liked

technocracy

Parallelization Strategies for Dense LLM Deployment: Navigating Through Application-Specific Tradeoffs and Bottlenecks

digitado ⋅ 9 de March de 2026

arXiv:2603.05692v1 Announce Type: new Abstract: Breakthroughs in the generative AI domain have fueled an explosion of large language model (LLM)-powered applications, whose workloads fundamentally consist of sequences of inferences through transformer architectures. Within this rapidly expanding ecosystem, dense LLMs–those that activate all model parameters for each token generation–form the foundation for advanced expert-based variants. Dense models continue to dominate because of their strong generalization ability, scalability, ease of fine-tuning, and versatility across diverse tasks. In LLM inference systems, […]

Ver mais

Like 0

Liked Liked

technocracy

FreeTxt-Vi: A Benchmarked Vietnamese-English Toolkit for Segmentation, Sentiment, and Summarisation

digitado ⋅ 9 de March de 2026

arXiv:2603.05690v1 Announce Type: new Abstract: FreeTxt-Vi is a free and open source web based toolkit for creating and analysing bilingual Vietnamese English text collections. Positioned at the intersection of corpus linguistics and natural language processing NLP it enables users to build explore and interpret free text data without requiring programming expertise. The system combines corpus analysis features such as concordancing keyword analysis word relation exploration and interactive visualisation with transformer based NLP components for sentiment analysis and summarisation. […]

Ver mais

Like 0

Liked Liked

technocracy

SecureRAG-RTL: A Retrieval-Augmented, Multi-Agent, Zero-Shot LLM-Driven Framework for Hardware Vulnerability Detection

digitado ⋅ 9 de March de 2026

arXiv:2603.05689v1 Announce Type: new Abstract: Large language models (LLMs) have shown remarkable capabilities in natural language processing tasks, yet their application in hardware security verification remains limited due to scarcity of publicly available hardware description language (HDL) datasets. This knowledge gap constrains LLM performance in detecting vulnerabilities within HDL designs. To address this challenge, we propose SecureRAG-RTL, a novel Retrieval-Augmented Generation (RAG)-based approach that significantly enhances LLM-based security verification of hardware designs. Our approach integrates domain-specific retrieval with […]

Ver mais

Like 0

Liked Liked