February 2026

Architecture-Agnostic Curriculum Learning for Document Understanding: Empirical Evidence from Text-Only and Multimodal

digitado ⋅ 26 de February de 2026

arXiv:2602.21225v1 Announce Type: new Abstract: We investigate whether progressive data scheduling — a curriculum learning strategy that incrementally increases training data exposure (33%$rightarrow$67%$rightarrow$100%) — yields consistent efficiency gains across architecturally distinct document understanding models. By evaluating BERT (text-only, 110M parameters) and LayoutLMv3 (multimodal, 126M parameters) on the FUNSD and CORD benchmarks, we establish that this schedule reduces wall-clock training time by approximately 33%, commensurate with the reduction from 6.67 to 10.0 effective epoch-equivalents of data. To isolate curriculum […]

Ver mais

Like 0

Liked Liked

technocracy

Make Every Draft Count: Hidden State based Speculative Decoding

digitado ⋅ 26 de February de 2026

arXiv:2602.21224v1 Announce Type: new Abstract: Speculative decoding has emerged as a pivotal technique to accelerate LLM inference by employing a lightweight draft model to generate candidate tokens that are subsequently verified by the target model in parallel. However, while this paradigm successfully increases the arithmetic intensity of memory-bound inference, it causes significant compute inefficiency: the majority of draft tokens fail verification and are discarded, resulting in waste of computation. Motivated by the goal of recollecting this wasted computation, […]

Ver mais

Like 0

Liked Liked

technocracy

Measuring Pragmatic Influence in Large Language Model Instructions

digitado ⋅ 26 de February de 2026

arXiv:2602.21223v1 Announce Type: new Abstract: It is not only what we ask large language models (LLMs) to do that matters, but also how we prompt. Phrases like “This is urgent” or “As your supervisor” can shift model behavior without altering task content. We study this effect as pragmatic framing, contextual cues that shape directive interpretation rather than task specification. While prior work exploits such cues for prompt optimization or probes them as security vulnerabilities, pragmatic framing itself has […]

Ver mais

Like 0

Liked Liked

technocracy

Task-Aware LoRA Adapter Composition via Similarity Retrieval in Vector Databases

digitado ⋅ 26 de February de 2026

arXiv:2602.21222v1 Announce Type: new Abstract: Parameter efficient fine tuning methods like LoRA have enabled task specific adaptation of large language models, but efficiently composing multiple specialized adapters for unseen tasks remains challenging. We present a novel framework for dynamic LoRA adapter composition that leverages similarity retrieval in vector databases to enable zero-shot generalization across diverse NLP tasks. Our approach constructs a task-aware vector database by embedding training examples from 22 datasets spanning commonsense reasoning, question answering, natural language […]

Ver mais

Like 0

Liked Liked

technocracy

Latent Context Compilation: Distilling Long Context into Compact Portable Memory

digitado ⋅ 26 de February de 2026

arXiv:2602.21221v1 Announce Type: new Abstract: Efficient long-context LLM deployment is stalled by a dichotomy between amortized compression, which struggles with out-of-distribution generalization, and Test-Time Training, which incurs prohibitive synthetic data costs and requires modifying model weights, creating stateful parameters that complicate concurrent serving. We propose Latent Context Compilation, a framework that fundamentally shifts context processing from adaptation to compilation. By utilizing a disposable LoRA module as a compiler, we distill long contexts into compact buffer tokens — stateless, […]

Ver mais

Like 0

Liked Liked

technocracy

Field-Theoretic Memory for AI Agents: Continuous Dynamics for Context Preservation

digitado ⋅ 26 de February de 2026

arXiv:2602.21220v1 Announce Type: new Abstract: We present a memory system for AI agents that treats stored information as continuous fields governed by partial differential equations rather than discrete entries in a database. The approach draws from classical field theory: memories diffuse through semantic space, decay thermodynamically based on importance, and interact through field coupling in multi-agent scenarios. We evaluate the system on two established long-context benchmarks: LoCoMo (ACL 2024) with 300-turn conversations across 35 sessions, and LongMemEval (ICLR […]

Ver mais

Like 0

Liked Liked

technocracy

Reasoning-Based Personalized Generation for Users with Sparse Data

digitado ⋅ 26 de February de 2026

arXiv:2602.21219v1 Announce Type: new Abstract: Large Language Model (LLM) personalization holds great promise for tailoring responses by leveraging personal context and history. However, real-world users usually possess sparse interaction histories with limited personal context, such as cold-start users in social platforms and newly registered customers in online E-commerce platforms, compromising the LLM-based personalized generation. To address this challenge, we introduce GraSPer (Graph-based Sparse Personalized Reasoning), a novel framework for enhancing personalized text generation under sparse context. GraSPer first […]

Ver mais

Like 0

Liked Liked

technocracy

EPSVec: Efficient and Private Synthetic Data Generation via Dataset Vectors

digitado ⋅ 26 de February de 2026

arXiv:2602.21218v1 Announce Type: new Abstract: High-quality data is essential for modern machine learning, yet many valuable corpora are sensitive and cannot be freely shared. Synthetic data offers a practical substitute for downstream development, and large language models (LLMs) have emerged as powerful engines for generating it. However, existing private text generation methods are severely inefficient: they are data-intensive, computationally slow, and often require large private corpora or batch sizes to achieve usable quality. We introduce EPSVec, a differentially-private […]

Ver mais

Like 0

Liked Liked

technocracy

Applied Sociolinguistic AI for Community Development (ASA-CD): A New Scientific Paradigm for Linguistically-Grounded Social Intervention

digitado ⋅ 26 de February de 2026

arXiv:2602.21217v1 Announce Type: new Abstract: This paper establishes Applied Sociolinguistic AI for Community Development (ASA-CD) as a novel scientific paradigm for addressing community challenges through linguistically grounded, AI-enabled intervention. ASA-CD introduces three key contributions: (1) linguistic biomarkers as computational indicators of discursive fragmentation; (2) development-aligned natural language processing (NLP), an AI optimisation paradigm prioritising collective outcomes; and (3) a standardised five-phase protocol for discursive intervention. A proof-of-concept study, incorporating real-world and synthetic corpora, demonstrates systematic associations between exclusionary […]

Ver mais

Like 0

Liked Liked

technocracy

EQ-5D Classification Using Biomedical Entity-Enriched Pre-trained Language Models and Multiple Instance Learning

digitado ⋅ 26 de February de 2026

arXiv:2602.21216v1 Announce Type: new Abstract: The EQ-5D (EuroQol 5-Dimensions) is a standardized instrument for the evaluation of health-related quality of life. In health economics, systematic literature reviews (SLRs) depend on the correct identification of publications that use the EQ-5D, but manual screening of large volumes of scientific literature is time-consuming, error-prone, and inconsistent. In this study, we investigate fine-tuning of general-purpose (BERT) and domain-specific (SciBERT, BioBERT) pre-trained language models (PLMs), enriched with biomedical entity information extracted through scispaCy […]

Ver mais

Like 0

Liked Liked