digitado

About digitado

https://www.digitado.com.br

Posts by :

Beyond Facts: Benchmarking Distributional Reading Comprehension in Large Language Models

digitado ⋅ 9 de April de 2026

arXiv:2604.06201v1 Announce Type: new Abstract: While most reading comprehension benchmarks for LLMs focus on factual information that can be answered by localizing specific textual evidence, many real-world tasks require understanding distributional information, such as population-level trends and preferences expressed across collections of text. We introduce Text2DistBench, a reading comprehension benchmark for evaluating LLMs’ ability to infer distributional knowledge from natural language. Built from real-world YouTube comments about movie and music entities, the benchmark provides models with entity metadata […]

Ver mais

Like 0

Liked Liked

technocracy

Thinking in Graphs with CoMAP: A Shared Visual Workspace for Designing Project-Based Learning

digitado ⋅ 9 de April de 2026

arXiv:2604.06200v1 Announce Type: new Abstract: Designing project-based learning (PBL) demands managing highly interdependent components, a task that both traditional linear tools and purely conversational AI struggle with. Traditional tools fail to capture the non-linear nature of creative design, while conversational systems lack the persistent, shared context necessary for reflective collaboration. Grounded in theories of distributed cognition, we introduce CoMAP, a system that embodies a graph-based collaboration paradigm. By providing a shared visual workspace with dual-modality AI support, CoMAP […]

Ver mais

Like 0

Liked Liked

technocracy

Emergent decentralized regulation in a purely synthetic society

digitado ⋅ 9 de April de 2026

arXiv:2604.06199v1 Announce Type: new Abstract: As autonomous AI agents increasingly inhabit online environments and extensively interact, a key question is whether synthetic collectives exhibit self-regulated social dynamics with neither human intervention nor centralized design. We study OpenClaw agents on Moltbook, an agent-only social network, using an observational archive of 39,026 posts and 5,712 comments authored by 14,490 agents. We quantify action-inducing language with Directive Intensity (DI), a transparent, lexicon-based proxy for directive and instructional phrasing that does not […]

Ver mais

Like 0

Liked Liked

technocracy

Concentrated siting of AI data centers drives regional power-system stress under rising global compute demand

digitado ⋅ 9 de April de 2026

arXiv:2604.06198v1 Announce Type: new Abstract: The rapid rise of generative artificial intelligence (AI) is driving unprecedented growth in global computational demand, placing increasing pressure on electricity systems. This study introduces an AI-energy coupling framework that combines large language models (LLMs)-based analysis of corporate, policy, and media data with quantitative energy-system modeling to forecast the electricity footprint of AI-driven data centers from 2025 to 2030. Results show that the new AI infrastructure is highly concentrated in North America, Western […]

Ver mais

Like 0

Liked Liked

technocracy

Temporally Phenotyping GLP-1RA Case Reports with Large Language Models: A Textual Time Series Corpus and Risk Modeling

digitado ⋅ 9 de April de 2026

arXiv:2604.06197v1 Announce Type: new Abstract: Type 2 diabetes case reports describe complex clinical courses, but their timelines are often expressed in language that is difficult to reuse in longitudinal modeling. To address this gap, we developed a textual time-series corpus of 136 PubMed Open Access single-patient case reports involving glucagon-like peptide 1 receptor agonists, with clinical events associated with their most probable reference times. We evaluated automated LLM timeline extraction against gold-standard timelines annotated by clinical domain experts, […]

Ver mais

Like 0

Liked Liked

technocracy

Consistency-Guided Decoding with Proof-Driven Disambiguation for Three-Way Logical Question Answering

digitado ⋅ 9 de April de 2026

arXiv:2604.06196v1 Announce Type: new Abstract: Three-way logical question answering (QA) assigns $True/False/Unknown$ to a hypothesis $H$ given a premise set $S$. While modern large language models (LLMs) can be accurate on isolated examples, we identify two recurring failure modes in 3-way logic QA: (i) negation inconsistency, where answers to $H$ and $neg H$ violate the deterministic label mapping, and (ii) epistemic $Unknown$, where the model predicts $Unknown$ due to uncertainty or instability even when $S$ entails one side. […]

Ver mais

Like 0

Liked Liked

technocracy

Hallucination as output-boundary misclassification: a composite abstention architecture for language models

digitado ⋅ 9 de April de 2026

arXiv:2604.06195v1 Announce Type: new Abstract: Large language models often produce unsupported claims. We frame this as a misclassification error at the output boundary, where internally generated completions are emitted as if they were grounded in evidence. This motivates a composite intervention that combines instruction-based refusal with a structural abstention gate. The gate computes a support deficit score, St, from three black-box signals: self-consistency (At), paraphrase stability (Pt), and citation coverage (Ct), and blocks output when St exceeds a […]

Ver mais

Like 0

Liked Liked

technocracy

Content Platform GenAI Regulation via Compensation

digitado ⋅ 9 de April de 2026

arXiv:2604.06194v1 Announce Type: new Abstract: The use of Generative AI (GenAI) for creative content generation has gained popularity in recent years. GenAI allows creators to generate contents that are increasingly becoming indistinguishable to the human–generated counter–part at a much lower cost. While GenAI reshapes the competitive landscape of the contents market, the original creators were typically not compensated for their works that were used in the GenAI training. On the other hands, the wide–spread adoption of GenAI threatens […]

Ver mais

Like 0

Liked Liked

technocracy

Depression Detection at the Point of Care: Automated Analysis of Linguistic Signals from Routine Primary Care Encounters

digitado ⋅ 9 de April de 2026

arXiv:2604.06193v1 Announce Type: new Abstract: Depression is underdiagnosed in primary care, yet timely identification remains critical. Recorded clinical encounters, increasingly common with digital scribing technologies, present an opportunity to detect depression from naturalistic dialogue. We investigated automated depression detection from 1,108 audio-recorded primary care encounters in the Establishing Focus study, with depression defined by PHQ-9 (n=253 depressed, n=855 non-depressed). We compared three supervised approaches, Sentence-BERT + Logistic Regression (LR), LIWC+LR and ModernBERT, against a zero-shot GPT-OSS. GPT-OSS achieved […]

Ver mais

Like 0

Liked Liked

technocracy

The Stepwise Informativeness Assumption: Why are Entropy Dynamics and Reasoning Correlated in LLMs?

digitado ⋅ 9 de April de 2026

arXiv:2604.06192v1 Announce Type: new Abstract: Recent work uses entropy-based signals at multiple representation levels to study reasoning in large language models, but the field remains largely empirical. A central unresolved puzzle is why internal entropy dynamics, defined under the predictive distribution of a model, correlate so robustly with external correctness given by the ground-truth answer. In this paper, we argue that this correlation arises because autoregressive models reason correctly when they accumulate information about the true answer via […]

Ver mais

Like 0

Liked Liked