digitado

About digitado

https://www.digitado.com.br

Posts by :

The Art of Building Verifiers for Computer Use Agents

digitado ⋅ 9 de April de 2026

arXiv:2604.06240v1 Announce Type: new Abstract: Verifying the success of computer use agent (CUA) trajectories is a critical challenge: without reliable verification, neither evaluation nor training signal can be trusted. In this paper, we present lessons learned from building a best-in-class verifier for web tasks we call the Universal Verifier. We design the Universal Verifier around four key principles: 1) constructing rubrics with meaningful, non-overlapping criteria to reduce noise; 2) separating process and outcome rewards that yield complementary signals, […]

Ver mais

Like 0

Liked Liked

technocracy

LLMs Have Made Failure Worth Publishing

digitado ⋅ 9 de April de 2026

arXiv:2604.06236v1 Announce Type: new Abstract: Scientific publishing systematically filters out negative results. We argue that this long-standing asymmetry has become an urgent problem in the era of large language models, which inherit the positive bias of the literature they are trained on, face an impending shortage of high-quality training data, and are increasingly deployed as both research tools and peer reviewers. We analyze three ways in which LLMs have changed the value of failure data and show that […]

Ver mais

Like 0

Liked Liked

technocracy

Negotiating Privacy with Smart Voice Assistants: Risk-Benefit and Control-Acceptance Tensions

digitado ⋅ 9 de April de 2026

arXiv:2604.06235v1 Announce Type: new Abstract: Smart Voice assistants (SVAs) are widely adopted by youth, yet privacy decision-making in these environments is often characterized by competing considerations rather than clear-cut preferences. While our prior research has examined privacy risks, benefits, trust, and self-efficacy as distinct predictors of behavior, less attention has been paid to how these factors combine into higher-level tension that shapes privacy outcomes. This study introduces a negotiation-based framework for understanding youth privacy decision-making with SVAs by […]

Ver mais

Like 0

Liked Liked

technocracy

Blind Refusal: Language Models Refuse to Help Users Evade Unjust, Absurd, and Illegitimate Rules

digitado ⋅ 9 de April de 2026

arXiv:2604.06233v1 Announce Type: new Abstract: Safety-trained language models routinely refuse requests for help circumventing rules. But not all rules deserve compliance. When users ask for help evading rules imposed by an illegitimate authority, rules that are deeply unjust or absurd in their content or application, or rules that admit of justified exceptions, refusal is a failure of moral reasoning. We introduce empirical results documenting this pattern of refusal that we call blind refusal: the tendency of language models […]

Ver mais

Like 0

Liked Liked

technocracy

What Do Humanities Scholars Need? A User Model for Recommendation in Digital Archives

digitado ⋅ 9 de April de 2026

arXiv:2604.06232v1 Announce Type: new Abstract: User models for recommender systems (RecSys) typically assume stable preferences, similarity-based relevance, and session-bounded interactions — assumptions derived from high-volume consumer contexts. This paper investigates these assumptions for humanities scholars working with digital archives. Following a human-centered design approach, we conducted focus groups and analyzed interview data from 18 researchers. Our analysis identifies four dimensions where scholarly information-seeking diverges from common RecSys user modeling: (1) context volatility — preferences shift with research tasks […]

Ver mais

Like 0

Liked Liked

technocracy

Automating Database-Native Function Code Synthesis with LLMs

digitado ⋅ 9 de April de 2026

arXiv:2604.06231v1 Announce Type: new Abstract: Database systems incorporate an ever-growing number of functions in their kernels (a.k.a., database native functions) for scenarios like new application support and business migration. This growth causes an urgent demand for automatic database native function synthesis. While recent advances in LLM-based code generation (e.g., Claude Code) show promise, they are too generic for database-specific development. They often hallucinate or overlook critical context because database function synthesis is inherently complex and error-prone, where synthesizing […]

Ver mais

Like 0

Liked Liked

technocracy

Ontology-based knowledge graph infrastructure for interoperable atomistic simulation data

digitado ⋅ 9 de April de 2026

arXiv:2604.06230v1 Announce Type: new Abstract: The reuse of atomistic simulation data is often limited by heterogeneous formats, incomplete metadata, and a lack of standardized representations of workflows and provenance. Here we present an ontology-based infrastructure for representing and integrating atomistic simulation data as a knowledge graph. The approach combines domain ontologies with a software framework that enables data capture both from existing datasets and directly from simulation workflows at the point of generation. Heterogeneous data from multiple sources […]

Ver mais

Like 0

Liked Liked

technocracy

Discoverability matters: Open access models and the translation of science into patents

digitado ⋅ 9 de April de 2026

arXiv:2604.06229v1 Announce Type: new Abstract: Scientific research is a key input into technological innovation, yet not all scientific knowledge is equally mobilized in patents. This paper examines how different scientific publishing models shape both the selection of scientific publications cited in patents and their cognitive alignment with patented technologies. Using large-scale data on non-patent references linking patents to scientific publications, combined with metadata from OpenAlex, we compare the Open Access (OA) structure of patent-cited science to that of […]

Ver mais

Like 0

Liked Liked

technocracy

Probabilistic Language Tries: A Unified Framework for Compression, Decision Policies, and Execution Reuse

digitado ⋅ 9 de April de 2026

arXiv:2604.06228v1 Announce Type: new Abstract: We introduce probabilistic language tries (PLTs), a unified representation that makes explicit the prefix structure implicitly defined by any generative model over sequences. By assigning to each outgoing edge the conditional probability of the corresponding token or action, a PLT simultaneously serves as: (i) an optimal lossless compressor via frequency-weighted interval encoding, generalizing arithmetic coding to model-conditioned distributions; (ii) a policy representation for sequential decision problems including games, search, and robotic control; and […]

Ver mais

Like 0

Liked Liked

technocracy

A Benchmark of Classical and Deep Learning Models for Agricultural Commodity Price Forecasting on A Novel Bangladeshi Market Price Dataset

digitado ⋅ 9 de April de 2026

arXiv:2604.06227v1 Announce Type: new Abstract: Accurate short-term forecasting of agricultural commodity prices is critical for food security planning and smallholder income stabilisation in developing economies, yet machine-learning-ready datasets for this purpose remain scarce in South Asia. This paper makes two contributions. First, we introduce AgriPriceBD, a benchmark dataset of 1,779 daily retail mid-prices for five Bangladeshi commodities – garlic, chickpea, green chilli, cucumber, and sweet pumpkin – spanning July 2020 to June 2025, extracted from government reports via […]

Ver mais

Like 0

Liked Liked