digitado – Page 190

Dissecting Subjectivity and the “Ground Truth” Illusion in Data Annotation

digitado ⋅ 13 de February de 2026

arXiv:2602.11318v1 Announce Type: new Abstract: In machine learning, “ground truth” refers to the assumed correct labels used to train and evaluate models. However, the foundational “ground truth” paradigm rests on a positivistic fallacy that treats human disagreement as technical noise rather than a vital sociotechnical signal. This systematic literature review analyzes research published between 2020 and 2025 across seven premier venues: ACL, AIES, CHI, CSCW, EAAMO, FAccT, and NeurIPS, investigating the mechanisms in data annotation practices that facilitate […]

Ver mais

Like 0

Liked Liked

technocracy

Continuous-Utility Direct Preference Optimization

digitado ⋅ 3 de February de 2026

arXiv:2602.00931v1 Announce Type: new Abstract: Large language model reasoning is often treated as a monolithic capability, relying on binary preference supervision that fails to capture partial progress or fine-grained reasoning quality. We introduce Continuous Utility Direct Preference Optimization (CU-DPO), a framework that aligns models to a portfolio of prompt-based cognitive strategies by replacing binary labels with continuous scores that capture fine-grained reasoning quality. We prove that learning with K strategies yields a Theta(K log K) improvement in sample […]

Ver mais

Like 0

Liked Liked

technocracy

Colosseum: Auditing Collusion in Cooperative Multi-Agent Systems

digitado ⋅ 18 de February de 2026

arXiv:2602.15198v1 Announce Type: new Abstract: Multi-agent systems, where LLM agents communicate through free-form language, enable sophisticated coordination for solving complex cooperative tasks. This surfaces a unique safety problem when individual agents form a coalition and emph{collude} to pursue secondary goals and degrade the joint objective. In this paper, we present Colosseum, a framework for auditing LLM agents’ collusive behavior in multi-agent settings. We ground how agents cooperate through a Distributed Constraint Optimization Problem (DCOP) and measure collusion via […]

Ver mais

Like 0

Liked Liked

technocracy

Model Agnostic Differentially Private Causal Inference

digitado ⋅ 2 de February de 2026

arXiv:2505.19589v3 Announce Type: replace-cross Abstract: Estimating causal effects from observational data is essential in fields such as medicine, economics and social sciences, where privacy concerns are paramount. We propose a general, model-agnostic framework for differentially private estimation of average treatment effects (ATE) that avoids strong structural assumptions on the data-generating process or the models used to estimate propensity scores and conditional outcomes. In contrast to prior work, which enforces differential privacy by directly privatizing these nuisance components, our […]

Ver mais

Like 0

Liked Liked

technocracy

Open Responses

digitado ⋅ 16 de January de 2026

Open Responses This is the standardization effort I’ve most wanted in the world of LLMs: a vendor-neutral specification for the JSON API that clients can use to talk to hosted LLMs. Open Responses aims to provide exactly that as a documented standard, derived from OpenAI’s Responses API. I was hoping for one based on their older Chat Completions API since so many other products have cloned the already, but basing it on Responses does make sense since that […]

Ver mais

Like 0

Liked Liked

technocracy

Differentiable Ghost Operators: A Physics-Informed Neural Turing Machine for Cross-Sectional Stock Selection

digitado ⋅ 17 de April de 2026

Extracting Alpha in extreme low signal-to-noise ratio (SNR) environments, such as the Chinese A share market, remains a notoriously unsolved challenge for deep learning. Traditional heavily parameterized models, including Transformers, inevitably fall into the ”dimensionality disaster,” memorizing market noise rather than fundamental mechanics. To break this overfitting curse, we propose a novel, ultra-lightweight architecture inspired by thermody namics and Neural Turing Machines (NTMs): the Physics-Informed Ghost Operator. By mapping the cross-sectional stock market into a high-dimensional physical manifold, […]

Ver mais

Like 0

Liked Liked

technocracy

Low performing pixel correction in computed tomography with unrolled network and synthetic data training

digitado ⋅ 30 de January de 2026

arXiv:2601.20995v1 Announce Type: new Abstract: Low performance pixels (LPP) in Computed Tomography (CT) detectors would lead to ring and streak artifacts in the reconstructed images, making them clinically unusable. In recent years, several solutions have been proposed to correct LPP artifacts, either in the image domain or in the sinogram domain using supervised deep learning methods. However, these methods require dedicated datasets for training, which are expensive to collect. Moreover, existing approaches focus solely either on image-space or […]

Ver mais

Like 0

Liked Liked

technocracy

New Adaptive Numerical Methods Based on Dual Formulation of Hyperbolic Conservation Laws

digitado ⋅ 29 de January de 2026

arXiv:2601.20000v1 Announce Type: new Abstract: In this paper, we propose an adaptive high-order method for hyperbolic systems of conservation laws. The proposed method is based on a dual formulation approach: Two numerical solutions, corresponding to conservative and nonconservative formulations of the same system, are evolved simultaneously. Since nonconservative schemes are known to produce nonphysical weak solutions near discontinuities, we exploit the difference between these two solutions to construct a smoothness indicator (SI). In smooth regions, the difference between […]

Ver mais

Like 0

Liked Liked

technocracy

TSSR: Two-Stage Swap-Reward-Driven Reinforcement Learning for Character-Level SMILES Generation

digitado ⋅ 8 de January de 2026

The design of reliable, valid, and diverse molecules is fundamental to modern drug discovery, as improved molecular generation supports efficient exploration of the chemical space for potential drug candidates and reduces the cost of early design efforts. Despite these needs, current chemical language models that generate molecules as SMILES strings are vulnerable to compounding token errors: many samples are unparseable or chemically implausible, and hard constraints meant to prevent failure can restrict exploration. To address this gap, we […]

Ver mais

Like 0

Liked Liked

technocracy

Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data

digitado ⋅ 24 de February de 2026

Large language models (LLMs) are becoming the foundation for autonomous agents that can use tools to solve complex tasks. Reinforcement learning (RL) has emerged as a common approach for injecting such agentic capabilities, but typically under tightly controlled training setups. It often depends on carefully constructed task-solution pairs and substantial human supervision, which creates a fundamental obstacle to open-ended self-evolution toward superintelligent systems. In this paper, we propose Tool-R0 framework for training general purpose tool-calling agents from scratch […]

Ver mais

Like 0

Liked Liked