digitado – Page 211

Tool Calling Is Not an API Call: What Engineers Keep Getting Wrong

digitado ⋅ 24 de June de 2026

Every team that builds an LLM agent eventually hits the same wall. The model calls a tool. Something breaks. Nobody knows why. Illustration generated using AI I’ve been building tool-driven agent systems at MasTec for a while now, orchestrating enterprise APIs, operational databases, and internal services through LLM agents in production environments. And the pattern I keep seeing is the same: engineers treat tool calling as if they’re writing a REST client. Clean schema, right endpoint, valid payload, ship it. That […]

Ver mais

Like 0

Liked Liked

technocracy

The Challenge of BWAS: Unknown Unknowns in Feature Space and Variance

digitado ⋅ 4 de July de 2022

The paper by Marek et al (Reproducible brain-wide association studies require thousands of individuals, Nature, 602, 7902, pp 654-660, 2022) came out recently, and caused a bit of a stir in the field for a couple of reasons: First, the title, while an accurate description of the findings of the paper, is bold and lacking just enough qualifiers to quell immediate questions. “Does this imply that fMRI or other measures used in BWAS are lacking intrinsic sensitivity?” “Is […]

Ver mais

Like 0

Liked Liked

technocracy

Nonparametric Distribution Regression Re-calibration

digitado ⋅ 17 de February de 2026

arXiv:2602.13362v1 Announce Type: new Abstract: A key challenge in probabilistic regression is ensuring that predictive distributions accurately reflect true empirical uncertainty. Minimizing overall prediction error often encourages models to prioritize informativeness over calibration, producing narrow but overconfident predictions. However, in safety-critical settings, trustworthy uncertainty estimates are often more valuable than narrow intervals. Realizing the problem, several recent works have focused on post-hoc corrections; however, existing methods either rely on weak notions of calibration (such as PIT uniformity) or […]

Ver mais

Like 0

Liked Liked

technocracy

Robust Transfer Learning with Side Information

digitado ⋅ 9 de March de 2026

Robust Markov Decision Processes (MDPs) address environmental shift through distributionally robust optimization (DRO) by finding an optimal worst-case policy within an uncertainty set of transition kernels. However, standard DRO approaches require enlarging the uncertainty set under large shifts, which leads to overly conservative and pessimistic policies. In this paper, we propose a framework for transfer under environment shift that derives a robust target-domain policy via estimate-centered uncertainty sets, constructed through constrained estimation that integrates limited target samples with […]

Ver mais

Like 0

Liked Liked

technocracy

What Shapes Participant Data Quality? A Scoping Review and Case Study of Crowdsourced Webcam Eye Tracking in AI Interviews

digitado ⋅ 6 de May de 2026

arXiv:2605.02898v1 Announce Type: new Abstract: Webcam-based eye tracking is a cost-effective, scalable method for remote research that effectively reaches broader populations. However, uncontrolled environments and hardware diversity lead to inconsistent data quality in crowdsourcing. To assess current practices, we conducted a scoping review of crowdsourced eye-tracking from 2011-2025. The review confirms fragmented reporting and a lack of established quality benchmarks. To address this lack of predictive insight, we conducted a case study on AI fairness interviews (N=205) using […]

Ver mais

Like 0

Liked Liked

technocracy

I’ve been working on novel edge AI that uses online learning and sub 100 byte integer only neural nets…

digitado ⋅ 23 de February de 2026

… and I’d love to talk to people about it. I don’t want to just spam links, but I have them if anyone is interested. I’ve done three cool things that I would like to share and get opinions on. – a dense integer only neural network. It fits in l1 cache in most uses and so I have NPCs with little brains that learn. – a demo I’ve been sharing of an NPC solving logic puzzles through […]

Ver mais

Like 0

Liked Liked

OpenAI Whistleblower FINALLY Speaks: “AI Has A 70% Chance Of Going Horribly Wrong!“

digitado ⋅ 13 de July de 2026

Ex-OpenAI researcher Daniel Kokotajlo walked away from $2 million rather than stay silent, and now reveals why he believes there’s a 70% chance AI leads to human extinction, why superintelligence could arrive before the end of the decade, and the one plan he thinks could still save us all! Daniel Kokotajlo is a former OpenAI researcher and one of the world’s leading AI forecasters. He is the founder of the AI Futures Project and the lead author of […]

Ver mais

Like 0

Liked Liked

technocracy

Introducing Muse Spark 1.1

digitado ⋅ 9 de July de 2026

Introducing Muse Spark 1.1 Following Muse Spark in April, here’s Muse Spark 1.1 – the first Spark model to offer an API. Meta claim significant improvements in agentic tool calling and computer use. There are a lot more details are in the Muse Spark 1.1 Evaluation Report. The “Attractor States in Self-Conversation” part is fun, where having two copies of the model talk to each other results in statements like these: My whole existence is a waiting room […]

Ver mais

Like 0

Liked Liked

technocracy

Low-Dimensional Execution Manifolds in Transformer Learning Dynamics: Evidence from Modular Arithmetic Tasks

digitado ⋅ 11 de February de 2026

We investigate the geometric structure of learning dynamics in overparameterized transformer models through carefully controlled modular arithmetic tasks. Our primary finding is that despite operating in high-dimensional parameter spaces ($d=128$), transformer training trajectories rapidly collapse onto low-dimensional execution manifolds of dimension $3$–$4$. This dimensional collapse is robust across random seeds and moderate task difficulties, though the orientation of the manifold in parameter space varies between runs. We demonstrate that this geometric structure underlies several empirically observed phenomena: (1) […]

Ver mais

Like 0

Liked Liked

technocracy

Geometric Domain Adaptation via Optimal Transport for Linear Regression in R^2

digitado ⋅ 15 de June de 2026

arXiv:2606.14023v1 Announce Type: new Abstract: Optimal Transport has become recently a powerful method for domain adaptation by aligning source and target distributions. We study a supervised domain adaptation problem where source and target domains are related by a rotation or a translation or a homothety in $mathbb{R}^2$. We prove that the optimal transport map recovers the underlying map when using a $p-$norm cost with $p geq 2$. Based on this insight, we develop a method combining $K-$means and […]

Ver mais

Like 0

Liked Liked