April 2026

What’s new in pip 26.1 – lockfiles and dependency cooldowns!

digitado ⋅ 28 de April de 2026

What’s new in pip 26.1 – lockfiles and dependency cooldowns! Richard Si describes an excellent set of upgrades to Python’s default pip tool for installing dependencies. This version drops support for Python 3.9 – fair enough, since it’s been EOL since October. macOS still ships with python3 as a default Python 3.9, so I tried out the new Python version against Python 3.14 like this: uv python install 3.14 mkdir /tmp/experiment cd /tmp/experiment python3.14 -m venv venv source […]

Ver mais

Like 0

Liked Liked

technocracy

Agentic AI in Action — Part 20 — Building an AI Digital Twin Agent for Scenario Simulation

digitado ⋅ 28 de April de 2026

Building an AI Digital Twin Agent for Scenario Simulation Most analytics systems today are designed to answer questions about the past. Dashboards and reporting tools help organizations understand what happened, and predictive models attempt to forecast what may happen next. While these capabilities are extremely valuable, decision makers often need something slightly different. They need a way to explore what might happen if a decision is made. In other words, they need the ability to test scenarios before […]

Ver mais

Like 0

Liked Liked

technocracy

RedParrot: Accelerating NL-to-DSL for Business Analytics via Query Semantic Caching

digitado ⋅ 28 de April de 2026

arXiv:2604.22758v1 Announce Type: new Abstract: Recently, at Xiaohongshu, the rapid expansion of e-commerce and advertising demands real-time business analytics with high accuracy and low latency. To meet this demand, systems typically rely on converting natural language (NL) queries into Domain-Specific Languages (DSLs) to ensure semantic consistency, validation, and portability. However, existing multi-stage LLM pipelines for this NL-to-DSL task suffer from prohibitive latency, high cost, and error propagation, rendering them unsuitable for enterprise-scale deployment. In this paper, we propose […]

Ver mais

Like 0

Liked Liked

technocracy

StratRAG: A Multi-Hop Retrieval Evaluation Dataset for Retrieval-Augmented Generation Systems

digitado ⋅ 28 de April de 2026

arXiv:2604.22757v1 Announce Type: new Abstract: We introduce StratRAG, an open-source retrieval evaluation dataset for benchmarking Retrieval-Augmented Generation (RAG) systems on multi-hop reasoning tasks under realistic, noisy document-pool conditions. Derived from HotpotQA (distractor setting), StratRAG comprises 2,200 examples across three question types — bridge, comparison, and yes-no — each paired with a pool of 15 candidate documents containing exactly 2 gold documents and 13 topically related distractors. We benchmark three retrieval strategies — BM25, dense retrieval (all-MiniLM-L6-v2), and hybrid […]

Ver mais

Like 0

Liked Liked

technocracy

Your Reviews Replicate You: LLM-Based Agents as Customer Digital Twins for Conjoint Analysis

digitado ⋅ 28 de April de 2026

arXiv:2604.22756v1 Announce Type: new Abstract: Conjoint analysis is a cornerstone of market research for estimating consumer preferences; however, traditional methods face persistent challenges regarding time, cost, and respondent fatigue. To address these limitations, this study proposes a framework that utilizes large language model (LLM)-based “customer digital twins (CDT)” as virtual respondents. We identified active users within the Reddit community and aggregated their comprehensive review histories to construct individualized vector databases. By integrating retrieval-augmented generation (RAG) with prompt engineering, […]

Ver mais

Like 0

Liked Liked

technocracy

RADIANT-LLM: an Agentic Retrieval Augmented Generation Framework for Reliable Decision Support in Safety-Critical Nuclear Engineering

digitado ⋅ 28 de April de 2026

arXiv:2604.22755v1 Announce Type: new Abstract: Reliable decision support in nuclear engineering requires traceable, domain-grounded knowledge retrieval, yet safety and risk analysis workflows remain hampered by fragmented documentation and hallucination when use pre-trained large language model (LLM) in specialized nuclear domains. To address these challenges, this paper presents RADIANT-LLM (Retrival-Augumented, Domain-Intelligent Agent for Nuclear Technologies using LLM), a multi-modal retrieval-augmented generation (RAG) framework designed for nuclear safety, security, and safeguards applications. The framework uses a local-first, model-agnostic architecture that […]

Ver mais

Like 0

Liked Liked

technocracy

HalalBench: A Multilingual OCR Benchmark for Food Packaging Ingredient Extraction

digitado ⋅ 28 de April de 2026

arXiv:2604.22754v1 Announce Type: new Abstract: No standardized benchmark exists for evaluating OCR on food packaging, despite its critical role in automated halal food verification. Existing benchmarks target documents or scene text, missing the unique challenges of ingredient labels: curved surfaces, dense multilingual text, and sub-8pt fonts. We present HalalBench, the first open multilingual benchmark for food packaging OCR, comprising 1,043 images (50 real, 993 synthetic) with 36,438 annotations in COCO format spanning 14 languages. We evaluate four engines: […]

Ver mais

Like 0

Liked Liked

technocracy

Introducing talkie: a 13B vintage language model from 1930

digitado ⋅ 28 de April de 2026

Introducing talkie: a 13B vintage language model from 1930 New project from Nick Levine, David Duvenaud, and Alec Radford (of GPT, GPT-2, Whisper fame). talkie-1930-13b-base (53.1 GB) is a “13B language model trained on 260B tokens of historical pre-1931 English text”. talkie-1930-13b-it (26.6 GB) is a checkpoint “finetuned using a novel dataset of instruction-response pairs extracted from pre-1931 reference works”, designed to power a chat interface. You can try that out here. Both models are Apache 2.0 licensed. […]

Ver mais

Like 0

Liked Liked

technocracy

microsoft/VibeVoice

digitado ⋅ 28 de April de 2026

microsoft/VibeVoice VibeVoice is Microsoft’s Whisper-style audio model for speech-to-text, MIT licensed and with speaker diarization built into the model. Microsoft released it on January 21st, 2026 but I hadn’t tried it until today. Here’s a one-liner to run it on a Mac with uv, mlx-audio (by Prince Canuma) and the 5.71GB mlx-community/VibeVoice-ASR-4bit MLX conversion of the 17.3GB VibeVoice-ASR model, in this case against a downloaded copy of my recent podcast appearance with Lenny Rachitsky: uv run –with mlx-audio […]

Ver mais

Like 0

Liked Liked

technocracy

Musk and Altman face off in trial that will determine OpenAI’s future

digitado ⋅ 27 de April de 2026

A hotly anticipated trial starts this week, where Elon Musk will attempt to prove that OpenAI, under Sam Altman, has abandoned its mission to remain a nonprofit in order to ensure that artificial intelligence serves humanity, and not just billionaires. Many view the lawsuit as a grudge match between Musk—who left OpenAI after serving as an early major donor and advisor—and Altman—who currently runs OpenAI, despite insiders’ allegedly growing distrust in his commitment to the dominant AI firm’s […]

Ver mais

Like 0

Liked Liked