digitado – Page 498

Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

digitado ⋅ 5 de March de 2026

arXiv:2603.03310v1 Announce Type: new Abstract: Modern large language model (LLM) inference engines optimize throughput and latency under fixed decoding rules, treating generation as a linear progression in token time. We propose a fundamentally different paradigm: entropic-time inference, where decoding is governed by the flow of uncertainty rather than token index. We introduce a self-organizing inference architecture that jointly couples scheduling, attention sparsification, and sampling temperature under a unified entropy control objective. Our method extends vLLM with entropy-aware scheduling, […]

Ver mais

Like 0

Liked Liked

technocracy

Compressed code: the hidden effects of quantization and distillation on programming tokens

digitado ⋅ 7 de January de 2026

arXiv:2601.02563v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated exceptional code generation capabilities, yet their token-level mechanisms remain underexplored, particularly in compressed models. Through systematic analysis of programming language token representations, we characterize how programming languages are encoded in LLM tokenizers by analyzing their vocabulary distribution and keyword coverage patterns. We introduce a novel cold-start probability analysis method that provides insights into model behavior without requiring explicit prompts. Additionally, we present a comprehensive evaluation of how […]

Ver mais

Like 0

Liked Liked

technocracy

Great white sharks are overheating

digitado ⋅ 18 de April de 2026

The evolutionary edge that fueled great white shark dominance for millions of years could soon become its greatest downfall. The ocean’s most iconic predators maintain warmer body temperatures than the surrounding seawater and are paying an increasingly steep price for it. As the oceans warm due to climate change, they now face the risk of potentially fatal overheating, according to a new report in Science. Several large tuna species and sharks, known as “mesothermic” species for the way […]

Ver mais

Like 0

Liked Liked

technocracy

Three ways to differentiate ReLU

digitado ⋅ 30 de April de 2026

When a function is not differentiable in the classical sense there are multiple ways to compute a generalized derivative. This post will look at three generalizations of the classical derivative, each applied to the ReLU (rectified linear unit) function. The ReLU function is a commonly used activation function for neural networks. It’s also called the ramp function for obvious reasons. The function is simply r(x) = max(0, x). Pointwise derivative The pointwise derivative would be 0 for x < 0, 1 […]

Ver mais

Like 0

Liked Liked

technocracy

MiRAGE: A Multiagent Framework for Generating Multimodal Multihop Question-Answer Dataset for RAG Evaluation

digitado ⋅ 23 de January de 2026

arXiv:2601.15487v1 Announce Type: new Abstract: The rapid evolution of Retrieval-Augmented Generation (RAG) toward multimodal, high-stakes enterprise applications has outpaced the development of domain specific evaluation benchmarks. Existing datasets often rely on general-domain corpora or purely textual retrieval, failing to capture the complexity of specialized technical documents where information is inextricably multimodal and reasoning requires synthesizing disjoint evidence. We address this gap by introducing MiRAGE, a Multiagent framework for RAG systems Evaluation, that leverages a collaborative swarm of specialized […]

Ver mais

Like 0

Liked Liked

technocracy

What Do Agents Learn from Trajectory-SFT: Semantics or Interfaces?

digitado ⋅ 2 de February de 2026

Large language models are increasingly evaluated as interactive agents, yet standard agent benchmarks conflate two qualitatively distinct sources of success: semantic tool-use and interface-specific interaction pattern memorization. Because both mechanisms can yield identical task success on the original interface, benchmark scores alone are not identifiable evidence of environment-invariant capability. We propose PIPE, a protocol-level evaluation augmentation for diagnosing interface reliance by minimally rewriting environment interfaces while preserving task semantics and execution behavior. Across 16 environments from AgentBench and […]

Ver mais

Like 0

Liked Liked

technocracy

Dual-Stack Migrations: How to Move Petabytes Without Losing Sleep

digitado ⋅ 15 de February de 2026

Two bridges, one rush hour, zero do-overs “Cutover weekend” is a fairy tale when you’re migrating thousands of tapes (or billions of objects). Real migrations live in the messy middle: two stacks, two truths, and twice the places for ghosts to hide. The goal isn’t elegance—it’s survivability. You’re not building a bridge and blowing up the old one; you’re running both bridges during rush hour… while replacing deck boards. TL;DR (for the exec sprinting between status meetings) You […]

Ver mais

Like 0

Liked Liked

technocracy

Why Network Segmentation Projects Fail

digitado ⋅ 13 de April de 2026

arXiv:2604.08632v1 Announce Type: new Abstract: Network segmentation is a foundational enterprise security control. Despite its recognized benefits, segmentation initiatives frequently fail in practice, and the field lacks a systematic empirical explanation for why these projects do not achieve their intended outcomes. This paper presents an empirical study of failed segmentation projects based on a survey of 400 U.S.-based network security practitioners. The survey was grounded in a two-part failure framework that separately measures general IT project failure factors […]

Ver mais

Like 0

Liked Liked

technocracy

Built a normalizer so WER stops penalizing formatting differences in STT evals! [P]

digitado ⋅ 23 de April de 2026

Hey guys! At my company, we’ve been benchmarking STT engines a lot and kept running into the same issue: WER is penalizing formatting differences that have nothing to do with actual recognition quality. “It’s $50” vs “it is fifty dollars”, “3:00PM” vs “3 pm”. Both perfect transcription, but a terrible error rate. The fix is normalizing both sides before scoring, but every project we had a different script doing it slightly differently. So we built a proper library […]

Ver mais

Like 0

Liked Liked

technocracy

Copula-Stein Discrepancy: A Generator-Based Stein Operator for Archimedean Dependence

digitado ⋅ 13 de January de 2026

arXiv:2510.24056v2 Announce Type: replace Abstract: Kernel Stein discrepancies (KSDs) are widely used for goodness-of-fit testing, but standard KSDs can be insensitive to higher-order dependence features such as tail dependence. We introduce the Copula-Stein Discrepancy (CSD), which defines a Stein operator directly on the copula density to target dependence geometry rather than the joint score. For Archimedean copulas, CSD admits a closed-form Stein kernel derived from the scalar generator. We prove that CSD metrizes weak convergence of copula distributions, […]

Ver mais

Like 0

Liked Liked