digitado – Page 301

Before the First Token: Scale-Dependent Emergence of Hallucination Signals in Autoregressive Language Models

digitado ⋅ 16 de April de 2026

arXiv:2604.13068v1 Announce Type: new Abstract: When do large language models decide to hallucinate? Despite serious consequences in healthcare, law, and finance, few formal answers exist. Recent work shows autoregressive models maintain internal representations distinguishing factual from fictional outputs, but when these representations peak as a function of model scale remains poorly understood. We study the temporal dynamics of hallucination-indicative internal representations across 7 autoregressive transformers (117M–7B parameters) using three fact-based datasets (TriviaQA, Simple Facts, Biography; 552 labeled examples). […]

Ver mais

Like 0

Liked Liked

technocracy

Learnings from COBOL modernization in the real world

digitado ⋅ 26 de February de 2026

There’s a lot of excitement right now about AI enabling mainframe application modernization. Boards are paying attention. CIOs are getting asked for a plan. AI is a genuine accelerator for COBOL modernization but to get results, AI needs additional context that source code alone can’t provide.Here’s what we’ve learned working with 400+ enterprise customers: mainframe modernization has two very different halves. The first half is reverse engineering, understanding what your existing systems actually do. The second half is […]

Ver mais

Like 0

Liked Liked

technocracy

Aligned explanations in neural networks

digitado ⋅ 9 de January de 2026

arXiv:2601.04378v1 Announce Type: cross Abstract: Feature attribution is the dominant paradigm for explaining deep neural networks. However, most existing methods only loosely reflect the model’s prediction-making process, thereby merely white-painting the black box. We argue that explanatory alignment is a key aspect of trustworthiness in prediction tasks: explanations must be directly linked to predictions, rather than serving as post-hoc rationalizations. We present model readability as a design principle enabling alignment, and PiNets as a modeling framework to pursue […]

Ver mais

Like 0

Liked Liked

technocracy

Reasoning-Native Agentic Communication for 6G

digitado ⋅ 23 de February de 2026

arXiv:2602.17738v1 Announce Type: new Abstract: Future 6G networks will interconnect not only devices, but autonomous machines that continuously sense, reason, and act. In such environments, communication can no longer be understood solely as delivering bits or even preserving semantic meaning. Even when two agents interpret the same information correctly, they may still behave inconsistently if their internal reasoning processes evolve differently. We refer to this emerging challenge as belief divergence. This article introduces reasoning native agentic communication, a […]

Ver mais

Like 0

Liked Liked

technocracy

Finite-Step Invariant Sets for Hybrid Systems with Probabilistic Guarantees

digitado ⋅ 8 de April de 2026

arXiv:2604.05102v1 Announce Type: new Abstract: Poincare return maps are a fundamental tool for analyzing periodic orbits in hybrid dynamical systems, including legged locomotion, power electronics, and other cyber-physical systems with switching behavior. The Poincare return map captures the evolution of the hybrid system on a guard surface, reducing the stability analysis of a periodic orbit to that of a discrete-time system. While linearization provides local stability information, assessing robustness to disturbances requires identifying invariant sets of the state […]

Ver mais

Like 0

Liked Liked

technocracy

Dispatch-Aware Ragged Attention for Pruned Vision Transformers

digitado ⋅ 20 de April de 2026

arXiv:2604.15408v1 Announce Type: new Abstract: Token pruning methods for Vision Transformers (ViTs) promise quadratic reductions in attention FLOPs by dropping uninformative patches. Yet when pruned sequences are executed with state-of-the-art variable-length attention APIs — including FlashAttention-2’s varlen and PyTorch’s NestedTensor SDPA-the wall-clock attention latency doesn’t scale accordingly. We trace this to a dispatch-overhead bottleneck: at the short, post-pruning sequence lengths typical of ViTs (<=197 tokens), actual matrix arithmetic completes in single-digit microseconds while the host-side dispatch path consumes […]

Ver mais

Like 0

Liked Liked

technocracy

CoGR-MoE: Concept-Guided Expert Routing with Consistent Selection and Flexible Reasoning for Visual Question Answering

digitado ⋅ 21 de April de 2026

arXiv:2604.16930v1 Announce Type: new Abstract: Visual Question Answering (VQA) requires models to identify the correct answer options based on both visual and textual evidence. Recent Mixture-of-Experts (MoE) methods improve option reasoning by grouping similar concepts or routing based on examples. However, unstable routing can lead to inconsistent expert selection in the same question type, while overly stable routing may reduce flexibility. To address this, we propose Concept-Guided Routing framework (CoGR-MoE), which incorporates semantics of the answer options to […]

Ver mais

Like 0

Liked Liked

technocracy

Learning temporal embeddings from electronic health records of chronic kidney disease patients

digitado ⋅ 26 de January de 2026

We investigate whether temporal embedding models trained on longitudinal electronic health records can learn clinically meaningful representations without compromising predictive performance, and how architectural choices affect embedding quality. Model-guided medicine requires representations that capture disease dynamics while remaining transparent and task agnostic, whereas most clinical prediction models are optimised for a single task. Representation learning facilitates learning embeddings that generalise across downstream tasks, and recurrent architectures are well-suited for modelling temporal structure in observational clinical data. Using the […]

Ver mais

Like 0

Liked Liked

technocracy

Towards a Fairer Non-negative Matrix Factorization

digitado ⋅ 6 de March de 2026

arXiv:2411.09847v3 Announce Type: replace-cross Abstract: There has been a recent critical need to study fairness and bias in machine learning (ML) algorithms. Since there is clearly no one-size-fits-all solution to fairness, ML methods should be developed alongside bias mitigation strategies that are practical and approachable to the practitioner. Motivated by recent work on “fair” PCA, here we consider the more challenging method of non-negative matrix factorization (NMF) as both a showcasing example and a method that is important […]

Ver mais

Like 0

Liked Liked

technocracy

A from-scratch tour of Bitcoin in Python

digitado ⋅ 21 de June de 2021

I find blockchain fascinating because it extends open source software development to open source + state. This seems to be a genuine/exciting innovation in computing paradigms; We don’t just get to share code, we get to share a running computer, and anyone anywhere can use it in an open and permissionless manner. The seeds of this revolution arguably began with Bitcoin, so I became curious to drill into it in some detail to get an intuitive understanding of […]

Ver mais

Like 0

Liked Liked