digitado

Connecting the Dots: Surfacing Structure in Documents through AI-Generated Cross-Modal Links

digitado ⋅ 20 de February de 2026

arXiv:2602.16895v1 Announce Type: new Abstract: Understanding information-dense documents like recipes and scientific papers requires readers to find, interpret, and connect details scattered across text, figures, tables, and other visual elements. These documents are often long and filled with specialized terminology, hindering the ability to locate relevant information or piece together related ideas. Existing tools offer limited support for synthesizing information across media types. As a result, understanding complex material remains cognitively demanding. This paper presents a framework for […]

Ver mais

Like 0

Liked Liked

technocracy

La nueva frontera de la censura: cuando los gobiernos intentan apagar el cielo

digitado ⋅ 15 de January de 2026

A principio de 2026 estamos siendo testigos de un salto histórico en la forma en que los regímenes autoritarios intentan controlar y censurar el acceso a la información. Irán, inmerso en protestas populares que desafían la autoridad teocrática, no se ha limitado a apagar la fibra óptica y las redes móviles: ha ido más allá, desplegando interferencias electrónicas de gran potencia para neutralizar también el acceso por internet satelital, que hasta ahora se consideraba inmunizado frente a bloqueos […]

Ver mais

Like 0

Liked Liked

technocracy

Automatic Construction of Chinese Verb Collostruction Database

digitado ⋅ 9 de January de 2026

arXiv:2601.04197v1 Announce Type: new Abstract: This paper proposes a fully unsupervised approach to the construction of verb collostruction database for Chinese language, aimed at complementing LLMs by providing explicit and interpretable rules for application scenarios where explanation and interpretability are indispensable. The paper formally defines a verb collostruction as a projective, rooted, ordered, and directed acyclic graph and employs a series of clustering algorithms to generate collostructions for a given verb from a list of sentences retrieved from […]

Ver mais

Like 0

Liked Liked

technocracy

A Covering Framework for Offline POMDPs Learning using Belief Space Metric

digitado ⋅ 4 de March de 2026

arXiv:2603.03191v1 Announce Type: new Abstract: In off policy evaluation (OPE) for partially observable Markov decision processes (POMDPs), an agent must infer hidden states from past observations, which exacerbates both the curse of horizon and the curse of memory in existing OPE methods. This paper introduces a novel covering analysis framework that exploits the intrinsic metric structure of the belief space (distributions over latent states) to relax traditional coverage assumptions. By assuming value relevant functions are Lipschitz continuous in […]

Ver mais

Like 0

Liked Liked

technocracy

Fr’echet regression of multivariate distributions with nonparanormal transport

digitado ⋅ 10 de March de 2026

arXiv:2603.07014v1 Announce Type: cross Abstract: Regression with distribution-valued responses and Euclidean predictors has gained increasing scientific relevance. While methodology for univariate distributional data has advanced rapidly in recent years, multivariate distributions, which additionally encode dependence across univariate marginals, have received less attention and pose computational and statistical challenges. In this work, we address these challenges with a new regression approach for multivariate distributional responses, in which distributions are modeled within the semiparametric nonparanormal family. By incorporating the nonparanormal […]

Ver mais

Like 0

Liked Liked

technocracy

Scaling Laws of SignSGD in Linear Regression: When Does It Outperform SGD?

digitado ⋅ 3 de March de 2026

arXiv:2603.02069v1 Announce Type: cross Abstract: We study scaling laws of signSGD under a power-law random features (PLRF) model that accounts for both feature and target decay. We analyze the population risk of a linear model trained with one-pass signSGD on Gaussian-sketched features. We express the risk as a function of model size, training steps, learning rate, and the feature and target decay parameters. Comparing against the SGD risk analyzed by Paquette et al. (2024), we identify a drift-normalization […]

Ver mais

Like 0

Liked Liked

technocracy

Decentralized Federated Learning by Partial Message Exchange

digitado ⋅ 2 de March de 2026

Decentralized federated learning (DFL) has emerged as a transformative server-free paradigm that enables collaborative learning over large-scale heterogeneous networks. However, it continues to face fundamental challenges, including data heterogeneity, restrictive assumptions for theoretical analysis, and degraded convergence when standard communication- or privacyenhancing techniques are applied. To overcome these drawbacks, this paper develops a novel algorithm, PaME (DFL by Partial Message Exchange). The central principle is to allow only randomly selected sparse coordinates to be exchanged between two neighbor […]

Ver mais

Like 0

Liked Liked

technocracy

When GraphDB Ontologies Break: Exploring Embeddings

digitado ⋅ 22 de February de 2026

Keywords: GraphDB, VectorDB, Ontology, Embedding Vectors. tl;dr Your GraphDB ontology works perfectly in dev, but then production users write “jogging” and it can’t match “running!” Can embeddings fix this? Let’s explore. You’re building a recommender system that matches events and people based on interests. You built out a graphDB to help with matching, and you’re excited that it passes all the tests and it looks done and ready to launch. You try it out on a couple of people and […]

Ver mais

Like 0

Liked Liked

technocracy

SnapPoint: A Hard Reset for Your Dev Machine

digitado ⋅ 3 de February de 2026

Most developer machines are not clean. They just look clean. At some point, every dev laptop turns into a dumping ground. You install tools to follow a blog post. You try a framework for a weekend. You switch jobs and inherit a new stack. You uninstall things, but only halfway. Binaries stick around. Config files stay buried in your home directory. Caches grow quietly in the background. Nothing is fully broken, but nothing feels right either. Your terminal […]

Ver mais

Like 0

Liked Liked

technocracy

A Gap Between Decision Trees and Neural Networks

digitado ⋅ 9 de January de 2026

arXiv:2601.03919v2 Announce Type: replace-cross Abstract: We study when geometric simplicity of decision boundaries, used here as a notion of interpretability, can conflict with accurate approximation of axis-aligned decision trees by shallow neural networks. Decision trees induce rule-based, axis-aligned decision regions (finite unions of boxes), whereas shallow ReLU networks are typically trained as score models whose predictions are obtained by thresholding. We analyze the infinite-width, bounded-norm, single-hidden-layer ReLU class through the Radon total variation ($mathrm{R}mathrm{TV}$) seminorm, which controls the […]

Ver mais

Like 0

Liked Liked