digitado – Page 560

Dataset Distillation Efficiently Encodes Low-Dimensional Representations from Gradient-Based Learning of Non-Linear Tasks

digitado ⋅ 16 de March de 2026

Dataset distillation, a training-aware data compression technique, has recently attracted increasing attention as an effective tool for mitigating costs of optimization and data storage. However, progress remains largely empirical. Mechanisms underlying the extraction of task-relevant information from the training process and the efficient encoding of such information into synthetic data points remain elusive. In this paper, we theoretically analyze practical algorithms of dataset distillation applied to the gradient-based training of two-layer neural networks with width $L$. By focusing […]

Ver mais

Like 0

Liked Liked

technocracy

Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets

digitado ⋅ 26 de March de 2026

arXiv:2405.17573v3 Announce Type: replace Abstract: We study Leaky ResNets, which interpolate between ResNets and Fully-Connected nets depending on an ‘effective depth’ hyper-parameter $tilde{L}$. In the infinite depth limit, we study ‘representation geodesics’ $A_{p}$: continuous paths in representation space (similar to NeuralODEs) from input $p=0$ to output $p=1$ that minimize the parameter norm of the network. We give a Lagrangian and Hamiltonian reformulation, which highlight the importance of two terms: a kinetic energy which favors small layer derivatives $partial_{p}A_{p}$ […]

Ver mais

Like 0

Liked Liked

technocracy

Quantized Vision-Language Models for Damage Assessment: A Comparative Study of LLaVA-1.5-7B Quantization Levels

digitado ⋅ 31 de March de 2026

arXiv:2603.26770v1 Announce Type: new Abstract: Bridge infrastructure inspection is a critical but labor-intensive task requiring expert assessment of structural damage such as rebar exposure, cracking, and corrosion. This paper presents a comprehensive study of quantized Vision-Language Models (VLMs) for automated bridge damage assessment, focusing on the trade-offs between description quality, inference speed, and resource requirements. We develop an end-to-end pipeline combining LLaVA-1.5-7B for visual damage analysis, structured JSON extraction, and rule-based priority scoring. To enable deployment on consumer-grade […]

Ver mais

Like 0

Liked Liked

technocracy

Dream2Learn: Structured Generative Dreaming for Continual Learning

digitado ⋅ 2 de March de 2026

Continual learning requires balancing plasticity and stability while mitigating catastrophic forgetting. Inspired by human dreaming as a mechanism for internal simulation and knowledge restructuring, we introduce Dream2Learn (D2L), a framework in which a model autonomously generates structured synthetic experiences from its own internal representations and uses them for self-improvement. Rather than reconstructing past data as in generative replay, D2L enables a classifier to create novel, semantically distinct dreamed classes that are coherent with its learned knowledge yet do […]

Ver mais

Like 0

Liked Liked

technocracy

HumanMCP: A Human-Like Query Dataset for Evaluating MCP Tool Retrieval Performance

digitado ⋅ 2 de March de 2026

arXiv:2602.23367v1 Announce Type: new Abstract: Model Context Protocol (MCP) servers contain a collection of thousands of open-source standardized tools, linking LLMs to external systems; however, existing datasets and benchmarks lack realistic, human-like user queries, remaining a critical gap in evaluating the tool usage and ecosystems of MCP servers. Existing datasets often do contain tool descriptions but fail to represent how different users portray their requests, leading to poor generalization and inflated reliability of certain benchmarks. This paper introduces […]

Ver mais

Like 0

Liked Liked

technocracy

Generalisation of RLHF under Reward Shift and Clipped KL Regularisation

digitado ⋅ 26 de February de 2026

arXiv:2602.21765v1 Announce Type: cross Abstract: Alignment and adaptation in large language models heavily rely on reinforcement learning from human feedback (RLHF); yet, theoretical understanding of its generalisability remains premature, especially when the learned reward could shift, and the KL control is estimated and clipped. To address this issue, we develop generalisation theory for RLHF that explicitly accounts for (1) emph{reward shift}: reward models are trained on preference data from earlier or mixed behaviour policies while RLHF optimises the […]

Ver mais

Like 0

Liked Liked

technocracy

How to Make Engineering Knowledge Searchable (A Complete Guide)

digitado ⋅ 16 de January de 2026

The Invisible Wall in Your Codebase Imagine a new senior engineer joins your team. They are brilliant, experienced, and eager to push code. But for the first three weeks, their most common contribution is a question: “Hey, does anyone know why we used a custom hook here instead of a library?” or “Where is the doc explaining the database schema?” This is the Unsearchable Knowledge Problem. In most engineering organizations, knowledge exists in fragmented silos that don’t talk […]

Ver mais

Like 0

Liked Liked

technocracy

Bayesian Recovery for Probabilistic Coalition Structures

digitado ⋅ 12 de January de 2026

arXiv:2601.05273v1 Announce Type: new Abstract: Probabilistic Coalition Structure Generation (PCSG) is NP-hard and can be recast as an $l_0$-type sparse recovery problem by representing coalition structures as sparse coefficient vectors over a coalition-incidence design. A natural question is whether standard sparse methods, such as $l_1$ relaxations and greedy pursuits, can reliably recover the optimal coalition structure in this setting. We show that the answer is negative in a PCSG-inspired regime where overlapping coalitions generate highly coherent, near-duplicate columns: […]

Ver mais

Like 0

Liked Liked

technocracy

CraniMem: Cranial Inspired Gated and Bounded Memory for Agentic Systems

digitado ⋅ 18 de March de 2026

arXiv:2603.15642v1 Announce Type: new Abstract: Large language model (LLM) agents are increasingly deployed in long running workflows, where they must preserve user and task state across many turns. Many existing agent memory systems behave like external databases with ad hoc read/write rules, which can yield unstable retention, limited consolidation, and vulnerability to distractor content. We present CraniMem, a neurocognitively motivated, gated and bounded multi-stage memory design for agentic systems. CraniMem couples goal conditioned gating and utility tagging with […]

Ver mais

Like 0

Liked Liked

technocracy

LESV: Language Embedded Sparse Voxel Fusion for Open-Vocabulary 3D Scene Understanding

digitado ⋅ 3 de April de 2026

arXiv:2604.01388v1 Announce Type: new Abstract: Recent advancements in open-vocabulary 3D scene understanding heavily rely on 3D Gaussian Splatting (3DGS) to register vision-language features into 3D space. However, we identify two critical limitations in these approaches: the spatial ambiguity arising from unstructured, overlapping Gaussians which necessitates probabilistic feature registration, and the multi-level semantic ambiguity caused by pooling features over object-level masks, which dilutes fine-grained details. To address these challenges, we present a novel framework that leverages Sparse Voxel Rasterization […]

Ver mais

Like 0

Liked Liked