digitado

Navigating the Concept Space of Language Models

digitado ⋅ 26 de March de 2026

arXiv:2603.23524v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) trained on large language model activations output thousands of features that enable mapping to human-interpretable concepts. The current practice for analyzing these features primarily relies on inspecting top-activating examples, manually browsing individual features, or performing semantic search on interested concepts, which makes exploratory discovery of concepts difficult at scale. In this paper, we present Concept Explorer, a scalable interactive system for post-hoc exploration of SAE features that organizes concept explanations […]

Ver mais

Like 0

Liked Liked

technocracy

LLM-AUG: Robust Wireless Data Augmentation with In-Context Learning in Large Language Models

digitado ⋅ 20 de April de 2026

Data scarcity remains a fundamental bottleneck in applying deep learning to wireless communication problems, particularly in scenarios where collecting labeled Radio Frequency (RF) data is expensive, time-consuming, or operationally constrained. This paper proposes LLM-AUG, a data augmentation framework that leverages in-context learning in large language models (LLMs) to generate synthetic training samples directly in a learned embedding space. Unlike conventional generative approaches that require training task-specific models, LLM-AUG performs data generation through structured prompting, enabling rapid adaptation in […]

Ver mais

Like 0

Liked Liked

technocracy

LegoNet: Memory Footprint Reduction Through Block Weight Clustering

digitado ⋅ 10 de March de 2026

arXiv:2603.06606v1 Announce Type: new Abstract: As the need for neural network-based applications to become more accurate and powerful grows, so too does their size and memory footprint. With embedded devices, whose cache and RAM are limited, this growth hinders their ability to leverage state-of-the-art neural network architectures. In this work, we propose textbf{LegoNet}, a compression technique that textbf{constructs blocks of weights of the entire model regardless of layer type} and clusters these induced blocks. Using blocks instead of […]

Ver mais

Like 0

Liked Liked

technocracy

A Century of Educational Technology from Standardization to Co-Creation and the Epistemological Rupture of Artificial Intelligence

digitado ⋅ 3 de April de 2026

Introduction: Educational systems have historically adapted to incorporate technological innovations, but the success of these processes may vary greatly. Some tools become embedded in established processes, while others lead to radical transformation. In this paper, we examine the development of educational technology and recommend and critique certain technologies that facilitate existing industrial education approaches and others that require the restructuring of how educational aims and methods are defined and implemented. Purpose: This study aims to evaluate this optimization-restructuring […]

Ver mais

Like 0

Liked Liked

technocracy

Can We Predict Before Executing Machine Learning Agents?

digitado ⋅ 9 de January de 2026

Autonomous machine learning agents have revolutionized scientific discovery, yet they remain constrained by a Generate-Execute-Feedback paradigm. Previous approaches suffer from a severe Execution Bottleneck, as hypothesis evaluation relies strictly on expensive physical execution. To bypass these physical constraints, we internalize execution priors to substitute costly runtime checks with instantaneous predictive reasoning, drawing inspiration from World Models. In this work, we formalize the task of Data-centric Solution Preference and construct a comprehensive corpus of 18,438 pairwise comparisons. We demonstrate […]

Ver mais

Like 0

Liked Liked

technocracy

Representation Geometry as a Diagnostic for Out-of-Distribution Robustness

digitado ⋅ 5 de February de 2026

arXiv:2602.03951v1 Announce Type: new Abstract: Robust generalization under distribution shift remains difficult to monitor and optimize in the absence of target-domain labels, as models with similar in-distribution accuracy can exhibit markedly different out-of-distribution (OOD) performance. While prior work has focused on training-time regularization and low-order representation statistics, little is known about whether the geometric structure of learned embeddings provides reliable post-hoc signals of robustness. We propose a geometry-based diagnostic framework that constructs class-conditional mutual k-nearest-neighbor graphs from in-distribution […]

Ver mais

Like 0

Liked Liked

technocracy

Simulating Complex Multi-Turn Tool Calling Interactions in Stateless Execution Environments

digitado ⋅ 29 de January de 2026

arXiv:2601.19914v1 Announce Type: new Abstract: Synthetic data has proven itself to be a valuable resource for tuning smaller, cost-effective language models to handle the complexities of multi-turn tool calling conversations. While many frameworks and systems for producing synthetic multi-turn tool calling data have been proposed, prior works have frequently assumed that any tool calling interactions will take place in an execution environment that maintains state. When such an environment is available, this is advantageous as it allows for […]

Ver mais

Like 0

Liked Liked

technocracy

dReLU Sparsification: Recovering LLM Performance with 150B Token Pretraining

digitado ⋅ 28 de February de 2026

Table of Links Abstract and 1. Introduction Related Work and Background Analysis 3.1 Limitations about Existing ReLUficatio 3.2 dReLU Are Neurons in Expert still Sparsely Activated? dReLU Sparsification Experiments Results 6.1 Downstream Tasks Performance 6.2 Sparsity of Sparsified Models Practical Inference Speedup Evaluation 7.1 Experiments Setting 7.2 Pure CPU Inference and 7.3 Hybrid GPU-CPU Inference 7.4 Deploy LLMs on mobile phones Conclusion and References A. Appendix / supplemental material B. Limitation C. Broader Impact 5 dReLU Sparsification In […]

Ver mais

Like 0

Liked Liked

technocracy

From Moments to Models: Graphon-Mixture Learning for Mixup and Contrastive Learning

digitado ⋅ 1 de April de 2026

arXiv:2510.03690v3 Announce Type: replace-cross Abstract: Real-world graph datasets often arise from mixtures of populations, where graphs are generated by multiple distinct underlying distributions. In this work, we propose a unified framework that explicitly models graph data as a mixture of probabilistic graph generative models represented by graphons. To characterize and estimate these graphons, we leverage graph moments (motif densities) to cluster graphs generated from the same underlying model. We establish a novel theoretical guarantee, deriving a tighter bound […]

Ver mais

Like 0

Liked Liked

technocracy

WildfireVLM: AI-powered Analysis for Early Wildfire Detection and Risk Assessment Using Satellite Imagery

digitado ⋅ 17 de February de 2026

arXiv:2602.13305v1 Announce Type: new Abstract: Wildfires are a growing threat to ecosystems, human lives, and infrastructure, with their frequency and intensity rising due to climate change and human activities. Early detection is critical, yet satellite-based monitoring remains challenging due to faint smoke signals, dynamic weather conditions, and the need for real-time analysis over large areas. We introduce WildfireVLM, an AI framework that combines satellite imagery wildfire detection with language-driven risk assessment. We construct a labeled wildfire and smoke […]

Ver mais

Like 0

Liked Liked