digitado

Distributional Machine Unlearning via Selective Data Removal

digitado ⋅ 15 de January de 2026

arXiv:2507.15112v4 Announce Type: replace-cross Abstract: Machine learning systems increasingly face requirements to remove entire domains of information–such as toxic language or biases–rather than individual user data. This task presents a dilemma: full removal of the unwanted domain data is computationally expensive, while random partial removal is statistically inefficient. We find that a domain’s statistical influence is often concentrated in a small subset of its data samples, suggesting a path between ineffective partial removal and unnecessary complete removal. We […]

Ver mais

Like 0

Liked Liked

technocracy

Sparse Autoencoders are Capable LLM Jailbreak Mitigators

digitado ⋅ 16 de February de 2026

arXiv:2602.12418v1 Announce Type: new Abstract: Jailbreak attacks remain a persistent threat to large language model safety. We propose Context-Conditioned Delta Steering (CC-Delta), an SAE-based defense that identifies jailbreak-relevant sparse features by comparing token-level representations of the same harmful request with and without jailbreak context. Using paired harmful/jailbreak prompts, CC-Delta selects features via statistical testing and applies inference-time mean-shift steering in SAE latent space. Across four aligned instruction-tuned models and twelve jailbreak attacks, CC-Delta achieves comparable or better safety-utility […]

Ver mais

Like 0

Liked Liked

technocracy

Quiz: How to Use Ollama to Run Large Language Models Locally

digitado ⋅ 12 de March de 2026

In this quiz, you’ll test your understanding of How to Use Ollama to Run Large Language Models Locally. By working through this quiz, you’ll revisit how to install Ollama, pull and manage models, chat with local LLMs from your terminal, and connect them to AI coding tools. Running models locally means your prompts stay private and no API keys or cloud services are needed. See how well you remember the key commands and concepts. [ Improve Your Python […]

Ver mais

Like 0

Liked Liked

technocracy

Data-Driven Synthesis of Robust Positively Invariant Sets from Noisy Data

digitado ⋅ 25 de March de 2026

arXiv:2603.22460v1 Announce Type: new Abstract: This paper develops a method to construct robust positively invariant (RPI) tube sets from finite noisy input-state data of an unknown linear time-invariant (LTI) system, yielding tubes that can be directly embedded in tube-based robust data-driven predictive control. Data-consistency uncertainty sets are constructed under process/measurement noise with polytopic/ellipsoidal bounds. In the measurement-noise case, we provide a deterministic and data-consistent procedure to certify the induced residual bound from data. Based on these sets, a […]

Ver mais

Like 0

Liked Liked

technocracy

Beyond the Hype: A Technical Deep-Dive Into the AI Tools Ecosystem of 2026

digitado ⋅ 9 de January de 2026

Architectural Paradigms and Market Consolidation in Production AI Systems The AI tooling landscape of 2026 represents a fascinating inflection point in the maturation of large language model (LLM) applications. What began as a Cambrian explosion of specialized tools has rapidly consolidated into distinct architectural patterns, each optimizing for specific computational trade-offs and user interaction paradigms. The Foundation Model Triumvirate: Diverging Optimization Strategies The dominance of ChatGPT, Claude, and Gemini isn’t merely about brand recognition — it reflects fundamentally different approaches to […]

Ver mais

Like 0

Liked Liked

technocracy

Wide Neural Networks as a Baseline for the Computational No-Coincidence Conjecture

digitado ⋅ 13 de January de 2026

arXiv:2510.06527v2 Announce Type: replace-cross Abstract: We establish that randomly initialized neural networks, with large width and a natural choice of hyperparameters, have nearly independent outputs exactly when their activation function is nonlinear with zero mean under the Gaussian measure: $mathbb{E}_{z sim mathcal{N}(0,1)}[sigma(z)]=0$. For example, this includes ReLU and GeLU with an additive shift, as well as tanh, but not ReLU or GeLU by themselves. Because of their nearly independent outputs, we propose neural networks with zero-mean activation functions […]

Ver mais

Like 0

Liked Liked

technocracy

On the Width Scaling of Neural Optimizers Under Matrix Operator Norms I: Row/Column Normalization and Hyperparameter Transfer

digitado ⋅ 11 de March de 2026

arXiv:2603.09952v1 Announce Type: cross Abstract: A central question in modern deep learning is how to design optimizers whose behavior remains stable as the network width $w$ increases. We address this question by interpreting several widely used neural-network optimizers, including textrm{AdamW} and textrm{Muon}, as instances of steepest descent under matrix operator norms. This perspective links optimizer geometry with the Lipschitz structure of the network forward map, and enables width-independent control of both Lipschitz and smoothness constants. However, steepest-descent rules […]

Ver mais

Like 0

Liked Liked

technocracy

InfoBridge: Mutual Information estimation via Bridge Matching

digitado ⋅ 2 de March de 2026

arXiv:2502.01383v4 Announce Type: replace-cross Abstract: Diffusion bridge models have recently become a powerful tool in the field of generative modeling. In this work, we leverage their power to address another important problem in machine learning and information theory, the estimation of the mutual information (MI) between two random variables. Neatly framing MI estimation as a domain transfer problem, we construct an unbiased estimator for data posing difficulties for conventional MI estimators. We showcase the performance of our estimator […]

Ver mais

Like 0

Liked Liked

technocracy

Self-Distillation of Hidden Layers for Self-Supervised Representation Learning

digitado ⋅ 16 de March de 2026

The landscape of self-supervised learning (SSL) is currently dominated by generative approaches (e.g., MAE) that reconstruct raw low-level data, and predictive approaches (e.g., I-JEPA) that predict high-level abstract embeddings. While generative methods provide strong grounding, they are computationally inefficient for high-redundancy modalities like imagery, and their training objective does not prioritize learning high-level, conceptual features. Conversely, predictive methods often suffer from training instability due to their reliance on the non-stationary targets of final-layer self-distillation. We introduce Bootleg, a […]

Ver mais

Like 0

Liked Liked

technocracy

Using Large Language Models to Detect Socially Shared Regulation of Collaborative Learning

digitado ⋅ 8 de January de 2026

The field of learning analytics has made notable strides in automating the detection of complex learning processes in multimodal data. However, most advancements have focused on individualized problem-solving instead of collaborative, open-ended problem-solving, which may offer both affordances (richer data) and challenges (low cohesion) to behavioral prediction. Here, we extend predictive models to automatically detect socially shared regulation of learning (SSRL) behaviors in collaborative computational modeling environments using embedding-based approaches. We leverage large language models (LLMs) as summarization […]

Ver mais

Like 0

Liked Liked