digitado – Page 304

A Few Bad Neurons: Isolating and Surgically Correcting Sycophancy

digitado ⋅ 28 de January de 2026

arXiv:2601.18939v1 Announce Type: new Abstract: Behavioral alignment in large language models (LLMs) is often achieved through broad fine-tuning, which can result in undesired side effects like distributional shift and low interpretability. We propose a method for alignment that identifies and updates only the neurons most responsible for a given behavior, a targeted approach that allows for fine-tuning with significantly less data. Using sparse autoencoders (SAEs) and linear probes, we isolate the 3% of MLP neurons most predictive of […]

Ver mais

Like 0

Liked Liked

technocracy

Adobe Bets on AI Agents Aiming to Improve Business Workflows; This Is How

digitado ⋅ 21 de April de 2026

With the AI race getting more intense, Adobe is shifting away from being just a maker of creative and marketing tools. Now, it wants to power autonomous digital labour. Its latest announcement makes their new direction clear: instead of just adding AI features to its software, Adobe is building systems that can do business tasks themselves. The heart of this move is the introduction of AI “agents” designed to plan, coordinate, and get things done across many workflows, […]

Ver mais

Like 0

Liked Liked

technocracy

Malcolm Ocean: The Parable of the Canoe Sandwich

digitado ⋅ 6 de March de 2025

Two friends in a canoe discover that sharing a sandwich isn’t really about food—it’s about . When people trust a shared “we,” generosity feels natural; when entitlement, guilt, or ego intrude, the sense of togetherness collapses. Healthy cooperation arises when individuals reveal their needs openly and surrender not to each other, but to the possibility of a caring whole.

Ver mais

Like 0

Liked Liked

technocracy

Iowa county adopts strict zoning rules for data centers, but residents still worry

digitado ⋅ 2 de March de 2026

PALO, Iowa—There are two restaurants in Palo, not counting the chicken wings and pizza sold at the only gas station in town. All three establishments, including the gas station, stand on the same half-mile stretch of First Street, an artery that divides the marshy floodplain of the Cedar River to the east from hundreds of acres of cornfields on the west. During historic flooding in 2008, the Cedar River surged 10 feet above its previous record, cresting at […]

Ver mais

Like 0

Liked Liked

technocracy

Adaptive Engram Memory System for Indonesian Language Model: Generative AI Based on TOBA LM for Batak and Minang Language

digitado ⋅ 12 de March de 2026

arXiv:2603.10006v1 Announce Type: new Abstract: This study presents TOBA-LM, a trilingual language model based on GPT-2 architecture with 1.2 billion parameters, trained on a corpus encompassing Indonesian, Batak, and Minangkabau using syllabic-agglutinative tokenization. The architecture integrates an Engram Memory mechanism, an adaptive n-gram-based memory system with a 500,000 x 768 embedding table that captures morphological dependencies through bigram and trigram pathways. Empirical results demonstrate a training efficiency of 80%, with the loss value dropping from 6.4 to 1.7996 […]

Ver mais

Like 0

Liked Liked

technocracy

Rodney and Claude Code for Desktop

digitado ⋅ 16 de February de 2026

I’m a very heavy user of Claude Code on the web, Anthropic’s excellent but poorly named cloud version of Claude Code where everything runs in a container environment managed by them, greatly reducing the risk of anything bad happening to a computer I care about. I don’t use the web interface at all (hence my dislike of the name) – I access it exclusively through their native iPhone and Mac desktop apps. Something I particularly appreciate about the […]

Ver mais

Like 0

Liked Liked

technocracy

New teaser gives us first look at Godzilla Minus Zero

digitado ⋅ 15 de April de 2026

The Godzilla franchise is going strong in 2026, with Apple TV’s Monarch: Legacy of Monsters (part of Legendary Entertainment’s MonsterVerse) and the pending release of Toho’s Godzilla Minus Zero, the hotly anticipated sequel to 2023’s critically acclaimed, Oscar-winning Godzilla Minus One. Toho unveiled the first short teaser at Cinemacon, and it has now been released online for our viewing pleasure. (Spoilers for Godzilla Minus One below.) Director Takashi Yamazaki wanted to return to Godzilla’s filmic roots with Minus […]

Ver mais

Like 0

Liked Liked

technocracy

Automatic Constraint Policy Optimization based on Continuous Constraint Interpolation Framework for Offline Reinforcement Learning

digitado ⋅ 30 de January de 2026

Offline Reinforcement Learning (RL) relies on policy constraints to mitigate extrapolation error, where both the constraint form and constraint strength critically shape performance. However, most existing methods commit to a single constraint family: weighted behavior cloning, density regularization, or support constraints, without a unified principle that explains their connections or trade-offs. In this work, we propose Continuous Constraint Interpolation (CCI), a unified optimization framework in which these three constraint families arise as special cases along a common constraint […]

Ver mais

Like 0

Liked Liked

technocracy

Selecting Language Models for Social Science: Start Small, Start Open, and Validate

digitado ⋅ 19 de January de 2026

arXiv:2601.10926v1 Announce Type: new Abstract: Currently, there are thousands of large pretrained language models (LLMs) available to social scientists. How do we select among them? Using validity, reliability, reproducibility, and replicability as guides, we explore the significance of: (1) model openness, (2) model footprint, (3) training data, and (4) model architectures and fine-tuning. While ex-ante tests of validity (i.e., benchmarks) are often privileged in these discussions, we argue that social scientists cannot altogether avoid validating computational measures (ex-post). […]

Ver mais

Like 0

Liked Liked

technocracy

Axiomatic Foundations of Counterfactual Explanations

digitado ⋅ 5 de February de 2026

arXiv:2602.04028v1 Announce Type: new Abstract: Explaining autonomous and intelligent systems is critical in order to improve trust in their decisions. Counterfactuals have emerged as one of the most compelling forms of explanation. They address “why not” questions by revealing how decisions could be altered. Despite the growing literature, most existing explainers focus on a single type of counterfactual and are restricted to local explanations, focusing on individual instances. There has been no systematic study of alternative counterfactual types, […]

Ver mais

Like 0

Liked Liked