technocracy

Bringing Value Models Back: Generative Critics for Value Modeling in LLM Reinforcement Learning

digitado ⋅ 12 de April de 2026

Credit assignment is a central challenge in reinforcement learning (RL). Classical actor-critic methods address this challenge through fine-grained advantage estimation based on a learned value function. However, learned value models are often avoided in modern large language model (LLM) RL because conventional discriminative critics are difficult to train reliably. We revisit value modeling and argue that this difficulty is partly due to limited expressiveness. In particular, representation complexity theory suggests that value functions can be hard to approximate […]

Ver mais

Like 0

Liked Liked

technocracy

Empirical Validation of the Classification-Verification Dichotomy for AI Safety Gates

digitado ⋅ 2 de April de 2026

arXiv:2604.00072v1 Announce Type: cross Abstract: Can classifier-based safety gates maintain reliable oversight as AI systems improve over hundreds of iterations? We provide comprehensive empirical evidence that they cannot. On a self-improving neural controller (d=240), eighteen classifier configurations — spanning MLPs, SVMs, random forests, k-NN, Bayesian classifiers, and deep networks — all fail the dual conditions for safe self-improvement. Three safe RL baselines (CPO, Lyapunov, safety shielding) also fail. Results extend to MuJoCo benchmarks (Reacher-v4 d=496, Swimmer-v4 d=1408, HalfCheetah-v4 […]

Ver mais

Like 0

Liked Liked

technocracy

Unique structure of elephant whiskers give them built-in sensing “intelligence”

digitado ⋅ 12 de February de 2026

An elephant’s trunk is a marvelous thing, flexible enough to bend and stretch as it forages for food, but also stiff enough to grasp and maneuver even delicate objects like peanuts or a tortilla chip. That’s because the trunk is highly sensitive when it comes to sensing touch. Scientists have determined that the whiskers lining the trunk are crucial for that sensitivity thanks to their unique structure, amounting to a kind of innate “material intelligence, according to a […]

Ver mais

Like 0

Liked Liked

technocracy

HCFT: Hierarchical Convolutional Fusion Transformer for EEG Decoding

digitado ⋅ 21 de January de 2026

arXiv:2601.12279v1 Announce Type: new Abstract: Electroencephalography (EEG) decoding requires models that can effectively extract and integrate complex temporal, spectral, and spatial features from multichannel signals. To address this challenge, we propose a lightweight and generalizable decoding framework named Hierarchical Convolutional Fusion Transformer (HCFT), which combines dual-branch convolutional encoders and hierarchical Transformer blocks for multi-scale EEG representation learning. Specifically, the model first captures local temporal and spatiotemporal dynamics through time-domain and time-space convolutional branches, and then aligns these features […]

Ver mais

Like 0

Liked Liked

technocracy

Behind The Platform That The Most-Established Service-Based Companies Use Against Reputation Threats

digitado ⋅ 17 de April de 2026

April 7, 2026 The reputation infrastructure behind some of the most established service-based companies in the country spent years operating without a single public-facing page. No ads and no open access. The businesses that found their way to Reputations.io did so through quiet introductions from legal counsel, business consultants, and industry advisors who had already seen what the platform could do. What brought them there was almost always the same situation. A company that had spent years building […]

Ver mais

Like 0

Liked Liked

technocracy

Geodesic Gradient Descent: A Generic and Learning-rate-free Optimizer on Objective Function-induced Manifolds

digitado ⋅ 10 de March de 2026

arXiv:2603.06651v1 Announce Type: new Abstract: Euclidean gradient descent algorithms barely capture the geometry of objective function-induced hypersurfaces and risk driving update trajectories off the hypersurfaces. Riemannian gradient descent algorithms address these issues but fail to represent complex hypersurfaces via a single classic manifold. We propose geodesic gradient descent (GGD), a generic and learning-rate-free Riemannian gradient descent algorithm. At each iteration, GGD uses an n-dimensional sphere to approximate a local neighborhood on the objective function-induced hypersurface, adapting to arbitrarily […]

Ver mais

Like 0

Liked Liked

technocracy

Antimemetics: An Essential Field Guide

digitado ⋅ 10 de January de 2026

Memes have been part of the discourse since Richard Dawkins published The Selfish Gene in 1976. Dawkins defined “memes” as units of cultural transmission—ideas and concepts that spread through imitation. Any idea that replicates by passing from one mind to another is a meme. A meme can be a belief system, a set of behaviors, an ideology, a viral catchphrase, a fashion trend, a cultural artifact, or an urban legend. But if memes are defined by virality, antimemes […]

Ver mais

Like 0

Liked Liked

technocracy

Exploration Through Introspection: A Self-Aware Reward Model

digitado ⋅ 8 de January de 2026

arXiv:2601.03389v1 Announce Type: new Abstract: Understanding how artificial agents model internal mental states is central to advancing Theory of Mind in AI. Evidence points to a unified system for self- and other-awareness. We explore this self-awareness by having reinforcement learning agents infer their own internal states in gridworld environments. Specifically, we introduce an introspective exploration component that is inspired by biological pain as a learning signal by utilizing a hidden Markov model to infer “pain-belief” from online observations. […]

Ver mais

Like 0

Liked Liked

technocracy

Learning Rate Matters: Vanilla LoRA May Suffice for LLM Fine-tuning

digitado ⋅ 4 de February de 2026

Low-Rank Adaptation (LoRA) is the prevailing approach for efficient large language model (LLM) fine-tuning. Building on this paradigm, recent studies have proposed alternative initialization strategies and architectural modifications, reporting substantial improvements over vanilla LoRA. However, these gains are often demonstrated under fixed or narrowly tuned hyperparameter settings, despite the known sensitivity of neural networks to training configurations. In this work, we systematically re-evaluate four representative LoRA variants alongside vanilla LoRA through extensive hyperparameter searches. Across mathematical and code […]

Ver mais

Like 0

Liked Liked

technocracy

Deep Learning Network-Temporal Models For Traffic Prediction

digitado ⋅ 12 de March de 2026

Time series analysis is critical for emerging net- work intelligent control and management functions. However, existing statistical-based and shallow machine learning models have shown limited prediction capabilities on multivariate time series. The intricate topological interdependency and complex temporal patterns in network data demand new model approaches. In this paper, based on a systematic multivariate time series model study, we present two deep learning models aiming for learning both temporal patterns and network topological correlations at the same time: […]

Ver mais

Like 0

Liked Liked