digitado

Stochastic Decision Horizons for Constrained Reinforcement Learning

digitado ⋅ 4 de February de 2026

Constrained Markov decision processes (CMDPs) provide a principled model for handling constraints, such as safety and other auxiliary objectives, in reinforcement learning. The common approach of using additive-cost constraints and dual variables often hinders off-policy scalability. We propose a Control as Inference formulation based on stochastic decision horizons, where constraint violations attenuate reward contributions and shorten the effective planning horizon via state-action-dependent continuation. This yields survival-weighted objectives that remain replay-compatible for off-policy actor-critic learning. We propose two violation […]

Ver mais

Like 0

Liked Liked

technocracy

Built a RL Toy Games Repo (3 Games Trained + 2 In Progress)

digitado ⋅ 5 de March de 2026

Hey!👋 I’ve got a little RL toybox repo I’ve been messing with on and off and figured I’d finally share it: https://github.com/bzznrc/rl-toybox I first tried this stuff a few years ago following this tutorial on building a LinearQNet-controlled Snake game. Then I tried applying the same to a tiny top-down shooter (“Bang”) and got immediately stuck on issues with the I/O and rewards. Recently I came back to it (thanks Codex) and managed to land an IO + […]

Ver mais

Like 0

Liked Liked

technocracy

From discrete-time policies to continuous-time diffusion samplers: Asymptotic equivalences and faster training

digitado ⋅ 5 de February de 2026

arXiv:2501.06148v3 Announce Type: replace-cross Abstract: We study the problem of training neural stochastic differential equations, or diffusion models, to sample from a Boltzmann distribution without access to target samples. Existing methods for training such models enforce time-reversal of the generative and noising processes, using either differentiable simulation or off-policy reinforcement learning (RL). We prove equivalences between families of objectives in the limit of infinitesimal discretization steps, linking entropic RL methods (GFlowNets) with continuous-time objects (partial differential equations and […]

Ver mais

Like 0

Liked Liked

technocracy

Google Gemini vs Anthropic Claude vs OpenAI ChatGPT vs xAI Grok: The Ultimate Comparison

digitado ⋅ 12 de March de 2026

Introduction to the 2026 AI Arms Race Welcome to the AI arms race of 2026 — and what a race it has become! Three years ago, ChatGPT was a novelty. Today, four AI titans are locked in a war for your attention, your workflow, your trust, and ultimately your subscription dollars. Google Gemini, Anthropic Claude, OpenAI ChatGPT, and xAI Grok have each staked out their territory, sharpened their claws, and are sprinting toward a finish line that keeps […]

Ver mais

Like 0

Liked Liked

technocracy

[R] Fast WTConv: Accelerated Implementation for “Wavelet Convolutions for Large Receptive Fields”

digitado ⋅ 10 de February de 2026

TL;DR: If you use depthwise convolutions, you may improve performance by using our popular WTConv [Finder et al., ECCV 2024], a simple and widely-used drop-in replacement. WTConv was previously implemented only in PyTorch, but it is now much faster with optimized code for CUDA/MPS/Triton. The WTConv layer, which we proposed in [Finder et al. ECCV 2024], is wavelet-based and serves as a simple drop-in replacement for a depthwise convolution. It increases the effective receptive field and often yields […]

Ver mais

Like 0

Liked Liked

technocracy

GPU Infrastructure Is Becoming an Asset Class — Here’s Why Crypto Investors Are Paying Attention

digitado ⋅ 19 de February de 2026

Big Tech will spend over $600 billion on AI infrastructure in 2026 — a 36% jump from last year. The vast majority of that capital is flowing toward one thing: GPUs. For investors who already understand hardware-based yield and decentralized infrastructure, this represents a familiar opportunity in an unfamiliar market. The GPU Shortage Isn’t Hype. It’s Structural. Graphics processing units were originally designed to render video game graphics. Today, they are the computational backbone of artificial intelligence. Every […]

Ver mais

Like 0

Liked Liked

technocracy

Imposing Boundary Conditions on Neural Operators via Learned Function Extensions

digitado ⋅ 4 de February de 2026

Neural operators have emerged as powerful surrogates for the solution of partial differential equations (PDEs), yet their ability to handle general, highly variable boundary conditions (BCs) remains limited. Existing approaches often fail when the solution operator exhibits strong sensitivity to boundary forcings. We propose a general framework for conditioning neural operators on complex non-homogeneous BCs through function extensions. Our key idea is to map boundary data to latent pseudo-extensions defined over the entire spatial domain, enabling any standard […]

Ver mais

Like 0

Liked Liked

technocracy

Pierre Teilhard de Chardin: The Universal Element

digitado ⋅ 27 de January de 2025

Teilhard describes the Universal Element—a mysterious reality binding all existence to the Absolute. Through what he calls “cosmic consciousness,” minds perceive unity beneath multiplicity, glimpsing a precious fabric woven through creation. Rejecting both pantheism and mere naturalism, he identifies this element as Christ, the cosmic center who unifies without absorbing, drawing all things toward convergence while preserving their distinct personalities in the growing body of the Pleroma.

Ver mais

Like 0

Liked Liked

technocracy

Examining LLMs Ability to Summarize Code Through Mutation-Analysis

digitado ⋅ 23 de February de 2026

arXiv:2602.17838v1 Announce Type: new Abstract: As developers increasingly rely on LLM-generated code summaries for documentation, testing, and review, it is important to study whether these summaries accurately reflect what the program actually does. LLMs often produce confident descriptions of what the code looks like it should do (intent), while missing subtle edge cases or logic changes that define what it actually does (behavior). We present a mutation-based evaluation methodology that directly tests whether a summary truly matches the […]

Ver mais

Like 0

Liked Liked

technocracy

RL on Mac M1 series?

digitado ⋅ 13 de January de 2026

Hey everyone, I’m curious to hear if its possible to break into and do RL research/personal projects in robotics or related areas on a Mac M1 device? Aside from typical gym projects and stuff I suppose. I know there is the genesis engine so would that be the only option or are there other possibilities? Appreciate your thoughts. submitted by /u/Sad-Throat-2384 [link] [comments]

Ver mais

Like 0

Liked Liked