digitado – Page 170

Unrewarded Exploration in Large Language Models Reveals Latent Learning from Psychology

digitado ⋅ 30 de January de 2026

Latent learning, classically theorized by Tolman, shows that biological agents (e.g., rats) can acquire internal representations of their environment without rewards, enabling rapid adaptation once rewards are introduced. In contrast, from a cognitive science perspective, reward learning remains overly dependent on external feedback, limiting flexibility and generalization. Although recent advances in the reasoning capabilities of large language models (LLMs), such as OpenAI-o1 and DeepSeek-R1, mark a significant breakthrough, these models still rely primarily on reward-centric reinforcement learning paradigms. […]

Ver mais

Like 0

Liked Liked

technocracy

OpenFang—The Game-Changing Open Source Agent Operating System That Replaces OpenClaw

digitado ⋅ 30 de March de 2026

The Most Popular But Flawed OpenClaw Gets a High Security Replacement in OpenFang On November 25th, 2025, a developer named Peter Steinberger pushed an open-source project called Clawdbot to GitHub. By mid-March, it had spawned over twenty alternatives, triggered a Mac mini shortage in several U.S. cities, and earned its current name — OpenClaw — after two rapid rebrands driven by trademark disputes. That was not a product launch. :::info That was a detonation. ::: And yet, within […]

Ver mais

Like 0

Liked Liked

technocracy

As AI Systems Become More Capable, We Would Like to Enlist their Help to Supervise Other AIs

digitado ⋅ 16 de January de 2026

Building Harmless AI With Self-Critique and AI Feedback :::info Authors: Yuntao Bai Saurav Kadavath Sandipan Kundu Amanda Askell Jackson Kernion Andy Jones Anna Chen Anna Goldie Azalia Mirhoseini Cameron McKinnon Carol Chen Catherine Olsson Christopher Olah Danny Hernandez Dawn Drain Deep Ganguli Dustin Li Eli Tran-Johnson Ethan Perez Jamie Kerr Jared Mueller Jeffrey Ladish Joshua Landau Kamal Ndousse Kamile Lukosuite Liane Lovitt Michael Sellitto Nelson Elhage Nicholas Schiefer Noemi Mercado Nova DasSarma Robert Lasenby Robin Larson Sam Ringer […]

Ver mais

Like 0

Liked Liked

technocracy

The Dark Logic of COMEFROM and Human Thought Patterns

digitado ⋅ 21 de April de 2026

Note: This article is written with assistance of the free version of Google Gemini AI quick mode. TL;DR: Scroll to the bottom for the keyword “Summary” As we know, COMEFROM is a sarcastic joke mocking the evilness of GOTO, and almost no seasoned professional software engineer will even consider to use it in production of any commercial codebase, so it’s only natural that nearly nobody will take this programming construct seriously. But what if I tell […]

Ver mais

Like 0

Liked Liked

technocracy

PaAgent: Portrait-Aware Image Restoration Agent via Subjective-Objective Reinforcement Learning

digitado ⋅ 19 de March de 2026

arXiv:2603.17055v1 Announce Type: new Abstract: Image Restoration (IR) agents, leveraging multimodal large language models to perceive degradation and invoke restoration tools, have shown promise in automating IR tasks. However, existing IR agents typically lack an insight summarization mechanism for past interactions, which results in an exhaustive search for the optimal IR tool. To address this limitation, we propose a portrait-aware IR agent, dubbed PaAgent, which incorporates a self-evolving portrait bank for IR tools and Retrieval-Augmented Generation (RAG) to […]

Ver mais

Like 0

Liked Liked

technocracy

INT3 compression+fused metal kernels [R]

digitado ⋅ 22 de April de 2026

Hey guys, I am a researcher and solo founder. I compress models with INT3 at +0.14 nats and built a 2-bit KV cache for long-horizon tasks. I shipped both (INT3 model + INT2 KV) with custom fused Metal kernels for Mac (M-series). Currently Qwen 7B is available in preview. #install brew install reinforceai/spiral/spiral #chat spiral-chat I am optimizing kernels further and working on Triton kernels for GPU support. There is still more room to pack more efficiently, I […]

Ver mais

Like 0

Liked Liked

technocracy

Warm Starts, Cold States: Exploiting Adiabaticity for Variational Ground-States

digitado ⋅ 9 de February de 2026

arXiv:2602.06137v1 Announce Type: cross Abstract: Reliable preparation of many-body ground states is an essential task in quantum computing, with applications spanning areas from chemistry and materials modeling to quantum optimization and benchmarking. A variety of approaches have been proposed to tackle this problem, including variational methods. However, variational training often struggle to navigate complex energy landscapes, frequently encountering suboptimal local minima or suffering from barren plateaus. In this work, we introduce an iterative strategy for ground-state preparation based […]

Ver mais

Like 0

Liked Liked

technocracy

Evaluating Robustness and Adaptability in Learning-Based Mission Planning for Active Debris Removal

digitado ⋅ 4 de February de 2026

Autonomous mission planning for Active Debris Removal (ADR) must balance efficiency, adaptability, and strict feasibility constraints on fuel and mission duration. This work compares three planners for the constrained multi-debris rendezvous problem in Low Earth Orbit: a nominal Masked Proximal Policy Optimization (PPO) policy trained under fixed mission parameters, a domain-randomized Masked PPO policy trained across varying mission constraints for improved robustness, and a plain Monte Carlo Tree Search (MCTS) baseline. Evaluations are conducted in a high-fidelity orbital […]

Ver mais

Like 0

Liked Liked

technocracy

Accelerating OpenPangu Inference on NPU via Speculative Decoding

digitado ⋅ 5 de March de 2026

arXiv:2603.03383v1 Announce Type: new Abstract: To mitigate the Memory Wall bottleneck encountered by Large Language Models (LLMs) during inference on textbf{NPU} hardware, and addressing the scarcity of native support for mainstream speculative decoding algorithms on domestic infrastructure, this study presents an end-to-end speculative inference acceleration scheme for OpenPangu-7B.

Ver mais

Like 0

Liked Liked

technocracy

IceWatch: Forecasting Glacial Lake Outburst Floods (GLOFs) using Multimodal Deep Learning

digitado ⋅ 18 de January de 2026

Glacial Lake Outburst Floods (GLOFs) pose a serious threat in high mountain regions. They are hazardous to communities, infrastructure, and ecosystems further downstream. The classical methods of GLOF detection and prediction have so far mainly relied on hydrological modeling, threshold-based lake monitoring, and manual satellite image analysis. These approaches suffer from several drawbacks: slow updates, reliance on manual labor, and losses in accuracy when clouds interfere and/or lack on-site data. To tackle these challenges, we present IceWatch: a […]

Ver mais

Like 0

Liked Liked