digitado

Squint: Fast Visual Reinforcement Learning for Sim-to-Real Robotics

digitado ⋅ 24 de February de 2026

Visual reinforcement learning is appealing for robotics but expensive — off-policy methods are sample-efficient yet slow; on-policy methods parallelize well but waste samples. Recent work has shown that off-policy methods can train faster than on-policy methods in wall-clock time for state-based control. Extending this to vision remains challenging, where high-dimensional input images complicate training dynamics and introduce substantial storage and encoding overhead. To address these challenges, we introduce Squint, a visual Soft Actor Critic method that achieves faster […]

Ver mais

Like 0

Liked Liked

technocracy

Fiducial Exoskeletons: Image-Centric Robot State Estimation

digitado ⋅ 14 de January de 2026

arXiv:2601.08034v1 Announce Type: new Abstract: We introduce Fiducial Exoskeletons, an image-based reformulation of 3D robot state estimation that replaces cumbersome procedures and motor-centric pipelines with single-image inference. Traditional approaches – especially robot-camera extrinsic estimation – often rely on high-precision actuators and require time-consuming routines such as hand-eye calibration. In contrast, modern learning-based robot control is increasingly trained and deployed from RGB observations on lower-cost hardware. Our key insight is twofold. First, we cast robot state estimation as 6D […]

Ver mais

Like 0

Liked Liked

technocracy

Evaluating LLMs When They Do Not Know the Answer: Statistical Evaluation of Mathematical Reasoning via Comparative Signals

digitado ⋅ 4 de February de 2026

arXiv:2602.03061v1 Announce Type: cross Abstract: Evaluating mathematical reasoning in LLMs is constrained by limited benchmark sizes and inherent model stochasticity, yielding high-variance accuracy estimates and unstable rankings across platforms. On difficult problems, an LLM may fail to produce a correct final answer, yet still provide reliable pairwise comparison signals indicating which of two candidate solutions is better. We leverage this observation to design a statistically efficient evaluation framework that combines standard labeled outcomes with pairwise comparison signals obtained […]

Ver mais

Like 0

Liked Liked

technocracy

Flow Matching for Offline Reinforcement Learning with Discrete Actions

digitado ⋅ 9 de February de 2026

arXiv:2602.06138v1 Announce Type: new Abstract: Generative policies based on diffusion models and flow matching have shown strong promise for offline reinforcement learning (RL), but their applicability remains largely confined to continuous action spaces. To address a broader range of offline RL settings, we extend flow matching to a general framework that supports discrete action spaces with multiple objectives. Specifically, we replace continuous flows with continuous-time Markov chains, trained using a Q-weighted flow matching objective. We then extend our […]

Ver mais

Like 0

Liked Liked

technocracy

jordanhubbard/nanolang

digitado ⋅ 20 de January de 2026

jordanhubbard/nanolang Plenty of people have mused about what a new programming language specifically designed to be used by LLMs might look like. Jordan Hubbard (co-founder of FreeBSD, with serious stints at Apple and NVIDIA) just released exactly that. A minimal, LLM-friendly programming language with mandatory testing and unambiguous syntax. NanoLang transpiles to C for native performance while providing a clean, modern syntax optimized for both human readability and AI code generation. The syntax strikes me as an interesting […]

Ver mais

Like 0

Liked Liked

technocracy

AgentTutor: Empowering Personalized Learning with Multi-Turn Interactive Teaching in Intelligent Education Systems

digitado ⋅ 9 de January de 2026

arXiv:2601.04219v1 Announce Type: new Abstract: The rapid advancement of large-scale language models (LLMs) has shown their potential to transform intelligent education systems (IESs) through automated teaching and learning support applications. However, current IESs often rely on single-turn static question-answering, which fails to assess learners’ cognitive levels, cannot adjust teaching strategies based on real-time feedback, and is limited to providing simple one-off responses. To address these issues, we introduce AgentTutor, a multi-turn interactive intelligent education system to empower personalized […]

Ver mais

Like 0

Liked Liked

technocracy

Attention? Attention!

digitado ⋅ 24 de June de 2018

[Updated on 2018-10-28: Add Pointer Network and the link to my implementation of Transformer.] [Updated on 2018-11-06: Add a link to the implementation of Transformer model.] [Updated on 2018-11-18: Add Neural Turing Machines.] [Updated on 2019-07-18: Correct the mistake on using the term “self-attention” when introducing the show-attention-tell paper; moved it to Self-Attention section.] [Updated on 2020-04-07: A follow-up post on improved Transformer models is here.]

Ver mais

Like 0

Liked Liked

technocracy

The life of a prescription at Amazon Pharmacy

digitado ⋅ 30 de September de 2024

The life of a prescription at Amazon Pharmacy From pricing estimation and regulatory compliance to inventory management and chatbot assistants, machine learning models help Amazon Pharmacy customers stay healthy and save time and money. Conversational AI Alexandre Alves Anita Vila September 30, 01:32 PM October 02, 11:42 AM Pharmacies play a vital role in ensuring patients health, but the process of dispensing medications is far more complex than it may appear. At Amazon Pharmacy, we are using artificial […]

Ver mais

Like 0

Liked Liked

technocracy

The Tragedy of Chain Commons

digitado ⋅ 25 de February de 2026

arXiv:2602.20341v1 Announce Type: new Abstract: Byzantine Fault Tolerant (BFT) consensus forms the foundation of many modern blockchains striving for both high throughput and low latency. A growing bottleneck is transaction execution and validation on the critical path of consensus, which has led to modular decoupled designs that separate ordering from execution: Consensus orders only metadata, while transactions are executed and validated concurrently. While this approach improves performance, it can leave invalid transactions in the ledger, increasing storage costs […]

Ver mais

Like 0

Liked Liked

technocracy

Annotated PIM Bibliography

digitado ⋅ 15 de January de 2026

arXiv:2601.09002v1 Announce Type: new Abstract: Processing in Memory (PIM) and similar terms such as Compute In Memory (CIM), Logic in Memory (LIM), In Memory Computing (IMC), and Near Memory Computing (NMC) have gained attention recently as a potentially “revolutionary new” technique. The truth, however, is that many examples of the technology go back over 60 years. This document attempts to provide an annotated bibliography of PIM technology that attempts to cover the whole time-frame, and is organized to […]

Ver mais

Like 0

Liked Liked