digitado

technocracy

9 demos of Gemini Omni and Gemini 3.5 in action

digitado ⋅ 30 de May de 2026

Watch 9 videos showing the capabilities of Gemini Omni and Gemini 3.5, announced at Google I/O 2026.

Ver mais

Like 0

Liked Liked

technocracy

Stochastic Decision Horizons for Constrained Reinforcement Learning

digitado ⋅ 4 de February de 2026

Constrained Markov decision processes (CMDPs) provide a principled model for handling constraints, such as safety and other auxiliary objectives, in reinforcement learning. The common approach of using additive-cost constraints and dual variables often hinders off-policy scalability. We propose a Control as Inference formulation based on stochastic decision horizons, where constraint violations attenuate reward contributions and shorten the effective planning horizon via state-action-dependent continuation. This yields survival-weighted objectives that remain replay-compatible for off-policy actor-critic learning. We propose two violation […]

Ver mais

Like 0

Liked Liked

technocracy

AQUA: an Agile Process to Develop Quantum Annealing Applications

digitado ⋅ 22 de January de 2026

arXiv:2601.14501v1 Announce Type: new Abstract: Quadratic unconstrained binary optimization (QUBO) is a field of operations research that is attracting growing interest due to the recent availability of quantum hardware targeted at solving QUBO problems. However, practical adoption is hindered by mathematical intricacy, hardware constraints, and a lack of sound software engineering processes for QUBO development. This work presents AQUA (Agile QUantum Annealing), an agile lifecycle for QUBO/QA development created through an industry-academia partnership between NetService S.p.A and the […]

Ver mais

Like 0

Liked Liked

technocracy

ConfigSpec: Profiling-Based Configuration Selection for Distributed Edge–Cloud Speculative LLM Serving

digitado ⋅ 15 de April de 2026

arXiv:2604.09722v1 Announce Type: new Abstract: Speculative decoding enables collaborative Large Language Model (LLM) inference across cloud and edge by separating lightweight token drafting from heavyweight verification. While prior systems show performance and cost benefits, practical deployment requires navigating a large configuration space spanning draft model variants, quantisation levels, speculative lengths, and heterogeneous edge devices. This paper presents ConfigSpec, a configurationselection framework for distributed speculative LLM serving. ConfigSpec profiles edge devices and draft-target alignment, and models drafting throughput, acceptance […]

Ver mais

Like 0

Liked Liked

technocracy

Deconfounded Lifelong Learning for Autonomous Driving via Dynamic Knowledge Spaces

digitado ⋅ 15 de March de 2026

End-to-End autonomous driving (E2E-AD) systems face challenges in lifelong learning, including catastrophic forgetting, difficulty in knowledge transfer across diverse scenarios, and spurious correlations between unobservable confounders and true driving intents. To address these issues, we propose DeLL, a Deconfounded Lifelong Learning framework that integrates a Dirichlet process mixture model (DPMM) with the front-door adjustment mechanism from causal inference. The DPMM is employed to construct two dynamic knowledge spaces: a trajectory knowledge space for clustering explicit driving behaviors and […]

Ver mais

Like 0

Liked Liked

technocracy

Is Gradient Ascent Really Necessary? Memorize to Forget for Machine Unlearning

digitado ⋅ 6 de February de 2026

For ethical and safe AI, machine unlearning rises as a critical topic aiming to protect sensitive, private, and copyrighted knowledge from misuse. To achieve this goal, it is common to conduct gradient ascent (GA) to reverse the training on undesired data. However, such a reversal is prone to catastrophic collapse, which leads to serious performance degradation in general tasks. As a solution, we propose model extrapolation as an alternative to GA, which reaches the counterpart direction in the […]

Ver mais

Like 0

Liked Liked

technocracy

When Domains Interact: Asymmetric and Order-Sensitive Cross-Domain Effects in Reinforcement Learning for Reasoning

digitado ⋅ 1 de February de 2026

Group Relative Policy Optimization (GRPO) has become a key technique for improving reasoning abilities in large language models, yet its behavior under different domain sequencing strategies is poorly understood. In particular, the impact of sequential (one domain at a time) versus mixed-domain (multiple domain at a time) training in GRPO has not been systematically studied. We provide the first systematic analysis of training-order effects across math, science, logic, and puzzle reasoning tasks. We found (1) single-domain generalization is […]

Ver mais

Like 0

Liked Liked

technocracy

llm-all-models-async 0.1

digitado ⋅ 31 de March de 2026

Release: llm-all-models-async 0.1 LLM plugins can define new models in both sync and async varieties. The async variants are most common for API-backed models – sync variants tend to be things that run the model directly within the plugin. My llm-mrchatterbox plugin is sync only. I wanted to try it out with various Datasette LLM features (specifically datasette-enrichments-llm) but Datasette can only use async models. So… I had Claude spin up this plugin that turns sync models into […]

Ver mais

Like 0

Liked Liked

technocracy

The OpenClaw Mess: Why Your Autonomous Agent is a Security Suicide Note.

digitado ⋅ 4 de March de 2026

Author(s): Mandar Karhade, MD. PhD. Originally published on Towards AI. When 200,000 GitHub stars meet 30,000 exposed instances, it’s time to stop the madness. These 6 Alternatives Might Actually Be Better for You. OpenClaw is the 800-pound gorilla of self-hosted AI assistants with 251K GitHub stars and 23+ channel integrations. But if you’re about to spin it up, stop. NanoClaw gives you container-isolated security with a codebase small enough to actually read. PicoClaw runs on a $10 RISC-V […]

Ver mais

Like 0

Liked Liked

technocracy

AI on the couch: Anthropic gives Claude 20 hours of psychiatry

digitado ⋅ 9 de April de 2026

The AI company Anthropic released a 244-page “system card” (PDF) this week describing its newest model, Claude Mythos. The model is “our most capable frontier model to date,” the company says, and supposedly is so good that Anthropic has decided “not to make it generally available.” (The company claims that Mythos is too good at finding unknown cybersecurity bugs, and so the model is only being released to select companies like Microsoft and Apple for now.) Whatever the […]

Ver mais

Like 0

Liked Liked