technocracy

DQN with Catastrophic Forgetting?

digitado ⋅ 1 de January de 2026

Hi everyone, happy new year! I have a project where I’m training a DQN with stuff relating to pricing and stock decisions. Unfortunaly, I seem to be running into what seems to be some kind of forgetting? When running the training on a pure random (100% exploration rate) and then just evaluating it (just being greedy) it actually reaches values better than fixed policy. The problem arises when I left it to train beyond that scope, especially after […]

Ver mais

Like 0

Liked Liked

technocracy

Aligning Validation with Deployment: Target-Weighted Cross-Validation for Spatial Prediction

digitado ⋅ 1 de April de 2026

arXiv:2603.29981v1 Announce Type: cross Abstract: Cross-validation (CV) is commonly used to estimate predictive risk when independent test data are unavailable. Its validity depends on the assumption that validation tasks are sampled from the same distribution as prediction tasks encountered during deployment. In spatial prediction and other settings with structured data, this assumption is frequently violated, leading to biased estimates of deployment risk. We propose Target-Weighted CV (TWCV), an estimator of deployment risk that accounts for discrepancies between validation […]

Ver mais

Like 0

Liked Liked

technocracy

SWE-Protégé: Learning to Selectively Collaborate With an Expert Unlocks Small Language Models as Software Engineering Agents

digitado ⋅ 25 de February de 2026

Small language models (SLMs) offer compelling advantages in cost, latency, and adaptability, but have so far lagged behind larger models on long-horizon software engineering tasks such as SWE-bench, where they suffer from pervasive action looping and low resolution rates. We introduce SWE-Protégé, a post-training framework that reframes software repair as an expert-protégé collaboration problem. In SWE-Protégé, an SLM remains the sole decision-maker while learning to selectively seek guidance from a strong expert model, recognize stalled states, and follow […]

Ver mais

Like 0

Liked Liked

technocracy

Algorithmic Insurance

digitado ⋅ 31 de March de 2026

arXiv:2106.00839v3 Announce Type: replace-cross Abstract: When AI systems make errors in high-stakes domains like medical diagnosis or autonomous vehicles, a single algorithmic flaw across varying operational contexts can generate highly heterogeneous losses that challenge traditional insurance assumptions. Algorithmic insurance constitutes a novel form of financial coverage for AI-induced damages, representing an emerging market that addresses algorithm-driven liability. However, insurers currently struggle to price these risks, while AI developers lack rigorous frameworks connecting system design with financial liability exposure. […]

Ver mais

Like 0

Liked Liked

technocracy

Oracle Is Firing 30,000 People to Pay for AI It Hasn’t Built Yet

digitado ⋅ 10 de March de 2026

Author(s): Menna Adly Originally published on Towards AI. If your company is “pivoting to AI,” your job might be funding the pivot. One desk. One cut badge. And a data center full of chalk outlines where the servers were supposed to go. Made By Author. On February 27, 2026, OpenAI closed the largest private funding round in history: $110 billion, backed by Amazon, Nvidia, and SoftBank. Six days later, Oracle confirmed it would cut thousands of workers to […]

Ver mais

Like 0

Liked Liked

technocracy

Top 10 Prompt Patterns That Unlock the Best Results in Google Antigravity

digitado ⋅ 24 de February de 2026

Today, AI prompts have evolved from simple keyword-based inputs to a multi-step logic-based structure that allows for the autonomous control of AI. Previously, AI could react to short commands, but now it requires a logical structure that allows it to operate to its full potential. With autonomous AI systems like Google Antigravity, AI no longer reacts to prompts or questions. It reacts to the design of the workflows around the prompts given to AI. This is also why […]

Ver mais

Like 0

Liked Liked

technocracy

The Automatic Verification of Image-Text Claims (AVerImaTeC) Shared Task

digitado ⋅ 13 de February de 2026

arXiv:2602.11221v1 Announce Type: new Abstract: The Automatic Verification of Image-Text Claims (AVerImaTeC) shared task aims to advance system development for retrieving evidence and verifying real-world image-text claims. Participants were allowed to either employ external knowledge sources, such as web search engines, or leverage the curated knowledge store provided by the organizers. System performance was evaluated using the AVerImaTeC score, defined as a conditional verdict accuracy in which a verdict is considered correct only when the associated evidence score […]

Ver mais

Like 0

Liked Liked

technocracy

New AI Hydra Release

digitado ⋅ 28 de March de 2026

AI Hydra is a Reinforcement Learning experimentation sandbox that allows users to experiment with different RL settings in a system that provides real-time feedback. This release features replay memory, reward shaping, and other settings, enhanced visualizations, and improved documentation. Available on [PyPI](https://pypi.org/project/ai-hydra/) and [GitHub](https://github.com/NadimGhaznavi/ai_hydra). As always, feedback is welcome and encouraged!! 🙂 https://reddit.com/link/1s5xzgy/video/8nfma3t3vrrg1/player submitted by /u/Nadim-Daniel [link] [comments]

Ver mais

Like 0

Liked Liked

technocracy

National Software Testing Conference Announces Industry-Leading Speaker Line-Up For 2026

digitado ⋅ 11 de March de 2026

London, UK The National Software Testing Conference (NSTC) 2026, which will be held at the Grand Connaught Rooms in London from July 14–15, 2026,have announced their speaker line-up for this year. NSTC, one of the top conferences for professionals in software testing, quality assurance, and quality engineering in the UK, is back with an agenda that looks to the future and examines the role of artificial intelligence, accessibility, security, and contemporary testing techniques. Senior decision-makers and practitioners from […]

Ver mais

Like 0

Liked Liked

technocracy

Achieving Logarithmic Regret in KL-Regularized Zero-Sum Markov Games

digitado ⋅ 5 de February de 2026

arXiv:2510.13060v2 Announce Type: replace-cross Abstract: Reverse Kullback-Leibler (KL) divergence-based regularization with respect to a fixed reference policy is widely used in modern reinforcement learning to preserve the desired traits of the reference policy and sometimes to promote exploration (using uniform reference policy, known as entropy regularization). Beyond serving as a mere anchor, the reference policy can also be interpreted as encoding prior knowledge about good actions in the environment. In the context of alignment, recent game-theoretic approaches have […]

Ver mais

Like 0

Liked Liked