March 2026

Is RLHF fundamentally broken? Paid labelers rating synthetic scenarios doesn’t seem like real human feedback to me

digitado ⋅ 31 de March de 2026

Every major AI model goes through RLHF — thousands of paid contractors rating AI outputs to teach models what good looks like. But here’s what bothers me: These contractors are paid per task — incentivized to finish fast not feel deeply. They’re rating synthetic scenarios not real emotional situations. They burn out after thousands of repetitive evaluations. The result is AI that passes every benchmark but fails every real human moment. OpenAI spent $100M+ on this process. And […]

Ver mais

Like 0

Liked Liked

technocracy

From Astronomy to Astrology: Testing the Illusion of Zodiac-Based Personality Prediction with Machine Learning

digitado ⋅ 30 de March de 2026

Astrology has long been used to interpret human personality, estimate compatibility, and guide social decision-making. Zodiac-based systems in particular remain culturally influential across much of the world, including in South Asian societies where astrological reasoning can shape marriage matching, naming conventions, ritual timing, and broader life planning. Despite this persistence, astrology has never established either a physically plausible mechanism or a statistically reliable predictive foundation. In this work, we examine zodiac-based personality prediction using a controlled machine-learning framework. […]

Ver mais

Like 0

Liked Liked

technocracy

Quoting Georgi Gerganov

digitado ⋅ 30 de March de 2026

Note that the main issues that people currently unknowingly face with local models mostly revolve around the harness and some intricacies around model chat templates and prompt construction. Sometimes there are even pure inference bugs. From typing the task in the client to the actual result, there is a long chain of components that atm are not only fragile – are also developed by different parties. So it’s difficult to consolidate the entire stack and you have to […]

Ver mais

Like 0

Liked Liked

technocracy

Transfer Learning in Bayesian Optimization for Aircraft Design

digitado ⋅ 30 de March de 2026

The use of transfer learning within Bayesian optimization addresses the disadvantages of the so-called textit{cold start} problem by using source data to aid in the optimization of a target problem. We present a method that leverages an ensemble of surrogate models using transfer learning and integrates it in a constrained Bayesian optimization framework. We identify challenges particular to aircraft design optimization related to heterogeneous design variables and constraints. We propose the use of a partial-least-squares dimension reduction algorithm […]

Ver mais

Like 0

Liked Liked

technocracy

Judge halts Nexstar/Tegna merger after FCC let firms exceed TV ownership limit

digitado ⋅ 30 de March de 2026

Although the Trump administration approved Nexstar Media Group’s $6.2 billion purchase of Tegna, a US judge has ordered the two companies to stop integrating their assets and operations. US District Judge Troy Nunley, an Obama appointee, issued a temporary restraining order on Friday prohibiting integration of the companies until further rulings by the court. “Defendants must immediately cease all ongoing actions relating to integration and consolidation of Nexstar and Tegna,” wrote Nunley, the chief judge in US District […]

Ver mais

Like 0

Liked Liked

technocracy

A Pontryagin Method of Model-based Reinforcement Learning via Hamiltonian Actor-Critic

digitado ⋅ 30 de March de 2026

Model-based reinforcement learning (MBRL) improves sample efficiency by leveraging learned dynamics models for policy optimization. However, the effectiveness of methods such as actor-critic is often limited by compounding model errors, which degrade long-horizon value estimation. Existing approaches, such as Model-Based Value Expansion (MVE), partially mitigate this issue through multi-step rollouts, but remain sensitive to rollout horizon selection and residual model bias. Motivated by the Pontryagin Maximum Principle (PMP), we propose Hamiltonian Actor-Critic (HAC), a model-based approach that eliminates […]

Ver mais

Like 0

Liked Liked

technocracy

Authors’ lucky break in court may help class action over Meta torrenting

digitado ⋅ 30 de March de 2026

Looks like Meta is hoping the recent Supreme Court ruling that found Internet service providers aren’t liable for piracy on their networks will help the social media giant dodge liability claims over its torrenting of AI training data. Last week, Meta filed a statement in a lawsuit that alleged that Meta should be liable under copyright law for contributory infringement simply because the company knows how torrenting works. By seeding perhaps 80 terabytes of pirated works, the company […]

Ver mais

Like 0

Liked Liked

technocracy

Structural Pass Analysis in Football: Learning Pass Archetypes and Tactical Impact from Spatio-Temporal Tracking Data

digitado ⋅ 30 de March de 2026

The increasing availability of spatio-temporal tracking data has created new opportunities for analysing tactical behaviour in football. However, many existing approaches evaluate passes primarily through outcome-based metrics such as scoring probability or possession value, providing limited insight into how passes influence the defensive organisation of the opponent. This paper introduces a structural framework for analysing football passes based on their interaction with defensive structure. Using synchronised tracking/event data, we derive three complementary structural metrics, Line Bypass Score, Space […]

Ver mais

Like 0

Liked Liked

technocracy

Robust Multi-Agent Reinforcement Learning for Small UAS Separation Assurance under GPS Degradation and Spoofing

digitado ⋅ 30 de March de 2026

We address robust separation assurance for small Unmanned Aircraft Systems (sUAS) under GPS degradation and spoofing via Multi-Agent Reinforcement Learning (MARL). In cooperative surveillance, each aircraft (or agent) broadcasts its GPS-derived position; when such position broadcasts are corrupted, the entire observed air traffic state becomes unreliable. We cast this state observation corruption as a zero-sum game between the agents and an adversary: with probability R, the adversary perturbs the observed state to maximally degrade each agent’s safety performance. […]

Ver mais

Like 0

Liked Liked

technocracy

Expectation Error Bounds for Transfer Learning in Linear Regression and Linear Neural Networks

digitado ⋅ 30 de March de 2026

In transfer learning, the learner leverages auxiliary data to improve generalization on a main task. However, the precise theoretical understanding of when and how auxiliary data help remains incomplete. We provide new insights on this issue in two canonical linear settings: ordinary least squares regression and under-parameterized linear neural networks. For linear regression, we derive exact closed-form expressions for the expected generalization error with bias-variance decomposition, yielding necessary and sufficient conditions for auxiliary tasks to improve generalization on […]

Ver mais

Like 0

Liked Liked