March 2026

CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR

digitado ⋅ 10 de March de 2026

Reinforcement Learning with Verifiable Rewards (RLVR) has significantly advanced the reasoning capacity of Large Language Models (LLMs). However, RLVR solely relies on final answers as outcome rewards, neglecting the correctness of intermediate reasoning steps. Training on these process-wrong but outcome-correct rollouts can lead to hallucination and answer-copying, severely undermining the model’s generalization and robustness. To address this, we incorporate a Contrastive Learning mechanism into the Policy Optimization (CLIPO) to generalize the RLVR process. By optimizing a contrastive loss […]

Ver mais

Like 0

Liked Liked

technocracy

Task Aware Modulation Using Representation Learning for Upsaling of Terrestrial Carbon Fluxes

digitado ⋅ 10 de March de 2026

Accurately upscaling terrestrial carbon fluxes is central to estimating the global carbon budget, yet remains challenging due to the sparse and regionally biased distribution of ground measurements. Existing data-driven upscaling products often fail to generalize beyond observed domains, leading to systematic regional biases and high predictive uncertainty. We introduce Task-Aware Modulation with Representation Learning (TAM-RL), a framework that couples spatio-temporal representation learning with knowledge-guided encoder-decoder architecture and loss function derived from the carbon balance equation. Across 150+ flux […]

Ver mais

Like 0

Liked Liked

technocracy

Task Aware Modulation Using Representation Learning for Upsaling of Terrestrial Carbon Fluxes

digitado ⋅ 10 de March de 2026

Ver mais

Like 0

Liked Liked

technocracy

Trump’s divisive FDA vaccine regulator self-destructs, will exit agency (again)

digitado ⋅ 10 de March de 2026

For the second time, Vinay Prasad is set to leave the Food and Drug Administration. In a post on social media Friday, FDA Commissioner Marty Makary announced that Prasad will exit in April, adding that he got “a tremendous amount accomplished” during his year at the agency. Prasad’s tenure was generally marked by controversy, but he is departing amid a cluster of self-destructive decisions. Those include a shocking rejection of an mRNA vaccine (which was over the objections […]

Ver mais

Like 0

Liked Liked

technocracy

NASA and SpaceX disagree about manual controls for lunar lander

digitado ⋅ 10 de March de 2026

NASA’s inspector general released a new report on Tuesday that examines the space agency’s management of the Human Landing System development contracts signed with SpaceX and Blue Origin. These landers are essential for NASA’s program to land humans on the Moon this decade and then establish a long-term settlement on the lunar surface. However, both NASA and the companies developing the landers have largely been silent about their efforts. For this reason the new report on Human Landing […]

Ver mais

Like 0

Liked Liked

technocracy

When Learning Rates Go Wrong: Early Structural Signals in PPO Actor-Critic

digitado ⋅ 10 de March de 2026

Deep Reinforcement Learning systems are highly sensitive to the learning rate (LR), and selecting stable and performant training runs often requires extensive hyperparameter search. In Proximal Policy Optimization (PPO) actor–critic methods, small LR values lead to slow convergence, whereas large LR values may induce instability or collapse. We analyse this phenomenon from the behavior of the hidden neurons in the network using the Overfitting-Underfitting Indicator (OUI), a metric that quantifies the balance of binary activation patterns over a […]

Ver mais

Like 0

Liked Liked

technocracy

YouTube expands access to deepfake detection tool for journalists and leaders

digitado ⋅ 10 de March de 2026

YouTube, these days, has become a home for AI slop. Every now and then, we come across AI-generated videos, mostly on Shorts. Deepfakes are yet another concern, which YouTube has been trying to deal with. Last year, the company rolled out AI likeness detection for creators to help them manage unauthorized AI-generated content. YouTube launched it in the Partner Program. YouTube moves to protect journalists and civic leaders by expanding access to its AI likeliness detection tool Now, […]

Ver mais

Like 0

Liked Liked

technocracy

Roadmap to learn RL and simulate a self balancing bipedal robot using mujoco. Need to know if i am on the the right path or if i am missing something

digitado ⋅ 10 de March de 2026

Starting with Foundations of RL using Sutton and Barto, gonna try to implement algorithims using Numpy Moving on to DRL using the hugging face course, spinning up by openAI and CleanRL, i think SB3 is used here but if im missing something pls lmk Finally Mujoco along with custom env submitted by /u/ElectricalCamera6046 [link] [comments]

Ver mais

Like 0

Liked Liked

technocracy

CarbonBench: A Global Benchmark for Upscaling of Carbon Fluxes Using Zero-Shot Learning

digitado ⋅ 10 de March de 2026

Accurately quantifying terrestrial carbon exchange is essential for climate policy and carbon accounting, yet models must generalize to ecosystems underrepresented in sparse eddy covariance observations. Despite this challenge being a natural instance of zero-shot spatial transfer learning for time series regression, no standardized benchmark exists to rigorously evaluate model performance across geographically distinct locations with different climate regimes and vegetation types. We introduce CarbonBench, the first benchmark for zero-shot spatial transfer in carbon flux upscaling. CarbonBench comprises over […]

Ver mais

Like 0

Liked Liked

technocracy

Oracle Is Firing 30,000 People to Pay for AI It Hasn’t Built Yet

digitado ⋅ 10 de March de 2026

Author(s): Menna Adly Originally published on Towards AI. If your company is “pivoting to AI,” your job might be funding the pivot. One desk. One cut badge. And a data center full of chalk outlines where the servers were supposed to go. Made By Author. On February 27, 2026, OpenAI closed the largest private funding round in history: $110 billion, backed by Amazon, Nvidia, and SoftBank. Six days later, Oracle confirmed it would cut thousands of workers to […]

Ver mais

Like 0

Liked Liked