digitado

About digitado

https://www.digitado.com.br

Posts by :

Reason in Chains, Learn in Trees: Self-Rectification and Grafting for Multi-turn Agent Policy Optimization

digitado ⋅ 8 de April de 2026

Reinforcement learning for Large Language Model agents is often hindered by sparse rewards in multi-step reasoning tasks. Existing approaches like Group Relative Policy Optimization treat sampled trajectories as independent chains, assigning uniform credit to all steps in each chain and ignoring the existence of critical steps that may disproportionally impact reasoning outcome. In this paper, we propose T-STAR(Tree-structured Self-Taught Agent Rectification), a framework that recovers the latent correlated reward structure across seemingly independent trajectories. Specifically, we consolidate trajectories […]

Ver mais

Like 0

Liked Liked

technocracy

Joint Interference Detection and Identification via Adversarial Multi-task Learning

digitado ⋅ 8 de April de 2026

Precise interference detection and identification are crucial for enhancing the survivability of communication systems in non-cooperative wireless environments. While deep learning (DL) has advanced this field, existing single-task learning (STL) approaches neglect inherent task correlations. Furthermore, emerging multi-task learning (MTL) methods often lack a theoretical foundation for quantifying and modeling task relationships. To bridge this gap, we establish a theoretically grounded MTL framework for joint interference detection, modulation identification, and interference identification. First, we derive an upper bound […]

Ver mais

Like 0

Liked Liked

technocracy

Energy Saving for Cell-Free Massive MIMO Networks: A Multi-Agent Deep Reinforcement Learning Approach

digitado ⋅ 8 de April de 2026

This paper focuses on energy savings in downlink operation of cell-free massive MIMO (CF mMIMO) networks under dynamic traffic conditions. We propose a multi-agent deep reinforcement learning (MADRL) algorithm that enables each access point (AP) to autonomously control antenna re-configuration and advanced sleep mode (ASM) selection. After the training process, the proposed framework operates in a fully distributed manner, eliminating the need for centralized control and allowing each AP to dynamically adjust to real-time traffic fluctuations. Simulation results […]

Ver mais

Like 0

Liked Liked

technocracy

DDP-SA: Scalable Privacy-Preserving Federated Learning via Distributed Differential Privacy and Secure Aggregation

digitado ⋅ 8 de April de 2026

This article presents DDP-SA, a scalable privacy-preserving federated learning framework that jointly leverages client-side local differential privacy (LDP) and full-threshold additive secret sharing (ASS) for secure aggregation. Unlike existing methods that rely solely on differential privacy or on secure multi-party computation (MPC), DDP-SA integrates both techniques to deliver stronger end-to-end privacy guarantees while remaining computationally practical. The framework introduces a two-stage protection mechanism: clients first perturb their local gradients with calibrated Laplace noise, then decompose the noisy gradients […]

Ver mais

Like 0

Liked Liked

technocracy

With Orion still flying, NASA is nearing key decisions about Artemis III

digitado ⋅ 8 de April de 2026

NASA’s Artemis II mission has yet to return to Earth—it will do so on Friday evening, splashing down into the Pacific Ocean off the coast of San Diego—but the agency is already nearing some key decisions on the next Artemis mission. The US space agency announced six weeks ago that it was modifying its Artemis timeline to insert a mission before beginning planned lunar landings. This new mission, designated Artemis III and intended to fly in Earth orbit […]

Ver mais

Like 0

Liked Liked

technocracy

Information as Structural Alignment: A Dynamical Theory of Continual Learning

digitado ⋅ 8 de April de 2026

Catastrophic forgetting is not an engineering failure. It is a mathematical consequence of storing knowledge as global parameter superposition. Existing methods, such as regularization, replay, and frozen subnetworks, add external mechanisms to a shared-parameter substrate. None derives retention from the learning dynamics themselves. This paper introduces the Informational Buildup Framework (IBF), an alternative substrate for continual learning, based on the premise that information is the achievement of structural alignment rather than stored content. In IBF, two equations govern […]

Ver mais

Like 0

Liked Liked

technocracy

Anthropic limits access to Mythos, its new cybersecurity AI model

digitado ⋅ 8 de April de 2026

Anthropic has launched a new cybersecurity AI model to a select group of customers, including Amazon, Apple, and Microsoft, days after details about the project were leaked online. Its new model, Claude Mythos Preview, would be available only to vetted organizations, including Broadcom, Cisco, and CrowdStrike, Anthropic said on Tuesday. The company added it was also in discussions with the US government about its use. The announcement follows a data leak by the San Francisco start-up last month, […]

Ver mais

Like 0

Liked Liked

technocracy

Epistemic Robust Offline Reinforcement Learning

digitado ⋅ 8 de April de 2026

Offline reinforcement learning learns policies from fixed datasets without further environment interaction. A key challenge in this setting is epistemic uncertainty, arising from limited or biased data coverage, particularly when the behavior policy systematically avoids certain actions. This can lead to inaccurate value estimates and unreliable generalization. Ensemble-based methods like SAC-N mitigate this by conservatively estimating Q-values using the ensemble minimum, but they require large ensembles and often conflate epistemic with aleatoric uncertainty. To address these limitations, we […]

Ver mais

Like 0

Liked Liked

technocracy

Smart Tech Korea 2026: Asia’s Rising Gateway for AI, Robotics, and Next-Gen Tech Innovation

digitado ⋅ 8 de April de 2026

STK 2026 (The 15th Smart Tech Korea), South Korea’s leading B2B technology platform, will take place from June 10 to 12, 2026, marking the exhibition’s largest footprint to date. The 15th edition will highlight transformative technologies reshaping global industries—ranging from AI and robotics to cloud, digital logistics, autonomous manufacturing and advanced security solutions. As an “Only B2B” event, STK 2026 is shaping up to be one of Asia’s most influential platforms for AI and emerging technologies. The 2026 […]

Ver mais

Like 0

Liked Liked

technocracy

Production-Ready Automated ECU Calibration using Residual Reinforcement Learning

digitado ⋅ 8 de April de 2026

Electronic Control Units (ECUs) have played a pivotal role in transforming motorcars of yore into the modern vehicles we see on our roads today. They actively regulate the actuation of individual components and thus determine the characteristics of the whole system. In this, the behavior of the control functions heavily depends on their calibration parameters which engineers traditionally design by hand. This is taking place in an environment of rising customer expectations and steadily shorter product development cycles. […]

Ver mais

Like 0

Liked Liked