digitado

About digitado

https://www.digitado.com.br

Posts by :

Efficient Hierarchical Implicit Flow Q-learning for Offline Goal-conditioned Reinforcement Learning

digitado ⋅ 10 de April de 2026

Offline goal-conditioned reinforcement learning (GCRL) is a practical reinforcement learning paradigm that aims to learn goal-conditioned policies from reward-free offline data. Despite recent advances in hierarchical architectures such as HIQL, long-horizon control in offline GCRL remains challenging due to the limited expressiveness of Gaussian policies and the inability of high-level policies to generate effective subgoals. To address these limitations, we propose the goal-conditioned mean flow policy, which introduces an average velocity field into hierarchical policy modeling for offline […]

Ver mais

Like 0

Liked Liked

technocracy

Most Replayed Moment: Your Thoughts Shape Your Reality! How To Rewrite Limiting Beliefs

digitado ⋅ 10 de April de 2026

Marisa Peer is a renowned therapist and best-selling author, known for her work in personal growth and the mind-body connection. In this Moments episode, she explores how childhood experiences, shaped by family dynamics and unmet needs, create subconscious beliefs that influence how we see ourselves and the world. Marisa shares practical tools to shift these beliefs, and successfully reshape your reality. Listen to the full episode here! Spotify: https://g2ul0.app.link/u9dMae0Kc2b Apple: https://g2ul0.app.link/48NKVd4Kc2b Watch the Episodes On YouTube: ⁠⁠https://www.youtube.com/c/%20TheDiaryOfACEO/videos Marisa […]

Ver mais

Like 0

Liked Liked

technocracy

WOMBET: World Model-based Experience Transfer for Robust and Sample-efficient Reinforcement Learning

digitado ⋅ 10 de April de 2026

Reinforcement learning (RL) in robotics is often limited by the cost and risk of data collection, motivating experience transfer from a source task to a target task. Offline-to-online RL leverages prior data but typically assumes a given fixed dataset and does not address how to generate reliable data for transfer. We propose textit{World Model-based Experience Transfer} (WOMBET), a framework that jointly generates and utilizes prior data. WOMBET learns a world model in the source task and generates offline […]

Ver mais

Like 0

Liked Liked

technocracy

Multi-Agent Decision-Focused Learning via Value-Aware Sequential Communication

digitado ⋅ 10 de April de 2026

Multi-agent coordination under partial observability requires agents to share complementary private information. While recent methods optimize messages for intermediate objectives (e.g., reconstruction accuracy or mutual information), rather than decision quality, we introduce textbf{SeqComm-DFL}, unifying the sequential communication with decision-focused learning for task performance. Our approach features emph{value-aware message generation with sequential Stackelberg conditioning}: messages maximize receiver decision quality and are generated in priority order, with agents conditioning on their predecessors. The emph{guidance potential} determined by their prosocial ordering. […]

Ver mais

Like 0

Liked Liked

technocracy

Delve into the Applicability of Advanced Optimizers for Multi-Task Learning

digitado ⋅ 10 de April de 2026

Multi-Task Learning (MTL) is a foundational machine learning problem that has seen extensive development over the past decade. Recently, various optimization-based MTL approaches have been proposed to learn multiple tasks simultaneously by altering the optimization trajectory. Although these methods strive to de-conflict and re-balance tasks, we empirically identify that their effectiveness is often undermined by an overlooked factor when employing advanced optimizers: the instant-derived gradients play only a marginal role in the actual parameter updates. This discrepancy prevents […]

Ver mais

Like 0

Liked Liked

technocracy

2DRL – Box2D reinforcement learning editor

digitado ⋅ 10 de April de 2026

I’ve been on-and-off working on this project for a few months, just wanted to share it: https://www.2drl.com/ TLDR – It’s kinda like Unity but for reinforcement learning and much more lightweight. It lets you visually design Box2D (2D rigid body physics) gym environments using a drag-and-drop interface. It also has scripting support, so in principle you can define any environment with any custom behaviour. From your scene and script, it will automatically generate the full environment code, which […]

Ver mais

Like 0

Liked Liked

technocracy

A novel hybrid approach for positive-valued DAG learning

digitado ⋅ 10 de April de 2026

Causal discovery from observational data remains a fundamental challenge in machine learning and statistics, particularly when variables represent inherently positive quantities such as gene expression levels, asset prices, company revenues, or population counts, which often follow multiplicative rather than additive dynamics. We propose the Hybrid Moment-Ratio Scoring (H-MRS) algorithm, a novel method for learning directed acyclic graphs (DAGs) from positive-valued data by combining moment-based scoring with log-scale regression. The key idea is that for positive-valued variables, the moment […]

Ver mais

Like 0

Liked Liked

technocracy

MCP-DPT: A Defense-Placement Taxonomy and Coverage Analysis for Model Context Protocol Security

digitado ⋅ 10 de April de 2026

arXiv:2604.07551v1 Announce Type: new Abstract: The Model Context Protocol (MCP) enables large language models (LLMs) to dynamically discover and invoke third-party tools, significantly expanding agent capabilities while introducing a distinct security landscape. Unlike prompt-only interactions, MCP exposes pre-execution artifacts, shared context, multi-turn workflows, and third-party supply chains to adversarial influence across independently operated components. While recent work has identified MCP-specific attacks and evaluated defenses, existing studies are largely attack-centric or benchmark-driven, providing limited guidance on where mitigation responsibility […]

Ver mais

Like 0

Liked Liked

technocracy

EMSDialog: Synthetic Multi-person Emergency Medical Service Dialogue Generation from Electronic Patient Care Reports via Multi-LLM Agents

digitado ⋅ 10 de April de 2026

arXiv:2604.07549v1 Announce Type: new Abstract: Conversational diagnosis prediction requires models to track evolving evidence in streaming clinical conversations and decide when to commit to a diagnosis. Existing medical dialogue corpora are largely dyadic or lack the multi-party workflow and annotations needed for this setting. We introduce an ePCR-grounded, topic-flow-based multi-agent generation pipeline that iteratively plans, generates, and self-refines dialogues with rule-based factual and topic flow checks. The pipeline yields EMSDialog, a dataset of 4,414 synthetic multi-speaker EMS conversations […]

Ver mais

Like 0

Liked Liked

technocracy

The Day My Chatbot Changed: Characterizing the Mental Health Impacts of Social AI App Updates via Negative User Reviews

digitado ⋅ 10 de April de 2026

arXiv:2604.07548v1 Announce Type: new Abstract: Artificial Intelligence (AI) chatbots are increasingly used for emotional, creative, and social support, leading to sustained and routine user interaction with these systems. As these applications evolve through frequent version updates, changes in functionality or behavior may influence how users evaluate them. However, work on how publicly expressed user feedback varies across app versions in real-world deployment contexts is limited. This study analyzes 210,840 Google Play reviews of the chatbot application Character AI, […]

Ver mais

Like 0

Liked Liked