January 2026

Learning Policy Representations for Steerable Behavior Synthesis

digitado ⋅ 29 de January de 2026

Given a Markov decision process (MDP), we seek to learn representations for a range of policies to facilitate behavior steering at test time. As policies of an MDP are uniquely determined by their occupancy measures, we propose modeling policy representations as expectations of state-action feature maps with respect to occupancy measures. We show that these representations can be approximated uniformly for a range of policies using a set-based architecture. Our model encodes a set of state-action samples into […]

Ver mais

Like 0

Liked Liked

technocracy

Quantum-Inspired Reinforcement Learning for Secure and Sustainable AIoT-Driven Supply Chain Systems

digitado ⋅ 29 de January de 2026

Modern supply chains must balance high-speed logistics with environmental impact and security constraints, prompting a surge of interest in AI-enabled Internet of Things (AIoT) solutions for global commerce. However, conventional supply chain optimization models often overlook crucial sustainability goals and cyber vulnerabilities, leaving systems susceptible to both ecological harm and malicious attacks. To tackle these challenges simultaneously, this work integrates a quantum-inspired reinforcement learning framework that unifies carbon footprint reduction, inventory management, and cryptographic-like security measures. We design […]

Ver mais

Like 0

Liked Liked

technocracy

Knowledge Gradient for Preference Learning

digitado ⋅ 29 de January de 2026

The knowledge gradient is a popular acquisition function in Bayesian optimization (BO) for optimizing black-box objectives with noisy function evaluations. Many practical settings, however, allow only pairwise comparison queries, yielding a preferential BO problem where direct function evaluations are unavailable. Extending the knowledge gradient to preferential BO is hindered by its computational challenge. At its core, the look-ahead step in the preferential setting requires computing a non-Gaussian posterior, which was previously considered intractable. In this paper, we address […]

Ver mais

Like 0

Liked Liked

technocracy

Federate the Router: Learning Language Model Routers with Sparse and Decentralized Evaluations

digitado ⋅ 29 de January de 2026

Large language models (LLMs) are increasingly accessed as remotely hosted services by edge and enterprise clients that cannot run frontier models locally. Since models vary widely in capability and price, routing queries to models that balance quality and inference cost is essential. Existing router approaches assume access to centralized query-model evaluation data. However, these data are often fragmented across clients, such as end users and organizations, and are privacy-sensitive, which makes centralizing data infeasible. Additionally, per-client router training […]

Ver mais

Like 0

Liked Liked

technocracy

Gaussian Process Bandit Optimization with Machine Learning Predictions and Application to Hypothesis Generation

digitado ⋅ 29 de January de 2026

Many real-world optimization problems involve an expensive ground-truth oracle (e.g., human evaluation, physical experiments) and a cheap, low-fidelity prediction oracle (e.g., machine learning models, simulations). Meanwhile, abundant offline data (e.g., past experiments and predictions) are often available and can be used to pretrain powerful predictive models, as well as to provide an informative prior. We propose Prediction-Augmented Gaussian Process Upper Confidence Bound (PA-GP-UCB), a novel Bayesian optimization algorithm that leverages both oracles and offline data to achieve provable […]

Ver mais

Like 0

Liked Liked

technocracy

ZK-HybridFL: Zero-Knowledge Proof-Enhanced Hybrid Ledger for Federated Learning

digitado ⋅ 29 de January de 2026

Federated learning (FL) enables collaborative model training while preserving data privacy, yet both centralized and decentralized approaches face challenges in scalability, security, and update validation. We propose ZK-HybridFL, a secure decentralized FL framework that integrates a directed acyclic graph (DAG) ledger with dedicated sidechains and zero-knowledge proofs (ZKPs) for privacy-preserving model validation. The framework uses event-driven smart contracts and an oracle-assisted sidechain to verify local model updates without exposing sensitive data. A built-in challenge mechanism efficiently detects adversarial […]

Ver mais

Like 0

Liked Liked

technocracy

Google Project Genie lets you create interactive worlds from a photo or prompt

digitado ⋅ 29 de January de 2026

Last year, Google showed off Genie 3, an updated version of its AI world model with impressive long-term memory that allowed it to create interactive worlds from a simple text prompt. At the time, Google only provided Genie to a small group of trusted testers. Now, it’s available more widely as Project Genie, but only for those paying for Google’s most expensive AI subscription. World models are exactly what they sound like—an AI that generates a dynamic environment […]

Ver mais

Like 0

Liked Liked

technocracy

Learning Reward Functions for Cooperative Resilience in Multi-Agent Systems

digitado ⋅ 29 de January de 2026

Multi-agent systems often operate in dynamic and uncertain environments, where agents must not only pursue individual goals but also safeguard collective functionality. This challenge is especially acute in mixed-motive multi-agent systems. This work focuses on cooperative resilience, the ability of agents to anticipate, resist, recover, and transform in the face of disruptions, a critical yet underexplored property in Multi-Agent Reinforcement Learning. We study how reward function design influences resilience in mixed-motive settings and introduce a novel framework that […]

Ver mais

Like 0

Liked Liked

technocracy

Comcast keeps losing customers despite price guarantee and unlimited data

digitado ⋅ 29 de January de 2026

In April 2025, Comcast President Mike Cavanagh bemoaned that the company’s cable broadband division was “not winning in the marketplace” amid increased competition from fiber and fixed wireless Internet service providers. Cavanagh identified some problems that had been obvious to Comcast customers for many years: Its prices aren’t transparent enough and rise too frequently, and dealing with the company is too difficult. Comcast sought to fix the problems with a five-year price guarantee, one year of free Xfinity […]

Ver mais

Like 0

Liked Liked

technocracy

Task-Uniform Convergence and Backward Transfer in Federated Domain-Incremental Learning with Partial Participation

digitado ⋅ 29 de January de 2026

Real-world federated systems seldom operate on static data: input distributions drift while privacy rules forbid raw-data sharing. We study this setting as Federated Domain-Incremental Learning (FDIL), where (i) clients are heterogeneous, (ii) tasks arrive sequentially with shifting domains, yet (iii) the label space remains fixed. Two theoretical pillars remain missing for FDIL under realistic deployment: a guarantee of backward knowledge transfer (BKT) and a convergence rate that holds across the sequence of all tasks with partial participation. We […]

Ver mais

Like 0

Liked Liked