digitado

Controlling Underestimation Bias in Constrained Reinforcement Learning for Safe Exploration

digitado ⋅ 17 de January de 2026

Constrained Reinforcement Learning (CRL) aims to maximize cumulative rewards while satisfying constraints. However, existing CRL algorithms often encounter significant constraint violations during training, limiting their applicability in safety-critical scenarios. In this paper, we identify the underestimation of the cost value function as a key factor contributing to these violations. To address this issue, we propose the Memory-driven Intrinsic Cost Estimation (MICE) method, which introduces intrinsic costs to mitigate underestimation and control bias to promote safer exploration. Inspired by […]

Ver mais

Like 0

Liked Liked

technocracy

Misinformation Exposure in the Chinese Web: A Cross-System Evaluation of Search Engines, LLMs, and AI Overviews

digitado ⋅ 27 de February de 2026

arXiv:2602.22221v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly integrated into search services, providing direct answers that can reduce users’ reliance on traditional result pages. Yet their factual reliability in non-English web ecosystems remains poorly understood, particularly when answering real user queries. We introduce a fact-checking dataset of 12~161 Chinese Yes/No questions derived from real-world online search logs and develop a unified evaluation pipeline to compare three information-access paradigms: traditional search engines, standalone LLMs, and AI-generated […]

Ver mais

Like 0

Liked Liked

technocracy

Real-time Multiattribute Bayesian Preference Elicitation with Pairwise Comparison Queries

digitado ⋅ 31 de March de 2010

Preference elicitation (PE) is an important component of interactive decision support systems that aim to make optimal recommendations to users by actively querying their preferences. In this paper, we outline five principles important for PE in real-world problems: (1) real-time, (2) multiattribute, (3) low cognitive load, (4) robust to noise, and (5) scalable. In light of these requirements, we introduce an approximate PE framework based on TrueSkill for performing efficient closed-form Bayesian updates and query selection for a […]

Ver mais

Like 0

Liked Liked

technocracy

Uncertainty quantification in model discovery by distilling interpretable material constitutive models from Gaussian process posteriors

digitado ⋅ 27 de January de 2026

arXiv:2510.22345v2 Announce Type: replace-cross Abstract: Constitutive model discovery refers to the task of identifying an appropriate model structure, usually from a predefined model library, while simultaneously inferring its material parameters. The data used for model discovery are measured in mechanical tests and are thus inevitably affected by noise which, in turn, induces uncertainties. Previously proposed methods for uncertainty quantification in model discovery either require the selection of a prior for the material parameters, are restricted to linear coefficients […]

Ver mais

Like 0

Liked Liked

technocracy

Boosting Maximum Entropy Reinforcement Learning via One-Step Flow Matching

digitado ⋅ 2 de February de 2026

Diffusion policies are expressive yet incur high inference latency. Flow Matching (FM) enables one-step generation, but integrating it into Maximum Entropy Reinforcement Learning (MaxEnt RL) is challenging: the optimal policy is an intractable energy-based distribution, and the efficient log-likelihood estimation required to balance exploration and exploitation suffers from severe discretization bias. We propose textbf{F}low-based textbf{L}og-likelihood-textbf{A}ware textbf{M}aximum textbf{E}ntropy RL (textbf{FLAME}), a principled framework that addresses these challenges. First, we derive a Q-Reweighted FM objective that bypasses partition function estimation […]

Ver mais

Like 0

Liked Liked

technocracy

Marketing & Storytelling: Generate Location-Aware Visual Content with Nano Banana Pro

digitado ⋅ 5 de January de 2026

Google’s Nano Banana Pro is a game-changer in light image generation, particularly for marketers and storytellers who need speed and highly contextual image generation. The location-aware visual content that depicts the place, the surroundings, and the cultural signs is going to be one of the strongest points in AI storytelling tools. This is where the Nano Banana image model gets its edge. It combines intelligent imaging with good architecture and gives the creator the power to create visuals […]

Ver mais

Like 0

Liked Liked

technocracy

Geopolitics, Geoeconomics and Risk: A Machine Learning Approach

digitado ⋅ 3 de March de 2026

arXiv:2510.12416v5 Announce Type: replace Abstract: This paper studies how geopolitical and geoeconomic shocks transmit to sovereign risk. Using a daily panel of CDS spreads, financial variables, and news-based indicators for 42 advanced and emerging economies over 2018-2025, we estimate nonlinear Machine Learning models that capture the interactions and threshold effects through which these shocks operate. A Shapley-Taylor decomposition exactly partitions predicted spreads into four channels: Direct, Global Financial Cycle, Uncertainty, and Local. The decomposition reveals a structural distinction. […]

Ver mais

Like 0

Liked Liked

technocracy

FedHB: Hierarchical Bayesian Federated Learning

digitado ⋅ 3 de March de 2026

arXiv:2305.04979v2 Announce Type: replace-cross Abstract: We propose a novel hierarchical Bayesian approach to Federated Learning (FL), where our model reasonably describes the generative process of clients’ local data via hierarchical Bayesian modeling: constituting random variables of local models for clients that are governed by a higher-level global variate. Interestingly, the variational inference in our Bayesian model leads to an optimisation problem whose block-coordinate descent solution becomes a distributed algorithm that is separable over clients and allows them not […]

Ver mais

Like 0

Liked Liked

technocracy

Learning Multi-type heterogeneous interacting particle systems

digitado ⋅ 5 de February de 2026

arXiv:2602.03954v1 Announce Type: new Abstract: We propose a framework for the joint inference of network topology, multi-type interaction kernels, and latent type assignments in heterogeneous interacting particle systems from multi-trajectory data. This learning task is a challenging non-convex mixed-integer optimization problem, which we address through a novel three-stage approach. First, we leverage shared structure across agent interactions to recover a low-rank embedding of the system parameters via matrix sensing. Second, we identify discrete interaction types by clustering within […]

Ver mais

Like 0

Liked Liked

technocracy

Morphological Addressing of Identity Basins in Text-to-Image Diffusion Models

digitado ⋅ 24 de February de 2026

arXiv:2602.18533v1 Announce Type: new Abstract: We demonstrate that morphological pressure creates navigable gradients at multiple levels of the text-to-image generative pipeline. In Study~1, identity basins in Stable Diffusion 1.5 can be navigated using morphological descriptors — constituent features like platinum blonde,” beauty mark,” and 1950s glamour” — without the target’s name or photographs. A self-distillation loop (generating synthetic images from descriptor prompts, then training a LoRA on those outputs) achieves consistent convergence toward a specific identity as measured […]

Ver mais

Like 0

Liked Liked