technocracy

Non-Markov Multi-Round Conversational Image Generation with History-Conditioned MLLMs

digitado ⋅ 30 de January de 2026

arXiv:2601.20911v1 Announce Type: new Abstract: Conversational image generation requires a model to follow user instructions across multiple rounds of interaction, grounded in interleaved text and images that accumulate as chat history. While recent multimodal large language models (MLLMs) can generate and edit images, most existing multi-turn benchmarks and training recipes are effectively Markov: the next output depends primarily on the most recent image, enabling shortcut solutions that ignore long-range history. In this work we formalize and target the […]

Ver mais

Like 0

Liked Liked

technocracy

Calibrated Bayesian Deep Learning for Explainable Decision Support Systems Based on Medical Imaging

digitado ⋅ 12 de February de 2026

In critical decision support systems based on medical imaging, the reliability of AI-assisted decision-making is as relevant as predictive accuracy. Although deep learning models have demonstrated significant accuracy, they frequently suffer from miscalibration, manifested as overconfidence in erroneous predictions. To facilitate clinical acceptance, it is imperative that models quantify uncertainty in a manner that correlates with prediction correctness, allowing clinicians to identify unreliable outputs for further review. In order to address this necessity, the present paper proposes a […]

Ver mais

Like 0

Liked Liked

technocracy

Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models

digitado ⋅ 15 de January de 2026

arXiv:2506.17139v3 Announce Type: replace-cross Abstract: In recent years, diffusion models trained on equilibrium molecular distributions have proven effective for sampling biomolecules. Beyond direct sampling, the score of such a model can also be used to derive the forces that act on molecular systems. However, while classical diffusion sampling usually recovers the training distribution, the corresponding energy-based interpretation of the learned score is often inconsistent with this distribution, even for low-dimensional toy systems. We trace this inconsistency to inaccuracies […]

Ver mais

Like 0

Liked Liked

technocracy

The Critical Horizon: Inspection Design Principles for Multi-Stage Operations and Deep Reasoning

digitado ⋅ 16 de February de 2026

arXiv:2602.09394v2 Announce Type: replace Abstract: Manufacturing lines, service journeys, supply chains, and AI reasoning chains share a common challenge: attributing a terminal outcome to the intermediate stage that caused it. We establish an information-theoretic barrier to this credit assignment problem: the signal connecting early steps to final outcomes decays exponentially with depth, creating a critical horizon beyond which reliable learning from endpoint data alone requires exponentially many samples. We prove four results. First, a Signal Decay Bound: sample […]

Ver mais

Like 0

Liked Liked

technocracy

Evolving Demonstration Optimization for Chain-of-Thought Feature Transformation

digitado ⋅ 12 de March de 2026

arXiv:2603.09987v1 Announce Type: new Abstract: Feature Transformation (FT) is a core data-centric AI task that improves feature space quality to advance downstream predictive performance. However, discovering effective transformations remains challenging due to the large space of feature-operator combinations. Existing solutions rely on discrete search or latent generation, but they are frequently limited by sample inefficiency, invalid candidates, and redundant generations with limited coverage. Large Language Models (LLMs) offer strong priors for producing valid transformations, but current LLM-based FT […]

Ver mais

Like 0

Liked Liked

technocracy

Unified Inference Framework for Single and Multi-Player Performative Prediction: Method and Asymptotic Optimality

digitado ⋅ 4 de February de 2026

arXiv:2602.03049v1 Announce Type: new Abstract: Performative prediction characterizes environments where predictive models alter the very data distributions they aim to forecast, triggering complex feedback loops. While prior research treats single-agent and multi-agent performativity as distinct phenomena, this paper introduces a unified statistical inference framework that bridges these contexts, treating the former as a special case of the latter. Our contribution is two-fold. First, we put forward the Repeated Risk Minimization (RRM) procedure for estimating the performative stability, and […]

Ver mais

Like 0

Liked Liked

technocracy

I can’t believe text normalization is so underdiscussed in streaming text-to-speech [D]

digitado ⋅ 22 de April de 2026

Kinda suprises me how little discussion there is around about mistakes in streaming TTS models People look for natural readers, high voice quality, expressive speech. And most models don’t look dumb here and fail. They fail when you give them basic stuff like price, dates, URLs, promo codes, phone numbers. So I was looking for some info and found a benchmark that compares commercial real time streaming TTS models in terms of how they pronounce dates, URLs, acronyms, […]

Ver mais

Like 0

Liked Liked

technocracy

Machines of Faithful Obedience

digitado ⋅ 25 de June de 2025

[Crossposted on LessWrong] Throughout history, technological and scientific advances have had both good and ill effects, but their overall impact has been overwhelmingly positive. Thanks to scientific progress, most people on earth live longer, healthier, and better than they did centuries or even decades ago. I believe that AI (including AGI and ASI) can do the same and be a positive force for humanity. I also believe that it is possible to solve the “technical alignment” problem and […]

Ver mais

Like 0

Liked Liked

technocracy

An Improved Privacy and Utility Analysis of Differentially Private SGD with Bounded Domain and Smooth Losses

digitado ⋅ 16 de January de 2026

arXiv:2502.17772v4 Announce Type: replace-cross Abstract: Differentially Private Stochastic Gradient Descent (DPSGD) is widely used to protect sensitive data during the training of machine learning models, but its privacy guarantee often comes at a large cost of model performance due to the lack of tight theoretical bounds quantifying privacy loss. While recent efforts have achieved more accurate privacy guarantees, they still impose some assumptions prohibited from practical applications, such as convexity and complex parameter requirements, and rarely investigate in-depth […]

Ver mais

Like 0

Liked Liked

technocracy

SRasP: Self-Reorientation Adversarial Style Perturbation for Cross-Domain Few-Shot Learning

digitado ⋅ 5 de March de 2026

Cross-Domain Few-Shot Learning (CD-FSL) aims to transfer knowledge from a seen source domain to unseen target domains, serving as a key benchmark for evaluating the robustness and transferability of models. Existing style-based perturbation methods mitigate domain shift but often suffer from gradient instability and convergence to sharp minima.To address these limitations, we propose a novel crop-global style perturbation network, termed Self-Reorientation Adversarial underline{S}tyle underline{P}erturbation (SRasP). Specifically, SRasP leverages global semantic guidance to identify incoherent crops, followed by reorienting […]

Ver mais

Like 0

Liked Liked