digitado

Breaking AR’s Sampling Bottleneck: Provable Acceleration via Diffusion Language Models

digitado ⋅ 9 de January de 2026

arXiv:2505.21400v2 Announce Type: replace-cross Abstract: Diffusion models have emerged as a powerful paradigm for modern generative modeling, demonstrating strong potential for large language models (LLMs). Unlike conventional autoregressive (AR) models that generate tokens sequentially, diffusion models allow for parallel sampling, offering a promising path to accelerate generation and eliminate the left-to-right generation constraints. Despite their empirical success, theoretical understandings of diffusion language models remain underdeveloped. In this work, we develop convergence guarantees for diffusion language models from an […]

Ver mais

Like 0

Liked Liked

technocracy

TR-49 is interactive fiction for fans of deep research rabbit holes

digitado ⋅ 23 de January de 2026

If you’ve ever fallen down a Wikipedia rabbit hole or spent a pleasant evening digging through college library stacks, you know the joy of a good research puzzle. Every new source and cross-reference you find unlocks an incremental understanding of a previously unknown world, forming a piecemeal tapestry of knowledge that you can eventually look back at as a cohesive and well-known whole. TR-49 takes this research process and operationalizes it into an engrossing and novel piece of […]

Ver mais

Like 0

Liked Liked

technocracy

Situation Graph Prediction: Structured Perspective Inference for User Modeling

digitado ⋅ 17 de February de 2026

arXiv:2602.13319v1 Announce Type: new Abstract: Perspective-Aware AI requires modeling evolving internal states–goals, emotions, contexts–not merely preferences. Progress is limited by a data bottleneck: digital footprints are privacy-sensitive and perspective states are rarely labeled. We propose Situation Graph Prediction (SGP), a task that frames perspective modeling as an inverse inference problem: reconstructing structured, ontology-aligned representations of perspective from observable multimodal artifacts. To enable grounding without real labels, we use a structure-first synthetic generation strategy that aligns latent labels and […]

Ver mais

Like 0

Liked Liked

technocracy

Latent-Space Contrastive Reinforcement Learning for Stable and Efficient LLM Reasoning

digitado ⋅ 24 de January de 2026

While Large Language Models (LLMs) demonstrate exceptional performance in surface-level text generation, their nature in handling complex multi-step reasoning tasks often remains one of “statistical fitting” rather than systematic logical deduction. Traditional Reinforcement Learning (RL) attempts to mitigate this by introducing a “think-before-speak” paradigm. However, applying RL directly in high-dimensional, discrete token spaces faces three inherent challenges: sample-inefficient rollouts, high gradient estimation variance, and the risk of catastrophic forgetting. To fundamentally address these structural bottlenecks, we propose textbf{DeepLatent […]

Ver mais

Like 0

Liked Liked

technocracy

Best-of-$infty$ — Asymptotic Performance of Test-Time LLM Ensembling

digitado ⋅ 5 de March de 2026

arXiv:2509.21091v3 Announce Type: replace Abstract: We study best-of-$N$ for large language models (LLMs) where the selection is based on majority voting. In particular, we analyze the limit $N to infty$, which we denote as boinflower. While this approach achieves impressive performance in the limit, it requires an infinite test-time budget. To address this, we propose an adaptive generation scheme that selects $N$ based on answer agreement, thereby efficiently allocating inference-time computation. Beyond adaptivity, we extend the framework to […]

Ver mais

Like 0

Liked Liked

technocracy

Generative AI for Networking

digitado ⋅ 7 de January de 2026

arXiv:2601.02389v1 Announce Type: new Abstract: Generative Artificial Intelligence (GenAI) and Large Language Models (LLMs) are revolutionizing network management systems, paving the way towards fully autonomous and self-optimizing communication systems. These models enable networks to address complex decision-making tasks across both short-term operational scenarios and long-term strategic planning. Through natural language understanding, LLMs can analyze customer inquiries, predict network congestion patterns, and automate troubleshooting processes, leading to more efficient customer support and network maintenance. GenAI can optimize content delivery […]

Ver mais

Like 0

Liked Liked

technocracy

Amazon launches an AI shopping assistant for the search bar, powered by Alexa+

digitado ⋅ 13 de May de 2026

Alexa for Shopping is a new personalized AI shopping assistant in the Amazon search bar that replaces its Rufus assistant.

Ver mais

Like 0

Liked Liked

technocracy

Safety Under Scaffolding: How Evaluation Conditions Shape Measured Safety

digitado ⋅ 12 de March de 2026

arXiv:2603.10044v1 Announce Type: new Abstract: Safety benchmarks evaluate language models in isolation, typically using multiple-choice format; production deployments wrap these models in agentic scaffolds that restructure inputs through reasoning traces, critic agents, and delegation pipelines. We report one of the largest controlled studies of scaffold effects on safety (N = 62,808; six frontier models, four deployment configurations), combining pre-registration, assessor blinding, equivalence testing, and specification curve analysis. Map-reduce scaffolding degrades measured safety (NNH = 14), yet two of […]

Ver mais

Like 0

Liked Liked

technocracy

DrugPlayGround: Benchmarking Large Language Models and Embeddings for Drug Discovery

digitado ⋅ 7 de April de 2026

arXiv:2604.02346v1 Announce Type: new Abstract: Large language models (LLMs) are in the ascendancy for research in drug discovery, offering unprecedented opportunities to reshape drug research by accelerating hypothesis generation, optimizing candidate prioritization, and enabling more scalable and cost-effective drug discovery pipelines. However there is currently a lack of objective assessments of LLM performance to ascertain their advantages and limitations over traditional drug discovery platforms. To tackle this emergent problem, we have developed DrugPlayGround, a framework to evaluate and […]

Ver mais

Like 0

Liked Liked

technocracy

MerkleSpeech: Public-Key Verifiable, Chunk-Localised Speech Provenance via Perceptual Fingerprints and Merkle Commitments

digitado ⋅ 12 de February de 2026

arXiv:2602.10166v1 Announce Type: new Abstract: Speech provenance goes beyond detecting whether a watermark is present. Real workflows involve splicing, quoting, trimming, and platform-level transforms that may preserve some regions while altering others. Neural watermarking systems have made strides in robustness and localised detection, but most deployments produce outputs with no third-party verifiable cryptographic proof tying a time segment to an issuer-signed original. Provenance standards like C2PA adopt signed manifests and Merkle-based fragment validation, yet their bindings target encoded […]

Ver mais

Like 0

Liked Liked