digitado

Playing Along: Learning a Double-Agent Defender for Belief Steering via Theory of Mind

digitado ⋅ 13 de April de 2026

As large language models (LLMs) become the engine behind conversational systems, their ability to reason about the intentions and states of their dialogue partners (i.e., form and use a theory-of-mind, or ToM) becomes increasingly critical for safe interaction with potentially adversarial partners. We propose a novel privacy-themed ToM challenge, ToM for Steering Beliefs (ToM-SB), in which a defender must act as a Double Agent to steer the beliefs of an attacker with partial prior knowledge within a shared […]

Ver mais

Like 0

Liked Liked

technocracy

Learnability with Partial Labels and Adaptive Nearest Neighbors

digitado ⋅ 23 de March de 2026

arXiv:2603.15781v2 Announce Type: replace Abstract: Prior work on partial labels learning (PLL) has shown that learning is possible even when each instance is associated with a bag of labels, rather than a single accurate but costly label. However, the necessary conditions for learning with partial labels remain unclear, and existing PLL methods are effective only in specific scenarios. In this work, we mathematically characterize the settings in which PLL is feasible. In addition, we present PL A-$k$NN, an […]

Ver mais

Like 0

Liked Liked

technocracy

BindCLIP: A Unified Contrastive-Generative Representation Learning Framework for Virtual Screening

digitado ⋅ 16 de February de 2026

Virtual screening aims to efficiently identify active ligands from massive chemical libraries for a given target pocket. Recent CLIP-style models such as DrugCLIP enable scalable virtual screening by embedding pockets and ligands into a shared space. However, our analyses indicate that such representations can be insensitive to fine-grained binding interactions and may rely on shortcut correlations in training data, limiting their ability to rank ligands by true binding compatibility. To address these issues, we propose BindCLIP, a unified […]

Ver mais

Like 0

Liked Liked

technocracy

The Transatlantic Divide: When Platforms Become Politics

digitado ⋅ 30 de January de 2026

In the previous articles in this series, we looked at digital trust as a progression: we examined trust as a social mechanism, analysed governance as its point of failure, and walked through legitimacy as the condition that determines whether systems endure. This fourth piece extends that shared logic outward, beyond platforms themselves, into geopolitics. A Regulatory Fight That Isn’t Really About Regulation What looks like a trade dispute over digital platforms is, at a deeper level, a clash […]

Ver mais

Like 0

Liked Liked

technocracy

The PokeAgent Challenge: Competitive and Long-Context Learning at Scale

digitado ⋅ 16 de March de 2026

We present the PokeAgent Challenge, a large-scale benchmark for decision-making research built on Pokemon’s multi-agent battle system and expansive role-playing game (RPG) environment. Partial observability, game-theoretic reasoning, and long-horizon planning remain open problems for frontier AI, yet few benchmarks stress all three simultaneously under realistic conditions. PokeAgent targets these limitations at scale through two complementary tracks: our Battling Track, which calls for strategic reasoning and generalization under partial observability in competitive Pokemon battles, and our Speedrunning Track, which […]

Ver mais

Like 0

Liked Liked

technocracy

SpaceX finally files for IPO, targets $1.75 trillion valuation

digitado ⋅ 1 de April de 2026

Elon Musk’s rocket company SpaceX has confidentially filed to go public, firing the starting gun on what is expected to be the biggest initial public offering in history. The Texas-headquartered company filed paperwork with the Securities and Exchange Commission this week for the listing, according to two people familiar with the matter. Confidential filings allow companies to advance their listing plans without publicly revealing their financials. SpaceX last month acquired Musk’s loss-making AI startup xAI for $250 billion. […]

Ver mais

Like 0

Liked Liked

technocracy

FalseReject: Reducing overcautiousness in LLMs through reasoning-aware safety evaluation

digitado ⋅ 18 de July de 2025

FalseReject: Reducing overcautiousness in LLMs through reasoning-aware safety evaluation Novel graph-based, adversarial, agentic method for generating training examples helps identify and mitigate “overrefusal”. Conversational AI Zhehao Zhang Weijie Xu July 18, 01:51 PM September 04, 08:35 AM Large language models (LLMs) have come a long way in enforcing responsible-AI standards through robust safety mechanisms. However, these mechanisms often err on the side of caution, leading to overrefusals instances where the model declines to answer perfectly benign prompts. This […]

Ver mais

Like 0

Liked Liked

technocracy

My 5 Biggest Surprises as an AI Developer

digitado ⋅ 14 de January de 2026

I recently started doing software development work to expand and integrate AI systems. Here are some of the biggest surprises I found along the way. AI Developers Love Calling Everything “Fine Tuning” When studying for the AI-102 exam, I thought fine tuning strictly referred to additional training of a model to include additional data (which can be fed into the system in different ways). When actually working on an AI project, what I learned was developers (and architects) sprinkle […]

Ver mais

Like 0

Liked Liked

technocracy

Learning Distributed Equilibria in Linear-Quadratic Stochastic Differential Games: An $α$-Potential Approach

digitado ⋅ 18 de February de 2026

We analyze independent policy-gradient (PG) learning in $N$-player linear-quadratic (LQ) stochastic differential games. Each player employs a distributed policy that depends only on its own state and updates the policy independently using the gradient of its own objective. We establish global linear convergence of these methods to an equilibrium by showing that the LQ game admits an $α$-potential structure, with $α$ determined by the degree of pairwise interaction asymmetry. For pairwise-symmetric interactions, we construct an affine distributed equilibrium […]

Ver mais

Like 0

Liked Liked

technocracy

User Preference Modeling for Conversational LLM Agents: Weak Rewards from Retrieval-Augmented Interaction

digitado ⋅ 24 de March de 2026

arXiv:2603.20939v1 Announce Type: cross Abstract: Large language models are increasingly used as personal assistants, yet most lack a persistent user model, forcing users to repeatedly restate preferences across sessions. We propose Vector-Adapted Retrieval Scoring (VARS), a pipeline-agnostic, frozen-backbone framework that represents each user with long-term and short-term vectors in a shared preference space and uses these vectors to bias retrieval scoring over structured preference memory. The vectors are updated online from weak scalar rewards from users’ feedback, enabling […]

Ver mais

Like 0

Liked Liked