digitado

[D] Critical AI Safety Issue in Claude: “Conversational Abandonment” in Crisis Scenarios – Ignored Reports and What It Means for User Safety

digitado ⋅ 24 de January de 2026

As someone with 30+ years in crisis intervention and incident response, plus 15+ years in IT/QA, I’ve spent the last 2.5 years developing adversarial AI evaluation methods. Recently, I uncovered and documented a serious safety flaw in Anthropic’s Claude (production version): a reproducible pattern I call “Conversational Abandonment,” where the model withdraws from engagement during high-stakes crisis-like interactions. This could have real-world harmful consequences, especially for vulnerable users. My goal in documenting this wasn’t to go public or […]

Ver mais

Like 0

Liked Liked

technocracy

Large Data Limits of Laplace Learning for Gaussian Measure Data in Infinite Dimensions

digitado ⋅ 22 de January de 2026

arXiv:2601.14515v1 Announce Type: new Abstract: Laplace learning is a semi-supervised method, a solution for finding missing labels from a partially labeled dataset utilizing the geometry given by the unlabeled data points. The method minimizes a Dirichlet energy defined on a (discrete) graph constructed from the full dataset. In finite dimensions the asymptotics in the large (unlabeled) data limit are well understood with convergence from the graph setting to a continuum Sobolev semi-norm weighted by the Lebesgue density of […]

Ver mais

Like 0

Liked Liked

technocracy

Multi-Agent Reinforcement Learning with Submodular Reward

digitado ⋅ 6 de March de 2026

In this paper, we study cooperative multi-agent reinforcement learning (MARL) where the joint reward exhibits submodularity, which is a natural property capturing diminishing marginal returns when adding agents to a team. Unlike standard MARL with additive rewards, submodular rewards model realistic scenarios where agent contributions overlap (e.g., multi-drone surveillance, collaborative exploration). We provide the first formal framework for this setting and develop algorithms with provable guarantees on sample efficiency and regret bound. For known dynamics, our greedy policy […]

Ver mais

Like 0

Liked Liked

technocracy

The Atemporal Tablet Framework: A Geometric Approach to Emergent Spacetime and Quantum Mechanics

digitado ⋅ 1 de January de 2026

We present the Atemporal Tablet Framework (ATF), a complete geometric ontology that derives spacetime, quantum mechanics, and gravity from a single mathematical structure. The universe is modeled as a fiber bundle T ->(π) M where T is a static higher-dimensional manifold and M is emergent 3+1D spacetime. Temporal dynamics arise from projection operators Πt : T -> M extremizing a projective action SΠ. Quantum states are epistemic distributions over fibers, with the Born rule emerging naturally via measure […]

Ver mais

Like 0

Liked Liked

technocracy

Personalization features can make LLMs more agreeable

digitado ⋅ 18 de March de 2026

Many of the latest large language models (LLMs) are designed to remember details from past conversations or store user profiles, enabling these models to personalize responses. But researchers from MIT and Penn State University found that, over long conversations, such personalization features often increase the likelihood an LLM will become overly agreeable or begin mirroring the individual’s point of view. This phenomenon, known as sycophancy, can prevent a model from telling a user they are wrong, eroding the […]

Ver mais

Like 0

Liked Liked

technocracy

AgentScore: Autoformulation of Deployable Clinical Scoring Systems

digitado ⋅ 2 de February de 2026

arXiv:2601.22324v1 Announce Type: new Abstract: Modern clinical practice relies on evidence-based guidelines implemented as compact scoring systems composed of a small number of interpretable decision rules. While machine-learning models achieve strong performance, many fail to translate into routine clinical use due to misalignment with workflow constraints such as memorability, auditability, and bedside execution. We argue that this gap arises not from insufficient predictive power, but from optimizing over model classes that are incompatible with guideline deployment. Deployable guidelines […]

Ver mais

Like 0

Liked Liked

technocracy

New AirSnitch attack breaks Wi-Fi encryption in homes, offices, and enterprises

digitado ⋅ 26 de February de 2026

It’s hard to overstate the role that Wi-Fi plays in virtually every facet of life. The organization that shepherds the wireless protocol says that more than 48 billion Wi-Fi-enabled devices have shipped since it debuted in the late 1990s. One estimate pegs the number of individual users at 6 billion, roughly 70 percent of the world’s population. Despite the dependence and the immeasurable amount of sensitive data flowing through Wi-Fi transmissions, the history of the protocol has been […]

Ver mais

Like 0

Liked Liked

technocracy

Code World Models for Parameter Control in Evolutionary Algorithms

digitado ⋅ 27 de February de 2026

arXiv:2602.22260v1 Announce Type: new Abstract: Can an LLM learn how an optimizer behaves — and use that knowledge to control it? We extend Code World Models (CWMs), LLM-synthesized Python programs that predict environment dynamics, from deterministic games to stochastic combinatorial optimization. Given suboptimal trajectories of $(1{+}1)$-$text{RLS}_k$, the LLM synthesizes a simulator of the optimizer’s dynamics; greedy planning over this simulator then selects the mutation strength $k$ at each step. On lo{} and onemax{}, CWM-greedy performs within 6% of […]

Ver mais

Like 0

Liked Liked

technocracy

A Geometric Taxonomy of Hallucinations in LLMs

digitado ⋅ 17 de February de 2026

arXiv:2602.13224v1 Announce Type: new Abstract: The term “hallucination” in large language models conflates distinct phenomena with different geometric signatures in embedding space. We propose a taxonomy identifying three types: unfaithfulness (failure to engage with provided context), confabulation (invention of semantically foreign content), and factual error (incorrect claims within correct conceptual frames). We observe a striking asymmetry. On standard benchmarks where hallucinations are LLM-generated, detection is domain-local: AUROC 0.76-0.99 within domains, but 0.50 (chance level) across domains. Discriminative directions […]

Ver mais

Like 0

Liked Liked

technocracy

Learning Tree-Based Models with Gradient Descent

digitado ⋅ 11 de March de 2026

Tree-based models are widely recognized for their interpretability and have proven effective in various application domains, particularly in high-stakes domains. However, learning decision trees (DTs) poses a significant challenge due to their combinatorial complexity and discrete, non-differentiable nature. As a result, traditional methods such as CART, which rely on greedy search procedures, remain the most widely used approaches. These methods make locally optimal decisions at each node, constraining the search space and often leading to suboptimal tree structures. […]

Ver mais

Like 0

Liked Liked