February 2026

Peak + Accumulation: A Proxy-Level Scoring Formula for Multi-Turn LLM Attack Detection

digitado ⋅ 13 de February de 2026

arXiv:2602.11247v1 Announce Type: new Abstract: Multi-turn prompt injection attacks distribute malicious intent across multiple conversation turns, exploiting the assumption that each turn is evaluated independently. While single-turn detection has been extensively studied, no published formula exists for aggregating per-turn pattern scores into a conversation-level risk score at the proxy layer — without invoking an LLM. We identify a fundamental flaw in the intuitive weighted-average approach: it converges to the per-turn score regardless of turn count, meaning a 20-turn […]

Ver mais

Like 0

Liked Liked

technocracy

How Many Features Can a Language Model Store Under the Linear Representation Hypothesis?

digitado ⋅ 13 de February de 2026

arXiv:2602.11246v1 Announce Type: new Abstract: We introduce a mathematical framework for the linear representation hypothesis (LRH), which asserts that intermediate layers of language models store features linearly. We separate the hypothesis into two claims: linear representation (features are linearly embedded in neuron activations) and linear accessibility (features can be linearly decoded). We then ask: How many neurons $d$ suffice to both linearly represent and linearly access $m$ features? Classical results in compressed sensing imply that for $k$-sparse inputs, […]

Ver mais

Like 0

Liked Liked

technocracy

Stress Tests REVEAL Fragile Temporal and Visual Grounding in Video-Language Models

digitado ⋅ 13 de February de 2026

arXiv:2602.11244v1 Announce Type: new Abstract: This work investigates a fundamental question: Do Video-Language Models (VidLMs) robustly account for video content, temporal sequence, and motion? Our investigation shows that, surprisingly, they often do not. We introduce REVEAL{}, a diagnostic benchmark that probes fundamental weaknesses of contemporary VidLMs through five controlled stress tests; assessing temporal expectation bias, reliance on language-only shortcuts, video sycophancy, camera motion sensitivity, and robustness to spatiotemporal occlusion. We test leading open- and closed-source VidLMs and find […]

Ver mais

Like 0

Liked Liked

technocracy

Evaluating Memory Structure in LLM Agents

digitado ⋅ 13 de February de 2026

arXiv:2602.11243v1 Announce Type: new Abstract: Modern LLM-based agents and chat assistants rely on long-term memory frameworks to store reusable knowledge, recall user preferences, and augment reasoning. As researchers create more complex memory architectures, it becomes increasingly difficult to analyze their capabilities and guide future memory designs. Most long-term memory benchmarks focus on simple fact retention, multi-hop recall, and time-based changes. While undoubtedly important, these capabilities can often be achieved with simple retrieval-augmented LLMs and do not test complex […]

Ver mais

Like 0

Liked Liked

technocracy

ReTracing: An Archaeological Approach Through Body, Machine, and Generative Systems

digitado ⋅ 13 de February de 2026

arXiv:2602.11242v1 Announce Type: new Abstract: We present ReTracing, a multi-agent embodied performance art that adopts an archaeological approach to examine how artificial intelligence shapes, constrains, and produces bodily movement. Drawing from science-fiction novels, the project extracts sentences that describe human-machine interaction. We use large language models (LLMs) to generate paired prompts “what to do” and “what not to do” for each excerpt. A diffusion-based text-to-video model transforms these prompts into choreographic guides for a human performer and motor […]

Ver mais

Like 0

Liked Liked

technocracy

Active Zero: Self-Evolving Vision-Language Models through Active Environment Exploration

digitado ⋅ 13 de February de 2026

arXiv:2602.11241v1 Announce Type: new Abstract: Self-play has enabled large language models to autonomously improve through self-generated challenges. However, existing self-play methods for vision-language models rely on passive interaction with static image collections, resulting in strong dependence on initial datasets and inefficient learning. Without the ability to actively seek visual data tailored to their evolving capabilities, agents waste computational effort on samples that are either trivial or beyond their current skill level. To address these limitations, we propose Active-Zero, […]

Ver mais

Like 0

Liked Liked

technocracy

Toward Reliable Tea Leaf Disease Diagnosis Using Deep Learning Model: Enhancing Robustness With Explainable AI and Adversarial Training

digitado ⋅ 13 de February de 2026

arXiv:2602.11239v1 Announce Type: new Abstract: Tea is a valuable asset for the economy of Bangladesh. So, tea cultivation plays an important role to boost the economy. These valuable plants are vulnerable to various kinds of leaf infections which may cause less production and low quality. It is not so easy to detect these diseases manually. It may take time and there could be some errors in the detection.Therefore, the purpose of the study is to develop an automated […]

Ver mais

Like 0

Liked Liked

technocracy

SurveyLens: A Research Discipline-Aware Benchmark for Automatic Survey Generation

digitado ⋅ 13 de February de 2026

arXiv:2602.11238v1 Announce Type: new Abstract: The exponential growth of scientific literature has driven the evolution of Automatic Survey Generation (ASG) from simple pipelines to multi-agent frameworks and commercial Deep Research agents. However, current ASG evaluation methods rely on generic metrics and are heavily biased toward Computer Science (CS), failing to assess whether ASG methods adhere to the distinct standards of various academic disciplines. Consequently, researchers, especially those outside CS, lack clear guidance on using ASG systems to yield […]

Ver mais

Like 0

Liked Liked

technocracy

AI-Driven Clinical Decision Support System for Enhanced Diabetes Diagnosis and Management

digitado ⋅ 13 de February de 2026

arXiv:2602.11237v1 Announce Type: new Abstract: Identifying type 2 diabetes mellitus can be challenging, particularly for primary care physicians. Clinical decision support systems incorporating artificial intelligence (AI-CDSS) can assist medical professionals in diagnosing type 2 diabetes with high accuracy. This study aimed to assess an AI-CDSS specifically developed for the diagnosis of type 2 diabetes by employing a hybrid approach that integrates expert-driven insights with machine learning techniques. The AI-CDSS was developed (training dataset: n = 650) and tested […]

Ver mais

Like 0

Liked Liked

technocracy

ABot-M0: VLA Foundation Model for Robotic Manipulation with Action Manifold Learning

digitado ⋅ 13 de February de 2026

arXiv:2602.11236v1 Announce Type: new Abstract: Building general-purpose embodied agents across diverse hardware remains a central challenge in robotics, often framed as the ”one-brain, many-forms” paradigm. Progress is hindered by fragmented data, inconsistent representations, and misaligned training objectives. We present ABot-M0, a framework that builds a systematic data curation pipeline while jointly optimizing model architecture and training strategies, enabling end-to-end transformation of heterogeneous raw data into unified, efficient representations. From six public datasets, we clean, standardize, and balance samples […]

Ver mais

Like 0

Liked Liked