March 2026

Collision-Aware Vision-Language Learning for End-to-End Driving with Multimodal Infraction Datasets

digitado ⋅ 30 de March de 2026

arXiv:2603.25946v1 Announce Type: new Abstract: High infraction rates remain the primary bottleneck for end-to-end (E2E) autonomous driving, as evidenced by the low driving scores on the CARLA Leaderboard. Despite collision-related infractions being the dominant failure mode in closed-loop evaluations, collision-aware representation learning has received limited attention. To address this gap, we first develop a Video-Language-Augmented Anomaly Detector (VLAAD), leveraging a Multiple Instance Learning (MIL) formulation to obtain stable, temporally localized collision signals for proactive prediction. To transition these […]

Ver mais

Like 0

Liked Liked

technocracy

Can Small Models Reason About Legal Documents? A Comparative Study

digitado ⋅ 30 de March de 2026

arXiv:2603.25944v1 Announce Type: new Abstract: Large language models show promise for legal applications, but deploying frontier models raises concerns about cost, latency, and data privacy. We evaluate whether sub-10B parameter models can serve as practical alternatives by testing nine models across three legal benchmarks (ContractNLI, CaseHOLD, and ECtHR) using five prompting strategies (direct, chain-of-thought, few-shot, BM25 RAG, and dense RAG). Across 405 experiments with three random seeds per configuration, we find that a Mixture-of-Experts model activating only 3B […]

Ver mais

Like 0

Liked Liked

technocracy

Enormous Fluid Antenna Systems (E-FAS) under Correlated Surface-Wave Leakage: Physical Layer Security

digitado ⋅ 30 de March de 2026

arXiv:2603.25943v1 Announce Type: new Abstract: Enormous fluid antenna systems (E-FAS) have recently emerged as a surface-wave (SW)-enabled architecture that can induce controllable large-scale channel gains through guided electromagnetic routing. This paper develops a secrecy analysis framework for E-FAS-assisted downlink transmission with practical pilot-based channel estimation. We consider a multiple-input single-output (MISO) wiretap setting in which the base station (BS) performs minimum mean-square-error (MMSE) channel estimation and adopts maximum-ratio transmission (MRT) with artificial noise (AN). To capture the leakage […]

Ver mais

Like 0

Liked Liked

technocracy

Reinforcing Structured Chain-of-Thought for Video Understanding

digitado ⋅ 30 de March de 2026

arXiv:2603.25942v1 Announce Type: new Abstract: Multi-modal Large Language Models (MLLMs) show promise in video understanding. However, their reasoning often suffers from thinking drift and weak temporal comprehension, even when enhanced by Reinforcement Learning (RL) techniques like Group Relative Policy Optimization (GRPO). Moreover, existing RL methods usually depend on Supervised Fine-Tuning (SFT), which requires costly Chain-of-Thought (CoT) annotation and multi-stage training, and enforces fixed reasoning paths, limiting MLLMs’ ability to generalize and potentially inducing bias. To overcome these limitations, […]

Ver mais

Like 0

Liked Liked

technocracy

Towards a new PGD strategy for the simulation of slender structures

digitado ⋅ 30 de March de 2026

arXiv:2603.25940v1 Announce Type: new Abstract: Effective models for slender structures derived from well-known plate (or shell) theories are justified within the limit of a small thickness, and may therefore prove limited for intermediate slenderness. On the other hand, direct 3D simulation of such structures is sub-optimal because it does not take advantage of the presence of small dimensions in some directions and is sometimes too costly and ill-conditioned. In this context, the Proper Generalized Decomposition (PGD) method, a […]

Ver mais

Like 0

Liked Liked

technocracy

Can Vision Foundation Models Navigate? Zero-Shot Real-World Evaluation and Lessons Learned

digitado ⋅ 30 de March de 2026

arXiv:2603.25937v1 Announce Type: new Abstract: Visual Navigation Models (VNMs) promise generalizable, robot navigation by learning from large-scale visual demonstrations. Despite growing real-world deployment, existing evaluations rely almost exclusively on success rate, whether the robot reaches its goal, which conceals trajectory quality, collision behavior, and robustness to environmental change. We present a real-world evaluation of five state-of-the-art VNMs (GNM, ViNT, NoMaD, NaviBridger, and CrossFormer) across two robot platforms and five environments spanning indoor and outdoor settings. Beyond success rate, […]

Ver mais

Like 0

Liked Liked

technocracy

DenseSwinV2: Channel Attentive Dual Branch CNN Transformer Learning for Cassava Leaf Disease Classification

digitado ⋅ 30 de March de 2026

arXiv:2603.25935v1 Announce Type: new Abstract: This work presents a new Hybrid Dense SwinV2, a two-branch framework that jointly leverages densely connected convolutional features and hierarchical customized Swin Transformer V2 (SwinV2) representations for cassava disease classification. The proposed framework captures high resolution local features through its DenseNet branch, preserving the fine structural cues and also allowing for effective gradient flow. Concurrently, the customized SwinV2 models global contextual dependencies through the idea of shifted-window self attention, which enables the capture […]

Ver mais

Like 0

Liked Liked

technocracy

Explore LLM-enabled Tools to Facilitate Imaginal Exposure Exercises for Social Anxiety

digitado ⋅ 30 de March de 2026

arXiv:2603.25933v1 Announce Type: new Abstract: Social anxiety (SA) is a prevalent mental health challenge that significantly impacts daily social interactions. Imaginal Exposure (IE), a Cognitive Behavioral Therapy (CBT) technique involving imagined anxiety-provoking scenarios, is effective but underutilized, in part because traditional IE homework requires clients to construct and sustain clinically relevant fear narratives. In this work, we explore the feasibility of an LLM-enabled tool that supports IE by generating vivid, personalized exposure scripts. We first co-designed ImaginalExpoBot with […]

Ver mais

Like 0

Liked Liked

technocracy

To Use or Not to Use: Investigating Student Perceptions of Faculty Generative AI Usage in Higher Education

digitado ⋅ 30 de March de 2026

arXiv:2603.25932v1 Announce Type: new Abstract: While Generative AI (GenAI) rapidly integrated into higher education, existing research has primarily focused on regulating student use. As a result, student perspectives on faculty adoption of GenAI remained unexplored. In this study, we analyzed survey responses from 156 undergraduate and graduate students to examine their attitudes toward both student and faculty use of GenAI. We classified students into four groups based on their attitudes, including GenAI Optimists, Student Support Group, Faculty Support […]

Ver mais

Like 0

Liked Liked

technocracy

DiReCT: Disentangled Regularization of Contrastive Trajectories for Physics-Refined Video Generation

digitado ⋅ 30 de March de 2026

arXiv:2603.25931v1 Announce Type: new Abstract: Flow-matching video generators produce temporally coherent, high-fidelity outputs yet routinely violate elementary physics because their reconstruction objectives penalize per-frame deviations without distinguishing physically consistent dynamics from impossible ones. Contrastive flow matching offers a principled remedy by pushing apart velocity-field trajectories of differing conditions, but we identify a fundamental obstacle in the text-conditioned video setting: semantic-physics entanglement. Because natural-language prompts couple scene content with physical behavior, naive negative sampling draws conditions whose velocity fields […]

Ver mais

Like 0

Liked Liked