digitado

Lang2Act: Fine-Grained Visual Reasoning through Self-Emergent Linguistic Toolchains

digitado ⋅ 17 de February de 2026

arXiv:2602.13235v1 Announce Type: new Abstract: Visual Retrieval-Augmented Generation (VRAG) enhances Vision-Language Models (VLMs) by incorporating external visual documents to address a given query. Existing VRAG frameworks usually depend on rigid, pre-defined external tools to extend the perceptual capabilities of VLMs, typically by explicitly separating visual perception from subsequent reasoning processes. However, this decoupled design can lead to unnecessary loss of visual information, particularly when image-based operations such as cropping are applied. In this paper, we propose Lang2Act, which […]

Ver mais

Like 0

Liked Liked

technocracy

Intersecting spheres and GPS

digitado ⋅ 14 de April de 2026

If you know the distance d to a satellite, you can compute a circle of points that passes through your location. That’s because you’re at the intersection of two spheres—the earth’s surface and a sphere of radius d centered on the satellite—and the intersection of two spheres is a circle. Said another way, one observation of a satellite determines a circle of possible locations. If you know the distance to a second satellite as well, then you can […]

Ver mais

Like 0

Liked Liked

technocracy

PCGamer Article Performance Audit

digitado ⋅ 22 de March de 2026

Research: PCGamer Article Performance Audit Stuart Breckenridge pointed out that PC Gamer Recommends RSS Readers in a 37MB Article That Just Keeps Downloading, highlighting a truly horrifying example of web bloat that added up to 100s more MBs thanks to auto-playing video ads. I decided to have Claude Code for web use Rodney to investigate the page – prompt here. Tags: web-performance, rodney

Ver mais

Like 0

Liked Liked

technocracy

CrossHGL: A Text-Free Foundation Model for Cross-Domain Heterogeneous Graph Learning

digitado ⋅ 29 de March de 2026

Heterogeneous graph representation learning (HGRL) is essential for modeling complex systems with diverse node and edge types. However, most existing methods are limited to closed-world settings with shared schemas and feature spaces, hindering cross-domain generalization. While recent graph foundation models improve transferability, they often target homogeneous graphs, rely on domain-specific schemas, or require rich textual attributes. Consequently, text-free and few-shot cross-domain HGRL remains underexplored. To address this, we propose CrossHGL, a foundation framework that preserves and transfers multi-relational […]

Ver mais

Like 0

Liked Liked

technocracy

A Near-Raw Talking-Head Video Dataset for Various Computer Vision Tasks

digitado ⋅ 31 de March de 2026

arXiv:2603.26763v1 Announce Type: new Abstract: Talking-head videos constitute a predominant content type in real-time communication, yet publicly available datasets for video processing research in this domain remain scarce and limited in signal fidelity. In this paper, we open-source a near-raw dataset of 847 talking-head recordings (approximately 212 minutes), each 15,s in duration, captured from 805 participants using 446 unique consumer webcam devices in their natural environments. All recordings are stored using the FFV1 lossless codec, preserving the camera-native […]

Ver mais

Like 0

Liked Liked

technocracy

SLAP: Slapband-based Autonomous Perching Drone with Failure Recovery for Vertical Tree Trunks

digitado ⋅ 6 de January de 2026

arXiv:2601.00238v1 Announce Type: new Abstract: Perching allows unmanned aerial vehicles (UAVs) to reduce energy consumption, remain anchored for surface sampling operations, or stably survey their surroundings. Previous efforts for perching on vertical surfaces have predominantly focused on lightweight mechanical design solutions with relatively scant system-level integration. Furthermore, perching strategies for vertical surfaces commonly require high-speed, aggressive landing operations that are dangerous for a surveyor drone with sensitive electronics onboard. This work presents the preliminary investigation of a perching […]

Ver mais

Like 0

Liked Liked

technocracy

Towards Verified and Targeted Explanations through Formal Methods

digitado ⋅ 18 de April de 2026

arXiv:2604.14209v1 Announce Type: new Abstract: As deep neural networks are deployed in safety-critical domains such as autonomous driving and medical diagnosis, stakeholders need explanations that are interpretable but also trustworthy with formal guarantees. Existing XAI methods fall short: heuristic attribution techniques (e.g., LIME, Integrated Gradients) highlight influential features but offer no mathematical guarantees about decision boundaries, while formal methods verify robustness yet remain untargeted, analyzing the nearest boundary regardless of whether it represents a critical risk. In safety-critical […]

Ver mais

Like 0

Liked Liked

technocracy

Human-Centered Ambient and Wearable Sensing for Automated Monitoring in Dementia Care: A Scoping Review

digitado ⋅ 9 de March de 2026

arXiv:2603.05516v1 Announce Type: new Abstract: We conducted a scoping review to map the rapidly evolving landscape of wearable and ambient sensing technologies for monitoring people with dementia across home and institutional settings. We analyzed empirical sensing studies (2015-2025) to identify and inform future technical and human-centered design requirements. Five key implementation principles emerge: (1) human-centered design involving all stakeholders to augment rather than replace caregivers; (2) personalized, adaptable solutions that support autonomy across settings and severity levels instead […]

Ver mais

Like 0

Liked Liked

technocracy

Quoting Kellan Elliott-McCrea

digitado ⋅ 25 de February de 2026

It’s also reasonable for people who entered technology in the last couple of decades because it was good job, or because they enjoyed coding to look at this moment with a real feeling of loss. That feeling of loss though can be hard to understand emotionally for people my age who entered tech because we were addicted to feeling of agency it gave us. The web was objectively awful as a technology, and genuinely amazing, and nobody got […]

Ver mais

Like 0

Liked Liked

technocracy

Flow-based Generative Modeling of Potential Outcomes and Counterfactuals

digitado ⋅ 16 de April de 2026

arXiv:2505.16051v4 Announce Type: replace Abstract: Predicting potential and counterfactual outcomes from observational data is central to individualized decision-making, particularly in clinical settings where treatment choices must be tailored to each patient rather than guided solely by population averages. We propose PO-Flow, a continuous normalizing flow (CNF) framework for causal inference that jointly models potential outcome distributions and factual-conditioned counterfactual outcomes. Trained via flow matching, PO-Flow provides a unified approach to individualized potential outcome prediction, conditional average treatment effect […]

Ver mais

Like 0

Liked Liked