technocracy

Understanding Pruning Regimes in Vision-Language Models Through Domain-Aware Layer Selection

digitado ⋅ 24 de March de 2026

arXiv:2603.20275v1 Announce Type: new Abstract: Transformer-based vision-language models (VLMs) contain substantial depth redundancy, yet the effect of removing specific decoder layers remains poorly understood, especially for domains that require tight coupling between perception and multi-step reasoning. We study structured decoder layer pruning through the lens of domain-aware activation similarity, measuring how strongly each layer transforms representations for math versus non-math inputs. This yields simple math-aware, non-math-aware, and mixed ranking criteria that identify layers whose input-output activations change least […]

Ver mais

Like 0

Liked Liked

technocracy

Functionalization of Situated Robots via Vapour

digitado ⋅ 31 de March de 2026

arXiv:2603.26752v1 Announce Type: new Abstract: Tight matching with the environment is key to effective robot operation in complex settings. Situated robots that build their bodies in situ (e.g. by spinning) are uniquely positioned to exploit their surroundings, yet functionalization of these structures remains an integration challenge – multimaterial spinning requires complex spinneret multiplexing, and mixture doping is limited by additive availability and chemical stability. We propose instead using materials available in the environment to functionalize in situ spun […]

Ver mais

Like 0

Liked Liked

technocracy

Towards OOD Generalization in Dynamic Graphs via Causal Invariant Learning

digitado ⋅ 2 de March de 2026

Although dynamic graph neural networks (DyGNNs) have demonstrated promising capabilities, most existing methods ignore out-of-distribution (OOD) shifts that commonly exist in dynamic graphs. Dynamic graph OOD generalization is non-trivial due to the following challenges: 1) Identifying invariant and variant patterns amid complex graph evolution, 2) Capturing the intrinsic evolution rationale from these patterns, and 3) Ensuring model generalization across diverse OOD shifts despite limited data distribution observations. Although several attempts have been made to tackle these challenges, none […]

Ver mais

Like 0

Liked Liked

technocracy

Personalized Federated Sequential Recommender

digitado ⋅ 25 de March de 2026

arXiv:2603.22349v1 Announce Type: new Abstract: In the domain of consumer electronics, personalized sequential recommendation has emerged as a central task. Current methodologies in this field are largely centered on modeling user behavior and have achieved notable performance. Nevertheless, the inherent quadratic computational complexity typical of most existing approaches often leads to inefficiencies that hinder real-time recommendation. Moreover, these methods face challenges in being effectively adapted to the personalized requirements of users across diverse scenarios. To tackle these issues, […]

Ver mais

Like 0

Liked Liked

technocracy

Shape-Adaptive Conditional Calibration for Conformal Prediction via Minimax Optimization

digitado ⋅ 25 de March de 2026

arXiv:2603.23374v1 Announce Type: cross Abstract: Achieving valid conditional coverage in conformal prediction is challenging due to the theoretical difficulty of satisfying pointwise constraints in finite samples. Building upon the characterization of conditional coverage through marginal moment restrictions, we introduce Minimax Optimization Predictive Inference (MOPI), a framework that generalizes prior work by optimizing over a flexible class of set-valued mappings during the calibration phase, rather than simply calibrating a fixed sublevel set. This minimax formulation effectively circumvents the structural […]

Ver mais

Like 0

Liked Liked

technocracy

ParaMETA: Towards Learning Disentangled Paralinguistic Speaking Styles Representations from Speech

digitado ⋅ 18 de January de 2026

Learning representative embeddings for different types of speaking styles, such as emotion, age, and gender, is critical for both recognition tasks (e.g., cognitive computing and human-computer interaction) and generative tasks (e.g., style-controllable speech generation). In this work, we introduce ParaMETA, a unified and flexible framework for learning and controlling speaking styles directly from speech. Unlike existing methods that rely on single-task models or cross-modal alignment, ParaMETA learns disentangled, task-specific embeddings by projecting speech into dedicated subspaces for each […]

Ver mais

Like 0

Liked Liked

technocracy

Building Aether: Architectural Breakdown of a Local-First P2P Messenger

digitado ⋅ 6 de April de 2026

Most “secure” messengers today still rely on centralized infrastructure. Whether it’s for signaling, metadata storage, or push notifications, there is almost always a server sitting between you and your recipient. With Aether, I wanted to take a different route. The goal was to build a strictly local-first software architecture. If two devices are on the same network, they should be able to discover each other and communicate directly—no cloud, no central databases, and no intermediary nodes. Here is […]

Ver mais

Like 0

Liked Liked

technocracy

Enhancing Renal Tumor Malignancy Prediction: Deep Learning with Automatic 3D CT Organ Focused Attention

digitado ⋅ 27 de February de 2026

arXiv:2602.22381v1 Announce Type: new Abstract: Accurate prediction of malignancy in renal tumors is crucial for informing clinical decisions and optimizing treatment strategies. However, existing imaging modalities lack the necessary accuracy to reliably predict malignancy before surgical intervention. While deep learning has shown promise in malignancy prediction using 3D CT images, traditional approaches often rely on manual segmentation to isolate the tumor region and reduce noise, which enhances predictive performance. Manual segmentation, however, is labor-intensive, costly, and dependent on […]

Ver mais

Like 0

Liked Liked

technocracy

Tencent Advertising Algorithm Challenge 2025: All-Modality Generative Recommendation

digitado ⋅ 8 de April de 2026

arXiv:2604.04976v1 Announce Type: new Abstract: Generative recommender systems are rapidly emerging as a new paradigm for recommendation, where collaborative identifiers and/or multi-modal content are mapped into discrete token spaces and user behavior is modelled with autoregressive sequence models. Despite progress on multi-modal recommendation datasets, there is still a lack of public benchmarks that jointly offer large-scale, realistic and fully all-modality data designed specifically for generative recommendation (GR) in industrial advertising. To foster research in this direction, we organised […]

Ver mais

Like 0

Liked Liked

technocracy

Crafting the Eyes for Thinking Machines: The “White Box” VLM

digitado ⋅ 7 de February de 2026

“In a voyage to build an open foundation for enthusiasts — to brainstorm and invent, rather than becoming sheep in the herd who call VLMs ‘expensive black boxes’ and settle for whatever crumbs enterprises toss over the wall.” The Manifesto We reject the “Black Box.” We refuse to treat computer vision as a magic API call. We demand to see the gears turning. We build to understand. We are not chasing the highest benchmark score on day one; we are chasing the […]

Ver mais

Like 0

Liked Liked