digitado – Page 483

Adaptive Multi-Objective Tiered Storage Configuration for KV Cache in LLM Service

digitado ⋅ 11 de March de 2026

arXiv:2603.08739v1 Announce Type: new Abstract: The memory-for-computation paradigm of KV caching is essential for accelerating large language model (LLM) inference service, but limited GPU high-bandwidth memory (HBM) capacity motivates offloading the KV cache to cheaper external storage tiers. While this expands capacity, it introduces the challenge of dynamically managing heterogeneous storage resources to balance cost, throughput, and latency under varying workloads. We formulate this as a multi-objective optimization problem: identifying the Pareto frontier across these metrics within the […]

Ver mais

Like 0

Liked Liked

technocracy

The HackerNoon Newsletter: 30 BI Engineering Interview Questions That Actually Matter in the AI Era (4/3/2026)

digitado ⋅ 3 de April de 2026

How are you, hacker? 🪐 What’s happening in tech today, April 3, 2026? The HackerNoon Newsletter brings the HackerNoon homepage straight to your inbox. On this day, The First Handheld Mobile Phone Is Made in 1973, Martin Luther King’s last speech in 1968, Osborne 1 is launched in 1981, First Gen Ipad released in 2010, Panama Papers Leaked in 2016, and we present you with these top quality stories. From Microsoft Generative AI Report: The 40 Most Disrupted […]

Ver mais

Like 0

Liked Liked

technocracy

Navigating Simply, Aligning Deeply: Winning Solutions for Mouse vs. AI 2025

digitado ⋅ 3 de February de 2026

arXiv:2602.00982v1 Announce Type: new Abstract: Visual robustness and neural alignment remain critical challenges in developing artificial agents that can match biological vision systems. We present the winning approaches from Team HCMUS_TheFangs for both tracks of the NeurIPS 2025 Mouse vs. AI: Robust Visual Foraging Competition. For Track 1 (Visual Robustness), we demonstrate that architectural simplicity combined with targeted components yields superior generalization, achieving 95.4% final score with a lightweight two-layer CNN enhanced by Gated Linear Units and observation […]

Ver mais

Like 0

Liked Liked

technocracy

The Art of Building Verifiers for Computer Use Agents

digitado ⋅ 9 de April de 2026

arXiv:2604.06240v1 Announce Type: new Abstract: Verifying the success of computer use agent (CUA) trajectories is a critical challenge: without reliable verification, neither evaluation nor training signal can be trusted. In this paper, we present lessons learned from building a best-in-class verifier for web tasks we call the Universal Verifier. We design the Universal Verifier around four key principles: 1) constructing rubrics with meaningful, non-overlapping criteria to reduce noise; 2) separating process and outcome rewards that yield complementary signals, […]

Ver mais

Like 0

Liked Liked

technocracy

From Classical to Topological Neural Networks Under Uncertainty

digitado ⋅ 12 de February de 2026

arXiv:2602.10266v1 Announce Type: new Abstract: This chapter explores neural networks, topological data analysis, and topological deep learning techniques, alongside statistical Bayesian methods, for processing images, time series, and graphs to maximize the potential of artificial intelligence in the military domain. Throughout the chapter, we highlight practical applications spanning image, video, audio, and time-series recognition, fraud detection, and link prediction for graphical data, illustrating how topology-aware and uncertainty-aware models can enhance robustness, interpretability, and generalization.

Ver mais

Like 0

Liked Liked

technocracy

What is the scientific value of administering the standard Rorschach test to LLMs when the training data is almost certainly contaminated? (R) + [D]

digitado ⋅ 28 de April de 2026

A recent paper published in JMIR Mental Health (Csigó & Cserey, 2026) caught my attention. The researchers administered the 10 standard Rorschach inkblot cards to three multimodal LLMs (GPT-4o, Grok 3, Gemini 2.0) and coded their responses using the Exner Comprehensive System. They analyzed the models’ “perceptual styles,” determinants (like human movement vs. color), and human-related content themes. However, I am seriously struggling to understand the methodological validity of this setup, and I’m curious what the scientific community […]

Ver mais

Like 0

Liked Liked

technocracy

Tackling Over-smoothing on Hypergraphs: A Ricci Flow-guided Neural Diffusion Approach

digitado ⋅ 18 de March de 2026

arXiv:2603.15696v1 Announce Type: new Abstract: Hypergraph neural networks (HGNNs) have demonstrated strong capabilities in modeling complex higher-order relationships. However, existing HGNNs often suffer from over-smoothing as the number of layers increases and lack effective control over message passing among nodes. Inspired by the theory of Ricci flow in differential geometry, we theoretically establish that introducing discrete Ricci flow into hypergraph structures can effectively regulate node feature evolution and thereby alleviate over-smoothing. Building on this insight, we propose Ricci […]

Ver mais

Like 0

Liked Liked

technocracy

Large deviation principles for convolutional Bayesian neural networks

digitado ⋅ 9 de March de 2026

arXiv:2603.06023v1 Announce Type: cross Abstract: While suitably scaled CNNs with Gaussian initialization are known to converge to Gaussian processes as the number of channels diverges, little is known beyond this Gaussian limit. We establish a large deviation principle (LDP) for convolutional neural networks in the infinite-channel regime. We consider a broad class of multidimensional CNN architectures characterized by general receptive fields encoded through a patch-extractor function satisfying mild structural assumptions. Our main result establishes a large deviation principle […]

Ver mais

Like 0

Liked Liked

technocracy

Weak-Form Evolutionary Kolmogorov-Arnold Networks for Solving Partial Differential Equations

digitado ⋅ 24 de February de 2026

arXiv:2602.18515v1 Announce Type: new Abstract: Partial differential equations (PDEs) form a central component of scientific computing. Among recent advances in deep learning, evolutionary neural networks have been developed to successively capture the temporal dynamics of time-dependent PDEs via parameter evolution. The parameter updates are obtained by solving a linear system derived from the governing equation residuals at each time step. However, strong-form evolutionary approaches can yield ill-conditioned linear systems due to pointwise residual discretization, and their computational cost […]

Ver mais

Like 0

Liked Liked

technocracy

Noninvasive Intracranial Pressure Estimation Using Subspace System Identification and Bespoke Machine Learning Algorithms: A Learning-to-Rank Approach

digitado ⋅ 28 de January de 2026

Objective: Accurate noninvasive estimation of intracranial pressure (ICP) remains a major challenge in critical care. We developed a bespoke machine learning algorithm that integrates system identification and ranking-constrained optimization to estimate mean ICP from noninvasive signals. Methods: A machine learning framework was proposed to obtain accurate mean ICP values using arbitrary noninvasive signals. The subspace system identification algorithm is employed to identify cerebral hemodynamics models for ICP simulation using arterial blood pressure (ABP), cerebral blood velocity (CBv), and […]

Ver mais

Like 0

Liked Liked