technocracy

CHIME: Chiplet-based Heterogeneous Near-Memory Acceleration for Edge Multimodal LLM Inference

digitado ⋅ 29 de January de 2026

arXiv:2601.19908v1 Announce Type: new Abstract: The proliferation of large language models (LLMs) is accelerating the integration of multimodal assistants into edge devices, where inference is executed under stringent latency and energy constraints, often exacerbated by intermittent connectivity. These challenges become particularly acute in the context of multimodal LLMs (MLLMs), as high-dimensional visual inputs are transformed into extensive token sequences, thereby inflating the key-value (KV) cache and imposing substantial data movement overheads to the LLM backbone. To address these […]

Ver mais

Like 0

Liked Liked

technocracy

FedPSA: Modeling Behavioral Staleness in Asynchronous Federated Learning

digitado ⋅ 17 de February de 2026

Asynchronous Federated Learning (AFL) has emerged as a significant research area in recent years. By not waiting for slower clients and executing the training process concurrently, it achieves faster training speed compared to traditional federated learning. However, due to the staleness introduced by the asynchronous process, its performance may degrade in some scenarios. Existing methods often use the round difference between the current model and the global model as the sole measure of staleness, which is coarse-grained and […]

Ver mais

Like 0

Liked Liked

technocracy

A second order regret bound for NormalHedge

digitado ⋅ 10 de February de 2026

arXiv:2602.08151v1 Announce Type: cross Abstract: We consider the problem of prediction with expert advice for “easy” sequences. We show that a variant of NormalHedge enjoys a second-order $epsilon$-quantile regret bound of $Obig(sqrt{V_T log(V_T/epsilon)}big) $ when $V_T > log N$, where $V_T$ is the cumulative second moment of instantaneous per-expert regret averaged with respect to a natural distribution determined by the algorithm. The algorithm is motivated by a continuous time limit using Stochastic Differential Equations. The discrete time analysis […]

Ver mais

Like 0

Liked Liked

technocracy

Robotic Assembly Using Deep Reinforcement Learning

digitado ⋅ 21 de October de 2020

Introduction Disclaimer: This article is a cross post from Pytorch Medium Blog Post. One of the most exciting advancements, that has pushed the frontier of the Artificial Intelligence (AI) in recent years, is Deep Reinforcement Learning (DRL). DRL belongs to the family of machine learning algorithms. It assumes that intelligent machines can learn from their actions similar to the way humans learn from experience. Over the recent years we could witness some impressive real-world applications of DRL. The […]

Ver mais

Like 0

Liked Liked

technocracy

Nuclear Norm Regularized Estimation of Panel Regression Models

digitado ⋅ 10 de February de 2026

arXiv:1810.10987v5 Announce Type: replace-cross Abstract: In this paper we investigate panel regression models with interactive fixed effects. We propose two new estimation methods that are based on minimizing convex objective functions. The first method minimizes the sum of squared residuals with a nuclear (trace) norm regularization. The second method minimizes the nuclear norm of the residuals. We establish the consistency of the two resulting estimators. Those estimators have a very important computational advantage compared to the existing least […]

Ver mais

Like 0

Liked Liked

technocracy

[D] We tested the same INT8 model on 5 Snapdragon chipsets. Accuracy ranged from 93% to 71%. Same weights, same ONNX file.

digitado ⋅ 18 de February de 2026

We’ve been doing on-device accuracy testing across multiple Snapdragon SoCs and the results have been eye-opening. Same model. Same quantization. Same ONNX export. Deployed to 5 different chipsets: Device Accuracy Snapdragon 8 Gen 3 91.8% Snapdragon 8 Gen 2 89.1% Snapdragon 7s Gen 2 84.3% Snapdragon 6 Gen 1 79.6% Snapdragon 4 Gen 2 71.2% Cloud benchmark reported 94.2%. The spread comes down to three things we’ve observed: NPU precision handling — INT8 rounding behavior differs across Hexagon […]

Ver mais

Like 0

Liked Liked

technocracy

Entropy-Aware Structural Alignment for Zero-Shot Handwritten Chinese Character Recognition

digitado ⋅ 5 de February de 2026

arXiv:2602.03913v1 Announce Type: new Abstract: Zero-shot Handwritten Chinese Character Recognition (HCCR) aims to recognize unseen characters by leveraging radical-based semantic compositions. However, existing approaches often treat characters as flat radical sequences, neglecting the hierarchical topology and the uneven information density of different components. To address these limitations, we propose an Entropy-Aware Structural Alignment Network that bridges the visual-semantic gap through information-theoretic modeling. First, we introduce an Information Entropy Prior to dynamically modulate positional embeddings via multiplicative interaction, acting […]

Ver mais

Like 0

Liked Liked

technocracy

It’s Time to Take Back CTRL

digitado ⋅ 8 de December de 2025

Technology is supercharging the attack on democracy by making it easier to spy on people, block free speech, and control what we do. The Electronic Frontier Foundation’s activists, lawyers, and technologists are fighting back. Join the movement to Take Back CTRL. DONATE TODAY Join EFF and Fight Back Take Back CTRL is EFF’s new website to give you insight into the ways that technology has become the veins and arteries of rising global authoritarianism. It’s not just because […]

Ver mais

Like 0

Liked Liked

technocracy

[R] Multi-Modal Reasoning with

digitado ⋅ 22 de February de 2026

Hi everyone, Cosmos-Reason2 is a recent Qwen3-VL-based multimodal reasoning model designed for physical AI tasks. However, it has been limited to powerful devices like DGX Spark, H100, GB200 and Jetson AGX Thor. We have deployed Cosmos-Reason2-2B under an 8GB memory constraint (Jetson Orin Nano) using model compression and inference optimizations, enabling text, image, and video reasoning. HF Link with models, instructions, and benchmarks: https://huggingface.co/embedl/Cosmos-Reason2-2B-W4A16. Interested to hear any feedback, or others experience deploying VLM reasoning models on memory-constrained […]

Ver mais

Like 0

Liked Liked

technocracy

[D] How to get credits to run experiments on closed source models as a student researcher.

digitado ⋅ 2 de March de 2026

Hello! I am working on building and evaluating frontier models on a benchmark. The task is overall pretty reasoning intensive, and ends up consuming a lot of tokens. For reference, in our pilot tests, for Gemini 3.1 Pro, the average output tokens were around 30k and GPT 5.2 runs for around 15 minutes. I would need to evaluate the models on around 900 questions. What would be the best way to get credits for this? submitted by /u/Exciting_Wonder67 […]

Ver mais

Like 0

Liked Liked