digitado – Page 74

Simplifying Deep Temporal Difference Learning

digitado ⋅ 7 de April de 2025

tl;dr The authors propose PQN, a simplified deep online Q-Learning that uses very small replay buffers. Normalization and parallelized sampling from vectorized environments stabilizes training without the need for huge replay buffers. PQN is competitive with more complex methods such as Rainbow, PPO-RNN, QMix while being 50x faster than traditional DQN. Introduction Temporal difference (TD) methods can be simple and efficient, but are notably unstable when combining them with neural networks or off-policy sampling. All of the following […]

Ver mais

Like 0

Liked Liked

technocracy

Provable Acceleration of Distributed Optimization with Local Updates

digitado ⋅ 8 de January de 2026

arXiv:2601.03442v1 Announce Type: new Abstract: In conventional distributed optimization, each agent performs a single local update between two communication rounds with its neighbors to synchronize solutions. Inspired by the success of using multiple local updates in federated learning, incorporating local updates into distributed optimization has recently attracted increasing attention. However, unlike federated learning, where multiple local updates can accelerate learning by improving gradient estimation under mini-batch settings, it remains unclear whether similar benefits hold in distributed optimization when […]

Ver mais

Like 0

Liked Liked

technocracy

JP-TL-Bench: Anchored Pairwise LLM Evaluation for Bidirectional Japanese-English Translation

digitado ⋅ 6 de January de 2026

arXiv:2601.00223v1 Announce Type: new Abstract: We introduce JP-TL-Bench, a lightweight, open benchmark designed to guide the iterative development of Japanese-English translation systems. In this context, the challenge is often “which of these two good translations is better?” rather than “is this translation acceptable?” This distinction matters for Japanese-English, where subtle choices in politeness, implicature, ellipsis, and register strongly affect perceived naturalness. JP-TL-Bench uses a protocol built to make LLM judging both reliable and affordable: it evaluates a candidate […]

Ver mais

Like 0

Liked Liked

technocracy

Running Mistral 7B Instruct on a Macbook

digitado ⋅ 14 de December de 2023

Similar to yesterday’s post on running Mistral 8x7Bs Mixture of Experts (MOE) model, I wanted to document the steps I took to run Mistral’s 7B-Instruct-v0.2 model on a Mac for anyone else interested in playing around with it. Unlike yesterday’s post though, this 7B Instruct model’s inference speed is about 20 tokens/second on my M2 Macbook with its 24GB of RAM, making it something a lot more practical to play around with than the 10 tokens/hour MOE model. […]

Ver mais

Like 0

Liked Liked

technocracy

LoRA-Drop: Temporal LoRA Decoding for Efficient LLM Inference

digitado ⋅ 7 de January de 2026

arXiv:2601.02569v1 Announce Type: new Abstract: Autoregressive large language models (LLMs) are bottlenecked by sequential decoding, where each new token typically requires executing all transformer layers. Existing dynamic-depth and layer-skipping methods reduce this cost, but often rely on auxiliary routing mechanisms or incur accuracy degradation when bypassed layers are left uncompensated. We present textbf{LoRA-Drop}, a plug-and-play inference framework that accelerates decoding by applying a emph{temporal compute schedule} to a fixed subset of intermediate layers: on most decoding steps, selected […]

Ver mais

Like 0

Liked Liked

technocracy

PCA-Guided Quantile Sampling: Preserving Data Structure in Large-Scale Subsampling

digitado ⋅ 13 de January de 2026

arXiv:2506.18249v2 Announce Type: replace-cross Abstract: We introduce Principal Component Analysis guided Quantile Sampling (PCA QS), a novel sampling framework designed to preserve both the statistical and geometric structure of large scale datasets. Unlike conventional PCA, which reduces dimensionality at the cost of interpretability, PCA QS retains the original feature space while using leading principal components solely to guide a quantile based stratification scheme. This principled design ensures that sampling remains representative without distorting the underlying data semantics. We […]

Ver mais

Like 0

Liked Liked

technocracy

Transformer-Based Approach for Automated Functional Group Replacement in Chemical Compounds

digitado ⋅ 14 de January de 2026

arXiv:2601.07930v1 Announce Type: new Abstract: Functional group replacement is a pivotal approach in cheminformatics to enable the design of novel chemical compounds with tailored properties. Traditional methods for functional group removal and replacement often rely on rule-based heuristics, which can be limited in their ability to generate diverse and novel chemical structures. Recently, transformer-based models have shown promise in improving the accuracy and efficiency of molecular transformations, but existing approaches typically focus on single-step modeling, lacking the guarantee […]

Ver mais

Like 0

Liked Liked

technocracy

How Beekeeper by LumApps optimized user personalization with Amazon Bedrock

digitado ⋅ 9 de January de 2026

This post is cowritten by Mike Koźmiński from Beekeeper. Large Language Models (LLMs) are evolving rapidly, making it difficult for organizations to select the best model for each specific use case, optimize prompts for quality and cost, adapt to changing model capabilities, and personalize responses for different users. Choosing the “right” LLM and prompt isn’t a one-time decision—it shifts as models, prices, and requirements change. System prompts are becoming larger (for example, Anthropic system prompt) and more complex. […]

Ver mais

Like 0

Liked Liked

technocracy

Real-Time Thermal Symmetry Control of Data Centers Based on Distributed Optical Fiber Sensing and Model Predictive Control

digitado ⋅ 14 de January de 2026

The high energy consumption and spatiotemporal thermal asymmetry of data center cooling systems have become critical bottlenecks constraining their green and sustainable development. Traditional point-type temperature sensors suffer from insufficient spatial coverage, while conventional feedback control strategies exhibit delayed responses and limited adaptability under dynamic workloads. To address these challenges, this study proposes a real-time thermal symmetry management framework for data centers based on distributed fiber optic temperature sensing and model predictive control (MPC). The proposed system employs […]

Ver mais

Like 0

Liked Liked

technocracy

Asymptotics of Erdos’s L2 Lagrange Interpolation Problem: Arcsine Distribution and Airy Endpoint Universality

digitado ⋅ 13 de January de 2026

Let (x_1,dots,x_nin[-1,1]) be distinct nodes and let [ l_k(x)=prod_{ineq k}frac{x-x_i}{x_k-x_i} ] denote the associated Lagrange interpolation polynomials. ErdH{o}s posed the problem of minimizing the functional [ I(x_1,dots,x_n)=int_{-1}^1 sum_{k=1}^n |l_k(x)|^2,dx ] and determining its asymptotic behavior as (ntoinfty). It was known that [ 2-O!left(frac{(log n)^2}{n}right)le inf I le 2-frac{2}{2n-1}, ] with the upper bound attained by nodes related to Legendre polynomials.In this paper, we develop a variational framework based on Christoffel functions, orthogonal polynomial asymptotics, and entropy methods to […]

Ver mais

Like 0

Liked Liked