digitado

LLM API Token Caching: The 90% Cost Reduction Feature when building AI Applications

digitado ⋅ 28 de December de 2025

Author(s): Nikhil Originally published on Towards AI. LLM API Token Caching: The 90% Cost Reduction Feature when building AI Applications If you’ve used Claude, GPT-4, or any modern LLM API, you’ve been spending far more than necessary on token processing if you are not caching the system prompt or any prompt that just static and doesn’t change for every api call. Cost comparison: 10x savings on cached token readsToken caching provides substantial cost benefits by allowing reuse of […]

Ver mais

Like 0

Liked Liked

technocracy

Bayesian Decision-Making Shapes Phenotypic Landscapes from Differentiation to Cancer

digitado ⋅ 5 de January de 2026

Cells adapt their phenotypes in noisy microenvironments while maintaining robust decision-making. We develop a coarse-grained theoretical framework in which cellular phenotypic adaptation is described as Bayesian decision-making coupled to replication and diffusion. This leads to an effective Fokker–Planck equation with an emergent fitness landscape governing phenotypic dynamics. We identify distinct phenotypic regimes—homeostatic fixation, bistable decision-making, critical switching, and runaway explosion—and propose a biological interpretation in which homeostatic and bistable landscapes correspond to healthy differentiated cell states, whereas explosive […]

Ver mais

Like 0

Liked Liked

technocracy

Data Labelling Using LLMs with Langformers

digitado ⋅ 22 de April de 2025

When most people think of Large Language Models (LLMs), they think of conversations, content generation, or summarization. But LLMs are also incredibly effective at data labelling — and now, with Langformers, you can easily utilize that power for you text labelling tasks. Whether you’re preparing training data, building a classifier, or just need quick annotations (maybe something around weak supervision), Langformers offers the simplest way to define labels and let LLMs do the heavy lifting. How It Works […]

Ver mais

Like 0

Liked Liked

technocracy

[D] Do ML researchers ever treat the user base as part of the model’s effective dimensionality?

digitado ⋅ 9 de January de 2026

Not asking about RLHF or online updates. My question is more structural. Scaling laws talk about parameters, data, compute, right? But I’ve seriously been wondering whether the interactive boundary (number + diversity of users) effectively increases the system’s dimensionality – in practice – even if the weights stay fixed. Who studies this? Does anyone? Is there literature on treating the model + its active user ecology, together, as one coupled system? Genuinely curious if this is a solved […]

Ver mais

Like 0

Liked Liked

technocracy

Guardians of the Hair: Rescuing Soft Boundaries in Depth, Stereo, and Novel Views

digitado ⋅ 8 de January de 2026

arXiv:2601.03362v1 Announce Type: new Abstract: Soft boundaries, like thin hairs, are commonly observed in natural and computer-generated imagery, but they remain challenging for 3D vision due to the ambiguous mixing of foreground and background cues. This paper introduces Guardians of the Hair (HairGuard), a framework designed to recover fine-grained soft boundary details in 3D vision tasks. Specifically, we first propose a novel data curation pipeline that leverages image matting datasets for training and design a depth fixer network […]

Ver mais

Like 0

Liked Liked

technocracy

Trying to fit exponential data

digitado ⋅ 22 de December de 2025

The first difficulty in trying to fit an exponential distribution to data is that the data may not follow an exponential distribution. Nothing grows exponentially forever. Eventually growth slows down. The simplest way growth can slow down is to follow a logistic curve, but fitting a logistic curve has its own problems, as detailed in the previous post. Suppose you are convinced that whatever you’re wanting to model follows an exponential curve, at least over the time scale […]

Ver mais

Like 0

Liked Liked

technocracy

HEEGNet: Hyperbolic Embeddings for EEG

digitado ⋅ 8 de January de 2026

arXiv:2601.03322v1 Announce Type: new Abstract: Electroencephalography (EEG)-based brain-computer interfaces facilitate direct communication with a computer, enabling promising applications in human-computer interactions. However, their utility is currently limited because EEG decoding often suffers from poor generalization due to distribution shifts across domains (e.g., subjects). Learning robust representations that capture underlying task-relevant information would mitigate these shifts and improve generalization. One promising approach is to exploit the underlying hierarchical structure in EEG, as recent studies suggest that hierarchical cognitive processes, […]

Ver mais

Like 0

Liked Liked

technocracy

An Introduction to Probabilistic Generative Models

digitado ⋅ 20 de February de 2020

Preliminaries Probability Bayesian Formular Calculus Probabilistic Generative Models1 The generative model used for making decisions contains an inference step and a decision step: Inference step is to calculate (Pr(mathcal{C}_k|mathbf{x})) which means the probability of (mathbf{x}) belonging to the class (mathcal{C}_k) given (mathbf{x}) Decision step is to make a decision based on (Pr(mathcal{C}_k|mathbf{x})) which was calculated in step 1 In this post, we just give an introduction and a framework for the probabilistic generative model in classification.

Ver mais

Like 0

Liked Liked

technocracy

Beyond SWIRL: Scalable Database Indexing for Dynamic Workloads

digitado ⋅ 10 de January de 2026

Table of Links Abstract and 1. Introduction Related Works 2.1 Traditional Index Selection Approaches 2.2 RL-based Index Selection Approaches Index Selection Problem Methodology 4.1 Formulation of the DRL Problem 4.2 Instance-Aware Deep Reinforcement Learning for Efficient Index Selection System Framework of IA2 5.1 Preprocessing Phase 5.2 RL Training and Application Phase Experiments 6.1 Experimental Setting 6.2 Experimental Results 6.3 End-to-End Performance Comparison 6.4 Key Insights Conclusion and Future Work, and References 6.3 End-to-End Performance Comparison reveals IA2’s distinctive […]

Ver mais

Like 0

Liked Liked

technocracy

Reward Hacking in Reinforcement Learning

digitado ⋅ 28 de November de 2024

Reward hacking occurs when a reinforcement learning (RL) agent exploits flaws or ambiguities in the reward function to achieve high rewards, without genuinely learning or completing the intended task. Reward hacking exists because RL environments are often imperfect, and it is fundamentally challenging to accurately specify a reward function. With the rise of language models generalizing to a broad spectrum of tasks and RLHF becomes a de facto method for alignment training, reward hacking in RL training of […]

Ver mais

Like 0

Liked Liked