Página de exemplo
Política de privacidade

How fast is 10 tokens per second really?

digitado ⋅ 20 de May de 2026

How fast is 10 tokens per second really?

Neat little HTML app by Mike Veerman (source code here) which simulates LLM token output speeds from 5/second to 800/second.

Useful if you see a model advertised as “30 tokens/second” and want to get a feel for what that actually looks like.

Via Hacker News

Tags: ai, generative-ai, llms

Like 0

Liked Liked

« Build real-time voice applications with Amazon SageMaker AI and vLLM » Multimodal evaluators: MLLM-as-a-judge for image-to-text tasks in Strands Evals

Search

Posts recentes

Manual Tracing, Scores, and Evaluation with Langfuse (Self-Hosted)
WorkOS Releases auth.md: An Open Agent Registration Protocol Built on OAuth Standards
Best Authentication Platforms for AI Agents and MCP Servers in 2026
Step by Step Guide to Build and Compare FedAvg and FedProx Federated Learning on Non-IID CIFAR-10 with NVIDIA FLARE
Together AI Open-Sources OSCAR: An Attention-Aware 2-Bit KV Cache Quantization System for Long-Context LLM Serving

Comentários

No comments to show.

Arquivos

Categorias

technocracy

Digitado © 2026