January 2026

Rewarding Intellectual Humility Learning When Not To Answer In Large Language Models

digitado ⋅ 28 de January de 2026

Large Language Models (LLMs) often produce hallucinated or unverifiable content, undermining their reliability in factual domains. This work investigates Reinforcement Learning with Verifiable Rewards (RLVR) as a training paradigm that explicitly rewards abstention ("I don’t know") alongside correctness to promote intellectual humility. We fine-tune and evaluate Granite-3.3-2B-Instruct and Qwen-3-4B-Instruct on the MedMCQA and Hendrycks Math benchmarks using a ternary reward structure ($-1$, r_abs, 1) under varying abstention reward structures. We further study the effect of combining RLVR with […]

Ver mais

Like 0

Liked Liked

technocracy

Why Employee Disengagement Spreads Faster in Large Organizations

digitado ⋅ 28 de January de 2026

Although it is not always immediately obvious to higher-ups, employee disengagement can quickly have a huge impact on a wider team within an organisation if it is not addressed. Even in large companies where employees often have a higher tenure than in smaller businesses due to the stability and growth prospects, the large scale of the workforce means that small issues can blow up into significant problems. Once disengagement starts to spread, company culture and productivity can take […]

Ver mais

Like 0

Liked Liked

technocracy

Dozens of CDC vaccination databases have been frozen under RFK Jr.

digitado ⋅ 28 de January de 2026

Nearly half of the databases that public health officials at the Centers for Disease Control and Prevention were updating on a monthly basis have been frozen without notice or explanation, according to a study published in the Annals of Internal Medicine. The study—led by Janet Freilich, a law expert at Boston University, and Jeremy Jacobs, a medical professor at Vanderbilt University—examined the status of all CDC databases, finding a total of 82 that had, as of early 2025, […]

Ver mais

Like 0

Liked Liked

technocracy

A Reinforcement Learning Based Universal Sequence Design for Polar Codes

digitado ⋅ 28 de January de 2026

To advance Polar code design for 6G applications, we develop a reinforcement learning-based universal sequence design framework that is extensible and adaptable to diverse channel conditions and decoding strategies. Crucially, our method scales to code lengths up to $2048$, making it suitable for use in standardization. Across all $(N,K)$ configurations supported in 5G, our approach achieves competitive performance relative to the NR sequence adopted in 5G and yields up to a 0.2 dB gain over the beta-expansion baseline […]

Ver mais

Like 0

Liked Liked

technocracy

TikTok users “absolutely justified” in fearing MAGA makeover, experts say

digitado ⋅ 28 de January de 2026

TikTok wants users to believe that errors blocking uploads of anti-ICE videos or direct messages mentioning Jeffrey Epstein are due to technical errors—not the platform shifting to censor content critical of Donald Trump after he hand-picked the US owners who took over the app last week. However, experts say that TikTok users’ censorship fears are justified, whether the bugs are to blame or not. Ioana Literat, an associate professor of technology, media, and learning at Teachers College, Columbia […]

Ver mais

Like 0

Liked Liked

technocracy

In-Context Reinforcement Learning From Suboptimal Historical Data

digitado ⋅ 28 de January de 2026

Transformer models have achieved remarkable empirical successes, largely due to their in-context learning capabilities. Inspired by this, we explore training an autoregressive transformer for in-context reinforcement learning (ICRL). In this setting, we initially train a transformer on an offline dataset consisting of trajectories collected from various RL tasks, and then fix and use this transformer to create an action policy for new RL tasks. Notably, we consider the setting where the offline dataset contains trajectories sampled from suboptimal […]

Ver mais

Like 0

Liked Liked

technocracy

There’s a rash of scam spam coming from a real Microsoft address

digitado ⋅ 27 de January de 2026

There are reports that a legitimate Microsoft email address—which Microsoft explicitly says customers should add to their allow list—is delivering scam spam. The emails originate from no-reply-powerbi@microsoft.com, an address tied to Power BI. The Microsoft platform provides analytics and business intelligence from various sources that can be integrated into a single dashboard. Microsoft documentation says that the address is used to send subscription emails to mail-enabled security groups. To prevent spam filters from blocking the address, the company […]

Ver mais

Like 0

Liked Liked

technocracy

OpenAI Launches Prism, a Free Writing Workspace Built for Scientists & Researchers

digitado ⋅ 27 de January de 2026

Science explains the mechanism of our existence, while it also quietly powers almost everything around us. Although AI has come this far, the everyday tools used by scientists feel dated, with many still sticking with old tools to write, revise, and collaborate on research. Research papers are still drafted across multiple platforms, equations live in separate files, citations are managed elsewhere. Well, OpenAI has just put its foot forward to fix that. Today, the AI giant launched Prism, […]

Ver mais

Like 0

Liked Liked

technocracy

Vector Databases in Practice: Building a Realistic Hybrid Search RAG System with Qdrant

digitado ⋅ 27 de January de 2026

Vector databases are often introduced as tools for semantic similarity search. In practice, that understanding breaks down the moment you try to build a real RAG system. In this article, I explain what vector databases actually do inside modern retrieval pipelines, why pure semantic search is insufficient, and why hybrid search is not an optimization but a requirement for production systems. You will see why semantic search fails silently, keyword search fails noisily, and why hybrid retrieval is the […]

Ver mais

Like 0

Liked Liked

technocracy

LLM Inference Optimization

digitado ⋅ 27 de January de 2026

KV Cache, Paged Attention, Flash Attention, Batching, MQA, GQA & Parallelism techniques A typical article on this topic might start off by explaining key innovations like KV caching, Paged attention, Dynamic Batching, Flash attention, MQA, GQA etc. Instead, let us start by simply observing the LLM Inference process more closely. If we do a good enough job, we will be in a position to “predict” typical bottlenecks in the inferencing operation. Once we know what these bottlenecks are, […]

Ver mais

Like 0

Liked Liked