technocracy

CryptoAnalystBench: Failures in Multi-Tool Long-Form LLM Analysis

digitado ⋅ 13 de February de 2026

arXiv:2602.11304v1 Announce Type: new Abstract: Modern analyst agents must reason over complex, high token inputs, including dozens of retrieved documents, tool outputs, and time sensitive data. While prior work has produced tool calling benchmarks and examined factuality in knowledge augmented systems, relatively little work studies their intersection: settings where LLMs must integrate large volumes of dynamic, structured and unstructured multi tool outputs. We investigate LLM failure modes in this regime using crypto as a representative high data density […]

Ver mais

Like 0

Liked Liked

technocracy

Introducing Amazon Bedrock global cross-Region inference for Anthropic’s Claude models in the Middle East Regions (UAE and Bahrain)

digitado ⋅ 24 de February de 2026

We’re excited to announce the availability of Anthropic’s Claude Opus 4.6, Claude Sonnet 4.6, Claude Opus 4.5, Claude Sonnet 4.5, and Claude Haiku 4.5 through Amazon Bedrock global cross-Region inference for customers operating in the Middle East. This launch supports organizations in the Middle East to access Anthropic’s latest Claude models on Amazon Bedrock while benefiting from global, highly available inference routing across the AWS network. With global cross-Region inference, you can scale inference workloads seamlessly, improve resiliency, […]

Ver mais

Like 0

Liked Liked

technocracy

Are we tokenmaxxing our way to nowhere?

digitado ⋅ 17 de April de 2026

The gap between AI insiders and everyone else is widening, and the spending, suspicion, and even new vocabulary are starting to show it. While OpenAI is busy buying up everything from finance apps to talk shows, a certain shoe company just rebranded as an AI infrastructure play, and Anthropic unveiled a model it says is too powerful to release publicly …but apparently not too […]

Ver mais

Like 0

Liked Liked

technocracy

EPA moves to stop considering economic benefits of cleaner air

digitado ⋅ 13 de January de 2026

If you were to do a cost-benefit analysis of your lunch, it would be pretty difficult to do the calculation without the sandwich. But it appears that the US Environmental Protection Agency (EPA) is moving in this same direction—removing the benefit—when it comes to air pollution regulations. According to a New York Times report based on internal emails and documents—and demonstrated by a recently produced analysis on the EPA website—the EPA is changing its cost-benefit analysis process for […]

Ver mais

Like 0

Liked Liked

technocracy

Reddit will require “fishy” accounts to verify they are run by a human

digitado ⋅ 25 de March de 2026

Reddit will require accounts that exhibit “automated or otherwise fishy behavior” to verify that a human runs them, Reddit CEO Steve Huffman said in a Reddit post today. The verification process aims to combat unwanted bots from flooding Reddit at a time when AI bots are poised to take over the Internet. “As AI becomes a bigger part of the Internet, we want to make sure that when you’re on Reddit, you know when you’re talking to a […]

Ver mais

Like 0

Liked Liked

technocracy

Time-TK: A Multi-Offset Temporal Interaction Framework Combining Transformer and Kolmogorov-Arnold Networks for Time Series Forecasting

digitado ⋅ 13 de February de 2026

arXiv:2602.11190v1 Announce Type: new Abstract: Time series forecasting is crucial for the World Wide Web and represents a core technical challenge in ensuring the stable and efficient operation of modern web services, such as intelligent transportation and website throughput. However, we have found that existing methods typically employ a strategy of embedding each time step as an independent token. This paradigm introduces a fundamental information bottleneck when processing long sequences, the root cause of which is that independent […]

Ver mais

Like 0

Liked Liked

technocracy

The UK Has It Wrong on Digital ID. Here’s Why.

digitado ⋅ 8 de December de 2025

In late September, the United Kingdom’s Prime Minister Keir Starmer announced his government’s plans to introduce a new digital ID scheme in the country to take effect before the end of the Parliament (no later than August 2029). The scheme will, according to the Prime Minister, “cut the faff” in proving people’s identities by creating a virtual ID on personal devices with information like people’s name, date of birth, nationality or residency status, and photo to verify their […]

Ver mais

Like 0

Liked Liked

technocracy

PRISM: Exploring Heterogeneous Pretrained EEG Foundation Model Transfer to Clinical Differential Diagnosis

digitado ⋅ 4 de March de 2026

arXiv:2603.02268v1 Announce Type: new Abstract: EEG foundation models are typically pretrained on narrow-source clinical archives and evaluated on benchmarks from the same ecosystem, leaving unclear whether representations encode neural physiology or recording-distribution artifacts. We introduce PRISM (Population Representative Invariant Signal Model), a masked autoencoder ablated along two axes — pretraining population and downstream adaptation — with architecture and preprocessing fixed. We compare a narrow-source EU/US corpus (TUH + PhysioNet) against a geographically diverse pool augmented with multi-center South […]

Ver mais

Like 0

Liked Liked

technocracy

The Median is Easier than it Looks: Approximation with a Constant-Depth, Linear-Width ReLU Network

digitado ⋅ 10 de February de 2026

arXiv:2602.07219v1 Announce Type: new Abstract: We study the approximation of the median of $d$ inputs using ReLU neural networks. We present depth-width tradeoffs under several settings, culminating in a constant-depth, linear-width construction that achieves exponentially small approximation error with respect to the uniform distribution over the unit hypercube. By further establishing a general reduction from the maximum to the median, our results break a barrier suggested by prior work on the maximum function, which indicated that linear width […]

Ver mais

Like 0

Liked Liked

technocracy

A better training method for reinforcement learning with human feedback

digitado ⋅ 2 de May de 2025

A better training method for reinforcement learning with human feedback Contrasting training pairs with large reward differences mitigate spurious correlations and improve performance of direct-alignment algorithms by as much as 20%40%. Machine learning Sailik Sengupta Saket Dingliwal May 02, 09:00 AM May 13, 02:56 PM Reinforcement learning with human feedback (RLHF) is the standard method for aligning large language models (LLMs) with human preferences such as the preferences for nontoxic language and factually accurate responses. Recently, one of […]

Ver mais

Like 0

Liked Liked