digitado

Knowing When to Answer: Adaptive Confidence Refinement for Reliable Audio-Visual Question Answering

digitado ⋅ 6 de February de 2026

arXiv:2602.04924v1 Announce Type: new Abstract: We present a formal problem formulation for textit{Reliable} Audio-Visual Question Answering ($mathcal{R}$-AVQA), where we prefer abstention over answering incorrectly. While recent AVQA models have high accuracy, their ability to identify when they are likely wrong and their consequent abstention from answering remain underexplored areas of research. To fill this gap, we explore several approaches and then propose Adaptive Confidence Refinement (ACR), a lightweight method to further enhance the performance of $mathcal{R}$-AVQA. Our key […]

Ver mais

Like 0

Liked Liked

technocracy

LEMUR: Learned Multi-Vector Retrieval

digitado ⋅ 29 de January de 2026

Multi-vector representations generated by late interaction models, such as ColBERT, enable superior retrieval quality compared to single-vector representations in information retrieval applications. In multi-vector retrieval systems, both queries and documents are encoded using one embedding for each token, and similarity between queries and documents is measured by the MaxSim similarity measure. However, the improved recall of multi-vector retrieval comes at the expense of significantly increased latency. This necessitates designing efficient approximate nearest neighbor search (ANNS) algorithms for multi-vector […]

Ver mais

Like 0

Liked Liked

technocracy

The Risks of Letting AI Direct Conversations

digitado ⋅ 2 de March de 2026

LLMs ask questions differently than humans—and that affects how executives use these tools to make decisions.

Ver mais

Like 0

Liked Liked

technocracy

SWE-bench February 2026 leaderboard update

digitado ⋅ 19 de February de 2026

SWE-bench February 2026 leaderboard update SWE-bench is one of the benchmarks that the labs love to list in their model releases. The official leaderboard is infrequently updated but they just did a full run of it against the current generation of models, which is notable because it’s always good to see benchmark results like this that weren’t self-reported by the labs. The fresh results are for their “Bash Only” benchmark, which runs their mini-swe-bench agent (~9,000 lines of […]

Ver mais

Like 0

Liked Liked

technocracy

The Depth Delusion: Why Transformers Should Be Wider, Not Deeper

digitado ⋅ 30 de January de 2026

arXiv:2601.20994v1 Announce Type: new Abstract: Neural scaling laws describe how language model loss decreases with parameters and data, but treat architecture as interchangeable–a billion parameters could arise from a shallow-wide model (10 layers & 8,192 hidden dimension) or a deep-narrow one (80 layers & 2,048 hidden dimension). We propose architecture-conditioned scaling laws decomposing this dependence, finding that optimal depth scales as D* ~ C^0.12 while optimal width scales as W* ~ C^0.34, meaning width should grow 2.8x faster […]

Ver mais

Like 0

Liked Liked

technocracy

Creating GIFs in Python using Pillow (PIL Fork)

digitado ⋅ 21 de August de 2018

I was working on a personal project the other day and I needed to create some images (frames) and save them as a playable GIF. Working in Python, I excepted to find an easy solution fast but oh boy did it take me too long to find it. Here I am now, creating a blog post to help future people looking to create gifs in Python.

Ver mais

Like 0

Liked Liked

technocracy

Attention? Attention!

digitado ⋅ 24 de June de 2018

[Updated on 2018-10-28: Add Pointer Network and the link to my implementation of Transformer.] [Updated on 2018-11-06: Add a link to the implementation of Transformer model.] [Updated on 2018-11-18: Add Neural Turing Machines.] [Updated on 2019-07-18: Correct the mistake on using the term “self-attention” when introducing the show-attention-tell paper; moved it to Self-Attention section.] [Updated on 2020-04-07: A follow-up post on improved Transformer models is here.]

Ver mais

Like 0

Liked Liked

technocracy

Federal data underscores meteoric rise of streaming subscription prices in 2025

digitado ⋅ 14 de January de 2026

The prices that Americans paid for subscription- and rental-based access to video streaming services and video games increased 29 percent from December 2024 to December 2025, according to data that the US Department of Labor’s Bureau of Labor Statistics (BLS) released on Tuesday. According to the BLS, the Consumer Price Index for All Urban Consumers (CPI-U), which BLS says represents over 90 percent of the US population across the country, for all items “increased 2.7 percent before seasonal […]

Ver mais

Like 0

Liked Liked

technocracy

Full-Batch Gradient Descent Outperforms One-Pass SGD: Sample Complexity Separation in Single-Index Learning

digitado ⋅ 2 de February de 2026

It is folklore that reusing training data more than once can improve the statistical efficiency of gradient-based learning. However, beyond linear regression, the theoretical advantage of full-batch gradient descent (GD, which always reuses all the data) over one-pass stochastic gradient descent (online SGD, which uses each data point only once) remains unclear. In this work, we consider learning a $d$-dimensional single-index model with a quadratic activation, for which it is known that one-pass SGD requires $ngtrsim dlog d$ […]

Ver mais

Like 0

Liked Liked

technocracy

The Transatlantic Divide: When Platforms Become Politics

digitado ⋅ 30 de January de 2026

In the previous articles in this series, we looked at digital trust as a progression: we examined trust as a social mechanism, analysed governance as its point of failure, and walked through legitimacy as the condition that determines whether systems endure. This fourth piece extends that shared logic outward, beyond platforms themselves, into geopolitics. A Regulatory Fight That Isn’t Really About Regulation What looks like a trade dispute over digital platforms is, at a deeper level, a clash […]

Ver mais

Like 0

Liked Liked