digitado

About digitado

https://www.digitado.com.br

Posts by :

Reason Tuning Qwen2.5-0.5B-Instruct on GSM8K dataset using GRPO written from scratch

digitado ⋅ 1 de April de 2026

So, I have been trying to reason tune a qwen2.5 0.5B instruct model on gsm8k math dataset on my Mac mini cluster for some time using GRPO I wrote from scratch It’s just reward hacking. Why? Because I the answer or the correct answer reward signal is too shallow like only reward if the final answer is correct nothing in between So I added a format reward so that the rewards and thus the advantages don’t become near […]

Ver mais

Like 0

Liked Liked

technocracy

Learning Shared Representations for Multi-Task Linear Bandits

digitado ⋅ 1 de April de 2026

Multi-task representation learning is an approach that learns shared latent representations across related tasks, facilitating knowledge transfer and improving sample efficiency. This paper introduces a novel approach to multi-task representation learning in linear bandits. We consider a setting with T concurrent linear bandit tasks, each with feature dimension d, that share a common latent representation of dimension r ll min{d,T}$, capturing their underlying relatedness. We propose a new Optimism in the Face of Uncertainty Linear (OFUL) algorithm that […]

Ver mais

Like 0

Liked Liked

technocracy

AI Model Develops Object Recognition Without Human Guidance

digitado ⋅ 1 de April de 2026

:::info Authors: Mathilde Caron, Facebook AI Research, Inria Hugo Touvron, Facebook AI Research, Sorbonne University Ishan Misra, Facebook AI Research Herve Jegou, Facebook AI Research Julien Mairal, Inria Piotr Bojanowski, Facebook AI Research Armand Joulin, Facebook AI Research ::: Abstract In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) [19] that stand out compared to convolutional networks (convnets). Beyond the fact that adapting self-supervised methods to this architecture works particularly well, we […]

Ver mais

Like 0

Liked Liked

technocracy

Structural Coercion and the AI Workplace

digitado ⋅ 1 de April de 2026

Part 3: Why Huxley aged better than Orwell on this question, and what the adoption curve is actually measuring A really efficient totalitarian state would be one in which the all-powerful executive of political bosses and their army of managers control a population of slaves who do not have to be coerced, because they love their servitude.” Aldous Huxley, foreword to Brave New World, 1946 Nobody is forcing you. The system doesn’t need to. That’s the insight at […]

Ver mais

Like 0

Liked Liked

technocracy

MOON3.0: Reasoning-aware Multimodal Representation Learning for E-commerce Product Understanding

digitado ⋅ 1 de April de 2026

With the rapid growth of e-commerce, exploring general representations rather than task-specific ones has attracted increasing attention. Although recent multimodal large language models (MLLMs) have driven significant progress in product understanding, they are typically employed as feature extractors that implicitly encode product information into global embeddings, thereby limiting their ability to capture fine-grained attributes. Therefore, we argue that leveraging the reasoning capabilities of MLLMs to explicitly model fine-grained product attributes holds significant potential. Nevertheless, achieving this goal remains […]

Ver mais

Like 0

Liked Liked

technocracy

The Evolution of Mobile Networks from 5G to 6G

digitado ⋅ 1 de April de 2026

Owing to the ever-increasing demand for faster communication networks, the rise of a new 6G technology is expected in the near future. The 6th-generation mobile communication network is expected to further improve and enhance the already existing networks. The advancements seen over the years have evolved dramatically from 1G to 2G, which initially were voice-based but progressed further to text messages. 3G evolution enabled users to exchange multimedia information such as videos, music, and images. 4th Gen ensured […]

Ver mais

Like 0

Liked Liked

technocracy

Building a Long-Running Conversational AI Agent with Intelligent Context Management

digitado ⋅ 1 de April de 2026

Author(s): Jageen Shukla Originally published on Towards AI. Learn how to build an AI agent that remembers unlimited conversation history using Redis, ChromaDB vector search, and intelligent context management. Full source code available on GitHub and you can read this blog free from here. Imagine having a conversation with an AI assistant that truly remembers your past discussions. Not just the last few messages, but meaningful context from conversations that happened days or even weeks ago. This is […]

Ver mais

Like 0

Liked Liked

technocracy

Improving Efficiency of GPU Kernel Optimization Agents using a Domain-Specific Language and Speed-of-Light Guidance

digitado ⋅ 1 de April de 2026

arXiv:2603.29010v1 Announce Type: new Abstract: Optimizing GPU kernels with LLM agents is an iterative process over a large design space. Every candidate must be generated, compiled, validated, and profiled, so fewer trials will save both runtime and cost. We make two key observations. First, the abstraction level that agents operate at is important. If it is too low, the LLM wastes reasoning on low-impact details. If it is too high, it may miss important optimization choices. Second, agents […]

Ver mais

Like 0

Liked Liked

technocracy

MEDiC: Multi-objective Exploration of Distillation from CLIP

digitado ⋅ 1 de April de 2026

arXiv:2603.29009v1 Announce Type: new Abstract: Masked image modeling (MIM) methods typically operate in either raw pixel space (reconstructing masked patches) or latent feature space (aligning with a pre-trained teacher). We present MEDiC (Multi-objective Exploration of Distillation from CLIP), a framework that combines both spaces in a single pipeline through three complementary objectives: patch-level token distillation from a frozen CLIP encoder, global CLS alignment, and pixel reconstruction via a lightweight decoder. We conduct a systematic investigation of the design […]

Ver mais

Like 0

Liked Liked

technocracy

Gleanmer: A 6 mW SoC for Real-Time 3D Gaussian Occupancy Mapping

digitado ⋅ 1 de April de 2026

arXiv:2603.29005v1 Announce Type: new Abstract: High-fidelity 3D occupancy mapping is essential for many edge-based applications (such as AR/VR and autonomous navigation) but is limited by power constraints. We present Gleanmer, a system on chip (SoC) with an accelerator for GMMap, a 3D occupancy map using Gaussians. Through algorithm-hardware co-optimizations for direct computation and efficient reuse of these compact Gaussians, Gleanmer reduces construction and query energy by up to 63% and 81%, respectively. Approximate computation on Gaussians reduces accelerator […]

Ver mais

Like 0

Liked Liked