technocracy

SlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative Tasks

digitado ⋅ 27 de March de 2026

arXiv:2603.24755v1 Announce Type: new Abstract: Software development is iterative, yet agentic coding benchmarks overwhelmingly evaluate single-shot solutions against complete specifications. Code can pass the test suite but become progressively harder to extend. Recent iterative benchmarks attempt to close this gap, but constrain the agent’s design decisions too tightly to faithfully measure how code quality shapes future extensions. We introduce SlopCodeBench, a language-agnostic benchmark comprising 20 problems and 93 checkpoints, in which agents repeatedly extend their own prior solutions […]

Ver mais

Like 0

Liked Liked

technocracy

Self-Evolving Multi-Agent Network for Industrial IoT Predictive Maintenance

digitado ⋅ 20 de February de 2026

arXiv:2602.16738v1 Announce Type: new Abstract: Industrial IoT predictive maintenance requires systems capable of real-time anomaly detection without sacrificing interpretability or demanding excessive computational resources. Traditional approaches rely on static, offline-trained models that cannot adapt to evolving operational conditions, while LLM-based monolithic systems demand prohibitive memory and latency, rendering them impractical for on-site edge deployment. We introduce SEMAS, a self-evolving hierarchical multi-agent system that distributes specialized agents across Edge, Fog, and Cloud computational tiers. Edge agents perform lightweight feature […]

Ver mais

Like 0

Liked Liked

technocracy

The Air-Gapped Chronicles: The SOC Blindspot — When Your Own AI Becomes the New Insider Threat

digitado ⋅ 25 de March de 2026

The Air-Gapped Chronicles: The SOC Blindspot — When Your Own AI Becomes the New Insider Threat Benchmarks passed. Production looked clean. Then the AI started explaining away real intrusions. The alerts it suppressed? Those were the breach. The scenario below is fictional, but built from real SOC AI failure patterns documented by Sygnia, OWASP, and academic research on LLM-powered security operations. The graveyard shift SOC analyst pulled up ticket #47291 at 2:14 AM. Alert: Suspicious PowerShell execution, svchost.exe spawned child process AI Summary: […]

Ver mais

Like 0

Liked Liked

technocracy

Trump admin is “destroying medical research,” Senate report finds

digitado ⋅ 4 de February de 2026

Jay Bhattacharya, director of the National Institutes of Health under the Trump administration, appeared before the Senate Committee on Health, Education, Labor, and Pensions (HELP) Tuesday. In the wide-ranging hearing, Bhattacharya defended the chaotic and disruptive cuts at the institutes he helms while carefully wording responses related to vaccines—seemingly to avoid contradicting his boss, anti-vaccine Health Secretary Robert F. Kennedy Jr. As Bhattacharya testified, Sen. Bernie Sanders (I-Vt.), the HELP committee’s ranking member, released a report outlining the […]

Ver mais

Like 0

Liked Liked

technocracy

Distribution Matching for Graph Quantification Under Structural Covariate Shift

digitado ⋅ 6 de January de 2026

arXiv:2601.00864v1 Announce Type: cross Abstract: Graphs are commonly used in machine learning to model relationships between instances. Consider the task of predicting the political preferences of users in a social network; to solve this task one should consider, both, the features of each individual user and the relationships between them. However, oftentimes one is not interested in the label of a single instance but rather in the distribution of labels over a set of instances; e.g., when predicting […]

Ver mais

Like 0

Liked Liked

technocracy

A Doubly Robust Machine Learning Approach for Disentangling Treatment Effect Heterogeneity with Functional Outcomes

digitado ⋅ 12 de February de 2026

arXiv:2602.11118v1 Announce Type: cross Abstract: Causal inference is paramount for understanding the effects of interventions, yet extracting personalized insights from increasingly complex data remains a significant challenge for modern machine learning. This is the case, in particular, when considering functional outcomes observed over a continuous domain (e.g., time, or space). Estimation of heterogeneous treatment effects, known as CATE, has emerged as a crucial tool for personalized decision-making, but existing meta-learning frameworks are largely limited to scalar outcomes, failing […]

Ver mais

Like 0

Liked Liked

technocracy

In rare chickenpox case, itchy blisters mushroom into large, rubbery nodules

digitado ⋅ 24 de April de 2026

Those who suffered through chickenpox as kids likely remember the agony of its itchy rash. Oven mitts or snow gloves may have been used to prevent you from inadvertently clawing your skin off, while dips in oatmeal may have offered some temporary relief. But in the end, you just had to endure the full cycle of the rash—from the breakout of the first raised, itchy papules that inflate into fluid-filled blisters that then break and leak, to the […]

Ver mais

Like 0

Liked Liked

technocracy

EU Reportedly Considering Interim Action Against Meta Over WhatsApp’s AI Policy

digitado ⋅ 9 de February de 2026

Key Highlights: Meta has been under the European Commission’s radar for multiple reasons in the past. Now, it seems Facebook’s parent company is yet again in hot water in the EU. This time, the issue that the EC has flagged is related to access for third-party AI assistants inside WhatsApp. Late last year, the EU launched a probe into the matter. At the time, the EC said that its investigation would cover all regions in the EU except […]

Ver mais

Like 0

Liked Liked

technocracy

dnaHNet: A Scalable and Hierarchical Foundation Model for Genomic Sequence Learning

digitado ⋅ 11 de February de 2026

Genomic foundation models have the potential to decode DNA syntax, yet face a fundamental tradeoff in their input representation. Standard fixed-vocabulary tokenizers fragment biologically meaningful motifs such as codons and regulatory elements, while nucleotide-level models preserve biological coherence but incur prohibitive computational costs for long contexts. We introduce dnaHNet, a state-of-the-art tokenizer-free autoregressive model that segments and models genomic sequences end-to-end. Using a differentiable dynamic chunking mechanism, dnaHNet compresses raw nucleotides into latent tokens adaptively, balancing compression with […]

Ver mais

Like 0

Liked Liked

technocracy

Step by Step Guide to Build an End-to-End Model Optimization Pipeline with NVIDIA Model Optimizer Using FastNAS Pruning and Fine-Tuning

digitado ⋅ 4 de April de 2026

In this tutorial, we build a complete end-to-end pipeline using NVIDIA Model Optimizer to train, prune, and fine-tune a deep learning model directly in Google Colab. We start by setting up the environment and preparing the CIFAR-10 dataset, then define a ResNet architecture and train it to establish a strong baseline. From there, we apply FastNAS pruning to systematically reduce the model’s complexity under FLOPs constraints while preserving performance. We also handle real-world compatibility issues, restore the optimized […]

Ver mais

Like 0

Liked Liked