digitado

Synthetic Visual Genome 2: Extracting Large-scale Spatio-Temporal Scene Graphs from Videos

digitado ⋅ 2 de March de 2026

arXiv:2602.23543v1 Announce Type: new Abstract: We introduce Synthetic Visual Genome 2 (SVG2), a large-scale panoptic video scene graph dataset. SVG2 contains over 636K videos with 6.6M objects, 52.0M attributes, and 6.7M relations, providing an order-of-magnitude increase in scale and diversity over prior spatio-temporal scene graph datasets. To create SVG2, we design a fully automated pipeline that combines multi-scale panoptic segmentation, online-offline trajectory tracking with automatic new-object discovery, per-trajectory semantic parsing, and GPT-5-based spatio-temporal relation inference. Building on this […]

Ver mais

Like 0

Liked Liked

technocracy

B-DENSE: Branching For Dense Ensemble Network Learning

digitado ⋅ 19 de February de 2026

arXiv:2602.15971v1 Announce Type: new Abstract: Inspired by non-equilibrium thermodynamics, diffusion models have achieved state-of-the-art performance in generative modeling. However, their iterative sampling nature results in high inference latency. While recent distillation techniques accelerate sampling, they discard intermediate trajectory steps. This sparse supervision leads to a loss of structural information and introduces significant discretization errors. To mitigate this, we propose B-DENSE, a novel framework that leverages multi-branch trajectory alignment. We modify the student architecture to output $K$-fold expanded channels, […]

Ver mais

Like 0

Liked Liked

technocracy

EM Algorithm

digitado ⋅ 5 de March de 2020

Preliminaries Gaussian distribution log-likelihood Calculus partial derivative Lagrange multiplier EM Algorithm for Gaussian Mixture1 Analysis Maximizing likelihood could not be used in the Gaussian mixture model directly, because of its severe defects which we have come across at ‘Maximum Likelihood of Gaussian Mixtures’. With the inspiration of K-means, a two-step algorithm was developed. The objective function is the log-likelihood function: [ begin{aligned} ln Pr(mathbf{x}|mathbf{pi},mathbf{mu},Sigma)&=ln (Pi_{n=1}^Nsum_{j=1}^{K}pi_kmathcal{N}(mathbf{x}|mathbf{mu}_k,Sigma_k))\ &=sum_{n=1}^{N}ln sum_{j=1}^{K}pi_jmathcal{N}(mathbf{x}_n|mathbf{mu}_j,Sigma_j)\ end{aligned}tag{1} ]

Ver mais

Like 0

Liked Liked

technocracy

Robot Photographer (Part IV, Final)

digitado ⋅ 18 de September de 2013

The story of Luke, an autonomous robot photographer, is officially completed and accepted to ICRA 2014! The source code of the implemented robot photographer can be found at https://github.com/manfredzab/robot-photographer and the short description is below. If you need more technical information, take a look at my master’s thesis, which contains all of the gory theoretical and implementation details, spread out over 140 pages. All in all, developing Luke has been amazingly fun, but now it’s time to move […]

Ver mais

Like 0

Liked Liked

technocracy

DISPO: Enhancing Training Efficiency and Stability in Reinforcement Learning for Large Language Model Mathematical Reasoning

digitado ⋅ 1 de February de 2026

Reinforcement learning with verifiable rewards has emerged as a promising paradigm for enhancing the reasoning capabilities of large language models particularly in mathematics. Current approaches in this domain present a clear trade-off: PPO-style methods (e.g., GRPO/DAPO) offer training stability but exhibit slow learning trajectories due to their trust-region constraints on policy updates, while REINFORCE-style approaches (e.g., CISPO) demonstrate improved learning efficiency but suffer from performance instability as they clip importance sampling weights while still permitting non-zero gradients outside […]

Ver mais

Like 0

Liked Liked

technocracy

SciDataCopilot: An Agentic Data Preparation Framework for AGI-driven Scientific Discovery

digitado ⋅ 11 de February de 2026

arXiv:2602.09132v1 Announce Type: new Abstract: The current landscape of AI for Science (AI4S) is predominantly anchored in large-scale textual corpora, where generative AI systems excel at hypothesis generation, literature search, and multi-modal reasoning. However, a critical bottleneck for accelerating closed-loop scientific discovery remains the utilization of raw experimental data. Characterized by extreme heterogeneity, high specificity, and deep domain expertise requirements, raw data possess neither direct semantic alignment with linguistic representations nor structural homogeneity suitable for a unified embedding […]

Ver mais

Like 0

Liked Liked

technocracy

Predicting Multi-Drug Resistance in Bacterial Isolates Through Performance Comparison and LIME-based Interpretation of Classification Models

digitado ⋅ 27 de February de 2026

arXiv:2602.22400v1 Announce Type: new Abstract: The rise of Antimicrobial Resistance, particularly Multi-Drug Resistance (MDR), presents a critical challenge for clinical decision-making due to limited treatment options and delays in conventional susceptibility testing. This study proposes an interpretable machine learning framework to predict MDR in bacterial isolates using clinical features and antibiotic susceptibility patterns. Five classification models were evaluated, including Logistic Regression, Random Forest, AdaBoost, XGBoost, and LightGBM. The models were trained on a curated dataset of 9,714 isolates, […]

Ver mais

Like 0

Liked Liked

technocracy

AstRL: Analog and Mixed-Signal Circuit Synthesis with Deep Reinforcement Learning

digitado ⋅ 12 de February de 2026

Analog and mixed-signal (AMS) integrated circuits (ICs) lie at the core of modern computing and communications systems. However, despite the continued rise in design complexity, advances in AMS automation remain limited. This reflects the central challenge in developing a generalized optimization method applicable across diverse circuit design spaces, many of which are distinct, constrained, and non-differentiable. To address this, our work casts circuit design as a graph generation problem and introduces a novel method of AMS synthesis driven […]

Ver mais

Like 0

Liked Liked

technocracy

PiC-BNN: A 128-kbit 65 nm Processing-in-CAM-Based End-to-End Binary Neural Network Accelerator

digitado ⋅ 29 de January de 2026

arXiv:2601.19920v1 Announce Type: new Abstract: Binary Neural Networks (BNNs), where weights and activations are constrained to binary values (+1, -1), are a highly efficient alternative to traditional neural networks. Unfortunately, typical BNNs, while binarizing linear layers (matrix-vector multiplication), still implement other network layers (batch normalization, softmax, output layer, and sometimes the input layer of a convolutional neural network) in full precision. This limits the area and energy benefits and requires architectural support for full precision operations. We propose […]

Ver mais

Like 0

Liked Liked

technocracy

Introduction to Generative AI

digitado ⋅ 18 de January de 2023

Generative artificial intelligence has seen an incredible popularity surge in 2022. Big Think has called it ‘the technology of the year’, and judging from the amount of attention and VC support generative AI startups have been gaining this year, this claim is more than justified. Moreover, tech experts say that in the next few years, not only will the development of generative AI not slow down but will also rapidly increase, conquering new and new fields. In this […]

Ver mais

Like 0

Liked Liked