technocracy

SemRep: Generative Code Representation Learning with Code Transformations

digitado ⋅ 13 de March de 2026

Code transformation is a foundational capability in the software development process, where its effectiveness relies on constructing a high-quality code representation to characterize the input code semantics and guide the transformation. Existing approaches treat code transformation as an end-to-end learning task, leaving the construction of the representation needed for semantic reasoning implicit in model weights or relying on rigid compiler-level abstractions. We present SemRep, a framework that improves code transformation through generative code representation learning. Our key insight […]

Ver mais

Like 0

Liked Liked

technocracy

Structural shifts in institutional participation and collaboration within the AI arXiv preprint research ecosystem

digitado ⋅ 5 de February de 2026

arXiv:2602.03969v1 Announce Type: new Abstract: The emergence of large language models (LLMs) represents a significant technological shift within the scientific ecosystem, particularly within the field of artificial intelligence (AI). This paper examines structural changes in the AI research landscape using a dataset of arXiv preprints (cs.AI) from 2021 through 2025. Given the rapid pace of AI development, the preprint ecosystem has become a critical barometer for real-time scientific shifts, often preceding formal peer-reviewed publication by months or years. […]

Ver mais

Like 0

Liked Liked

technocracy

Fuel Consumption Prediction: A Comparative Analysis of Machine Learning Paradigms

digitado ⋅ 22 de March de 2026

The automotive industry is under growing pressure to reduce its environmental impact, requiring accurate predictive modeling to support sustainable engineering design. This study examines the factors that determine vehicle fuel consumption from the seminal Motor Trend dataset, identifying the governing physical factors of efficiency through rigorous quantitative analysis. Methodologically, the research uses data sanitization, statistical outlier elimination, and in-depth Exploratory Data Analysis (EDA) to curb the occurrence of multicollinearity between powertrain features. A comparative analysis of machine learning […]

Ver mais

Like 0

Liked Liked

technocracy

The Context Advantage: How Palantir AIP Operates the Modern Enterprise

digitado ⋅ 15 de January de 2026

Author(s): Sainath Palla Originally published on Towards AI. Over the last couple of years, most conversations about AI have focused on model size, speed, or how many parameters a system can fit into memory. These are useful metrics, but they do not explain why some organisations see operational results while others remain stuck in experimentation. The difference is not the model. The difference is context. It is similar to how we once compared phones by processor speed. Faster […]

Ver mais

Like 0

Liked Liked

technocracy

Convergence Rate of a Functional Learning Method for Contextual Stochastic Optimization

digitado ⋅ 13 de March de 2026

We consider a stochastic optimization problem involving two random variables: a context variable $X$ and a dependent variable $Y$. The objective is to minimize the expected value of a nonlinear loss functional applied to the conditional expectation $mathbb{E}[f(X, Y,β) mid X]$, where $f$ is a nonlinear function and $β$ represents the decision variables. We focus on the practically important setting in which direct sampling from the conditional distribution of $Y mid X$ is infeasible, and only a stream […]

Ver mais

Like 0

Liked Liked

technocracy

Rethinking Multimodal Fusion for Time Series: Auxiliary Modalities Need Constrained Fusion

digitado ⋅ 25 de March de 2026

arXiv:2603.22372v1 Announce Type: new Abstract: Recent advances in multimodal learning have motivated the integration of auxiliary modalities such as text or vision into time series (TS) forecasting. However, most existing methods provide limited gains, often improving performance only in specific datasets or relying on architecture-specific designs that limit generalization. In this paper, we show that multimodal models with naive fusion strategies (e.g., simple addition or concatenation) often underperform unimodal TS models, which we attribute to the uncontrolled integration […]

Ver mais

Like 0

Liked Liked

technocracy

Edges Are All You Need: Robust Gait Recognition via Label-Free Structure

digitado ⋅ 9 de March de 2026

arXiv:2603.05537v1 Announce Type: new Abstract: Gait recognition is a non-intrusive biometric technique for security applications, yet existing studies are dominated by silhouette- and parsing-based representations. Silhouettes are sparse and miss internal structural details, limiting discriminability. Parsing enriches silhouettes with part-level structures, but relies heavily on upstream human parsers (e.g., label granularity and boundary precision), leading to unstable performance across datasets and sometimes even inferior results to silhouettes. We revisit gait representations from a structural perspective and describe a […]

Ver mais

Like 0

Liked Liked

technocracy

VDLM: Variable Diffusion LMs via Robust Latent-to-Text Rendering

digitado ⋅ 19 de February de 2026

arXiv:2602.15870v1 Announce Type: new Abstract: Autoregressive language models decode left-to-right with irreversible commitments, limiting revision during multi-step reasoning. We propose textbf{VDLM}, a modular variable diffusion language model that separates semantic planning from text rendering. VDLM applies LLaDA-style masked diffusion over semantic variable embeddings to enable iterative refinement in latent space, then post-trains the planner with trajectory-aware optimization using embedding-space rewards and values, avoiding text decoding inside the RL loop. To convert planned embeddings back to text, we use […]

Ver mais

Like 0

Liked Liked

technocracy

LURE: Latent Space Unblocking for Multi-Concept Reawakening in Diffusion Models

digitado ⋅ 22 de January de 2026

arXiv:2601.14330v1 Announce Type: new Abstract: Concept erasure aims to suppress sensitive content in diffusion models, but recent studies show that erased concepts can still be reawakened, revealing vulnerabilities in erasure methods. Existing reawakening methods mainly rely on prompt-level optimization to manipulate sampling trajectories, neglecting other generative factors, which limits a comprehensive understanding of the underlying dynamics. In this paper, we model the generation process as an implicit function to enable a comprehensive theoretical analysis of multiple factors, including […]

Ver mais

Like 0

Liked Liked

technocracy

RobuMTL: Enhancing Multi-Task Learning Robustness Against Weather Conditions

digitado ⋅ 19 de January de 2026

arXiv:2601.10921v1 Announce Type: new Abstract: Robust Multi-Task Learning (MTL) is crucial for autonomous systems operating in real-world environments, where adverse weather conditions can severely degrade model performance and reliability. In this paper, we introduce RobuMTL, a novel architecture designed to adaptively address visual degradation by dynamically selecting task-specific hierarchical Low-Rank Adaptation (LoRA) modules and a LoRA expert squad based on input perturbations in a mixture-of-experts fashion. Our framework enables adaptive specialization based on input characteristics, improving robustness across […]

Ver mais

Like 0

Liked Liked