digitado – Page 22

Can Adversarial Code Comments Fool AI Security Reviewers — Large-Scale Empirical Study of Comment-Based Attacks and Defenses Against LLM Code Analysis

digitado ⋅ 20 de February de 2026

arXiv:2602.16741v1 Announce Type: new Abstract: AI-assisted code review is widely used to detect vulnerabilities before production release. Prior work shows that adversarial prompt manipulation can degrade large language model (LLM) performance in code generation. We test whether similar comment-based manipulation misleads LLMs during vulnerability detection. We build a 100-sample benchmark across Python, JavaScript, and Java, each paired with eight comment variants ranging from no comments to adversarial strategies such as authority spoofing and technical deception. Eight frontier models, […]

Ver mais

Like 0

Liked Liked

technocracy

Choice-Model-Assisted Q-learning for Delayed-Feedback Revenue Management

digitado ⋅ 2 de February de 2026

We study reinforcement learning for revenue management with delayed feedback, where a substantial fraction of value is determined by customer cancellations and modifications observed days after booking. We propose emph{choice-model-assisted RL}: a calibrated discrete choice model is used as a fixed partial world model to impute the delayed component of the learning target at decision time. In the fixed-model deployment regime, we prove that tabular Q-learning with model-imputed targets converges to an $O(varepsilon/(1-γ))$ neighborhood of the optimal Q-function, […]

Ver mais

Like 0

Liked Liked

technocracy

Scaling Strategy, Not Compute: A Stand-Alone, Open-Source StarCraft II Benchmark for Accessible Reinforcement Learning Research

digitado ⋅ 10 de March de 2026

arXiv:2603.06608v1 Announce Type: new Abstract: The research community lacks a middle ground between StarCraft IIs full game and its mini-games. The full-games sprawling state-action space renders reward signals sparse and noisy, but in mini-games simple agents saturate performance. This complexity gap hinders steady curriculum design and prevents researchers from experimenting with modern Reinforcement Learning algorithms in RTS environments under realistic compute budgets. To fill this gap, we present the Two-Bridge Map Suite, the first entry in an open-source […]

Ver mais

Like 0

Liked Liked

technocracy

Asymptotic Behavior of Multi–Task Learning: Implicit Regularization and Double Descent Effects

digitado ⋅ 5 de March de 2026

Multi–task learning seeks to improve the generalization error by leveraging the common information shared by multiple related tasks. One challenge in multi–task learning is identifying formulations capable of uncovering the common information shared between different but related tasks. This paper provides a precise asymptotic analysis of a popular multi–task formulation associated with misspecified perceptron learning models. The main contribution of this paper is to precisely determine the reasons behind the benefits gained from combining multiple related tasks. Specifically, […]

Ver mais

Like 0

Liked Liked

technocracy

Opportunistic Scheduling for Optimal Spot Instance Savings in the Cloud

digitado ⋅ 21 de January de 2026

arXiv:2601.12266v1 Announce Type: new Abstract: We study the problem of scheduling delay-sensitive jobs over spot and on-demand cloud instances to minimize average cost while meeting an average delay constraint. Jobs arrive as a general stochastic process, and incur different costs based on the instance type. This work provides the first analytical treatment of this problem using tools from queuing theory, stochastic processes, and optimization. We derive cost expressions for general policies, prove queue length one is optimal for […]

Ver mais

Like 0

Liked Liked

technocracy

Leveraging Generative AI for Enhancing Domain-Driven Software Design

digitado ⋅ 30 de January de 2026

arXiv:2601.20909v1 Announce Type: new Abstract: Domain-Driven Design (DDD) is a key framework for developing customer-oriented software, focusing on the precise modeling of an application’s domain. Traditionally, metamodels that describe these domains are created manually by system designers, forming the basis for iterative software development. This paper explores the partial automation of metamodel generation using generative AI, particularly for producing domain-specific JSON objects. By training a model on real-world DDD project data, we demonstrate that generative AI can produce […]

Ver mais

Like 0

Liked Liked

technocracy

Conversational Context Classification: A Representation Engineering Approach

digitado ⋅ 21 de January de 2026

arXiv:2601.12286v1 Announce Type: new Abstract: The increasing prevalence of Large Language Models (LLMs) demands effective safeguards for their operation, particularly concerning their tendency to generate out-of-context responses. A key challenge is accurately detecting when LLMs stray from expected conversational norms, manifesting as topic shifts, factual inaccuracies, or outright hallucinations. Traditional anomaly detection struggles to directly apply within contextual semantics. This paper outlines our experiment in exploring the use of Representation Engineering (RepE) and One-Class Support Vector Machine (OCSVM) […]

Ver mais

Like 0

Liked Liked

technocracy

Sun Finance automates ID extraction and fraud detection with generative AI on AWS

digitado ⋅ 30 de April de 2026

This post was co-authored with Krišjānis Kočāns, Kaspars Magaznieks, Sergei Kiriasov from Sun Finance Group If you process identity documents at scale—loan applications, account openings, compliance checks—you’ve likely hit the same wall: traditional optical character recognition (OCR) gets you partway there, but extraction errors still push a large share of applications into manual review queues. Add fraud detection to the mix, and the manual workload compounds. Sun Finance, a Latvian fintech founded in 2017, operates as a technology-first […]

Ver mais

Like 0

Liked Liked

technocracy

Automated Classification of Research Papers Toward Sustainable Development Goals: A Boolean Query-Based Computational Framework

digitado ⋅ 27 de January de 2026

arXiv:2601.16988v1 Announce Type: new Abstract: The rapid expansion of scholarly publications across diverse disciplines has made it increasingly difficult to systematically evaluate how research contributes to the United Nations Sustainable Development Goals (SDGs). Domain classification of research articles done manually through research experts is extremely impractical because of the number of publications, expensive in time and may not be consistent when done by human beings. This paper proposes an automated and rule-based computational model of classifying research papers […]

Ver mais

Like 0

Liked Liked

technocracy

Identity, Cooperation and Framing Effects within Groups of Real and Simulated Humans

digitado ⋅ 26 de January de 2026

arXiv:2601.16355v1 Announce Type: new Abstract: Humans act via a nuanced process that depends both on rational deliberation and also on identity and contextual factors. In this work, we study how large language models (LLMs) can simulate human action in the context of social dilemma games. While prior work has focused on “steering” (weak binding) of chat models to simulate personas, we analyze here how deep binding of base models with extended backstories leads to more faithful replication of […]

Ver mais

Like 0

Liked Liked