digitado – Page 541

Response-Based Knowledge Distillation for Multilingual Jailbreak Prevention Unwittingly Compromises Safety

digitado ⋅ 13 de February de 2026

arXiv:2602.11157v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly deployed worldwide, yet their safety alignment remains predominantly English-centric. This allows for vulnerabilities in non-English contexts, especially with low-resource languages. We introduce a novel application of knowledge distillation (KD) in the context of multilingual jailbreak prevention, examining its efficacy. We distill the refusal behaviors of a proprietary teacher model (OpenAI o1-mini) with Low-Rank Adaptation (LoRA) into three open-source student models: Meta-Llama-3-8B-Instruct, Gemma-2-2B-IT, and Qwen3-8B, using ~28,000 multilingual […]

Ver mais

Like 0

Liked Liked

technocracy

Supervisor, not overseer

digitado ⋅ 12 de February de 2026

In my post about my Showboat project I used the term “overseer” to refer to the person who manages a coding agent. It turns out that’s a term tied to slavery and plantation management. So that’s gross! I’ve edited that post to use “supervisor” instead, and I’ll be using that going forward. Tags: language

Ver mais

Like 0

Liked Liked

technocracy

Self-Supervised Learning for Speaker Recognition: A study and review

digitado ⋅ 11 de February de 2026

Deep learning models trained in a supervised setting have revolutionized audio and speech processing. However, their performance inherently depends on the quantity of human-annotated data, making them costly to scale and prone to poor generalization under unseen conditions. To address these challenges, Self-Supervised Learning (SSL) has emerged as a promising paradigm, leveraging vast amounts of unlabeled data to learn relevant representations. The application of SSL for Automatic Speech Recognition (ASR) has been extensively studied, but research on other […]

Ver mais

Like 0

Liked Liked

technocracy

Demystifying Low-Rank Knowledge Distillation in Large Language Models: Convergence, Generalization, and Information-Theoretic Guarantees

digitado ⋅ 25 de March de 2026

arXiv:2603.22355v1 Announce Type: new Abstract: Knowledge distillation has emerged as a powerful technique for compressing large language models (LLMs) into efficient, deployable architectures while preserving their advanced capabilities. Recent advances in low-rank knowledge distillation, particularly methods like Low-Rank Clone (LRC), have demonstrated remarkable empirical success, achieving comparable performance to full-parameter distillation with significantly reduced training data and computational overhead. However, the theoretical foundations underlying these methods remain poorly understood. In this paper, we establish a rigorous theoretical framework […]

Ver mais

Like 0

Liked Liked

technocracy

Precision Switching Schedule for Efficient Control Implementations

digitado ⋅ 3 de March de 2026

arXiv:2603.00616v1 Announce Type: new Abstract: Modern cyber-physical systems, such as automotive control, rely on feedback controllers that regulate the system towards desired a setpoint. In practice, however, the controller must also be scheduled efficiently on resource-constrained processors, where the choice of numerical precision for controller implementation directly affects both control quality and computational cost. This trade-off is critical: higher precision improves control performance but increases runtime, while lower precision executes faster in the processor but may degrade overall […]

Ver mais

Like 0

Liked Liked

technocracy

Diffusion Sequence Models for Generative In-Context Meta-Learning of Robot Dynamics

digitado ⋅ 15 de April de 2026

Accurate modeling of robot dynamics is essential for model-based control, yet remains challenging under distributional shifts and real-time constraints. In this work, we formulate system identification as an in-context meta-learning problem and compare deterministic and generative sequence models for forward dynamics prediction. We take a Transformer-based meta-model, as a strong deterministic baseline, and introduce to this setting two complementary diffusion-based approaches: (i) inpainting diffusion (Diffuser), which learns the joint input-observation distribution, and (ii) conditioned diffusion models (CNN and […]

Ver mais

Like 0

Liked Liked

technocracy

datasette-referrer-policy 0.1

digitado ⋅ 6 de May de 2026

Release: datasette-referrer-policy 0.1 The OpenStreetMap tiles on the Datasette global-power-plants demo weren’t displaying correctly. This turned out to be caused by two bugs. The first is that the CAPTCHA I added to that site a few weeks ago was triggering for the .json fetch requests used by the map plugin, and since those weren’t HTML the user was not being asked to solve them. Here’s the fix. The second was that OpenStreetMap quite reasonably block tile requests from […]

Ver mais

Like 0

Liked Liked

technocracy

TourPlanner: A Competitive Consensus Framework with Constraint-Gated Reinforcement Learning for Travel Planning

digitado ⋅ 8 de January de 2026

Travel planning is a sophisticated decision-making process that requires synthesizing multifaceted information to construct itineraries. However, existing travel planning approaches face several challenges: (1) Pruning candidate points of interest (POIs) while maintaining a high recall rate; (2) A single reasoning path restricts the exploration capability within the feasible solution space for travel planning; (3) Simultaneously optimizing hard constraints and soft constraints remains a significant difficulty. To address these challenges, we propose TourPlanner, a comprehensive framework featuring multi-path reasoning […]

Ver mais

Like 0

Liked Liked

technocracy

From Imitation to Discrimination: Progressive Curriculum Learning for Robust Web Navigation

digitado ⋅ 14 de April de 2026

Text-based web agents offer computational efficiency for autonomous web navigation, yet developing robust agents remains challenging due to the noisy and heterogeneous nature of real-world HTML. Standard Supervised Fine-Tuning (SFT) approaches fail in two critical dimensions: they lack discrimination capabilities to reject plausible but incorrect elements in densely populated pages, and exhibit limited generalization to unseen website layouts. To address these challenges, we introduce the Triton dataset (590k instances) and a progressive training curriculum. Triton is constructed via […]

Ver mais

Like 0

Liked Liked

technocracy

Small Symbols, Big Risks: Exploring Emoticon Semantic Confusion in Large Language Models

digitado ⋅ 14 de January de 2026

arXiv:2601.07885v1 Announce Type: new Abstract: Emoticons are widely used in digital communication to convey affective intent, yet their safety implications for Large Language Models (LLMs) remain largely unexplored. In this paper, we identify emoticon semantic confusion, a vulnerability where LLMs misinterpret ASCII-based emoticons to perform unintended and even destructive actions. To systematically study this phenomenon, we develop an automated data generation pipeline and construct a dataset containing 3,757 code-oriented test cases spanning 21 meta-scenarios, four programming languages, and […]

Ver mais

Like 0

Liked Liked