January 2026

EMoE: Eigenbasis-Guided Routing for Mixture-of-Experts

digitado ⋅ 21 de January de 2026

arXiv:2601.12137v1 Announce Type: new Abstract: The relentless scaling of deep learning models has led to unsustainable computational demands, positioning Mixture-of-Experts (MoE) architectures as a promising path towards greater efficiency. However, MoE models are plagued by two fundamental challenges: 1) a load imbalance problem known as the“rich get richer” phenomenon, where a few experts are over-utilized, and 2) an expert homogeneity problem, where experts learn redundant representations, negating their purpose. Current solutions typically employ an auxiliary load-balancing loss that, […]

Ver mais

Like 0

Liked Liked

technocracy

CoSMeTIC: Zero-Knowledge Computational Sparse Merkle Trees with Inclusion-Exclusion Proofs for Clinical Research

digitado ⋅ 21 de January de 2026

arXiv:2601.12136v1 Announce Type: new Abstract: Analysis of clinical data is a cornerstone of biomedical research with applications in areas such as genomic testing and response characterization of therapeutic drugs. Maintaining strict privacy controls is essential because such data typically contains personally identifiable health information of patients. At the same time, regulatory compliance often requires study managers to demonstrate the integrity and authenticity of participant data used in analyses. Balancing these competing requirements, privacy preservation and verifiable accountability, remains […]

Ver mais

Like 0

Liked Liked

technocracy

Human-Human-AI Triadic Programming: Uncovering the Role of AI Agent and the Value of Human Partner in Collaborative Learning

digitado ⋅ 21 de January de 2026

arXiv:2601.12134v1 Announce Type: new Abstract: As AI assistance becomes embedded in programming practice, researchers have increasingly examined how these systems help learners generate code and work more efficiently. However, these studies often position AI as a replacement for human collaboration and overlook the social and learning-oriented aspects that emerge in collaborative programming. Our work introduces human-human-AI (HHAI) triadic programming, where an AI agent serves as an additional collaborator rather than a substitute for a human partner. Through a […]

Ver mais

Like 0

Liked Liked

technocracy

Bengali Text Classification: An Evaluation of Large Language Model Approaches

digitado ⋅ 21 de January de 2026

arXiv:2601.12132v1 Announce Type: new Abstract: Bengali text classification is a Significant task in natural language processing (NLP), where text is categorized into predefined labels. Unlike English, Bengali faces challenges due to the lack of extensive annotated datasets and pre-trained language models. This study explores the effectiveness of large language models (LLMs) in classifying Bengali newspaper articles. The dataset used, obtained from Kaggle, consists of articles from Prothom Alo, a major Bangladeshi newspaper. Three instruction-tuned LLMs LLaMA 3.1 8B […]

Ver mais

Like 0

Liked Liked

technocracy

SolarGPT-QA: A Domain-Adaptive Large Language Model for Educational Question Answering in Space Weather and Heliophysics

digitado ⋅ 21 de January de 2026

arXiv:2601.12131v1 Announce Type: new Abstract: Solar activity, including solar flares, coronal mass ejections (CMEs), and geomagnetic storms, can significantly impact satellites, aviation, power grids, data centers, and space missions. Extreme solar events can cause substantial economic damage if not predicted in advance, highlighting the importance of accurate forecasting and effective education in space science. Although large language models (LLMs) perform well on general tasks, they often lack domain-specific knowledge and pedagogical capability to clearly explain complex space science […]

Ver mais

Like 0

Liked Liked

technocracy

UniMo: Unified Motion Generation and Understanding with Chain of Thought

digitado ⋅ 21 de January de 2026

arXiv:2601.12126v1 Announce Type: new Abstract: Existing 3D human motion generation and understanding methods often exhibit limited interpretability, restricting effective mutual enhancement between these inherently related tasks. While current unified frameworks based on large language models (LLMs) leverage linguistic priors, they frequently encounter challenges in semantic alignment and task coherence. Moreover, the next-token prediction paradigm in LLMs is ill-suited for motion sequences, causing cumulative prediction errors. To address these limitations, we propose UniMo, a novel framework that integrates motion-language […]

Ver mais

Like 0

Liked Liked

technocracy

SynQP: A Framework and Metrics for Evaluating the Quality and Privacy Risk of Synthetic Data

digitado ⋅ 21 de January de 2026

arXiv:2601.12124v1 Announce Type: new Abstract: The use of synthetic data in health applications raises privacy concerns, yet the lack of open frameworks for privacy evaluations has slowed its adoption. A major challenge is the absence of accessible benchmark datasets for evaluating privacy risks, due to difficulties in acquiring sensitive data. To address this, we introduce SynQP, an open framework for benchmarking privacy in synthetic data generation (SDG) using simulated sensitive data, ensuring that original data remains confidential. We […]

Ver mais

Like 0

Liked Liked

technocracy

Statistical Reinforcement Learning in the Real World: A Survey of Challenges and Future Directions

digitado ⋅ 21 de January de 2026

Reinforcement learning (RL) has achieved remarkable success in real-world decision-making across diverse domains, including gaming, robotics, online advertising, public health, and natural language processing. Despite these advances, a substantial gap remains between RL research and its deployment in many practical settings. Two recurring challenges often underlie this gap. First, many settings offer limited opportunity for the agent to interact extensively with the target environment due to practical constraints. Second, many target environments often undergo substantial changes, requiring redesign […]

Ver mais

Like 0

Liked Liked

technocracy

Gated DeltaNet: The “Surgical Eraser” Solving Linear Attention’s Memory Problem

digitado ⋅ 21 de January de 2026

The “Infinite Floor” vs. The “Whiteboard” Imagine a librarian (the Model) trying to answer questions based on a massive stack of books (the Context) Standard Transformers (Attention) are like a librarian who lays every single page out on an infinite floor. To answer a question, they look at every page simultaneously. It’s perfect, but as the books pile up, the floor runs out of space, and the librarian collapses from exhaustion (Quadratic Complexity O(L²)). RNNs and Mamba (State Space […]

Ver mais

Like 0

Liked Liked

technocracy

Stop Using Self-Joins: How Using GroupBy and Filters Instead Can Save Massive Time and Cost in…

digitado ⋅ 21 de January de 2026

Stop Using Self-Joins: How Using GroupBy and Filters Instead Can Save Massive Time and Cost in PySpark When working with large datasets in PySpark, it’s rather common in a notebook to see a table joined to itself using an inner join. While this approach is straightforward and intuitive, it often comes with a steep price: long runtimes, excessive shuffles, and inflated compute costs. In many real-world scenarios, you can replace a self–inner join with a groupBy + aggregation + filter […]

Ver mais

Like 0

Liked Liked