A Technical Report on the Second Place Solution for the CIKM 2025 AnalytiCup Competition
arXiv:2601.05259v1 Announce Type: new
Abstract: In this work, we address the challenge of multilingual category relevance judgment in e-commerce search, where traditional ensemble-based systems improve accuracy but at the cost of heavy training, inference, and maintenance complexity. To overcome this limitation, we propose a simplified yet effective framework that leverages prompt engineering with Chain-of-Thought task decomposition to guide reasoning within a single large language model. Specifically, our approach decomposes the relevance judgment process into four interpretable subtasks: translation, intent understanding, category matching, and relevance judgment — and fine-tunes a base model (Qwen2.5-14B) using Low-Rank Adaptation (LoRA) for efficient adaptation. This design not only reduces computational and storage overhead but also enhances interpretability by explicitly structuring the model’s reasoning path. Experimental results show that our single-model framework achieves competitive accuracy and high inference efficiency, processing 20 samples per second on a single A100 GPU. In the CIKM 2025 AnalytiCup Competition Proposals, our method achieved 0.8902 on the public leaderboard and 0.8889 on the private leaderboard, validating the effectiveness and robustness of the proposed approach. These results highlight that structured prompting combined with lightweight fine-tuning can outperform complex ensemble systems, offering a new paradigm for scalable industrial AI applications.