MemeChain: A Multimodal Cross-Chain Dataset for Meme Coin Forensics and Risk Analysis
arXiv:2601.22185v1 Announce Type: new
Abstract: The meme coin ecosystem has grown into one of the most active yet least observable segments of the cryptocurrency market, characterized by extreme churn, minimal project commitment, and widespread fraudulent behavior. While countless meme coins are deployed across multiple blockchains, they rely heavily on off-chain web and social infrastructure to signal legitimacy. These very signals are largely absent from existing datasets, which are often limited to single-chain data or lack the multimodal artifacts required for comprehensive risk modeling.
To address this gap, we introduce MemeChain, a large-scale, open-source, cross-chain dataset comprising 34,988 meme coins across Ethereum, BNB Smart Chain, Solana, and Base. MemeChain integrates on-chain data with off-chain artifacts, including website HTML source code, token logos, and linked social media accounts, enabling multimodal and forensic study of meme coin projects. Analysis of the dataset shows that visual branding is frequently omitted in low-effort deployments, and many projects lack a functional website. Moreover, we quantify the ecosystem’s extreme volatility, identifying 1,801 tokens (5.15%) that cease all trading activity within just 24 hours of launch. By providing unified cross-chain coverage and rich off-chain context, MemeChain serves as a foundational resource for research in financial forensics, multimodal anomaly detection, and automated scam prevention in the meme coin ecosystem.