Hypergraph Partitioning for Bibliometric Term Blocking: An Application of the KaHyPar Framework to IEEE Xplore Data
The prevalence of graph-based approaches in bibliometric analysis is limited to considering only pairwise relationships—such as co-authorship, citation lists, or keywords—which artificially simplifies the structure of these relationships. Hypergraphs allow for the direct modeling of numerous relationships, thereby improving the accuracy of the analysis. This study demonstrates the application of the KaHyPar framework to partition sets of IEEE Xplore terms into blocks, treating them as hyperedges. This is the first in a series of articles detailing hypergraph-based term blocking techniques. The study utilized bibliometric data from 2021 to 2025, exported from the IEEE Xplore database using the terms “Artificial Intelligence,” “Blockchain,” “Data Science,” “Deep Learning,” “Image Processing,” “Internet of Things,” “Anomaly Detection,” and “Machine Learning.” After removing duplicates and excluding 16 records with empty “IEEE Terms” fields, 38,157 records were used for the study. While partitioning IEEE Terms records with KaHyPar is effective, the process requires rigorous data preparation for the .hgr format. This complexity explains why hypergraphs remain underutilized in scientometrics compared to more accessible tools like VOSviewer. The proposed significance criterion—based on a term’s occurrence frequency within hyperedges associated with the block—yielded easily interpretable results. Future studies should investigate the impact of KaHyPar parameters and evaluate alternative frameworks such as HYPE and Mt-KaHyPar, alongside other metrics for term significance.