Autoencoder-Enhanced Hierarchical Mondrian Anonymization via Latent Representations

digitado ⋅ 3 de February de 2026

Releasing structured microdata requires balancing utility and privacy under group-based disclosure risks. We propose AE-LRHMA, a hybrid anonymization framework that performs Mondrian-style hierarchical partitioning in an autoencoder-learned latent space and integrates local (k,e) -microaggregation. To explicitly control sensitive-value concentration and diversity within each equivalence class, we introduce a tunable constraint set consisting of k, a maximum sensitive proportion threshold, and an optional sensitive-entropy threshold (used as a hard gate when enabled and otherwise as a soft term in split scoring). The anonymized output is generated via standard interval/set generalization in the original space. Experiments on Adult and Bank Marketing demonstrate that AE-LRHMA yields lower information loss and more stable group structures than representative baselines under comparable settings. We further report linkage-attack-oriented risk metrics to empirically characterize relative disclosure trends, without claiming formal guarantees such as differential privacy.

Like 0

Liked Liked