PDF: PUF-based DNN Fingerprinting for Knowledge Distillation Traceability
arXiv:2602.23587v1 Announce Type: new
Abstract: Knowledge distillation transfers large teacher models to compact student models, enabling deployment on resource-limited platforms while suffering minimal performance degradation. However, this paradigm could lead to various security risks, especially model theft. Existing defenses against model theft, such as watermarking and secure enclaves, focus primarily on identity authentication and incur significant resource costs. Aiming to provide post-theft accountability and traceability, we propose a novel fingerprinting framework that superimposes device-specific Physical Unclonable Function (PUF) signatures onto teacher logits during distillation. Compared with watermarking or secure enclaves, our approach is lightweight, requires no architectural changes, and enables traceability of any leaked or cloned model. Since the signatures are based on PUFs, this framework is robust against reverse engineering and tampering attacks. In this framework, the signature recovery process consists of two stages: first a neural network-based decoder and then a Hamming distance decoder. Furthermore, we also propose a bit compression scheme to support a large number of devices. Experiment results demonstrate that our framework achieves high key recovery rate and negligible accuracy loss while allowing a tunable trade-off between these two key metrics. These results show that the proposed framework is a practical and robust solution for protecting distilled models.