[P] Using SHAP to explain Unsupervised Anomaly Detection on PCA-anonymized data (Credit Card Fraud). Is this a valid approach for a thesis?
Hello everyone,
I’m currently working on a project for my BSc dissertation focused on XAI for Fraud Detection. I have some concerns about my dataset and I am looking for thoughts from the community.
I’m using the Kaggle Credit Card Fraud dataset where 28 of the features (V1-V28) are the result of a PCA transformation.
I am using an unsupervised approach by training a Stacked Autoencoder and fraud is detected based on high Reconstruction Error.
I am using SHAP to explain why the Autoencoder flags a specific transaction. Specifically, I’ve written a custom function to explain the Mean Squared Error (reconstruction error) of the model .
My Concern is that since the features are PCA-transformed, I can’t for example say “the model flagged this because of the location”. I can only say “The model flagged this because of a signature in V14 and V17”
I would love to hear your thoughts on whether this “abstract Interpretability” is a legitimate contribution or if the PCA transformation makes the XAI side of things useless.
submitted by /u/LeaveTrue7987
[link] [comments]