[R] External validation keeps killing my ML models (lab-generated vs external lab data) — looking for academic collaborators

digitado ⋅ 5 de February de 2026

Hey folks,

I’m working on an ML/DL project involving 1D biological signal data (spectral-like signals). I’m running into a problem that I know exists in theory but is brutal in practice — external validation collapse.

Here’s the situation:

When I train/test within the same dataset (80/20 split, k-fold CV), performance is consistently strong
- PCA + LDA → good separation
- Classical ML → solid metrics
- DL → also performs well
The moment I test on truly external data, performance drops hard.

Important detail:

Training data was generated by one operator in the lab
External data was generated independently by another operator (same lab, different batch conditions)
Signals are biologically present, but clearly distribution-shifted

I’ve tried:

PCA, LDA, multiple ML algorithms
Threshold tuning (Youden’s J, recalibration)
Converting 1D signals into 2D representations (e.g., spider/radar RGB plots) inspired by recent papers
DL pipelines on these transformed inputs

Nothing generalizes the way internal CV suggests it should.

What’s frustrating (and validating?) is that most published papers don’t evaluate on truly external datasets, which now makes complete sense to me.

I’m not looking for a magic hack — I’m interested in:

Proper ways to handle domain shift / batch effects
Honest modeling strategies for external generalization
Whether this should be framed as a methodological limitation rather than a “failed model”

If you’re an academic / researcher who has dealt with:

External validation failures
Batch effects in biological signal data
Domain adaptation or robust ML

I’d genuinely love to discuss and potentially collaborate. There’s scope for methodological contribution, and I’m open to adding contributors as co-authors if there’s meaningful input.