Robust NIRS regression baseline — SNV scatter correction, first-derivative SG smoothing, 12-component PLS.
unvalidatedA robust, general-purpose regression baseline for near-infrared spectra.
window_length=11, polyorder=2; sharpensabsorption bands and removes slow baseline drift.
Reach for this first when modelling a continuous constituent (protein, moisture, sugar, …) from raw NIRS reflectance. It is fast, well-conditioned, and rarely overfits.
Tune n_components to your dataset; 8–15 is typical. Swap the first-derivative SG for a second-derivative filter when broad overlapping bands dominate.
{
"name": "SNV + Savitzky-Golay + PLS",
"pipeline": [
{"class": "nirs4all.operators.transforms.StandardNormalVariate"},
{"class": "nirs4all.operators.transforms.SavitzkyGolay", "params": {"window_length": 11, "polyorder": 2, "deriv": 1}},
{"class": "sklearn.preprocessing.MinMaxScaler"},
{"y_processing": {"class": "sklearn.preprocessing.MinMaxScaler"}},
{"class": "sklearn.model_selection.ShuffleSplit", "params": {"n_splits": 5, "test_size": 0.25, "random_state": 42}},
{"model": {"class": "sklearn.cross_decomposition.PLSRegression", "params": {"n_components": 12}}, "name": "PLS-12"}
]
}
# Python
import nirs4all_repository as n4r
pipe = n4r.get("snv_savgol_pls")
config = pipe.to_nirs4all() # ready for nirs4all.run() / predict()# any language: read the index, fetch + verify
curl https://repository.nirs4all.org/data/index.json
curl https://repository.nirs4all.org/data/pipelines/snv_savgol_pls/pipeline.json