repository
catalogue / snv_savgol_pls

SNV · Savitzky–Golay · PLS

Robust NIRS regression baseline — SNV scatter correction, first-derivative SG smoothing, 12-component PLS.

unvalidated

Overview

SNV · Savitzky–Golay · PLS

A robust, general-purpose regression baseline for near-infrared spectra.

What it does

  1. Standard Normal Variate (SNV) — removes multiplicative scatter and offset.
  2. Savitzky–Golay (1st derivative)window_length=11, polyorder=2; sharpens

absorption bands and removes slow baseline drift.

  1. Min–max scaling of features and target.
  2. PLS regression with 12 latent components, selected by a 5-split shuffle CV.

When to use it

Reach for this first when modelling a continuous constituent (protein, moisture, sugar, …) from raw NIRS reflectance. It is fast, well-conditioned, and rarely overfits.

Notes

Tune n_components to your dataset; 8–15 is typical. Swap the first-derivative SG for a second-derivative filter when broad overlapping bands dominate.

Recipe

{
  "name": "SNV + Savitzky-Golay + PLS",
  "pipeline": [
    {"class": "nirs4all.operators.transforms.StandardNormalVariate"},
    {"class": "nirs4all.operators.transforms.SavitzkyGolay", "params": {"window_length": 11, "polyorder": 2, "deriv": 1}},
    {"class": "sklearn.preprocessing.MinMaxScaler"},
    {"y_processing": {"class": "sklearn.preprocessing.MinMaxScaler"}},
    {"class": "sklearn.model_selection.ShuffleSplit", "params": {"n_splits": 5, "test_size": 0.25, "random_state": 42}},
    {"model": {"class": "sklearn.cross_decomposition.PLSRegression", "params": {"n_components": 12}}, "name": "PLS-12"}
  ]
}

Use it

# Python
import nirs4all_repository as n4r
pipe = n4r.get("snv_savgol_pls")
config = pipe.to_nirs4all()  # ready for nirs4all.run() / predict()
# any language: read the index, fetch + verify
curl https://repository.nirs4all.org/data/index.json
curl https://repository.nirs4all.org/data/pipelines/snv_savgol_pls/pipeline.json

Metadata

framework
nirs4all
kind
recipe
task
regression
version
1.0.0
license
CeCILL-2.1 OR AGPL-3.0-or-later
trust
official
tags
pls, snv, savitzky_golay, baseline, preprocessing
authors
Gregory Beurier
reference
regression_demo

Expected metrics

test · rmse0.15 ±0.04
val · rmse0.12 ±0.03
statusunvalidated

Provenance

Download

recipe
pipeline.json
descriptor
descriptor.yaml