Artifact Rejection

EEG is a remarkably sensitive measurement—microvolts of electrical potential at the scalp, generated by the coordinated activity of millions of cortical neurons. Unfortunately, the brain is not the only thing generating electrical signals in the head. Eye movements produce potentials ten times larger than cortical activity. Jaw clenching sends EMG bursts across the entire scalp. The heart’s electrical cycle propagates through the body to every electrode. Even sweat glands generate slow voltage drifts.

Artifact rejection is the process of separating what the brain is doing from what everything else is doing. The Coherence Workstation uses a two-layer approach: Artifact Subspace Reconstruction (ASR) handles continuous, large-amplitude artifacts before ICA, and a post-ICA epoch filter handles residual contamination after ICA components have been removed.

Layer 1: Artifact Subspace Reconstruction (ASR)

ASR is a statistical method for identifying and reconstructing time segments where the signal departs dramatically from what clean EEG should look like. It works by learning the statistical structure of clean data from the recording itself, then flagging any segment whose amplitude exceeds that structure by more than a threshold measured in standard deviations.

preprocessing:
  asr:
    cutoff: 20
    window_length: 0.5

Cutoff: 20 standard deviations. This is the threshold for what counts as “too large.” ASR computes the principal components of the EEG signal in sliding windows and flags any component whose variance exceeds 20 SD of the calibration data. The flagged subspace is reconstructed from the remaining (clean) components—the artifact is replaced with a statistical estimate of what the brain signal would have been without the artifact, rather than simply being deleted.

This threshold is moderately conservative. Lower values (10–15 SD) are more aggressive—they’ll catch subtler artifacts but risk removing genuine high-amplitude brain activity (epileptiform discharges, high-alpha bursts). Higher values (25–30 SD) only catch extreme artifacts like large eye movements or electrode pops. The default of 20 SD reflects a clinical balance: aggressive enough to remove the artifacts that would destabilize ICA, conservative enough to preserve legitimate neural transients.

Window length: 0.5 seconds. ASR processes the data in 500 ms sliding windows. This is short enough to catch transient artifacts (a single eye blink lasts 200–400 ms) but long enough to capture sufficient statistical structure for reliable component estimation.

What ASR Does and Doesn’t Do

ASR is not a replacement for ICA. It handles large-amplitude transient artifacts—eye blinks, electrode pops, sudden movement—that could distort ICA decomposition if left in the data. It does not handle ongoing low-amplitude contamination like sustained muscle tension, cardiac artifact, or slow electrode drift. Those are better addressed by ICA, which can separate them as independent components based on their spatial and temporal signatures.

Think of ASR as a bouncer at the door. It removes the obvious troublemakers before the more sophisticated analysis (ICA) begins. This matters because ICA decomposition is sensitive to outliers—a few seconds of extreme artifact can dominate a component and prevent the algorithm from finding the subtler sources (brain, ongoing EMG, cardiac) that we actually care about classifying.

Layer 2: Post-ICA Epoch Filtering

After ICA has removed identified artifact components (see ICA), the signal is substantially cleaner—but not perfectly clean. Residual contamination can remain from components that were borderline (kept as brain with moderate confidence) or from artifact types that don’t decompose well into independent components.

The post-ICA epoch filter addresses this residual contamination by applying simple amplitude-based criteria across three frequency ranges:

artifact_rejection:
  voltage:
    enabled: true
    threshold_uv: 75.0
  slow_wave:
    enabled: true
    freq_low: 0.0
    freq_high: 4.0
    threshold_uv: 20.0
  fast_wave:
    enabled: true
    freq_low: 20.0
    freq_high: 35.0
    threshold_uv: 5.0
  padding_ms:
    before: 250
    after: 250

Voltage Threshold (75 µV)

Any time point where the absolute amplitude at any channel exceeds 75 µV is flagged. Post-ICA, a 75 µV signal is almost certainly not cortical—resting EEG rarely exceeds 50 µV after artifact component removal. This catches large transients that survived ASR and ICA, including residual eye movements and electrode contact artifacts.

Slow-Wave Filter (Delta, 0–4 Hz, 20 µV)

The data is bandpass-filtered to the delta range (0–4 Hz), and any time point where the filtered signal exceeds 20 µV is flagged. This targets residual eye movement artifact, which has its strongest spectral signature in the delta band. It also catches slow electrode drift and sweat artifact. The 20 µV threshold is set relative to the expected amplitude of genuine cortical delta activity after ICA cleaning—typically under 15 µV in a clean resting recording.

Fast-Wave Filter (Beta/Low Gamma, 20–35 Hz, 5 µV)

The data is bandpass-filtered to the 20–35 Hz range, and any time point exceeding 5 µV is flagged. This targets residual muscle artifact (EMG), which produces broadband power concentrated in the beta and gamma ranges. The 5 µV threshold is low because genuine cortical beta activity is small-amplitude—typically 2–3 µV at the scalp. Anything substantially above that in the 20–35 Hz range is likely muscular.

Padding (±250 ms)

Every flagged time point is expanded by 250 ms in both directions. Artifacts don’t start and stop instantaneously—the signal is typically contaminated for a brief period before and after the peak violation. The padding ensures that the transition zones around artifacts are also excluded from analysis.

How Flagged Segments Are Used

The epoch filter doesn’t delete data. It produces a map of “good” and “bad” time segments that downstream stages use to select clean data for analysis. Spectral analysis computes PSDs only from good segments. Connectivity analysis extracts epochs only from good segments. The flagged segments are stored in the stage output and can be reviewed in the desktop application.

This approach—flagging rather than deleting—preserves the original data and makes the cleaning process transparent. The clinician can see exactly what was removed and why, and can adjust the thresholds if the defaults are too aggressive or too permissive for a particular recording.

Segment Algebra

When multiple filters flag different time regions, the pipeline computes the union of all flagged segments. A time point flagged by any filter is excluded, regardless of whether other filters consider it clean. Adjacent flagged regions are merged into contiguous segments, and the padding is applied to the merged result.

The output is a list of contiguous “good data” time ranges—segments where no filter detected contamination. These ranges are used by all downstream stages that require clean data. The total duration of good data, expressed as a percentage of the recording length, is reported in the preprocessing output as a data quality metric.