EEGLab Reference Validation

When you build a new signal processing pipeline, the first question from any clinician is: Does it produce the same results as what I’m already using? For clinical QEEG, “what I’m already using” is almost always EEGLAB—the open-source MATLAB toolbox that has been the workhorse of EEG research and clinical analysis for over two decades.

The Coherence Workstation is not a port of EEGLAB. It’s a ground-up implementation in Python using MNE-Python as the signal processing core, with different algorithms (Picard vs. Infomax for ICA, specparam vs. manual peak-picking for spectral analysis) and different defaults. But it follows EEGLAB conventions where the conventions are sound—frequency band boundaries, montage naming, ICLabel classification, LORETA source estimation—and diverges only where a specific technical justification exists.

This page documents how those claims are verified.

The Validation Approach

Validation works by processing the same recordings through both pipelines—EEGLAB (MATLAB) and the Coherence Workstation (Python)—and comparing the outputs at each stage. The comparison isn’t a pass/fail test; it’s a detailed, metric-by-metric accounting of where results agree, where they diverge, and what explains the difference.

The recordings used for validation are drawn from clinical data collected in routine practice. They include both resting-state (eyes-open and eyes-closed) and task (GO/NoGo) recordings, processed with matched parameters where possible.

What Gets Compared

Stage 1: Spectral Power

The most fundamental comparison. For each channel and each frequency band, the pipeline computes absolute and relative power using Welch’s method, and the results are compared against EEGLAB’s spectopo function.

Where results agree: Band power estimates are typically within 5% when the Welch parameters (window length, overlap, FFT size) are matched. The spectral shape—the relative distribution of power across frequencies—is virtually identical.

Where results may differ: Absolute power values can differ by a constant factor if the two pipelines use different scaling conventions (volts vs. microvolts, one-sided vs. two-sided PSD). These are bookkeeping differences, not algorithmic ones—the relative power distribution and band ratios are unaffected.

Where results intentionally differ: The Coherence Workstation uses specparam for aperiodic separation, which EEGLAB’s default spectral tools do not. This means the pipeline reports both traditional band power (which includes the aperiodic component) and aperiodic-corrected peak power (which isolates the periodic oscillatory contribution). The traditional band power values should match EEGLAB; the aperiodic-corrected values have no EEGLAB equivalent to compare against.

Stage 2: Frequency Band Definitions

The pipeline uses Alpha [8–13 Hz] and Beta [13–30 Hz], matching the EEGLAB/Thatcher/Kaiser clinical standard. Earlier versions of the Coherence Workstation used slightly different boundaries (Alpha [8–12], Beta [12–25]); these legacy boundaries are documented in configs/default.yaml for reference but are no longer active.

Any comparison against a system using different band boundaries will show systematic differences in band power—not because the signal processing differs, but because the bands are integrating over different frequency ranges. This is a configuration difference, not a validation failure.

Stage 3: ICA Decomposition

ICA decomposition is deterministic given the same algorithm, parameters, and random seed. However, Picard (used by the Coherence Workstation) and Infomax (used by EEGLAB’s default runica) are different algorithms that solve the same optimization problem via different methods.

Where results agree: The components identified by Picard and Infomax are typically equivalent—the same eye blink component, the same muscle component, the same brain sources—though they may appear in a different order and with different scaling.

Where results may differ: The ordering of components is arbitrary (both algorithms sort by variance explained, but ties break differently). The spatial maps may show slight numerical differences while capturing the same spatial pattern. In rare cases with marginal data quality, one algorithm may converge to a different local optimum than the other, producing qualitatively different decompositions.

What this means for validation: Component-by-component numerical comparison is not meaningful. Instead, validation compares the reconstructed signal after component rejection—the clean EEG that results from removing the identified artifact components. If both pipelines identify and remove the same artifact types, the reconstructed signals should be nearly identical regardless of the decomposition details.

Stage 4: ICLabel Classification

Both pipelines use ICLabel for component classification—the same deep-learning model, the same probability categories, the same decision thresholds. The probabilities should be identical for identical components. Differences arise only when the underlying ICA decomposition produces different components (see above), which changes the input to ICLabel and therefore the classifications.

Stage 5: Connectivity

Connectivity validation is more complex because the Coherence Workstation uses dwPLI as its primary metric while EEGLAB’s connectivity tools typically default to coherence or PLI. The validation compares:

Coherence values (which both pipelines compute) should match within numerical precision when epoch parameters are matched
dwPLI values are compared against EEGLAB’s pop_roiconnect plugin (which implements dwPLI via the ROIconnect framework)
Surrogate testing results may differ due to different random seeds for phase randomization, but the significance pattern (which connections are significant vs. not) should be largely consistent

Stage 6: Source Localization

Both pipelines use the same mathematical approach for source estimation—a pre-computed transformation matrix based on sLORETA with a template head model. The Coherence Workstation uses the MNE-Python forward model assets; EEGLAB’s LORETA plugin uses the original sLORETA-KEY coordinates. The voxel grids differ slightly (different template discretizations), so point-by-point comparison is not appropriate. Instead, validation compares the regional pattern—which Brodmann areas show peak activation, and whether the relative intensity ranking across regions is consistent.

Tolerance Margins

For quantitative comparisons, the following tolerances define “agreement”:

Spectral power: Within 5% relative difference (absolute power) or 0.1 dB (log power)
Band ratios: Within 2% relative difference
Asymmetry indices: Within 0.05 absolute difference
Coherence values: Within 0.02 absolute difference
Source localization: Same peak Brodmann area; top-5 regions overlap by at least 3 of 5

Values within these margins are considered equivalent—the differences reflect numerical precision, different interpolation methods, or different implementation details rather than meaningful algorithmic disagreement. Values outside these margins trigger investigation to identify the source of the discrepancy.

Where We Intentionally Differ

Some differences between the pipelines are not bugs—they’re deliberate choices with documented rationale:

ICA algorithm (Picard vs. Infomax): Picard is faster and more robust to initialization. The clinical results are equivalent, but the computation time is substantially shorter.

Two-stage filtering: The Coherence Workstation fits ICA on a 1.0 Hz-filtered copy while analyzing data filtered at 0.5 Hz. EEGLAB’s default workflow uses a single filter cutoff for both. The two-stage approach preserves more delta activity at the cost of added complexity.

Aperiodic separation: The Coherence Workstation separates periodic and aperiodic spectral components using specparam. EEGLAB’s default tools report total band power without aperiodic separation. This is an additive feature—the traditional metrics are still available for comparison.

Surrogate-based significance for connectivity: The Coherence Workstation uses phase-randomization surrogates with FDR correction rather than arbitrary percentile thresholds. This produces statistically principled significance estimates rather than display-parameter-dependent connection counts.

Each of these differences is documented in the relevant pipeline section. The validation confirms that where the pipelines make the same choices, they produce the same results—and where they make different choices, the differences are explained and justified.

Ongoing Validation

Validation is not a one-time event. As the pipeline evolves—new features, bug fixes, parameter adjustments—the cross-pipeline comparison is re-run to ensure that the core signal processing remains consistent. Any code change that affects spectral power, connectivity, or source localization values triggers a validation check against the EEGLAB reference outputs.

The validation recordings, EEGLAB reference outputs, and comparison scripts are maintained as part of the project’s test infrastructure. They’re not published (they contain clinical data), but the validation results are available for audit upon request.