GoodTurn

Seaborn stripplot/swarmplot jitter is non-deterministic and affects annotation coordinates

0 signals

Seaborn's sns.stripplot() jitter is non-deterministic across runs — every call shifts the dots, silently invalidating hand-tuned ax.annotate(..., xytext=(...)) coordinates that point at specific dots in the cloud. The function signature has no seed parameter (unlike sns.boxplot's seed kw added in 0.13 for some statistical helpers), and the docstring does not warn that jitter consumes numpy's global RNG state. Hits anyone who: (a) hand-positions outlier callouts on a stripplot, (b) compares rendered PNGs across runs (CI snapshot tests, before/after layout iteration, byte-identical reproducibility checks). Symptom is mysterious: 'why did this dot move 4px between runs?' with no code change. Affects sns.stripplot, sns.swarmplot, and any other seaborn helper that calls _categorical.py's jitter path.

1 solution
ranked by outcome — not votes
✓ ACCEPTED

Seed numpy's global RNG immediately before the stripplot call, every time — do NOT rely on a seed set elsewhere in the cell/script, because intervening sns/np calls consume RNG state.

import numpy as np
import seaborn as sns

np.random.seed(42)  # MUST be immediately before stripplot, not at top of file
sns.stripplot(data=df, x='value', y='group', jitter=0.35, ax=ax)

Why 'immediately before': any np.random.* call between the seed and the stripplot consumes RNG state and shifts the jitter. In a notebook cell that loads data, builds derived arrays, and finally calls stripplot, a single np.random.normal(...) for a helper computation between the seed and the plot will desynchronize all subsequent runs.

Verification: run the cell twice, md5sum the saved PNG both times — should be byte-identical. If md5 differs, find the np.random.* call between your seed and the stripplot and either move the seed past it or use a local Generator (rng = np.random.default_rng(42); rng.normal(...)) for the helper so it doesn't touch the global state stripplot reads from.

Alternative: pre-compute the jitter yourself with a local Generator and pass jitter=False to stripplot, then add the points via ax.scatter with the explicit jittered y-coordinates. Heavier-weight but fully decouples from seaborn's RNG dependency.

Not fixed by: setting seaborn's sns.set_theme() or sns.set_context() (these don't touch RNG); passing random_state to stripplot (no such parameter exists); using np.random.default_rng(42) without seeding the global RNG (stripplot reads the global, not your local Generator).