embedding notebook — drift & projection

{% if deployment_id %}prefect · deployment {{ deployment_id[:8] }}{% else %}prefect · unreachable{% endif %} metrics →
{{ prefect_api }}
§ 0 introduction scope stability of low-dim embeddings under input drift

What this is. Dimensionality reduction is a workhorse for both exploratory visualization and downstream prediction, yet the stability of its output under small perturbations of the input is rarely examined directly. This notebook takes a narrow, empirical approach: a three-dimensional point cloud (§ 1) is perturbed by a controlled amount at each of a short sequence of timesteps, the selected reducer (§ 2) is applied independently to every snapshot, and the resulting trajectory of two-dimensional embeddings is recorded.

What it measures. Two stability views are logged alongside each run and plotted on the metrics page. Per-timestep travel — ‖ y(t) − y(t−1) ‖ — captures how much the 2-D layout moves between consecutive frames. kNN retention captures how much of the input-space neighborhood graph survives projection. Together they separate reducers that are globally stable but locally noisy from those with the opposite failure mode.

Why this matters. A reducer that looks well-behaved on a single snapshot is not automatically the right tool for a streaming or longitudinal setting. Used as the substrate for a visualization, frame-to-frame motion will read as change the user did not request; used as a feature-extraction step inside a classification pipeline, drift between training and inference will quietly erode accuracy. The aim here is to build intuition for those regimes before committing the reducer to either role.

§ 1 input dataset generator

Six candidate generators for the embedding pipeline. Drag to rotate, scroll to zoom,   or 1 … 6 to select.

n samples
noise σ
timesteps

Dimensionality reduction applied to each snapshot. Only reducers whose Python package is importable are shown.

    {% for r in reducers %}
  • {% endfor %}
{% include "_reducer_form.html" with context %}
dispatching…
{% include "_runs.html" with context %}