Go to file
Michael Pilosov 9277229024 compare: color points by their original-dataset label (mono|original toggle)
Server enrichment regenerates the dataset deterministically (random_state=0,
matching the flow's _DEFAULT_GENERATOR_KWARGS — the stem's seed drives jitter,
not generation) and attaches per-point labels + label_kind to frames.json.

Client picks the dataset-picker's scheme: continuous ramp for s_curve/swiss_roll,
8-color categorical palette for blobs/gaussian_quantiles/classification. Jitter-
added points (id >= num_points) render black. Rainbow material is opaque with
alpha cutoff so overlapping points don't blend to the ramp midpoint.

Swiss_roll and swiss_roll_hole collide on generator_path; the plain variant
wins for now (kwargs aren't preserved through the flow's metrics.json).

Bumped Cache-Control on the frames endpoint so browsers don't cache stale
pre-enrichment payloads.
2026-04-22 15:29:03 -06:00
app compare: color points by their original-dataset label (mono|original toggle) 2026-04-22 15:29:03 -06:00
flows cap embedding-flow runner at 1 concurrent run 2026-04-21 22:19:55 -06:00
.gitignore some minor upgrades to prefect syntax 2026-04-21 18:02:39 -06:00
clean.sh some minor upgrades to prefect syntax 2026-04-21 18:02:39 -06:00
makefile rename folder 2026-04-21 19:30:45 -06:00
pyproject.toml some minor upgrades to prefect syntax 2026-04-21 18:02:39 -06:00
README.md some minor upgrades to prefect syntax 2026-04-21 18:02:39 -06:00
requirements-frozen.txt some minor upgrades to prefect syntax 2026-04-21 18:02:39 -06:00
uv.lock some minor upgrades to prefect syntax 2026-04-21 18:02:39 -06:00

Dimension Reduction Lab

A Python project exploring various dimension reduction techniques using Prefect for workflow orchestration.

Overview

This project serves as an experimental sandbox for studying dimensionality reduction and embedding algorithms within a reproducible environment. The primary goal is to evaluate and compare different techniques (like UMAP, t-SNE, PaCMAP, and TriMap) while focusing on their stability characteristics, particularly in the context of changing or drifting data distributions. By leveraging Prefect's workflow management capabilities, we can systematically analyze how these algorithms perform across arbitrary datasets, track their behavior over time, and measure their sensitivity to various hyperparameters and data perturbations.

Requirements

The project uses several key dependencies (as seen in requirements.frozen.txt):

Package Management

This project uses UV (μv) as its package manager, a fast Python package installer and resolver written in Rust. The requirements.frozen.txt file was generated using UV to ensure reproducible dependencies.

To update dependencies:

uv pip compile pyproject.toml (--all-extras) -o requirements.frozen.txt

Modifying --all-extras to include either an individual optional dependency group or all of them. See the pyproject.toml file for more information.

This project uses Prefect for workflow orchestration, for it's lightweight approach to running experiments from a UI and compatibility with single-node deployments.