diff --git a/Makefile b/Makefile index a748bc1..0afdc16 100644 --- a/Makefile +++ b/Makefile @@ -29,13 +29,10 @@ clean: @rm -rf output/ @rm -rf checkpoints/ -compress: plots/progress_35845_sm.png plots/progress_680065_sm.png +compress: plots/progress_136013_sm.png -plots/progress_35845_sm.png: plots/progress_35845.png - @convert -resize 33% plots/progress_35845.png plots/progress_35845_sm.png - -plots/progress_680065_sm.png: plots/progress_680065.png - @convert -resize 33% plots/progress_680065.png plots/progress_680065_sm.png +plots/progress_136013_sm.png: plots/progress_136013.png + @convert -resize 33% plots/progress_136013.png plots/progress_136013_sm.png install: pip install -r requirements.txt diff --git a/README.md b/README.md index 5210675..f6e3f36 100644 --- a/README.md +++ b/README.md @@ -59,14 +59,11 @@ The approach demonstrated can be extended to other metrics or features beyond ge After training, the model should be able to understand the similarity between cities based on their geodesic distances. You can inspect the evaluation plots generated by the `eval.py` script to see the improvement in similarity scores before and after training. -After five epochs, the model no longer treats the terms as unrelated: -![Evaluation plot](./plots/progress_35845_sm.png) +After one epoch, we can see the model has learned to correlate our desired quantities: -After ten epochs, we can see the model has learned to correlate our desired quantities: -![Evaluation plot](./plots/progress_680065_sm.png) +![Evaluation plot](./plots/progress_136013_sm.png) - -*The above plots are examples showing the relationship between geodesic distance and the similarity between the embedded vectors (1 = more similar), for 10,000 randomly selected pairs of US cities (re-sampled for each image).* +*The above plot is an example showing the relationship between geodesic distance and the similarity between the embedded vectors (1 = more similar), for 10,000 randomly selected pairs of US cities (re-sampled for each image).* *Note the (vertical) "gap" we see in the image, corresponding to the size of the continental United States (~5,000 km)* @@ -86,6 +83,6 @@ There are several potential improvements and extensions to the current model: # Notes - Generating the data took about 13 minutes (for 3269 US cities) on 8-cores (Intel 9700K), yielding 2,720,278 records (combinations of cities). -- Training on an Nvidia 3090 FE takes about an hour per epoch with an 80/20 test/train split. Batch size is 16, so there were 136,014 steps per epoch -- Evaluation on the above hardware took about 15 minutes for 20 epochs at 10k samples each. +- Training on an Nvidia 3090 FE takes about an hour per epoch with an 80/20 test/train split and batch size 16, so there were 136,014 steps per epoch. At batch size 16 times larger, each epoch took about 14 minutes. +- Evaluation (generating plots) on the above hardware took about 15 minutes for 20 epochs at 10k samples each. - **WARNING**: _It is unclear how the model performs on sentences, as it was trained and evaluated only on word-pairs._ See improvement (5) above. diff --git a/plots/progress_136013_sm.png b/plots/progress_136013_sm.png new file mode 100644 index 0000000..b70a651 Binary files /dev/null and b/plots/progress_136013_sm.png differ diff --git a/plots/progress_35845_sm.png b/plots/progress_35845_sm.png deleted file mode 100644 index ac9144b..0000000 Binary files a/plots/progress_35845_sm.png and /dev/null differ diff --git a/plots/progress_680065_sm.png b/plots/progress_680065_sm.png deleted file mode 100644 index 0427d25..0000000 Binary files a/plots/progress_680065_sm.png and /dev/null differ