From 5332c75f21b2a8851b53a71d15ee937170ba368e Mon Sep 17 00:00:00 2001 From: mm Date: Thu, 4 May 2023 10:22:17 +0000 Subject: [PATCH] details in readme --- README.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 5a4521a..22f153c 100644 --- a/README.md +++ b/README.md @@ -13,4 +13,7 @@ A particularly useful addition to the dataset here: - airports: they (more/less) have unique codes, and this semantic understanding would be helpful for search engines. - aliases for cities: the dataset used for city data (lat/lon) contains a pretty exhaustive list of aliases for the cities. It would be good to generate examples of these with a distance of 0 and train the model on this knowledge. -see `Makefile` for instructions. +# notes +- see `Makefile` for instructions. +- Generating the data took about 13 minutes (for 3269 US cities) on 8-cores (Intel 9700K), yielding 2,720,278 records (combinations of cities). +- Training on an Nvidia 3090 FE takes about an hour per epoch with an 80/20 test/train split. Batch size is 16, so there were 136,014 steps per epoch