5 changed files with 11 additions and 11 deletions
--- a/6
+++ b/6
@ -29,13 +29,13 @@ clean:
 	@rm -rf output/
 	@rm -rf checkpoints/
-compress: plots/progress_35845_sm.png plots/progress_136013_sm.png
+compress: plots/progress_35845_sm.png plots/progress_680065_sm.png
 plots/progress_35845_sm.png: plots/progress_35845.png
 	@convert -resize 33% plots/progress_35845.png plots/progress_35845_sm.png
-plots/progress_136013_sm.png: plots/progress_136013.png
+plots/progress_680065_sm.png: plots/progress_680065.png
-	@convert -resize 33% plots/progress_136013.png plots/progress_136013_sm.png
+	@convert -resize 33% plots/progress_680065.png plots/progress_680065_sm.png
 install:
 	pip install -r requirements.txt
--- a/README.md
+++ b/README.md
@ -59,11 +59,11 @@ The approach demonstrated can be extended to other metrics or features beyond ge
 After training, the model should be able to understand the similarity between cities based on their geodesic distances.
 You can inspect the evaluation plots generated by the `eval.py` script to see the improvement in similarity scores before and after training.
-Early on in the first epoch, the model no longer treats the terms as totally unrelated:
+After five epochs, the model no longer treats the terms as unrelated:
 ![Evaluation plot](./plots/progress_35845_sm.png)
-After one full epoch, we can see the model has learned to correlate our desired quantities:
+After ten epochs, we can see the model has learned to correlate our desired quantities:
-![Evaluation plot](./plots/progress_136013_sm.png)
+![Evaluation plot](./plots/progress_680065_sm.png)
 *The above plots are examples showing the relationship between geodesic distance and the similarity between the embedded vectors (1 = more similar), for 10,000 randomly selected pairs of US cities (re-sampled for each image).*
--- a/plots/progress_136013_sm.png
+++ b/plots/progress_136013_sm.png
--- a/plots/progress_680065_sm.png
+++ b/plots/progress_680065_sm.png
--- a/train.py
+++ b/train.py
@ -55,28 +55,28 @@ train_examples, val_examples = train_test_split(
 # validation examples can be something like templated sentences
 # that maintain the same distance as the cities (same context)
 # should probably add training examples like that too if needed
-BATCH_SIZE = 16 * 16
+batch_size = 16
 num_examples = len(train_examples)
-steps_per_epoch = num_examples // BATCH_SIZE
+steps_per_epoch = num_examples // batch_size
 print(f"\nHead of training data (size: {num_examples}):")
 print(train_data[:10], "\n")
 # Create DataLoaders for train and validation datasets
-train_dataloader = DataLoader(train_examples, shuffle=True, batch_size=BATCH_SIZE)
+train_dataloader = DataLoader(train_examples, shuffle=True, batch_size=16)
 print("TRAINING")
 # Configure the training arguments
 training_args = {
    "output_path": "./output",
    # "evaluation_steps": steps_per_epoch,  # already evaluates at the end of each epoch
-    "epochs": 10,
+    "epochs": 20,
    "warmup_steps": 500,
    "optimizer_params": {"lr": 2e-5},
    # "weight_decay": 0,  # not sure if this helps but works fine without setting it.
    "scheduler": "WarmupLinear",
    "save_best_model": True,
-    "checkpoint_path": "./checkpoints",
+    "checkpoint_path": "./checkpoints_absmax_split",
    "checkpoint_save_steps": steps_per_epoch,
    "checkpoint_save_total_limit": 100,
 }