update plots to reflect epochs used

bigger batch size
batchsize
2023-05-05 02:01:51 +00:00 · 2023-05-05 01:45:50 +00:00 · 2023-05-05 00:54:43 +00:00
5 changed files with 11 additions and 11 deletions
--- a/6
+++ b/6
@ -29,13 +29,13 @@ clean:
 	@rm -rf output/
 	@rm -rf checkpoints/
-compress: plots/progress_35845_sm.png plots/progress_680065_sm.png
+compress: plots/progress_35845_sm.png plots/progress_136013_sm.png
 plots/progress_35845_sm.png: plots/progress_35845.png
 	@convert -resize 33% plots/progress_35845.png plots/progress_35845_sm.png
-plots/progress_680065_sm.png: plots/progress_680065.png
+plots/progress_136013_sm.png: plots/progress_136013.png
-	@convert -resize 33% plots/progress_680065.png plots/progress_680065_sm.png
+	@convert -resize 33% plots/progress_136013.png plots/progress_136013_sm.png
 install:
 	pip install -r requirements.txt
--- a/README.md
+++ b/README.md
@ -59,11 +59,11 @@ The approach demonstrated can be extended to other metrics or features beyond ge
 After training, the model should be able to understand the similarity between cities based on their geodesic distances.
 You can inspect the evaluation plots generated by the `eval.py` script to see the improvement in similarity scores before and after training.
-After five epochs, the model no longer treats the terms as unrelated:
+Early on in the first epoch, the model no longer treats the terms as totally unrelated:
 ![Evaluation plot](./plots/progress_35845_sm.png)
-After ten epochs, we can see the model has learned to correlate our desired quantities:
+After one full epoch, we can see the model has learned to correlate our desired quantities:
-![Evaluation plot](./plots/progress_680065_sm.png)
+![Evaluation plot](./plots/progress_136013_sm.png)
 *The above plots are examples showing the relationship between geodesic distance and the similarity between the embedded vectors (1 = more similar), for 10,000 randomly selected pairs of US cities (re-sampled for each image).*
--- a/plots/progress_136013_sm.png
+++ b/plots/progress_136013_sm.png
--- a/plots/progress_680065_sm.png
+++ b/plots/progress_680065_sm.png
--- a/train.py
+++ b/train.py
@ -55,28 +55,28 @@ train_examples, val_examples = train_test_split(
 # validation examples can be something like templated sentences
 # that maintain the same distance as the cities (same context)
 # should probably add training examples like that too if needed
-batch_size = 16
+BATCH_SIZE = 16 * 16
 num_examples = len(train_examples)
-steps_per_epoch = num_examples // batch_size
+steps_per_epoch = num_examples // BATCH_SIZE
 print(f"\nHead of training data (size: {num_examples}):")
 print(train_data[:10], "\n")
 # Create DataLoaders for train and validation datasets
-train_dataloader = DataLoader(train_examples, shuffle=True, batch_size=16)
+train_dataloader = DataLoader(train_examples, shuffle=True, batch_size=BATCH_SIZE)
 print("TRAINING")
 # Configure the training arguments
 training_args = {
    "output_path": "./output",
    # "evaluation_steps": steps_per_epoch,  # already evaluates at the end of each epoch
-    "epochs": 20,
+    "epochs": 10,
    "warmup_steps": 500,
    "optimizer_params": {"lr": 2e-5},
    # "weight_decay": 0,  # not sure if this helps but works fine without setting it.
    "scheduler": "WarmupLinear",
    "save_best_model": True,
-    "checkpoint_path": "./checkpoints_absmax_split",
+    "checkpoint_path": "./checkpoints",
    "checkpoint_save_steps": steps_per_epoch,
    "checkpoint_save_total_limit": 100,
 }
Author	SHA1	Message	Date
mm	f14481bbad	update plots to reflect epochs used	2023-05-05 02:01:51 +00:00
mm	03313d3904	bigger batch size	2023-05-05 01:45:50 +00:00
mm	948c337ec2	batchsize	2023-05-05 00:54:43 +00:00