Add optional warmup step for training.

Specified with `warmup_epochs`, uses `warmup_lr`. The purpose is to put the net into a somewhat stable state so that the gradients are not as high during the early stages of the training and don't "accidentally" break the net.
2026-05-20 05:07:46 +00:00 · 2021-03-25 14:41:24 +01:00
parent bbe338b9fc
commit 876902070d
2 changed files with 57 additions and 6 deletions
@@ -20,12 +20,16 @@ Currently the following options are available:

 `epochs` - the number of weight update cycles (epochs) to train the network for. One such cycle is `epoch_size` positions. If not specified then the training will loop forever.

+`warmup_epochs` - the number of epochs to "pretrain" the net for with `warmup_lr` learning rate. Default: 0.
+
 `epoch_size` - The number of positions per epoch. Should be kept lowish as the current implementation loads all into memory before processing. Default is already high enough. The epoch size is not tied to validation nor net serialization, there are more specific options for that. Default: 1000000

 `basedir` - the base directory for the paths. Default: "" (current directory)

 `lr` - initial learning rate. Default: 1.

+`warmup_lr` - the learning rate to use during warmup epochs. Default: 0.1.
+
 `use_draw_games_in_training` - either 0 or 1. If 1 then draws will be used in training too. Default: 1.

 `use_draw_games_in_validation` - either 0 or 1. If 1 then draws will be used in validation too. Default: 1.