A second batch of code reorganization.

This commit is contained in:
Tomasz Sobczyk
2020-09-07 23:26:38 +02:00
committed by nodchip
parent 832c414b0d
commit 1482e5215a
6 changed files with 96 additions and 150 deletions
+28 -28
View File
@@ -27,30 +27,6 @@
// SGD looking only at the sign of the gradient. It requires less memory, but the accuracy is...
// #define SGD_UPDATE
// ----------------------
// Settings for learning
// ----------------------
// mini-batch size.
// Calculate the gradient by combining this number of phases.
// If you make it smaller, the number of update_weights() will increase and the convergence will be faster. The gradient is incorrect.
// If you increase it, the number of update_weights() decreases, so the convergence will be slow. The slope will come out accurately.
// I don't think you need to change this value in most cases.
#define LEARN_MINI_BATCH_SIZE (1000 * 1000 * 1)
// The number of phases to read from the file at one time. After reading this much, shuffle.
// It is better to have a certain size, but this number x 40 bytes x 3 times as much memory is consumed. 400MB*3 is consumed in the 10M phase.
// Must be a multiple of THREAD_BUFFER_SIZE(=10000).
#define LEARN_SFEN_READ_SIZE (1000 * 1000 * 10)
// Saving interval of evaluation function at learning. Save each time you learn this number of phases.
// Needless to say, the longer the saving interval, the shorter the learning time.
// Folder name is incremented for each save like 0/, 1/, 2/...
// By default, once every 1 billion phases.
#define LEARN_EVAL_SAVE_INTERVAL (1000000000ULL)
// ----------------------
// Select the objective function
@@ -79,10 +55,6 @@
// debug settings for learning
// ----------------------
// Reduce the output of rmse during learning to 1 for this number of times.
// rmse calculation is done in one thread, so it takes some time, so reducing the output is effective.
#define LEARN_RMSE_OUTPUT_INTERVAL 1
// ----------------------
// learning from zero vector
@@ -205,6 +177,34 @@ typedef float LearnFloatType;
namespace Learner
{
// ----------------------
// Settings for learning
// ----------------------
// mini-batch size.
// Calculate the gradient by combining this number of phases.
// If you make it smaller, the number of update_weights() will increase and the convergence will be faster. The gradient is incorrect.
// If you increase it, the number of update_weights() decreases, so the convergence will be slow. The slope will come out accurately.
// I don't think you need to change this value in most cases.
constexpr std::size_t LEARN_MINI_BATCH_SIZE = 1000 * 1000 * 1;
// The number of phases to read from the file at one time. After reading this much, shuffle.
// It is better to have a certain size, but this number x 40 bytes x 3 times as much memory is consumed. 400MB*3 is consumed in the 10M phase.
// Must be a multiple of THREAD_BUFFER_SIZE(=10000).
constexpr std::size_t LEARN_SFEN_READ_SIZE = 1000 * 1000 * 10;
// Saving interval of evaluation function at learning. Save each time you learn this number of phases.
// Needless to say, the longer the saving interval, the shorter the learning time.
// Folder name is incremented for each save like 0/, 1/, 2/...
// By default, once every 1 billion phases.
constexpr std::size_t LEARN_EVAL_SAVE_INTERVAL = 1000000000ULL;
// Reduce the output of rmse during learning to 1 for this number of times.
// rmse calculation is done in one thread, so it takes some time, so reducing the output is effective.
constexpr std::size_t LEARN_RMSE_OUTPUT_INTERVAL = 1;
//Structure in which PackedSfen and evaluation value are integrated
// If you write different contents for each option, it will be a problem when reusing the teacher game
// For the time being, write all the following members regardless of the options.