mirror of
https://github.com/opelly27/Stockfish.git
synced 2026-05-20 10:57:43 +00:00
Merge remote-tracking branch 'remotes/origin/master' into trainer
This commit is contained in:
@@ -32,9 +32,182 @@ Additional options:
|
|||||||
|
|
||||||
- `blas=[yes/no]` - whether to use an external BLAS library. Default is `no`. Using an external BLAS library may have a significantly improve learning performance and by default expects openBLAS to be installed.
|
- `blas=[yes/no]` - whether to use an external BLAS library. Default is `no`. Using an external BLAS library may have a significantly improve learning performance and by default expects openBLAS to be installed.
|
||||||
|
|
||||||
## Training Guide
|
* #### Ponder
|
||||||
### Generating Training Data
|
Let Stockfish ponder its next move while the opponent is thinking.
|
||||||
To generate training data from the classic eval, use the gensfen command with the setting "Use NNUE" set to "false". The given example is generation in its simplest form. There are more commands.
|
|
||||||
|
* #### MultiPV
|
||||||
|
Output the N best lines (principal variations, PVs) when searching.
|
||||||
|
Leave at 1 for best performance.
|
||||||
|
|
||||||
|
* #### Use NNUE
|
||||||
|
Toggle between the NNUE and classical evaluation functions. If set to "true",
|
||||||
|
the network parameters must be available to load from file (see also EvalFile),
|
||||||
|
if they are not embedded in the binary.
|
||||||
|
|
||||||
|
* #### EvalFile
|
||||||
|
The name of the file of the NNUE evaluation parameters. Depending on the GUI the
|
||||||
|
filename might have to include the full path to the folder/directory that contains the file.
|
||||||
|
Other locations, such as the directory that contains the binary and the working directory,
|
||||||
|
are also searched.
|
||||||
|
|
||||||
|
* #### UCI_AnalyseMode
|
||||||
|
An option handled by your GUI.
|
||||||
|
|
||||||
|
* #### UCI_Chess960
|
||||||
|
An option handled by your GUI. If true, Stockfish will play Chess960.
|
||||||
|
|
||||||
|
* #### UCI_ShowWDL
|
||||||
|
If enabled, show approximate WDL statistics as part of the engine output.
|
||||||
|
These WDL numbers model expected game outcomes for a given evaluation and
|
||||||
|
game ply for engine self-play at fishtest LTC conditions (60+0.6s per game).
|
||||||
|
|
||||||
|
* #### UCI_LimitStrength
|
||||||
|
Enable weaker play aiming for an Elo rating as set by UCI_Elo. This option overrides Skill Level.
|
||||||
|
|
||||||
|
* #### UCI_Elo
|
||||||
|
If enabled by UCI_LimitStrength, aim for an engine strength of the given Elo.
|
||||||
|
This Elo rating has been calibrated at a time control of 60s+0.6s and anchored to CCRL 40/4.
|
||||||
|
|
||||||
|
* #### Skill Level
|
||||||
|
Lower the Skill Level in order to make Stockfish play weaker (see also UCI_LimitStrength).
|
||||||
|
Internally, MultiPV is enabled, and with a certain probability depending on the Skill Level a
|
||||||
|
weaker move will be played.
|
||||||
|
|
||||||
|
* #### SyzygyPath
|
||||||
|
Path to the folders/directories storing the Syzygy tablebase files. Multiple
|
||||||
|
directories are to be separated by ";" on Windows and by ":" on Unix-based
|
||||||
|
operating systems. Do not use spaces around the ";" or ":".
|
||||||
|
|
||||||
|
Example: `C:\tablebases\wdl345;C:\tablebases\wdl6;D:\tablebases\dtz345;D:\tablebases\dtz6`
|
||||||
|
|
||||||
|
It is recommended to store .rtbw files on an SSD. There is no loss in storing
|
||||||
|
the .rtbz files on a regular HD. It is recommended to verify all md5 checksums
|
||||||
|
of the downloaded tablebase files (`md5sum -c checksum.md5`) as corruption will
|
||||||
|
lead to engine crashes.
|
||||||
|
|
||||||
|
* #### SyzygyProbeDepth
|
||||||
|
Minimum remaining search depth for which a position is probed. Set this option
|
||||||
|
to a higher value to probe less agressively if you experience too much slowdown
|
||||||
|
(in terms of nps) due to TB probing.
|
||||||
|
|
||||||
|
* #### Syzygy50MoveRule
|
||||||
|
Disable to let fifty-move rule draws detected by Syzygy tablebase probes count
|
||||||
|
as wins or losses. This is useful for ICCF correspondence games.
|
||||||
|
|
||||||
|
* #### SyzygyProbeLimit
|
||||||
|
Limit Syzygy tablebase probing to positions with at most this many pieces left
|
||||||
|
(including kings and pawns).
|
||||||
|
|
||||||
|
* #### Contempt
|
||||||
|
A positive value for contempt favors middle game positions and avoids draws,
|
||||||
|
effective for the classical evaluation only.
|
||||||
|
|
||||||
|
* #### Analysis Contempt
|
||||||
|
By default, contempt is set to prefer the side to move. Set this option to "White"
|
||||||
|
or "Black" to analyse with contempt for that side, or "Off" to disable contempt.
|
||||||
|
|
||||||
|
* #### Move Overhead
|
||||||
|
Assume a time delay of x ms due to network and GUI overheads. This is useful to
|
||||||
|
avoid losses on time in those cases.
|
||||||
|
|
||||||
|
* #### Slow Mover
|
||||||
|
Lower values will make Stockfish take less time in games, higher values will
|
||||||
|
make it think longer.
|
||||||
|
|
||||||
|
* #### nodestime
|
||||||
|
Tells the engine to use nodes searched instead of wall time to account for
|
||||||
|
elapsed time. Useful for engine testing.
|
||||||
|
|
||||||
|
* #### Clear Hash
|
||||||
|
Clear the hash table.
|
||||||
|
|
||||||
|
* #### Debug Log File
|
||||||
|
Write all communication to and from the engine into a text file.
|
||||||
|
|
||||||
|
## A note on classical and NNUE evaluation
|
||||||
|
|
||||||
|
Both approaches assign a value to a position that is used in alpha-beta (PVS) search
|
||||||
|
to find the best move. The classical evaluation computes this value as a function
|
||||||
|
of various chess concepts, handcrafted by experts, tested and tuned using fishtest.
|
||||||
|
The NNUE evaluation computes this value with a neural network based on basic
|
||||||
|
inputs (e.g. piece positions only). The network is optimized and trained
|
||||||
|
on the evaluations of millions of positions at moderate search depth.
|
||||||
|
|
||||||
|
The NNUE evaluation was first introduced in shogi, and ported to Stockfish afterward.
|
||||||
|
It can be evaluated efficiently on CPUs, and exploits the fact that only parts
|
||||||
|
of the neural network need to be updated after a typical chess move.
|
||||||
|
[The nodchip repository](https://github.com/nodchip/Stockfish) provides additional
|
||||||
|
tools to train and develop the NNUE networks.
|
||||||
|
|
||||||
|
On CPUs supporting modern vector instructions (avx2 and similar), the NNUE evaluation
|
||||||
|
results in stronger playing strength, even if the nodes per second computed by the engine
|
||||||
|
is somewhat lower (roughly 60% of nps is typical).
|
||||||
|
|
||||||
|
Note that the NNUE evaluation depends on the Stockfish binary and the network parameter
|
||||||
|
file (see EvalFile). Not every parameter file is compatible with a given Stockfish binary.
|
||||||
|
The default value of the EvalFile UCI option is the name of a network that is guaranteed
|
||||||
|
to be compatible with that binary.
|
||||||
|
|
||||||
|
## What to expect from Syzygybases?
|
||||||
|
|
||||||
|
If the engine is searching a position that is not in the tablebases (e.g.
|
||||||
|
a position with 8 pieces), it will access the tablebases during the search.
|
||||||
|
If the engine reports a very large score (typically 153.xx), this means
|
||||||
|
that it has found a winning line into a tablebase position.
|
||||||
|
|
||||||
|
If the engine is given a position to search that is in the tablebases, it
|
||||||
|
will use the tablebases at the beginning of the search to preselect all
|
||||||
|
good moves, i.e. all moves that preserve the win or preserve the draw while
|
||||||
|
taking into account the 50-move rule.
|
||||||
|
It will then perform a search only on those moves. **The engine will not move
|
||||||
|
immediately**, unless there is only a single good move. **The engine likely
|
||||||
|
will not report a mate score even if the position is known to be won.**
|
||||||
|
|
||||||
|
It is therefore clear that this behaviour is not identical to what one might
|
||||||
|
be used to with Nalimov tablebases. There are technical reasons for this
|
||||||
|
difference, the main technical reason being that Nalimov tablebases use the
|
||||||
|
DTM metric (distance-to-mate), while Syzygybases use a variation of the
|
||||||
|
DTZ metric (distance-to-zero, zero meaning any move that resets the 50-move
|
||||||
|
counter). This special metric is one of the reasons that Syzygybases are
|
||||||
|
more compact than Nalimov tablebases, while still storing all information
|
||||||
|
needed for optimal play and in addition being able to take into account
|
||||||
|
the 50-move rule.
|
||||||
|
|
||||||
|
## Large Pages
|
||||||
|
|
||||||
|
Stockfish supports large pages on Linux and Windows. Large pages make
|
||||||
|
the hash access more efficient, improving the engine speed, especially
|
||||||
|
on large hash sizes. Typical increases are 5..10% in terms of nps, but
|
||||||
|
speed increases up to 30% have been measured. The support is
|
||||||
|
automatic. Stockfish attempts to use large pages when available and
|
||||||
|
will fall back to regular memory allocation when this is not the case.
|
||||||
|
|
||||||
|
### Support on Linux
|
||||||
|
|
||||||
|
Large page support on Linux is obtained by the Linux kernel
|
||||||
|
transparent huge pages functionality. Typically, transparent huge pages
|
||||||
|
are already enabled and no configuration is needed.
|
||||||
|
|
||||||
|
### Support on Windows
|
||||||
|
|
||||||
|
The use of large pages requires "Lock Pages in Memory" privilege. See
|
||||||
|
[Enable the Lock Pages in Memory Option (Windows)](https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/enable-the-lock-pages-in-memory-option-windows)
|
||||||
|
on how to enable this privilege. Logout/login may be needed
|
||||||
|
afterwards. Due to memory fragmentation, it may not always be
|
||||||
|
possible to allocate large pages even when enabled. A reboot
|
||||||
|
might alleviate this problem. To determine whether large pages
|
||||||
|
are in use, see the engine log.
|
||||||
|
|
||||||
|
## Compiling Stockfish yourself from the sources
|
||||||
|
|
||||||
|
Stockfish has support for 32 or 64-bit CPUs, certain hardware
|
||||||
|
instructions, big-endian machines such as Power PC, and other platforms.
|
||||||
|
|
||||||
|
On Unix-like systems, it should be easy to compile Stockfish
|
||||||
|
directly from the source code with the included Makefile in the folder
|
||||||
|
`src`. In general it is recommended to run `make help` to see a list of make
|
||||||
|
targets with corresponding descriptions.
|
||||||
|
|
||||||
```
|
```
|
||||||
uci
|
uci
|
||||||
setoption name Use NNUE value false
|
setoption name Use NNUE value false
|
||||||
|
|||||||
+26
-18
@@ -56,7 +56,7 @@ namespace Eval {
|
|||||||
return UseNNUEMode::False;
|
return UseNNUEMode::False;
|
||||||
}
|
}
|
||||||
|
|
||||||
void init_NNUE() {
|
void NNUE::init() {
|
||||||
|
|
||||||
useNNUE = nnue_mode_from_option(Options["Use NNUE"]);
|
useNNUE = nnue_mode_from_option(Options["Use NNUE"]);
|
||||||
if (useNNUE == UseNNUEMode::False)
|
if (useNNUE == UseNNUEMode::False)
|
||||||
@@ -81,8 +81,8 @@ namespace Eval {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// verify_NNUE() verifies that the last net used was loaded successfully
|
/// NNUE::verify() verifies that the last net used was loaded successfully
|
||||||
void verify_NNUE() {
|
void NNUE::verify() {
|
||||||
|
|
||||||
string eval_file = string(Options["EvalFile"]);
|
string eval_file = string(Options["EvalFile"]);
|
||||||
|
|
||||||
@@ -984,24 +984,32 @@ make_v:
|
|||||||
/// evaluation of the position from the point of view of the side to move.
|
/// evaluation of the position from the point of view of the side to move.
|
||||||
|
|
||||||
Value Eval::evaluate(const Position& pos) {
|
Value Eval::evaluate(const Position& pos) {
|
||||||
if (useNNUE == UseNNUEMode::Pure) {
|
|
||||||
return NNUE::evaluate(pos);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Use classical eval if there is a large imbalance
|
Value v;
|
||||||
// If there is a moderate imbalance, use classical eval with probability (1/8),
|
|
||||||
// as derived from the node counter.
|
|
||||||
bool useClassical = abs(eg_value(pos.psq_score())) * 16 > NNUEThreshold1 * (16 + pos.rule50_count());
|
|
||||||
bool classical = (useNNUE == UseNNUEMode::False)
|
|
||||||
|| useClassical
|
|
||||||
|| (abs(eg_value(pos.psq_score())) > PawnValueMg / 4 && !(pos.this_thread()->nodes & 0xB));
|
|
||||||
Value v = classical ? Evaluation<NO_TRACE>(pos).value()
|
|
||||||
: NNUE::evaluate(pos);
|
|
||||||
|
|
||||||
if ( useClassical
|
if (Eval::useNNUE == UseNNUEMode::Pure) {
|
||||||
&& useNNUE != UseNNUEMode::False
|
|
||||||
&& abs(v) * 16 < NNUEThreshold2 * (16 + pos.rule50_count()))
|
|
||||||
v = NNUE::evaluate(pos);
|
v = NNUE::evaluate(pos);
|
||||||
|
}
|
||||||
|
else if (Eval::useNNUE == UseNNUEMode::False)
|
||||||
|
v = Evaluation<NO_TRACE>(pos).value();
|
||||||
|
else
|
||||||
|
{
|
||||||
|
// scale and shift NNUE for compatibility with search and classical evaluation
|
||||||
|
auto adjusted_NNUE = [&](){ return NNUE::evaluate(pos) * 5 / 4 + Tempo; };
|
||||||
|
|
||||||
|
// if there is PSQ imbalance use classical eval, with small probability if it is small
|
||||||
|
Value psq = Value(abs(eg_value(pos.psq_score())));
|
||||||
|
int r50 = 16 + pos.rule50_count();
|
||||||
|
bool largePsq = psq * 16 > (NNUEThreshold1 + pos.non_pawn_material() / 64) * r50;
|
||||||
|
bool classical = largePsq || (psq > PawnValueMg / 4 && !(pos.this_thread()->nodes & 0xB));
|
||||||
|
|
||||||
|
v = classical ? Evaluation<NO_TRACE>(pos).value() : adjusted_NNUE();
|
||||||
|
|
||||||
|
// if the classical eval is small and imbalance large, use NNUE nevertheless.
|
||||||
|
if ( largePsq
|
||||||
|
&& abs(v) * 16 < NNUEThreshold2 * r50)
|
||||||
|
v = adjusted_NNUE();
|
||||||
|
}
|
||||||
|
|
||||||
// Damp down the evaluation linearly when shuffling
|
// Damp down the evaluation linearly when shuffling
|
||||||
v = v * (100 - pos.rule50_count()) / 100;
|
v = v * (100 - pos.rule50_count()) / 100;
|
||||||
|
|||||||
+3
-3
@@ -38,8 +38,6 @@ namespace Eval {
|
|||||||
|
|
||||||
extern UseNNUEMode useNNUE;
|
extern UseNNUEMode useNNUE;
|
||||||
extern std::string eval_file_loaded;
|
extern std::string eval_file_loaded;
|
||||||
void init_NNUE();
|
|
||||||
void verify_NNUE();
|
|
||||||
|
|
||||||
// The default net name MUST follow the format nn-[SHA256 first 12 digits].nnue
|
// The default net name MUST follow the format nn-[SHA256 first 12 digits].nnue
|
||||||
// for the build process (profile-build and fishtest) to work. Do not change the
|
// for the build process (profile-build and fishtest) to work. Do not change the
|
||||||
@@ -49,7 +47,9 @@ namespace Eval {
|
|||||||
namespace NNUE {
|
namespace NNUE {
|
||||||
|
|
||||||
Value evaluate(const Position& pos);
|
Value evaluate(const Position& pos);
|
||||||
bool load_eval(std::string streamName, std::istream& stream);
|
bool load_eval(std::string name, std::istream& stream);
|
||||||
|
void init();
|
||||||
|
void verify();
|
||||||
|
|
||||||
} // namespace NNUE
|
} // namespace NNUE
|
||||||
|
|
||||||
|
|||||||
@@ -1168,7 +1168,7 @@ namespace Learner
|
|||||||
<< " detect_draw_by_insufficient_mating_material = " << detect_draw_by_insufficient_mating_material << endl;
|
<< " detect_draw_by_insufficient_mating_material = " << detect_draw_by_insufficient_mating_material << endl;
|
||||||
|
|
||||||
// Show if the training data generator uses NNUE.
|
// Show if the training data generator uses NNUE.
|
||||||
Eval::verify_NNUE();
|
Eval::NNUE::verify();
|
||||||
|
|
||||||
Threads.main()->ponder = false;
|
Threads.main()->ponder = false;
|
||||||
|
|
||||||
|
|||||||
+4
-4
@@ -1841,7 +1841,7 @@ namespace Learner
|
|||||||
|
|
||||||
if (use_convert_plain)
|
if (use_convert_plain)
|
||||||
{
|
{
|
||||||
Eval::init_NNUE();
|
Eval::NNUE::init();
|
||||||
cout << "convert_plain.." << endl;
|
cout << "convert_plain.." << endl;
|
||||||
convert_plain(filenames, output_file_name);
|
convert_plain(filenames, output_file_name);
|
||||||
return;
|
return;
|
||||||
@@ -1849,7 +1849,7 @@ namespace Learner
|
|||||||
|
|
||||||
if (use_convert_bin)
|
if (use_convert_bin)
|
||||||
{
|
{
|
||||||
Eval::init_NNUE();
|
Eval::NNUE::init();
|
||||||
cout << "convert_bin.." << endl;
|
cout << "convert_bin.." << endl;
|
||||||
convert_bin(
|
convert_bin(
|
||||||
filenames,
|
filenames,
|
||||||
@@ -1870,7 +1870,7 @@ namespace Learner
|
|||||||
|
|
||||||
if (use_convert_bin_from_pgn_extract)
|
if (use_convert_bin_from_pgn_extract)
|
||||||
{
|
{
|
||||||
Eval::init_NNUE();
|
Eval::NNUE::init();
|
||||||
cout << "convert_bin_from_pgn-extract.." << endl;
|
cout << "convert_bin_from_pgn-extract.." << endl;
|
||||||
convert_bin_from_pgn_extract(
|
convert_bin_from_pgn_extract(
|
||||||
filenames,
|
filenames,
|
||||||
@@ -1938,7 +1938,7 @@ namespace Learner
|
|||||||
cout << "init.." << endl;
|
cout << "init.." << endl;
|
||||||
|
|
||||||
// Read evaluation function parameters
|
// Read evaluation function parameters
|
||||||
Eval::init_NNUE();
|
Eval::NNUE::init();
|
||||||
|
|
||||||
Threads.main()->ponder = false;
|
Threads.main()->ponder = false;
|
||||||
|
|
||||||
|
|||||||
@@ -12,7 +12,7 @@ void MultiThink::go_think()
|
|||||||
// Read evaluation function, etc.
|
// Read evaluation function, etc.
|
||||||
// In the case of the learn command, the value of the evaluation function may be corrected after reading the evaluation function, so
|
// In the case of the learn command, the value of the evaluation function may be corrected after reading the evaluation function, so
|
||||||
// Skip memory corruption check.
|
// Skip memory corruption check.
|
||||||
Eval::init_NNUE();
|
Eval::NNUE::init();
|
||||||
|
|
||||||
// Call the derived class's init().
|
// Call the derived class's init().
|
||||||
init();
|
init();
|
||||||
|
|||||||
+1
-1
@@ -45,7 +45,7 @@ int main(int argc, char* argv[]) {
|
|||||||
Endgames::init();
|
Endgames::init();
|
||||||
Threads.set(size_t(Options["Threads"]));
|
Threads.set(size_t(Options["Threads"]));
|
||||||
Search::clear(); // After threads are up
|
Search::clear(); // After threads are up
|
||||||
Eval::init_NNUE();
|
Eval::NNUE::init();
|
||||||
|
|
||||||
UCI::loop(argc, argv);
|
UCI::loop(argc, argv);
|
||||||
|
|
||||||
|
|||||||
+25
-32
@@ -357,27 +357,11 @@ void std_aligned_free(void* ptr) {
|
|||||||
#endif
|
#endif
|
||||||
}
|
}
|
||||||
|
|
||||||
/// aligned_ttmem_alloc() will return suitably aligned memory, if possible using large pages.
|
/// aligned_large_pages_alloc() will return suitably aligned memory, if possible using large pages.
|
||||||
/// The returned pointer is the aligned one, while the mem argument is the one that needs
|
|
||||||
/// to be passed to free. With c++17 some of this functionality could be simplified.
|
|
||||||
|
|
||||||
#if defined(__linux__) && !defined(__ANDROID__)
|
#if defined(_WIN32)
|
||||||
|
|
||||||
void* aligned_ttmem_alloc(size_t allocSize, void*& mem) {
|
static void* aligned_large_pages_alloc_win(size_t allocSize) {
|
||||||
|
|
||||||
constexpr size_t alignment = 2 * 1024 * 1024; // assumed 2MB page sizes
|
|
||||||
size_t size = ((allocSize + alignment - 1) / alignment) * alignment; // multiple of alignment
|
|
||||||
if (posix_memalign(&mem, alignment, size))
|
|
||||||
mem = nullptr;
|
|
||||||
#if defined(MADV_HUGEPAGE)
|
|
||||||
madvise(mem, allocSize, MADV_HUGEPAGE);
|
|
||||||
#endif
|
|
||||||
return mem;
|
|
||||||
}
|
|
||||||
|
|
||||||
#elif defined(_WIN64)
|
|
||||||
|
|
||||||
static void* aligned_ttmem_alloc_large_pages(size_t allocSize) {
|
|
||||||
|
|
||||||
HANDLE hProcessToken { };
|
HANDLE hProcessToken { };
|
||||||
LUID luid { };
|
LUID luid { };
|
||||||
@@ -422,12 +406,13 @@ static void* aligned_ttmem_alloc_large_pages(size_t allocSize) {
|
|||||||
return mem;
|
return mem;
|
||||||
}
|
}
|
||||||
|
|
||||||
void* aligned_ttmem_alloc(size_t allocSize, void*& mem) {
|
void* aligned_large_pages_alloc(size_t allocSize) {
|
||||||
|
|
||||||
static bool firstCall = true;
|
static bool firstCall = true;
|
||||||
|
void* mem;
|
||||||
|
|
||||||
// Try to allocate large pages
|
// Try to allocate large pages
|
||||||
mem = aligned_ttmem_alloc_large_pages(allocSize);
|
mem = aligned_large_pages_alloc_win(allocSize);
|
||||||
|
|
||||||
// Suppress info strings on the first call. The first call occurs before 'uci'
|
// Suppress info strings on the first call. The first call occurs before 'uci'
|
||||||
// is received and in that case this output confuses some GUIs.
|
// is received and in that case this output confuses some GUIs.
|
||||||
@@ -449,23 +434,31 @@ void* aligned_ttmem_alloc(size_t allocSize, void*& mem) {
|
|||||||
|
|
||||||
#else
|
#else
|
||||||
|
|
||||||
void* aligned_ttmem_alloc(size_t allocSize, void*& mem) {
|
void* aligned_large_pages_alloc(size_t allocSize) {
|
||||||
|
|
||||||
constexpr size_t alignment = 64; // assumed cache line size
|
#if defined(__linux__)
|
||||||
size_t size = allocSize + alignment - 1; // allocate some extra space
|
constexpr size_t alignment = 2 * 1024 * 1024; // assumed 2MB page size
|
||||||
mem = malloc(size);
|
#else
|
||||||
void* ret = reinterpret_cast<void*>((uintptr_t(mem) + alignment - 1) & ~uintptr_t(alignment - 1));
|
constexpr size_t alignment = 4096; // assumed small page size
|
||||||
return ret;
|
#endif
|
||||||
|
|
||||||
|
// round up to multiples of alignment
|
||||||
|
size_t size = ((allocSize + alignment - 1) / alignment) * alignment;
|
||||||
|
void *mem = std_aligned_alloc(alignment, size);
|
||||||
|
#if defined(MADV_HUGEPAGE)
|
||||||
|
madvise(mem, size, MADV_HUGEPAGE);
|
||||||
|
#endif
|
||||||
|
return mem;
|
||||||
}
|
}
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
|
||||||
/// aligned_ttmem_free() will free the previously allocated ttmem
|
/// aligned_large_pages_free() will free the previously allocated ttmem
|
||||||
|
|
||||||
#if defined(_WIN64)
|
#if defined(_WIN32)
|
||||||
|
|
||||||
void aligned_ttmem_free(void* mem) {
|
void aligned_large_pages_free(void* mem) {
|
||||||
|
|
||||||
if (mem && !VirtualFree(mem, 0, MEM_RELEASE))
|
if (mem && !VirtualFree(mem, 0, MEM_RELEASE))
|
||||||
{
|
{
|
||||||
@@ -478,8 +471,8 @@ void aligned_ttmem_free(void* mem) {
|
|||||||
|
|
||||||
#else
|
#else
|
||||||
|
|
||||||
void aligned_ttmem_free(void *mem) {
|
void aligned_large_pages_free(void *mem) {
|
||||||
free(mem);
|
std_aligned_free(mem);
|
||||||
}
|
}
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
+2
-2
@@ -39,8 +39,8 @@ void prefetch(void* addr);
|
|||||||
void start_logger(const std::string& fname);
|
void start_logger(const std::string& fname);
|
||||||
void* std_aligned_alloc(size_t alignment, size_t size);
|
void* std_aligned_alloc(size_t alignment, size_t size);
|
||||||
void std_aligned_free(void* ptr);
|
void std_aligned_free(void* ptr);
|
||||||
void* aligned_ttmem_alloc(size_t size, void*& mem);
|
void* aligned_large_pages_alloc(size_t size); // memory aligned by page size, min alignment: 4096 bytes
|
||||||
void aligned_ttmem_free(void* mem); // nop if mem == nullptr
|
void aligned_large_pages_free(void* mem); // nop if mem == nullptr
|
||||||
|
|
||||||
void dbg_hit_on(bool b);
|
void dbg_hit_on(bool b);
|
||||||
void dbg_hit_on(bool c, bool b);
|
void dbg_hit_on(bool c, bool b);
|
||||||
|
|||||||
@@ -30,7 +30,7 @@
|
|||||||
|
|
||||||
namespace Eval::NNUE {
|
namespace Eval::NNUE {
|
||||||
|
|
||||||
uint32_t kpp_board_index[PIECE_NB][COLOR_NB] = {
|
const uint32_t kpp_board_index[PIECE_NB][COLOR_NB] = {
|
||||||
// convention: W - us, B - them
|
// convention: W - us, B - them
|
||||||
// viewed from other side, W and B are reversed
|
// viewed from other side, W and B are reversed
|
||||||
{ PS_NONE, PS_NONE },
|
{ PS_NONE, PS_NONE },
|
||||||
@@ -52,7 +52,7 @@ namespace Eval::NNUE {
|
|||||||
};
|
};
|
||||||
|
|
||||||
// Input feature converter
|
// Input feature converter
|
||||||
AlignedPtr<FeatureTransformer> feature_transformer;
|
LargePagePtr<FeatureTransformer> feature_transformer;
|
||||||
|
|
||||||
// Evaluation function
|
// Evaluation function
|
||||||
AlignedPtr<Network> network;
|
AlignedPtr<Network> network;
|
||||||
@@ -79,14 +79,22 @@ namespace Eval::NNUE {
|
|||||||
std::memset(pointer.get(), 0, sizeof(T));
|
std::memset(pointer.get(), 0, sizeof(T));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
template <typename T>
|
||||||
|
void Initialize(LargePagePtr<T>& pointer) {
|
||||||
|
|
||||||
|
static_assert(alignof(T) <= 4096, "aligned_large_pages_alloc() may fail for such a big alignment requirement of T");
|
||||||
|
pointer.reset(reinterpret_cast<T*>(aligned_large_pages_alloc(sizeof(T))));
|
||||||
|
std::memset(pointer.get(), 0, sizeof(T));
|
||||||
|
}
|
||||||
|
|
||||||
// Read evaluation function parameters
|
// Read evaluation function parameters
|
||||||
template <typename T>
|
template <typename T>
|
||||||
bool ReadParameters(std::istream& stream, const AlignedPtr<T>& pointer) {
|
bool ReadParameters(std::istream& stream, T& reference) {
|
||||||
|
|
||||||
std::uint32_t header;
|
std::uint32_t header;
|
||||||
header = read_little_endian<std::uint32_t>(stream);
|
header = read_little_endian<std::uint32_t>(stream);
|
||||||
if (!stream || header != T::GetHashValue()) return false;
|
if (!stream || header != T::GetHashValue()) return false;
|
||||||
return pointer->ReadParameters(stream);
|
return reference.ReadParameters(stream);
|
||||||
}
|
}
|
||||||
|
|
||||||
// write evaluation function parameters
|
// write evaluation function parameters
|
||||||
@@ -97,6 +105,13 @@ namespace Eval::NNUE {
|
|||||||
return pointer->WriteParameters(stream);
|
return pointer->WriteParameters(stream);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
template <typename T>
|
||||||
|
bool WriteParameters(std::ostream& stream, const LargePagePtr<T>& pointer) {
|
||||||
|
constexpr std::uint32_t header = T::GetHashValue();
|
||||||
|
stream.write(reinterpret_cast<const char*>(&header), sizeof(header));
|
||||||
|
return pointer->WriteParameters(stream);
|
||||||
|
}
|
||||||
|
|
||||||
} // namespace Detail
|
} // namespace Detail
|
||||||
|
|
||||||
// Initialize the evaluation function parameters
|
// Initialize the evaluation function parameters
|
||||||
@@ -138,8 +153,8 @@ namespace Eval::NNUE {
|
|||||||
std::string architecture;
|
std::string architecture;
|
||||||
if (!ReadHeader(stream, &hash_value, &architecture)) return false;
|
if (!ReadHeader(stream, &hash_value, &architecture)) return false;
|
||||||
if (hash_value != kHashValue) return false;
|
if (hash_value != kHashValue) return false;
|
||||||
if (!Detail::ReadParameters(stream, feature_transformer)) return false;
|
if (!Detail::ReadParameters(stream, *feature_transformer)) return false;
|
||||||
if (!Detail::ReadParameters(stream, network)) return false;
|
if (!Detail::ReadParameters(stream, *network)) return false;
|
||||||
return stream && stream.peek() == std::ios::traits_type::eof();
|
return stream && stream.peek() == std::ios::traits_type::eof();
|
||||||
}
|
}
|
||||||
// write evaluation function parameters
|
// write evaluation function parameters
|
||||||
@@ -162,7 +177,7 @@ namespace Eval::NNUE {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// Load eval, from a file stream or a memory stream
|
// Load eval, from a file stream or a memory stream
|
||||||
bool load_eval(std::string streamName, std::istream& stream) {
|
bool load_eval(std::string name, std::istream& stream) {
|
||||||
|
|
||||||
Initialize();
|
Initialize();
|
||||||
|
|
||||||
@@ -171,7 +186,7 @@ namespace Eval::NNUE {
|
|||||||
std::cout << "info string SkipLoadingEval set to true, Net not loaded!" << std::endl;
|
std::cout << "info string SkipLoadingEval set to true, Net not loaded!" << std::endl;
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
fileName = streamName;
|
fileName = name;
|
||||||
return ReadParameters(stream);
|
return ReadParameters(stream);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -40,11 +40,22 @@ namespace Eval::NNUE {
|
|||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
|
template <typename T>
|
||||||
|
struct LargePageDeleter {
|
||||||
|
void operator()(T* ptr) const {
|
||||||
|
ptr->~T();
|
||||||
|
aligned_large_pages_free(ptr);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
template <typename T>
|
template <typename T>
|
||||||
using AlignedPtr = std::unique_ptr<T, AlignedDeleter<T>>;
|
using AlignedPtr = std::unique_ptr<T, AlignedDeleter<T>>;
|
||||||
|
|
||||||
|
template <typename T>
|
||||||
|
using LargePagePtr = std::unique_ptr<T, LargePageDeleter<T>>;
|
||||||
|
|
||||||
// Input feature converter
|
// Input feature converter
|
||||||
extern AlignedPtr<FeatureTransformer> feature_transformer;
|
extern LargePagePtr<FeatureTransformer> feature_transformer;
|
||||||
|
|
||||||
// Evaluation function
|
// Evaluation function
|
||||||
extern AlignedPtr<Network> network;
|
extern AlignedPtr<Network> network;
|
||||||
|
|||||||
@@ -113,7 +113,7 @@ namespace Eval::NNUE {
|
|||||||
PS_END2 = 12 * SQUARE_NB + 1
|
PS_END2 = 12 * SQUARE_NB + 1
|
||||||
};
|
};
|
||||||
|
|
||||||
extern uint32_t kpp_board_index[PIECE_NB][COLOR_NB];
|
extern const uint32_t kpp_board_index[PIECE_NB][COLOR_NB];
|
||||||
|
|
||||||
// Type of input feature after conversion
|
// Type of input feature after conversion
|
||||||
using TransformedFeatureType = std::uint8_t;
|
using TransformedFeatureType = std::uint8_t;
|
||||||
|
|||||||
+10
-12
@@ -217,7 +217,7 @@ void MainThread::search() {
|
|||||||
Time.init(Limits, us, rootPos.game_ply());
|
Time.init(Limits, us, rootPos.game_ply());
|
||||||
TT.new_search();
|
TT.new_search();
|
||||||
|
|
||||||
Eval::verify_NNUE();
|
Eval::NNUE::verify();
|
||||||
|
|
||||||
if (rootMoves.empty())
|
if (rootMoves.empty())
|
||||||
{
|
{
|
||||||
@@ -454,10 +454,7 @@ void Thread::search() {
|
|||||||
++failedHighCnt;
|
++failedHighCnt;
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
|
||||||
++rootMoves[pvIdx].bestMoveCount;
|
|
||||||
break;
|
break;
|
||||||
}
|
|
||||||
|
|
||||||
delta += delta / 4 + 5;
|
delta += delta / 4 + 5;
|
||||||
|
|
||||||
@@ -1146,7 +1143,7 @@ moves_loop: // When in check, search starts from here
|
|||||||
// Step 16. Reduced depth search (LMR, ~200 Elo). If the move fails high it will be
|
// Step 16. Reduced depth search (LMR, ~200 Elo). If the move fails high it will be
|
||||||
// re-searched at full depth.
|
// re-searched at full depth.
|
||||||
if ( depth >= 3
|
if ( depth >= 3
|
||||||
&& moveCount > 1 + 2 * rootNode + 2 * (PvNode && abs(bestValue) < 2)
|
&& moveCount > 1 + 2 * rootNode
|
||||||
&& ( !captureOrPromotion
|
&& ( !captureOrPromotion
|
||||||
|| moveCountPruning
|
|| moveCountPruning
|
||||||
|| ss->staticEval + PieceValue[EG][pos.captured_piece()] <= alpha
|
|| ss->staticEval + PieceValue[EG][pos.captured_piece()] <= alpha
|
||||||
@@ -1213,14 +1210,14 @@ moves_loop: // When in check, search starts from here
|
|||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
// Increase reduction for captures/promotions if late move and at low depth
|
// Increase reduction for captures/promotions if late move and at low depth
|
||||||
if (depth < 8 && moveCount > 2)
|
if (depth < 8 && moveCount > 2)
|
||||||
r++;
|
r++;
|
||||||
|
|
||||||
// Unless giving check, this capture is likely bad
|
// Unless giving check, this capture is likely bad
|
||||||
if ( !givesCheck
|
if ( !givesCheck
|
||||||
&& ss->staticEval + PieceValue[EG][pos.captured_piece()] + 213 * depth <= alpha)
|
&& ss->staticEval + PieceValue[EG][pos.captured_piece()] + 213 * depth <= alpha)
|
||||||
r++;
|
r++;
|
||||||
}
|
}
|
||||||
|
|
||||||
Depth d = std::clamp(newDepth - r, 1, newDepth);
|
Depth d = std::clamp(newDepth - r, 1, newDepth);
|
||||||
@@ -1567,6 +1564,7 @@ moves_loop: // When in check, search starts from here
|
|||||||
[pos.moved_piece(move)]
|
[pos.moved_piece(move)]
|
||||||
[to_sq(move)];
|
[to_sq(move)];
|
||||||
|
|
||||||
|
// CounterMove based pruning
|
||||||
if ( !captureOrPromotion
|
if ( !captureOrPromotion
|
||||||
&& Search::prune_at_shallow_depth
|
&& Search::prune_at_shallow_depth
|
||||||
&& moveCount
|
&& moveCount
|
||||||
|
|||||||
@@ -73,7 +73,6 @@ struct RootMove {
|
|||||||
Value previousScore = -VALUE_INFINITE;
|
Value previousScore = -VALUE_INFINITE;
|
||||||
int selDepth = 0;
|
int selDepth = 0;
|
||||||
int tbRank = 0;
|
int tbRank = 0;
|
||||||
int bestMoveCount = 0;
|
|
||||||
Value tbScore;
|
Value tbScore;
|
||||||
std::vector<Move> pv;
|
std::vector<Move> pv;
|
||||||
};
|
};
|
||||||
|
|||||||
+10
-10
@@ -236,16 +236,16 @@ Thread* ThreadPool::get_best_thread() const {
|
|||||||
votes[th->rootMoves[0].pv[0]] +=
|
votes[th->rootMoves[0].pv[0]] +=
|
||||||
(th->rootMoves[0].score - minScore + 14) * int(th->completedDepth);
|
(th->rootMoves[0].score - minScore + 14) * int(th->completedDepth);
|
||||||
|
|
||||||
if (abs(bestThread->rootMoves[0].score) >= VALUE_TB_WIN_IN_MAX_PLY)
|
if (abs(bestThread->rootMoves[0].score) >= VALUE_TB_WIN_IN_MAX_PLY)
|
||||||
{
|
{
|
||||||
// Make sure we pick the shortest mate / TB conversion or stave off mate the longest
|
// Make sure we pick the shortest mate / TB conversion or stave off mate the longest
|
||||||
if (th->rootMoves[0].score > bestThread->rootMoves[0].score)
|
if (th->rootMoves[0].score > bestThread->rootMoves[0].score)
|
||||||
bestThread = th;
|
bestThread = th;
|
||||||
}
|
}
|
||||||
else if ( th->rootMoves[0].score >= VALUE_TB_WIN_IN_MAX_PLY
|
else if ( th->rootMoves[0].score >= VALUE_TB_WIN_IN_MAX_PLY
|
||||||
|| ( th->rootMoves[0].score > VALUE_TB_LOSS_IN_MAX_PLY
|
|| ( th->rootMoves[0].score > VALUE_TB_LOSS_IN_MAX_PLY
|
||||||
&& votes[th->rootMoves[0].pv[0]] > votes[bestThread->rootMoves[0].pv[0]]))
|
&& votes[th->rootMoves[0].pv[0]] > votes[bestThread->rootMoves[0].pv[0]]))
|
||||||
bestThread = th;
|
bestThread = th;
|
||||||
}
|
}
|
||||||
|
|
||||||
return bestThread;
|
return bestThread;
|
||||||
|
|||||||
+4
-3
@@ -67,11 +67,12 @@ void TranspositionTable::resize(size_t mbSize) {
|
|||||||
|
|
||||||
Threads.main()->wait_for_search_finished();
|
Threads.main()->wait_for_search_finished();
|
||||||
|
|
||||||
aligned_ttmem_free(mem);
|
aligned_large_pages_free(table);
|
||||||
|
|
||||||
clusterCount = mbSize * 1024 * 1024 / sizeof(Cluster);
|
clusterCount = mbSize * 1024 * 1024 / sizeof(Cluster);
|
||||||
table = static_cast<Cluster*>(aligned_ttmem_alloc(clusterCount * sizeof(Cluster), mem));
|
|
||||||
if (!mem)
|
table = static_cast<Cluster*>(aligned_large_pages_alloc(clusterCount * sizeof(Cluster)));
|
||||||
|
if (!table)
|
||||||
{
|
{
|
||||||
std::cerr << "Failed to allocate " << mbSize
|
std::cerr << "Failed to allocate " << mbSize
|
||||||
<< "MB for transposition table." << std::endl;
|
<< "MB for transposition table." << std::endl;
|
||||||
|
|||||||
@@ -73,7 +73,7 @@ class TranspositionTable {
|
|||||||
static_assert(sizeof(Cluster) == 32, "Unexpected Cluster size");
|
static_assert(sizeof(Cluster) == 32, "Unexpected Cluster size");
|
||||||
|
|
||||||
public:
|
public:
|
||||||
~TranspositionTable() { aligned_ttmem_free(mem); }
|
~TranspositionTable() { aligned_large_pages_free(table); }
|
||||||
void new_search() { generation8 += 8; } // Lower 3 bits are used by PV flag and Bound
|
void new_search() { generation8 += 8; } // Lower 3 bits are used by PV flag and Bound
|
||||||
TTEntry* probe(const Key key, bool& found) const;
|
TTEntry* probe(const Key key, bool& found) const;
|
||||||
int hashfull() const;
|
int hashfull() const;
|
||||||
@@ -91,7 +91,6 @@ private:
|
|||||||
|
|
||||||
size_t clusterCount;
|
size_t clusterCount;
|
||||||
Cluster* table;
|
Cluster* table;
|
||||||
void* mem;
|
|
||||||
uint8_t generation8; // Size must be not bigger than TTEntry::genBound8
|
uint8_t generation8; // Size must be not bigger than TTEntry::genBound8
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|||||||
+2
-2
@@ -47,7 +47,7 @@ const char* StartFEN = "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
|
|||||||
void test_cmd(Position& pos, istringstream& is)
|
void test_cmd(Position& pos, istringstream& is)
|
||||||
{
|
{
|
||||||
// Initialize as it may be searched.
|
// Initialize as it may be searched.
|
||||||
Eval::init_NNUE();
|
Eval::NNUE::init();
|
||||||
|
|
||||||
std::string param;
|
std::string param;
|
||||||
is >> param;
|
is >> param;
|
||||||
@@ -100,7 +100,7 @@ namespace {
|
|||||||
Position p;
|
Position p;
|
||||||
p.set(pos.fen(), Options["UCI_Chess960"], &states->back(), Threads.main());
|
p.set(pos.fen(), Options["UCI_Chess960"], &states->back(), Threads.main());
|
||||||
|
|
||||||
Eval::verify_NNUE();
|
Eval::NNUE::verify();
|
||||||
|
|
||||||
sync_cout << "\n" << Eval::trace(p) << sync_endl;
|
sync_cout << "\n" << Eval::trace(p) << sync_endl;
|
||||||
}
|
}
|
||||||
|
|||||||
+2
-2
@@ -41,8 +41,8 @@ void on_hash_size(const Option& o) { TT.resize(size_t(o)); }
|
|||||||
void on_logger(const Option& o) { start_logger(o); }
|
void on_logger(const Option& o) { start_logger(o); }
|
||||||
void on_threads(const Option& o) { Threads.set(size_t(o)); }
|
void on_threads(const Option& o) { Threads.set(size_t(o)); }
|
||||||
void on_tb_path(const Option& o) { Tablebases::init(o); }
|
void on_tb_path(const Option& o) { Tablebases::init(o); }
|
||||||
void on_use_NNUE(const Option& ) { Eval::init_NNUE(); }
|
void on_use_NNUE(const Option& ) { Eval::NNUE::init(); }
|
||||||
void on_eval_file(const Option& ) { Eval::init_NNUE(); }
|
void on_eval_file(const Option& ) { Eval::NNUE::init(); }
|
||||||
void on_prune_at_shallow_depth(const Option& o) {
|
void on_prune_at_shallow_depth(const Option& o) {
|
||||||
Search::prune_at_shallow_depth = o;
|
Search::prune_at_shallow_depth = o;
|
||||||
}
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user