using github actions, create a prerelease for the latest commit to master.
As such a development version will be available on github, in addition to the latest release.
closes https://github.com/official-stockfish/Stockfish/pull/4622
No functional change
Implemented LEB128 (de)compression for the feature transformer.
Reduces embedded network size from 70 MiB to 39 Mib.
The new nn-78bacfcee510.nnue corresponds to the master net compressed.
closes https://github.com/official-stockfish/Stockfish/pull/4617
No functional change
Use block sparse input for the first fully connected layer on architectures with at least SSSE3.
Depending on the CPU architecture, this yields a speedup of up to 10%, e.g.
```
Result of 100 runs of 'bench 16 1 13 default depth NNUE'
base (...ockfish-base) = 959345 +/- 7477
test (...ckfish-patch) = 1054340 +/- 9640
diff = +94995 +/- 3999
speedup = +0.0990
P(speedup > 0) = 1.0000
CPU: 8 x AMD Ryzen 7 5700U with Radeon Graphics
Hyperthreading: on
```
Passed STC:
https://tests.stockfishchess.org/tests/view/6485aa0965ffe077ca12409c
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 8864 W: 2479 L: 2223 D: 4162
Ptnml(0-2): 13, 829, 2504, 1061, 25
This commit includes a net with reordered weights, to increase the likelihood of block sparse inputs,
but otherwise equivalent to the previous master net (nn-ea57bea57e32.nnue).
Activation data collected with https://github.com/AndrovT/Stockfish/tree/log-activations, running bench 16 1 13 varied_1000.epd depth NNUE on this data. Net parameters permuted with https://gist.github.com/AndrovT/9e3fbaebb7082734dc84d27e02094cb3.
closes https://github.com/official-stockfish/Stockfish/pull/4612
No functional change
Created by retraining an earlier epoch (ep659) of the experiment that led to the first SFNNv6 net:
- First retrained on the nn-0dd1cebea573 dataset
- Then retrained with skip 20 on a smaller dataset containing unfiltered Leela data
- And then retrained again with skip 27 on the nn-0dd1cebea573 dataset
The equivalent 7-step training sequence from scratch that led here was:
1. max-epoch 400, lambda 1.0, constant LR 9.75e-4, T79T77-filter-v6-dd.min.binpack
ep379 chosen for retraining in step2
2. max-epoch 800, end-lambda 0.75, T60T70wIsRightFarseerT60T74T75T76.binpack
ep679 chosen for retraining in step3
3. max-epoch 800, end-lambda 0.75, skip 28, nn-e1fb1ade4432 dataset
ep799 chosen for retraining in step4
4. max-epoch 800, end-lambda 0.7, skip 28, nn-e1fb1ade4432 dataset
ep759 became nn-8d69132723e2.nnue (first SFNNv6 net)
ep659 chosen for retraining in step5
5. max-epoch 800, end-lambda 0.7, skip 28, nn-0dd1cebea573 dataset
ep759 chosen for retraining in step6
6. max-epoch 800, end-lambda 0.7, skip 20, leela-dfrc-v2-T77decT78janfebT79aprT80apr.binpack
ep639 chosen for retraining in step7
7. max-epoch 800, end-lambda 0.7, skip 27, nn-0dd1cebea573 dataset
ep619 became nn-ea57bea57e32.nnue
For the last retraining (step7):
python3 easy_train.py
--experiment-name L1-1536-Re6-masterShuffled-ep639-sk27-Re5-leela-dfrc-v2-T77toT80small-Re4-masterShuffled-ep659-Re3-sameAs-Re2-leela96-dfrc99-16t-v2-T60novdecT80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-Re1-LeelaFarseer-new-T77T79 \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-L1-1536 \
--early-fen-skipping 27 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--max_epoch 800 \
--start-from-engine-test-net False \
--start-from-model /data/L1-1536-Re5-leela-dfrc-v2-T77toT80small-epoch639.nnue \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--seed $RANDOM \
--gpus "0,"
For preparing the step6 leela-dfrc-v2-T77decT78janfebT79aprT80apr.binpack dataset:
python3 interleave_binpacks.py \
leela96-filt-v2.binpack \
dfrc99-16tb7p-eval-filt-v2.binpack \
test77-dec2021-16tb7p.no-db.min-mar2023.binpack \
test78-janfeb2022-16tb7p.no-db.min-mar2023.binpack \
test79-apr2022-16tb7p-filter-v6-dd.binpack \
test80-apr2022-16tb7p.no-db.min-mar2023.binpack \
/data/leela-dfrc-v2-T77decT78janfebT79aprT80apr.binpack
The unfiltered Leela data used for the step6 dataset can be found at:
https://robotmoon.com/nnue-training-data
Local elo at 25k nodes per move:
nn-epoch619.nnue : 2.3 +/- 1.9
Passed STC:
https://tests.stockfishchess.org/tests/view/6480d43c6e6ce8d9fc6d7cc8
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 40992 W: 11017 L: 10706 D: 19269
Ptnml(0-2): 113, 4400, 11170, 4689, 124
Passed LTC:
https://tests.stockfishchess.org/tests/view/648119ac6e6ce8d9fc6d8208
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 129174 W: 35059 L: 34579 D: 59536
Ptnml(0-2): 66, 12548, 38868, 13050, 55
closes https://github.com/official-stockfish/Stockfish/pull/4611
bench: 2370027
Created by retraining an earlier epoch of the experiment leading to the first SFNNv6 net
on a more-randomized version of the nn-e1fb1ade4432.nnue dataset mixed with unfiltered
T80 apr2023 data. Trained using early-fen-skipping 28 and max-epoch 960.
The trainer settings and epochs used in the 5-step training sequence leading here were:
1. train from scratch for 400 epochs, lambda 1.0, constant LR 9.75e-4, T79T77-filter-v6-dd.min.binpack
2. retrain ep379, max-epoch 800, end-lambda 0.75, T60T70wIsRightFarseerT60T74T75T76.binpack
3. retrain ep679, max-epoch 800, end-lambda 0.75, skip 28, nn-e1fb1ade4432 dataset
4. retrain ep799, max-epoch 800, end-lambda 0.7, skip 28, nn-e1fb1ade4432 dataset
5. retrain ep439, max-epoch 960, end-lambda 0.7, skip 28, shuffled nn-e1fb1ade4432 + T80 apr2023
This net was epoch 559 of the final (step 5) retraining:
```bash
python3 easy_train.py \
--experiment-name L1-1536-Re4-leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr-shuffled-sk28 \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-L1-1536 \
--early-fen-skipping 28 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--max_epoch 960 \
--start-from-engine-test-net False \
--start-from-model /data/L1-1536-Re3-nn-epoch439.nnue \
--engine-test-branch linrock/Stockfish/L1-1536 \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--seed $RANDOM \
--gpus "0,"
```
During data preparation, most binpacks were unminimized by removing positions with
score 32002 (`VALUE_NONE`). This makes the tradeoff of increasing dataset filesize
on disk to increase the randomness of positions in interleaved datasets.
The code used for unminimizing is at:
https://github.com/linrock/Stockfish/tree/tools-unminify
For preparing the dataset used in this experiment:
```bash
python3 interleave_binpacks.py \
leela96-filt-v2.binpack \
dfrc99-16tb7p-eval-filt-v2.binpack \
filt-v6-dd-min/test60-novdec2021-12tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-aug2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-sep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd/test80-jul2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-oct2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-nov2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd-min/test80-jan2023-3of3-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-feb2023-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd/test79-apr2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test79-may2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd-min/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd/test78-juntosep2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test77-dec2021-16tb7p-filter-v6-dd.binpack \
test80-apr2023-2tb7p.binpack \
/data/leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr.binpack
```
T80 apr2023 data was converted using lc0-rescorer with ~2tb of tablebases and can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move vs. nn-e1fb1ade4432.nnue (L1 size 1024):
nn-epoch559.nnue : 25.7 +/- 1.6
Passed STC:
https://tests.stockfishchess.org/tests/view/647cd3b87cf638f0f53f9cbb
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 59200 W: 16000 L: 15660 D: 27540
Ptnml(0-2): 159, 6488, 15996, 6768, 189
Passed LTC:
https://tests.stockfishchess.org/tests/view/647d58de726f6b400e4085d8
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 58800 W: 16002 L: 15657 D: 27141
Ptnml(0-2): 44, 5607, 17748, 5962, 39
closes https://github.com/official-stockfish/Stockfish/pull/4606
bench 2141197
Created by training a new net from scratch with L1 size increased from 1024 to 1536.
Thanks to Vizvezdenec for the idea of exploring larger net sizes after recent
training data improvements.
A new net was first trained with lambda 1.0 and constant LR 8.75e-4. Then a strong net
from a later epoch in the training run was chosen for retraining with start-lambda 1.0
and initial LR 4.375e-4 decaying with gamma 0.995. Retraining was performed a total of
3 times, for this 4-step process:
1. 400 epochs, lambda 1.0 on filtered T77+T79 v6 deduplicated data
2. 800 epochs, end-lambda 0.75 on T60T70wIsRightFarseerT60T74T75T76.binpack
3. 800 epochs, end-lambda 0.75 and early-fen-skipping 28 on the master dataset
4. 800 epochs, end-lambda 0.7 and early-fen-skipping 28 on the master dataset
In the training sequence that reached the new nn-8d69132723e2.nnue net,
the epochs used for the 3x retraining runs were:
1. epoch 379 trained on T77T79-filter-v6-dd.min.binpack
2. epoch 679 trained on T60T70wIsRightFarseerT60T74T75T76.binpack
3. epoch 799 trained on the master dataset
For training from scratch:
python3 easy_train.py \
--experiment-name new-L1-1536-T77T79-filter-v6dd \
--training-dataset /data/T77T79-filter-v6-dd.min.binpack \
--max_epoch 400 \
--lambda 1.0 \
--start-from-engine-test-net False \
--engine-test-branch linrock/Stockfish/L1-1536 \
--nnue-pytorch-branch linrock/Stockfish/misc-fixes-L1-1536 \
--tui False \
--gpus "0," \
--seed $RANDOM
Retraining commands were similar to each other. For the 3rd retraining run:
python3 easy_train.py \
--experiment-name L1-1536-T77T79-v6dd-Re1-LeelaFarseer-Re2-masterDataset-Re3-sameData \
--training-dataset /data/leela96-dfrc99-v2-T60novdecT80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd.binpack \
--early-fen-skipping 28 \
--max_epoch 800 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--lr 4.375e-4 \
--gamma 0.995 \
--start-from-engine-test-net False \
--start-from-model /data/L1-1536-T77T79-v6dd-Re1-LeelaFarseer-Re2-masterDataset-nn-epoch799.nnue \
--engine-test-branch linrock/Stockfish/L1-1536 \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-L1-1536 \
--tui False \
--gpus "0," \
--seed $RANDOM
The T77+T79 data used is a subset of the master dataset available at:
https://robotmoon.com/nnue-training-data/
T60T70wIsRightFarseerT60T74T75T76.binpack is available at:
https://drive.google.com/drive/folders/1S9-ZiQa_3ApmjBtl2e8SyHxj4zG4V8gG
Local elo at 25k nodes per move vs. nn-e1fb1ade4432.nnue (L1 size 1024):
nn-epoch759.nnue : 26.9 +/- 1.6
Failed STC
https://tests.stockfishchess.org/tests/view/64742485d29264e4cfa75f97
LLR: -2.94 (-2.94,2.94) <0.00,2.00>
Total: 13728 W: 3588 L: 3829 D: 6311
Ptnml(0-2): 71, 1661, 3610, 1482, 40
Failing LTC
https://tests.stockfishchess.org/tests/view/64752d7c4a36543c4c9f3618
LLR: -1.91 (-2.94,2.94) <0.50,2.50>
Total: 35424 W: 9522 L: 9603 D: 16299
Ptnml(0-2): 24, 3579, 10585, 3502, 22
Passed VLTC 180+1.8
https://tests.stockfishchess.org/tests/view/64752df04a36543c4c9f3638
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 47616 W: 13174 L: 12863 D: 21579
Ptnml(0-2): 13, 4261, 14952, 4566, 16
Passed VLTC SMP 60+0.6 th 8
https://tests.stockfishchess.org/tests/view/647446ced29264e4cfa761e5
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 19942 W: 5694 L: 5451 D: 8797
Ptnml(0-2): 6, 1504, 6707, 1749, 5
closes https://github.com/official-stockfish/Stockfish/pull/4593
bench 2222567
This change removes one of the constants in the calculation of optimism. It also changes the 2 constants used with the scale value so that they are independent, instead of applying a constant to the scale and then adjusting it again when it is applied to the optimism. This might make the tuning of these constants cleaner and more reliable in the future.
STC 10+0.1 (accidentally run as an Elo gainer:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 154080 W: 41119 L: 40651 D: 72310
Ptnml(0-2): 375, 16840, 42190, 17212, 423
https://tests.stockfishchess.org/tests/live_elo/64653eabf3b1a4e86c317f77
LTC 60+0.6:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 217434 W: 58382 L: 58363 D: 100689
Ptnml(0-2): 66, 21075, 66419, 21088, 69
https://tests.stockfishchess.org/tests/live_elo/6465d077f3b1a4e86c318d6c
closes https://github.com/official-stockfish/Stockfish/pull/4576
bench: 3190961
Created by retraining nn-dabb1ed23026.nnue with a dataset composed of:
* The previous best dataset (nn-1ceb1a57d117.nnue dataset)
* Adding de-duplicated T80 data from feb2023 and the last 10 days of jan2023, filtered with v6-dd
Initially trained with the same options as the recent master net (nn-1ceb1a57d117.nnue).
Around epoch 890, training was manually stopped and max epoch increased to 1000.
```
python3 easy_train.py \
--experiment-name leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovjanfebT79aprmayT78jantosepT77dec-v6dd \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovjanfebT79aprmayT78jantosepT77dec-v6dd.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes \
--start-from-engine-test-net True \
--early-fen-skipping 30 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--max_epoch 900 \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--gpus "0," \
--seed $RANDOM
```
The same v6-dd filtering and binpack minimizer was used for preparing the recent nn-1ceb1a57d117.nnue dataset.
```
python3 interleave_binpacks.py \
leela96-filt-v2.binpack \
dfrc99-filt-v2.binpack \
T60-nov2021-12tb7p-eval-filt-v2.binpack \
T60-dec2021-12tb7p-eval-filt-v2.binpack \
filt-v6/test80-aug2022-16tb7p-filter-v6.min-mar2023.binpack \
filt-v6/test80-sep2022-16tb7p-filter-v6.min-mar2023.binpack \
filt-v6-dd/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test80-jul2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-oct2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-nov2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-jan2022-3of3-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test80-feb2023-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test79-apr2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test79-may2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test78-juntosep2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test77-dec2021-16tb7p-filter-v6-dd.binpack \
/data/leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovjanfebT79aprmayT78jantosepT77dec-v6dd.binpack
```
Links for downloading the training data components can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move:
nn-epoch919.nnue : 2.6 +/- 2.8
Passed STC vs. nn-dabb1ed23026.nnue
https://tests.stockfishchess.org/tests/view/644420df94ff3db5625f2af5
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 125960 W: 33898 L: 33464 D: 58598
Ptnml(0-2): 351, 13920, 34021, 14320, 368
Passed LTC vs. nn-1ceb1a57d117.nnue
https://tests.stockfishchess.org/tests/view/64469f128d30316529b3dc46
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 24544 W: 6817 L: 6542 D: 11185
Ptnml(0-2): 8, 2252, 7488, 2505, 19
closes https://github.com/official-stockfish/Stockfish/pull/4546
bench 3714847
* Extending v6 filtering to data from T77 dec2021, T79 may2022, and T80 nov2022
* Reducing the number of duplicate positions, prioritizing position scores seen later in time
* Using a binpack minimizer to reduce the overall data size
Trained the same way as the previous master net, aside from the dataset changes:
```
python3 easy_train.py \
--experiment-name leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovT79aprmayT78jantosepT77dec-v6dd \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovT79aprmayT78jantosepT77dec-v6dd.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes \
--start-from-engine-test-net True \
--early-fen-skipping 30 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--max_epoch 900 \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--gpus "0," \
--seed $RANDOM
```
The new v6-dd filtering reduces duplicate positions by iterating over hourly data files within leela test runs, starting with the most recent, then keeping positions the first time they're seen and ignoring positions that are seen again. This ordering was done with the assumption that position scores seen later in time are generally more accurate than scores seen earlier in the test run. Positions are de-duplicated based on piece orientations, the first token in fen strings.
The binpack minimizer was run with default settings after first merging monthly data into single binpacks.
```
python3 interleave_binpacks.py \
leela96-filt-v2.binpack \
dfrc99-filt-v2.binpack \
T60-nov2021-12tb7p-eval-filt-v2.binpack \
T60-dec2021-12tb7p-eval-filt-v2.binpack \
filt-v6/test80-aug2022-16tb7p-filter-v6.min-mar2023.binpack \
filt-v6/test80-sep2022-16tb7p-filter-v6.min-mar2023.binpack \
filt-v6-dd/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test80-jul2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-oct2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-nov2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test79-apr2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test79-may2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test78-juntosep2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test77-dec2021-16tb7p-filter-v6-dd.binpack \
/data/leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovT79aprmayT78jantosepT77dec-v6dd.binpack
```
The code for v6-dd filtering is available along with training data preparation scripts at:
https://github.com/linrock/nnue-data
Links for downloading the training data components:
https://robotmoon.com/nnue-training-data/
The binpack minimizer is from: #4447
Local elo at 25k nodes per move:
nn-epoch859.nnue : 1.2 +/- 2.6
Passed STC:
https://tests.stockfishchess.org/tests/view/643aad7db08900ff1bc5a832
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 565040 W: 150225 L: 149162 D: 265653
Ptnml(0-2): 1875, 62137, 153229, 63608, 1671
Passed LTC:
https://tests.stockfishchess.org/tests/view/643ecf2fa43cf30e719d2042
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 1014840 W: 274645 L: 272456 D: 467739
Ptnml(0-2): 515, 98565, 306970, 100956, 414
closes https://github.com/official-stockfish/Stockfish/pull/4545
bench 3476305
This idea is a result of my second condition combination tuning for reductions:
https://tests.stockfishchess.org/tests/view/643ed5573806eca398f06d61
There were used two parameters per combination: one for the 'sign' of the first and the second condition in a combination. Values >= 50 indicate using a condition directly and values <= -50 means use the negation of a condition.
Each condition pair (X,Y) had two occurances dependent of the order of the two conditions:
- if X < Y the parameters used for more reduction
- if X > Y the parameters used for less reduction
- if X = Y then only one condition is present and A[X][X][0]/A[X][X][1] stands for using more/less reduction for only this condition.
The parameter pair A[7][2][0] (value = -94.70) and A[7][2][1] (value = 93.60) was one of the strongest signals with values near 100/-100.
Here condition nr. 7 was '(ss+1)->cutoffCnt > 3' and condition nr. 2 'move == ttMove'. For condition nr. 7 the negation is used because A[7][2][0] is negative.
This translates finally to less reduction (because 7 > 2) for tt moves if child cutoffs <= 3.
STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 65728 W: 17704 L: 17358 D: 30666
Ptnml(0-2): 184, 7092, 18008, 7354, 226
https://tests.stockfishchess.org/tests/view/643ff767ef2529086a7ed042
LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 139200 W: 37776 L: 37282 D: 64142
Ptnml(0-2): 58, 13241, 42509, 13733, 59
https://tests.stockfishchess.org/tests/view/6440bfa9ef2529086a7edbc7
closes https://github.com/official-stockfish/Stockfish/pull/4538
Bench: 3548023
This patch is a simplification of my recent elo gainer.
Logically the Elo gainer didn't make much sense and this patch simplifies it into smth more logical.
Instead of assigning negative bonuses to all non-first moves that enter PV nodes
we assign positive bonuses in full depth search after LMR only for moves that
will result in a fail high - thus not assigning positive bonuses
for moves that will go to pv search - so doing "almost" the same as we do in master now for them.
Logic differs for some other moves, though, but this removes some lines of code.
Passed STC:
https://tests.stockfishchess.org/tests/view/642cf5cf77ff3301150dc5ec
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 409320 W: 109124 L: 109308 D: 190888
Ptnml(0-2): 1149, 45385, 111751, 45251, 1124
Passed LTC:
https://tests.stockfishchess.org/tests/view/642fe75d20eb941419bde200
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 260336 W: 70280 L: 70303 D: 119753
Ptnml(0-2): 99, 25236, 79528, 25199, 106
closes https://github.com/official-stockfish/Stockfish/pull/4522
Bench: 4286815
Since bestValue becomes value and beta - alpha is always non-negative,
extraReduction is always false, hence it has no effect.
This patch includes small changes to improve readability.
closes https://github.com/official-stockfish/Stockfish/pull/4505
No functional change
The current implementation generates warnings on MSVC. However, we have
no real use cases for double-typed UCI option values now. Also parameter
tuning only accepts following three types:
int, Value, Score
closes https://github.com/official-stockfish/Stockfish/pull/4505
No functional change
Replace the deprecated Intel compiler icc with its newer icx variant.
This newer compiler is based on clang, and yields good performance.
As before, currently only linux is supported.
closes https://github.com/official-stockfish/Stockfish/pull/4478
No functional change
Made advanced Windows API calls (those from Advapi32.dll) dynamically
linked to avoid link errors when compiling using
Intel icx compiler for Windows.
https://github.com/official-stockfish/Stockfish/pull/4467
No functional change
this makes it easier to compile under MSVC, even though we recommend gcc/clang for production compiles at the moment.
In Win32 API, by default, most null-terminated character strings arguments are of wchar_t (UTF16, formerly UCS16-LE) type, i.e. 2 bytes (at least) per character. So, src/misc.cpp should have proper type. Respectively, for src/syzygy/tbprobe.cpp, in Widows, file paths should be std::wstring rather than std::string. However, this requires a very big number of changes, since the config files are also keeping the 8-bit-per-character std::string strings. Therefore, just one change of using 8-byte-per-character CreateFileA make it compile under MSVC.
closes https://github.com/official-stockfish/Stockfish/pull/4438
No functional change
Created by retraining the master net with these modifications:
* New filtering methods for existing data from T80 sep+oct2022, T79 apr2022, T78 jun+jul+aug+sep2022, T77 dec2021
* Adding new filtered data from T80 aug2022 and T78 apr+may2022
* Increasing early-fen-skipping from 28 to 30
```
python3 easy_train.py \
--experiment-name leela96-dfrc99-T80novT79mayT60novdec-v2-T80augsepoctT79aprT78aprtosep-v6-T77dec-v3-sk30 \
--training-dataset /data/leela96-dfrc99-T80novT79mayT60novdec-v2-T80augsepoctT79aprT78aprtosep-v6-T77dec-v3.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes \
--start-from-engine-test-net True \
--early-fen-skipping 30 \
--max_epoch 900 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--gpus "0," \
--seed $RANDOM
```
The v3 filtering used for data from T77dec 2021 differs from v2 filtering in that:
* To improve binpack compression, positions after ply 28 were skipped during training by setting position scores to VALUE_NONE (32002) instead of removing them entirely
* All early-game positions with ply <= 28 were removed to maximize binpack compression
* Only bestmove captures at d6pv2 search were skipped, not 2nd bestmove captures
* Binpack compression was repaired for the remaining positions by effectively replacing bestmoves with "played moves" to maintain contiguous sequences of positions in the training game data
After improving binpack compression, The T77 dec2021 data size was reduced from 95G to 19G.
The v6 filtering used for data from T80augsepoctT79aprT78aprtosep 2022 differs from v2 in that:
* All positions with only one legal move were removed
* Tighter score differences at d6pv2 search were used to remove more positions with only one good move than before
* d6pv2 search was not used to remove positions where the best 2 moves were captures
```
python3 interleave_binpacks.py \
nn-547-dataset/leela96-eval-filt-v2.binpack \
nn-547-dataset/dfrc99-eval-filt-v2.binpack \
nn-547-dataset/test80-nov2022-12tb7p-eval-filt-v2-d6.binpack \
nn-547-dataset/T79-may2022-12tb7p-eval-filt-v2.binpack \
nn-547-dataset/T60-nov2021-12tb7p-eval-filt-v2.binpack \
nn-547-dataset/T60-dec2021-12tb7p-eval-filt-v2.binpack \
filt-v6/test80-aug2022-16tb7p-filter-v6.binpack \
filt-v6/test80-sep2022-16tb7p-filter-v6.binpack \
filt-v6/test80-oct2022-16tb7p-filter-v6.binpack \
filt-v6/test79-apr2022-16tb7p-filter-v6.binpack \
filt-v6/test78-aprmay2022-16tb7p-filter-v6.binpack \
filt-v6/test78-junjulaug2022-16tb7p-filter-v6.binpack \
filt-v6/test78-sep2022-16tb7p-filter-v6.binpack \
filt-v3/test77-dec2021-16tb7p-filt-v3.binpack \
/data/leela96-dfrc99-T80novT79mayT60novdec-v2-T80augsepoctT79aprT78aprtosep-v6-T77dec-v3.binpack
```
The code for the new data filtering methods is available at:
https://github.com/linrock/Stockfish/tree/nnue-data-v3/nnue-data
The code for giving hexword names to .nnue files is at:
https://github.com/linrock/nnue-namer
Links for downloading the training data components can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move:
nn-epoch779.nnue : 0.6 +/- 3.1
Passed STC:
https://tests.stockfishchess.org/tests/view/64212412db43ab2ba6f8efb0
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 82256 W: 22185 L: 21809 D: 38262
Ptnml(0-2): 286, 9065, 22067, 9407, 303
Passed LTC:
https://tests.stockfishchess.org/tests/view/64223726db43ab2ba6f91d6c
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 30840 W: 8437 L: 8149 D: 14254
Ptnml(0-2): 14, 2891, 9323, 3177, 15
closes https://github.com/official-stockfish/Stockfish/pull/4465
bench 5101970
This patch simplifies initialization of statScore to "always set it up to 0" instead of setting it up to 0 two plies deeper.
Reason for why it was done in previous way partially was because of LMR usage of previous statScore which was simplified long time ago so it makes sense to make in more simple there.
Passed STC:
https://tests.stockfishchess.org/tests/view/641a86d1db43ab2ba6f7b31d
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 115648 W: 30895 L: 30764 D: 53989
Ptnml(0-2): 368, 12741, 31473, 12876, 366
Passed LTC:
https://tests.stockfishchess.org/tests/view/641b1c31db43ab2ba6f7d17a
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 175576 W: 47122 L: 47062 D: 81392
Ptnml(0-2): 91, 17077, 53390, 17141, 89
closes https://github.com/official-stockfish/Stockfish/pull/4460
bench 5081969
Patch analyzes field after SEE exchanges concluded with a recapture by
the opponent:
if opponent Queen/Rook/King results under attack after the exchanges, we
consider the move sharp and don't prune it.
Important note:
By accident I forgot to adjust 'occupied' when the king takes part in
the exchanges.
As result of this a move is considered sharp too, when opponent king
apparently can evade check by recapturing.
Surprisingly this seems contribute to patch's strength.
STC:
https://tests.stockfishchess.org/tests/view/640b16132644b62c33947397
LLR: 2.96 (-2.94,2.94) <0.00,2.00>
Total: 116400 W: 31239 L: 30817 D: 54344
Ptnml(0-2): 350, 12742, 31618, 13116, 374
LTC:
https://tests.stockfishchess.org/tests/view/640c88092644b62c3394c1c5
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 177600 W: 47988 L: 47421 D: 82191
Ptnml(0-2): 62, 16905, 54317, 17436, 80
closes https://github.com/official-stockfish/Stockfish/pull/4453
bench: 5012145
Since st is a member of position we don't need to pass it separately as
parameter.
While being there also remove some line in pos_is_ok, where
a copy of StateInfo was made by using default copy constructor and
then verified it's correctedness by doing a memcmp.
There is no point in doing that.
Passed non-regression test
https://tests.stockfishchess.org/tests/view/64098d562644b62c33942b35
LLR: 3.24 (-2.94,2.94) <-1.75,0.25>
Total: 548960 W: 145834 L: 146134 D: 256992
Ptnml(0-2): 1617, 57652, 156261, 57314, 1636
closes https://github.com/official-stockfish/Stockfish/pull/4444
No functional change
Keep incbin.h with the same mode as the other source files.
A mode diff might show up when working with patch files or sending the source code between devices.
This patch should fix such behaviour.
closes https://github.com/official-stockfish/Stockfish/pull/4442
No functional change
in a some of cases movepicker returned some moves more than once which lead
to them being searched more than once. This bug was possible because of how
we use queen promotions - they are generated as a captures but are not
included in position function which checks if move is a capture. Thus if
any refutation (killer or countermove) was a queen promotion it was
searched twice - once as a capture and one as a refutation.
This patch affects various things, namely stats assignments for queen promotions
and other moves if best move is queen promotion,
also some heuristics in search and qsearch.
With this patch every queen promotion is now considered a capture.
After this patch number of found duplicated moves is 0 during normal 13 depth bench run.
Passed STC:
https://tests.stockfishchess.org/tests/view/63f77e01e74a12625bcd87d7
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 80920 W: 21455 L: 21289 D: 38176
Ptnml(0-2): 198, 8839, 22241, 8963, 219
Passed LTC:
https://tests.stockfishchess.org/tests/view/63f7e020e74a12625bcd9a76
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 89712 W: 23674 L: 23533 D: 42505
Ptnml(0-2): 24, 8737, 27202, 8860, 33
closes https://github.com/official-stockfish/Stockfish/pull/4405
bench 4681731
Call the recently added hint function for NNUE accumulator update after a failed probcut search.
In this case we already searched at least some captures and tt move which, however, is not sufficient for a cutoff.
So it seems we have a greater chance that the full search will also have no cutoff and hence all moves have to be searched.
STC: https://tests.stockfishchess.org/tests/view/63fa74a4e74a12625bce1823
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 70096 W: 18770 L: 18423 D: 32903
Ptnml(0-2): 191, 7342, 19654, 7651, 210
To be sure that we have no heavy interaction retest on top of #4410.
Rebased STC: https://tests.stockfishchess.org/tests/view/63fb2f62e74a12625bce3b03
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 137688 W: 36790 L: 36349 D: 64549
Ptnml(0-2): 397, 14373, 38919, 14702, 453
closes https://github.com/official-stockfish/Stockfish/pull/4411
No functional change
Credits to Stefan Geschwentner (locutus2) showing that the hint
is useful on PvNodes. In contrast to his test,
this version avoids to use the hint when in check.
I believe checking positions aren't good candidates for the hint
because:
- evasion moves are rather few, so a checking pos. has much less childs
than a normal position
- if the king has to move the NNUE eval can't use incremental updates,
so the child nodes have to do a full refresh anyway.
Passed STC:
https://tests.stockfishchess.org/tests/view/63f9c5b1e74a12625bcdf585
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 124472 W: 33268 L: 32846 D: 58358
Ptnml(0-2): 350, 12986, 35170, 13352, 378
closes https://github.com/official-stockfish/Stockfish/pull/4410
no functional change
Params found with the nevergrad TBPSA optimizer via nevergrad4sf modified to:
* use SPRT LLR with fishtest STC elo gainer bounds [0, 2] as the objective function
* increase the game batch size after each new optimal point is found
The params were the optimal point after TBPSA iteration 7 and 160 nevergrad evaluations with:
* initial batch size of 96 games per evaluation
* batch size increase of 64 games after each iteration
* a budget of 512 evaluations
* TC: fixed 1.5 million nodes per move, no time limit
nevergrad4sf enables optimizing stockfish params with TBPSA:
https://github.com/vondele/nevergrad4sf
Using pentanomial game results with smaller game batch sizes was inspired by:
Use of SPRT LLR calculated from pentanomial game results as the objective function was an experiment at maximizing the information from game batches to reduce the computational cost for TBPSA to converge on good parameters.
For the exact code used to find the params:
https://github.com/linrock/tuning-fork
Passed STC:
https://tests.stockfishchess.org/tests/view/63f4ef5ee74a12625bcd114a
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 66552 W: 17736 L: 17390 D: 31426
Ptnml(0-2): 164, 7229, 18166, 7531, 186
Passed LTC:
https://tests.stockfishchess.org/tests/view/63f56028e74a12625bcd2550
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 71264 W: 19150 L: 18787 D: 33327
Ptnml(0-2): 23, 6728, 21771, 7083, 27
closes https://github.com/official-stockfish/Stockfish/pull/4401
bench 3687580
This patch introduces `hint_common_parent_position()` to signal that potentially several child nodes will require an NNUE eval. By populating explicitly the accumulator, these subsequent evaluations can be performed more efficiently.
This was based on the observation that calculating the evaluation in an excluded move position yielded a significant Elo gain, even though the evaluation itself was already available (work by pb00067).
Sopel wrote the code to perform just the accumulator update. This PR is based on cleaned up code that
passed STC:
https://tests.stockfishchess.org/tests/view/63f62f9be74a12625bcd4aa0
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 110368 W: 29607 L: 29167 D: 51594
Ptnml(0-2): 41, 10551, 33572, 10967, 53
and in an the earlier (equivalent) version
passed STC:
https://tests.stockfishchess.org/tests/view/63f3c3fee74a12625bcce2a6
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 47552 W: 12786 L: 12467 D: 22299
Ptnml(0-2): 120, 5107, 12997, 5438, 114
passed LTC:
https://tests.stockfishchess.org/tests/view/63f45cc2e74a12625bccfa63
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 110368 W: 29607 L: 29167 D: 51594
Ptnml(0-2): 41, 10551, 33572, 10967, 53
closes https://github.com/official-stockfish/Stockfish/pull/4402
Bench: 3726250
The sdot instruction computes (and accumulates) a signed dot product,
which is quite handy for Stockfish's NNUE code. The instruction is
optional for Armv8.2 and Armv8.3, and mandatory for Armv8.4 and above.
The commit adds a new 'arm-dotprod' architecture with enabled dot
product support. It also enables dot product support for the existing
'apple-silicon' architecture, which is at least Armv8.5.
The following local speed test was performed on an Apple M1 with
ARCH=apple-silicon. I had to remove CPU pinning from the benchmark
script. However, the results were still consistent: Checking both
binaries against themselves reported a speedup of +0.0000 and +0.0005,
respectively.
```
Result of 100 runs
==================
base (...ish.037ef3e1) = 1917997 +/- 7152
test (...fish.dotprod) = 2159682 +/- 9066
diff = +241684 +/- 2923
speedup = +0.1260
P(speedup > 0) = 1.0000
CPU: 10 x arm
Hyperthreading: off
```
Fixes#4193
closes https://github.com/official-stockfish/Stockfish/pull/4400
No functional change
Created by retraining the master net on a dataset composed of:
* Most of the previous best dataset filtered to remove positions likely having only one good move
* Adding training data from Leela T77 dec2021 rescored with 16tb of 7-piece tablebases
Trained with end lambda 0.7 and max epoch 900. Positions with ply <= 28 were removed from most of the previous best dataset before training began. A new nnue-pytorch trainer param for skipping early plies was used to skip plies <= 24 in the unfiltered and additional Leela T77 parts of the dataset.
```
python easy_train.py \
--experiment-name leela96-dfrc99-T80octnovT79aprmayT60novdec-eval-filt-v2-T78augsep-12tb-T77dec-16tb-lambda7-sk24 \
--training-dataset /data/leela96-dfrc99-T80octnovT79aprmayT60novdec-eval-filt-v2-T78augsep-12tb-T77dec-16tb.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/easy-train-early-fen-skipping \
--early-fen-skipping 24 \
--gpus "0," \
--start-from-engine-test-net True \
--start-lambda 1.0 \
--end-lambda 0.7 \
--gamma 0.995 \
--lr 4.375e-4 \
--tui False \
--seed $RANDOM \
--max_epoch 900
```
The depth6 multipv2 search filtering method is the same as the one used for filtering recent best datasets, with a lower eval difference threshold to remove slightly more positions than before. These parts of the dataset were filtered:
* 96% of T60T70wIsRightFarseerT60T74T75T76.binpack
* 99% of dfrc_n5000.binpack
* T80 oct + nov 2022 data, no positions with castling flags, rescored with ~600gb 7p tablebases
* T79 apr + may 2022 data, rescored with 12tb 7p tablebases
* T60 nov + dec 2021 data, rescored with 12tb 7p tablebases
These parts of the dataset were not filtered. Positions with ply <= 24 were skipped during training:
* T78 aug + sep 2022 data, rescored with 12tb 7p tablebases
* 84% of T77 dec 2021 data, rescored with 16tb 7p tablebases
The code and exact evaluation thresholds used for data filtering can be found at:
https://github.com/linrock/Stockfish/tree/tools-filter-multipv2-eval-diff-t2/src/filter
The exact training data used can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move:
nn-epoch859.nnue : 3.5 +/ 1.2
Passed STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
https://tests.stockfishchess.org/tests/view/63dfeefc73223e7f52ad769f
Total: 219744 W: 58572 L: 58002 D: 103170
Ptnml(0-2): 609, 24446, 59284, 24832, 701
Passed LTC:
https://tests.stockfishchess.org/tests/view/63e268fc73223e7f52ade7b6
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 91256 W: 24528 L: 24121 D: 42607
Ptnml(0-2): 48, 8863, 27390, 9288, 39
closes https://github.com/official-stockfish/Stockfish/pull/4387
bench 3841998
This patch is a simplification / code normalisation in qsearch.
Adds steps in comments the same way we have in search;
Makes a separate "pruning" stage instead of heuristics randomly being spread over qsearch code;
Reorders pruning heuristics from least taxing ones to more taxing ones;
Removes repeated check for best value not being mated, instead uses 1 check - thus removes some lines of code.
Moves prefetch and move setup after pruning - makes no sense to do them if move will actually get pruned.
Passed non-regression test:
https://tests.stockfishchess.org/tests/view/63dd2c5ff9a50a69252c1413
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 113504 W: 29898 L: 29770 D: 53836
Ptnml(0-2): 287, 11861, 32327, 11991, 286
https://github.com/official-stockfish/Stockfish/pull/4382
Non-functional change.
PR consists of 2 improvements on nodes with excludeMove:
1. Remove xoring the posKey with make_key(excludedMove)
Since we never call tte->save anymore with excludedMove,
the unique left purpose of the xoring was to avoid a TT hit.
Nevertheless on a normal bench run this produced ~25 false positives
(key collisions)
To avoid that we now forbid early TT cutoff's with excludeMove
Maybe these accesses to TT with xored key caused useless misses
in the CPU caches (L1, L2 ...)
Now doing the probe with the same key as the enclosing search does,
should hit the CPU cache.
2. Don't probe Tablebases with excludedMove.
This can't be tested on fishtest, but it's obvious that
tablebases don't deliver any information about suboptimal moves.
Side note:
Very surprisingly it looks like we cannot use static eval's from
TT since they slightly differ over time due to changing optimism.
Attempts to use static eval's from TT did loose about 13 ELO.
This is something about to investigate.
LTC: https://tests.stockfishchess.org/tests/view/63dc0f8de9d4cdfbe672d0c6
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 44736 W: 12046 L: 11733 D: 20957
Ptnml(0-2): 12, 4212, 13617, 4505, 22
An analogue of this passed STC & LTC
see PR #4374 (thanks Dubslow for reviewing!)
closes https://github.com/official-stockfish/Stockfish/pull/4380
Bench: 4758694
This patch adds more debugging slots up to 32 per type and provide tools
to calculate standard deviation and Pearson's correlation coefficient.
However, due to slot being 0 at default, dbg_hit_on(c, b) has to be removed.
Initial idea from snicolet/Stockfish@d8ab604
closes https://github.com/official-stockfish/Stockfish/pull/4354
No functional change
Current master prunes all moves with negative SEE values in qsearch.
This patch sets constant negative threshold thus allowing some moves with negative SEE values to be searched.
Value of threshold is completely arbitrary and can be tweaked - also it as function of depth can be tried.
Original idea by author of Alexandria engine.
Passed STC
https://tests.stockfishchess.org/tests/view/63d79a59a67dd929a5564976
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 34864 W: 9392 L: 9086 D: 16386
Ptnml(0-2): 113, 3742, 9429, 4022, 126
Passed LTC
https://tests.stockfishchess.org/tests/view/63d8074aa67dd929a5565bc2
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 91616 W: 24532 L: 24126 D: 42958
Ptnml(0-2): 32, 8840, 27662, 9238, 36
closes https://github.com/official-stockfish/Stockfish/pull/4376
Bench: 4010877
update the WLD model with about 400M positions extracted from recent LTC games after the net updates.
This ensures that the 50% win rate is again at 1.0 eval.
closes https://github.com/official-stockfish/Stockfish/pull/4373
No functional change.
Beyond the simplification, this could be considered a bugfix from a certain point of view.
However, the effect is very subtle and essentially impossible for users to notice.
5372f81cc8 added about 2 Elo at LTC, but only for second and later `go` commands; now, with
this patch, the first `go` command will also benefit from that gain. Games under time
controls are unaffected (as per the tests).
STC: https://tests.stockfishchess.org/tests/view/63c3d291330c0d3d051d48a8
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 473792 W: 124858 L: 125104 D: 223830
Ptnml(0-2): 1338, 49653, 135063, 49601, 1241
LTC: https://tests.stockfishchess.org/tests/view/63c8cd56a83c702aac083bc9
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 290728 W: 76926 L: 76978 D: 136824
Ptnml(0-2): 106, 27987, 89221, 27953, 97
closes https://github.com/official-stockfish/Stockfish/pull/4361
bench 4208265
This patch results in search values for a TB win/loss to be reported in a way that does not change with normalization, i.e. will be consistent over time.
A value of 200.00 pawns is now reported upon entering a TB won position. Values smaller than 200.00 relate to the distance in plies from the root to the probed position position,
with 1 cp being 1 ply distance.
closes https://github.com/official-stockfish/Stockfish/pull/4353
No functional change
Created by retraining the master net with Leela T78 data from Aug+Sep 2022 added to the previous best dataset. Trained with end lambda 0.7 and started with max epoch 800. All positions with ply <= 28 were skipped:
```
python easy_train.py \
--experiment-name leela95-dfrc96-filt-only-T80octnov-T60novdecT78augsepT79aprmay-12tb7p-sk28-lambda7 \
--training-dataset /data/leela95-dfrc96-filt-only-T80octnov-T60novdecT78augsepT79aprmay-12tb7p.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-skip-ply-lteq-28 \
--start-from-engine-test-net True \
--gpus "0," \
--start-lambda 1.0 \
--end-lambda 0.7 \
--gamma 0.995 \
--lr 4.375e-4 \
--tui False \
--seed $RANDOM \
--max_epoch 800
```
Around epoch 750, training was manually paused and max epoch increased to 950 before resuming. The additional Leela training data from T78 was prepared in the same way as the previous best dataset.
The exact training data used can be found at:
https://robotmoon.com/nnue-training-data/
While the local elo ratings during this experiment were much lower than in recent master nets, several later epochs had a consistent elo above zero, and this was hypothesized to represent potential strength at slower time controls.
Local elo at 25k nodes per move
leela95-dfrc96-filt-only-T80octnov-T60novdecT78augsepT79aprmay-12tb7p-sk28-lambda7
nn-epoch819.nnue : 0.4 +/- 1.1 (nn-bc24c101ada0.nnue)
nn-epoch799.nnue : 0.3 +/- 1.2
nn-epoch759.nnue : 0.3 +/- 1.1
nn-epoch839.nnue : 0.2 +/- 1.4
Passed STC
https://tests.stockfishchess.org/tests/view/63cabf6f0eefe8694a0c6013
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 41608 W: 11161 L: 10848 D: 19599
Ptnml(0-2): 116, 4496, 11281, 4781, 130
Passed LTC
https://tests.stockfishchess.org/tests/view/63cb1856344bb01c191af263
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 76760 W: 20517 L: 20137 D: 36106
Ptnml(0-2): 34, 7435, 23070, 7799, 42
closes https://github.com/official-stockfish/Stockfish/pull/4351
bench 3941848
Bit-shifting is a single instruction, and should be faster than an array lookup
on supported architectures. Besides (ever so slightly) speeding up the
conversion of a square into a bitboard, we may see minor general performance
improvements due to preserving more of the CPU's existing cache.
passed STC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 47280 W: 12469 L: 12271 D: 22540
Ptnml(0-2): 128, 4893, 13402, 5087, 130
https://tests.stockfishchess.org/tests/view/63c5cfe618c20f4929c5fe46
Small speedup locally:
```
Result of 20 runs
==================
base (./stockfish.master ) = 1752135 +/- 10943
test (./stockfish.patch ) = 1763939 +/- 10818
diff = +11804 +/- 4731
speedup = +0.0067
P(speedup > 0) = 1.0000
CPU: 16 x AMD Ryzen 9 3950X 16-Core Processor
```
Closes https://github.com/official-stockfish/Stockfish/pull/4343
Bench: 4106793
The accumulator should be an earlyclobber because it is written before
all input operands are read. Otherwise, the asm code computes a wrong
result if the accumulator shares a register with one of the other input
operands (which happens if we pass in the same expression for the
accumulator and the operand).
Closes https://github.com/official-stockfish/Stockfish/pull/4339
No functional change
Created by retraining the master net on a dataset composed of:
* The Leela-dfrc_n5000.binpack dataset filtered with depth6 multipv2 search to remove positions with only one good move, in addition to removing positions where either of the two best moves are captures
* The same Leela T80 oct+nov 2022 training data used in recent best datasets
* Additional Leela training data from T60 nov+dec 2021 and T79 apr+may 2022
Trained with end lambda 0.7 and started with max epoch 800. All positions with ply <= 28 were skipped:
```
python easy_train.py \
--experiment-name leela95-dfrc96-mpv-eval-fonly-T80octnov-T79aprmayT60novdec-12tb7p-sk28-lambda7 \
--training-dataset /data/leela95-dfrc96-mpv-eval-fonly-T80octnov-T79aprmayT60novdec-12tb7p.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-skip-ply-lteq-28 \
--start-from-engine-test-net True \
--gpus "0," \
--start-lambda 1.0 \
--end-lambda 0.7 \
--gamma 0.995 \
--lr 4.375e-4 \
--tui False \
--seed $RANDOM \
--max_epoch 800
```
Around epoch 780, training was manually paused and max epoch increased to 920 before resuming.
During depth6 multipv2 data filtering, positions were considered to have only one good move if the score of the best move was significantly better than the 2nd best move in a way that changes the outcome of the game:
* the best move leads to a significant advantage while the 2nd best move equalizes or loses
* the best move is about equal while the 2nd best move loses
The modified stockfish branch and exact score thresholds used for filtering are at:
https://github.com/linrock/Stockfish/tree/tools-filter-multipv2-eval-diff/src/filter
About 95% of the Leela portion and 96% of the DFRC portion of the Leela-dfrc_n5000.binpack dataset was filtered. Unfiltered parts of the dataset were left out.
The additional Leela training data from T60 nov+dec 2021 and T79 apr+may 2022 was WDL-rescored with about 12TB of syzygy 7-piece tablebases where the material difference is less than around 6 pawns. Best moves were exported to .plain data files during data conversion with the lc0 rescorer.
The exact training data can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move
experiment_leela95-dfrc96-mpv-eval-fonly-T80octnov-T79aprmayT60novdec-12tb7p-sk28-lambda7
run_0/nn-epoch899.nnue : 3.8 +/- 1.6
Passed STC
https://tests.stockfishchess.org/tests/view/63bed1f540aa064159b9c89b
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 103344 W: 27392 L: 26991 D: 48961
Ptnml(0-2): 333, 11223, 28099, 11744, 273
Passed LTC
https://tests.stockfishchess.org/tests/view/63c010415705810de2deb3ec
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 21712 W: 5891 L: 5619 D: 10202
Ptnml(0-2): 12, 2022, 6511, 2304, 7
closes https://github.com/official-stockfish/Stockfish/pull/4338
bench 4106793
Removed sprintf() which generated a warning, because of security reasons.
Replace NULL with nullptr
Replace typedef with using
Do not inherit from std::vector. Use composition instead.
optimize mutex-unlocking
closes https://github.com/official-stockfish/Stockfish/pull/4327
No functional change
If a global function has no previous declaration, either the declaration
is missing in the corresponding header file or the function should be
declared static. Static functions are local to the translation unit,
which allows the compiler to apply some optimizations earlier (when
compiling the translation unit rather than during link-time
optimization).
The commit enables the warning for gcc, clang, and mingw. It also fixes
the reported warnings by declaring the functions static or by adding a
header file (benchmark.h).
closes https://github.com/official-stockfish/Stockfish/pull/4325
No functional change
This is a later epoch (epoch 859) from the same experiment run that trained yesterday's master net nn-60fa44e376d9.nnue (epoch 779). The experiment was manually paused around epoch 790 and unpaused with max epoch increased to 900 mainly to get more local elo data without letting the GPU idle.
nn-60fa44e376d9.nnue is from #4314
nn-335a9b2d8a80.nnue is from #4295
Local elo vs. nn-335a9b2d8a80.nnue at 25k nodes per move:
experiment_leela93-dfrc99-filt-only-T80-oct-nov-skip28
run_0/nn-epoch779.nnue (nn-60fa44e376d9.nnue) : 5.0 +/- 1.2
run_0/nn-epoch859.nnue (nn-a3dc078bafc7.nnue) : 5.6 +/- 1.6
Passed STC vs. nn-335a9b2d8a80.nnue
https://tests.stockfishchess.org/tests/view/63ae10495bd1e5f27f13d94f
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 37536 W: 10088 L: 9781 D: 17667
Ptnml(0-2): 110, 4006, 10223, 4325, 104
An LTC test vs. nn-335a9b2d8a80.nnue was paused due to nn-60fa44e376d9.nnue passing LTC first:
https://tests.stockfishchess.org/tests/view/63ae5d34331d5fca5113703b
Passed LTC vs. nn-60fa44e376d9.nnue
https://tests.stockfishchess.org/tests/view/63af1e41465d2b022dbce4e7
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 148704 W: 39672 L: 39155 D: 69877
Ptnml(0-2): 59, 14443, 44843, 14936, 71
closes https://github.com/official-stockfish/Stockfish/pull/4319
bench 3984365
Created by retraining the master net on the previous best dataset with additional filtering. No new data was added.
More of the Leela-dfrc_n5000.binpack part of the dataset was pre-filtered with depth6 multipv2 search to remove bestmove captures. About 93% of the previous Leela/SF data and 99% of the SF dfrc data was filtered. Unfiltered parts of the dataset were left out. The new Leela T80 oct+nov data is the same as before. All early game positions with ply count <= 28 were skipped during training by modifying the training data loader in nnue-pytorch.
Trained in a similar way as recent master nets, with a different nnue-pytorch branch for early ply skipping:
python3 easy_train.py \
--experiment-name=leela93-dfrc99-filt-only-T80-oct-nov-skip28 \
--training-dataset=/data/leela93-dfrc99-filt-only-T80-oct-nov.binpack \
--start-from-engine-test-net True \
--nnue-pytorch-branch=linrock/nnue-pytorch/misc-fixes-skip-ply-lteq-28 \
--gpus="0," \
--start-lambda=1.0 \
--end-lambda=0.75 \
--gamma=0.995 \
--lr=4.375e-4 \
--tui=False \
--seed=$RANDOM \
--max_epoch=800 \
--network-testing-threads 20 \
--num-workers 6
For the exact training data used: https://robotmoon.com/nnue-training-data/
Details about the previous best dataset: #4295
Local testing at a fixed 25k nodes:
experiment_leela93-dfrc99-filt-only-T80-oct-nov-skip28
Local Elo: run_0/nn-epoch779.nnue : 5.1 +/- 1.5
Passed STC
https://tests.stockfishchess.org/tests/view/63adb3acae97a464904fd4e8
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 36504 W: 9847 L: 9538 D: 17119
Ptnml(0-2): 108, 3981, 9784, 4252, 127
Passed LTC
https://tests.stockfishchess.org/tests/view/63ae0ae25bd1e5f27f13d884
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 36592 W: 10017 L: 9717 D: 16858
Ptnml(0-2): 17, 3461, 11037, 3767, 14
closes https://github.com/official-stockfish/Stockfish/pull/4314
bench 4015511
In both modified methods, the variable 'result' is checked to detect
whether the probe operation failed. However, the variable is not
initialized on all paths, so the check might test an uninitialized
value.
A test position (with TB) is given by:
position fen 3K1k2/R7/8/8/8/8/8/R6Q w - - 0 1 moves a1b1 f8g8 b1a1 g8f8 a1b1 f8g8 b1a1
This is now fixed by always initializing the variable.
closes https://github.com/official-stockfish/Stockfish/pull/4309
No functional change
Created by retraining the master net with a combination of:
the previous best dataset (Leela-dfrc_n5000.binpack), with about half the dataset filtered using depth6 multipv2 search to throw away positions where either of the 2 best moves are captures
Leela T80 Oct and Nov training data rescored with best moves, adding ~9.5 billion positions
Trained effectively the same way as the previous master net:
python3 easy_train.py \
--experiment-name=leela-dfrc-filtered-T80-oct-nov \
--training-dataset=/data/leela-dfrc-filtered-T80-oct-nov.binpack \
--start-from-engine-test-net True \
--gpus="0," \
--start-lambda=1.0 \
--end-lambda=0.75 \
--gamma=0.995 \
--lr=4.375e-4 \
--tui=False \
--seed=$RANDOM \
--max_epoch=800 \
--auto-exit-timeout-on-training-finished=900 \
--network-testing-threads 20 \
--num-workers 6
Local testing at a fixed 25k nodes:
experiments/experiment_leela-dfrc-filtered-T80-oct-nov/training/run_0/nn-epoch779.nnue
localElo: run_0/nn-epoch779.nnue : 4.7 +/- 3.1
The new Leela T80 part of the dataset was prepared by downloading test80 training data from all of Oct 2022 and Nov 2022, rescoring with syzygy 6-piece tablebases and ~600 GB of 7-piece tablebases, saving best moves to exported .plain files, removing all positions with castling flags, then converting to binpacks and using interleave_binpacks.py to merge them together. Scripts used in this data conversion process are available at:
https://github.com/linrock/lc0-data-converter
Filtering binpack data using depth6 multipv2 search was done by modifying transform.cpp in the tools branch:
https://github.com/linrock/Stockfish/tree/tools-filter-multipv2-no-rescore
Links for downloading the training data (total size: 338 GB) are available at:
https://robotmoon.com/nnue-training-data/
Passed STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 30544 W: 8244 L: 7947 D: 14353
Ptnml(0-2): 93, 3243, 8302, 3542, 92
https://tests.stockfishchess.org/tests/view/63a0d377264a0cf18f86f82b
Passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 32464 W: 8866 L: 8573 D: 15025
Ptnml(0-2): 19, 3054, 9794, 3345, 20
https://tests.stockfishchess.org/tests/view/63a10bc9fb452d3c44b1e016
closes https://github.com/official-stockfish/Stockfish/pull/4295
Bench 3554904
Instead of allowing .depend for specific build-related targets, filter
non-build-related targets (i.e. help, clean) so that other targets can
normally execute .depend target.
closes https://github.com/official-stockfish/Stockfish/pull/4293
No functional change
If ttMove is doubly extended, we allow a depth growth of the remaining moves.
The idea is to get a more realistic score comparison, because of the depth
difference. We take some care to avoid this extension for high depths,
in order to avoid the cost, since the search result is supposed
to be more accurate in this case.
This pull request includes some small cleanups.
STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 60256 W: 16189 L: 15848 D: 28219
Ptnml(0-2): 182, 6546, 16330, 6889, 181
https://tests.stockfishchess.org/tests/view/639109a1792a529ae8f27777
LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 106232 W: 28487 L: 28053 D: 49692
Ptnml(0-2): 46, 10224, 32145, 10652, 49
https://tests.stockfishchess.org/tests/view/63914cba792a529ae8f282ee
closes https://github.com/official-stockfish/Stockfish/pull/4271
Bench: 3622368
Add a constraint so that the dependency build only occurs when users
actually run build tasks.
This fixes a bug on some systems where gcc/g++ is not available.
closes https://github.com/official-stockfish/Stockfish/pull/4255
No functional change
fixes the lowerbound/upperbound output by avoiding
scores outside the alpha,beta bracket. Since SF search
uses fail-soft we can't simply take the returned value
as score.
closes https://github.com/official-stockfish/Stockfish/pull/4259
No functional change
Official release version of Stockfish 15.1
Bench: 3467381
---
Today, we have the pleasure to announce Stockfish 15.1.
As usual, downloads will be freely available at stockfishchess.org/download
*Elo gain and competition results*
With this release, version 5 of the NNUE neural net architecture has
been introduced, and the training data has been extended to include
Fischer random chess (FRC) positions. As a result, Elo gains are largest
for FRC, reaching up to 50 Elo for doubly randomized FRC[1] (DFRC).
More importantly, also for standard chess this release progressed and
will win two times more game pairs than it loses[2] against
Stockfish 15. Stockfish continues to win in a dominating way[3] all
chess engine tournaments, including the TCEC Superfinal, Cup, FRC, DFRC,
and Swiss as well as the CCC Bullet, Blitz, and Rapid events.
*New evaluation*
This release also introduces a new convention for the evaluation that
is reported by search. An evaluation of +1 is now no longer tied to the
value of one pawn, but to the likelihood of winning the game. With
a +1 evaluation, Stockfish has now a 50% chance of winning the game
against an equally strong opponent. This convention scales down
evaluations a bit compared to Stockfish 15 and allows for consistent
evaluations in the future.
*ChessBase settlement*
In this release period, the Stockfish team has successfully enforced
its GPL license against ChessBase. This has been an intense process that
included filing a lawsuit[4], a court hearing[5], and finally
negotiating a settlement[6] that established that ChessBase infringed on
the license by not distributing the Stockfish derivatives Fat Fritz 2
and Houdini 6 as free software, and that ensures ChessBase will respect
the Free Software principles in the future. This settlement has been
covered by major chess sites (see e.g. lichess.org[7] and chess.com[8]),
and we are proud that it has been hailed as a ‘historic violation
settlement[9]’ by the Software Freedom Conservancy.
*Thank you*
The Stockfish project builds on a thriving community of enthusiasts
(thanks everybody!) that contribute their expertise, time, and resources
to build a free and open-source chess engine that is robust, widely
available, and very strong. We invite our chess fans to join the
fishtest testing framework and programmers to contribute to the
project[10].
The Stockfish team
[1] https://tests.stockfishchess.org/tests/view/638a6170d2b9c924c4c62cb4
[2] https://tests.stockfishchess.org/tests/view/638a4dd7d2b9c924c4c6297b
[3] https://en.wikipedia.org/wiki/Stockfish_(chess)#Competition_results
[4] https://stockfishchess.org/blog/2021/our-lawsuit-against-chessbase/
[5] https://stockfishchess.org/blog/2022/public-court-hearing-soon/
[6] https://stockfishchess.org/blog/2022/chessbase-stockfish-agreement/
[7] https://lichess.org/blog/Y3u1mRAAACIApBVn/settlement-reached-in-stockfish-v-chessbase
[8] https://www.chess.com/news/view/chessbase-stockfish-reach-settlement
[9] https://sfconservancy.org/news/2022/nov/28/sfc-named-trusted-party-in-gpl-case/
[10] https://stockfishchess.org/get-involved/
If multiple threads have the same best move,
pick the thread with the largest contribution to the confidence vote.
This thread will later be used to display PV, so this patch is
about user-friendliness and/or least surprises, it non-functional for playing strenght.
closes https://github.com/official-stockfish/Stockfish/pull/4246
No functional change
fixes the lowerbound/upperbound output by taking the alpha,beta bracket
into account also if a bestThread is selected that is different from the master thread.
Instead of keeping track which bounds where used in the specific search,
in this version we simply store the quality (exact, upperbound,
lowerbound) of the score along with the actual score as information on
rootMove.
closes https://github.com/official-stockfish/Stockfish/pull/4239
No functional change
This updates the WDL model based on the LTC statistics (2M games).
Relatively small change, note that this also adjusts the NormalizeToPawnValue (now 361),
to keep win prob at 50% for 100cp.
closes https://github.com/official-stockfish/Stockfish/pull/4236
No functional change.
Github Actions allows us to use up to 20 workers.
This way we can launch multiple different checks
at the same time and optimize the overall time
the CI takes a bit.
closes https://github.com/official-stockfish/Stockfish/pull/4223
No functional change
For development versions of Stockfish, the version will now look like
dev-20221107-dca9a0533
indicating a development version, the date of the last commit,
and the git SHA of that commit. If git is not available,
the fallback is the date of compilation. Releases will continue to be
versioned as before.
Additionally, this PR extends the CI to create binary artifacts,
i.e. pushes to master will automatically build Stockfish and upload
the binaries to github.
closes https://github.com/official-stockfish/Stockfish/pull/4220
No functional change
Normalizes the internal value as reported by evaluate or search
to the UCI centipawn result used in output. This value is derived from
the win_rate_model() such that Stockfish outputs an advantage of
"100 centipawns" for a position if the engine has a 50% probability to win
from this position in selfplay at fishtest LTC time control.
The reason to introduce this normalization is that our evaluation is, since NNUE,
no longer related to the classical parameter PawnValueEg (=208). This leads to
the current evaluation changing quite a bit from release to release, for example,
the eval needed to have 50% win probability at fishtest LTC (in cp and internal Value):
June 2020 : 113cp (237)
June 2021 : 115cp (240)
April 2022 : 134cp (279)
July 2022 : 167cp (348)
With this patch, a 100cp advantage will have a fixed interpretation,
i.e. a 50% win chance. To keep this value steady, it will be needed to update the win_rate_model()
from time to time, based on fishtest data. This analysis can be performed with
a set of scripts currently available at https://github.com/vondele/WLD_model
fixes https://github.com/official-stockfish/Stockfish/issues/4155
closes https://github.com/official-stockfish/Stockfish/pull/4216
No functional change
Joint work by Ofek Shochat and Stéphane Nicolet.
passed STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 93288 W: 24996 L: 24601 D: 43691
Ptnml(0-2): 371, 10263, 24989, 10642, 379
https://tests.stockfishchess.org/tests/view/63448f4f4bc7650f07541987
passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 84168 W: 22771 L: 22377 D: 39020
Ptnml(0-2): 47, 8181, 25234, 8575, 47
https://tests.stockfishchess.org/tests/view/6345186d4bc7650f07542fbd
================
It seems there are two effects with this patch:
effect A :
If Stockfish is winning at root, we have optimism > 0 for all leaves in
the search tree where Stockfish is to move. There, if (psq - nnue) > 0
(ie if the advantage is more materialistic than positional), then the
product D = optimism * (psq - nnue) will be positive, nnueComplexity will
increase, and the eval will increase from SF point of view.
So the effect A is that if Stockfish is winning at root, she will slightly
favor in the search tree (in other words, search more) the positions where
she can convert her advantage via materialist means.
effect B :
If Stockfish is losing at root, we have optimism > 0 for all leaves in
the search tree where the opponent is to move. There, if (psq - nnue) < 0
(ie if the opponent advantage is more positional than materialistic), then
the product D = optimism * (psq-nnue) will be negative, nnueComplexity will
decrease, and the eval will decrease from the opponent point of view.
So the effect B is that Stockfish will slightly favor in the search tree
(search more) the branches where she can defend by slowly reducing the
opponent positional advantage.
=================
closes https://github.com/official-stockfish/Stockfish/pull/4195
bench: 4673898
relatively soon servers with 512 threads will be available 'quite commonly',
anticipate even more threads, and increase our current maximum from 512 to 1024.
closes https://github.com/official-stockfish/Stockfish/pull/4152
No functional change.
If the elapsed time is close to the available time, the time management thread can signal that the next iterations should be searched at the same depth (Threads.increaseDepth = false). While the rootDepth increases, the adjustedDepth is kept constant with the searchAgainCounter.
In exceptional cases, when threading is used and the master thread, which controls the time management, signals to not increaseDepth, but by itself takes a long time to finish the iteration, the helper threads can search repeatedly at the same depth. This search finishes more and more quickly, leading to helper threads that report a rootDepth of MAX_DEPTH (245). The latter is not optimal as it is confusing for the user, stops search on these threads, and leads to an incorrect bias in the thread voting scheme. Probably with only a small impact on strength.
This behavior was observed almost two years ago,
see https://github.com/official-stockfish/Stockfish/issues/2717
This patch fixes#2717 by ensuring the effective depth increases at once every four iterations,
even in increaseDepth is false.
Depth 245 searches (for non-trivial positions) were indeed absent with this patch,
but frequent with master in the tests below:
https://discord.com/channels/435943710472011776/813919248455827515/994872720800088095
Total pgns: 2173
Base: 2867
Patch: 0
it passed non-regression testing in various setups:
SMP STC:
https://tests.stockfishchess.org/tests/view/62bfecc96178ffe6394ba036
LLR: 2.94 (-2.94,2.94) <-2.25,0.25>
Total: 37288 W: 10171 L: 10029 D: 17088
Ptnml(0-2): 75, 3777, 10793, 3929, 70
SMP LTC:
https://tests.stockfishchess.org/tests/view/62c08f6f49b62510394be066
LLR: 2.94 (-2.94,2.94) <-2.25,0.25>
Total: 190568 W: 52125 L: 52186 D: 86257
Ptnml(0-2): 70, 17854, 59504, 17779, 77
LTC:
https://tests.stockfishchess.org/tests/view/62c08b6049b62510394bdfb6
LLR: 2.96 (-2.94,2.94) <-2.25,0.25>
Total: 48120 W: 13204 L: 13083 D: 21833
Ptnml(0-2): 54, 4458, 14919, 4571, 58
Special thanks to miguel-I, Disservin, ruicoelhopedro and others for analysing the problem,
the data, and coming up with the key insight, needed to fix this longstanding issue.
closes https://github.com/official-stockfish/Stockfish/pull/4104
Bench: 5182295
First things first...
this PR is being made from court. Today, Tord and Stéphane, with broad support
of the developer community are defending their complaint, filed in Munich, against ChessBase.
With their products Houdini 6 and Fat Fritz 2, both Stockfish derivatives,
ChessBase violated repeatedly the Stockfish GPLv3 license. Tord and Stéphane have terminated
their license with ChessBase permanently. Today we have the opportunity to present
our evidence to the judge and enforce that termination. To read up, have a look at our blog post
https://stockfishchess.org/blog/2022/public-court-hearing-soon/ and
https://stockfishchess.org/blog/2021/our-lawsuit-against-chessbase/
This PR introduces a net trained with an enhanced data set and a modified loss function in the trainer.
A slight adjustment for the scaling was needed to get a pass on standard chess.
passed STC:
https://tests.stockfishchess.org/tests/view/62c0527a49b62510394bd610
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 135008 W: 36614 L: 36152 D: 62242
Ptnml(0-2): 640, 15184, 35407, 15620, 653
passed LTC:
https://tests.stockfishchess.org/tests/view/62c17e459e7d9997a12d458e
LLR: 2.94 (-2.94,2.94) <0.50,3.00>
Total: 28864 W: 8007 L: 7749 D: 13108
Ptnml(0-2): 47, 2810, 8466, 3056, 53
Local testing at a fixed 25k nodes resulted in
Test run1026/easy_train_data/experiments/experiment_2/training/run_0/nn-epoch799.nnue
localElo: 4.2 +- 1.6
The real strength of the net is in FRC and DFRC chess where it gains significantly.
Tested at STC with slightly different scaling:
FRC:
https://tests.stockfishchess.org/tests/view/62c13a4002ba5d0a774d20d4
Elo: 29.78 +-3.4 (95%) LOS: 100.0%
Total: 10000 W: 2007 L: 1152 D: 6841
Ptnml(0-2): 31, 686, 2804, 1355, 124
nElo: 59.24 +-6.9 (95%) PairsRatio: 2.06
DFRC:
https://tests.stockfishchess.org/tests/view/62c13a5702ba5d0a774d20d9
Elo: 55.25 +-3.9 (95%) LOS: 100.0%
Total: 10000 W: 2984 L: 1407 D: 5609
Ptnml(0-2): 51, 636, 2266, 1779, 268
nElo: 96.95 +-7.2 (95%) PairsRatio: 2.98
Tested at LTC with identical scaling:
FRC:
https://tests.stockfishchess.org/tests/view/62c26a3c9e7d9997a12d6caf
Elo: 16.20 +-2.5 (95%) LOS: 100.0%
Total: 10000 W: 1192 L: 726 D: 8082
Ptnml(0-2): 10, 403, 3727, 831, 29
nElo: 44.12 +-6.7 (95%) PairsRatio: 2.08
DFRC:
https://tests.stockfishchess.org/tests/view/62c26a539e7d9997a12d6cb2
Elo: 40.94 +-3.0 (95%) LOS: 100.0%
Total: 10000 W: 2215 L: 1042 D: 6743
Ptnml(0-2): 10, 410, 3053, 1451, 76
nElo: 92.77 +-6.9 (95%) PairsRatio: 3.64
This is due to the mixing in a significant fraction of DFRC training data in the final training round. The net is
trained using the easy_train.py script in the following way:
```
python easy_train.py \
--training-dataset=../Leela-dfrc_n5000.binpack \
--experiment-name=2 \
--nnue-pytorch-branch=vondele/nnue-pytorch/lossScan4 \
--additional-training-arg=--param-index=2 \
--start-lambda=1.0 \
--end-lambda=0.75 \
--gamma=0.995 \
--lr=4.375e-4 \
--start-from-engine-test-net True \
--tui=False \
--seed=$RANDOM \
--max_epoch=800 \
--auto-exit-timeout-on-training-finished=900 \
--network-testing-threads 8 \
--num-workers 12
```
where the data set used (Leela-dfrc_n5000.binpack) is a combination of our previous best data set (mix of Leela and some SF data) and DFRC data, interleaved to form:
The data is available in https://drive.google.com/drive/folders/1S9-ZiQa_3ApmjBtl2e8SyHxj4zG4V8gG?usp=sharing
Leela mix: https://drive.google.com/file/d/1JUkMhHSfgIYCjfDNKZUMYZt6L5I7Ra6G/view?usp=sharing
DFRC: https://drive.google.com/file/d/17vDaff9LAsVo_1OfsgWAIYqJtqR8aHlm/view?usp=sharing
The training branch used is
https://github.com/vondele/nnue-pytorch/commits/lossScan4
A PR to the main trainer repo will be made later. This contains a revised loss function, now computing the loss from the score based on the win rate model, which is a more accurate representation than what we had before. Scaling constants are tweaked there as well.
closes https://github.com/official-stockfish/Stockfish/pull/4100
Bench: 5186781
The speedup is around 0.25% using gcc 11.3.1 (bmi2, nnue bench, depth 16
and 23) while it is neutral using clang (same conditions).
According to `perf` that integer division was one of the most time-consuming
instructions in search (gcc disassembly).
Passed STC:
https://tests.stockfishchess.org/tests/view/628a17fe24a074e5cd59b3aa
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 22232 W: 5992 L: 5751 D: 10489
Ptnml(0-2): 88, 2235, 6218, 2498, 77
yellow LTC:
https://tests.stockfishchess.org/tests/view/628a35d7ccae0450e35106f7
LLR: -2.95 (-2.94,2.94) <0.50,3.00>
Total: 320168 W: 85853 L: 85326 D: 148989
Ptnml(0-2): 185, 29698, 99821, 30165, 215
This patch also suggests that UHO STC is sensible to small speedups (< 0.50%).
closes https://github.com/official-stockfish/Stockfish/pull/4032
No functional change
This patch provides command line flags `--help` and `--license` as well as the corresponding `help` and `license` commands.
```
$ ./stockfish --help
Stockfish 200522 by the Stockfish developers (see AUTHORS file)
Stockfish is a powerful chess engine and free software licensed under the GNU GPLv3.
Stockfish is normally used with a separate graphical user interface (GUI).
Stockfish implements the universal chess interface (UCI) to exchange information.
For further information see https://github.com/official-stockfish/Stockfish#readme
or the corresponding README.md and Copying.txt files distributed with this program.
```
The idea is to provide a minimal help that links to the README.md file,
not replicating information that is already available elsewhere.
We use this opportunity to explicitly report the license as well.
closes https://github.com/official-stockfish/Stockfish/pull/4027
No functional change.
train a net using training data with a
heavier weight on positions having 16 pieces on the board. More specifically,
with a relative weight of `i * (32-i)/(16 * 16)+1` (where i is the number of pieces on the board).
This is done with the trainer branch https://github.com/glinscott/nnue-pytorch/pull/173
The command used is:
```
python train.py $datafile $datafile $restarttype $restartfile --gpus 1 --threads 4 --num-workers 12 --random-fen-skipping=3 --batch-size 16384 --progress_bar_refresh_rate 300 --smart-fen-skipping --features=HalfKAv2_hm^ --lambda=1.00 --max_epochs=$epochs --seed $RANDOM --default_root_dir exp/run_$i
```
The datafile is T60T70wIsRightFarseerT60T74T75T76.binpack, the restart is from the master net.
passed STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 22728 W: 6197 L: 5945 D: 10586
Ptnml(0-2): 105, 2453, 6001, 2695, 110
https://tests.stockfishchess.org/tests/view/625cf944ff677a888877cd90
passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.00>
Total: 35664 W: 9535 L: 9264 D: 16865
Ptnml(0-2): 30, 3524, 10455, 3791, 32
https://tests.stockfishchess.org/tests/view/625d3c32ff677a888877d7ca
closes https://github.com/official-stockfish/Stockfish/pull/3989
Bench: 7269563
Official release version of Stockfish 15
Bench: 8129754
---
A new major release of Stockfish is now available at https://stockfishchess.org
Stockfish 15 continues to push the boundaries of chess, providing unrivalled
analysis and playing strength. In our testing, Stockfish 15 is ahead of
Stockfish 14 by 36 Elo points and wins nine times more game pairs than it
loses[1].
Improvements to the engine have made it possible for Stockfish to end up
victorious in tournaments at all sorts of time controls ranging from bullet to
classical and even at Fischer random chess[2]. At CCC, Stockfish won all of
the latest tournaments: CCC 16 Bullet, Blitz and Rapid, CCC 960 championship,
and the CCC 17 Rapid. At TCEC, Stockfish won the Season 21, Cup 9, FRC 4 and
in the current Season 22 superfinal, at the time of writing, has won 16 game
pairs and not yet lost a single one.
This progress is the result of a dedicated team of developers that comes up
with new ideas and improvements. For Stockfish 15, we tested nearly 13000
different changes and retained the best 200. These include the fourth
generation of our NNUE network architecture, as well as various search
improvements. To perform these tests, contributors provide CPU time for
testing, and in the last year, they have collectively played roughly a
billion chess games. In the last few years, our distributed testing
framework, Fishtest, has been operated superbly and has been developed and
improved extensively. This work by Pasquale Pigazzini, Tom Vijlbrief, Michel
Van den Bergh, and various other developers[3] is an essential part of the
success of the Stockfish project.
Indeed, the Stockfish project builds on a thriving community of enthusiasts
to offer a free and open-source chess engine that is robust, widely
available, and very strong. We invite our chess fans to join the Fishtest
testing framework and programmers to contribute to the project[4].
The Stockfish team
[1] https://tests.stockfishchess.org/tests/view/625d156dff677a888877d1be
[2] https://en.wikipedia.org/wiki/Stockfish_(chess)#Competition_results
[3] https://github.com/glinscott/fishtest/blob/master/AUTHORS
[4] https://stockfishchess.org/get-involved/
This patch chooses the delta value (which skews the nnue evaluation between positional and materialistic)
depending on the material: If the material is low, delta will be higher and the evaluation is shifted
to the positional value. If the material is high, the evaluation will be shifted to the psqt value.
I don't think slightly negative values of delta should be a concern.
Passed STC:
https://tests.stockfishchess.org/tests/view/62418513b3b383e86185766f
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 28808 W: 7832 L: 7564 D: 13412
Ptnml(0-2): 147, 3186, 7505, 3384, 182
Passed LTC:
https://tests.stockfishchess.org/tests/view/62419137b3b383e861857842
LLR: 2.96 (-2.94,2.94) <0.50,3.00>
Total: 58632 W: 15776 L: 15450 D: 27406
Ptnml(0-2): 42, 5889, 17149, 6173, 63
closes https://github.com/official-stockfish/Stockfish/pull/3971
Bench: 7588855
This idea is a mix of koivisto idea of threat history and heuristic that
was simplified some time ago in LMR - decreasing reduction for moves that evade a capture.
Instead of doing so in LMR this patch does it in movepicker - to do this it
calculates squares that are attacked by different piece types and pieces that are located
on this squares and boosts up weight of moves that make this pieces land on a square that is not under threat.
Boost is greater for pieces with bigger material values.
Special thanks to koivisto and seer authors for explaining me ideas behind threat history.
Passed STC:
https://tests.stockfishchess.org/tests/view/62406e473b32264b9aa1478b
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 19816 W: 5320 L: 5072 D: 9424
Ptnml(0-2): 86, 2165, 5172, 2385, 100
Passed LTC:
https://tests.stockfishchess.org/tests/view/62407f2e3b32264b9aa149c8
LLR: 2.94 (-2.94,2.94) <0.50,3.00>
Total: 51200 W: 13805 L: 13500 D: 23895
Ptnml(0-2): 44, 5023, 15164, 5322, 47
closes https://github.com/official-stockfish/Stockfish/pull/3970
bench 7736491
Via the ttPv flag an implicit tree of current and former PV nodes is maintained. In addition this tree is grown or shrinked at the leafs dependant on the search results. But now the shrinking step has been removed.
As the frequency of ttPv nodes decreases with depth the shown scaling behavior (STC barely passed but LTC scales well) of the tests was expected.
STC:
LLR: 2.93 (-2.94,2.94) <-2.25,0.25>
Total: 270408 W: 71593 L: 71785 D: 127030
Ptnml(0-2): 1339, 31024, 70630, 30912, 1299
https://tests.stockfishchess.org/tests/view/622fbf9dc9e950cbfc2376d6
LTC:
LLR: 2.96 (-2.94,2.94) <-2.25,0.25>
Total: 34368 W: 9135 L: 8992 D: 16241
Ptnml(0-2): 28, 3423, 10135, 3574, 24
https://tests.stockfishchess.org/tests/view/62305257c9e950cbfc238964
closes https://github.com/official-stockfish/Stockfish/pull/3963
Bench: 7044203
This commit generalizes the feature transform to use vec_t macros
that are architecture defined instead of using a seperate code path for each one.
It should make some old architectures (MMX, including improvements by Fanael) faster
and make further such improvements easier in the future.
Includes some corrections to CI for mingw.
closes https://github.com/official-stockfish/Stockfish/pull/3955
closes https://github.com/official-stockfish/Stockfish/pull/3928
No functional change
- set the variable only for the required tests to keep simple the yml file
- use NDK 21.x until will be fixed the Stockfish static build problem
with NDK 23.x
- set the test for armv7, armv7-neon, armv8 builds:
- use armv7a-linux-androideabi21-clang++ compiler for armv7 armv7-neon
- enforce a static build
- silence the Warning for the unused compilation flag "-pie" with
the static build, otherwise the Github workflow stops
- use qemu to bench the build and get the signature
Many thanks to @pschneider1968 that made all the hard work with NDK :)
closes https://github.com/official-stockfish/Stockfish/pull/3924
No functional change
This patch is a result of tuning done by user @candirufish after 150k games.
Since the tuned values were really interesting and touched heuristics
that are known for their non-linear scaling I decided to run limited
games LTC match, even if the STC test was really bad (which was expected).
After seeing the results of the LTC match, I also run a VLTC (very long
time control) SPRTtest, which passed.
The main difference is in extensions: this patch allows much more
singular/double extensions, both in terms of allowing them at lower
depths and with lesser margins.
Failed STC:
https://tests.stockfishchess.org/tests/view/620d66643ec80158c0cd3b46
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 4968 W: 1194 L: 1398 D: 2376
Ptnml(0-2): 47, 633, 1294, 497, 13
Performed well at LTC in a fixed-length match:
https://tests.stockfishchess.org/tests/view/620d66823ec80158c0cd3b4a
ELO: 3.36 +-1.8 (95%) LOS: 100.0%
Total: 30000 W: 7966 L: 7676 D: 14358
Ptnml(0-2): 36, 2936, 8755, 3248, 25
Passed VLTC SPRT test:
https://tests.stockfishchess.org/tests/view/620da11a26f5b17ec884f939
LLR: 2.96 (-2.94,2.94) <0.50,3.00>
Total: 4400 W: 1326 L: 1127 D: 1947
Ptnml(0-2): 13, 309, 1348, 526, 4
closes https://github.com/official-stockfish/Stockfish/pull/3937
Bench: 6318903
Architecture:
The diagram of the "SFNNv4" architecture:
https://user-images.githubusercontent.com/8037982/153455685-cbe3a038-e158-4481-844d-9d5fccf5c33a.png
The most important architectural changes are the following:
* 1024x2 [activated] neurons are pairwise, elementwise multiplied (not quite pairwise due to implementation details, see diagram), which introduces a non-linearity that exhibits similar benefits to previously tested sigmoid activation (quantmoid4), while being slightly faster.
* The following layer has therefore 2x less inputs, which we compensate by having 2 more outputs. It is possible that reducing the number of outputs might be beneficial (as we had it as low as 8 before). The layer is now 1024->16.
* The 16 outputs are split into 15 and 1. The 1-wide output is added to the network output (after some necessary scaling due to quantization differences). The 15-wide is activated and follows the usual path through a set of linear layers. The additional 1-wide output is at least neutral, but has shown a slightly positive trend in training compared to networks without it (all 16 outputs through the usual path), and allows possibly an additional stage of lazy evaluation to be introduced in the future.
Additionally, the inference code was rewritten and no longer uses a recursive implementation. This was necessitated by the splitting of the 16-wide intermediate result into two, which was impossible to do with the old implementation with ugly hacks. This is hopefully overall for the better.
First session:
The first session was training a network from scratch (random initialization). The exact trainer used was slightly different (older) from the one used in the second session, but it should not have a measurable effect. The purpose of this session is to establish a strong network base for the second session. Small deviations in strength do not harm the learnability in the second session.
The training was done using the following command:
python3 train.py \
/home/sopel/nnue/nnue-pytorch-training/data/nodes5000pv2_UHO.binpack \
/home/sopel/nnue/nnue-pytorch-training/data/nodes5000pv2_UHO.binpack \
--gpus "$3," \
--threads 4 \
--num-workers 4 \
--batch-size 16384 \
--progress_bar_refresh_rate 20 \
--random-fen-skipping 3 \
--features=HalfKAv2_hm^ \
--lambda=1.0 \
--gamma=0.992 \
--lr=8.75e-4 \
--max_epochs=400 \
--default_root_dir ../nnue-pytorch-training/experiment_$1/run_$2
Every 20th net was saved and its playing strength measured against some baseline at 25k nodes per move with pure NNUE evaluation (modified binary). The exact setup is not important as long as it's consistent. The purpose is to sift good candidates from bad ones.
The dataset can be found https://drive.google.com/file/d/1UQdZN_LWQ265spwTBwDKo0t1WjSJKvWY/view
Second session:
The second training session was done starting from the best network (as determined by strength testing) from the first session. It is important that it's resumed from a .pt model and NOT a .ckpt model. The conversion can be performed directly using serialize.py
The LR schedule was modified to use gamma=0.995 instead of gamma=0.992 and LR=4.375e-4 instead of LR=8.75e-4 to flatten the LR curve and allow for longer training. The training was then running for 800 epochs instead of 400 (though it's possibly mostly noise after around epoch 600).
The training was done using the following command:
The training was done using the following command:
python3 train.py \
/data/sopel/nnue/nnue-pytorch-training/data/T60T70wIsRightFarseerT60T74T75T76.binpack \
/data/sopel/nnue/nnue-pytorch-training/data/T60T70wIsRightFarseerT60T74T75T76.binpack \
--gpus "$3," \
--threads 4 \
--num-workers 4 \
--batch-size 16384 \
--progress_bar_refresh_rate 20 \
--random-fen-skipping 3 \
--features=HalfKAv2_hm^ \
--lambda=1.0 \
--gamma=0.995 \
--lr=4.375e-4 \
--max_epochs=800 \
--resume-from-model /data/sopel/nnue/nnue-pytorch-training/data/exp295/nn-epoch399.pt \
--default_root_dir ../nnue-pytorch-training/experiment_$1/run_$run_id
In particular note that we now use lambda=1.0 instead of lambda=0.8 (previous nets), because tests show that WDL-skipping introduced by vondele performs better with lambda=1.0. Nets were being saved every 20th epoch. In total 16 runs were made with these settings and the best nets chosen according to playing strength at 25k nodes per move with pure NNUE evaluation - these are the 4 nets that have been put on fishtest.
The dataset can be found either at ftp://ftp.chessdb.cn/pub/sopel/data_sf/T60T70wIsRightFarseerT60T74T75T76.binpack in its entirety (download might be painfully slow because hosted in China) or can be assembled in the following way:
Get the https://github.com/official-stockfish/Stockfish/blob/5640ad48ae5881223b868362c1cbeb042947f7b4/script/interleave_binpacks.py script.
Download T60T70wIsRightFarseer.binpack https://drive.google.com/file/d/1_sQoWBl31WAxNXma2v45004CIVltytP8/view
Download farseerT74.binpack http://trainingdata.farseer.org/T74-May13-End.7z
Download farseerT75.binpack http://trainingdata.farseer.org/T75-June3rd-End.7z
Download farseerT76.binpack http://trainingdata.farseer.org/T76-Nov10th-End.7z
Run python3 interleave_binpacks.py T60T70wIsRightFarseer.binpack farseerT74.binpack farseerT75.binpack farseerT76.binpack T60T70wIsRightFarseerT60T74T75T76.binpack
Tests:
STC: https://tests.stockfishchess.org/tests/view/6203fb85d71106ed12a407b7
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 16952 W: 4775 L: 4521 D: 7656
Ptnml(0-2): 133, 1818, 4318, 2076, 131
LTC: https://tests.stockfishchess.org/tests/view/62041e68d71106ed12a40e85
LLR: 2.94 (-2.94,2.94) <0.50,3.00>
Total: 14944 W: 4138 L: 3907 D: 6899
Ptnml(0-2): 21, 1499, 4202, 1728, 22
closes https://github.com/official-stockfish/Stockfish/pull/3927
Bench: 4919707
Most credits for this patch should go to @candirufish.
Based on his big search tuning (1M games at 20+0.1s)
https://tests.stockfishchess.org/tests/view/61fc7a6ed508ec6a1c9f4b7d
with some hand polishing on top of it, which includes :
a) correcting trend sigmoid - for some reason original tuning resulted in it being negative. This heuristic was proven to be worth some elo for years so reversing it sign is probably some random artefact;
b) remove changes to continuation history based pruning - this heuristic historically was really good at providing green STCs and then failing at LTC miserably if we tried to make it more strict, original tuning was done at short time control and thus it became more strict - which doesn't scale to longer time controls;
c) remove changes to improvement - not really indended :).
passed STC
https://tests.stockfishchess.org/tests/view/6203526e88ae2c84271c2ee2
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 16840 W: 4604 L: 4363 D: 7873
Ptnml(0-2): 82, 1780, 4449, 2033, 76
passed LTC
https://tests.stockfishchess.org/tests/view/620376e888ae2c84271c35d4
LLR: 2.96 (-2.94,2.94) <0.50,3.00>
Total: 17232 W: 4771 L: 4542 D: 7919
Ptnml(0-2): 14, 1655, 5048, 1886, 13
closes https://github.com/official-stockfish/Stockfish/pull/3926
bench 5030992
have maximal compatibility on legacy target arch, now supporting AMD Athlon
The old behavior can anyway be selected by the user if needed, for example
make -j profile-build ARCH=x86-32 sse=yes
fixes#3904
closes https://github.com/official-stockfish/Stockfish/pull/3918
No functional change
For cross-compiling to Android on windows, the Makefile needs some tweaks.
Tested with Android NDK 23.1.7779620 and 21.4.7075529, using
Windows 10 with clean MSYS2 environment (i.e. no MINGW/GCC/Clang
toolchain in PATH) and Fedora 35, with build target:
build ARCH=armv8 COMP=ndk
The resulting binary runs fine inside Droidfish on my Samsung
Galaxy Note20 Ultra and Samsung Galaxy Tab S7+
Other builds tested to exclude regressions: MINGW64/Clang64 build
on Windows; MINGW64 cross build, native Clang and GCC builds on Fedora.
wiki docs https://github.com/glinscott/fishtest/wiki/Cross-compiling-Stockfish-for-Android-on-Windows-and-Linux
closes https://github.com/official-stockfish/Stockfish/pull/3901
No functional change
A Windows Native Build (WNB) can be done:
- on Windows, using a recent mingw-w64 g++/clang compiler
distributed by msys2, cygwin and others
- on Linux, using mingw-w64 g++ to cross compile
Improvements:
- check for a WNB in a proper way and set a variable to simplify the code
- set the proper EXE for a WNB
- use the proper name for the mingw-w64 clang compiler
- use the static linking for a WNB
- use wine to make a PGO cross compile on Linux (also with Intel SDE)
- enable the LTO build for mingw-w64 g++ compiler
- set `lto=auto` to use the make's job server, if available, or otherwise
to fall back to autodetection of the number of CPU threads
- clean up all the temporary LTO files saved in the local directory
Tested on:
- msys2 MINGW64 (g++), UCRT64 (g++), MINGW32 (g++), CLANG64 (clang)
environments
- cygwin mingw-w64 g++
- Ubuntu 18.04 & 21.10 mingw-w64 PGO cross compile (also with Intel SDE)
closes#3891
No functional change
This updates estimates from 2yr ago #2401, and adds missing terms.
All tests run at 10+0.1 (STC), 20000 games, error bars +- 1.8 Elo, book 8moves_v3.png.
A table of Elo values with the links to the corresponding tests can be found at the PR
closes https://github.com/official-stockfish/Stockfish/pull/3868
Non-functional Change
This is a reintroduction of an idea that was simplified away approximately 1 year ago.
There are some tweaks to it :
a) exclude promotions;
b) exclude Pv Nodes from it - Pv Nodes logic for captures is really different from non Pv nodes so it makes a lot of sense;
c) use a big grain of capture history - idea is taken from my recent patches in futility pruning.
passed STC
https://tests.stockfishchess.org/tests/view/61bd90f857a0d0f327c373b7
LLR: 2.96 (-2.94,2.94) <0.00,2.50>
Total: 86640 W: 22474 L: 22110 D: 42056
Ptnml(0-2): 268, 9732, 22963, 10082, 275
passed LTC
https://tests.stockfishchess.org/tests/view/61be094457a0d0f327c38aa3
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 23240 W: 6079 L: 5838 D: 11323
Ptnml(0-2): 14, 2261, 6824, 2512, 9
https://github.com/official-stockfish/Stockfish/pull/3864
bench 4493723
This patch is a follow up of previous 2 patches that introduced more reductions for PV nodes with low delta and more pruning for nodes with low delta. Instead of writing separate heuristics now it adjust reductions based on delta / rootDelta - it allows to remove 3 separate adjustements of pruning/LMR in different places and also makes reduction dependence on delta and rootDelta smoother. Also now it works for all pruning heuristics and not just 2.
Passed STC
https://tests.stockfishchess.org/tests/view/61ba9b6c57a0d0f327c2d48b
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 79192 W: 20513 L: 20163 D: 38516
Ptnml(0-2): 238, 8900, 21024, 9142, 292
passed LTC
https://tests.stockfishchess.org/tests/view/61baf77557a0d0f327c2eb8e
LLR: 2.96 (-2.94,2.94) <0.50,3.00>
Total: 158400 W: 41134 L: 40572 D: 76694
Ptnml(0-2): 101, 16372, 45745, 16828, 154
closes https://github.com/official-stockfish/Stockfish/pull/3862
bench 4651538
This idea is somewhat similar to extentions in LMR but has a different flavour.
If result of LMR was really good - thus exceeded alpha by some pretty
big given margin, we can extend move after LMR in full depth search with 0 window.
The idea is that this move is probably a fail high with somewhat of a big
probability so extending it makes a lot of sense
passed STC
https://tests.stockfishchess.org/tests/view/61ad45ea56fcf33bce7d74b7
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 59680 W: 15531 L: 15215 D: 28934
Ptnml(0-2): 193, 6711, 15734, 6991, 211
passed LTC
https://tests.stockfishchess.org/tests/view/61ad9ff356fcf33bce7d8646
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 59104 W: 15321 L: 14992 D: 28791
Ptnml(0-2): 53, 6023, 17065, 6364, 47
closes https://github.com/official-stockfish/Stockfish/pull/3838
bench 4881329
This patch optimizes the NEON implementation in two ways.
The activation layer after the feature transformer is rewritten to make it easier for the compiler to see through dependencies and unroll. This in itself is a minimal, but a positive improvement. Other architectures could benefit from this too in the future. This is not an algorithmic change.
The affine transform for large matrices (first layer after FT) on NEON now utilizes the same optimized code path as >=SSSE3, which makes the memory accesses more sequential and makes better use of the available registers, which allows for code that has longer dependency chains.
Benchmarks from Redshift#161, profile-build with apple clang
george@Georges-MacBook-Air nets % ./stockfish-b82d93 bench 2>&1 | tail -4 (current master)
===========================
Total time (ms) : 2167
Nodes searched : 4667742
Nodes/second : 2154011
george@Georges-MacBook-Air nets % ./stockfish-7377b8 bench 2>&1 | tail -4 (this patch)
===========================
Total time (ms) : 1842
Nodes searched : 4667742
Nodes/second : 2534061
This is a solid 18% improvement overall, larger in a bench with NNUE-only, not mixed.
Improvement is also observed on armv7-neon (Raspberry Pi, and older phones), around 5% speedup.
No changes for architectures other than NEON.
closes https://github.com/official-stockfish/Stockfish/pull/3837
No functional changes.
Initialize continuation history with a slighlty negative value -71 instead of zero.
The idea is, because the most history entries will be later negative anyway, to shift
the starting values a little bit in the "correct" direction. Of course the effect of
initialization dimishes with greater depth so I had the apprehension that the LTC test
would be difficult to pass, but it passed.
STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 34520 W: 9076 L: 8803 D: 16641
Ptnml(0-2): 136, 3837, 9047, 4098, 142
https://tests.stockfishchess.org/tests/view/61aa52e39e8855bba1a3776b
LTC:
LLR: 2.93 (-2.94,2.94) <0.50,3.00>
Total: 75568 W: 19620 L: 19254 D: 36694
Ptnml(0-2): 44, 7773, 21796, 8115, 56
https://tests.stockfishchess.org/tests/view/61aa87d39e8855bba1a383a5
closes https://github.com/official-stockfish/Stockfish/pull/3834
Bench: 4674029
In their infinite wisdom, Intel axed AVX512 from Alder Lake
chips (well, not entirely, but we kind of want to use the Gracemont
cores for chess!) but still added VNNI support.
Confusingly enough, this is not the same as VNNI256 support.
This adds a specific AVX-VNNI target that will use this AVX-VNNI
mode, by prefixing the VNNI instructions with the appropriate VEX
prefix, and avoiding AVX512 usage.
This is about 1% faster on P cores:
Result of 20 runs
==================
base (./clang-bmi2 ) = 3306337 +/- 7519
test (./clang-vnni ) = 3344226 +/- 7388
diff = +37889 +/- 4153
speedup = +0.0115
P(speedup > 0) = 1.0000
But a nice 3% faster on E cores:
Result of 20 runs
==================
base (./clang-bmi2 ) = 1938054 +/- 28257
test (./clang-vnni ) = 1994606 +/- 31756
diff = +56552 +/- 3735
speedup = +0.0292
P(speedup > 0) = 1.0000
This was measured on Clang 13. GCC 11.2 appears to generate
worse code for Alder Lake, though the speedup on the E cores
is similar.
It is possible to run the engine specifically on the P or E using binding,
for example in linux it is possible to use (for an 8 P + 8 E setup like i9-12900K):
taskset -c 0-15 ./stockfish
taskset -c 16-23 ./stockfish
where the first call binds to the P-cores and the second to the E-cores.
closes https://github.com/official-stockfish/Stockfish/pull/3824
No functional change
This patch is a result of refining of tuning vondele did after
new net passed and some hand-made values adjustements - excluding
changes in other pruning heuristics and rounding value of history
divisor to the nearest power of 2.
With this patch futility pruning becomes more aggressive and
history influence on it is doubled again.
passed STC
https://tests.stockfishchess.org/tests/view/61a2c4c1a26505c2278c150d
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 33848 W: 8841 L: 8574 D: 16433
Ptnml(0-2): 100, 3745, 8988, 3970, 121
passed LTC
https://tests.stockfishchess.org/tests/view/61a327ffa26505c2278c26d9
LLR: 2.94 (-2.94,2.94) <0.50,3.00>
Total: 22272 W: 5856 L: 5614 D: 10802
Ptnml(0-2): 12, 2230, 6412, 2468, 14
closes https://github.com/official-stockfish/Stockfish/pull/3814
bench 6302543
This idea is somewhat of a respin of smth we had in futility pruning and that was simplified away - dependence of it not only on static evaluation of position but also on move history heuristics.
Instead of aborting it when they are high there we use fraction of their sum to adjust static eval pruning criteria.
passed STC
https://tests.stockfishchess.org/tests/view/619bd438c0a4ea18ba95a27d
LLR: 2.93 (-2.94,2.94) <0.00,2.50>
Total: 113704 W: 29284 L: 28870 D: 55550
Ptnml(0-2): 357, 12884, 30044, 13122, 445
passed LTC
https://tests.stockfishchess.org/tests/view/619cb8f0c0a4ea18ba95a334
LLR: 2.96 (-2.94,2.94) <0.50,3.00>
Total: 147136 W: 37307 L: 36770 D: 73059
Ptnml(0-2): 107, 15279, 42265, 15804, 113
closes https://github.com/official-stockfish/Stockfish/pull/3805
bench 6777918
retire msvc support and corresponding CI. No active development happens on msvc,
and build is much slower or wrong.
gcc (mingw) is our toolchain of choice also on windows, and the latter is tested.
No functional change
Current master implements a scaling of the raw NNUE output value with a formula
equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies
between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature
allows Stockfish to keep material on the board when she thinks she has the advantage,
and to seek exchanges and simplifications when she thinks she has to defend.
This patch slightly offsets the turning point between these two strategies, by adding
to Stockfish's evaluation a small "optimism" value before actually doing the scaling.
The effect is that SF will play a little bit more risky, trying to keep the tension a
little bit longer when she is defending, and keeping even more material on the board
when she has an advantage.
We note that this patch is similar in spirit to the old "Contempt" idea we used to have
in classical Stockfish, but this implementation differs in two key points:
a) it has been tested as an Elo-gainer against master;
b) the values output by the search are not changed on average by the implementation
(in other words, the optimism value changes the tension/exchange strategy, but a
displayed value of 1.0 pawn has the same signification before and after the patch).
See the old comment https://github.com/official-stockfish/Stockfish/pull/1361#issuecomment-359165141
for some images illustrating the ideas.
-------
finished yellow at STC:
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 165048 W: 41705 L: 41611 D: 81732
Ptnml(0-2): 565, 18959, 43245, 19327, 428
https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b
passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 121656 W: 30762 L: 30287 D: 60607
Ptnml(0-2): 87, 12558, 35032, 13095, 56
https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877
-------
How to continue from there?
a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value
could be tweaked to try to gain more Elo, so the parameters of the sigmoid function
in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible
using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92
b) in a similar vein, with two recents patches affecting the scaling of the NNUE
evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning
of the NNUE network;
c) this patch will tend to keep tension in middlegame a little bit longer, so any
patch improving the defensive aspect of play via search extensions in risky,
tactical positions would be welcome.
-------
closes https://github.com/official-stockfish/Stockfish/pull/3797
Bench: 6184852
Starting with Windows Build 20348 the behavior of the numa API has been changed:
https://docs.microsoft.com/en-us/windows/win32/procthread/numa-support
Old code only worked because there was probably a limit on how many
cores/threads can reside within one NUMA node, and the OS creates extra NUMA
nodes when necessary, however the actual mechanism of core binding is
done by "Processor Groups"(https://docs.microsoft.com/en-us/windows/win32/procthread/processor-groups). With a newer OS, one NUMA node can have many
such "Processor Groups" and we should just consistently use the number
of groups to bind the threads instead of deriving the topology from
the number of NUMA nodes.
This change is required to spread threads on all cores on Windows 11 with
a 3990X CPU. It has only 1 NUMA node with 2 groups of 64 threads each.
closes https://github.com/official-stockfish/Stockfish/pull/3787
No functional change.
In case the evaluation at root is large, discourage the use of lazyEval.
This fixes https://github.com/official-stockfish/Stockfish/issues/3772
or at least improves it significantly. In this case, poor play with large
odds can be observed, in extreme cases leading to a loss despite large
advantage:
r1bq1b1r/ppp3p1/3p1nkp/n3p3/2B1P2N/2NPB3/PPP2PPP/R3K2R b KQ - 5 9
With this patch the poor move is only considered up to depth 13, in master
up to depth 28.
The patch did not pass at LTC with Elo gainer bounds, but with slightly
positive Elo nevertheless (95% LOS).
STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 40368 W: 10318 L: 10041 D: 20009
Ptnml(0-2): 103, 4493, 10725, 4750, 113
https://tests.stockfishchess.org/tests/view/61800ad259e71df00dcc420d
LTC:
LLR: -2.94 (-2.94,2.94) <0.50,3.00>
Total: 212288 W: 52997 L: 52692 D: 106599
Ptnml(0-2): 112, 22038, 61549, 22323, 122
https://tests.stockfishchess.org/tests/view/618050d959e71df00dcc426d
closes https://github.com/official-stockfish/Stockfish/pull/3780
Bench: 7127040
Maintain for each root move an exponential average of the search value with a weight ratio of 2:1 (new value vs old values). Then the average score is used as the center of the initial aspiration window instead of the previous score.
Stats indicate (see PR) that the deviation for previous score is in general greater than using average score, so later seems a better estimation of the next search value. This is probably the reason this patch succeded besides smoothing the sometimes wild swings in search score. An additional observation is that at higher depth previous score is above but average score below zero. So for average score more/less fail/low highs should be occur than previous score.
STC:
LLR: 2.97 (-2.94,2.94) <0.00,2.50>
Total: 59792 W: 15106 L: 14792 D: 29894
Ptnml(0-2): 144, 6718, 15869, 7010, 155
https://tests.stockfishchess.org/tests/view/61841612d7a085ad008eef06
LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.00>
Total: 46448 W: 11835 L: 11537 D: 23076
Ptnml(0-2): 21, 4756, 13374, 5050, 23
https://tests.stockfishchess.org/tests/view/618463abd7a085ad008eef3e
closes https://github.com/official-stockfish/Stockfish/pull/3776
Bench: 6719976
Currently we handle the UCI_Elo with a double randomization. This
seems not necessary and a bit involuted.
This patch removes the first randomization and unifies the 2 cases.
closes https://github.com/official-stockfish/Stockfish/pull/3769
No functional change.
To help with debugging, the worker sends the output of
stderr (suitable truncated) to the action log on the
server, in case a build fails. For this to work it is
important that there is no spurious output to stderr.
closes https://github.com/official-stockfish/Stockfish/pull/3773
No functional change
Official release version of Stockfish 14.1
Bench: 6334068
---
Today, we have the pleasure to announce Stockfish 14.1.
As usual, downloads will be freely available at stockfishchess.org/download [1].
With Stockfish 14.1 our users get access to the strongest chess engine
available today. In the period leading up to this release, Stockfish
convincingly won several chess engine tournaments, including the TCEC 21
superfinal, the TCEC Cup 9, and the Computer Chess Championship for
Fischer Random Chess (Chess960). In the latter tournament, Stockfish
was undefeated in 599 out of 600 games played.
Compared to Stockfish 14, this release introduces a more advanced NNUE
architecture and various search improvements. In self play testing, using
a book of balanced openings, Stockfish 14.1 wins three times more game
pairs than it loses [2]. At this high level, draws are very common, so the
Elo difference to Stockfish 14 is about 17 Elo. The NNUE evaluation method,
introduced to top level chess with Stockfish 12 about one year ago [3],
has now been adopted by several other strong CPU based chess engines.
The Stockfish project builds on a thriving community of enthusiasts
(thanks everybody!) that contribute their expertise, time, and resources
to build a free and open-source chess engine that is robust,
widely available, and very strong. We invite our chess fans to join the
fishtest testing framework and programmers to contribute to the project [4].
Stay safe and enjoy chess!
The Stockfish team
[1] https://stockfishchess.org/download/
[2] https://tests.stockfishchess.org/tests/view/6175c320af70c2be1788fa2b
[3] https://github.com/official-stockfish/Stockfish/discussions/3628
[4] https://stockfishchess.org/get-involved/
Idea of this patch is the following: in case we already have four moves that
exceeded alpha in the current node, the probability of finding fifth should
be reasonably low. Note that four is completely arbitrary - there could and
probably should be some tweaks, both in tweaking best move count threshold
for more reductions and tweaking how they work - for example making more
reductions with best move count linearly.
passed STC:
https://tests.stockfishchess.org/tests/view/615f614783dd501a05b0aee2
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 141816 W: 36056 L: 35686 D: 70074
Ptnml(0-2): 499, 15131, 39273, 15511, 494
passed LTC:
https://tests.stockfishchess.org/tests/view/615fdff683dd501a05b0af35
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 68536 W: 17221 L: 16891 D: 34424
Ptnml(0-2): 38, 6573, 20725, 6885, 47
closes https://github.com/official-stockfish/Stockfish/pull/3736
Bench: 6131513
This patch updates the stat_bonus() function (used in the history tables to
help move ordering), keeping the same quadratic for small depths but changing
the values for depth >= 9:
The old bonus formula was increasing from zero at depth 1 to 4100 at depth 14,
then used the strange, small value of 73 for all depths >= 15.
The new bonus formula increases from 0 at depth 1 to 2000 at depth 8, then
keeps 2000 for all depths >= 8.
passed STC:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 169624 W: 42875 L: 42454 D: 84295
Ptnml(0-2): 585, 19340, 44557, 19729, 601
https://tests.stockfishchess.org/tests/view/615bd69e9d256038a969b97c
passed LTC:
LLR: 3.07 (-2.94,2.94) <0.50,3.50>
Total: 37336 W: 9456 L: 9191 D: 18689
Ptnml(0-2): 20, 3810, 10747, 4067, 24
https://tests.stockfishchess.org/tests/view/615c75d99d256038a969b9b2
closes https://github.com/official-stockfish/Stockfish/pull/3731
Bench: 6261865
When playing games in MultiPV mode we must take care to only track the
best move changing for the first PV line. Otherwise, SF will spend most
of its time for the initial moves after the book exit.
This has been observed and reported on Discord, but can also be seen in
games played in Stefan Pohl's MultiPV experiment.
Tested with MultiPV=4.
STC:
https://tests.stockfishchess.org/tests/view/615c24b59d256038a969b990
LLR: 2.95 (-2.94,2.94) <-0.50,2.50>
Total: 1744 W: 694 L: 447 D: 603
Ptnml(0-2): 32, 125, 358, 278, 79
LTC:
https://tests.stockfishchess.org/tests/view/615c31769d256038a969b993
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 2048 W: 723 L: 525 D: 800
Ptnml(0-2): 10, 158, 511, 314, 31
closes https://github.com/official-stockfish/Stockfish/pull/3729
Bench: 5714575
Respin of multi-thread idea that was simplified away recently: basically doing
more reductions with thread count since Lazy SMP naturally widens search. With
drawish book this idea got simplified away but with less drawish book it again
gains elo, maybe trying to reinstall other ideas that were simplified away
previously can be beneficial.
passed STC
LLR: 2.96 (-2.94,2.94) <-0.50,2.50>
Total: 39736 W: 10205 L: 9986 D: 19545
Ptnml(0-2): 45, 4254, 11064, 4447, 58
https://tests.stockfishchess.org/tests/view/615750702d02f48db3961b00
passed LTC
LLR: 2.97 (-2.94,2.94) <0.50,3.50>
Total: 60352 W: 15530 L: 15218 D: 29604
Ptnml(0-2): 24, 5900, 18016, 6212, 24
https://tests.stockfishchess.org/tests/view/6157d8935488e26ea5eace7f
closes https://github.com/official-stockfish/Stockfish/pull/3724
Bench 5714575
Idea is to extend some quiet ttMoves if a lot of things indicate that
the transposition table move is going to be a good move:
1) move being a killer - so being the best move in nearby node;
2) reply continuation history is really good.
This is basically saying that move is good "in general" in this position,
that it is a good reply to the opponent move and that it was the best in
this position somewhere in search - so extending it makes a lot of sense.
In general in past year we had a lot of extensions of different types,
maybe there is something more in it :)
passed STC
LLR: 2.96 (-2.94,2.94) <-0.50,2.50>
Total: 42944 W: 10932 L: 10695 D: 21317
Ptnml(0-2): 141, 4869, 11210, 5116, 136
https://tests.stockfishchess.org/tests/view/614cca8e7bdc23e77ceb89f0
passed LTC
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 156848 W: 39473 L: 38893 D: 78482
Ptnml(0-2): 125, 16327, 44913, 16961, 98
https://tests.stockfishchess.org/tests/view/614cf93d7bdc23e77ceb8a13
closes https://github.com/official-stockfish/Stockfish/pull/3719
Bench: 5714575
In master, during singular move analysis, when both the transposition value
and a reduced search for the other moves seem to indicate a fail high, we
heuristically prune the whole subtree and return an fail high score.
This patch is a little bit more cautious in this case, and instead of the
risky cutoff, we now search the ttMove with a reduced depth (by two plies).
STC:
https://tests.stockfishchess.org/tests/view/614dafe07bdc23e77ceb8a89
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 46728 W: 11909 L: 11666 D: 23153
Ptnml(0-2): 181, 5288, 12168, 5561, 166
LTC:
https://tests.stockfishchess.org/tests/view/614dc84abe4c07e0ecac3c95
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 74520 W: 18809 L: 18450 D: 37261
Ptnml(0-2): 45, 7735, 21346, 8084, 50
closes https://github.com/official-stockfish/Stockfish/pull/3718
Bench: 5499262
This patch relax a little bit the condition for doubly singular moves
(ie moves that are so forced that we think that they deserve a local
double extension of the search). We lower the margin and allow up to
six such double extensions in the path between the root and the critical
node.
Original idea by Siad Daboul (@TopoIogist) in PR #3709
Tested with the previous commit:
passed STC:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 33048 W: 8458 L: 8236 D: 16354
Ptnml(0-2): 120, 3701, 8660, 3923, 120
https://tests.stockfishchess.org/tests/view/614b24347bdc23e77ceb88fe
passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.50>
Total: 54176 W: 13712 L: 13406 D: 27058
Ptnml(0-2): 36, 5653, 15399, 5969, 31
https://tests.stockfishchess.org/tests/view/614b3b727bdc23e77ceb8911
closes https://github.com/official-stockfish/Stockfish/pull/3714
Bench: 5792377
This patch detects some search explosions (due to double extensions in
search.cpp) which can happen in some pathological positions, and takes
measures to ensure progress in search even for these pathological situations.
While a small number of double extensions can be useful during search
(for example to resolve a tactical sequence), a sustained regime of
double extensions leads to search explosion and a non-finishing search.
See the discussion in https://github.com/official-stockfish/Stockfish/pull/3544
and the issue https://github.com/official-stockfish/Stockfish/issues/3532 .
The implemented algorithm is the following:
a) at each node during search, store the current depth in the stack.
Double extensions are by definition levels of the stack where the
depth at ply N is strictly higher than depth at ply N-1.
b) during search, calculate for each thread a running average of the
number of double extensions in the last 4096 visited nodes.
c) if one thread has more than 2% of double extensions for a sustained
period of time (6 millions consecutive nodes, or about 4 seconds on
my iMac), we decide that this thread is in an explosion state and
we calm down this thread by preventing it to do any double extension
for the next 6 millions nodes.
To calculate the running averages, we also introduced a auxiliary class
generalizing the computations of ttHitAverage variable we already had in
code. The implementation uses an exponential moving average of period 4096
and resolution 1/1024, and all computations are done with integers for
efficiency.
-----------
Example where the patch solves a search explosion:
```
./stockfish
ucinewgame
position fen 8/Pk6/8/1p6/8/P1K5/8/6B1 w - - 37 130
go infinite
```
This algorithm does not affect search in normal, non-pathological positions.
We verified, for instance, that the usual bench is unchanged up to depth 20
at least, and that the node numbers are unchanged for a search of the starting
position at depth 32.
-------------
See https://github.com/official-stockfish/Stockfish/pull/3714
Bench: 5575265
This patch introduces extension for captures and promotions. Every capture or
promotion that is not the first move in the list gets extended at PvNodes and
cutNodes. Special thanks to @locutus2 - all my previous attepmts that failed
on this idea were done only for PvNodes - idea to include also cutNodes was
based on his latest passed patch.
STC
https://tests.stockfishchess.org/tests/view/6134abf325b9b35584838574
LLR: 2.95 (-2.94,2.94) <-0.50,2.50>
Total: 188920 W: 47754 L: 47304 D: 93862
Ptnml(0-2): 595, 21754, 49344, 22140, 627
LTC
https://tests.stockfishchess.org/tests/view/613521de25b9b355848385d7
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 8768 W: 2283 L: 2098 D: 4387
Ptnml(0-2): 7, 866, 2452, 1053, 6
closes https://github.com/official-stockfish/Stockfish/pull/3692
bench: 5564555
The new network caused some issues initially due to the very narrow neuron set between the first two FC layers. Necessary changes were hacked together to make it work. This patch is a mature approach to make the affine transform code faster, more readable, and easier to maintain should the layer sizes change again.
The following changes were made:
* ClippedReLU always produces a multiple of 32 outputs. This is about as good of a solution for AffineTransform's SIMD requirements as it can get without a bigger rewrite.
* All self-contained simd helpers are moved to a separate file (simd.h). Inline asm is utilized to work around GCC's issues with code generation and register assignment. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101693, https://godbolt.org/z/da76fY1n7
* AffineTransform has 2 specializations. While it's more lines of code due to the boilerplate, the logic in both is significantly reduced, as these two are impossible to nicely combine into one.
1) The first specialization is for cases when there's >=128 inputs. It uses a different approach to perform the affine transform and can make full use of AVX512 without any edge cases. Furthermore, it has higher theoretical throughput because less loads are needed in the hot path, requiring only a fixed amount of instructions for horizontal additions at the end, which are amortized by the large number of inputs.
2) The second specialization is made to handle smaller layers where performance is still necessary but edge cases need to be handled. AVX512 implementation for this was ommited by mistake, a remnant from the temporary implementation for the new... This could be easily reintroduced if needed. A slightly more detailed description of both implementations is in the code.
Overall it should be a minor speedup, as shown on fishtest:
passed STC:
LLR: 2.96 (-2.94,2.94) <-0.50,2.50>
Total: 51520 W: 4074 L: 3888 D: 43558
Ptnml(0-2): 111, 3136, 19097, 3288, 128
and various tests shown in the pull request
closes https://github.com/official-stockfish/Stockfish/pull/3663
No functional change
Introduces a new NNUE network architecture and associated network parameters
The summary of the changes:
* Position for each perspective mirrored such that the king is on e..h files. Cuts the feature transformer size in half, while preserving enough knowledge to be good. See https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY/edit#heading=h.b40q4rb1w7on.
* The number of neurons after the feature transformer increased two-fold, to 1024x2. This is possibly mostly due to the now very optimized feature transformer update code.
* The number of neurons after the second layer is reduced from 16 to 8, to reduce the speed impact. This, perhaps surprisingly, doesn't harm the strength much. See https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY/edit#heading=h.6qkocr97fezq
The AffineTransform code did not work out-of-the box with the smaller number of neurons after the second layer, so some temporary changes have been made to add a special case for InputDimensions == 8. Also additional 0 padding is added to the output for some archs that cannot process inputs by <=8 (SSE2, NEON). VNNI uses an implementation that can keep all outputs in the registers while reducing the number of loads by 3 for each 16 inputs, thanks to the reduced number of output neurons. However GCC is particularily bad at optimization here (and perhaps why the current way the affine transform is done even passed sprt) (see https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY/edit# for details) and more work will be done on this in the following days. I expect the current VNNI implementation to be improved and extended to other architectures.
The network was trained with a slightly modified version of the pytorch trainer (https://github.com/glinscott/nnue-pytorch); the changes are in https://github.com/glinscott/nnue-pytorch/pull/143
The training utilized 2 datasets.
dataset A - https://drive.google.com/file/d/1VlhnHL8f-20AXhGkILujnNXHwy9T-MQw/view?usp=sharing
dataset B - as described in https://github.com/official-stockfish/Stockfish/commit/ba01f4b95448bcb324755f4dd2a632a57c6e67bc
The training process was as following:
train on dataset A for 350 epochs, take the best net in terms of elo at 20k nodes per move (it's fine to take anything from later stages of training).
convert the .ckpt to .pt
--resume-from-model from the .pt file, train on dataset B for <600 epochs, take the best net. Lambda=0.8, applied before the loss function.
The first training command:
python3 train.py \
../nnue-pytorch-training/data/large_gensfen_multipvdiff_100_d9.binpack \
../nnue-pytorch-training/data/large_gensfen_multipvdiff_100_d9.binpack \
--gpus "$3," \
--threads 1 \
--num-workers 1 \
--batch-size 16384 \
--progress_bar_refresh_rate 20 \
--smart-fen-skipping \
--random-fen-skipping 3 \
--features=HalfKAv2_hm^ \
--lambda=1.0 \
--max_epochs=600 \
--default_root_dir ../nnue-pytorch-training/experiment_$1/run_$2
The second training command:
python3 serialize.py \
--features=HalfKAv2_hm^ \
../nnue-pytorch-training/experiment_131/run_6/default/version_0/checkpoints/epoch-499.ckpt \
../nnue-pytorch-training/experiment_$1/base/base.pt
python3 train.py \
../nnue-pytorch-training/data/michael_commit_b94a65.binpack \
../nnue-pytorch-training/data/michael_commit_b94a65.binpack \
--gpus "$3," \
--threads 1 \
--num-workers 1 \
--batch-size 16384 \
--progress_bar_refresh_rate 20 \
--smart-fen-skipping \
--random-fen-skipping 3 \
--features=HalfKAv2_hm^ \
--lambda=0.8 \
--max_epochs=600 \
--resume-from-model ../nnue-pytorch-training/experiment_$1/base/base.pt \
--default_root_dir ../nnue-pytorch-training/experiment_$1/run_$2
STC: https://tests.stockfishchess.org/tests/view/611120b32a8a49ac5be798c4
LLR: 2.97 (-2.94,2.94) <-0.50,2.50>
Total: 22480 W: 2434 L: 2251 D: 17795
Ptnml(0-2): 101, 1736, 7410, 1865, 128
LTC: https://tests.stockfishchess.org/tests/view/611152b32a8a49ac5be798ea
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 9776 W: 442 L: 333 D: 9001
Ptnml(0-2): 5, 295, 4180, 402, 6
closes https://github.com/official-stockfish/Stockfish/pull/3646
bench: 5189338
This patch improves the codegen in the AffineTransform::forward function for architectures >=SSSE3. Current code works directly on memory and the compiler cannot see that the stores through outptr do not alias the loads through weights and input32. The solution implemented is to perform the affine transform with local variables as accumulators and only store the result to memory at the end. The number of accumulators required is OutputDimensions / OutputSimdWidth, which means that for the 1024->16 affine transform it requires 4 registers with SSSE3, 2 with AVX2, 1 with AVX512. It also cuts the number of stores required by NumRegs * 256 for each node evaluated. The local accumulators are expected to be assigned to registers, but even if this cannot be done in some case due to register pressure it will help the compiler to see that there is no aliasing between the loads and stores and may still result in better codegen.
See https://godbolt.org/z/59aTKbbYc for codegen comparison.
passed STC:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 140328 W: 10635 L: 10358 D: 119335
Ptnml(0-2): 302, 8339, 52636, 8554, 333
closes https://github.com/official-stockfish/Stockfish/pull/3634
No functional change
combined work by Serio Vieri, Michael Byrne, and Jonathan D (aka SFisGod) based on top of previous developments, by restarts from good nets.
Sergio generated the net https://tests.stockfishchess.org/api/nn/nn-d8609abe8caf.nnue:
The initial net nn-d8609abe8caf.nnue is trained by generating around 16B of training data from the last master net nn-9e3c6298299a.nnue, then trained, continuing from the master net, with lambda=0.2 and sampling ratio of 1. Starting with LR=2e-3, dropping LR with a factor of 0.5 until it reaches LR=5e-4. in_scaling is set to 361. No other significant changes made to the pytorch trainer.
Training data gen command (generates in chunks of 200k positions):
generate_training_data min_depth 9 max_depth 11 count 200000 random_move_count 10 random_move_max_ply 80 random_multi_pv 12 random_multi_pv_diff 100 random_multi_pv_depth 8 write_min_ply 10 eval_limit 1500 book noob_3moves.epd output_file_name gendata/$(date +"%Y%m%d-%H%M")_${HOSTNAME}.binpack
PyTorch trainer command (Note that this only trains for 20 epochs, repeatedly train until convergence):
python train.py --features "HalfKAv2^" --max_epochs 20 --smart-fen-skipping --random-fen-skipping 500 --batch-size 8192 --default_root_dir $dir --seed $RANDOM --threads 4 --num-workers 32 --gpus $gpuids --track_grad_norm 2 --gradient_clip_val 0.05 --lambda 0.2 --log_every_n_steps 50 $resumeopt $data $val
See https://github.com/sergiovieri/Stockfish/tree/tools_mod/rl for the scripts used to generate data.
Based on that Michael generated nn-76a8a7ffb820.nnue in the following way:
The net being submitted was trained with the pytorch trainer: https://github.com/glinscott/nnue-pytorch
python train.py i:/bin/all.binpack i:/bin/all.binpack --gpus 1 --threads 4 --num-workers 30 --batch-size 16384 --progress_bar_refresh_rate 30 --smart-fen-skipping --random-fen-skipping 3 --features=HalfKAv2^ --auto_lr_find True --lambda=1.0 --max_epochs=240 --seed %random%%random% --default_root_dir exp/run_109 --resume-from-model ./pt/nn-d8609abe8caf.pt
This run is thus started from Segio Vieri's net nn-d8609abe8caf.nnue
all.binpack equaled 4 parts Wrong_NNUE_2.binpack https://drive.google.com/file/d/1seGNOqcVdvK_vPNq98j-zV3XPE5zWAeq/view?usp=sharing plus two parts of Training_Data.binpack https://drive.google.com/file/d/1RFkQES3DpsiJqsOtUshENtzPfFgUmEff/view?usp=sharing
Each set was concatenated together - making one large Wrong_NNUE 2 binpack and one large Training so the were approximately equal in size. They were then interleaved together. The idea was to give Wrong_NNUE.binpack closer to equal weighting with the Training_Data binpack
model.py modifications:
loss = torch.pow(torch.abs(p - q), 2.6).mean()
LR = 8.0e-5 calculated as follows: 1.5e-3*(.992^360) - the idea here was to take a highly trained net and just use all.binpack as a finishing micro refinement touch for the last 2 Elo or so. This net was discovered on the 59th epoch.
optimizer = ranger.Ranger(train_params, betas=(.90, 0.999), eps=1.0e-7, gc_loc=False, use_gc=False)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=1, gamma=0.992)
For this micro optimization, I had set the period to "5" in train.py. This changes the checkpoint output so that every 5th checkpoint file is created
The final touches were to adjust the NNUE scale, as was done by Jonathan in tests running at the same time.
passed LTC
https://tests.stockfishchess.org/tests/view/60fa45aed8a6b65b2f3a77a4
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 53040 W: 1732 L: 1575 D: 49733
Ptnml(0-2): 14, 1432, 23474, 1583, 17
passed STC
https://tests.stockfishchess.org/tests/view/60f9fee2d8a6b65b2f3a7775
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 37928 W: 3178 L: 3001 D: 31749
Ptnml(0-2): 100, 2446, 13695, 2623, 100.
closes https://github.com/official-stockfish/Stockfish/pull/3626
Bench: 5169957
The main idea is that illegal moves influencing search or
qsearch obviously can't be any sort of good. The only reason
why initially legality checks for search and qsearch were done
after they actually can influence some heuristics is because
legality check is expensive computationally. Eventually in
search it was moved to the place where it makes sure that
illegal moves can't influence search.
This patch shows that the same can be done for qsearch + it
passed STC with elo-gaining bounds + it removes 3 lines of code
because one no longer needs to increment/decrement movecount
on illegal moves.
passed STC with elo-gaining bounds
https://tests.stockfishchess.org/tests/view/60f20aefd1189bed71812da0
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 61512 W: 4688 L: 4492 D: 52332
Ptnml(0-2): 139, 3730, 22848, 3874, 165
The same version functionally but with moving condition ever earlier
passed LTC with simplification bounds.
https://tests.stockfishchess.org/tests/view/60f292cad1189bed71812de9
LLR: 2.98 (-2.94,2.94) <-2.50,0.50>
Total: 60944 W: 1724 L: 1685 D: 57535
Ptnml(0-2): 11, 1556, 27298, 1597, 10
closes https://github.com/official-stockfish/Stockfish/pull/3618
bench 4709569
Not all linux users will have libatomic installed.
When using clang as the system compiler with compiler-rt as the default
runtime library instead of libgcc, atomic builtins may be provided by compiler-rt.
This change allows such users to pass RTLIB=compiler-rt to make sure
the build doesn't error out on the missing (unnecessary) libatomic.
closes https://github.com/official-stockfish/Stockfish/pull/3597
No functional change
Official release version of Stockfish 14
Bench: 4770936
---
Today, we have the pleasure to announce Stockfish 14.
As usual, downloads will be freely available at https://stockfishchess.org
The engine is now significantly stronger than just a few months ago,
and wins four times more game pairs than it loses against the previous
release version [0]. Stockfish 14 is now at least 400 Elo ahead of
Stockfish 7, a top engine in 2016 [1]. During the last five years,
Stockfish has thus gained about 80 Elo per year.
Stockfish 14 evaluates positions more accurately than Stockfish 13 as
a result of two major steps forward in defining and training the
efficiently updatable neural network (NNUE) that provides the evaluation
for positions.
First, the collaboration with the Leela Chess Zero team - announced
previously [2] - has come to fruition. The LCZero team has provided a
collection of billions of positions evaluated by Leela that we have
combined with billions of positions evaluated by Stockfish to train the
NNUE net that powers Stockfish 14. The fact that we could use and combine
these datasets freely was essential for the progress made and demonstrates
the power of open source and open data [3].
Second, the architecture of the NNUE network was significantly updated:
the new network is not only larger, but more importantly, it deals better
with large material imbalances and can specialize for multiple phases of
the game [4]. A new project, kick-started by Gary Linscott and
Tomasz Sobczyk, led to a GPU accelerated net trainer written in
pytorch.[5] This tool allows for training high-quality nets in a couple
of hours.
Finally, this release features some search refinements, minor bug
fixes and additional improvements. For example, Stockfish is now about
90 Elo stronger for chess960 (Fischer random chess) at short time control.
The Stockfish project builds on a thriving community of enthusiasts
(thanks everybody!) that contribute their expertise, time, and resources
to build a free and open-source chess engine that is robust, widely
available, and very strong. We invite our chess fans to join the fishtest
testing framework and programmers to contribute to the project on
github [6].
Stay safe and enjoy chess!
The Stockfish team
[0] https://tests.stockfishchess.org/tests/view/60dae5363beab81350aca077
[1] https://nextchessmove.com/dev-builds
[2] https://stockfishchess.org/blog/2021/stockfish-13/
[3] https://lczero.org/blog/2021/06/the-importance-of-open-data/
[4] https://github.com/official-stockfish/Stockfish/commit/e8d64af1
[5] https://github.com/glinscott/nnue-pytorch/
[6] https://stockfishchess.org/get-involved/
In the so-called "hybrid" method of evaluation of current master, we use the
classical eval (because of its speed) instead of the NNUE eval when the classical
material balance approximation hints that the position is "winning enough" to
rely on the classical eval.
This trade-off idea between speed and accuracy works well in general, but in
some fortress positions the classical eval is just bad. So in shuffling branches
of the search tree, we (slowly) increase the thresehold so that eventually we
don't trust classical anymore and switch to NNUE evaluation.
This patch increases that threshold faster, so that we switch to NNUE quicker
in shuffling branches. Idea is to incite Stockfish to spend less time in fortresses
lines in the search tree, and spend more time searching the critical lines.
passed STC:
LLR: 2.96 (-2.94,2.94) <-0.50,2.50>
Total: 47872 W: 3908 L: 3720 D: 40244
Ptnml(0-2): 122, 3053, 17419, 3199, 143
https://tests.stockfishchess.org/tests/view/60cef34b457376eb8bcab79d
passed LTC:
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 73616 W: 2326 L: 2143 D: 69147
Ptnml(0-2): 21, 1940, 32705, 2119, 23
https://tests.stockfishchess.org/tests/view/60cf6d842114332881e73528
Retested at LTC against lastest master:
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 18264 W: 642 L: 532 D: 17090
Ptnml(0-2): 6, 479, 8055, 583, 9
https://tests.stockfishchess.org/tests/view/60d18cd540925195e7a6c351
closes https://github.com/official-stockfish/Stockfish/pull/3578
Bench: 5139233
This patch removes the UCI option for setting Contempt in classical evaluation.
It is exactly equivalent to using Contempt=0 for the UCI contempt value and keeping
the dynamic part in the algo (renaming this dynamic part `trend` to better describe
what it does). We have tried quite hard to implement a working Contempt feature for
NNUE but nothing really worked, so it is probably time to give up.
Interested chess fans wishing to keep playing with the UCI option for Contempt and
use it with the classical eval are urged to download the version tagged "SF_Classical"
of Stockfish (dated 31 July 2020), as it was the last version where our search
algorithm was tuned for the classical eval and is probably our strongest classical
player ever: https://github.com/official-stockfish/Stockfish/tags
Passed STC:
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 72904 W: 6228 L: 6175 D: 60501
Ptnml(0-2): 221, 5006, 25971, 5007, 247
https://tests.stockfishchess.org/tests/view/60c98bf9457376eb8bcab18d
Passed LTC:
LLR: 2.93 (-2.94,2.94) <-2.50,0.50>
Total: 45168 W: 1601 L: 1547 D: 42020
Ptnml(0-2): 38, 1331, 19786, 1397, 32
https://tests.stockfishchess.org/tests/view/60c9c7fa457376eb8bcab1bb
closes https://github.com/official-stockfish/Stockfish/pull/3575
Bench: 4947716
This patch increase the weight of pawns and pieces from 28 to 32
in the scaling formula we apply to the output of the NNUE pure eval.
Increasing this gradient for pawns and pieces means that Stockfish
will try a little harder to keep material when she has the advantage,
and try a little bit harder to escape into an endgame when she is
under pressure.
STC:
LLR: 2.93 (-2.94,2.94) <-0.50,2.50>
Total: 53168 W: 4371 L: 4177 D: 44620
Ptnml(0-2): 160, 3389, 19283, 3601, 151
https://tests.stockfishchess.org/tests/view/60cefd1d457376eb8bcab7ab
LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 10888 W: 386 L: 288 D: 10214
Ptnml(0-2): 3, 260, 4821, 356, 4
https://tests.stockfishchess.org/tests/view/60cf709d2114332881e7352b
closes https://github.com/official-stockfish/Stockfish/pull/3571
Bench: 4965430
trained with the Python command
c:\nnue>python train.py i:/bin/all.binpack i:/bin/all.binpack --gpus 1 --threads 4 --num-workers 30 --batch-size 16384 --progress_bar_refresh_rate 300 --smart-fen-skipping --random-fen-skipping 3 --features=HalfKAv2^ --lambda=1.0 --max_epochs=440 --seed %random%%random% --default_root_dir exp/run_10 --resume-from-model ./pt/nn-3b20abec10c1.pt
`
all.binpack equaled 4 parts Wrong_NNUE_2.binpack https://drive.google.com/file/d/1seGNOqcVdvK_vPNq98j-zV3XPE5zWAeq/view?usp=sharing plus two parts of Training_Data.binpack https://drive.google.com/file/d/1RFkQES3DpsiJqsOtUshENtzPfFgUmEff/view?usp=sharing
Each set was concatenated together - making one large Wrong_NNUE 2 binpack and one large Training so the were approximately equal in size. They were then interleaved together. The idea was to give Wrong_NNUE.binpack closer to equal weighting with the Training_Data binpack .
Net nn-3b20abec10c1.nnue was chosen as the --resume-from-model with the idea that through learning, the manually hex edited values will be learned and will not need to be manually adjusted going forward. They would also be fine tuned by the learning process.
passed STC:
https://tests.stockfishchess.org/tests/view/60cdf91e457376eb8bcab66f
LLR: 2.95 (-2.94,2.94) <-0.50,2.50>
Total: 18256 W: 1639 L: 1479 D: 15138
Ptnml(0-2): 59, 1179, 6505, 1313, 72
passed LTC:
https://tests.stockfishchess.org/tests/view/60ce2166457376eb8bcab6e1
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 18792 W: 654 L: 542 D: 17596
Ptnml(0-2): 9, 490, 8291, 592, 14
closes https://github.com/official-stockfish/Stockfish/pull/3570
Bench: 5020972
The Cygwin environment has two g++ compilers, each with a different problem
for compiling Stockfish at the moment:
(a) g++.exe : full posix build compiler, linked to cygwin dll.
=> This one has a problem embedding the net.
(b) x86_64-w64-mingw32-g++.exe : native Windows build compiler.
=> This one manages to embed the net, but has a problem related to libgcov
when we use the profile-build target of Stockfish.
This patch solves the problem for compiler (b), so that our recommended command line
if you want to build an optimized version of Stockfish on Cygwin becomes something
like the following (you can change the ARCH value to whatever you want, but note
the COMP and CXX variables pointing at the right compiler):
```
make -j profile-build ARCH=x86-64-modern COMP=mingw CXX=x86_64-w64-mingw32-c++.exe
```
closes https://github.com/official-stockfish/Stockfish/pull/3569
No functional change
move to github actions to replace travis CI.
First version, testing on linux using gcc and clang.
gcc build with sanitizers and valgrind.
No functional change
Optimization of vondele's nn-33c9d39e5eb6.nnue using SPSA
https://tests.stockfishchess.org/tests/view/60ca68be457376eb8bcab28b
Setting: ck values are default based on how large the parameters are
The new values for this net are the raw values at the end of the tuning (80k games)
The significant changes are in buckets 1 and 2 (5-12 pieces) so the main difference is in playing endgames if we compare it to nn-33c9. There is also change in bucket 7 (29-32 pieces) but not as substantial as the changes in buckets 1 and 2. If we interpret the changes based on an experiment a few months ago, this new net plays more optimistically during endgames and less optimistically during openings.
STC:
LLR: 2.93 (-2.94,2.94) <-0.50,2.50>
Total: 49504 W: 4246 L: 4053 D: 41205
Ptnml(0-2): 140, 3282, 17749, 3407, 174
https://tests.stockfishchess.org/tests/view/60cbd752457376eb8bcab478
LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.50>
Total: 88720 W: 4926 L: 4651 D: 79143
Ptnml(0-2): 105, 4048, 35793, 4295, 119
https://tests.stockfishchess.org/tests/view/60cc7828457376eb8bcab4fa
closes https://github.com/official-stockfish/Stockfish/pull/3566
Bench: 4758885
This net was created by @pleomati, who manually edited with an hex editor
10 values randomly chosen in the LCSFNet10 net (nn-6ad41a9207d0.nnue) to
create this one. The LCSFNet10 net was trained by Joost VandeVondele from
a dataset combining Stockfish games and Leela games (16x10^9 positions from
SF self-play at depth 9, and 6.3x10^9 positions from Leela games, so overall
72% of Stockfish positions and 28% of Leela positions).
passed STC 10+0.1:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 50888 W: 5881 L: 5654 D: 39353
Ptnml(0-2): 281, 4290, 16085, 4497, 291
https://tests.stockfishchess.org/tests/view/60cbfa68457376eb8bcab49a
passed LTC 60+0.6:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 25480 W: 1498 L: 1338 D: 22644
Ptnml(0-2): 36, 1155, 10193, 1325, 31
https://tests.stockfishchess.org/tests/view/60cc4af8457376eb8bcab4d4
closes https://github.com/official-stockfish/Stockfish/pull/3564
Bench: 4904930
This reverts commit "Fix for Cygwin's environment build-profile", as it was
giving errors for "make clean" on some Windows environments. See comments in
https://github.com/official-stockfish/Stockfish/commit/68bf362ea2385a641be9f5ed9ce2acdf55a1ecf1
Possibly somebody can propose a solution that would fix Cygwin builds and
not break on other system too, stay tuned! :-)
No functional change
The Cygwin environment has two g++ compilers, each with a different problem
for compiling Stockfish at the moment:
(a) g++.exe : full posix build compiler, linked to cygwin dll.
=> This one has a problem embedding the net.
(b) x86_64-w64-mingw32-g++.exe : native Windows build compiler.
=> This one manages to embed the net, but has a problem related to libgcov
when we use the profile-build target of Stockfish.
This patch solves the problem for compiler (b), so that our recommended command line
if you want to build an optimized version of Stockfish on Cygwin becomes something
like the following (you can change the ARCH value to whatever you want, but note
the COMP and CXX variables pointing at the right compiler):
```
make -j profile-build ARCH=x86-64-modern COMP=mingw CXX=x86_64-w64-mingw32-c++.exe
```
closes https://github.com/official-stockfish/Stockfish/pull/3463
No functional change
of a root move leading to a 3-fold repetition.
With this small fix a draw ranking and thus a draw score is being applied.
This works for both, ranking by dtz or wdl tables.
Fixes https://github.com/official-stockfish/Stockfish/issues/3542
(No functional change without TBs.)
Bench: 4877339
This net is the result of training on data used by the Leela project. More precisely,
we shuffled T60 and T74 data kindly provided by borg (for different Tnn, the data is
a result of Leela selfplay with differently sized Leela nets).
The data is available at vondele's google drive:
https://drive.google.com/drive/folders/1mftuzYdl9o6tBaceR3d_VBQIrgKJsFpl.
The Leela data comes in small chunks of .binpack files. To shuffle them, we simply
used a small python script to randomly rename the files, and then concatenated them
using `cat`. As validation data we picked a file of T60 data. We will further investigate
T74 data.
The training for the NNUE architecture used 200 epochs with the Python trainer from
the Stockfish project. Unlike the previous run we tried with this data, this run does
not have adjusted scaling — not because we didn't want to, but because we forgot.
However, this training randomly skips 40% more positions than previous run. The loss
was very spiky and decreased slower than it does usually.
Training loss: https://github.com/official-stockfish/images/blob/main/training-loss-8e47cf062333.png
Validation loss: https://github.com/official-stockfish/images/blob/main/validation-loss-8e47cf062333.png
This is the exact training command:
python train.py --smart-fen-skipping --random-fen-skipping 14 --batch-size 16384 --threads 4 --num-workers 4 --gpus 1 trainingdata\training_data.binpack validationdata\val.binpack
---
10k STC result:
ELO: 3.61 +-3.3 (95%) LOS: 98.4%
Total: 10000 W: 1241 L: 1137 D: 7622
Ptnml(0-2): 68, 841, 3086, 929, 76
https://tests.stockfishchess.org/tests/view/60c67e50457376eb8bcaae70
10k LTC result:
ELO: 2.71 +-2.4 (95%) LOS: 98.8%
Total: 10000 W: 659 L: 581 D: 8760
Ptnml(0-2): 22, 485, 3900, 579, 14
https://tests.stockfishchess.org/tests/view/60c69deb457376eb8bcaae98
Passed LTC:
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 9648 W: 685 L: 545 D: 8418
Ptnml(0-2): 22, 448, 3740, 596, 18
https://tests.stockfishchess.org/tests/view/60c6d41c457376eb8bcaaecf
---
closes https://github.com/official-stockfish/Stockfish/pull/3550
Bench: 4877339
Compute optimal register count for feature transformer accumulation dynamically.
This also introduces a change where AVX512 would only use 8 registers instead of 16
(now possible due to a 2x increase in feature transformer size).
closes https://github.com/official-stockfish/Stockfish/pull/3543
No functional change
This patch restricts LMR extensions (of non-transposition table moves) from being
used when the transposition table move was extended by two plies via singular
extension. This may serve to limit search explosions in certain positions.
This makes a lot of sense because the precondition for the tt-move to have been
singular extended by two plies is that the result of the alternate search (with
excluded the tt-move) has been a hard fail low: it is natural to later search less
for non tt-moves in this situation.
The current state of depth/extensions/reductions management is getting quite tricky
in our search algo, see https://github.com/official-stockfish/Stockfish/pull/3546#issuecomment-860174549
for some discussion. Suggestions welcome!
Passed STC
https://tests.stockfishchess.org/tests/view/60c3f293457376eb8bcaac8d
LLR: 2.95 (-2.94,2.94) <-0.50,2.50>
Total: 117984 W: 9698 L: 9430 D: 98856
Ptnml(0-2): 315, 7708, 42703, 7926, 340
passed LTC
https://tests.stockfishchess.org/tests/view/60c46ea5457376eb8bcaacc7
LLR: 2.97 (-2.94,2.94) <0.50,3.50>
Total: 11280 W: 401 L: 302 D: 10577
Ptnml(0-2): 2, 271, 4998, 364, 5
closes https://github.com/official-stockfish/Stockfish/pull/3546
Bench: 4709974
Load feature transformer weights in bulk on little-endian machines.
This is in particular useful to test new nets with c-chess-cli,
see https://github.com/lucasart/c-chess-cli/issues/44
```
$ time ./stockfish.exe uci
Before : 0m0.914s
After : 0m0.483s
```
No functional change
Cleaner vector code structure in feature transformer. This patch just
regroups the parts of the inner loop for each SIMD instruction set.
Tested for non-regression:
LLR: 2.96 (-2.94,2.94) <-2.50,0.50>
Total: 115760 W: 9835 L: 9831 D: 96094
Ptnml(0-2): 326, 7776, 41715, 7694, 369
https://tests.stockfishchess.org/tests/view/60b96b39457376eb8bcaa26e
It would be nice if a future patch could use some of the macros at
the top of the file to unify the code between the distincts SIMD
instruction sets (of course, unifying the Relu will be the challenge).
closes https://github.com/official-stockfish/Stockfish/pull/3506
No functional change
This simplification patch implements two changes:
1. it simplifies away the so-called "lazy" path in the NNUE evaluation internals,
where we trusted the psqt head alone to avoid the costly "positional" head in
some cases;
2. it raises a little bit the NNUEThreshold1 in evaluate.cpp (from 682 to 800),
which increases the limit where we switched from NNUE eval to Classical eval.
Both effects increase the number of positional evaluations done by our new net
architecture, but the results of our tests below seem to indicate that the loss
of speed will be compensated by the gain of eval quality.
STC:
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 26280 W: 2244 L: 2137 D: 21899
Ptnml(0-2): 72, 1755, 9405, 1810, 98
https://tests.stockfishchess.org/tests/view/60ae73f112066fd299795a51
LTC:
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 20592 W: 750 L: 677 D: 19165
Ptnml(0-2): 9, 614, 8980, 681, 12
https://tests.stockfishchess.org/tests/view/60ae88e812066fd299795a82
closes https://github.com/official-stockfish/Stockfish/pull/3503
Bench: 3817907
Definition of the lazy threshold moved to evaluate.cpp where all others are.
Lazy threshold only used for real searches, not used for the "eval" call.
This preserves the purity of NNUE evaluation, which is useful to verify
consistency between the engine and the NNUE trainer.
closes https://github.com/official-stockfish/Stockfish/pull/3499
No functional change
Our new nets output two values for the side to move in the last layer.
We can interpret the first value as a material evaluation of the
position, and the second one as the dynamic, positional value of the
location of pieces.
This patch changes the balance for the (materialist, positional) parts
of the score from (128, 128) to (121, 135) when the piece material is
equal between the two players, but keeps the standard (128, 128) balance
when one player is at least an exchange up.
Passed STC:
LLR: 2.93 (-2.94,2.94) <-0.50,2.50>
Total: 15936 W: 1421 L: 1266 D: 13249
Ptnml(0-2): 37, 1037, 5694, 1134, 66
https://tests.stockfishchess.org/tests/view/60a82df9ce8ea25a3ef0408f
Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 13904 W: 516 L: 410 D: 12978
Ptnml(0-2): 4, 374, 6088, 484, 2
https://tests.stockfishchess.org/tests/view/60a8bbf9ce8ea25a3ef04101
closes https://github.com/official-stockfish/Stockfish/pull/3492
Bench: 3856635
The Tempo variable was introduced 10 years ago in our search because the
classical evaluation function was antisymmetrical in White and Black by design
to gain speed:
Eval(White to play) = -Eval(Black to play)
Nowadays our neural networks know which side is to play in a position when
they evaluate a position and are trained on real games, so the neural network
encodes the advantage of moving as an output of search. This patch shows that
the Tempo variable is not necessary anymore.
STC:
LLR: 2.94 (-2.94,2.94) <-2.50,0.50>
Total: 33512 W: 2805 L: 2709 D: 27998
Ptnml(0-2): 80, 2209, 12095, 2279, 93
https://tests.stockfishchess.org/tests/view/60a44ceace8ea25a3ef03d30
LTC:
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 53920 W: 1807 L: 1760 D: 50353
Ptnml(0-2): 16, 1617, 23650, 1658, 19
https://tests.stockfishchess.org/tests/view/60a477f0ce8ea25a3ef03d49
We also tried a match (20000 games) at STC using purely classical, result was neutral:
https://tests.stockfishchess.org/tests/view/60a4eebcce8ea25a3ef03db5
Note: there are two locations left in search.cpp where we assume antisymmetry
of evaluation (in relation with a speed optimization for null moves in lines
770 and 1439), but as the values are just used for heuristic pruning this
approximation should not hurt too much because the order of magnitude is still
true most of the time.
closes https://github.com/official-stockfish/Stockfish/pull/3481
Bench: 4015864
This improves the speed of NNUE by a bit on old hardware that code path
is intended for, like a Pentium III 1.13 GHz:
10 repeats of "./stockfish bench 16 1 13 default depth NNUE":
Before:
54 642 504 897 cycles (± 0.12%)
62 301 937 829 instructions (± 0.03%)
After:
54 320 821 928 cycles (± 0.13%)
62 084 742 699 instructions (± 0.02%)
Speed of go depth 20 from startpos:
Before: 53103 nps
After: 53856 nps
closes https://github.com/official-stockfish/Stockfish/pull/3476
No functional change.
This patch increases lmrDepth threshold for continuation history based pruning in search.
This part of code for a long time was known to be really TC sensitive - decreasing
this threshold easily passed lower time controls but failed badly at LTC,
on the other hand it increase was part of a tuning that resulted
in being negative at STC but was +12 elo at 180+1.8.
After recent simplification of special conditions that sometimes
increase it from 4 to 5 it was logical to overall test at longer
time controls if 5 is better than 4 with deeper searches.
reduces strenght on STC
https://tests.stockfishchess.org/tests/view/60a3a8bbce8ea25a3ef03c74
ELO: -2.57 +-2.0 (95%) LOS: 0.6%
Total: 20000 W: 1820 L: 1968 D: 16212
Ptnml(0-2): 68, 1582, 6836, 1458, 56
Passed LTC with STC bounds
https://tests.stockfishchess.org/tests/view/60a027395085663412d090ce
LLR: 2.93 (-2.94,2.94) <-0.50,2.50>
Total: 175256 W: 6774 L: 6548 D: 161934
Ptnml(0-2): 91, 5808, 75604, 6034, 91
Passed VLTC with LTC bounds
https://tests.stockfishchess.org/tests/view/60a2bccce229097940a037a7
LLR: 2.96 (-2.94,2.94) <0.50,3.50>
Total: 65736 W: 1224 L: 1092 D: 63420
Ptnml(0-2): 5, 1012, 30706, 1136, 9
closes https://github.com/official-stockfish/Stockfish/pull/3473
bench 3689330
- Comment for Countemove pruning -> Continuation history
- Fix comment in input_slice.h
- Shorter lines in Makefile
- Comment for scale factor
- Fix comment for pinners in see_ge()
- Change Thread.id() signature to size_t
- Trailing space in reprosearch.sh
- Add Douglas Matos Gomes to the AUTHORS file
- Introduce comment for undo_null_move()
- Use Stockfish coding style for export_net()
- Change date in AUTHORS file
closes https://github.com/official-stockfish/Stockfish/pull/3416
No functional change
e2k (Elbrus 2000) - this is a VLIW/EPIC architecture,
the like Intel Itanium (IA-64) architecture.
The architecture has half native / half software support
for most Intel/AMD SIMD (e.g. MMX/SSE/SSE2/SSE3/SSSE3/SSE4.1/SSE4.2/AES/AVX/AVX2 & 3DNow!/SSE4a/XOP/FMA4) via intrinsics.
https://en.wikipedia.org/wiki/Elbrus_2000
closes https://github.com/official-stockfish/Stockfish/pull/3425
No functional change
This PR adds an ability to export any currently loaded network.
The export_net command now takes an optional filename parameter.
If the loaded net is not the embedded net the filename parameter is required.
Two changes were required to support this:
* the "architecture" string, which is really just a some kind of description in the net, is now saved into netDescription on load and correctly saved on export.
* the AffineTransform scrambles weights for some architectures and sparsifies them, such that retrieving the index is hard. This is solved by having a temporary scrambled<->unscrambled index lookup table when loading the network, and the actual index is saved for each individual weight that makes it to canSaturate16. This increases the size of the canSaturate16 entries by 6 bytes.
closes https://github.com/official-stockfish/Stockfish/pull/3456
No functional change
This patch broadens and simplifies definition of PvNode that is likely to fail low.
New definition can be described as following "If node was already researched
at depth >= current depth and failed low there" which is more logical than the
previous version and takes less space + allows to not recompute it every time during move loop.
Passed simplification STC
https://tests.stockfishchess.org/tests/view/609148bf95e7f1852abd2e82
LLR: 2.93 (-2.94,2.94) <-2.50,0.50>
Total: 20128 W: 1865 L: 1751 D: 16512
Ptnml(0-2): 63, 1334, 7165, 1430, 72
Passed simplification LTC
https://tests.stockfishchess.org/tests/view/6091691295e7f1852abd2e8b
LLR: 2.94 (-2.94,2.94) <-2.50,0.50>
Total: 95128 W: 3498 L: 3481 D: 88149
Ptnml(0-2): 41, 2956, 41549, 2981, 37
closes https://github.com/official-stockfish/Stockfish/pull/3455
Bench: 3933037
Introduce variable tempo for nnue depending on logarithm of estimated
strength, where strength is the product of time and number of threads.
The original idea here was that NNUE is best with a slightly different
tempo value to classical, since its style of play is slightly different.
It turns out that the best tempo for NNUE varies with strength of play,
so a formula is used which gives about 19 for STC and 24 for LTC under
current fishtest settings.
STC 10+0.1:
LLR: 2.94 (-2.94,2.94) {-0.20,1.10}
Total: 120816 W: 11155 L: 10861 D: 98800
Ptnml(0-2): 406, 8728, 41933, 8848, 493
https://tests.stockfishchess.org/tests/view/60735b3a8141753378960534
LTC 60+0.6:
LLR: 2.94 (-2.94,2.94) {0.20,0.90}
Total: 35688 W: 1392 L: 1234 D: 33062
Ptnml(0-2): 23, 1079, 15473, 1255, 14
https://tests.stockfishchess.org/tests/view/6073ffbc814175337896057f
Passed non-regression SMP test at LTC 20+0.2 (8 threads):
LLR: 2.95 (-2.94,2.94) {-0.70,0.20}
Total: 11008 W: 317 L: 267 D: 10424
Ptnml(0-2): 2, 245, 4962, 291, 4
https://tests.stockfishchess.org/tests/view/60749ea881417533789605a4
closes https://github.com/official-stockfish/Stockfish/pull/3426
Bench 4075325
A lot of optimizations happend since the NNUE was introduced
and since then some parts of the code were left unused. This
got to the point where asserts were have to be made just to
let people know that modifying something will not have any
effects or may even break everything due to the assumptions
being made. Removing these parts removes those inexisting
"false dependencies". Additionally:
* append_changed_indices now takes the king pos and stateinfo
explicitly, no more misleading pos parameter
* IndexList is removed in favor of a generic ValueList.
Feature transformer just instantiates the type it needs.
* The update cost and refresh requirement is deferred to the
feature set once again, but now doesn't go through the whole
FeatureSet machinery and just calls HalfKP directly.
* accumulator no longer has a singular dimension.
* The PS constants and the PieceSquareIndex array are made local
to the HalfKP feature set because they are specific to it and
DO differ for other feature sets.
* A few names are changed to more descriptive
Passed STC non-regression:
https://tests.stockfishchess.org/tests/view/608421dd95e7f1852abd2790
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 180008 W: 16186 L: 16258 D: 147564
Ptnml(0-2): 587, 12593, 63725, 12503, 596
closes https://github.com/official-stockfish/Stockfish/pull/3441
No functional change
NNUE evaluation is incapable of recognizing trivially drawn bishop endgames
(the wrong-colored rook pawn), which are in fact ubiquitous and stock standard
in chess analysis. Switching off NNUE evaluation in KBPs vs KPs endgames is
a measure that stops Stockfish from trading down to a drawn version of these
endings when we presumably have advantage. The patch is able to edge over master
in endgame positions.
Patch tested for Elo gain with the "endgame.epd" book, and verified for
non-regression with our usual book (see the pull request for details).
STC:
LLR: 2.93 (-2.94,2.94) {-0.20,1.10}
Total: 33232 W: 6655 L: 6497 D: 20080
Ptnml(0-2): 4, 2342, 11769, 2494, 7
https://tests.stockfishchess.org/tests/view/6074a52981417533789605b8
LTC:
LLR: 2.93 (-2.94,2.94) {0.20,0.90}
Total: 159056 W: 29799 L: 29378 D: 99879
Ptnml(0-2): 7, 9004, 61085, 9425, 7
https://tests.stockfishchess.org/tests/view/6074c39a81417533789605ca
Closes https://github.com/official-stockfish/Stockfish/pull/3427
Bench: 4503918
blah
This patch changes the pop_lsb() signature from Square pop_lsb(Bitboard*) to
Square pop_lsb(Bitboard&). This is more idomatic for C++ style signatures.
Passed a non-regression STC test:
LLR: 2.93 (-2.94,2.94) {-1.25,0.25}
Total: 21280 W: 1928 L: 1847 D: 17505
Ptnml(0-2): 71, 1427, 7558, 1518, 66
https://tests.stockfishchess.org/tests/view/6053a1e22433018de7a38e2f
We have verified that the generated binary is identical on gcc-10.
Closes https://github.com/official-stockfish/Stockfish/pull/3404
No functional change.
our net currently is not trained on FRC games, and so doesn't know about the important pattern of a bishop that is cornered in FRC.
This patch introduces a term we have in the classical evaluation for this case, and adds it to the NNUE eval.
Since fishtest doesn't support FRC right now, the patch was tested locally at STC conditions,
starting from the book of FRC starting positions.
Score of master vs patch: 993 - 2226 - 6781 [0.438] 10000
Which corresponds to approximately 40 Elo
The patch passes non-regression testing for traditional chess (where it adds one branch).
passed STC:
https://tests.stockfishchess.org/tests/view/604fa2532433018de7a38b67
LLR: 2.95 (-2.94,2.94) {-1.25,0.25}
Total: 30560 W: 2701 L: 2636 D: 25223
Ptnml(0-2): 88, 2056, 10921, 2133, 82
passed STC also in an earlier version:
https://tests.stockfishchess.org/tests/view/604f61282433018de7a38b4d
closes https://github.com/official-stockfish/Stockfish/pull/3398
No functional change
We remark that in current master, most of our use cases for between_bb() can be
optimized if the second parameter of the function is added to the segment. So we
change the definition of between_bb(s1, s2) such that it excludes s1 but includes s2.
We also use a precomputed array for between_bb() for another small speed gain
(see https://tests.stockfishchess.org/tests/view/604d09f72433018de7a389fb).
Passed STC:
LLR: 2.96 (-2.94,2.94) {-0.25,1.25}
Total: 18736 W: 1746 L: 1607 D: 15383
Ptnml(0-2): 61, 1226, 6644, 1387, 50
https://tests.stockfishchess.org/tests/view/60428c84ddcba5f0627bb6e4
Yellow LTC:
LTC:
LLR: -3.00 (-2.94,2.94) {0.25,1.25}
Total: 39144 W: 1431 L: 1413 D: 36300
Ptnml(0-2): 13, 1176, 17184, 1178, 21
https://tests.stockfishchess.org/tests/view/605128702433018de7a38ca1
Closes https://github.com/official-stockfish/Stockfish/pull/3397
---------
Verified for correctness by running perft on the following position:
./stockfish
position fen 4rrk1/1p1nq3/p7/2p1P1pp/3P2bp/3Q1Bn1/PPPB4/1K2R1NR w - - 40 21
go perft 6
Nodes searched: 6136386434
--------
No functional change
The codebase contains multiple functions returning by const-value.
This patch is a small cleanup making those function returns
by value instead, removing the const specifier.
closes https://github.com/official-stockfish/Stockfish/pull/3328
No functional change
We introduce a metric for each internal node in search, called DistanceFromPV.
This distance indicated how far the current node is from the principal variation.
We then use this distance to search the nodes which are close to the PV a little
deeper (up to 4 plies deeper than the PV): this improves the quality of the search
at these nodes and bring better updates for the PV during search.
STC:
LLR: 2.96 (-2.94,2.94) {-0.25,1.25}
Total: 54936 W: 5047 L: 4850 D: 45039
Ptnml(0-2): 183, 3907, 19075, 4136, 167
https://tests.stockfishchess.org/tests/view/6037b88e7f517a561bc4a392
LTC:
LLR: 2.95 (-2.94,2.94) {0.25,1.25}
Total: 49608 W: 1880 L: 1703 D: 46025
Ptnml(0-2): 22, 1514, 21555, 1691, 22
https://tests.stockfishchess.org/tests/view/6038271b7f517a561bc4a3cb
Closes https://github.com/official-stockfish/Stockfish/pull/3369
Bench: 5037279
The idea of this patch can be described as follows: if we are in check
and the transposition table move is a capture that returns a value
far above beta, we can assume that the opponent just blundered a piece
by giving check, and we return the transposition table value. This is
similar to the usual probCut logic for quiet moves, but with a different
threshold.
Passed STC
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 33440 W: 3056 L: 2891 D: 27493
Ptnml(0-2): 110, 2338, 11672, 2477, 123
https://tests.stockfishchess.org/tests/view/602cd1087f517a561bc49bda
Passed LTC
LLR: 2.98 (-2.94,2.94) {0.25,1.25}
Total: 10072 W: 401 L: 309 D: 9362
Ptnml(0-2): 2, 288, 4365, 378, 3
https://tests.stockfishchess.org/tests/view/602ceea57f517a561bc49bf0
The committed version has an additional fix to never return unproven wins
in the tablebase range or the mate range. This fix passed tests for non-
regression at STC and LTC:
STC:
LLR: 2.93 (-2.94,2.94) {-1.25,0.25}
Total: 26240 W: 2354 L: 2280 D: 21606
Ptnml(0-2): 85, 1763, 9372, 1793, 107
https://tests.stockfishchess.org/tests/view/602d86a87f517a561bc49c7a
LTC:
LLR: 2.95 (-2.94,2.94) {-0.75,0.25}
Total: 35304 W: 1299 L: 1256 D: 32749
Ptnml(0-2): 14, 1095, 15395, 1130, 18
https://tests.stockfishchess.org/tests/view/602d98d17f517a561bc49c83
Closes https://github.com/official-stockfish/Stockfish/pull/3362
Bench: 3830215
Official release version of Stockfish 13
Bench: 3766422
-----
It is our pleasure to release Stockfish 13 to chess fans worldwide.
As usual, downloads are freely available at
https://stockfishchess.org
The Stockfish project builds on a thriving community of enthusiasts
who contribute their expertise, time, and resources to build a free
and open-source chess engine that is robust, widely available, and
very strong. We would like to thank them all!
The good news first: from now on, our users can expect more frequent
high-quality releases of Stockfish! Sadly, this decision has been
triggered by the start of sales of the Fat Fritz 2 engine by ChessBase,
which is a copy of a very recent development version of Stockfish
with minor modifications. We refer to our statement on Fat Fritz 2[1]
and a community blog[2] for further information.
This version of Stockfish is significantly stronger than any of its
predecessors. Stockfish 13 outperforms Stockfish 12 by at least
35 Elo[3]. When playing against a one-year-old Stockfish, it wins 60
times more game pairs than it loses[4]. This release features an NNUE
network retrained on billions of positions, much faster network
evaluation code, and significantly improved search heuristics, as
well as additional evaluation tweaks. In the course of its development,
this version has won the superfinals of the TCEC Season 19 and
TCEC Season 20.
Going forward, the Leela Chess Zero and Stockfish teams will join
forces to demonstrate our commitment to open source chess engines and
training tools, and open data. We are convinced that our free and
open-source chess engines serve the chess community very well.
Stay safe and enjoy chess!
The Stockfish team
[1] https://blog.stockfishchess.org/post/643239805544792064/statement-on-fat-fritz-2
[2] https://lichess.org/blog/YCvy7xMAACIA8007/fat-fritz-2-is-a-rip-off
[3] https://tests.stockfishchess.org/tests/view/602bcccf7f517a561bc49b11
[4] https://tests.stockfishchess.org/tests/view/600fbb9c735dd7f0f0352d59
• reorder some sections of the README file
• add reference to the AUTHORS file
• rename Syzygybases to Syzygy tablebases
• add pointer to the Discord channel
• more precise info about the GPLv3 licence
No functional change
Do not decrease reduction at pv-nodes which are likely to fail low.
The idea of this patch can be described as following: during the search, if a node
on the principal variation was re-searched in non-pv search and this re-search got
a value which was much lower than alpha, then we can assume that this pv-node is
likely to fail low again, and we can reduce more aggressively at this node.
Passed STC
https://tests.stockfishchess.org/tests/view/6023a5fa7f517a561bc49638
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 70288 W: 6443 L: 6223 D: 57622
Ptnml(0-2): 239, 5022, 24436, 5174, 273
Passed LTC
https://tests.stockfishchess.org/tests/view/6023f2617f517a561bc49661
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 105656 W: 4048 L: 3748 D: 97860
Ptnml(0-2): 67, 3312, 45761, 3630, 58
Closes https://github.com/official-stockfish/Stockfish/pull/3349
Bench: 3766422
This patch removes some magic numbers in TT bit management and introduce proper
constants in the code, to improve documentation and ease further modifications.
No function change
It's about 1% speedup for Stockfish.
Result of 100 runs
==================
base (...fish_clang12) = 1946851 +/- 3717
test (./stockfish ) = 1967276 +/- 3408
diff = +20425 +/- 2438
speedup = +0.0105
P(speedup > 0) = 1.0000
Thanks to David Major for making me aware of this part
of LLVM development.
closes https://github.com/official-stockfish/Stockfish/pull/3346
No functional change
This patch give a small bonus to incite the attacking side to keep more
pawns on the board.
A consequence of this bonus is that Stockfish will tend to play positions
slightly more closed on average than master, especially when it believes
that it has an advantage.
To lower the risk of blockades where Stockfish start shuffling without
progress, we also implement a progressive decrease of the evaluation
value with the 50 moves counter (along with the necessary aging of the
transposition table to reduce the impact of the Graph History Interaction
problem): since the evaluation decreases during shuffling phases, the
engine will tend to examine the consequences of pawn breaks faster during
the search.
Passed STC:
LLR: 2.96 (-2.94,2.94) {-0.25,1.25}
Total: 26184 W: 2406 L: 2252 D: 21526
Ptnml(0-2): 85, 1784, 9223, 1892, 108
https://tests.stockfishchess.org/tests/view/600cc08b735dd7f0f0352c06
Passed LCT:
LLR: 2.95 (-2.94,2.94) {0.25,1.25}
Total: 199768 W: 7695 L: 7191 D: 184882
Ptnml(0-2): 85, 6478, 86269, 6952, 100
https://tests.stockfishchess.org/tests/view/600ccd28735dd7f0f0352c10
Closes https://github.com/official-stockfish/Stockfish/pull/3321
Bench: 3988915
Give an additional penalty of S(20, 10) for any doubled pawn if none of
the opponent's pawns is facing any of our
- pawns or
- pawn attacks;
that means, each of their pawns can push at least one square without
being captured.
This ignores their non-pawns pieces and attacks.
One possible justification: Their pawns' ability to push freely provides
options to react to our threats by changing their pawn structure. Our
doubled pawns however will likely lead to an exploitable weakness, even
if the pawn structure is not yet fixed.
Note that the notion of "their pawns not being fixed" is symmetric for
both players: If all of their pawns can push freely so can ours. All
pawns being freely pushable might just be an early-game-indicator.
However, it can trigger during endgame pawns races, where doubled pawns
are especially hindering, too.
LTC
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 134976 W: 17964 L: 17415 D: 99597
Ptnml(0-2): 998, 12702, 39619, 13091, 1078
https://tests.stockfishchess.org/tests/view/5ffdd5316019e097de3ef281
STC
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 35640 W: 7219 L: 6904 D: 21517
Ptnml(0-2): 645, 4096, 8084, 4289, 706
https://tests.stockfishchess.org/tests/view/5ffda4a16019e097de3ef265
closes https://github.com/official-stockfish/Stockfish/pull/3302
Bench: 4363873
This change simplifies control flow in the generate_moves function which ensures the compiler doesn't duplicate work due to possibly not resolving pureness of the function calls. Also the biggest change is the removal of the unnecessary condition checking for empty b in a convoluted way. The rationale for removal of this condition is that computing attacks_bb with occupancy is not much more costly than computing pseudo attacks and overall the condition (also being likely unpredictable) is a pessimisation.
This is inspired by previous changes by @BM123499.
Passed STC:
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 88040 W: 8172 L: 7931 D: 71937
Ptnml(0-2): 285, 6128, 30957, 6361, 289
https://tests.stockfishchess.org/tests/view/5ffc28386019e097de3ef1c7
closes https://github.com/official-stockfish/Stockfish/pull/3300
No functional change.
Changed name from Bad Outpost to Uncontested Outpost
Scale Uncontested Outpost with number of pawns + Decrease Bishop PSQT values and general tuning
Credits for the decrease of the Bishop PSQT values: Fauzi
Credits for scaling Uncontested Outpost with number of pawns: Lolligerhans
Credits for the tunings: Fauzi
Passed STC:
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 32040 W: 6593 L: 6281 D: 19166
Ptnml(0-2): 596, 3713, 7095, 4015, 601
https://tests.stockfishchess.org/tests/view/5ffa43026019e097de3ef0f2
Passed LTC:
LLR: 2.95 (-2.94,2.94) {0.25,1.25}
Total: 84376 W: 11395 L: 10950 D: 62031
Ptnml(0-2): 652, 7930, 24623, 8287, 696
https://tests.stockfishchess.org/tests/view/5ffa6e7b6019e097de3ef0fd
closes https://github.com/official-stockfish/Stockfish/pull/3296
Bench: 4287509
- "discovered check" (instead of "discovery check")
- "en passant" (instead of "en-passant")
- "pseudo-legal" before a noun (instead of "pseudo legal")
- "3-fold" (instead of "3fold")
closes https://github.com/official-stockfish/Stockfish/pull/3294
No functional change.
In a recent patch we added comparing capture history to a number for LMR of captures.
Calling it via thisThread-> is not needed since capture history was already declared by this time -
so removing makes code slightly shorter and easier to follow.
closes https://github.com/official-stockfish/Stockfish/pull/3297
No functional change.
Size of the weights in the last layer is less than 512 bits. It leads to wrong data access for AVX512. There is no error because in current implementation it is guaranteed that there is an array of zeros after weights so zero multiplied by something is returned and sum is correct. It is a mistake that can lead to unexpected bugs in the future. Used AVX2 instructions for smaller input size.
No measurable slowdown on avx512.
closes https://github.com/official-stockfish/Stockfish/pull/3298
No functional change.
Reordered weights in such a way that accumulated sum fits to output.
Weights are grouped in blocks of four elements because four
int8 (weight type) corresponds to one int32 (output type).
No horizontal additions.
Grouped AVX512, AVX2 and SSSE3 implementations.
Repeated code was removed.
An earlier version passed STC:
LLR: 2.97 (-2.94,2.94) {-0.25,1.25}
Total: 15336 W: 1495 L: 1355 D: 12486
Ptnml(0-2): 44, 1054, 5350, 1158, 62
https://tests.stockfishchess.org/tests/view/5ff60e106019e097de3eefd5
Speedup depends on the architecture, up to 4% measured on a NNUE only bench.
closes https://github.com/official-stockfish/Stockfish/pull/3287
No functional change
The idea of this patch is pretty simple - we already do more reductions
for non-PV and root nodes in case of stable best move for depth > 10.
This patch makes us do so if root depth if > 10 instead, which
is logical since best move changes (thus instability of it) is
counted at root, so it makes a lot of sense to use depth of the root.
passed STC
https://tests.stockfishchess.org/tests/view/5fd643271ac16912018885c5
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 13232 W: 1308 L: 1169 D: 10755
Ptnml(0-2): 39, 935, 4535, 1062, 45
passed LTC
https://tests.stockfishchess.org/tests/view/5fd68db11ac16912018885f0
LLR: 2.96 (-2.94,2.94) {0.25,1.25}
Total: 14024 W: 565 L: 463 D: 12996
Ptnml(0-2): 3, 423, 6062, 517, 7
closes https://github.com/official-stockfish/Stockfish/pull/3263
Bench: 4050630
Improves throughput by summing 2 intermediate dot products using 16 bit addition before upconverting to 32 bit.
Potential saturation is detected and the code-path is avoided in this case.
The saturation can't happen with the current nets,
but nets can be constructed that trigger this check.
STC https://tests.stockfishchess.org/tests/view/5fd40a861ac1691201888479
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 25544 W: 2451 L: 2296 D: 20797
Ptnml(0-2): 92, 1761, 8925, 1888, 106
about 5% speedup
closes https://github.com/official-stockfish/Stockfish/pull/3261
No functional change
Imbalance tables tweaked to contain MiddleGame and Endgame values, instead of a single value.
The idea started from Fisherman, which requested my help to tune the values back in June/July,
so I tuned the values back then, and we were able to accomplish good results,
but not enough to pass both STC and LTC tests.
So after the recent changes, I decided to give it another shot, and I am glad that it was a successful attempt.
A special thanks goes also to mstembera, which notified me a simple way to let the patch perform a little better.
Passed STC:
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 115976 W: 23124 L: 22695 D: 70157
Ptnml(0-2): 2074, 13652, 26285, 13725, 2252
https://tests.stockfishchess.org/tests/view/5fc92d2d42a050a89f02ccc8
Passed LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 156304 W: 20617 L: 20024 D: 115663
Ptnml(0-2): 1138, 14647, 46084, 15050, 1233
https://tests.stockfishchess.org/tests/view/5fc9fee142a050a89f02cd3e
closes https://github.com/official-stockfish/Stockfish/pull/3255
Bench: 4278746
This appears to be slightly faster than using a comparison against zero
to compute the high bits, on both old (like Pentium III) and new (like
Zen 2) hardware.
closes https://github.com/official-stockfish/Stockfish/pull/3254
No functional change.
The idea of this patch can be described as following: we update static
history stats based on comparison of the static evaluations of the
position before and after the move. If the move increases static evaluation
it's assigned positive bonus, if it decreases static evaluation
it's assigned negative bonus. These stats are used in movepicker
to sort quiet moves.
passed STC
https://tests.stockfishchess.org/tests/view/5fca4c0842a050a89f02cd66
LLR: 3.00 (-2.94,2.94) {-0.25,1.25}
Total: 78152 W: 7409 L: 7171 D: 63572
Ptnml(0-2): 303, 5695, 26873, 5871, 334
passed LTC
https://tests.stockfishchess.org/tests/view/5fca6be442a050a89f02cd75
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 40240 W: 1602 L: 1441 D: 37197
Ptnml(0-2): 19, 1306, 17305, 1475, 15
closes https://github.com/official-stockfish/Stockfish/pull/3253
bench 3845156
Include scaling change as suggested by Dietrich Kappe,
the one who trained net for Komodo. According to him,
some nets may require different scaling in order to utilize its full strength.
STC:
LLR: 2.93 (-2.94,2.94) {-0.25,1.25}
Total: 99856 W: 9669 L: 9401 D: 80786
Ptnml(0-2): 374, 7468, 34037, 7614, 435
https://tests.stockfishchess.org/tests/view/5fc2697642a050a89f02c8ec
LTC:
LLR: 2.96 (-2.94,2.94) {0.25,1.25}
Total: 29840 W: 1220 L: 1081 D: 27539
Ptnml(0-2): 10, 969, 12827, 1100, 14
https://tests.stockfishchess.org/tests/view/5fc2ea5142a050a89f02c957
Bench: 3561701
This patch removes the incrementally updated piece lists from the Position object.
This has been tried before but always failed. My reasons for trying again are:
* 32-bit systems (including phones) are now much less important than they were some years ago (and are absent from fishtest);
* NNUE may have made SF less finely tuned to the order in which moves were generated.
STC:
LLR: 2.94 (-2.94,2.94) {-1.25,0.25}
Total: 55272 W: 5260 L: 5216 D: 44796
Ptnml(0-2): 208, 4147, 18898, 4159, 224
https://tests.stockfishchess.org/tests/view/5fc2986a42a050a89f02c926
LTC:
LLR: 2.96 (-2.94,2.94) {-0.75,0.25}
Total: 16600 W: 673 L: 608 D: 15319
Ptnml(0-2): 14, 533, 7138, 604, 11
https://tests.stockfishchess.org/tests/view/5fc2f98342a050a89f02c95c
closes https://github.com/official-stockfish/Stockfish/pull/3247
Bench: 3940967
+-----------------+
| . . . . . . . . | All files are closed. Some files are
| . . . . . o o . | more valuable for rooks, because
| . . . . o . . o | they might open in the future.
| . . . o x . . x |
| o . o x . x x . |
| x o x . . . . . | x our pawns
| . x . . . . . . | o their pawns
| . . . . . . . . | ^ rooks are scored higher on these files
+-----------------+
^ ^
Files containing none of our own pawns are open or half-open (otherwise
they are closed). Rooks on (half-)open files recieve a bonus for the
future potential to act along all ranks.
This commit refines the (relative) penalty of rooks on closed files.
Files that contain one of our blocked pawns are considered less likely
to open in the future; rooks on these files are now penalized stronger.
This bonus does not generally correlate with mobility. If the condition
is sufficiently refined in the future, it may be beneficial to adjust or
override mobility scores in some cases.
LTC
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 494384 W: 71565 L: 70231 D: 352588
Ptnml(0-2): 3907, 48050, 142118, 49036, 4081
https://tests.stockfishchess.org/tests/view/5fb9312e67cbf42301d6afb9
LTC (non-regression w/ book noob_3moves.epd)
LLR: 2.95 (-2.94,2.94) {-0.75,0.25}
Total: 208520 W: 27044 L: 26937 D: 154539
Ptnml(0-2): 1557, 19850, 61391, 19853, 1609
https://tests.stockfishchess.org/tests/view/5fc01ced67cbf42301d6b3df
STC
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 98392 W: 20269 L: 19868 D: 58255
Ptnml(0-2): 1804, 11297, 22589, 11706, 1800
https://tests.stockfishchess.org/tests/view/5fb7f88a67cbf42301d6af10
closes https://github.com/official-stockfish/Stockfish/pull/3242
Bench: 3682630
in affine transform for AVX512/AVX2/SSSE3
The idea is to initialize sum with the first element instead of zero.
Reduce one add_epi32 and one set_zero SIMD instructions for each output dimension.
sum = 0; for i = 1 to n sum += a[i] ->
sum = a[1]; for i = 2 to n sum += a[i]
STC:
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 69048 W: 7024 L: 6799 D: 55225
Ptnml(0-2): 260, 5175, 23458, 5342, 289
https://tests.stockfishchess.org/tests/view/5faf2cf467cbf42301d6aa06
closes https://github.com/official-stockfish/Stockfish/pull/3227
No functional change.
For the feature transformer the code is analogical to AVX2 since there was room for easy adaptation of wider simd registers.
For the smaller affine transforms that have 32 byte stride we keep 2 columns in one zmm register. We also unroll more aggressively so that in the end we have to do 16 parallel horizontal additions on ymm slices each consisting of 4 32-bit integers. The slices are embedded in 8 zmm registers.
These changes provide about 1.5% speedup for AVX-512 builds.
Closes https://github.com/official-stockfish/Stockfish/pull/3218
No functional change.
Using no searching time in case of a single legal move is not beneficial from
a strength point of view, and this special case can be easily removed:
STC:
LLR: 2.93 (-2.94,2.94) {-1.25,0.25}
Total: 22472 W: 2458 L: 2357 D: 17657
Ptnml(0-2): 106, 1733, 7453, 1842, 102
https://tests.stockfishchess.org/tests/view/5f926cbc81eda81bd78cb6df
LTC:
LLR: 2.94 (-2.94,2.94) {-0.75,0.25}
Total: 37880 W: 1736 L: 1682 D: 34462
Ptnml(0-2): 22, 1392, 16057, 1448, 21
https://tests.stockfishchess.org/tests/view/5f92a26081eda81bd78cb6fe
The advantage of using the normal time management for a single legal move is that scores
reported for that move are reasonable, not searching leads to artifacts during games
(see e.g. https://tcec-chess.com/#div=sf&game=96&season=19)
The disadvantage of using normal time management of a single legal move is that thinking
times can be unnaturally long, making it 'painful to watch' in online tournaments.
This patch uses normal time management, but caps the used time to 500ms.
This should lead to reasonable scores, and be hardly perceptible.
closes https://github.com/official-stockfish/Stockfish/pull/3195
closes https://github.com/official-stockfish/Stockfish/pull/3183
variant of a patch suggested by SFisGOD
No functional change.
Only do countermove based pruning in qsearch if we already have a move with a better score than a TB loss.
This patch fixes a bug (started as 843a961) that incorrectly prunes moves if in check,
and adds an assert to make sure no wrong mate scores are given in the future.
It replaces a no-op moveCount check with a check for bestValue.
Initially discussed in #3171 and later in #3199, #3198 and #3210.
This PR effectively closes#3171
It also likely fixes#3196 where this causes user visible incorrect TB scores,
which probably result from these incorrect mate scores.
Passed STC and LTC non-regression tests.
https://tests.stockfishchess.org/tests/view/5f9ef8dabca9bf35bae7f648
LLR: 2.93 (-2.94,2.94) {-1.25,0.25}
Total: 21672 W: 2339 L: 2230 D: 17103
Ptnml(0-2): 126, 1689, 7083, 1826, 112
https://tests.stockfishchess.org/tests/view/5f9f0caebca9bf35bae7f666
LLR: 2.97 (-2.94,2.94) {-0.75,0.25}
Total: 33152 W: 1551 L: 1485 D: 30116
Ptnml(0-2): 27, 1308, 13832, 1390, 19
closes https://github.com/official-stockfish/Stockfish/pull/3214
Bench: 3625915
A non-functional speedup. Unroll the loops going over
the output dimensions in the affine transform layers by
a factor of 4 and perform 4 horizontal additions at a time.
Instead of doing naive horizontal additions on each vector
separately use hadd and shuffling between vectors to reduce
the number of instructions by using all lanes for all stages
of the horizontal adds.
passed STC of the initial version:
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 17808 W: 1914 L: 1756 D: 14138
Ptnml(0-2): 76, 1330, 5948, 1460, 90
https://tests.stockfishchess.org/tests/view/5f9d516f6a2c112b60691da3
passed STC of the final version after cleanup:
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 16296 W: 1750 L: 1595 D: 12951
Ptnml(0-2): 72, 1192, 5479, 1319, 86
https://tests.stockfishchess.org/tests/view/5f9df5776a2c112b60691de3
closes https://github.com/official-stockfish/Stockfish/pull/3203
No functional change
This patch was inspired by c065abd which updates the accumulator,
if possible, based on the accumulator of two plies back if
the accumulator of the preceding ply is not available.
With this patch we look back even further in the position history
in an attempt to reduce the number of complete recomputations.
When we find a usable accumulator for the position N plies back,
we also update the accumulator of the position N-1 plies back
because that accumulator is most likely to be helpful later
when evaluating positions in sibling branches.
By not updating all intermediate accumulators immediately,
we avoid doing too much work that is not certain to be useful.
Overall, roughly 2-3% speedup.
This patch makes the code more specific to the net architecture,
changing input features of the net will require additional changes
to the incremental update code as discussed in the PR #3193 and #3191.
Passed STC:
https://tests.stockfishchess.org/tests/view/5f9056712c92c7fe3a8c60d0
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 10040 W: 1116 L: 968 D: 7956
Ptnml(0-2): 42, 722, 3365, 828, 63
closes https://github.com/official-stockfish/Stockfish/pull/3193
No functional change.
Idea of this patch can be described as following - in case we have consecutive fail highs and we reach late enough moves at root node probability of remaining quiet moves being able to produce even bigger value than moves that produced previous cutoff (so ones that should be high in move ordering but now they fail to produce beta cutoff because we actually reached high move count) should be quiet small so we can reduce them more.
passed STC
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 53392 W: 5681 L: 5474 D: 42237
Ptnml(0-2): 214, 4104, 17894, 4229, 255
https://tests.stockfishchess.org/tests/view/5f88501adcdad978fe8c527e
passed LTC
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 59136 W: 2773 L: 2564 D: 53799
Ptnml(0-2): 30, 2117, 25078, 2300, 43
https://tests.stockfishchess.org/tests/view/5f884dbfdcdad978fe8c527a
closes https://github.com/official-stockfish/Stockfish/pull/3184
Bench: 4066972
We now include the total pawn count in the scaling factor for the output
of the NNUE evaluation network. This should have the effect of trying to
keep more pawns when SF has the advantage, but exchange them when she
is defending.
Thanks to Alexander Pagel (Lolligerhans) for the idea of using the
value of pawns to ease the comparison with the rest of the material
estimation.
Passed STC:
LLR: 2.93 (-2.94,2.94) {-0.25,1.25}
Total: 15072 W: 1700 L: 1539 D: 11833
Ptnml(0-2): 65, 1202, 4845, 1355, 69
https://tests.stockfishchess.org/tests/view/5f7235a63b22d6afa50699b3
Passed LTC:
LLR: 2.93 (-2.94,2.94) {0.25,1.25}
Total: 25880 W: 1270 L: 1124 D: 23486
Ptnml(0-2): 23, 980, 10788, 1126, 23
https://tests.stockfishchess.org/tests/view/5f723b483b22d6afa5069a99
closes https://github.com/official-stockfish/Stockfish/pull/3164
Bench: 3776081
Idea is that division by fraction of 2 is slightly faster than by other numbers so parameters are adjusted in a way that division in null move pruning depth reduction features dividing by 256 instead of dividing by 213.
Other than this patch is almost non-functional - difference starts to exist by depth 133.
passed STC
https://tests.stockfishchess.org/tests/view/5f70dd943b22d6afa50693c5
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 57048 W: 6616 L: 6392 D: 44040
Ptnml(0-2): 304, 4583, 18531, 4797, 309
passed LTC
https://tests.stockfishchess.org/tests/view/5f7180db3b22d6afa506941f
LLR: 2.95 (-2.94,2.94) {0.25,1.25}
Total: 45960 W: 2419 L: 2229 D: 41312
Ptnml(0-2): 43, 1779, 19137, 1987, 34
closes https://github.com/official-stockfish/Stockfish/pull/3159
bench 3789924
Current master uses a constant scale factor of 5/4 = 1.25 for the output
of the NNUE network, for compatibility with search and classical evaluation.
We modify this scale factor to make it dependent on the phase of the game,
going from about 1.5 in the opening to 1.0 for pure pawn endgames.
This helps Stockfish to avoid exchanges of pieces (heavy pieces in particular)
when she has the advantage, keeping more material on the board when attacking.
Passed STC:
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 14744 W: 1771 L: 1599 D: 11374
Ptnml(0-2): 87, 1184, 4664, 1344, 93
https://tests.stockfishchess.org/tests/view/5f6fb0a63b22d6afa506904f
Passed LTC:
LLR: 2.95 (-2.94,2.94) {0.25,1.25}
Total: 8912 W: 512 L: 393 D: 8007
Ptnml(0-2): 7, 344, 3637, 459, 9
https://tests.stockfishchess.org/tests/view/5f6fcf533b22d6afa5069066
closes https://github.com/official-stockfish/Stockfish/pull/3154
Bench: 3943952
- Clean signature of functions in namespace NNUE
- Add comment for countermove based pruning
- Remove bestMoveCount variable
- Add const qualifier to kpp_board_index array
- Fix spaces in get_best_thread()
- Fix indention in capture LMR code in search.cpp
- Rename TtmemDeleter to LargePageDeleter
Closes https://github.com/official-stockfish/Stockfish/pull/3063
No functional change
Use TT memory functions to allocate memory for the NNUE weights. This
should provide a small speed-up on systems where large pages are not
automatically used, including Windows and some Linux distributions.
Further, since we now have a wrapper for std::aligned_alloc(), we can
simplify the TT memory management a bit:
- We no longer need to store separate pointers to the hash table and
its underlying memory allocation.
- We also get to merge the Linux-specific and default implementations
of aligned_ttmem_alloc().
Finally, we'll enable the VirtualAlloc code path with large page
support also for Win32.
STC: https://tests.stockfishchess.org/tests/view/5f66595823a84a47b9036fba
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 14896 W: 1854 L: 1686 D: 11356
Ptnml(0-2): 65, 1224, 4742, 1312, 105
closes https://github.com/official-stockfish/Stockfish/pull/3081
No functional change.
This PR sets the "comp" variable simply to "clang",
which seems to be more consistent and allows a small simplification.
The PR also moves the section that sets "profile_make" and "profile_use" to after the NDK section,
which ensures that these variables are now set correctly for NDK/clang.
closes https://github.com/official-stockfish/Stockfish/pull/3121
No functional change
NNUE appears to provide a more stable eval than the classic eval,
so the time use dependencies on bestMoveChanges, fallingEval,
etc may need to change to make the best use of available time.
This change doubles the effect of totBestMoveChanges when giving
more time because the choice of best move is unstable.
STC:
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 101928 W: 11995 L: 11698 D: 78235 Elo +0.78
Ptnml(0-2): 592, 8707, 32103, 8936, 626
https://tests.stockfishchess.org/tests/view/5f538a462d02727c56b36cec
LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 186392 W: 10383 L: 9877 D: 166132 Elo +0.81
Ptnml(0-2): 207, 8370, 75539, 8870, 210
https://tests.stockfishchess.org/tests/view/5f54a9712d02727c56b36d5a
closes https://github.com/official-stockfish/Stockfish/pull/3119
Bench 4222126
No need to initialize StatScore at rootNode. Current Logic is redundant because at subsequent levels the grandchildren statScore is initialized to zero.
closes https://github.com/official-stockfish/Stockfish/pull/3122
Non functional change.
Restore the default NNUE setting (enabled) after a bench command.
This also makes the resulting program settings independent of the
number of FENs that are being benched.
Fixes issue #3112.
closes https://github.com/official-stockfish/Stockfish/pull/3113
No functional change.
This fixes#3108 and removes some NNUE code that is currently not used.
At the moment, do_null_move() copies the accumulator from the previous
state into the new state, which is correct. It then clears the "computed_score"
flag because the side to move has changed, and with the other side to move
NNUE will return a completely different evaluation (normally with changed
sign but also with different NNUE-internal tempo bonus).
The problem is that do_null_move() clears the wrong flag. It clears the
computed_score flag of the old state, not of the new state. It turns out
that this almost never affects the search. For example, fixing it does not
change the current bench (but it does change the previous bench). This is
because the search code usually avoids calling evaluate() after a null move.
This PR corrects do_null_move() by removing the computed_score flag altogether.
The flag is not needed because nnue_evaluate() is never called twice on a position.
This PR also removes some unnecessary {}s and inserts a few blank lines
in the modified NNUE files in line with SF coding style.
Resulf ot STC non-regression test:
LLR: 2.95 (-2.94,2.94) {-1.25,0.25}
Total: 26328 W: 3118 L: 3012 D: 20198
Ptnml(0-2): 126, 2208, 8397, 2300, 133
https://tests.stockfishchess.org/tests/view/5f553ccc2d02727c56b36db1
closes https://github.com/official-stockfish/Stockfish/pull/3109
bench: 4109324
- git log HEAD | grep "\b[Bb]ench[ :]\+[0-9]\{7\}" | head -n 1 | sed "s/[^0-9]*\([0-9]*\).*/\1/g" > git_sig
- export benchref=$(cat git_sig)
- echo "Reference bench:" $benchref
# Compiler version string
- $COMPILER -v
# test help target
- make help
# Verify bench number against various builds
- export CXXFLAGS="-Werror -D_GLIBCXX_DEBUG"
- make clean && make -j2 ARCH=x86-64-modern optimize=no debug=yes build && ../tests/signature.sh $benchref
- export CXXFLAGS="-Werror"
- make clean && make -j2 ARCH=x86-64-modern build && ../tests/signature.sh $benchref
- make clean && make -j2 ARCH=x86-64-ssse3 build && ../tests/signature.sh $benchref
- make clean && make -j2 ARCH=x86-64-sse3-popcnt build && ../tests/signature.sh $benchref
- make clean && make -j2 ARCH=x86-64 build && ../tests/signature.sh $benchref
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then make clean && make -j2 ARCH=general-64 build && ../tests/signature.sh $benchref; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then make clean && make -j2 ARCH=x86-32 optimize=no debug=yes build && ../tests/signature.sh $benchref; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then make clean && make -j2 ARCH=x86-32-sse41-popcnt build && ../tests/signature.sh $benchref; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then make clean && make -j2 ARCH=x86-32-sse2 build && ../tests/signature.sh $benchref; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then make clean && make -j2 ARCH=x86-32 build && ../tests/signature.sh $benchref; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then make clean && make -j2 ARCH=general-32 build && ../tests/signature.sh $benchref; fi
# workaround: exclude a custom version of llvm+clang, which doesn't find llvm-profdata on ubuntu
- if [[ "$TRAVIS_OS_NAME" != "linux" || "$COMP" == "gcc" ]]; then make clean && make -j2 ARCH=x86-64-modern profile-build && ../tests/signature.sh $benchref; fi
# compile only for some more advanced architectures (might not run in travis)
- make clean && make -j2 ARCH=x86-64-avx2 build
- make clean && make -j2 ARCH=x86-64-bmi2 build
- make clean && make -j2 ARCH=x86-64-avx512 build
- make clean && make -j2 ARCH=x86-64-vnni512 build
- make clean && make -j2 ARCH=x86-64-vnni256 build
#
# Check perft and reproducible search
- make clean && make -j2 ARCH=x86-64-modern build
- ../tests/perft.sh
- ../tests/reprosearch.sh
#
# Valgrind
#
- export CXXFLAGS="-O1 -fno-inline"
- if [ -x "$(command -v valgrind )" ]; then make clean && make -j2 ARCH=x86-64-modern debug=yes optimize=no build > /dev/null && ../tests/instrumented.sh --valgrind; fi
- if [ -x "$(command -v valgrind )" ]; then ../tests/instrumented.sh --valgrind-thread; fi
#
# Sanitizer
#
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then make clean && make -j2 ARCH=x86-64-modern sanitize=undefined optimize=no debug=yes build > /dev/null && ../tests/instrumented.sh --sanitizer-undefined; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then make clean && make -j2 ARCH=x86-64-modern sanitize=thread optimize=no debug=yes build > /dev/null && ../tests/instrumented.sh --sanitizer-thread; fi
are already enabled and no configuration is needed.
### Support on Windows
The use of large pages requires "Lock Pages in Memory" privilege. See
[Enable the Lock Pages in Memory Option (Windows)](https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/enable-the-lock-pages-in-memory-option-windows)
on how to enable this privilege. Logout/login may be needed
afterwards. Due to memory fragmentation, it may not always be
possible to allocate large pages even when enabled. A reboot
might alleviate this problem. To determine whether large pages
are in use, see the engine log.
## Compiling Stockfish yourself from the sources
Stockfish has support for 32 or 64-bit CPUs, certain hardware
instructions, big-endian machines such as Power PC, and other platforms.
On Unix-like systems, it should be easy to compile Stockfish
directly from the source code with the included Makefile in the folder
`src`. In general it is recommended to run `make help` to see a list of make
targets with corresponding descriptions.
On Unix-like systems, it should be easy to compile Stockfish directly from the
source code with the included Makefile in the folder `src`. In general, it is
recommended to run `make help` to see a list of make targets with corresponding
descriptions.
```
cd src
make help
make build ARCH=x86-64-modern
make net
cd src
make -j build ARCH=x86-64-modern
```
When not using the Makefile to compile (for instance with Microsoft MSVC) you
need to manually set/unset some switches in the compiler command line; see
file *types.h* for a quick reference.
Detailed compilation instructions for all platforms can be found in our
[documentation][wiki-compile-link].
When reporting an issue or a bug, please tell us which version and
compiler you used to create your executable. These informations can
be found by typing the following commands in a console:
```
./stockfish compiler
```
## Understanding the code base and participating in the project
Stockfish's improvement over the last couple of years has been a great
community effort. There are a few ways to help contribute to its growth.
## Contributing
### Donating hardware
Improving Stockfish requires a massive amount of testing. You can donate
your hardware resources by installing the [Fishtest Worker](https://github.com/glinscott/fishtest/wiki/Running-the-worker:-overview)
and view the current tests on [Fishtest](https://tests.stockfishchess.org/tests).
Improving Stockfish requires a massive amount of testing. You can donate your
hardware resources by installing the [Fishtest Worker][worker-link] and viewing
the current tests on [Fishtest][fishtest-link].
### Improving the code
If you want to help improve the code, there are several valuable resources:
* [In this wiki,](https://www.chessprogramming.org) many techniques used in
In the [chessprogramming wiki][programming-link], many techniques used in
Stockfish are explained with a lot of background information.
The [section on Stockfish][programmingsf-link] describes many features
and techniques used by Stockfish. However, it is generic rather than
focused on Stockfish's precise implementation.
* [The section on Stockfish](https://www.chessprogramming.org/Stockfish)
describes many features and techniques used by Stockfish. However, it is
generic rather than being focused on Stockfish's precise implementation.
Nevertheless, a helpful resource.
* The latest source can always be found on [GitHub](https://github.com/official-stockfish/Stockfish).
Discussions about Stockfish take place in the [FishCooking](https://groups.google.com/forum/#!forum/fishcooking)
group and engine testing is done on [Fishtest](https://tests.stockfishchess.org/tests).
If you want to help improve Stockfish, please read this [guideline](https://github.com/glinscott/fishtest/wiki/Creating-my-first-test)
The engine testing is done on [Fishtest][fishtest-link].
If you want to help improve Stockfish, please read this [guideline][guideline-link]
first, where the basics of Stockfish development are explained.
Discussions about Stockfish take place these days mainly in the Stockfish
[Discord server][discord-link]. This is also the best place to ask questions
about the codebase and how to improve it.
## Terms of use
Stockfish is free, and distributed under the**GNU General Public License version 3**
(GPL v3). Essentially, this means that you are free to do almost exactly
what you want with the program, including distributing it among your
friends, making it available for download from your web site, selling
it (either by itself or as part of some bigger software package), or
using it as the starting point for a software project of your own.
Stockfish is free and distributed under the
[**GNU General Public License version 3**][license-link] (GPL v3). Essentially,
this means you are free to do almost exactly what you want with the program,
including distributing it among your friends, making it available for download
from your website, selling it (either by itself or as part of some bigger
software package), or using it as the starting point for a software project of
your own.
The only real limitation is that whenever you distribute Stockfish in
some way, you must always include the full source code, or a pointer
to where the source code can be found. If you make any changes to the
source code, these changes must also be made available under the GPL.
The only real limitation is that whenever you distribute Stockfish in some way,
you MUST always include the license and the full source code (or a pointer to
where the source code can be found) to generate the exact binary you are
distributing. If you make any changes to the source code, these changes must
also be made available under GPL v3.
For full details, read the copy of the GPL v3 found in the file named
# repeat two short games, separated by ucinewgame.
# repeat two short games, separated by ucinewgame.
# with go nodes $nodes they should result in exactly
# the same node count for each iteration.
cat << EOF > repeat.exp
@@ -43,7 +43,7 @@ cat << EOF > repeat.exp
expect eof
EOF
# to increase the likelyhood of finding a non-reproducible case,
# to increase the likelihood of finding a non-reproducible case,
# the allowed number of nodes are varied systematically
for i in `seq 1 20`
do
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.