using github actions, create a prerelease for the latest commit to master.
As such a development version will be available on github, in addition to the latest release.
closes https://github.com/official-stockfish/Stockfish/pull/4622
No functional change
Implemented LEB128 (de)compression for the feature transformer.
Reduces embedded network size from 70 MiB to 39 Mib.
The new nn-78bacfcee510.nnue corresponds to the master net compressed.
closes https://github.com/official-stockfish/Stockfish/pull/4617
No functional change
Use block sparse input for the first fully connected layer on architectures with at least SSSE3.
Depending on the CPU architecture, this yields a speedup of up to 10%, e.g.
```
Result of 100 runs of 'bench 16 1 13 default depth NNUE'
base (...ockfish-base) = 959345 +/- 7477
test (...ckfish-patch) = 1054340 +/- 9640
diff = +94995 +/- 3999
speedup = +0.0990
P(speedup > 0) = 1.0000
CPU: 8 x AMD Ryzen 7 5700U with Radeon Graphics
Hyperthreading: on
```
Passed STC:
https://tests.stockfishchess.org/tests/view/6485aa0965ffe077ca12409c
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 8864 W: 2479 L: 2223 D: 4162
Ptnml(0-2): 13, 829, 2504, 1061, 25
This commit includes a net with reordered weights, to increase the likelihood of block sparse inputs,
but otherwise equivalent to the previous master net (nn-ea57bea57e32.nnue).
Activation data collected with https://github.com/AndrovT/Stockfish/tree/log-activations, running bench 16 1 13 varied_1000.epd depth NNUE on this data. Net parameters permuted with https://gist.github.com/AndrovT/9e3fbaebb7082734dc84d27e02094cb3.
closes https://github.com/official-stockfish/Stockfish/pull/4612
No functional change
Created by retraining an earlier epoch (ep659) of the experiment that led to the first SFNNv6 net:
- First retrained on the nn-0dd1cebea573 dataset
- Then retrained with skip 20 on a smaller dataset containing unfiltered Leela data
- And then retrained again with skip 27 on the nn-0dd1cebea573 dataset
The equivalent 7-step training sequence from scratch that led here was:
1. max-epoch 400, lambda 1.0, constant LR 9.75e-4, T79T77-filter-v6-dd.min.binpack
ep379 chosen for retraining in step2
2. max-epoch 800, end-lambda 0.75, T60T70wIsRightFarseerT60T74T75T76.binpack
ep679 chosen for retraining in step3
3. max-epoch 800, end-lambda 0.75, skip 28, nn-e1fb1ade4432 dataset
ep799 chosen for retraining in step4
4. max-epoch 800, end-lambda 0.7, skip 28, nn-e1fb1ade4432 dataset
ep759 became nn-8d69132723e2.nnue (first SFNNv6 net)
ep659 chosen for retraining in step5
5. max-epoch 800, end-lambda 0.7, skip 28, nn-0dd1cebea573 dataset
ep759 chosen for retraining in step6
6. max-epoch 800, end-lambda 0.7, skip 20, leela-dfrc-v2-T77decT78janfebT79aprT80apr.binpack
ep639 chosen for retraining in step7
7. max-epoch 800, end-lambda 0.7, skip 27, nn-0dd1cebea573 dataset
ep619 became nn-ea57bea57e32.nnue
For the last retraining (step7):
python3 easy_train.py
--experiment-name L1-1536-Re6-masterShuffled-ep639-sk27-Re5-leela-dfrc-v2-T77toT80small-Re4-masterShuffled-ep659-Re3-sameAs-Re2-leela96-dfrc99-16t-v2-T60novdecT80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-Re1-LeelaFarseer-new-T77T79 \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-L1-1536 \
--early-fen-skipping 27 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--max_epoch 800 \
--start-from-engine-test-net False \
--start-from-model /data/L1-1536-Re5-leela-dfrc-v2-T77toT80small-epoch639.nnue \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--seed $RANDOM \
--gpus "0,"
For preparing the step6 leela-dfrc-v2-T77decT78janfebT79aprT80apr.binpack dataset:
python3 interleave_binpacks.py \
leela96-filt-v2.binpack \
dfrc99-16tb7p-eval-filt-v2.binpack \
test77-dec2021-16tb7p.no-db.min-mar2023.binpack \
test78-janfeb2022-16tb7p.no-db.min-mar2023.binpack \
test79-apr2022-16tb7p-filter-v6-dd.binpack \
test80-apr2022-16tb7p.no-db.min-mar2023.binpack \
/data/leela-dfrc-v2-T77decT78janfebT79aprT80apr.binpack
The unfiltered Leela data used for the step6 dataset can be found at:
https://robotmoon.com/nnue-training-data
Local elo at 25k nodes per move:
nn-epoch619.nnue : 2.3 +/- 1.9
Passed STC:
https://tests.stockfishchess.org/tests/view/6480d43c6e6ce8d9fc6d7cc8
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 40992 W: 11017 L: 10706 D: 19269
Ptnml(0-2): 113, 4400, 11170, 4689, 124
Passed LTC:
https://tests.stockfishchess.org/tests/view/648119ac6e6ce8d9fc6d8208
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 129174 W: 35059 L: 34579 D: 59536
Ptnml(0-2): 66, 12548, 38868, 13050, 55
closes https://github.com/official-stockfish/Stockfish/pull/4611
bench: 2370027
Created by retraining an earlier epoch of the experiment leading to the first SFNNv6 net
on a more-randomized version of the nn-e1fb1ade4432.nnue dataset mixed with unfiltered
T80 apr2023 data. Trained using early-fen-skipping 28 and max-epoch 960.
The trainer settings and epochs used in the 5-step training sequence leading here were:
1. train from scratch for 400 epochs, lambda 1.0, constant LR 9.75e-4, T79T77-filter-v6-dd.min.binpack
2. retrain ep379, max-epoch 800, end-lambda 0.75, T60T70wIsRightFarseerT60T74T75T76.binpack
3. retrain ep679, max-epoch 800, end-lambda 0.75, skip 28, nn-e1fb1ade4432 dataset
4. retrain ep799, max-epoch 800, end-lambda 0.7, skip 28, nn-e1fb1ade4432 dataset
5. retrain ep439, max-epoch 960, end-lambda 0.7, skip 28, shuffled nn-e1fb1ade4432 + T80 apr2023
This net was epoch 559 of the final (step 5) retraining:
```bash
python3 easy_train.py \
--experiment-name L1-1536-Re4-leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr-shuffled-sk28 \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-L1-1536 \
--early-fen-skipping 28 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--max_epoch 960 \
--start-from-engine-test-net False \
--start-from-model /data/L1-1536-Re3-nn-epoch439.nnue \
--engine-test-branch linrock/Stockfish/L1-1536 \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--seed $RANDOM \
--gpus "0,"
```
During data preparation, most binpacks were unminimized by removing positions with
score 32002 (`VALUE_NONE`). This makes the tradeoff of increasing dataset filesize
on disk to increase the randomness of positions in interleaved datasets.
The code used for unminimizing is at:
https://github.com/linrock/Stockfish/tree/tools-unminify
For preparing the dataset used in this experiment:
```bash
python3 interleave_binpacks.py \
leela96-filt-v2.binpack \
dfrc99-16tb7p-eval-filt-v2.binpack \
filt-v6-dd-min/test60-novdec2021-12tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-aug2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-sep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd/test80-jul2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-oct2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-nov2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd-min/test80-jan2023-3of3-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-feb2023-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd/test79-apr2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test79-may2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd-min/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd/test78-juntosep2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test77-dec2021-16tb7p-filter-v6-dd.binpack \
test80-apr2023-2tb7p.binpack \
/data/leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr.binpack
```
T80 apr2023 data was converted using lc0-rescorer with ~2tb of tablebases and can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move vs. nn-e1fb1ade4432.nnue (L1 size 1024):
nn-epoch559.nnue : 25.7 +/- 1.6
Passed STC:
https://tests.stockfishchess.org/tests/view/647cd3b87cf638f0f53f9cbb
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 59200 W: 16000 L: 15660 D: 27540
Ptnml(0-2): 159, 6488, 15996, 6768, 189
Passed LTC:
https://tests.stockfishchess.org/tests/view/647d58de726f6b400e4085d8
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 58800 W: 16002 L: 15657 D: 27141
Ptnml(0-2): 44, 5607, 17748, 5962, 39
closes https://github.com/official-stockfish/Stockfish/pull/4606
bench 2141197
Created by training a new net from scratch with L1 size increased from 1024 to 1536.
Thanks to Vizvezdenec for the idea of exploring larger net sizes after recent
training data improvements.
A new net was first trained with lambda 1.0 and constant LR 8.75e-4. Then a strong net
from a later epoch in the training run was chosen for retraining with start-lambda 1.0
and initial LR 4.375e-4 decaying with gamma 0.995. Retraining was performed a total of
3 times, for this 4-step process:
1. 400 epochs, lambda 1.0 on filtered T77+T79 v6 deduplicated data
2. 800 epochs, end-lambda 0.75 on T60T70wIsRightFarseerT60T74T75T76.binpack
3. 800 epochs, end-lambda 0.75 and early-fen-skipping 28 on the master dataset
4. 800 epochs, end-lambda 0.7 and early-fen-skipping 28 on the master dataset
In the training sequence that reached the new nn-8d69132723e2.nnue net,
the epochs used for the 3x retraining runs were:
1. epoch 379 trained on T77T79-filter-v6-dd.min.binpack
2. epoch 679 trained on T60T70wIsRightFarseerT60T74T75T76.binpack
3. epoch 799 trained on the master dataset
For training from scratch:
python3 easy_train.py \
--experiment-name new-L1-1536-T77T79-filter-v6dd \
--training-dataset /data/T77T79-filter-v6-dd.min.binpack \
--max_epoch 400 \
--lambda 1.0 \
--start-from-engine-test-net False \
--engine-test-branch linrock/Stockfish/L1-1536 \
--nnue-pytorch-branch linrock/Stockfish/misc-fixes-L1-1536 \
--tui False \
--gpus "0," \
--seed $RANDOM
Retraining commands were similar to each other. For the 3rd retraining run:
python3 easy_train.py \
--experiment-name L1-1536-T77T79-v6dd-Re1-LeelaFarseer-Re2-masterDataset-Re3-sameData \
--training-dataset /data/leela96-dfrc99-v2-T60novdecT80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd.binpack \
--early-fen-skipping 28 \
--max_epoch 800 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--lr 4.375e-4 \
--gamma 0.995 \
--start-from-engine-test-net False \
--start-from-model /data/L1-1536-T77T79-v6dd-Re1-LeelaFarseer-Re2-masterDataset-nn-epoch799.nnue \
--engine-test-branch linrock/Stockfish/L1-1536 \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-L1-1536 \
--tui False \
--gpus "0," \
--seed $RANDOM
The T77+T79 data used is a subset of the master dataset available at:
https://robotmoon.com/nnue-training-data/
T60T70wIsRightFarseerT60T74T75T76.binpack is available at:
https://drive.google.com/drive/folders/1S9-ZiQa_3ApmjBtl2e8SyHxj4zG4V8gG
Local elo at 25k nodes per move vs. nn-e1fb1ade4432.nnue (L1 size 1024):
nn-epoch759.nnue : 26.9 +/- 1.6
Failed STC
https://tests.stockfishchess.org/tests/view/64742485d29264e4cfa75f97
LLR: -2.94 (-2.94,2.94) <0.00,2.00>
Total: 13728 W: 3588 L: 3829 D: 6311
Ptnml(0-2): 71, 1661, 3610, 1482, 40
Failing LTC
https://tests.stockfishchess.org/tests/view/64752d7c4a36543c4c9f3618
LLR: -1.91 (-2.94,2.94) <0.50,2.50>
Total: 35424 W: 9522 L: 9603 D: 16299
Ptnml(0-2): 24, 3579, 10585, 3502, 22
Passed VLTC 180+1.8
https://tests.stockfishchess.org/tests/view/64752df04a36543c4c9f3638
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 47616 W: 13174 L: 12863 D: 21579
Ptnml(0-2): 13, 4261, 14952, 4566, 16
Passed VLTC SMP 60+0.6 th 8
https://tests.stockfishchess.org/tests/view/647446ced29264e4cfa761e5
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 19942 W: 5694 L: 5451 D: 8797
Ptnml(0-2): 6, 1504, 6707, 1749, 5
closes https://github.com/official-stockfish/Stockfish/pull/4593
bench 2222567
This change removes one of the constants in the calculation of optimism. It also changes the 2 constants used with the scale value so that they are independent, instead of applying a constant to the scale and then adjusting it again when it is applied to the optimism. This might make the tuning of these constants cleaner and more reliable in the future.
STC 10+0.1 (accidentally run as an Elo gainer:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 154080 W: 41119 L: 40651 D: 72310
Ptnml(0-2): 375, 16840, 42190, 17212, 423
https://tests.stockfishchess.org/tests/live_elo/64653eabf3b1a4e86c317f77
LTC 60+0.6:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 217434 W: 58382 L: 58363 D: 100689
Ptnml(0-2): 66, 21075, 66419, 21088, 69
https://tests.stockfishchess.org/tests/live_elo/6465d077f3b1a4e86c318d6c
closes https://github.com/official-stockfish/Stockfish/pull/4576
bench: 3190961
Created by retraining nn-dabb1ed23026.nnue with a dataset composed of:
* The previous best dataset (nn-1ceb1a57d117.nnue dataset)
* Adding de-duplicated T80 data from feb2023 and the last 10 days of jan2023, filtered with v6-dd
Initially trained with the same options as the recent master net (nn-1ceb1a57d117.nnue).
Around epoch 890, training was manually stopped and max epoch increased to 1000.
```
python3 easy_train.py \
--experiment-name leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovjanfebT79aprmayT78jantosepT77dec-v6dd \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovjanfebT79aprmayT78jantosepT77dec-v6dd.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes \
--start-from-engine-test-net True \
--early-fen-skipping 30 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--max_epoch 900 \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--gpus "0," \
--seed $RANDOM
```
The same v6-dd filtering and binpack minimizer was used for preparing the recent nn-1ceb1a57d117.nnue dataset.
```
python3 interleave_binpacks.py \
leela96-filt-v2.binpack \
dfrc99-filt-v2.binpack \
T60-nov2021-12tb7p-eval-filt-v2.binpack \
T60-dec2021-12tb7p-eval-filt-v2.binpack \
filt-v6/test80-aug2022-16tb7p-filter-v6.min-mar2023.binpack \
filt-v6/test80-sep2022-16tb7p-filter-v6.min-mar2023.binpack \
filt-v6-dd/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test80-jul2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-oct2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-nov2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-jan2022-3of3-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test80-feb2023-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test79-apr2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test79-may2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test78-juntosep2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test77-dec2021-16tb7p-filter-v6-dd.binpack \
/data/leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovjanfebT79aprmayT78jantosepT77dec-v6dd.binpack
```
Links for downloading the training data components can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move:
nn-epoch919.nnue : 2.6 +/- 2.8
Passed STC vs. nn-dabb1ed23026.nnue
https://tests.stockfishchess.org/tests/view/644420df94ff3db5625f2af5
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 125960 W: 33898 L: 33464 D: 58598
Ptnml(0-2): 351, 13920, 34021, 14320, 368
Passed LTC vs. nn-1ceb1a57d117.nnue
https://tests.stockfishchess.org/tests/view/64469f128d30316529b3dc46
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 24544 W: 6817 L: 6542 D: 11185
Ptnml(0-2): 8, 2252, 7488, 2505, 19
closes https://github.com/official-stockfish/Stockfish/pull/4546
bench 3714847
* Extending v6 filtering to data from T77 dec2021, T79 may2022, and T80 nov2022
* Reducing the number of duplicate positions, prioritizing position scores seen later in time
* Using a binpack minimizer to reduce the overall data size
Trained the same way as the previous master net, aside from the dataset changes:
```
python3 easy_train.py \
--experiment-name leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovT79aprmayT78jantosepT77dec-v6dd \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovT79aprmayT78jantosepT77dec-v6dd.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes \
--start-from-engine-test-net True \
--early-fen-skipping 30 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--max_epoch 900 \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--gpus "0," \
--seed $RANDOM
```
The new v6-dd filtering reduces duplicate positions by iterating over hourly data files within leela test runs, starting with the most recent, then keeping positions the first time they're seen and ignoring positions that are seen again. This ordering was done with the assumption that position scores seen later in time are generally more accurate than scores seen earlier in the test run. Positions are de-duplicated based on piece orientations, the first token in fen strings.
The binpack minimizer was run with default settings after first merging monthly data into single binpacks.
```
python3 interleave_binpacks.py \
leela96-filt-v2.binpack \
dfrc99-filt-v2.binpack \
T60-nov2021-12tb7p-eval-filt-v2.binpack \
T60-dec2021-12tb7p-eval-filt-v2.binpack \
filt-v6/test80-aug2022-16tb7p-filter-v6.min-mar2023.binpack \
filt-v6/test80-sep2022-16tb7p-filter-v6.min-mar2023.binpack \
filt-v6-dd/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test80-jul2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-oct2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-nov2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test79-apr2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test79-may2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test78-juntosep2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test77-dec2021-16tb7p-filter-v6-dd.binpack \
/data/leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovT79aprmayT78jantosepT77dec-v6dd.binpack
```
The code for v6-dd filtering is available along with training data preparation scripts at:
https://github.com/linrock/nnue-data
Links for downloading the training data components:
https://robotmoon.com/nnue-training-data/
The binpack minimizer is from: #4447
Local elo at 25k nodes per move:
nn-epoch859.nnue : 1.2 +/- 2.6
Passed STC:
https://tests.stockfishchess.org/tests/view/643aad7db08900ff1bc5a832
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 565040 W: 150225 L: 149162 D: 265653
Ptnml(0-2): 1875, 62137, 153229, 63608, 1671
Passed LTC:
https://tests.stockfishchess.org/tests/view/643ecf2fa43cf30e719d2042
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 1014840 W: 274645 L: 272456 D: 467739
Ptnml(0-2): 515, 98565, 306970, 100956, 414
closes https://github.com/official-stockfish/Stockfish/pull/4545
bench 3476305
This idea is a result of my second condition combination tuning for reductions:
https://tests.stockfishchess.org/tests/view/643ed5573806eca398f06d61
There were used two parameters per combination: one for the 'sign' of the first and the second condition in a combination. Values >= 50 indicate using a condition directly and values <= -50 means use the negation of a condition.
Each condition pair (X,Y) had two occurances dependent of the order of the two conditions:
- if X < Y the parameters used for more reduction
- if X > Y the parameters used for less reduction
- if X = Y then only one condition is present and A[X][X][0]/A[X][X][1] stands for using more/less reduction for only this condition.
The parameter pair A[7][2][0] (value = -94.70) and A[7][2][1] (value = 93.60) was one of the strongest signals with values near 100/-100.
Here condition nr. 7 was '(ss+1)->cutoffCnt > 3' and condition nr. 2 'move == ttMove'. For condition nr. 7 the negation is used because A[7][2][0] is negative.
This translates finally to less reduction (because 7 > 2) for tt moves if child cutoffs <= 3.
STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 65728 W: 17704 L: 17358 D: 30666
Ptnml(0-2): 184, 7092, 18008, 7354, 226
https://tests.stockfishchess.org/tests/view/643ff767ef2529086a7ed042
LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 139200 W: 37776 L: 37282 D: 64142
Ptnml(0-2): 58, 13241, 42509, 13733, 59
https://tests.stockfishchess.org/tests/view/6440bfa9ef2529086a7edbc7
closes https://github.com/official-stockfish/Stockfish/pull/4538
Bench: 3548023
This patch is a simplification of my recent elo gainer.
Logically the Elo gainer didn't make much sense and this patch simplifies it into smth more logical.
Instead of assigning negative bonuses to all non-first moves that enter PV nodes
we assign positive bonuses in full depth search after LMR only for moves that
will result in a fail high - thus not assigning positive bonuses
for moves that will go to pv search - so doing "almost" the same as we do in master now for them.
Logic differs for some other moves, though, but this removes some lines of code.
Passed STC:
https://tests.stockfishchess.org/tests/view/642cf5cf77ff3301150dc5ec
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 409320 W: 109124 L: 109308 D: 190888
Ptnml(0-2): 1149, 45385, 111751, 45251, 1124
Passed LTC:
https://tests.stockfishchess.org/tests/view/642fe75d20eb941419bde200
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 260336 W: 70280 L: 70303 D: 119753
Ptnml(0-2): 99, 25236, 79528, 25199, 106
closes https://github.com/official-stockfish/Stockfish/pull/4522
Bench: 4286815
Since bestValue becomes value and beta - alpha is always non-negative,
extraReduction is always false, hence it has no effect.
This patch includes small changes to improve readability.
closes https://github.com/official-stockfish/Stockfish/pull/4505
No functional change
The current implementation generates warnings on MSVC. However, we have
no real use cases for double-typed UCI option values now. Also parameter
tuning only accepts following three types:
int, Value, Score
closes https://github.com/official-stockfish/Stockfish/pull/4505
No functional change
Replace the deprecated Intel compiler icc with its newer icx variant.
This newer compiler is based on clang, and yields good performance.
As before, currently only linux is supported.
closes https://github.com/official-stockfish/Stockfish/pull/4478
No functional change
Made advanced Windows API calls (those from Advapi32.dll) dynamically
linked to avoid link errors when compiling using
Intel icx compiler for Windows.
https://github.com/official-stockfish/Stockfish/pull/4467
No functional change
this makes it easier to compile under MSVC, even though we recommend gcc/clang for production compiles at the moment.
In Win32 API, by default, most null-terminated character strings arguments are of wchar_t (UTF16, formerly UCS16-LE) type, i.e. 2 bytes (at least) per character. So, src/misc.cpp should have proper type. Respectively, for src/syzygy/tbprobe.cpp, in Widows, file paths should be std::wstring rather than std::string. However, this requires a very big number of changes, since the config files are also keeping the 8-bit-per-character std::string strings. Therefore, just one change of using 8-byte-per-character CreateFileA make it compile under MSVC.
closes https://github.com/official-stockfish/Stockfish/pull/4438
No functional change
Created by retraining the master net with these modifications:
* New filtering methods for existing data from T80 sep+oct2022, T79 apr2022, T78 jun+jul+aug+sep2022, T77 dec2021
* Adding new filtered data from T80 aug2022 and T78 apr+may2022
* Increasing early-fen-skipping from 28 to 30
```
python3 easy_train.py \
--experiment-name leela96-dfrc99-T80novT79mayT60novdec-v2-T80augsepoctT79aprT78aprtosep-v6-T77dec-v3-sk30 \
--training-dataset /data/leela96-dfrc99-T80novT79mayT60novdec-v2-T80augsepoctT79aprT78aprtosep-v6-T77dec-v3.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes \
--start-from-engine-test-net True \
--early-fen-skipping 30 \
--max_epoch 900 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--gpus "0," \
--seed $RANDOM
```
The v3 filtering used for data from T77dec 2021 differs from v2 filtering in that:
* To improve binpack compression, positions after ply 28 were skipped during training by setting position scores to VALUE_NONE (32002) instead of removing them entirely
* All early-game positions with ply <= 28 were removed to maximize binpack compression
* Only bestmove captures at d6pv2 search were skipped, not 2nd bestmove captures
* Binpack compression was repaired for the remaining positions by effectively replacing bestmoves with "played moves" to maintain contiguous sequences of positions in the training game data
After improving binpack compression, The T77 dec2021 data size was reduced from 95G to 19G.
The v6 filtering used for data from T80augsepoctT79aprT78aprtosep 2022 differs from v2 in that:
* All positions with only one legal move were removed
* Tighter score differences at d6pv2 search were used to remove more positions with only one good move than before
* d6pv2 search was not used to remove positions where the best 2 moves were captures
```
python3 interleave_binpacks.py \
nn-547-dataset/leela96-eval-filt-v2.binpack \
nn-547-dataset/dfrc99-eval-filt-v2.binpack \
nn-547-dataset/test80-nov2022-12tb7p-eval-filt-v2-d6.binpack \
nn-547-dataset/T79-may2022-12tb7p-eval-filt-v2.binpack \
nn-547-dataset/T60-nov2021-12tb7p-eval-filt-v2.binpack \
nn-547-dataset/T60-dec2021-12tb7p-eval-filt-v2.binpack \
filt-v6/test80-aug2022-16tb7p-filter-v6.binpack \
filt-v6/test80-sep2022-16tb7p-filter-v6.binpack \
filt-v6/test80-oct2022-16tb7p-filter-v6.binpack \
filt-v6/test79-apr2022-16tb7p-filter-v6.binpack \
filt-v6/test78-aprmay2022-16tb7p-filter-v6.binpack \
filt-v6/test78-junjulaug2022-16tb7p-filter-v6.binpack \
filt-v6/test78-sep2022-16tb7p-filter-v6.binpack \
filt-v3/test77-dec2021-16tb7p-filt-v3.binpack \
/data/leela96-dfrc99-T80novT79mayT60novdec-v2-T80augsepoctT79aprT78aprtosep-v6-T77dec-v3.binpack
```
The code for the new data filtering methods is available at:
https://github.com/linrock/Stockfish/tree/nnue-data-v3/nnue-data
The code for giving hexword names to .nnue files is at:
https://github.com/linrock/nnue-namer
Links for downloading the training data components can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move:
nn-epoch779.nnue : 0.6 +/- 3.1
Passed STC:
https://tests.stockfishchess.org/tests/view/64212412db43ab2ba6f8efb0
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 82256 W: 22185 L: 21809 D: 38262
Ptnml(0-2): 286, 9065, 22067, 9407, 303
Passed LTC:
https://tests.stockfishchess.org/tests/view/64223726db43ab2ba6f91d6c
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 30840 W: 8437 L: 8149 D: 14254
Ptnml(0-2): 14, 2891, 9323, 3177, 15
closes https://github.com/official-stockfish/Stockfish/pull/4465
bench 5101970
This patch simplifies initialization of statScore to "always set it up to 0" instead of setting it up to 0 two plies deeper.
Reason for why it was done in previous way partially was because of LMR usage of previous statScore which was simplified long time ago so it makes sense to make in more simple there.
Passed STC:
https://tests.stockfishchess.org/tests/view/641a86d1db43ab2ba6f7b31d
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 115648 W: 30895 L: 30764 D: 53989
Ptnml(0-2): 368, 12741, 31473, 12876, 366
Passed LTC:
https://tests.stockfishchess.org/tests/view/641b1c31db43ab2ba6f7d17a
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 175576 W: 47122 L: 47062 D: 81392
Ptnml(0-2): 91, 17077, 53390, 17141, 89
closes https://github.com/official-stockfish/Stockfish/pull/4460
bench 5081969
Patch analyzes field after SEE exchanges concluded with a recapture by
the opponent:
if opponent Queen/Rook/King results under attack after the exchanges, we
consider the move sharp and don't prune it.
Important note:
By accident I forgot to adjust 'occupied' when the king takes part in
the exchanges.
As result of this a move is considered sharp too, when opponent king
apparently can evade check by recapturing.
Surprisingly this seems contribute to patch's strength.
STC:
https://tests.stockfishchess.org/tests/view/640b16132644b62c33947397
LLR: 2.96 (-2.94,2.94) <0.00,2.00>
Total: 116400 W: 31239 L: 30817 D: 54344
Ptnml(0-2): 350, 12742, 31618, 13116, 374
LTC:
https://tests.stockfishchess.org/tests/view/640c88092644b62c3394c1c5
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 177600 W: 47988 L: 47421 D: 82191
Ptnml(0-2): 62, 16905, 54317, 17436, 80
closes https://github.com/official-stockfish/Stockfish/pull/4453
bench: 5012145
Since st is a member of position we don't need to pass it separately as
parameter.
While being there also remove some line in pos_is_ok, where
a copy of StateInfo was made by using default copy constructor and
then verified it's correctedness by doing a memcmp.
There is no point in doing that.
Passed non-regression test
https://tests.stockfishchess.org/tests/view/64098d562644b62c33942b35
LLR: 3.24 (-2.94,2.94) <-1.75,0.25>
Total: 548960 W: 145834 L: 146134 D: 256992
Ptnml(0-2): 1617, 57652, 156261, 57314, 1636
closes https://github.com/official-stockfish/Stockfish/pull/4444
No functional change
Keep incbin.h with the same mode as the other source files.
A mode diff might show up when working with patch files or sending the source code between devices.
This patch should fix such behaviour.
closes https://github.com/official-stockfish/Stockfish/pull/4442
No functional change
in a some of cases movepicker returned some moves more than once which lead
to them being searched more than once. This bug was possible because of how
we use queen promotions - they are generated as a captures but are not
included in position function which checks if move is a capture. Thus if
any refutation (killer or countermove) was a queen promotion it was
searched twice - once as a capture and one as a refutation.
This patch affects various things, namely stats assignments for queen promotions
and other moves if best move is queen promotion,
also some heuristics in search and qsearch.
With this patch every queen promotion is now considered a capture.
After this patch number of found duplicated moves is 0 during normal 13 depth bench run.
Passed STC:
https://tests.stockfishchess.org/tests/view/63f77e01e74a12625bcd87d7
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 80920 W: 21455 L: 21289 D: 38176
Ptnml(0-2): 198, 8839, 22241, 8963, 219
Passed LTC:
https://tests.stockfishchess.org/tests/view/63f7e020e74a12625bcd9a76
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 89712 W: 23674 L: 23533 D: 42505
Ptnml(0-2): 24, 8737, 27202, 8860, 33
closes https://github.com/official-stockfish/Stockfish/pull/4405
bench 4681731
Call the recently added hint function for NNUE accumulator update after a failed probcut search.
In this case we already searched at least some captures and tt move which, however, is not sufficient for a cutoff.
So it seems we have a greater chance that the full search will also have no cutoff and hence all moves have to be searched.
STC: https://tests.stockfishchess.org/tests/view/63fa74a4e74a12625bce1823
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 70096 W: 18770 L: 18423 D: 32903
Ptnml(0-2): 191, 7342, 19654, 7651, 210
To be sure that we have no heavy interaction retest on top of #4410.
Rebased STC: https://tests.stockfishchess.org/tests/view/63fb2f62e74a12625bce3b03
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 137688 W: 36790 L: 36349 D: 64549
Ptnml(0-2): 397, 14373, 38919, 14702, 453
closes https://github.com/official-stockfish/Stockfish/pull/4411
No functional change
Credits to Stefan Geschwentner (locutus2) showing that the hint
is useful on PvNodes. In contrast to his test,
this version avoids to use the hint when in check.
I believe checking positions aren't good candidates for the hint
because:
- evasion moves are rather few, so a checking pos. has much less childs
than a normal position
- if the king has to move the NNUE eval can't use incremental updates,
so the child nodes have to do a full refresh anyway.
Passed STC:
https://tests.stockfishchess.org/tests/view/63f9c5b1e74a12625bcdf585
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 124472 W: 33268 L: 32846 D: 58358
Ptnml(0-2): 350, 12986, 35170, 13352, 378
closes https://github.com/official-stockfish/Stockfish/pull/4410
no functional change
Params found with the nevergrad TBPSA optimizer via nevergrad4sf modified to:
* use SPRT LLR with fishtest STC elo gainer bounds [0, 2] as the objective function
* increase the game batch size after each new optimal point is found
The params were the optimal point after TBPSA iteration 7 and 160 nevergrad evaluations with:
* initial batch size of 96 games per evaluation
* batch size increase of 64 games after each iteration
* a budget of 512 evaluations
* TC: fixed 1.5 million nodes per move, no time limit
nevergrad4sf enables optimizing stockfish params with TBPSA:
https://github.com/vondele/nevergrad4sf
Using pentanomial game results with smaller game batch sizes was inspired by:
Use of SPRT LLR calculated from pentanomial game results as the objective function was an experiment at maximizing the information from game batches to reduce the computational cost for TBPSA to converge on good parameters.
For the exact code used to find the params:
https://github.com/linrock/tuning-fork
Passed STC:
https://tests.stockfishchess.org/tests/view/63f4ef5ee74a12625bcd114a
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 66552 W: 17736 L: 17390 D: 31426
Ptnml(0-2): 164, 7229, 18166, 7531, 186
Passed LTC:
https://tests.stockfishchess.org/tests/view/63f56028e74a12625bcd2550
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 71264 W: 19150 L: 18787 D: 33327
Ptnml(0-2): 23, 6728, 21771, 7083, 27
closes https://github.com/official-stockfish/Stockfish/pull/4401
bench 3687580
This patch introduces `hint_common_parent_position()` to signal that potentially several child nodes will require an NNUE eval. By populating explicitly the accumulator, these subsequent evaluations can be performed more efficiently.
This was based on the observation that calculating the evaluation in an excluded move position yielded a significant Elo gain, even though the evaluation itself was already available (work by pb00067).
Sopel wrote the code to perform just the accumulator update. This PR is based on cleaned up code that
passed STC:
https://tests.stockfishchess.org/tests/view/63f62f9be74a12625bcd4aa0
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 110368 W: 29607 L: 29167 D: 51594
Ptnml(0-2): 41, 10551, 33572, 10967, 53
and in an the earlier (equivalent) version
passed STC:
https://tests.stockfishchess.org/tests/view/63f3c3fee74a12625bcce2a6
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 47552 W: 12786 L: 12467 D: 22299
Ptnml(0-2): 120, 5107, 12997, 5438, 114
passed LTC:
https://tests.stockfishchess.org/tests/view/63f45cc2e74a12625bccfa63
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 110368 W: 29607 L: 29167 D: 51594
Ptnml(0-2): 41, 10551, 33572, 10967, 53
closes https://github.com/official-stockfish/Stockfish/pull/4402
Bench: 3726250
The sdot instruction computes (and accumulates) a signed dot product,
which is quite handy for Stockfish's NNUE code. The instruction is
optional for Armv8.2 and Armv8.3, and mandatory for Armv8.4 and above.
The commit adds a new 'arm-dotprod' architecture with enabled dot
product support. It also enables dot product support for the existing
'apple-silicon' architecture, which is at least Armv8.5.
The following local speed test was performed on an Apple M1 with
ARCH=apple-silicon. I had to remove CPU pinning from the benchmark
script. However, the results were still consistent: Checking both
binaries against themselves reported a speedup of +0.0000 and +0.0005,
respectively.
```
Result of 100 runs
==================
base (...ish.037ef3e1) = 1917997 +/- 7152
test (...fish.dotprod) = 2159682 +/- 9066
diff = +241684 +/- 2923
speedup = +0.1260
P(speedup > 0) = 1.0000
CPU: 10 x arm
Hyperthreading: off
```
Fixes#4193
closes https://github.com/official-stockfish/Stockfish/pull/4400
No functional change
Created by retraining the master net on a dataset composed of:
* Most of the previous best dataset filtered to remove positions likely having only one good move
* Adding training data from Leela T77 dec2021 rescored with 16tb of 7-piece tablebases
Trained with end lambda 0.7 and max epoch 900. Positions with ply <= 28 were removed from most of the previous best dataset before training began. A new nnue-pytorch trainer param for skipping early plies was used to skip plies <= 24 in the unfiltered and additional Leela T77 parts of the dataset.
```
python easy_train.py \
--experiment-name leela96-dfrc99-T80octnovT79aprmayT60novdec-eval-filt-v2-T78augsep-12tb-T77dec-16tb-lambda7-sk24 \
--training-dataset /data/leela96-dfrc99-T80octnovT79aprmayT60novdec-eval-filt-v2-T78augsep-12tb-T77dec-16tb.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/easy-train-early-fen-skipping \
--early-fen-skipping 24 \
--gpus "0," \
--start-from-engine-test-net True \
--start-lambda 1.0 \
--end-lambda 0.7 \
--gamma 0.995 \
--lr 4.375e-4 \
--tui False \
--seed $RANDOM \
--max_epoch 900
```
The depth6 multipv2 search filtering method is the same as the one used for filtering recent best datasets, with a lower eval difference threshold to remove slightly more positions than before. These parts of the dataset were filtered:
* 96% of T60T70wIsRightFarseerT60T74T75T76.binpack
* 99% of dfrc_n5000.binpack
* T80 oct + nov 2022 data, no positions with castling flags, rescored with ~600gb 7p tablebases
* T79 apr + may 2022 data, rescored with 12tb 7p tablebases
* T60 nov + dec 2021 data, rescored with 12tb 7p tablebases
These parts of the dataset were not filtered. Positions with ply <= 24 were skipped during training:
* T78 aug + sep 2022 data, rescored with 12tb 7p tablebases
* 84% of T77 dec 2021 data, rescored with 16tb 7p tablebases
The code and exact evaluation thresholds used for data filtering can be found at:
https://github.com/linrock/Stockfish/tree/tools-filter-multipv2-eval-diff-t2/src/filter
The exact training data used can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move:
nn-epoch859.nnue : 3.5 +/ 1.2
Passed STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
https://tests.stockfishchess.org/tests/view/63dfeefc73223e7f52ad769f
Total: 219744 W: 58572 L: 58002 D: 103170
Ptnml(0-2): 609, 24446, 59284, 24832, 701
Passed LTC:
https://tests.stockfishchess.org/tests/view/63e268fc73223e7f52ade7b6
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 91256 W: 24528 L: 24121 D: 42607
Ptnml(0-2): 48, 8863, 27390, 9288, 39
closes https://github.com/official-stockfish/Stockfish/pull/4387
bench 3841998
This patch is a simplification / code normalisation in qsearch.
Adds steps in comments the same way we have in search;
Makes a separate "pruning" stage instead of heuristics randomly being spread over qsearch code;
Reorders pruning heuristics from least taxing ones to more taxing ones;
Removes repeated check for best value not being mated, instead uses 1 check - thus removes some lines of code.
Moves prefetch and move setup after pruning - makes no sense to do them if move will actually get pruned.
Passed non-regression test:
https://tests.stockfishchess.org/tests/view/63dd2c5ff9a50a69252c1413
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 113504 W: 29898 L: 29770 D: 53836
Ptnml(0-2): 287, 11861, 32327, 11991, 286
https://github.com/official-stockfish/Stockfish/pull/4382
Non-functional change.
PR consists of 2 improvements on nodes with excludeMove:
1. Remove xoring the posKey with make_key(excludedMove)
Since we never call tte->save anymore with excludedMove,
the unique left purpose of the xoring was to avoid a TT hit.
Nevertheless on a normal bench run this produced ~25 false positives
(key collisions)
To avoid that we now forbid early TT cutoff's with excludeMove
Maybe these accesses to TT with xored key caused useless misses
in the CPU caches (L1, L2 ...)
Now doing the probe with the same key as the enclosing search does,
should hit the CPU cache.
2. Don't probe Tablebases with excludedMove.
This can't be tested on fishtest, but it's obvious that
tablebases don't deliver any information about suboptimal moves.
Side note:
Very surprisingly it looks like we cannot use static eval's from
TT since they slightly differ over time due to changing optimism.
Attempts to use static eval's from TT did loose about 13 ELO.
This is something about to investigate.
LTC: https://tests.stockfishchess.org/tests/view/63dc0f8de9d4cdfbe672d0c6
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 44736 W: 12046 L: 11733 D: 20957
Ptnml(0-2): 12, 4212, 13617, 4505, 22
An analogue of this passed STC & LTC
see PR #4374 (thanks Dubslow for reviewing!)
closes https://github.com/official-stockfish/Stockfish/pull/4380
Bench: 4758694
This patch adds more debugging slots up to 32 per type and provide tools
to calculate standard deviation and Pearson's correlation coefficient.
However, due to slot being 0 at default, dbg_hit_on(c, b) has to be removed.
Initial idea from snicolet/Stockfish@d8ab604
closes https://github.com/official-stockfish/Stockfish/pull/4354
No functional change
Current master prunes all moves with negative SEE values in qsearch.
This patch sets constant negative threshold thus allowing some moves with negative SEE values to be searched.
Value of threshold is completely arbitrary and can be tweaked - also it as function of depth can be tried.
Original idea by author of Alexandria engine.
Passed STC
https://tests.stockfishchess.org/tests/view/63d79a59a67dd929a5564976
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 34864 W: 9392 L: 9086 D: 16386
Ptnml(0-2): 113, 3742, 9429, 4022, 126
Passed LTC
https://tests.stockfishchess.org/tests/view/63d8074aa67dd929a5565bc2
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 91616 W: 24532 L: 24126 D: 42958
Ptnml(0-2): 32, 8840, 27662, 9238, 36
closes https://github.com/official-stockfish/Stockfish/pull/4376
Bench: 4010877
update the WLD model with about 400M positions extracted from recent LTC games after the net updates.
This ensures that the 50% win rate is again at 1.0 eval.
closes https://github.com/official-stockfish/Stockfish/pull/4373
No functional change.
Beyond the simplification, this could be considered a bugfix from a certain point of view.
However, the effect is very subtle and essentially impossible for users to notice.
5372f81cc8 added about 2 Elo at LTC, but only for second and later `go` commands; now, with
this patch, the first `go` command will also benefit from that gain. Games under time
controls are unaffected (as per the tests).
STC: https://tests.stockfishchess.org/tests/view/63c3d291330c0d3d051d48a8
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 473792 W: 124858 L: 125104 D: 223830
Ptnml(0-2): 1338, 49653, 135063, 49601, 1241
LTC: https://tests.stockfishchess.org/tests/view/63c8cd56a83c702aac083bc9
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 290728 W: 76926 L: 76978 D: 136824
Ptnml(0-2): 106, 27987, 89221, 27953, 97
closes https://github.com/official-stockfish/Stockfish/pull/4361
bench 4208265
This patch results in search values for a TB win/loss to be reported in a way that does not change with normalization, i.e. will be consistent over time.
A value of 200.00 pawns is now reported upon entering a TB won position. Values smaller than 200.00 relate to the distance in plies from the root to the probed position position,
with 1 cp being 1 ply distance.
closes https://github.com/official-stockfish/Stockfish/pull/4353
No functional change
Created by retraining the master net with Leela T78 data from Aug+Sep 2022 added to the previous best dataset. Trained with end lambda 0.7 and started with max epoch 800. All positions with ply <= 28 were skipped:
```
python easy_train.py \
--experiment-name leela95-dfrc96-filt-only-T80octnov-T60novdecT78augsepT79aprmay-12tb7p-sk28-lambda7 \
--training-dataset /data/leela95-dfrc96-filt-only-T80octnov-T60novdecT78augsepT79aprmay-12tb7p.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-skip-ply-lteq-28 \
--start-from-engine-test-net True \
--gpus "0," \
--start-lambda 1.0 \
--end-lambda 0.7 \
--gamma 0.995 \
--lr 4.375e-4 \
--tui False \
--seed $RANDOM \
--max_epoch 800
```
Around epoch 750, training was manually paused and max epoch increased to 950 before resuming. The additional Leela training data from T78 was prepared in the same way as the previous best dataset.
The exact training data used can be found at:
https://robotmoon.com/nnue-training-data/
While the local elo ratings during this experiment were much lower than in recent master nets, several later epochs had a consistent elo above zero, and this was hypothesized to represent potential strength at slower time controls.
Local elo at 25k nodes per move
leela95-dfrc96-filt-only-T80octnov-T60novdecT78augsepT79aprmay-12tb7p-sk28-lambda7
nn-epoch819.nnue : 0.4 +/- 1.1 (nn-bc24c101ada0.nnue)
nn-epoch799.nnue : 0.3 +/- 1.2
nn-epoch759.nnue : 0.3 +/- 1.1
nn-epoch839.nnue : 0.2 +/- 1.4
Passed STC
https://tests.stockfishchess.org/tests/view/63cabf6f0eefe8694a0c6013
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 41608 W: 11161 L: 10848 D: 19599
Ptnml(0-2): 116, 4496, 11281, 4781, 130
Passed LTC
https://tests.stockfishchess.org/tests/view/63cb1856344bb01c191af263
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 76760 W: 20517 L: 20137 D: 36106
Ptnml(0-2): 34, 7435, 23070, 7799, 42
closes https://github.com/official-stockfish/Stockfish/pull/4351
bench 3941848
Bit-shifting is a single instruction, and should be faster than an array lookup
on supported architectures. Besides (ever so slightly) speeding up the
conversion of a square into a bitboard, we may see minor general performance
improvements due to preserving more of the CPU's existing cache.
passed STC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 47280 W: 12469 L: 12271 D: 22540
Ptnml(0-2): 128, 4893, 13402, 5087, 130
https://tests.stockfishchess.org/tests/view/63c5cfe618c20f4929c5fe46
Small speedup locally:
```
Result of 20 runs
==================
base (./stockfish.master ) = 1752135 +/- 10943
test (./stockfish.patch ) = 1763939 +/- 10818
diff = +11804 +/- 4731
speedup = +0.0067
P(speedup > 0) = 1.0000
CPU: 16 x AMD Ryzen 9 3950X 16-Core Processor
```
Closes https://github.com/official-stockfish/Stockfish/pull/4343
Bench: 4106793
The accumulator should be an earlyclobber because it is written before
all input operands are read. Otherwise, the asm code computes a wrong
result if the accumulator shares a register with one of the other input
operands (which happens if we pass in the same expression for the
accumulator and the operand).
Closes https://github.com/official-stockfish/Stockfish/pull/4339
No functional change
Created by retraining the master net on a dataset composed of:
* The Leela-dfrc_n5000.binpack dataset filtered with depth6 multipv2 search to remove positions with only one good move, in addition to removing positions where either of the two best moves are captures
* The same Leela T80 oct+nov 2022 training data used in recent best datasets
* Additional Leela training data from T60 nov+dec 2021 and T79 apr+may 2022
Trained with end lambda 0.7 and started with max epoch 800. All positions with ply <= 28 were skipped:
```
python easy_train.py \
--experiment-name leela95-dfrc96-mpv-eval-fonly-T80octnov-T79aprmayT60novdec-12tb7p-sk28-lambda7 \
--training-dataset /data/leela95-dfrc96-mpv-eval-fonly-T80octnov-T79aprmayT60novdec-12tb7p.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-skip-ply-lteq-28 \
--start-from-engine-test-net True \
--gpus "0," \
--start-lambda 1.0 \
--end-lambda 0.7 \
--gamma 0.995 \
--lr 4.375e-4 \
--tui False \
--seed $RANDOM \
--max_epoch 800
```
Around epoch 780, training was manually paused and max epoch increased to 920 before resuming.
During depth6 multipv2 data filtering, positions were considered to have only one good move if the score of the best move was significantly better than the 2nd best move in a way that changes the outcome of the game:
* the best move leads to a significant advantage while the 2nd best move equalizes or loses
* the best move is about equal while the 2nd best move loses
The modified stockfish branch and exact score thresholds used for filtering are at:
https://github.com/linrock/Stockfish/tree/tools-filter-multipv2-eval-diff/src/filter
About 95% of the Leela portion and 96% of the DFRC portion of the Leela-dfrc_n5000.binpack dataset was filtered. Unfiltered parts of the dataset were left out.
The additional Leela training data from T60 nov+dec 2021 and T79 apr+may 2022 was WDL-rescored with about 12TB of syzygy 7-piece tablebases where the material difference is less than around 6 pawns. Best moves were exported to .plain data files during data conversion with the lc0 rescorer.
The exact training data can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move
experiment_leela95-dfrc96-mpv-eval-fonly-T80octnov-T79aprmayT60novdec-12tb7p-sk28-lambda7
run_0/nn-epoch899.nnue : 3.8 +/- 1.6
Passed STC
https://tests.stockfishchess.org/tests/view/63bed1f540aa064159b9c89b
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 103344 W: 27392 L: 26991 D: 48961
Ptnml(0-2): 333, 11223, 28099, 11744, 273
Passed LTC
https://tests.stockfishchess.org/tests/view/63c010415705810de2deb3ec
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 21712 W: 5891 L: 5619 D: 10202
Ptnml(0-2): 12, 2022, 6511, 2304, 7
closes https://github.com/official-stockfish/Stockfish/pull/4338
bench 4106793
Removed sprintf() which generated a warning, because of security reasons.
Replace NULL with nullptr
Replace typedef with using
Do not inherit from std::vector. Use composition instead.
optimize mutex-unlocking
closes https://github.com/official-stockfish/Stockfish/pull/4327
No functional change
If a global function has no previous declaration, either the declaration
is missing in the corresponding header file or the function should be
declared static. Static functions are local to the translation unit,
which allows the compiler to apply some optimizations earlier (when
compiling the translation unit rather than during link-time
optimization).
The commit enables the warning for gcc, clang, and mingw. It also fixes
the reported warnings by declaring the functions static or by adding a
header file (benchmark.h).
closes https://github.com/official-stockfish/Stockfish/pull/4325
No functional change
This is a later epoch (epoch 859) from the same experiment run that trained yesterday's master net nn-60fa44e376d9.nnue (epoch 779). The experiment was manually paused around epoch 790 and unpaused with max epoch increased to 900 mainly to get more local elo data without letting the GPU idle.
nn-60fa44e376d9.nnue is from #4314
nn-335a9b2d8a80.nnue is from #4295
Local elo vs. nn-335a9b2d8a80.nnue at 25k nodes per move:
experiment_leela93-dfrc99-filt-only-T80-oct-nov-skip28
run_0/nn-epoch779.nnue (nn-60fa44e376d9.nnue) : 5.0 +/- 1.2
run_0/nn-epoch859.nnue (nn-a3dc078bafc7.nnue) : 5.6 +/- 1.6
Passed STC vs. nn-335a9b2d8a80.nnue
https://tests.stockfishchess.org/tests/view/63ae10495bd1e5f27f13d94f
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 37536 W: 10088 L: 9781 D: 17667
Ptnml(0-2): 110, 4006, 10223, 4325, 104
An LTC test vs. nn-335a9b2d8a80.nnue was paused due to nn-60fa44e376d9.nnue passing LTC first:
https://tests.stockfishchess.org/tests/view/63ae5d34331d5fca5113703b
Passed LTC vs. nn-60fa44e376d9.nnue
https://tests.stockfishchess.org/tests/view/63af1e41465d2b022dbce4e7
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 148704 W: 39672 L: 39155 D: 69877
Ptnml(0-2): 59, 14443, 44843, 14936, 71
closes https://github.com/official-stockfish/Stockfish/pull/4319
bench 3984365
Created by retraining the master net on the previous best dataset with additional filtering. No new data was added.
More of the Leela-dfrc_n5000.binpack part of the dataset was pre-filtered with depth6 multipv2 search to remove bestmove captures. About 93% of the previous Leela/SF data and 99% of the SF dfrc data was filtered. Unfiltered parts of the dataset were left out. The new Leela T80 oct+nov data is the same as before. All early game positions with ply count <= 28 were skipped during training by modifying the training data loader in nnue-pytorch.
Trained in a similar way as recent master nets, with a different nnue-pytorch branch for early ply skipping:
python3 easy_train.py \
--experiment-name=leela93-dfrc99-filt-only-T80-oct-nov-skip28 \
--training-dataset=/data/leela93-dfrc99-filt-only-T80-oct-nov.binpack \
--start-from-engine-test-net True \
--nnue-pytorch-branch=linrock/nnue-pytorch/misc-fixes-skip-ply-lteq-28 \
--gpus="0," \
--start-lambda=1.0 \
--end-lambda=0.75 \
--gamma=0.995 \
--lr=4.375e-4 \
--tui=False \
--seed=$RANDOM \
--max_epoch=800 \
--network-testing-threads 20 \
--num-workers 6
For the exact training data used: https://robotmoon.com/nnue-training-data/
Details about the previous best dataset: #4295
Local testing at a fixed 25k nodes:
experiment_leela93-dfrc99-filt-only-T80-oct-nov-skip28
Local Elo: run_0/nn-epoch779.nnue : 5.1 +/- 1.5
Passed STC
https://tests.stockfishchess.org/tests/view/63adb3acae97a464904fd4e8
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 36504 W: 9847 L: 9538 D: 17119
Ptnml(0-2): 108, 3981, 9784, 4252, 127
Passed LTC
https://tests.stockfishchess.org/tests/view/63ae0ae25bd1e5f27f13d884
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 36592 W: 10017 L: 9717 D: 16858
Ptnml(0-2): 17, 3461, 11037, 3767, 14
closes https://github.com/official-stockfish/Stockfish/pull/4314
bench 4015511
In both modified methods, the variable 'result' is checked to detect
whether the probe operation failed. However, the variable is not
initialized on all paths, so the check might test an uninitialized
value.
A test position (with TB) is given by:
position fen 3K1k2/R7/8/8/8/8/8/R6Q w - - 0 1 moves a1b1 f8g8 b1a1 g8f8 a1b1 f8g8 b1a1
This is now fixed by always initializing the variable.
closes https://github.com/official-stockfish/Stockfish/pull/4309
No functional change
Created by retraining the master net with a combination of:
the previous best dataset (Leela-dfrc_n5000.binpack), with about half the dataset filtered using depth6 multipv2 search to throw away positions where either of the 2 best moves are captures
Leela T80 Oct and Nov training data rescored with best moves, adding ~9.5 billion positions
Trained effectively the same way as the previous master net:
python3 easy_train.py \
--experiment-name=leela-dfrc-filtered-T80-oct-nov \
--training-dataset=/data/leela-dfrc-filtered-T80-oct-nov.binpack \
--start-from-engine-test-net True \
--gpus="0," \
--start-lambda=1.0 \
--end-lambda=0.75 \
--gamma=0.995 \
--lr=4.375e-4 \
--tui=False \
--seed=$RANDOM \
--max_epoch=800 \
--auto-exit-timeout-on-training-finished=900 \
--network-testing-threads 20 \
--num-workers 6
Local testing at a fixed 25k nodes:
experiments/experiment_leela-dfrc-filtered-T80-oct-nov/training/run_0/nn-epoch779.nnue
localElo: run_0/nn-epoch779.nnue : 4.7 +/- 3.1
The new Leela T80 part of the dataset was prepared by downloading test80 training data from all of Oct 2022 and Nov 2022, rescoring with syzygy 6-piece tablebases and ~600 GB of 7-piece tablebases, saving best moves to exported .plain files, removing all positions with castling flags, then converting to binpacks and using interleave_binpacks.py to merge them together. Scripts used in this data conversion process are available at:
https://github.com/linrock/lc0-data-converter
Filtering binpack data using depth6 multipv2 search was done by modifying transform.cpp in the tools branch:
https://github.com/linrock/Stockfish/tree/tools-filter-multipv2-no-rescore
Links for downloading the training data (total size: 338 GB) are available at:
https://robotmoon.com/nnue-training-data/
Passed STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 30544 W: 8244 L: 7947 D: 14353
Ptnml(0-2): 93, 3243, 8302, 3542, 92
https://tests.stockfishchess.org/tests/view/63a0d377264a0cf18f86f82b
Passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 32464 W: 8866 L: 8573 D: 15025
Ptnml(0-2): 19, 3054, 9794, 3345, 20
https://tests.stockfishchess.org/tests/view/63a10bc9fb452d3c44b1e016
closes https://github.com/official-stockfish/Stockfish/pull/4295
Bench 3554904
Instead of allowing .depend for specific build-related targets, filter
non-build-related targets (i.e. help, clean) so that other targets can
normally execute .depend target.
closes https://github.com/official-stockfish/Stockfish/pull/4293
No functional change
If ttMove is doubly extended, we allow a depth growth of the remaining moves.
The idea is to get a more realistic score comparison, because of the depth
difference. We take some care to avoid this extension for high depths,
in order to avoid the cost, since the search result is supposed
to be more accurate in this case.
This pull request includes some small cleanups.
STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 60256 W: 16189 L: 15848 D: 28219
Ptnml(0-2): 182, 6546, 16330, 6889, 181
https://tests.stockfishchess.org/tests/view/639109a1792a529ae8f27777
LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 106232 W: 28487 L: 28053 D: 49692
Ptnml(0-2): 46, 10224, 32145, 10652, 49
https://tests.stockfishchess.org/tests/view/63914cba792a529ae8f282ee
closes https://github.com/official-stockfish/Stockfish/pull/4271
Bench: 3622368
Add a constraint so that the dependency build only occurs when users
actually run build tasks.
This fixes a bug on some systems where gcc/g++ is not available.
closes https://github.com/official-stockfish/Stockfish/pull/4255
No functional change
fixes the lowerbound/upperbound output by avoiding
scores outside the alpha,beta bracket. Since SF search
uses fail-soft we can't simply take the returned value
as score.
closes https://github.com/official-stockfish/Stockfish/pull/4259
No functional change
Official release version of Stockfish 15.1
Bench: 3467381
---
Today, we have the pleasure to announce Stockfish 15.1.
As usual, downloads will be freely available at stockfishchess.org/download
*Elo gain and competition results*
With this release, version 5 of the NNUE neural net architecture has
been introduced, and the training data has been extended to include
Fischer random chess (FRC) positions. As a result, Elo gains are largest
for FRC, reaching up to 50 Elo for doubly randomized FRC[1] (DFRC).
More importantly, also for standard chess this release progressed and
will win two times more game pairs than it loses[2] against
Stockfish 15. Stockfish continues to win in a dominating way[3] all
chess engine tournaments, including the TCEC Superfinal, Cup, FRC, DFRC,
and Swiss as well as the CCC Bullet, Blitz, and Rapid events.
*New evaluation*
This release also introduces a new convention for the evaluation that
is reported by search. An evaluation of +1 is now no longer tied to the
value of one pawn, but to the likelihood of winning the game. With
a +1 evaluation, Stockfish has now a 50% chance of winning the game
against an equally strong opponent. This convention scales down
evaluations a bit compared to Stockfish 15 and allows for consistent
evaluations in the future.
*ChessBase settlement*
In this release period, the Stockfish team has successfully enforced
its GPL license against ChessBase. This has been an intense process that
included filing a lawsuit[4], a court hearing[5], and finally
negotiating a settlement[6] that established that ChessBase infringed on
the license by not distributing the Stockfish derivatives Fat Fritz 2
and Houdini 6 as free software, and that ensures ChessBase will respect
the Free Software principles in the future. This settlement has been
covered by major chess sites (see e.g. lichess.org[7] and chess.com[8]),
and we are proud that it has been hailed as a ‘historic violation
settlement[9]’ by the Software Freedom Conservancy.
*Thank you*
The Stockfish project builds on a thriving community of enthusiasts
(thanks everybody!) that contribute their expertise, time, and resources
to build a free and open-source chess engine that is robust, widely
available, and very strong. We invite our chess fans to join the
fishtest testing framework and programmers to contribute to the
project[10].
The Stockfish team
[1] https://tests.stockfishchess.org/tests/view/638a6170d2b9c924c4c62cb4
[2] https://tests.stockfishchess.org/tests/view/638a4dd7d2b9c924c4c6297b
[3] https://en.wikipedia.org/wiki/Stockfish_(chess)#Competition_results
[4] https://stockfishchess.org/blog/2021/our-lawsuit-against-chessbase/
[5] https://stockfishchess.org/blog/2022/public-court-hearing-soon/
[6] https://stockfishchess.org/blog/2022/chessbase-stockfish-agreement/
[7] https://lichess.org/blog/Y3u1mRAAACIApBVn/settlement-reached-in-stockfish-v-chessbase
[8] https://www.chess.com/news/view/chessbase-stockfish-reach-settlement
[9] https://sfconservancy.org/news/2022/nov/28/sfc-named-trusted-party-in-gpl-case/
[10] https://stockfishchess.org/get-involved/
If multiple threads have the same best move,
pick the thread with the largest contribution to the confidence vote.
This thread will later be used to display PV, so this patch is
about user-friendliness and/or least surprises, it non-functional for playing strenght.
closes https://github.com/official-stockfish/Stockfish/pull/4246
No functional change
fixes the lowerbound/upperbound output by taking the alpha,beta bracket
into account also if a bestThread is selected that is different from the master thread.
Instead of keeping track which bounds where used in the specific search,
in this version we simply store the quality (exact, upperbound,
lowerbound) of the score along with the actual score as information on
rootMove.
closes https://github.com/official-stockfish/Stockfish/pull/4239
No functional change
This updates the WDL model based on the LTC statistics (2M games).
Relatively small change, note that this also adjusts the NormalizeToPawnValue (now 361),
to keep win prob at 50% for 100cp.
closes https://github.com/official-stockfish/Stockfish/pull/4236
No functional change.
Github Actions allows us to use up to 20 workers.
This way we can launch multiple different checks
at the same time and optimize the overall time
the CI takes a bit.
closes https://github.com/official-stockfish/Stockfish/pull/4223
No functional change
For development versions of Stockfish, the version will now look like
dev-20221107-dca9a0533
indicating a development version, the date of the last commit,
and the git SHA of that commit. If git is not available,
the fallback is the date of compilation. Releases will continue to be
versioned as before.
Additionally, this PR extends the CI to create binary artifacts,
i.e. pushes to master will automatically build Stockfish and upload
the binaries to github.
closes https://github.com/official-stockfish/Stockfish/pull/4220
No functional change
Normalizes the internal value as reported by evaluate or search
to the UCI centipawn result used in output. This value is derived from
the win_rate_model() such that Stockfish outputs an advantage of
"100 centipawns" for a position if the engine has a 50% probability to win
from this position in selfplay at fishtest LTC time control.
The reason to introduce this normalization is that our evaluation is, since NNUE,
no longer related to the classical parameter PawnValueEg (=208). This leads to
the current evaluation changing quite a bit from release to release, for example,
the eval needed to have 50% win probability at fishtest LTC (in cp and internal Value):
June 2020 : 113cp (237)
June 2021 : 115cp (240)
April 2022 : 134cp (279)
July 2022 : 167cp (348)
With this patch, a 100cp advantage will have a fixed interpretation,
i.e. a 50% win chance. To keep this value steady, it will be needed to update the win_rate_model()
from time to time, based on fishtest data. This analysis can be performed with
a set of scripts currently available at https://github.com/vondele/WLD_model
fixes https://github.com/official-stockfish/Stockfish/issues/4155
closes https://github.com/official-stockfish/Stockfish/pull/4216
No functional change
Joint work by Ofek Shochat and Stéphane Nicolet.
passed STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 93288 W: 24996 L: 24601 D: 43691
Ptnml(0-2): 371, 10263, 24989, 10642, 379
https://tests.stockfishchess.org/tests/view/63448f4f4bc7650f07541987
passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 84168 W: 22771 L: 22377 D: 39020
Ptnml(0-2): 47, 8181, 25234, 8575, 47
https://tests.stockfishchess.org/tests/view/6345186d4bc7650f07542fbd
================
It seems there are two effects with this patch:
effect A :
If Stockfish is winning at root, we have optimism > 0 for all leaves in
the search tree where Stockfish is to move. There, if (psq - nnue) > 0
(ie if the advantage is more materialistic than positional), then the
product D = optimism * (psq - nnue) will be positive, nnueComplexity will
increase, and the eval will increase from SF point of view.
So the effect A is that if Stockfish is winning at root, she will slightly
favor in the search tree (in other words, search more) the positions where
she can convert her advantage via materialist means.
effect B :
If Stockfish is losing at root, we have optimism > 0 for all leaves in
the search tree where the opponent is to move. There, if (psq - nnue) < 0
(ie if the opponent advantage is more positional than materialistic), then
the product D = optimism * (psq-nnue) will be negative, nnueComplexity will
decrease, and the eval will decrease from the opponent point of view.
So the effect B is that Stockfish will slightly favor in the search tree
(search more) the branches where she can defend by slowly reducing the
opponent positional advantage.
=================
closes https://github.com/official-stockfish/Stockfish/pull/4195
bench: 4673898
relatively soon servers with 512 threads will be available 'quite commonly',
anticipate even more threads, and increase our current maximum from 512 to 1024.
closes https://github.com/official-stockfish/Stockfish/pull/4152
No functional change.
If the elapsed time is close to the available time, the time management thread can signal that the next iterations should be searched at the same depth (Threads.increaseDepth = false). While the rootDepth increases, the adjustedDepth is kept constant with the searchAgainCounter.
In exceptional cases, when threading is used and the master thread, which controls the time management, signals to not increaseDepth, but by itself takes a long time to finish the iteration, the helper threads can search repeatedly at the same depth. This search finishes more and more quickly, leading to helper threads that report a rootDepth of MAX_DEPTH (245). The latter is not optimal as it is confusing for the user, stops search on these threads, and leads to an incorrect bias in the thread voting scheme. Probably with only a small impact on strength.
This behavior was observed almost two years ago,
see https://github.com/official-stockfish/Stockfish/issues/2717
This patch fixes#2717 by ensuring the effective depth increases at once every four iterations,
even in increaseDepth is false.
Depth 245 searches (for non-trivial positions) were indeed absent with this patch,
but frequent with master in the tests below:
https://discord.com/channels/435943710472011776/813919248455827515/994872720800088095
Total pgns: 2173
Base: 2867
Patch: 0
it passed non-regression testing in various setups:
SMP STC:
https://tests.stockfishchess.org/tests/view/62bfecc96178ffe6394ba036
LLR: 2.94 (-2.94,2.94) <-2.25,0.25>
Total: 37288 W: 10171 L: 10029 D: 17088
Ptnml(0-2): 75, 3777, 10793, 3929, 70
SMP LTC:
https://tests.stockfishchess.org/tests/view/62c08f6f49b62510394be066
LLR: 2.94 (-2.94,2.94) <-2.25,0.25>
Total: 190568 W: 52125 L: 52186 D: 86257
Ptnml(0-2): 70, 17854, 59504, 17779, 77
LTC:
https://tests.stockfishchess.org/tests/view/62c08b6049b62510394bdfb6
LLR: 2.96 (-2.94,2.94) <-2.25,0.25>
Total: 48120 W: 13204 L: 13083 D: 21833
Ptnml(0-2): 54, 4458, 14919, 4571, 58
Special thanks to miguel-I, Disservin, ruicoelhopedro and others for analysing the problem,
the data, and coming up with the key insight, needed to fix this longstanding issue.
closes https://github.com/official-stockfish/Stockfish/pull/4104
Bench: 5182295
First things first...
this PR is being made from court. Today, Tord and Stéphane, with broad support
of the developer community are defending their complaint, filed in Munich, against ChessBase.
With their products Houdini 6 and Fat Fritz 2, both Stockfish derivatives,
ChessBase violated repeatedly the Stockfish GPLv3 license. Tord and Stéphane have terminated
their license with ChessBase permanently. Today we have the opportunity to present
our evidence to the judge and enforce that termination. To read up, have a look at our blog post
https://stockfishchess.org/blog/2022/public-court-hearing-soon/ and
https://stockfishchess.org/blog/2021/our-lawsuit-against-chessbase/
This PR introduces a net trained with an enhanced data set and a modified loss function in the trainer.
A slight adjustment for the scaling was needed to get a pass on standard chess.
passed STC:
https://tests.stockfishchess.org/tests/view/62c0527a49b62510394bd610
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 135008 W: 36614 L: 36152 D: 62242
Ptnml(0-2): 640, 15184, 35407, 15620, 653
passed LTC:
https://tests.stockfishchess.org/tests/view/62c17e459e7d9997a12d458e
LLR: 2.94 (-2.94,2.94) <0.50,3.00>
Total: 28864 W: 8007 L: 7749 D: 13108
Ptnml(0-2): 47, 2810, 8466, 3056, 53
Local testing at a fixed 25k nodes resulted in
Test run1026/easy_train_data/experiments/experiment_2/training/run_0/nn-epoch799.nnue
localElo: 4.2 +- 1.6
The real strength of the net is in FRC and DFRC chess where it gains significantly.
Tested at STC with slightly different scaling:
FRC:
https://tests.stockfishchess.org/tests/view/62c13a4002ba5d0a774d20d4
Elo: 29.78 +-3.4 (95%) LOS: 100.0%
Total: 10000 W: 2007 L: 1152 D: 6841
Ptnml(0-2): 31, 686, 2804, 1355, 124
nElo: 59.24 +-6.9 (95%) PairsRatio: 2.06
DFRC:
https://tests.stockfishchess.org/tests/view/62c13a5702ba5d0a774d20d9
Elo: 55.25 +-3.9 (95%) LOS: 100.0%
Total: 10000 W: 2984 L: 1407 D: 5609
Ptnml(0-2): 51, 636, 2266, 1779, 268
nElo: 96.95 +-7.2 (95%) PairsRatio: 2.98
Tested at LTC with identical scaling:
FRC:
https://tests.stockfishchess.org/tests/view/62c26a3c9e7d9997a12d6caf
Elo: 16.20 +-2.5 (95%) LOS: 100.0%
Total: 10000 W: 1192 L: 726 D: 8082
Ptnml(0-2): 10, 403, 3727, 831, 29
nElo: 44.12 +-6.7 (95%) PairsRatio: 2.08
DFRC:
https://tests.stockfishchess.org/tests/view/62c26a539e7d9997a12d6cb2
Elo: 40.94 +-3.0 (95%) LOS: 100.0%
Total: 10000 W: 2215 L: 1042 D: 6743
Ptnml(0-2): 10, 410, 3053, 1451, 76
nElo: 92.77 +-6.9 (95%) PairsRatio: 3.64
This is due to the mixing in a significant fraction of DFRC training data in the final training round. The net is
trained using the easy_train.py script in the following way:
```
python easy_train.py \
--training-dataset=../Leela-dfrc_n5000.binpack \
--experiment-name=2 \
--nnue-pytorch-branch=vondele/nnue-pytorch/lossScan4 \
--additional-training-arg=--param-index=2 \
--start-lambda=1.0 \
--end-lambda=0.75 \
--gamma=0.995 \
--lr=4.375e-4 \
--start-from-engine-test-net True \
--tui=False \
--seed=$RANDOM \
--max_epoch=800 \
--auto-exit-timeout-on-training-finished=900 \
--network-testing-threads 8 \
--num-workers 12
```
where the data set used (Leela-dfrc_n5000.binpack) is a combination of our previous best data set (mix of Leela and some SF data) and DFRC data, interleaved to form:
The data is available in https://drive.google.com/drive/folders/1S9-ZiQa_3ApmjBtl2e8SyHxj4zG4V8gG?usp=sharing
Leela mix: https://drive.google.com/file/d/1JUkMhHSfgIYCjfDNKZUMYZt6L5I7Ra6G/view?usp=sharing
DFRC: https://drive.google.com/file/d/17vDaff9LAsVo_1OfsgWAIYqJtqR8aHlm/view?usp=sharing
The training branch used is
https://github.com/vondele/nnue-pytorch/commits/lossScan4
A PR to the main trainer repo will be made later. This contains a revised loss function, now computing the loss from the score based on the win rate model, which is a more accurate representation than what we had before. Scaling constants are tweaked there as well.
closes https://github.com/official-stockfish/Stockfish/pull/4100
Bench: 5186781
The speedup is around 0.25% using gcc 11.3.1 (bmi2, nnue bench, depth 16
and 23) while it is neutral using clang (same conditions).
According to `perf` that integer division was one of the most time-consuming
instructions in search (gcc disassembly).
Passed STC:
https://tests.stockfishchess.org/tests/view/628a17fe24a074e5cd59b3aa
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 22232 W: 5992 L: 5751 D: 10489
Ptnml(0-2): 88, 2235, 6218, 2498, 77
yellow LTC:
https://tests.stockfishchess.org/tests/view/628a35d7ccae0450e35106f7
LLR: -2.95 (-2.94,2.94) <0.50,3.00>
Total: 320168 W: 85853 L: 85326 D: 148989
Ptnml(0-2): 185, 29698, 99821, 30165, 215
This patch also suggests that UHO STC is sensible to small speedups (< 0.50%).
closes https://github.com/official-stockfish/Stockfish/pull/4032
No functional change
This patch provides command line flags `--help` and `--license` as well as the corresponding `help` and `license` commands.
```
$ ./stockfish --help
Stockfish 200522 by the Stockfish developers (see AUTHORS file)
Stockfish is a powerful chess engine and free software licensed under the GNU GPLv3.
Stockfish is normally used with a separate graphical user interface (GUI).
Stockfish implements the universal chess interface (UCI) to exchange information.
For further information see https://github.com/official-stockfish/Stockfish#readme
or the corresponding README.md and Copying.txt files distributed with this program.
```
The idea is to provide a minimal help that links to the README.md file,
not replicating information that is already available elsewhere.
We use this opportunity to explicitly report the license as well.
closes https://github.com/official-stockfish/Stockfish/pull/4027
No functional change.
train a net using training data with a
heavier weight on positions having 16 pieces on the board. More specifically,
with a relative weight of `i * (32-i)/(16 * 16)+1` (where i is the number of pieces on the board).
This is done with the trainer branch https://github.com/glinscott/nnue-pytorch/pull/173
The command used is:
```
python train.py $datafile $datafile $restarttype $restartfile --gpus 1 --threads 4 --num-workers 12 --random-fen-skipping=3 --batch-size 16384 --progress_bar_refresh_rate 300 --smart-fen-skipping --features=HalfKAv2_hm^ --lambda=1.00 --max_epochs=$epochs --seed $RANDOM --default_root_dir exp/run_$i
```
The datafile is T60T70wIsRightFarseerT60T74T75T76.binpack, the restart is from the master net.
passed STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 22728 W: 6197 L: 5945 D: 10586
Ptnml(0-2): 105, 2453, 6001, 2695, 110
https://tests.stockfishchess.org/tests/view/625cf944ff677a888877cd90
passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.00>
Total: 35664 W: 9535 L: 9264 D: 16865
Ptnml(0-2): 30, 3524, 10455, 3791, 32
https://tests.stockfishchess.org/tests/view/625d3c32ff677a888877d7ca
closes https://github.com/official-stockfish/Stockfish/pull/3989
Bench: 7269563
Official release version of Stockfish 15
Bench: 8129754
---
A new major release of Stockfish is now available at https://stockfishchess.org
Stockfish 15 continues to push the boundaries of chess, providing unrivalled
analysis and playing strength. In our testing, Stockfish 15 is ahead of
Stockfish 14 by 36 Elo points and wins nine times more game pairs than it
loses[1].
Improvements to the engine have made it possible for Stockfish to end up
victorious in tournaments at all sorts of time controls ranging from bullet to
classical and even at Fischer random chess[2]. At CCC, Stockfish won all of
the latest tournaments: CCC 16 Bullet, Blitz and Rapid, CCC 960 championship,
and the CCC 17 Rapid. At TCEC, Stockfish won the Season 21, Cup 9, FRC 4 and
in the current Season 22 superfinal, at the time of writing, has won 16 game
pairs and not yet lost a single one.
This progress is the result of a dedicated team of developers that comes up
with new ideas and improvements. For Stockfish 15, we tested nearly 13000
different changes and retained the best 200. These include the fourth
generation of our NNUE network architecture, as well as various search
improvements. To perform these tests, contributors provide CPU time for
testing, and in the last year, they have collectively played roughly a
billion chess games. In the last few years, our distributed testing
framework, Fishtest, has been operated superbly and has been developed and
improved extensively. This work by Pasquale Pigazzini, Tom Vijlbrief, Michel
Van den Bergh, and various other developers[3] is an essential part of the
success of the Stockfish project.
Indeed, the Stockfish project builds on a thriving community of enthusiasts
to offer a free and open-source chess engine that is robust, widely
available, and very strong. We invite our chess fans to join the Fishtest
testing framework and programmers to contribute to the project[4].
The Stockfish team
[1] https://tests.stockfishchess.org/tests/view/625d156dff677a888877d1be
[2] https://en.wikipedia.org/wiki/Stockfish_(chess)#Competition_results
[3] https://github.com/glinscott/fishtest/blob/master/AUTHORS
[4] https://stockfishchess.org/get-involved/
This patch chooses the delta value (which skews the nnue evaluation between positional and materialistic)
depending on the material: If the material is low, delta will be higher and the evaluation is shifted
to the positional value. If the material is high, the evaluation will be shifted to the psqt value.
I don't think slightly negative values of delta should be a concern.
Passed STC:
https://tests.stockfishchess.org/tests/view/62418513b3b383e86185766f
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 28808 W: 7832 L: 7564 D: 13412
Ptnml(0-2): 147, 3186, 7505, 3384, 182
Passed LTC:
https://tests.stockfishchess.org/tests/view/62419137b3b383e861857842
LLR: 2.96 (-2.94,2.94) <0.50,3.00>
Total: 58632 W: 15776 L: 15450 D: 27406
Ptnml(0-2): 42, 5889, 17149, 6173, 63
closes https://github.com/official-stockfish/Stockfish/pull/3971
Bench: 7588855
This idea is a mix of koivisto idea of threat history and heuristic that
was simplified some time ago in LMR - decreasing reduction for moves that evade a capture.
Instead of doing so in LMR this patch does it in movepicker - to do this it
calculates squares that are attacked by different piece types and pieces that are located
on this squares and boosts up weight of moves that make this pieces land on a square that is not under threat.
Boost is greater for pieces with bigger material values.
Special thanks to koivisto and seer authors for explaining me ideas behind threat history.
Passed STC:
https://tests.stockfishchess.org/tests/view/62406e473b32264b9aa1478b
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 19816 W: 5320 L: 5072 D: 9424
Ptnml(0-2): 86, 2165, 5172, 2385, 100
Passed LTC:
https://tests.stockfishchess.org/tests/view/62407f2e3b32264b9aa149c8
LLR: 2.94 (-2.94,2.94) <0.50,3.00>
Total: 51200 W: 13805 L: 13500 D: 23895
Ptnml(0-2): 44, 5023, 15164, 5322, 47
closes https://github.com/official-stockfish/Stockfish/pull/3970
bench 7736491
Via the ttPv flag an implicit tree of current and former PV nodes is maintained. In addition this tree is grown or shrinked at the leafs dependant on the search results. But now the shrinking step has been removed.
As the frequency of ttPv nodes decreases with depth the shown scaling behavior (STC barely passed but LTC scales well) of the tests was expected.
STC:
LLR: 2.93 (-2.94,2.94) <-2.25,0.25>
Total: 270408 W: 71593 L: 71785 D: 127030
Ptnml(0-2): 1339, 31024, 70630, 30912, 1299
https://tests.stockfishchess.org/tests/view/622fbf9dc9e950cbfc2376d6
LTC:
LLR: 2.96 (-2.94,2.94) <-2.25,0.25>
Total: 34368 W: 9135 L: 8992 D: 16241
Ptnml(0-2): 28, 3423, 10135, 3574, 24
https://tests.stockfishchess.org/tests/view/62305257c9e950cbfc238964
closes https://github.com/official-stockfish/Stockfish/pull/3963
Bench: 7044203
This commit generalizes the feature transform to use vec_t macros
that are architecture defined instead of using a seperate code path for each one.
It should make some old architectures (MMX, including improvements by Fanael) faster
and make further such improvements easier in the future.
Includes some corrections to CI for mingw.
closes https://github.com/official-stockfish/Stockfish/pull/3955
closes https://github.com/official-stockfish/Stockfish/pull/3928
No functional change
- set the variable only for the required tests to keep simple the yml file
- use NDK 21.x until will be fixed the Stockfish static build problem
with NDK 23.x
- set the test for armv7, armv7-neon, armv8 builds:
- use armv7a-linux-androideabi21-clang++ compiler for armv7 armv7-neon
- enforce a static build
- silence the Warning for the unused compilation flag "-pie" with
the static build, otherwise the Github workflow stops
- use qemu to bench the build and get the signature
Many thanks to @pschneider1968 that made all the hard work with NDK :)
closes https://github.com/official-stockfish/Stockfish/pull/3924
No functional change
This patch is a result of tuning done by user @candirufish after 150k games.
Since the tuned values were really interesting and touched heuristics
that are known for their non-linear scaling I decided to run limited
games LTC match, even if the STC test was really bad (which was expected).
After seeing the results of the LTC match, I also run a VLTC (very long
time control) SPRTtest, which passed.
The main difference is in extensions: this patch allows much more
singular/double extensions, both in terms of allowing them at lower
depths and with lesser margins.
Failed STC:
https://tests.stockfishchess.org/tests/view/620d66643ec80158c0cd3b46
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 4968 W: 1194 L: 1398 D: 2376
Ptnml(0-2): 47, 633, 1294, 497, 13
Performed well at LTC in a fixed-length match:
https://tests.stockfishchess.org/tests/view/620d66823ec80158c0cd3b4a
ELO: 3.36 +-1.8 (95%) LOS: 100.0%
Total: 30000 W: 7966 L: 7676 D: 14358
Ptnml(0-2): 36, 2936, 8755, 3248, 25
Passed VLTC SPRT test:
https://tests.stockfishchess.org/tests/view/620da11a26f5b17ec884f939
LLR: 2.96 (-2.94,2.94) <0.50,3.00>
Total: 4400 W: 1326 L: 1127 D: 1947
Ptnml(0-2): 13, 309, 1348, 526, 4
closes https://github.com/official-stockfish/Stockfish/pull/3937
Bench: 6318903
Architecture:
The diagram of the "SFNNv4" architecture:
https://user-images.githubusercontent.com/8037982/153455685-cbe3a038-e158-4481-844d-9d5fccf5c33a.png
The most important architectural changes are the following:
* 1024x2 [activated] neurons are pairwise, elementwise multiplied (not quite pairwise due to implementation details, see diagram), which introduces a non-linearity that exhibits similar benefits to previously tested sigmoid activation (quantmoid4), while being slightly faster.
* The following layer has therefore 2x less inputs, which we compensate by having 2 more outputs. It is possible that reducing the number of outputs might be beneficial (as we had it as low as 8 before). The layer is now 1024->16.
* The 16 outputs are split into 15 and 1. The 1-wide output is added to the network output (after some necessary scaling due to quantization differences). The 15-wide is activated and follows the usual path through a set of linear layers. The additional 1-wide output is at least neutral, but has shown a slightly positive trend in training compared to networks without it (all 16 outputs through the usual path), and allows possibly an additional stage of lazy evaluation to be introduced in the future.
Additionally, the inference code was rewritten and no longer uses a recursive implementation. This was necessitated by the splitting of the 16-wide intermediate result into two, which was impossible to do with the old implementation with ugly hacks. This is hopefully overall for the better.
First session:
The first session was training a network from scratch (random initialization). The exact trainer used was slightly different (older) from the one used in the second session, but it should not have a measurable effect. The purpose of this session is to establish a strong network base for the second session. Small deviations in strength do not harm the learnability in the second session.
The training was done using the following command:
python3 train.py \
/home/sopel/nnue/nnue-pytorch-training/data/nodes5000pv2_UHO.binpack \
/home/sopel/nnue/nnue-pytorch-training/data/nodes5000pv2_UHO.binpack \
--gpus "$3," \
--threads 4 \
--num-workers 4 \
--batch-size 16384 \
--progress_bar_refresh_rate 20 \
--random-fen-skipping 3 \
--features=HalfKAv2_hm^ \
--lambda=1.0 \
--gamma=0.992 \
--lr=8.75e-4 \
--max_epochs=400 \
--default_root_dir ../nnue-pytorch-training/experiment_$1/run_$2
Every 20th net was saved and its playing strength measured against some baseline at 25k nodes per move with pure NNUE evaluation (modified binary). The exact setup is not important as long as it's consistent. The purpose is to sift good candidates from bad ones.
The dataset can be found https://drive.google.com/file/d/1UQdZN_LWQ265spwTBwDKo0t1WjSJKvWY/view
Second session:
The second training session was done starting from the best network (as determined by strength testing) from the first session. It is important that it's resumed from a .pt model and NOT a .ckpt model. The conversion can be performed directly using serialize.py
The LR schedule was modified to use gamma=0.995 instead of gamma=0.992 and LR=4.375e-4 instead of LR=8.75e-4 to flatten the LR curve and allow for longer training. The training was then running for 800 epochs instead of 400 (though it's possibly mostly noise after around epoch 600).
The training was done using the following command:
The training was done using the following command:
python3 train.py \
/data/sopel/nnue/nnue-pytorch-training/data/T60T70wIsRightFarseerT60T74T75T76.binpack \
/data/sopel/nnue/nnue-pytorch-training/data/T60T70wIsRightFarseerT60T74T75T76.binpack \
--gpus "$3," \
--threads 4 \
--num-workers 4 \
--batch-size 16384 \
--progress_bar_refresh_rate 20 \
--random-fen-skipping 3 \
--features=HalfKAv2_hm^ \
--lambda=1.0 \
--gamma=0.995 \
--lr=4.375e-4 \
--max_epochs=800 \
--resume-from-model /data/sopel/nnue/nnue-pytorch-training/data/exp295/nn-epoch399.pt \
--default_root_dir ../nnue-pytorch-training/experiment_$1/run_$run_id
In particular note that we now use lambda=1.0 instead of lambda=0.8 (previous nets), because tests show that WDL-skipping introduced by vondele performs better with lambda=1.0. Nets were being saved every 20th epoch. In total 16 runs were made with these settings and the best nets chosen according to playing strength at 25k nodes per move with pure NNUE evaluation - these are the 4 nets that have been put on fishtest.
The dataset can be found either at ftp://ftp.chessdb.cn/pub/sopel/data_sf/T60T70wIsRightFarseerT60T74T75T76.binpack in its entirety (download might be painfully slow because hosted in China) or can be assembled in the following way:
Get the https://github.com/official-stockfish/Stockfish/blob/5640ad48ae5881223b868362c1cbeb042947f7b4/script/interleave_binpacks.py script.
Download T60T70wIsRightFarseer.binpack https://drive.google.com/file/d/1_sQoWBl31WAxNXma2v45004CIVltytP8/view
Download farseerT74.binpack http://trainingdata.farseer.org/T74-May13-End.7z
Download farseerT75.binpack http://trainingdata.farseer.org/T75-June3rd-End.7z
Download farseerT76.binpack http://trainingdata.farseer.org/T76-Nov10th-End.7z
Run python3 interleave_binpacks.py T60T70wIsRightFarseer.binpack farseerT74.binpack farseerT75.binpack farseerT76.binpack T60T70wIsRightFarseerT60T74T75T76.binpack
Tests:
STC: https://tests.stockfishchess.org/tests/view/6203fb85d71106ed12a407b7
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 16952 W: 4775 L: 4521 D: 7656
Ptnml(0-2): 133, 1818, 4318, 2076, 131
LTC: https://tests.stockfishchess.org/tests/view/62041e68d71106ed12a40e85
LLR: 2.94 (-2.94,2.94) <0.50,3.00>
Total: 14944 W: 4138 L: 3907 D: 6899
Ptnml(0-2): 21, 1499, 4202, 1728, 22
closes https://github.com/official-stockfish/Stockfish/pull/3927
Bench: 4919707
Most credits for this patch should go to @candirufish.
Based on his big search tuning (1M games at 20+0.1s)
https://tests.stockfishchess.org/tests/view/61fc7a6ed508ec6a1c9f4b7d
with some hand polishing on top of it, which includes :
a) correcting trend sigmoid - for some reason original tuning resulted in it being negative. This heuristic was proven to be worth some elo for years so reversing it sign is probably some random artefact;
b) remove changes to continuation history based pruning - this heuristic historically was really good at providing green STCs and then failing at LTC miserably if we tried to make it more strict, original tuning was done at short time control and thus it became more strict - which doesn't scale to longer time controls;
c) remove changes to improvement - not really indended :).
passed STC
https://tests.stockfishchess.org/tests/view/6203526e88ae2c84271c2ee2
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 16840 W: 4604 L: 4363 D: 7873
Ptnml(0-2): 82, 1780, 4449, 2033, 76
passed LTC
https://tests.stockfishchess.org/tests/view/620376e888ae2c84271c35d4
LLR: 2.96 (-2.94,2.94) <0.50,3.00>
Total: 17232 W: 4771 L: 4542 D: 7919
Ptnml(0-2): 14, 1655, 5048, 1886, 13
closes https://github.com/official-stockfish/Stockfish/pull/3926
bench 5030992
have maximal compatibility on legacy target arch, now supporting AMD Athlon
The old behavior can anyway be selected by the user if needed, for example
make -j profile-build ARCH=x86-32 sse=yes
fixes#3904
closes https://github.com/official-stockfish/Stockfish/pull/3918
No functional change
For cross-compiling to Android on windows, the Makefile needs some tweaks.
Tested with Android NDK 23.1.7779620 and 21.4.7075529, using
Windows 10 with clean MSYS2 environment (i.e. no MINGW/GCC/Clang
toolchain in PATH) and Fedora 35, with build target:
build ARCH=armv8 COMP=ndk
The resulting binary runs fine inside Droidfish on my Samsung
Galaxy Note20 Ultra and Samsung Galaxy Tab S7+
Other builds tested to exclude regressions: MINGW64/Clang64 build
on Windows; MINGW64 cross build, native Clang and GCC builds on Fedora.
wiki docs https://github.com/glinscott/fishtest/wiki/Cross-compiling-Stockfish-for-Android-on-Windows-and-Linux
closes https://github.com/official-stockfish/Stockfish/pull/3901
No functional change
A Windows Native Build (WNB) can be done:
- on Windows, using a recent mingw-w64 g++/clang compiler
distributed by msys2, cygwin and others
- on Linux, using mingw-w64 g++ to cross compile
Improvements:
- check for a WNB in a proper way and set a variable to simplify the code
- set the proper EXE for a WNB
- use the proper name for the mingw-w64 clang compiler
- use the static linking for a WNB
- use wine to make a PGO cross compile on Linux (also with Intel SDE)
- enable the LTO build for mingw-w64 g++ compiler
- set `lto=auto` to use the make's job server, if available, or otherwise
to fall back to autodetection of the number of CPU threads
- clean up all the temporary LTO files saved in the local directory
Tested on:
- msys2 MINGW64 (g++), UCRT64 (g++), MINGW32 (g++), CLANG64 (clang)
environments
- cygwin mingw-w64 g++
- Ubuntu 18.04 & 21.10 mingw-w64 PGO cross compile (also with Intel SDE)
closes#3891
No functional change
This updates estimates from 2yr ago #2401, and adds missing terms.
All tests run at 10+0.1 (STC), 20000 games, error bars +- 1.8 Elo, book 8moves_v3.png.
A table of Elo values with the links to the corresponding tests can be found at the PR
closes https://github.com/official-stockfish/Stockfish/pull/3868
Non-functional Change
This is a reintroduction of an idea that was simplified away approximately 1 year ago.
There are some tweaks to it :
a) exclude promotions;
b) exclude Pv Nodes from it - Pv Nodes logic for captures is really different from non Pv nodes so it makes a lot of sense;
c) use a big grain of capture history - idea is taken from my recent patches in futility pruning.
passed STC
https://tests.stockfishchess.org/tests/view/61bd90f857a0d0f327c373b7
LLR: 2.96 (-2.94,2.94) <0.00,2.50>
Total: 86640 W: 22474 L: 22110 D: 42056
Ptnml(0-2): 268, 9732, 22963, 10082, 275
passed LTC
https://tests.stockfishchess.org/tests/view/61be094457a0d0f327c38aa3
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 23240 W: 6079 L: 5838 D: 11323
Ptnml(0-2): 14, 2261, 6824, 2512, 9
https://github.com/official-stockfish/Stockfish/pull/3864
bench 4493723
This patch is a follow up of previous 2 patches that introduced more reductions for PV nodes with low delta and more pruning for nodes with low delta. Instead of writing separate heuristics now it adjust reductions based on delta / rootDelta - it allows to remove 3 separate adjustements of pruning/LMR in different places and also makes reduction dependence on delta and rootDelta smoother. Also now it works for all pruning heuristics and not just 2.
Passed STC
https://tests.stockfishchess.org/tests/view/61ba9b6c57a0d0f327c2d48b
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 79192 W: 20513 L: 20163 D: 38516
Ptnml(0-2): 238, 8900, 21024, 9142, 292
passed LTC
https://tests.stockfishchess.org/tests/view/61baf77557a0d0f327c2eb8e
LLR: 2.96 (-2.94,2.94) <0.50,3.00>
Total: 158400 W: 41134 L: 40572 D: 76694
Ptnml(0-2): 101, 16372, 45745, 16828, 154
closes https://github.com/official-stockfish/Stockfish/pull/3862
bench 4651538
This idea is somewhat similar to extentions in LMR but has a different flavour.
If result of LMR was really good - thus exceeded alpha by some pretty
big given margin, we can extend move after LMR in full depth search with 0 window.
The idea is that this move is probably a fail high with somewhat of a big
probability so extending it makes a lot of sense
passed STC
https://tests.stockfishchess.org/tests/view/61ad45ea56fcf33bce7d74b7
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 59680 W: 15531 L: 15215 D: 28934
Ptnml(0-2): 193, 6711, 15734, 6991, 211
passed LTC
https://tests.stockfishchess.org/tests/view/61ad9ff356fcf33bce7d8646
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 59104 W: 15321 L: 14992 D: 28791
Ptnml(0-2): 53, 6023, 17065, 6364, 47
closes https://github.com/official-stockfish/Stockfish/pull/3838
bench 4881329
This patch optimizes the NEON implementation in two ways.
The activation layer after the feature transformer is rewritten to make it easier for the compiler to see through dependencies and unroll. This in itself is a minimal, but a positive improvement. Other architectures could benefit from this too in the future. This is not an algorithmic change.
The affine transform for large matrices (first layer after FT) on NEON now utilizes the same optimized code path as >=SSSE3, which makes the memory accesses more sequential and makes better use of the available registers, which allows for code that has longer dependency chains.
Benchmarks from Redshift#161, profile-build with apple clang
george@Georges-MacBook-Air nets % ./stockfish-b82d93 bench 2>&1 | tail -4 (current master)
===========================
Total time (ms) : 2167
Nodes searched : 4667742
Nodes/second : 2154011
george@Georges-MacBook-Air nets % ./stockfish-7377b8 bench 2>&1 | tail -4 (this patch)
===========================
Total time (ms) : 1842
Nodes searched : 4667742
Nodes/second : 2534061
This is a solid 18% improvement overall, larger in a bench with NNUE-only, not mixed.
Improvement is also observed on armv7-neon (Raspberry Pi, and older phones), around 5% speedup.
No changes for architectures other than NEON.
closes https://github.com/official-stockfish/Stockfish/pull/3837
No functional changes.
Initialize continuation history with a slighlty negative value -71 instead of zero.
The idea is, because the most history entries will be later negative anyway, to shift
the starting values a little bit in the "correct" direction. Of course the effect of
initialization dimishes with greater depth so I had the apprehension that the LTC test
would be difficult to pass, but it passed.
STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 34520 W: 9076 L: 8803 D: 16641
Ptnml(0-2): 136, 3837, 9047, 4098, 142
https://tests.stockfishchess.org/tests/view/61aa52e39e8855bba1a3776b
LTC:
LLR: 2.93 (-2.94,2.94) <0.50,3.00>
Total: 75568 W: 19620 L: 19254 D: 36694
Ptnml(0-2): 44, 7773, 21796, 8115, 56
https://tests.stockfishchess.org/tests/view/61aa87d39e8855bba1a383a5
closes https://github.com/official-stockfish/Stockfish/pull/3834
Bench: 4674029
In their infinite wisdom, Intel axed AVX512 from Alder Lake
chips (well, not entirely, but we kind of want to use the Gracemont
cores for chess!) but still added VNNI support.
Confusingly enough, this is not the same as VNNI256 support.
This adds a specific AVX-VNNI target that will use this AVX-VNNI
mode, by prefixing the VNNI instructions with the appropriate VEX
prefix, and avoiding AVX512 usage.
This is about 1% faster on P cores:
Result of 20 runs
==================
base (./clang-bmi2 ) = 3306337 +/- 7519
test (./clang-vnni ) = 3344226 +/- 7388
diff = +37889 +/- 4153
speedup = +0.0115
P(speedup > 0) = 1.0000
But a nice 3% faster on E cores:
Result of 20 runs
==================
base (./clang-bmi2 ) = 1938054 +/- 28257
test (./clang-vnni ) = 1994606 +/- 31756
diff = +56552 +/- 3735
speedup = +0.0292
P(speedup > 0) = 1.0000
This was measured on Clang 13. GCC 11.2 appears to generate
worse code for Alder Lake, though the speedup on the E cores
is similar.
It is possible to run the engine specifically on the P or E using binding,
for example in linux it is possible to use (for an 8 P + 8 E setup like i9-12900K):
taskset -c 0-15 ./stockfish
taskset -c 16-23 ./stockfish
where the first call binds to the P-cores and the second to the E-cores.
closes https://github.com/official-stockfish/Stockfish/pull/3824
No functional change
This patch is a result of refining of tuning vondele did after
new net passed and some hand-made values adjustements - excluding
changes in other pruning heuristics and rounding value of history
divisor to the nearest power of 2.
With this patch futility pruning becomes more aggressive and
history influence on it is doubled again.
passed STC
https://tests.stockfishchess.org/tests/view/61a2c4c1a26505c2278c150d
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 33848 W: 8841 L: 8574 D: 16433
Ptnml(0-2): 100, 3745, 8988, 3970, 121
passed LTC
https://tests.stockfishchess.org/tests/view/61a327ffa26505c2278c26d9
LLR: 2.94 (-2.94,2.94) <0.50,3.00>
Total: 22272 W: 5856 L: 5614 D: 10802
Ptnml(0-2): 12, 2230, 6412, 2468, 14
closes https://github.com/official-stockfish/Stockfish/pull/3814
bench 6302543
This idea is somewhat of a respin of smth we had in futility pruning and that was simplified away - dependence of it not only on static evaluation of position but also on move history heuristics.
Instead of aborting it when they are high there we use fraction of their sum to adjust static eval pruning criteria.
passed STC
https://tests.stockfishchess.org/tests/view/619bd438c0a4ea18ba95a27d
LLR: 2.93 (-2.94,2.94) <0.00,2.50>
Total: 113704 W: 29284 L: 28870 D: 55550
Ptnml(0-2): 357, 12884, 30044, 13122, 445
passed LTC
https://tests.stockfishchess.org/tests/view/619cb8f0c0a4ea18ba95a334
LLR: 2.96 (-2.94,2.94) <0.50,3.00>
Total: 147136 W: 37307 L: 36770 D: 73059
Ptnml(0-2): 107, 15279, 42265, 15804, 113
closes https://github.com/official-stockfish/Stockfish/pull/3805
bench 6777918
retire msvc support and corresponding CI. No active development happens on msvc,
and build is much slower or wrong.
gcc (mingw) is our toolchain of choice also on windows, and the latter is tested.
No functional change
Current master implements a scaling of the raw NNUE output value with a formula
equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies
between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature
allows Stockfish to keep material on the board when she thinks she has the advantage,
and to seek exchanges and simplifications when she thinks she has to defend.
This patch slightly offsets the turning point between these two strategies, by adding
to Stockfish's evaluation a small "optimism" value before actually doing the scaling.
The effect is that SF will play a little bit more risky, trying to keep the tension a
little bit longer when she is defending, and keeping even more material on the board
when she has an advantage.
We note that this patch is similar in spirit to the old "Contempt" idea we used to have
in classical Stockfish, but this implementation differs in two key points:
a) it has been tested as an Elo-gainer against master;
b) the values output by the search are not changed on average by the implementation
(in other words, the optimism value changes the tension/exchange strategy, but a
displayed value of 1.0 pawn has the same signification before and after the patch).
See the old comment https://github.com/official-stockfish/Stockfish/pull/1361#issuecomment-359165141
for some images illustrating the ideas.
-------
finished yellow at STC:
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 165048 W: 41705 L: 41611 D: 81732
Ptnml(0-2): 565, 18959, 43245, 19327, 428
https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b
passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 121656 W: 30762 L: 30287 D: 60607
Ptnml(0-2): 87, 12558, 35032, 13095, 56
https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877
-------
How to continue from there?
a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value
could be tweaked to try to gain more Elo, so the parameters of the sigmoid function
in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible
using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92
b) in a similar vein, with two recents patches affecting the scaling of the NNUE
evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning
of the NNUE network;
c) this patch will tend to keep tension in middlegame a little bit longer, so any
patch improving the defensive aspect of play via search extensions in risky,
tactical positions would be welcome.
-------
closes https://github.com/official-stockfish/Stockfish/pull/3797
Bench: 6184852
Starting with Windows Build 20348 the behavior of the numa API has been changed:
https://docs.microsoft.com/en-us/windows/win32/procthread/numa-support
Old code only worked because there was probably a limit on how many
cores/threads can reside within one NUMA node, and the OS creates extra NUMA
nodes when necessary, however the actual mechanism of core binding is
done by "Processor Groups"(https://docs.microsoft.com/en-us/windows/win32/procthread/processor-groups). With a newer OS, one NUMA node can have many
such "Processor Groups" and we should just consistently use the number
of groups to bind the threads instead of deriving the topology from
the number of NUMA nodes.
This change is required to spread threads on all cores on Windows 11 with
a 3990X CPU. It has only 1 NUMA node with 2 groups of 64 threads each.
closes https://github.com/official-stockfish/Stockfish/pull/3787
No functional change.
In case the evaluation at root is large, discourage the use of lazyEval.
This fixes https://github.com/official-stockfish/Stockfish/issues/3772
or at least improves it significantly. In this case, poor play with large
odds can be observed, in extreme cases leading to a loss despite large
advantage:
r1bq1b1r/ppp3p1/3p1nkp/n3p3/2B1P2N/2NPB3/PPP2PPP/R3K2R b KQ - 5 9
With this patch the poor move is only considered up to depth 13, in master
up to depth 28.
The patch did not pass at LTC with Elo gainer bounds, but with slightly
positive Elo nevertheless (95% LOS).
STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.50>
Total: 40368 W: 10318 L: 10041 D: 20009
Ptnml(0-2): 103, 4493, 10725, 4750, 113
https://tests.stockfishchess.org/tests/view/61800ad259e71df00dcc420d
LTC:
LLR: -2.94 (-2.94,2.94) <0.50,3.00>
Total: 212288 W: 52997 L: 52692 D: 106599
Ptnml(0-2): 112, 22038, 61549, 22323, 122
https://tests.stockfishchess.org/tests/view/618050d959e71df00dcc426d
closes https://github.com/official-stockfish/Stockfish/pull/3780
Bench: 7127040
Maintain for each root move an exponential average of the search value with a weight ratio of 2:1 (new value vs old values). Then the average score is used as the center of the initial aspiration window instead of the previous score.
Stats indicate (see PR) that the deviation for previous score is in general greater than using average score, so later seems a better estimation of the next search value. This is probably the reason this patch succeded besides smoothing the sometimes wild swings in search score. An additional observation is that at higher depth previous score is above but average score below zero. So for average score more/less fail/low highs should be occur than previous score.
STC:
LLR: 2.97 (-2.94,2.94) <0.00,2.50>
Total: 59792 W: 15106 L: 14792 D: 29894
Ptnml(0-2): 144, 6718, 15869, 7010, 155
https://tests.stockfishchess.org/tests/view/61841612d7a085ad008eef06
LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.00>
Total: 46448 W: 11835 L: 11537 D: 23076
Ptnml(0-2): 21, 4756, 13374, 5050, 23
https://tests.stockfishchess.org/tests/view/618463abd7a085ad008eef3e
closes https://github.com/official-stockfish/Stockfish/pull/3776
Bench: 6719976
Currently we handle the UCI_Elo with a double randomization. This
seems not necessary and a bit involuted.
This patch removes the first randomization and unifies the 2 cases.
closes https://github.com/official-stockfish/Stockfish/pull/3769
No functional change.
To help with debugging, the worker sends the output of
stderr (suitable truncated) to the action log on the
server, in case a build fails. For this to work it is
important that there is no spurious output to stderr.
closes https://github.com/official-stockfish/Stockfish/pull/3773
No functional change
Official release version of Stockfish 14.1
Bench: 6334068
---
Today, we have the pleasure to announce Stockfish 14.1.
As usual, downloads will be freely available at stockfishchess.org/download [1].
With Stockfish 14.1 our users get access to the strongest chess engine
available today. In the period leading up to this release, Stockfish
convincingly won several chess engine tournaments, including the TCEC 21
superfinal, the TCEC Cup 9, and the Computer Chess Championship for
Fischer Random Chess (Chess960). In the latter tournament, Stockfish
was undefeated in 599 out of 600 games played.
Compared to Stockfish 14, this release introduces a more advanced NNUE
architecture and various search improvements. In self play testing, using
a book of balanced openings, Stockfish 14.1 wins three times more game
pairs than it loses [2]. At this high level, draws are very common, so the
Elo difference to Stockfish 14 is about 17 Elo. The NNUE evaluation method,
introduced to top level chess with Stockfish 12 about one year ago [3],
has now been adopted by several other strong CPU based chess engines.
The Stockfish project builds on a thriving community of enthusiasts
(thanks everybody!) that contribute their expertise, time, and resources
to build a free and open-source chess engine that is robust,
widely available, and very strong. We invite our chess fans to join the
fishtest testing framework and programmers to contribute to the project [4].
Stay safe and enjoy chess!
The Stockfish team
[1] https://stockfishchess.org/download/
[2] https://tests.stockfishchess.org/tests/view/6175c320af70c2be1788fa2b
[3] https://github.com/official-stockfish/Stockfish/discussions/3628
[4] https://stockfishchess.org/get-involved/
Idea of this patch is the following: in case we already have four moves that
exceeded alpha in the current node, the probability of finding fifth should
be reasonably low. Note that four is completely arbitrary - there could and
probably should be some tweaks, both in tweaking best move count threshold
for more reductions and tweaking how they work - for example making more
reductions with best move count linearly.
passed STC:
https://tests.stockfishchess.org/tests/view/615f614783dd501a05b0aee2
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 141816 W: 36056 L: 35686 D: 70074
Ptnml(0-2): 499, 15131, 39273, 15511, 494
passed LTC:
https://tests.stockfishchess.org/tests/view/615fdff683dd501a05b0af35
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 68536 W: 17221 L: 16891 D: 34424
Ptnml(0-2): 38, 6573, 20725, 6885, 47
closes https://github.com/official-stockfish/Stockfish/pull/3736
Bench: 6131513
This patch updates the stat_bonus() function (used in the history tables to
help move ordering), keeping the same quadratic for small depths but changing
the values for depth >= 9:
The old bonus formula was increasing from zero at depth 1 to 4100 at depth 14,
then used the strange, small value of 73 for all depths >= 15.
The new bonus formula increases from 0 at depth 1 to 2000 at depth 8, then
keeps 2000 for all depths >= 8.
passed STC:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 169624 W: 42875 L: 42454 D: 84295
Ptnml(0-2): 585, 19340, 44557, 19729, 601
https://tests.stockfishchess.org/tests/view/615bd69e9d256038a969b97c
passed LTC:
LLR: 3.07 (-2.94,2.94) <0.50,3.50>
Total: 37336 W: 9456 L: 9191 D: 18689
Ptnml(0-2): 20, 3810, 10747, 4067, 24
https://tests.stockfishchess.org/tests/view/615c75d99d256038a969b9b2
closes https://github.com/official-stockfish/Stockfish/pull/3731
Bench: 6261865
When playing games in MultiPV mode we must take care to only track the
best move changing for the first PV line. Otherwise, SF will spend most
of its time for the initial moves after the book exit.
This has been observed and reported on Discord, but can also be seen in
games played in Stefan Pohl's MultiPV experiment.
Tested with MultiPV=4.
STC:
https://tests.stockfishchess.org/tests/view/615c24b59d256038a969b990
LLR: 2.95 (-2.94,2.94) <-0.50,2.50>
Total: 1744 W: 694 L: 447 D: 603
Ptnml(0-2): 32, 125, 358, 278, 79
LTC:
https://tests.stockfishchess.org/tests/view/615c31769d256038a969b993
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 2048 W: 723 L: 525 D: 800
Ptnml(0-2): 10, 158, 511, 314, 31
closes https://github.com/official-stockfish/Stockfish/pull/3729
Bench: 5714575
Respin of multi-thread idea that was simplified away recently: basically doing
more reductions with thread count since Lazy SMP naturally widens search. With
drawish book this idea got simplified away but with less drawish book it again
gains elo, maybe trying to reinstall other ideas that were simplified away
previously can be beneficial.
passed STC
LLR: 2.96 (-2.94,2.94) <-0.50,2.50>
Total: 39736 W: 10205 L: 9986 D: 19545
Ptnml(0-2): 45, 4254, 11064, 4447, 58
https://tests.stockfishchess.org/tests/view/615750702d02f48db3961b00
passed LTC
LLR: 2.97 (-2.94,2.94) <0.50,3.50>
Total: 60352 W: 15530 L: 15218 D: 29604
Ptnml(0-2): 24, 5900, 18016, 6212, 24
https://tests.stockfishchess.org/tests/view/6157d8935488e26ea5eace7f
closes https://github.com/official-stockfish/Stockfish/pull/3724
Bench 5714575
Idea is to extend some quiet ttMoves if a lot of things indicate that
the transposition table move is going to be a good move:
1) move being a killer - so being the best move in nearby node;
2) reply continuation history is really good.
This is basically saying that move is good "in general" in this position,
that it is a good reply to the opponent move and that it was the best in
this position somewhere in search - so extending it makes a lot of sense.
In general in past year we had a lot of extensions of different types,
maybe there is something more in it :)
passed STC
LLR: 2.96 (-2.94,2.94) <-0.50,2.50>
Total: 42944 W: 10932 L: 10695 D: 21317
Ptnml(0-2): 141, 4869, 11210, 5116, 136
https://tests.stockfishchess.org/tests/view/614cca8e7bdc23e77ceb89f0
passed LTC
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 156848 W: 39473 L: 38893 D: 78482
Ptnml(0-2): 125, 16327, 44913, 16961, 98
https://tests.stockfishchess.org/tests/view/614cf93d7bdc23e77ceb8a13
closes https://github.com/official-stockfish/Stockfish/pull/3719
Bench: 5714575
In master, during singular move analysis, when both the transposition value
and a reduced search for the other moves seem to indicate a fail high, we
heuristically prune the whole subtree and return an fail high score.
This patch is a little bit more cautious in this case, and instead of the
risky cutoff, we now search the ttMove with a reduced depth (by two plies).
STC:
https://tests.stockfishchess.org/tests/view/614dafe07bdc23e77ceb8a89
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 46728 W: 11909 L: 11666 D: 23153
Ptnml(0-2): 181, 5288, 12168, 5561, 166
LTC:
https://tests.stockfishchess.org/tests/view/614dc84abe4c07e0ecac3c95
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 74520 W: 18809 L: 18450 D: 37261
Ptnml(0-2): 45, 7735, 21346, 8084, 50
closes https://github.com/official-stockfish/Stockfish/pull/3718
Bench: 5499262
This patch relax a little bit the condition for doubly singular moves
(ie moves that are so forced that we think that they deserve a local
double extension of the search). We lower the margin and allow up to
six such double extensions in the path between the root and the critical
node.
Original idea by Siad Daboul (@TopoIogist) in PR #3709
Tested with the previous commit:
passed STC:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 33048 W: 8458 L: 8236 D: 16354
Ptnml(0-2): 120, 3701, 8660, 3923, 120
https://tests.stockfishchess.org/tests/view/614b24347bdc23e77ceb88fe
passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.50>
Total: 54176 W: 13712 L: 13406 D: 27058
Ptnml(0-2): 36, 5653, 15399, 5969, 31
https://tests.stockfishchess.org/tests/view/614b3b727bdc23e77ceb8911
closes https://github.com/official-stockfish/Stockfish/pull/3714
Bench: 5792377
This patch detects some search explosions (due to double extensions in
search.cpp) which can happen in some pathological positions, and takes
measures to ensure progress in search even for these pathological situations.
While a small number of double extensions can be useful during search
(for example to resolve a tactical sequence), a sustained regime of
double extensions leads to search explosion and a non-finishing search.
See the discussion in https://github.com/official-stockfish/Stockfish/pull/3544
and the issue https://github.com/official-stockfish/Stockfish/issues/3532 .
The implemented algorithm is the following:
a) at each node during search, store the current depth in the stack.
Double extensions are by definition levels of the stack where the
depth at ply N is strictly higher than depth at ply N-1.
b) during search, calculate for each thread a running average of the
number of double extensions in the last 4096 visited nodes.
c) if one thread has more than 2% of double extensions for a sustained
period of time (6 millions consecutive nodes, or about 4 seconds on
my iMac), we decide that this thread is in an explosion state and
we calm down this thread by preventing it to do any double extension
for the next 6 millions nodes.
To calculate the running averages, we also introduced a auxiliary class
generalizing the computations of ttHitAverage variable we already had in
code. The implementation uses an exponential moving average of period 4096
and resolution 1/1024, and all computations are done with integers for
efficiency.
-----------
Example where the patch solves a search explosion:
```
./stockfish
ucinewgame
position fen 8/Pk6/8/1p6/8/P1K5/8/6B1 w - - 37 130
go infinite
```
This algorithm does not affect search in normal, non-pathological positions.
We verified, for instance, that the usual bench is unchanged up to depth 20
at least, and that the node numbers are unchanged for a search of the starting
position at depth 32.
-------------
See https://github.com/official-stockfish/Stockfish/pull/3714
Bench: 5575265
This patch introduces extension for captures and promotions. Every capture or
promotion that is not the first move in the list gets extended at PvNodes and
cutNodes. Special thanks to @locutus2 - all my previous attepmts that failed
on this idea were done only for PvNodes - idea to include also cutNodes was
based on his latest passed patch.
STC
https://tests.stockfishchess.org/tests/view/6134abf325b9b35584838574
LLR: 2.95 (-2.94,2.94) <-0.50,2.50>
Total: 188920 W: 47754 L: 47304 D: 93862
Ptnml(0-2): 595, 21754, 49344, 22140, 627
LTC
https://tests.stockfishchess.org/tests/view/613521de25b9b355848385d7
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 8768 W: 2283 L: 2098 D: 4387
Ptnml(0-2): 7, 866, 2452, 1053, 6
closes https://github.com/official-stockfish/Stockfish/pull/3692
bench: 5564555
The new network caused some issues initially due to the very narrow neuron set between the first two FC layers. Necessary changes were hacked together to make it work. This patch is a mature approach to make the affine transform code faster, more readable, and easier to maintain should the layer sizes change again.
The following changes were made:
* ClippedReLU always produces a multiple of 32 outputs. This is about as good of a solution for AffineTransform's SIMD requirements as it can get without a bigger rewrite.
* All self-contained simd helpers are moved to a separate file (simd.h). Inline asm is utilized to work around GCC's issues with code generation and register assignment. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101693, https://godbolt.org/z/da76fY1n7
* AffineTransform has 2 specializations. While it's more lines of code due to the boilerplate, the logic in both is significantly reduced, as these two are impossible to nicely combine into one.
1) The first specialization is for cases when there's >=128 inputs. It uses a different approach to perform the affine transform and can make full use of AVX512 without any edge cases. Furthermore, it has higher theoretical throughput because less loads are needed in the hot path, requiring only a fixed amount of instructions for horizontal additions at the end, which are amortized by the large number of inputs.
2) The second specialization is made to handle smaller layers where performance is still necessary but edge cases need to be handled. AVX512 implementation for this was ommited by mistake, a remnant from the temporary implementation for the new... This could be easily reintroduced if needed. A slightly more detailed description of both implementations is in the code.
Overall it should be a minor speedup, as shown on fishtest:
passed STC:
LLR: 2.96 (-2.94,2.94) <-0.50,2.50>
Total: 51520 W: 4074 L: 3888 D: 43558
Ptnml(0-2): 111, 3136, 19097, 3288, 128
and various tests shown in the pull request
closes https://github.com/official-stockfish/Stockfish/pull/3663
No functional change
Introduces a new NNUE network architecture and associated network parameters
The summary of the changes:
* Position for each perspective mirrored such that the king is on e..h files. Cuts the feature transformer size in half, while preserving enough knowledge to be good. See https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY/edit#heading=h.b40q4rb1w7on.
* The number of neurons after the feature transformer increased two-fold, to 1024x2. This is possibly mostly due to the now very optimized feature transformer update code.
* The number of neurons after the second layer is reduced from 16 to 8, to reduce the speed impact. This, perhaps surprisingly, doesn't harm the strength much. See https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY/edit#heading=h.6qkocr97fezq
The AffineTransform code did not work out-of-the box with the smaller number of neurons after the second layer, so some temporary changes have been made to add a special case for InputDimensions == 8. Also additional 0 padding is added to the output for some archs that cannot process inputs by <=8 (SSE2, NEON). VNNI uses an implementation that can keep all outputs in the registers while reducing the number of loads by 3 for each 16 inputs, thanks to the reduced number of output neurons. However GCC is particularily bad at optimization here (and perhaps why the current way the affine transform is done even passed sprt) (see https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY/edit# for details) and more work will be done on this in the following days. I expect the current VNNI implementation to be improved and extended to other architectures.
The network was trained with a slightly modified version of the pytorch trainer (https://github.com/glinscott/nnue-pytorch); the changes are in https://github.com/glinscott/nnue-pytorch/pull/143
The training utilized 2 datasets.
dataset A - https://drive.google.com/file/d/1VlhnHL8f-20AXhGkILujnNXHwy9T-MQw/view?usp=sharing
dataset B - as described in https://github.com/official-stockfish/Stockfish/commit/ba01f4b95448bcb324755f4dd2a632a57c6e67bc
The training process was as following:
train on dataset A for 350 epochs, take the best net in terms of elo at 20k nodes per move (it's fine to take anything from later stages of training).
convert the .ckpt to .pt
--resume-from-model from the .pt file, train on dataset B for <600 epochs, take the best net. Lambda=0.8, applied before the loss function.
The first training command:
python3 train.py \
../nnue-pytorch-training/data/large_gensfen_multipvdiff_100_d9.binpack \
../nnue-pytorch-training/data/large_gensfen_multipvdiff_100_d9.binpack \
--gpus "$3," \
--threads 1 \
--num-workers 1 \
--batch-size 16384 \
--progress_bar_refresh_rate 20 \
--smart-fen-skipping \
--random-fen-skipping 3 \
--features=HalfKAv2_hm^ \
--lambda=1.0 \
--max_epochs=600 \
--default_root_dir ../nnue-pytorch-training/experiment_$1/run_$2
The second training command:
python3 serialize.py \
--features=HalfKAv2_hm^ \
../nnue-pytorch-training/experiment_131/run_6/default/version_0/checkpoints/epoch-499.ckpt \
../nnue-pytorch-training/experiment_$1/base/base.pt
python3 train.py \
../nnue-pytorch-training/data/michael_commit_b94a65.binpack \
../nnue-pytorch-training/data/michael_commit_b94a65.binpack \
--gpus "$3," \
--threads 1 \
--num-workers 1 \
--batch-size 16384 \
--progress_bar_refresh_rate 20 \
--smart-fen-skipping \
--random-fen-skipping 3 \
--features=HalfKAv2_hm^ \
--lambda=0.8 \
--max_epochs=600 \
--resume-from-model ../nnue-pytorch-training/experiment_$1/base/base.pt \
--default_root_dir ../nnue-pytorch-training/experiment_$1/run_$2
STC: https://tests.stockfishchess.org/tests/view/611120b32a8a49ac5be798c4
LLR: 2.97 (-2.94,2.94) <-0.50,2.50>
Total: 22480 W: 2434 L: 2251 D: 17795
Ptnml(0-2): 101, 1736, 7410, 1865, 128
LTC: https://tests.stockfishchess.org/tests/view/611152b32a8a49ac5be798ea
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 9776 W: 442 L: 333 D: 9001
Ptnml(0-2): 5, 295, 4180, 402, 6
closes https://github.com/official-stockfish/Stockfish/pull/3646
bench: 5189338
This patch improves the codegen in the AffineTransform::forward function for architectures >=SSSE3. Current code works directly on memory and the compiler cannot see that the stores through outptr do not alias the loads through weights and input32. The solution implemented is to perform the affine transform with local variables as accumulators and only store the result to memory at the end. The number of accumulators required is OutputDimensions / OutputSimdWidth, which means that for the 1024->16 affine transform it requires 4 registers with SSSE3, 2 with AVX2, 1 with AVX512. It also cuts the number of stores required by NumRegs * 256 for each node evaluated. The local accumulators are expected to be assigned to registers, but even if this cannot be done in some case due to register pressure it will help the compiler to see that there is no aliasing between the loads and stores and may still result in better codegen.
See https://godbolt.org/z/59aTKbbYc for codegen comparison.
passed STC:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 140328 W: 10635 L: 10358 D: 119335
Ptnml(0-2): 302, 8339, 52636, 8554, 333
closes https://github.com/official-stockfish/Stockfish/pull/3634
No functional change
combined work by Serio Vieri, Michael Byrne, and Jonathan D (aka SFisGod) based on top of previous developments, by restarts from good nets.
Sergio generated the net https://tests.stockfishchess.org/api/nn/nn-d8609abe8caf.nnue:
The initial net nn-d8609abe8caf.nnue is trained by generating around 16B of training data from the last master net nn-9e3c6298299a.nnue, then trained, continuing from the master net, with lambda=0.2 and sampling ratio of 1. Starting with LR=2e-3, dropping LR with a factor of 0.5 until it reaches LR=5e-4. in_scaling is set to 361. No other significant changes made to the pytorch trainer.
Training data gen command (generates in chunks of 200k positions):
generate_training_data min_depth 9 max_depth 11 count 200000 random_move_count 10 random_move_max_ply 80 random_multi_pv 12 random_multi_pv_diff 100 random_multi_pv_depth 8 write_min_ply 10 eval_limit 1500 book noob_3moves.epd output_file_name gendata/$(date +"%Y%m%d-%H%M")_${HOSTNAME}.binpack
PyTorch trainer command (Note that this only trains for 20 epochs, repeatedly train until convergence):
python train.py --features "HalfKAv2^" --max_epochs 20 --smart-fen-skipping --random-fen-skipping 500 --batch-size 8192 --default_root_dir $dir --seed $RANDOM --threads 4 --num-workers 32 --gpus $gpuids --track_grad_norm 2 --gradient_clip_val 0.05 --lambda 0.2 --log_every_n_steps 50 $resumeopt $data $val
See https://github.com/sergiovieri/Stockfish/tree/tools_mod/rl for the scripts used to generate data.
Based on that Michael generated nn-76a8a7ffb820.nnue in the following way:
The net being submitted was trained with the pytorch trainer: https://github.com/glinscott/nnue-pytorch
python train.py i:/bin/all.binpack i:/bin/all.binpack --gpus 1 --threads 4 --num-workers 30 --batch-size 16384 --progress_bar_refresh_rate 30 --smart-fen-skipping --random-fen-skipping 3 --features=HalfKAv2^ --auto_lr_find True --lambda=1.0 --max_epochs=240 --seed %random%%random% --default_root_dir exp/run_109 --resume-from-model ./pt/nn-d8609abe8caf.pt
This run is thus started from Segio Vieri's net nn-d8609abe8caf.nnue
all.binpack equaled 4 parts Wrong_NNUE_2.binpack https://drive.google.com/file/d/1seGNOqcVdvK_vPNq98j-zV3XPE5zWAeq/view?usp=sharing plus two parts of Training_Data.binpack https://drive.google.com/file/d/1RFkQES3DpsiJqsOtUshENtzPfFgUmEff/view?usp=sharing
Each set was concatenated together - making one large Wrong_NNUE 2 binpack and one large Training so the were approximately equal in size. They were then interleaved together. The idea was to give Wrong_NNUE.binpack closer to equal weighting with the Training_Data binpack
model.py modifications:
loss = torch.pow(torch.abs(p - q), 2.6).mean()
LR = 8.0e-5 calculated as follows: 1.5e-3*(.992^360) - the idea here was to take a highly trained net and just use all.binpack as a finishing micro refinement touch for the last 2 Elo or so. This net was discovered on the 59th epoch.
optimizer = ranger.Ranger(train_params, betas=(.90, 0.999), eps=1.0e-7, gc_loc=False, use_gc=False)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=1, gamma=0.992)
For this micro optimization, I had set the period to "5" in train.py. This changes the checkpoint output so that every 5th checkpoint file is created
The final touches were to adjust the NNUE scale, as was done by Jonathan in tests running at the same time.
passed LTC
https://tests.stockfishchess.org/tests/view/60fa45aed8a6b65b2f3a77a4
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 53040 W: 1732 L: 1575 D: 49733
Ptnml(0-2): 14, 1432, 23474, 1583, 17
passed STC
https://tests.stockfishchess.org/tests/view/60f9fee2d8a6b65b2f3a7775
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 37928 W: 3178 L: 3001 D: 31749
Ptnml(0-2): 100, 2446, 13695, 2623, 100.
closes https://github.com/official-stockfish/Stockfish/pull/3626
Bench: 5169957
The main idea is that illegal moves influencing search or
qsearch obviously can't be any sort of good. The only reason
why initially legality checks for search and qsearch were done
after they actually can influence some heuristics is because
legality check is expensive computationally. Eventually in
search it was moved to the place where it makes sure that
illegal moves can't influence search.
This patch shows that the same can be done for qsearch + it
passed STC with elo-gaining bounds + it removes 3 lines of code
because one no longer needs to increment/decrement movecount
on illegal moves.
passed STC with elo-gaining bounds
https://tests.stockfishchess.org/tests/view/60f20aefd1189bed71812da0
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 61512 W: 4688 L: 4492 D: 52332
Ptnml(0-2): 139, 3730, 22848, 3874, 165
The same version functionally but with moving condition ever earlier
passed LTC with simplification bounds.
https://tests.stockfishchess.org/tests/view/60f292cad1189bed71812de9
LLR: 2.98 (-2.94,2.94) <-2.50,0.50>
Total: 60944 W: 1724 L: 1685 D: 57535
Ptnml(0-2): 11, 1556, 27298, 1597, 10
closes https://github.com/official-stockfish/Stockfish/pull/3618
bench 4709569
Not all linux users will have libatomic installed.
When using clang as the system compiler with compiler-rt as the default
runtime library instead of libgcc, atomic builtins may be provided by compiler-rt.
This change allows such users to pass RTLIB=compiler-rt to make sure
the build doesn't error out on the missing (unnecessary) libatomic.
closes https://github.com/official-stockfish/Stockfish/pull/3597
No functional change
Official release version of Stockfish 14
Bench: 4770936
---
Today, we have the pleasure to announce Stockfish 14.
As usual, downloads will be freely available at https://stockfishchess.org
The engine is now significantly stronger than just a few months ago,
and wins four times more game pairs than it loses against the previous
release version [0]. Stockfish 14 is now at least 400 Elo ahead of
Stockfish 7, a top engine in 2016 [1]. During the last five years,
Stockfish has thus gained about 80 Elo per year.
Stockfish 14 evaluates positions more accurately than Stockfish 13 as
a result of two major steps forward in defining and training the
efficiently updatable neural network (NNUE) that provides the evaluation
for positions.
First, the collaboration with the Leela Chess Zero team - announced
previously [2] - has come to fruition. The LCZero team has provided a
collection of billions of positions evaluated by Leela that we have
combined with billions of positions evaluated by Stockfish to train the
NNUE net that powers Stockfish 14. The fact that we could use and combine
these datasets freely was essential for the progress made and demonstrates
the power of open source and open data [3].
Second, the architecture of the NNUE network was significantly updated:
the new network is not only larger, but more importantly, it deals better
with large material imbalances and can specialize for multiple phases of
the game [4]. A new project, kick-started by Gary Linscott and
Tomasz Sobczyk, led to a GPU accelerated net trainer written in
pytorch.[5] This tool allows for training high-quality nets in a couple
of hours.
Finally, this release features some search refinements, minor bug
fixes and additional improvements. For example, Stockfish is now about
90 Elo stronger for chess960 (Fischer random chess) at short time control.
The Stockfish project builds on a thriving community of enthusiasts
(thanks everybody!) that contribute their expertise, time, and resources
to build a free and open-source chess engine that is robust, widely
available, and very strong. We invite our chess fans to join the fishtest
testing framework and programmers to contribute to the project on
github [6].
Stay safe and enjoy chess!
The Stockfish team
[0] https://tests.stockfishchess.org/tests/view/60dae5363beab81350aca077
[1] https://nextchessmove.com/dev-builds
[2] https://stockfishchess.org/blog/2021/stockfish-13/
[3] https://lczero.org/blog/2021/06/the-importance-of-open-data/
[4] https://github.com/official-stockfish/Stockfish/commit/e8d64af1
[5] https://github.com/glinscott/nnue-pytorch/
[6] https://stockfishchess.org/get-involved/
In the so-called "hybrid" method of evaluation of current master, we use the
classical eval (because of its speed) instead of the NNUE eval when the classical
material balance approximation hints that the position is "winning enough" to
rely on the classical eval.
This trade-off idea between speed and accuracy works well in general, but in
some fortress positions the classical eval is just bad. So in shuffling branches
of the search tree, we (slowly) increase the thresehold so that eventually we
don't trust classical anymore and switch to NNUE evaluation.
This patch increases that threshold faster, so that we switch to NNUE quicker
in shuffling branches. Idea is to incite Stockfish to spend less time in fortresses
lines in the search tree, and spend more time searching the critical lines.
passed STC:
LLR: 2.96 (-2.94,2.94) <-0.50,2.50>
Total: 47872 W: 3908 L: 3720 D: 40244
Ptnml(0-2): 122, 3053, 17419, 3199, 143
https://tests.stockfishchess.org/tests/view/60cef34b457376eb8bcab79d
passed LTC:
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 73616 W: 2326 L: 2143 D: 69147
Ptnml(0-2): 21, 1940, 32705, 2119, 23
https://tests.stockfishchess.org/tests/view/60cf6d842114332881e73528
Retested at LTC against lastest master:
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 18264 W: 642 L: 532 D: 17090
Ptnml(0-2): 6, 479, 8055, 583, 9
https://tests.stockfishchess.org/tests/view/60d18cd540925195e7a6c351
closes https://github.com/official-stockfish/Stockfish/pull/3578
Bench: 5139233
This patch removes the UCI option for setting Contempt in classical evaluation.
It is exactly equivalent to using Contempt=0 for the UCI contempt value and keeping
the dynamic part in the algo (renaming this dynamic part `trend` to better describe
what it does). We have tried quite hard to implement a working Contempt feature for
NNUE but nothing really worked, so it is probably time to give up.
Interested chess fans wishing to keep playing with the UCI option for Contempt and
use it with the classical eval are urged to download the version tagged "SF_Classical"
of Stockfish (dated 31 July 2020), as it was the last version where our search
algorithm was tuned for the classical eval and is probably our strongest classical
player ever: https://github.com/official-stockfish/Stockfish/tags
Passed STC:
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 72904 W: 6228 L: 6175 D: 60501
Ptnml(0-2): 221, 5006, 25971, 5007, 247
https://tests.stockfishchess.org/tests/view/60c98bf9457376eb8bcab18d
Passed LTC:
LLR: 2.93 (-2.94,2.94) <-2.50,0.50>
Total: 45168 W: 1601 L: 1547 D: 42020
Ptnml(0-2): 38, 1331, 19786, 1397, 32
https://tests.stockfishchess.org/tests/view/60c9c7fa457376eb8bcab1bb
closes https://github.com/official-stockfish/Stockfish/pull/3575
Bench: 4947716
This patch increase the weight of pawns and pieces from 28 to 32
in the scaling formula we apply to the output of the NNUE pure eval.
Increasing this gradient for pawns and pieces means that Stockfish
will try a little harder to keep material when she has the advantage,
and try a little bit harder to escape into an endgame when she is
under pressure.
STC:
LLR: 2.93 (-2.94,2.94) <-0.50,2.50>
Total: 53168 W: 4371 L: 4177 D: 44620
Ptnml(0-2): 160, 3389, 19283, 3601, 151
https://tests.stockfishchess.org/tests/view/60cefd1d457376eb8bcab7ab
LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 10888 W: 386 L: 288 D: 10214
Ptnml(0-2): 3, 260, 4821, 356, 4
https://tests.stockfishchess.org/tests/view/60cf709d2114332881e7352b
closes https://github.com/official-stockfish/Stockfish/pull/3571
Bench: 4965430
trained with the Python command
c:\nnue>python train.py i:/bin/all.binpack i:/bin/all.binpack --gpus 1 --threads 4 --num-workers 30 --batch-size 16384 --progress_bar_refresh_rate 300 --smart-fen-skipping --random-fen-skipping 3 --features=HalfKAv2^ --lambda=1.0 --max_epochs=440 --seed %random%%random% --default_root_dir exp/run_10 --resume-from-model ./pt/nn-3b20abec10c1.pt
`
all.binpack equaled 4 parts Wrong_NNUE_2.binpack https://drive.google.com/file/d/1seGNOqcVdvK_vPNq98j-zV3XPE5zWAeq/view?usp=sharing plus two parts of Training_Data.binpack https://drive.google.com/file/d/1RFkQES3DpsiJqsOtUshENtzPfFgUmEff/view?usp=sharing
Each set was concatenated together - making one large Wrong_NNUE 2 binpack and one large Training so the were approximately equal in size. They were then interleaved together. The idea was to give Wrong_NNUE.binpack closer to equal weighting with the Training_Data binpack .
Net nn-3b20abec10c1.nnue was chosen as the --resume-from-model with the idea that through learning, the manually hex edited values will be learned and will not need to be manually adjusted going forward. They would also be fine tuned by the learning process.
passed STC:
https://tests.stockfishchess.org/tests/view/60cdf91e457376eb8bcab66f
LLR: 2.95 (-2.94,2.94) <-0.50,2.50>
Total: 18256 W: 1639 L: 1479 D: 15138
Ptnml(0-2): 59, 1179, 6505, 1313, 72
passed LTC:
https://tests.stockfishchess.org/tests/view/60ce2166457376eb8bcab6e1
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 18792 W: 654 L: 542 D: 17596
Ptnml(0-2): 9, 490, 8291, 592, 14
closes https://github.com/official-stockfish/Stockfish/pull/3570
Bench: 5020972
The Cygwin environment has two g++ compilers, each with a different problem
for compiling Stockfish at the moment:
(a) g++.exe : full posix build compiler, linked to cygwin dll.
=> This one has a problem embedding the net.
(b) x86_64-w64-mingw32-g++.exe : native Windows build compiler.
=> This one manages to embed the net, but has a problem related to libgcov
when we use the profile-build target of Stockfish.
This patch solves the problem for compiler (b), so that our recommended command line
if you want to build an optimized version of Stockfish on Cygwin becomes something
like the following (you can change the ARCH value to whatever you want, but note
the COMP and CXX variables pointing at the right compiler):
```
make -j profile-build ARCH=x86-64-modern COMP=mingw CXX=x86_64-w64-mingw32-c++.exe
```
closes https://github.com/official-stockfish/Stockfish/pull/3569
No functional change
move to github actions to replace travis CI.
First version, testing on linux using gcc and clang.
gcc build with sanitizers and valgrind.
No functional change
Optimization of vondele's nn-33c9d39e5eb6.nnue using SPSA
https://tests.stockfishchess.org/tests/view/60ca68be457376eb8bcab28b
Setting: ck values are default based on how large the parameters are
The new values for this net are the raw values at the end of the tuning (80k games)
The significant changes are in buckets 1 and 2 (5-12 pieces) so the main difference is in playing endgames if we compare it to nn-33c9. There is also change in bucket 7 (29-32 pieces) but not as substantial as the changes in buckets 1 and 2. If we interpret the changes based on an experiment a few months ago, this new net plays more optimistically during endgames and less optimistically during openings.
STC:
LLR: 2.93 (-2.94,2.94) <-0.50,2.50>
Total: 49504 W: 4246 L: 4053 D: 41205
Ptnml(0-2): 140, 3282, 17749, 3407, 174
https://tests.stockfishchess.org/tests/view/60cbd752457376eb8bcab478
LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.50>
Total: 88720 W: 4926 L: 4651 D: 79143
Ptnml(0-2): 105, 4048, 35793, 4295, 119
https://tests.stockfishchess.org/tests/view/60cc7828457376eb8bcab4fa
closes https://github.com/official-stockfish/Stockfish/pull/3566
Bench: 4758885
This net was created by @pleomati, who manually edited with an hex editor
10 values randomly chosen in the LCSFNet10 net (nn-6ad41a9207d0.nnue) to
create this one. The LCSFNet10 net was trained by Joost VandeVondele from
a dataset combining Stockfish games and Leela games (16x10^9 positions from
SF self-play at depth 9, and 6.3x10^9 positions from Leela games, so overall
72% of Stockfish positions and 28% of Leela positions).
passed STC 10+0.1:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 50888 W: 5881 L: 5654 D: 39353
Ptnml(0-2): 281, 4290, 16085, 4497, 291
https://tests.stockfishchess.org/tests/view/60cbfa68457376eb8bcab49a
passed LTC 60+0.6:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 25480 W: 1498 L: 1338 D: 22644
Ptnml(0-2): 36, 1155, 10193, 1325, 31
https://tests.stockfishchess.org/tests/view/60cc4af8457376eb8bcab4d4
closes https://github.com/official-stockfish/Stockfish/pull/3564
Bench: 4904930
This reverts commit "Fix for Cygwin's environment build-profile", as it was
giving errors for "make clean" on some Windows environments. See comments in
https://github.com/official-stockfish/Stockfish/commit/68bf362ea2385a641be9f5ed9ce2acdf55a1ecf1
Possibly somebody can propose a solution that would fix Cygwin builds and
not break on other system too, stay tuned! :-)
No functional change
The Cygwin environment has two g++ compilers, each with a different problem
for compiling Stockfish at the moment:
(a) g++.exe : full posix build compiler, linked to cygwin dll.
=> This one has a problem embedding the net.
(b) x86_64-w64-mingw32-g++.exe : native Windows build compiler.
=> This one manages to embed the net, but has a problem related to libgcov
when we use the profile-build target of Stockfish.
This patch solves the problem for compiler (b), so that our recommended command line
if you want to build an optimized version of Stockfish on Cygwin becomes something
like the following (you can change the ARCH value to whatever you want, but note
the COMP and CXX variables pointing at the right compiler):
```
make -j profile-build ARCH=x86-64-modern COMP=mingw CXX=x86_64-w64-mingw32-c++.exe
```
closes https://github.com/official-stockfish/Stockfish/pull/3463
No functional change
of a root move leading to a 3-fold repetition.
With this small fix a draw ranking and thus a draw score is being applied.
This works for both, ranking by dtz or wdl tables.
Fixes https://github.com/official-stockfish/Stockfish/issues/3542
(No functional change without TBs.)
Bench: 4877339
This net is the result of training on data used by the Leela project. More precisely,
we shuffled T60 and T74 data kindly provided by borg (for different Tnn, the data is
a result of Leela selfplay with differently sized Leela nets).
The data is available at vondele's google drive:
https://drive.google.com/drive/folders/1mftuzYdl9o6tBaceR3d_VBQIrgKJsFpl.
The Leela data comes in small chunks of .binpack files. To shuffle them, we simply
used a small python script to randomly rename the files, and then concatenated them
using `cat`. As validation data we picked a file of T60 data. We will further investigate
T74 data.
The training for the NNUE architecture used 200 epochs with the Python trainer from
the Stockfish project. Unlike the previous run we tried with this data, this run does
not have adjusted scaling — not because we didn't want to, but because we forgot.
However, this training randomly skips 40% more positions than previous run. The loss
was very spiky and decreased slower than it does usually.
Training loss: https://github.com/official-stockfish/images/blob/main/training-loss-8e47cf062333.png
Validation loss: https://github.com/official-stockfish/images/blob/main/validation-loss-8e47cf062333.png
This is the exact training command:
python train.py --smart-fen-skipping --random-fen-skipping 14 --batch-size 16384 --threads 4 --num-workers 4 --gpus 1 trainingdata\training_data.binpack validationdata\val.binpack
---
10k STC result:
ELO: 3.61 +-3.3 (95%) LOS: 98.4%
Total: 10000 W: 1241 L: 1137 D: 7622
Ptnml(0-2): 68, 841, 3086, 929, 76
https://tests.stockfishchess.org/tests/view/60c67e50457376eb8bcaae70
10k LTC result:
ELO: 2.71 +-2.4 (95%) LOS: 98.8%
Total: 10000 W: 659 L: 581 D: 8760
Ptnml(0-2): 22, 485, 3900, 579, 14
https://tests.stockfishchess.org/tests/view/60c69deb457376eb8bcaae98
Passed LTC:
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 9648 W: 685 L: 545 D: 8418
Ptnml(0-2): 22, 448, 3740, 596, 18
https://tests.stockfishchess.org/tests/view/60c6d41c457376eb8bcaaecf
---
closes https://github.com/official-stockfish/Stockfish/pull/3550
Bench: 4877339
Compute optimal register count for feature transformer accumulation dynamically.
This also introduces a change where AVX512 would only use 8 registers instead of 16
(now possible due to a 2x increase in feature transformer size).
closes https://github.com/official-stockfish/Stockfish/pull/3543
No functional change
This patch restricts LMR extensions (of non-transposition table moves) from being
used when the transposition table move was extended by two plies via singular
extension. This may serve to limit search explosions in certain positions.
This makes a lot of sense because the precondition for the tt-move to have been
singular extended by two plies is that the result of the alternate search (with
excluded the tt-move) has been a hard fail low: it is natural to later search less
for non tt-moves in this situation.
The current state of depth/extensions/reductions management is getting quite tricky
in our search algo, see https://github.com/official-stockfish/Stockfish/pull/3546#issuecomment-860174549
for some discussion. Suggestions welcome!
Passed STC
https://tests.stockfishchess.org/tests/view/60c3f293457376eb8bcaac8d
LLR: 2.95 (-2.94,2.94) <-0.50,2.50>
Total: 117984 W: 9698 L: 9430 D: 98856
Ptnml(0-2): 315, 7708, 42703, 7926, 340
passed LTC
https://tests.stockfishchess.org/tests/view/60c46ea5457376eb8bcaacc7
LLR: 2.97 (-2.94,2.94) <0.50,3.50>
Total: 11280 W: 401 L: 302 D: 10577
Ptnml(0-2): 2, 271, 4998, 364, 5
closes https://github.com/official-stockfish/Stockfish/pull/3546
Bench: 4709974
Load feature transformer weights in bulk on little-endian machines.
This is in particular useful to test new nets with c-chess-cli,
see https://github.com/lucasart/c-chess-cli/issues/44
```
$ time ./stockfish.exe uci
Before : 0m0.914s
After : 0m0.483s
```
No functional change
Cleaner vector code structure in feature transformer. This patch just
regroups the parts of the inner loop for each SIMD instruction set.
Tested for non-regression:
LLR: 2.96 (-2.94,2.94) <-2.50,0.50>
Total: 115760 W: 9835 L: 9831 D: 96094
Ptnml(0-2): 326, 7776, 41715, 7694, 369
https://tests.stockfishchess.org/tests/view/60b96b39457376eb8bcaa26e
It would be nice if a future patch could use some of the macros at
the top of the file to unify the code between the distincts SIMD
instruction sets (of course, unifying the Relu will be the challenge).
closes https://github.com/official-stockfish/Stockfish/pull/3506
No functional change
This simplification patch implements two changes:
1. it simplifies away the so-called "lazy" path in the NNUE evaluation internals,
where we trusted the psqt head alone to avoid the costly "positional" head in
some cases;
2. it raises a little bit the NNUEThreshold1 in evaluate.cpp (from 682 to 800),
which increases the limit where we switched from NNUE eval to Classical eval.
Both effects increase the number of positional evaluations done by our new net
architecture, but the results of our tests below seem to indicate that the loss
of speed will be compensated by the gain of eval quality.
STC:
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 26280 W: 2244 L: 2137 D: 21899
Ptnml(0-2): 72, 1755, 9405, 1810, 98
https://tests.stockfishchess.org/tests/view/60ae73f112066fd299795a51
LTC:
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 20592 W: 750 L: 677 D: 19165
Ptnml(0-2): 9, 614, 8980, 681, 12
https://tests.stockfishchess.org/tests/view/60ae88e812066fd299795a82
closes https://github.com/official-stockfish/Stockfish/pull/3503
Bench: 3817907
Definition of the lazy threshold moved to evaluate.cpp where all others are.
Lazy threshold only used for real searches, not used for the "eval" call.
This preserves the purity of NNUE evaluation, which is useful to verify
consistency between the engine and the NNUE trainer.
closes https://github.com/official-stockfish/Stockfish/pull/3499
No functional change
Our new nets output two values for the side to move in the last layer.
We can interpret the first value as a material evaluation of the
position, and the second one as the dynamic, positional value of the
location of pieces.
This patch changes the balance for the (materialist, positional) parts
of the score from (128, 128) to (121, 135) when the piece material is
equal between the two players, but keeps the standard (128, 128) balance
when one player is at least an exchange up.
Passed STC:
LLR: 2.93 (-2.94,2.94) <-0.50,2.50>
Total: 15936 W: 1421 L: 1266 D: 13249
Ptnml(0-2): 37, 1037, 5694, 1134, 66
https://tests.stockfishchess.org/tests/view/60a82df9ce8ea25a3ef0408f
Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 13904 W: 516 L: 410 D: 12978
Ptnml(0-2): 4, 374, 6088, 484, 2
https://tests.stockfishchess.org/tests/view/60a8bbf9ce8ea25a3ef04101
closes https://github.com/official-stockfish/Stockfish/pull/3492
Bench: 3856635
The Tempo variable was introduced 10 years ago in our search because the
classical evaluation function was antisymmetrical in White and Black by design
to gain speed:
Eval(White to play) = -Eval(Black to play)
Nowadays our neural networks know which side is to play in a position when
they evaluate a position and are trained on real games, so the neural network
encodes the advantage of moving as an output of search. This patch shows that
the Tempo variable is not necessary anymore.
STC:
LLR: 2.94 (-2.94,2.94) <-2.50,0.50>
Total: 33512 W: 2805 L: 2709 D: 27998
Ptnml(0-2): 80, 2209, 12095, 2279, 93
https://tests.stockfishchess.org/tests/view/60a44ceace8ea25a3ef03d30
LTC:
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 53920 W: 1807 L: 1760 D: 50353
Ptnml(0-2): 16, 1617, 23650, 1658, 19
https://tests.stockfishchess.org/tests/view/60a477f0ce8ea25a3ef03d49
We also tried a match (20000 games) at STC using purely classical, result was neutral:
https://tests.stockfishchess.org/tests/view/60a4eebcce8ea25a3ef03db5
Note: there are two locations left in search.cpp where we assume antisymmetry
of evaluation (in relation with a speed optimization for null moves in lines
770 and 1439), but as the values are just used for heuristic pruning this
approximation should not hurt too much because the order of magnitude is still
true most of the time.
closes https://github.com/official-stockfish/Stockfish/pull/3481
Bench: 4015864
This improves the speed of NNUE by a bit on old hardware that code path
is intended for, like a Pentium III 1.13 GHz:
10 repeats of "./stockfish bench 16 1 13 default depth NNUE":
Before:
54 642 504 897 cycles (± 0.12%)
62 301 937 829 instructions (± 0.03%)
After:
54 320 821 928 cycles (± 0.13%)
62 084 742 699 instructions (± 0.02%)
Speed of go depth 20 from startpos:
Before: 53103 nps
After: 53856 nps
closes https://github.com/official-stockfish/Stockfish/pull/3476
No functional change.
This patch increases lmrDepth threshold for continuation history based pruning in search.
This part of code for a long time was known to be really TC sensitive - decreasing
this threshold easily passed lower time controls but failed badly at LTC,
on the other hand it increase was part of a tuning that resulted
in being negative at STC but was +12 elo at 180+1.8.
After recent simplification of special conditions that sometimes
increase it from 4 to 5 it was logical to overall test at longer
time controls if 5 is better than 4 with deeper searches.
reduces strenght on STC
https://tests.stockfishchess.org/tests/view/60a3a8bbce8ea25a3ef03c74
ELO: -2.57 +-2.0 (95%) LOS: 0.6%
Total: 20000 W: 1820 L: 1968 D: 16212
Ptnml(0-2): 68, 1582, 6836, 1458, 56
Passed LTC with STC bounds
https://tests.stockfishchess.org/tests/view/60a027395085663412d090ce
LLR: 2.93 (-2.94,2.94) <-0.50,2.50>
Total: 175256 W: 6774 L: 6548 D: 161934
Ptnml(0-2): 91, 5808, 75604, 6034, 91
Passed VLTC with LTC bounds
https://tests.stockfishchess.org/tests/view/60a2bccce229097940a037a7
LLR: 2.96 (-2.94,2.94) <0.50,3.50>
Total: 65736 W: 1224 L: 1092 D: 63420
Ptnml(0-2): 5, 1012, 30706, 1136, 9
closes https://github.com/official-stockfish/Stockfish/pull/3473
bench 3689330
- Comment for Countemove pruning -> Continuation history
- Fix comment in input_slice.h
- Shorter lines in Makefile
- Comment for scale factor
- Fix comment for pinners in see_ge()
- Change Thread.id() signature to size_t
- Trailing space in reprosearch.sh
- Add Douglas Matos Gomes to the AUTHORS file
- Introduce comment for undo_null_move()
- Use Stockfish coding style for export_net()
- Change date in AUTHORS file
closes https://github.com/official-stockfish/Stockfish/pull/3416
No functional change
e2k (Elbrus 2000) - this is a VLIW/EPIC architecture,
the like Intel Itanium (IA-64) architecture.
The architecture has half native / half software support
for most Intel/AMD SIMD (e.g. MMX/SSE/SSE2/SSE3/SSSE3/SSE4.1/SSE4.2/AES/AVX/AVX2 & 3DNow!/SSE4a/XOP/FMA4) via intrinsics.
https://en.wikipedia.org/wiki/Elbrus_2000
closes https://github.com/official-stockfish/Stockfish/pull/3425
No functional change
This PR adds an ability to export any currently loaded network.
The export_net command now takes an optional filename parameter.
If the loaded net is not the embedded net the filename parameter is required.
Two changes were required to support this:
* the "architecture" string, which is really just a some kind of description in the net, is now saved into netDescription on load and correctly saved on export.
* the AffineTransform scrambles weights for some architectures and sparsifies them, such that retrieving the index is hard. This is solved by having a temporary scrambled<->unscrambled index lookup table when loading the network, and the actual index is saved for each individual weight that makes it to canSaturate16. This increases the size of the canSaturate16 entries by 6 bytes.
closes https://github.com/official-stockfish/Stockfish/pull/3456
No functional change
This patch broadens and simplifies definition of PvNode that is likely to fail low.
New definition can be described as following "If node was already researched
at depth >= current depth and failed low there" which is more logical than the
previous version and takes less space + allows to not recompute it every time during move loop.
Passed simplification STC
https://tests.stockfishchess.org/tests/view/609148bf95e7f1852abd2e82
LLR: 2.93 (-2.94,2.94) <-2.50,0.50>
Total: 20128 W: 1865 L: 1751 D: 16512
Ptnml(0-2): 63, 1334, 7165, 1430, 72
Passed simplification LTC
https://tests.stockfishchess.org/tests/view/6091691295e7f1852abd2e8b
LLR: 2.94 (-2.94,2.94) <-2.50,0.50>
Total: 95128 W: 3498 L: 3481 D: 88149
Ptnml(0-2): 41, 2956, 41549, 2981, 37
closes https://github.com/official-stockfish/Stockfish/pull/3455
Bench: 3933037
Introduce variable tempo for nnue depending on logarithm of estimated
strength, where strength is the product of time and number of threads.
The original idea here was that NNUE is best with a slightly different
tempo value to classical, since its style of play is slightly different.
It turns out that the best tempo for NNUE varies with strength of play,
so a formula is used which gives about 19 for STC and 24 for LTC under
current fishtest settings.
STC 10+0.1:
LLR: 2.94 (-2.94,2.94) {-0.20,1.10}
Total: 120816 W: 11155 L: 10861 D: 98800
Ptnml(0-2): 406, 8728, 41933, 8848, 493
https://tests.stockfishchess.org/tests/view/60735b3a8141753378960534
LTC 60+0.6:
LLR: 2.94 (-2.94,2.94) {0.20,0.90}
Total: 35688 W: 1392 L: 1234 D: 33062
Ptnml(0-2): 23, 1079, 15473, 1255, 14
https://tests.stockfishchess.org/tests/view/6073ffbc814175337896057f
Passed non-regression SMP test at LTC 20+0.2 (8 threads):
LLR: 2.95 (-2.94,2.94) {-0.70,0.20}
Total: 11008 W: 317 L: 267 D: 10424
Ptnml(0-2): 2, 245, 4962, 291, 4
https://tests.stockfishchess.org/tests/view/60749ea881417533789605a4
closes https://github.com/official-stockfish/Stockfish/pull/3426
Bench 4075325
A lot of optimizations happend since the NNUE was introduced
and since then some parts of the code were left unused. This
got to the point where asserts were have to be made just to
let people know that modifying something will not have any
effects or may even break everything due to the assumptions
being made. Removing these parts removes those inexisting
"false dependencies". Additionally:
* append_changed_indices now takes the king pos and stateinfo
explicitly, no more misleading pos parameter
* IndexList is removed in favor of a generic ValueList.
Feature transformer just instantiates the type it needs.
* The update cost and refresh requirement is deferred to the
feature set once again, but now doesn't go through the whole
FeatureSet machinery and just calls HalfKP directly.
* accumulator no longer has a singular dimension.
* The PS constants and the PieceSquareIndex array are made local
to the HalfKP feature set because they are specific to it and
DO differ for other feature sets.
* A few names are changed to more descriptive
Passed STC non-regression:
https://tests.stockfishchess.org/tests/view/608421dd95e7f1852abd2790
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 180008 W: 16186 L: 16258 D: 147564
Ptnml(0-2): 587, 12593, 63725, 12503, 596
closes https://github.com/official-stockfish/Stockfish/pull/3441
No functional change
NNUE evaluation is incapable of recognizing trivially drawn bishop endgames
(the wrong-colored rook pawn), which are in fact ubiquitous and stock standard
in chess analysis. Switching off NNUE evaluation in KBPs vs KPs endgames is
a measure that stops Stockfish from trading down to a drawn version of these
endings when we presumably have advantage. The patch is able to edge over master
in endgame positions.
Patch tested for Elo gain with the "endgame.epd" book, and verified for
non-regression with our usual book (see the pull request for details).
STC:
LLR: 2.93 (-2.94,2.94) {-0.20,1.10}
Total: 33232 W: 6655 L: 6497 D: 20080
Ptnml(0-2): 4, 2342, 11769, 2494, 7
https://tests.stockfishchess.org/tests/view/6074a52981417533789605b8
LTC:
LLR: 2.93 (-2.94,2.94) {0.20,0.90}
Total: 159056 W: 29799 L: 29378 D: 99879
Ptnml(0-2): 7, 9004, 61085, 9425, 7
https://tests.stockfishchess.org/tests/view/6074c39a81417533789605ca
Closes https://github.com/official-stockfish/Stockfish/pull/3427
Bench: 4503918
blah
This patch changes the pop_lsb() signature from Square pop_lsb(Bitboard*) to
Square pop_lsb(Bitboard&). This is more idomatic for C++ style signatures.
Passed a non-regression STC test:
LLR: 2.93 (-2.94,2.94) {-1.25,0.25}
Total: 21280 W: 1928 L: 1847 D: 17505
Ptnml(0-2): 71, 1427, 7558, 1518, 66
https://tests.stockfishchess.org/tests/view/6053a1e22433018de7a38e2f
We have verified that the generated binary is identical on gcc-10.
Closes https://github.com/official-stockfish/Stockfish/pull/3404
No functional change.
our net currently is not trained on FRC games, and so doesn't know about the important pattern of a bishop that is cornered in FRC.
This patch introduces a term we have in the classical evaluation for this case, and adds it to the NNUE eval.
Since fishtest doesn't support FRC right now, the patch was tested locally at STC conditions,
starting from the book of FRC starting positions.
Score of master vs patch: 993 - 2226 - 6781 [0.438] 10000
Which corresponds to approximately 40 Elo
The patch passes non-regression testing for traditional chess (where it adds one branch).
passed STC:
https://tests.stockfishchess.org/tests/view/604fa2532433018de7a38b67
LLR: 2.95 (-2.94,2.94) {-1.25,0.25}
Total: 30560 W: 2701 L: 2636 D: 25223
Ptnml(0-2): 88, 2056, 10921, 2133, 82
passed STC also in an earlier version:
https://tests.stockfishchess.org/tests/view/604f61282433018de7a38b4d
closes https://github.com/official-stockfish/Stockfish/pull/3398
No functional change
We remark that in current master, most of our use cases for between_bb() can be
optimized if the second parameter of the function is added to the segment. So we
change the definition of between_bb(s1, s2) such that it excludes s1 but includes s2.
We also use a precomputed array for between_bb() for another small speed gain
(see https://tests.stockfishchess.org/tests/view/604d09f72433018de7a389fb).
Passed STC:
LLR: 2.96 (-2.94,2.94) {-0.25,1.25}
Total: 18736 W: 1746 L: 1607 D: 15383
Ptnml(0-2): 61, 1226, 6644, 1387, 50
https://tests.stockfishchess.org/tests/view/60428c84ddcba5f0627bb6e4
Yellow LTC:
LTC:
LLR: -3.00 (-2.94,2.94) {0.25,1.25}
Total: 39144 W: 1431 L: 1413 D: 36300
Ptnml(0-2): 13, 1176, 17184, 1178, 21
https://tests.stockfishchess.org/tests/view/605128702433018de7a38ca1
Closes https://github.com/official-stockfish/Stockfish/pull/3397
---------
Verified for correctness by running perft on the following position:
./stockfish
position fen 4rrk1/1p1nq3/p7/2p1P1pp/3P2bp/3Q1Bn1/PPPB4/1K2R1NR w - - 40 21
go perft 6
Nodes searched: 6136386434
--------
No functional change
The codebase contains multiple functions returning by const-value.
This patch is a small cleanup making those function returns
by value instead, removing the const specifier.
closes https://github.com/official-stockfish/Stockfish/pull/3328
No functional change
We introduce a metric for each internal node in search, called DistanceFromPV.
This distance indicated how far the current node is from the principal variation.
We then use this distance to search the nodes which are close to the PV a little
deeper (up to 4 plies deeper than the PV): this improves the quality of the search
at these nodes and bring better updates for the PV during search.
STC:
LLR: 2.96 (-2.94,2.94) {-0.25,1.25}
Total: 54936 W: 5047 L: 4850 D: 45039
Ptnml(0-2): 183, 3907, 19075, 4136, 167
https://tests.stockfishchess.org/tests/view/6037b88e7f517a561bc4a392
LTC:
LLR: 2.95 (-2.94,2.94) {0.25,1.25}
Total: 49608 W: 1880 L: 1703 D: 46025
Ptnml(0-2): 22, 1514, 21555, 1691, 22
https://tests.stockfishchess.org/tests/view/6038271b7f517a561bc4a3cb
Closes https://github.com/official-stockfish/Stockfish/pull/3369
Bench: 5037279
The idea of this patch can be described as follows: if we are in check
and the transposition table move is a capture that returns a value
far above beta, we can assume that the opponent just blundered a piece
by giving check, and we return the transposition table value. This is
similar to the usual probCut logic for quiet moves, but with a different
threshold.
Passed STC
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 33440 W: 3056 L: 2891 D: 27493
Ptnml(0-2): 110, 2338, 11672, 2477, 123
https://tests.stockfishchess.org/tests/view/602cd1087f517a561bc49bda
Passed LTC
LLR: 2.98 (-2.94,2.94) {0.25,1.25}
Total: 10072 W: 401 L: 309 D: 9362
Ptnml(0-2): 2, 288, 4365, 378, 3
https://tests.stockfishchess.org/tests/view/602ceea57f517a561bc49bf0
The committed version has an additional fix to never return unproven wins
in the tablebase range or the mate range. This fix passed tests for non-
regression at STC and LTC:
STC:
LLR: 2.93 (-2.94,2.94) {-1.25,0.25}
Total: 26240 W: 2354 L: 2280 D: 21606
Ptnml(0-2): 85, 1763, 9372, 1793, 107
https://tests.stockfishchess.org/tests/view/602d86a87f517a561bc49c7a
LTC:
LLR: 2.95 (-2.94,2.94) {-0.75,0.25}
Total: 35304 W: 1299 L: 1256 D: 32749
Ptnml(0-2): 14, 1095, 15395, 1130, 18
https://tests.stockfishchess.org/tests/view/602d98d17f517a561bc49c83
Closes https://github.com/official-stockfish/Stockfish/pull/3362
Bench: 3830215
Official release version of Stockfish 13
Bench: 3766422
-----
It is our pleasure to release Stockfish 13 to chess fans worldwide.
As usual, downloads are freely available at
https://stockfishchess.org
The Stockfish project builds on a thriving community of enthusiasts
who contribute their expertise, time, and resources to build a free
and open-source chess engine that is robust, widely available, and
very strong. We would like to thank them all!
The good news first: from now on, our users can expect more frequent
high-quality releases of Stockfish! Sadly, this decision has been
triggered by the start of sales of the Fat Fritz 2 engine by ChessBase,
which is a copy of a very recent development version of Stockfish
with minor modifications. We refer to our statement on Fat Fritz 2[1]
and a community blog[2] for further information.
This version of Stockfish is significantly stronger than any of its
predecessors. Stockfish 13 outperforms Stockfish 12 by at least
35 Elo[3]. When playing against a one-year-old Stockfish, it wins 60
times more game pairs than it loses[4]. This release features an NNUE
network retrained on billions of positions, much faster network
evaluation code, and significantly improved search heuristics, as
well as additional evaluation tweaks. In the course of its development,
this version has won the superfinals of the TCEC Season 19 and
TCEC Season 20.
Going forward, the Leela Chess Zero and Stockfish teams will join
forces to demonstrate our commitment to open source chess engines and
training tools, and open data. We are convinced that our free and
open-source chess engines serve the chess community very well.
Stay safe and enjoy chess!
The Stockfish team
[1] https://blog.stockfishchess.org/post/643239805544792064/statement-on-fat-fritz-2
[2] https://lichess.org/blog/YCvy7xMAACIA8007/fat-fritz-2-is-a-rip-off
[3] https://tests.stockfishchess.org/tests/view/602bcccf7f517a561bc49b11
[4] https://tests.stockfishchess.org/tests/view/600fbb9c735dd7f0f0352d59
• reorder some sections of the README file
• add reference to the AUTHORS file
• rename Syzygybases to Syzygy tablebases
• add pointer to the Discord channel
• more precise info about the GPLv3 licence
No functional change
Do not decrease reduction at pv-nodes which are likely to fail low.
The idea of this patch can be described as following: during the search, if a node
on the principal variation was re-searched in non-pv search and this re-search got
a value which was much lower than alpha, then we can assume that this pv-node is
likely to fail low again, and we can reduce more aggressively at this node.
Passed STC
https://tests.stockfishchess.org/tests/view/6023a5fa7f517a561bc49638
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 70288 W: 6443 L: 6223 D: 57622
Ptnml(0-2): 239, 5022, 24436, 5174, 273
Passed LTC
https://tests.stockfishchess.org/tests/view/6023f2617f517a561bc49661
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 105656 W: 4048 L: 3748 D: 97860
Ptnml(0-2): 67, 3312, 45761, 3630, 58
Closes https://github.com/official-stockfish/Stockfish/pull/3349
Bench: 3766422
This patch removes some magic numbers in TT bit management and introduce proper
constants in the code, to improve documentation and ease further modifications.
No function change
It's about 1% speedup for Stockfish.
Result of 100 runs
==================
base (...fish_clang12) = 1946851 +/- 3717
test (./stockfish ) = 1967276 +/- 3408
diff = +20425 +/- 2438
speedup = +0.0105
P(speedup > 0) = 1.0000
Thanks to David Major for making me aware of this part
of LLVM development.
closes https://github.com/official-stockfish/Stockfish/pull/3346
No functional change
This patch give a small bonus to incite the attacking side to keep more
pawns on the board.
A consequence of this bonus is that Stockfish will tend to play positions
slightly more closed on average than master, especially when it believes
that it has an advantage.
To lower the risk of blockades where Stockfish start shuffling without
progress, we also implement a progressive decrease of the evaluation
value with the 50 moves counter (along with the necessary aging of the
transposition table to reduce the impact of the Graph History Interaction
problem): since the evaluation decreases during shuffling phases, the
engine will tend to examine the consequences of pawn breaks faster during
the search.
Passed STC:
LLR: 2.96 (-2.94,2.94) {-0.25,1.25}
Total: 26184 W: 2406 L: 2252 D: 21526
Ptnml(0-2): 85, 1784, 9223, 1892, 108
https://tests.stockfishchess.org/tests/view/600cc08b735dd7f0f0352c06
Passed LCT:
LLR: 2.95 (-2.94,2.94) {0.25,1.25}
Total: 199768 W: 7695 L: 7191 D: 184882
Ptnml(0-2): 85, 6478, 86269, 6952, 100
https://tests.stockfishchess.org/tests/view/600ccd28735dd7f0f0352c10
Closes https://github.com/official-stockfish/Stockfish/pull/3321
Bench: 3988915
Give an additional penalty of S(20, 10) for any doubled pawn if none of
the opponent's pawns is facing any of our
- pawns or
- pawn attacks;
that means, each of their pawns can push at least one square without
being captured.
This ignores their non-pawns pieces and attacks.
One possible justification: Their pawns' ability to push freely provides
options to react to our threats by changing their pawn structure. Our
doubled pawns however will likely lead to an exploitable weakness, even
if the pawn structure is not yet fixed.
Note that the notion of "their pawns not being fixed" is symmetric for
both players: If all of their pawns can push freely so can ours. All
pawns being freely pushable might just be an early-game-indicator.
However, it can trigger during endgame pawns races, where doubled pawns
are especially hindering, too.
LTC
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 134976 W: 17964 L: 17415 D: 99597
Ptnml(0-2): 998, 12702, 39619, 13091, 1078
https://tests.stockfishchess.org/tests/view/5ffdd5316019e097de3ef281
STC
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 35640 W: 7219 L: 6904 D: 21517
Ptnml(0-2): 645, 4096, 8084, 4289, 706
https://tests.stockfishchess.org/tests/view/5ffda4a16019e097de3ef265
closes https://github.com/official-stockfish/Stockfish/pull/3302
Bench: 4363873
This change simplifies control flow in the generate_moves function which ensures the compiler doesn't duplicate work due to possibly not resolving pureness of the function calls. Also the biggest change is the removal of the unnecessary condition checking for empty b in a convoluted way. The rationale for removal of this condition is that computing attacks_bb with occupancy is not much more costly than computing pseudo attacks and overall the condition (also being likely unpredictable) is a pessimisation.
This is inspired by previous changes by @BM123499.
Passed STC:
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 88040 W: 8172 L: 7931 D: 71937
Ptnml(0-2): 285, 6128, 30957, 6361, 289
https://tests.stockfishchess.org/tests/view/5ffc28386019e097de3ef1c7
closes https://github.com/official-stockfish/Stockfish/pull/3300
No functional change.
Changed name from Bad Outpost to Uncontested Outpost
Scale Uncontested Outpost with number of pawns + Decrease Bishop PSQT values and general tuning
Credits for the decrease of the Bishop PSQT values: Fauzi
Credits for scaling Uncontested Outpost with number of pawns: Lolligerhans
Credits for the tunings: Fauzi
Passed STC:
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 32040 W: 6593 L: 6281 D: 19166
Ptnml(0-2): 596, 3713, 7095, 4015, 601
https://tests.stockfishchess.org/tests/view/5ffa43026019e097de3ef0f2
Passed LTC:
LLR: 2.95 (-2.94,2.94) {0.25,1.25}
Total: 84376 W: 11395 L: 10950 D: 62031
Ptnml(0-2): 652, 7930, 24623, 8287, 696
https://tests.stockfishchess.org/tests/view/5ffa6e7b6019e097de3ef0fd
closes https://github.com/official-stockfish/Stockfish/pull/3296
Bench: 4287509
- "discovered check" (instead of "discovery check")
- "en passant" (instead of "en-passant")
- "pseudo-legal" before a noun (instead of "pseudo legal")
- "3-fold" (instead of "3fold")
closes https://github.com/official-stockfish/Stockfish/pull/3294
No functional change.
In a recent patch we added comparing capture history to a number for LMR of captures.
Calling it via thisThread-> is not needed since capture history was already declared by this time -
so removing makes code slightly shorter and easier to follow.
closes https://github.com/official-stockfish/Stockfish/pull/3297
No functional change.
Size of the weights in the last layer is less than 512 bits. It leads to wrong data access for AVX512. There is no error because in current implementation it is guaranteed that there is an array of zeros after weights so zero multiplied by something is returned and sum is correct. It is a mistake that can lead to unexpected bugs in the future. Used AVX2 instructions for smaller input size.
No measurable slowdown on avx512.
closes https://github.com/official-stockfish/Stockfish/pull/3298
No functional change.
Reordered weights in such a way that accumulated sum fits to output.
Weights are grouped in blocks of four elements because four
int8 (weight type) corresponds to one int32 (output type).
No horizontal additions.
Grouped AVX512, AVX2 and SSSE3 implementations.
Repeated code was removed.
An earlier version passed STC:
LLR: 2.97 (-2.94,2.94) {-0.25,1.25}
Total: 15336 W: 1495 L: 1355 D: 12486
Ptnml(0-2): 44, 1054, 5350, 1158, 62
https://tests.stockfishchess.org/tests/view/5ff60e106019e097de3eefd5
Speedup depends on the architecture, up to 4% measured on a NNUE only bench.
closes https://github.com/official-stockfish/Stockfish/pull/3287
No functional change
The idea of this patch is pretty simple - we already do more reductions
for non-PV and root nodes in case of stable best move for depth > 10.
This patch makes us do so if root depth if > 10 instead, which
is logical since best move changes (thus instability of it) is
counted at root, so it makes a lot of sense to use depth of the root.
passed STC
https://tests.stockfishchess.org/tests/view/5fd643271ac16912018885c5
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 13232 W: 1308 L: 1169 D: 10755
Ptnml(0-2): 39, 935, 4535, 1062, 45
passed LTC
https://tests.stockfishchess.org/tests/view/5fd68db11ac16912018885f0
LLR: 2.96 (-2.94,2.94) {0.25,1.25}
Total: 14024 W: 565 L: 463 D: 12996
Ptnml(0-2): 3, 423, 6062, 517, 7
closes https://github.com/official-stockfish/Stockfish/pull/3263
Bench: 4050630
Improves throughput by summing 2 intermediate dot products using 16 bit addition before upconverting to 32 bit.
Potential saturation is detected and the code-path is avoided in this case.
The saturation can't happen with the current nets,
but nets can be constructed that trigger this check.
STC https://tests.stockfishchess.org/tests/view/5fd40a861ac1691201888479
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 25544 W: 2451 L: 2296 D: 20797
Ptnml(0-2): 92, 1761, 8925, 1888, 106
about 5% speedup
closes https://github.com/official-stockfish/Stockfish/pull/3261
No functional change
Imbalance tables tweaked to contain MiddleGame and Endgame values, instead of a single value.
The idea started from Fisherman, which requested my help to tune the values back in June/July,
so I tuned the values back then, and we were able to accomplish good results,
but not enough to pass both STC and LTC tests.
So after the recent changes, I decided to give it another shot, and I am glad that it was a successful attempt.
A special thanks goes also to mstembera, which notified me a simple way to let the patch perform a little better.
Passed STC:
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 115976 W: 23124 L: 22695 D: 70157
Ptnml(0-2): 2074, 13652, 26285, 13725, 2252
https://tests.stockfishchess.org/tests/view/5fc92d2d42a050a89f02ccc8
Passed LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 156304 W: 20617 L: 20024 D: 115663
Ptnml(0-2): 1138, 14647, 46084, 15050, 1233
https://tests.stockfishchess.org/tests/view/5fc9fee142a050a89f02cd3e
closes https://github.com/official-stockfish/Stockfish/pull/3255
Bench: 4278746
This appears to be slightly faster than using a comparison against zero
to compute the high bits, on both old (like Pentium III) and new (like
Zen 2) hardware.
closes https://github.com/official-stockfish/Stockfish/pull/3254
No functional change.
The idea of this patch can be described as following: we update static
history stats based on comparison of the static evaluations of the
position before and after the move. If the move increases static evaluation
it's assigned positive bonus, if it decreases static evaluation
it's assigned negative bonus. These stats are used in movepicker
to sort quiet moves.
passed STC
https://tests.stockfishchess.org/tests/view/5fca4c0842a050a89f02cd66
LLR: 3.00 (-2.94,2.94) {-0.25,1.25}
Total: 78152 W: 7409 L: 7171 D: 63572
Ptnml(0-2): 303, 5695, 26873, 5871, 334
passed LTC
https://tests.stockfishchess.org/tests/view/5fca6be442a050a89f02cd75
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 40240 W: 1602 L: 1441 D: 37197
Ptnml(0-2): 19, 1306, 17305, 1475, 15
closes https://github.com/official-stockfish/Stockfish/pull/3253
bench 3845156
Include scaling change as suggested by Dietrich Kappe,
the one who trained net for Komodo. According to him,
some nets may require different scaling in order to utilize its full strength.
STC:
LLR: 2.93 (-2.94,2.94) {-0.25,1.25}
Total: 99856 W: 9669 L: 9401 D: 80786
Ptnml(0-2): 374, 7468, 34037, 7614, 435
https://tests.stockfishchess.org/tests/view/5fc2697642a050a89f02c8ec
LTC:
LLR: 2.96 (-2.94,2.94) {0.25,1.25}
Total: 29840 W: 1220 L: 1081 D: 27539
Ptnml(0-2): 10, 969, 12827, 1100, 14
https://tests.stockfishchess.org/tests/view/5fc2ea5142a050a89f02c957
Bench: 3561701
This patch removes the incrementally updated piece lists from the Position object.
This has been tried before but always failed. My reasons for trying again are:
* 32-bit systems (including phones) are now much less important than they were some years ago (and are absent from fishtest);
* NNUE may have made SF less finely tuned to the order in which moves were generated.
STC:
LLR: 2.94 (-2.94,2.94) {-1.25,0.25}
Total: 55272 W: 5260 L: 5216 D: 44796
Ptnml(0-2): 208, 4147, 18898, 4159, 224
https://tests.stockfishchess.org/tests/view/5fc2986a42a050a89f02c926
LTC:
LLR: 2.96 (-2.94,2.94) {-0.75,0.25}
Total: 16600 W: 673 L: 608 D: 15319
Ptnml(0-2): 14, 533, 7138, 604, 11
https://tests.stockfishchess.org/tests/view/5fc2f98342a050a89f02c95c
closes https://github.com/official-stockfish/Stockfish/pull/3247
Bench: 3940967
+-----------------+
| . . . . . . . . | All files are closed. Some files are
| . . . . . o o . | more valuable for rooks, because
| . . . . o . . o | they might open in the future.
| . . . o x . . x |
| o . o x . x x . |
| x o x . . . . . | x our pawns
| . x . . . . . . | o their pawns
| . . . . . . . . | ^ rooks are scored higher on these files
+-----------------+
^ ^
Files containing none of our own pawns are open or half-open (otherwise
they are closed). Rooks on (half-)open files recieve a bonus for the
future potential to act along all ranks.
This commit refines the (relative) penalty of rooks on closed files.
Files that contain one of our blocked pawns are considered less likely
to open in the future; rooks on these files are now penalized stronger.
This bonus does not generally correlate with mobility. If the condition
is sufficiently refined in the future, it may be beneficial to adjust or
override mobility scores in some cases.
LTC
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 494384 W: 71565 L: 70231 D: 352588
Ptnml(0-2): 3907, 48050, 142118, 49036, 4081
https://tests.stockfishchess.org/tests/view/5fb9312e67cbf42301d6afb9
LTC (non-regression w/ book noob_3moves.epd)
LLR: 2.95 (-2.94,2.94) {-0.75,0.25}
Total: 208520 W: 27044 L: 26937 D: 154539
Ptnml(0-2): 1557, 19850, 61391, 19853, 1609
https://tests.stockfishchess.org/tests/view/5fc01ced67cbf42301d6b3df
STC
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 98392 W: 20269 L: 19868 D: 58255
Ptnml(0-2): 1804, 11297, 22589, 11706, 1800
https://tests.stockfishchess.org/tests/view/5fb7f88a67cbf42301d6af10
closes https://github.com/official-stockfish/Stockfish/pull/3242
Bench: 3682630
in affine transform for AVX512/AVX2/SSSE3
The idea is to initialize sum with the first element instead of zero.
Reduce one add_epi32 and one set_zero SIMD instructions for each output dimension.
sum = 0; for i = 1 to n sum += a[i] ->
sum = a[1]; for i = 2 to n sum += a[i]
STC:
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 69048 W: 7024 L: 6799 D: 55225
Ptnml(0-2): 260, 5175, 23458, 5342, 289
https://tests.stockfishchess.org/tests/view/5faf2cf467cbf42301d6aa06
closes https://github.com/official-stockfish/Stockfish/pull/3227
No functional change.
For the feature transformer the code is analogical to AVX2 since there was room for easy adaptation of wider simd registers.
For the smaller affine transforms that have 32 byte stride we keep 2 columns in one zmm register. We also unroll more aggressively so that in the end we have to do 16 parallel horizontal additions on ymm slices each consisting of 4 32-bit integers. The slices are embedded in 8 zmm registers.
These changes provide about 1.5% speedup for AVX-512 builds.
Closes https://github.com/official-stockfish/Stockfish/pull/3218
No functional change.
Using no searching time in case of a single legal move is not beneficial from
a strength point of view, and this special case can be easily removed:
STC:
LLR: 2.93 (-2.94,2.94) {-1.25,0.25}
Total: 22472 W: 2458 L: 2357 D: 17657
Ptnml(0-2): 106, 1733, 7453, 1842, 102
https://tests.stockfishchess.org/tests/view/5f926cbc81eda81bd78cb6df
LTC:
LLR: 2.94 (-2.94,2.94) {-0.75,0.25}
Total: 37880 W: 1736 L: 1682 D: 34462
Ptnml(0-2): 22, 1392, 16057, 1448, 21
https://tests.stockfishchess.org/tests/view/5f92a26081eda81bd78cb6fe
The advantage of using the normal time management for a single legal move is that scores
reported for that move are reasonable, not searching leads to artifacts during games
(see e.g. https://tcec-chess.com/#div=sf&game=96&season=19)
The disadvantage of using normal time management of a single legal move is that thinking
times can be unnaturally long, making it 'painful to watch' in online tournaments.
This patch uses normal time management, but caps the used time to 500ms.
This should lead to reasonable scores, and be hardly perceptible.
closes https://github.com/official-stockfish/Stockfish/pull/3195
closes https://github.com/official-stockfish/Stockfish/pull/3183
variant of a patch suggested by SFisGOD
No functional change.
Only do countermove based pruning in qsearch if we already have a move with a better score than a TB loss.
This patch fixes a bug (started as 843a961) that incorrectly prunes moves if in check,
and adds an assert to make sure no wrong mate scores are given in the future.
It replaces a no-op moveCount check with a check for bestValue.
Initially discussed in #3171 and later in #3199, #3198 and #3210.
This PR effectively closes#3171
It also likely fixes#3196 where this causes user visible incorrect TB scores,
which probably result from these incorrect mate scores.
Passed STC and LTC non-regression tests.
https://tests.stockfishchess.org/tests/view/5f9ef8dabca9bf35bae7f648
LLR: 2.93 (-2.94,2.94) {-1.25,0.25}
Total: 21672 W: 2339 L: 2230 D: 17103
Ptnml(0-2): 126, 1689, 7083, 1826, 112
https://tests.stockfishchess.org/tests/view/5f9f0caebca9bf35bae7f666
LLR: 2.97 (-2.94,2.94) {-0.75,0.25}
Total: 33152 W: 1551 L: 1485 D: 30116
Ptnml(0-2): 27, 1308, 13832, 1390, 19
closes https://github.com/official-stockfish/Stockfish/pull/3214
Bench: 3625915
A non-functional speedup. Unroll the loops going over
the output dimensions in the affine transform layers by
a factor of 4 and perform 4 horizontal additions at a time.
Instead of doing naive horizontal additions on each vector
separately use hadd and shuffling between vectors to reduce
the number of instructions by using all lanes for all stages
of the horizontal adds.
passed STC of the initial version:
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 17808 W: 1914 L: 1756 D: 14138
Ptnml(0-2): 76, 1330, 5948, 1460, 90
https://tests.stockfishchess.org/tests/view/5f9d516f6a2c112b60691da3
passed STC of the final version after cleanup:
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 16296 W: 1750 L: 1595 D: 12951
Ptnml(0-2): 72, 1192, 5479, 1319, 86
https://tests.stockfishchess.org/tests/view/5f9df5776a2c112b60691de3
closes https://github.com/official-stockfish/Stockfish/pull/3203
No functional change
This patch was inspired by c065abd which updates the accumulator,
if possible, based on the accumulator of two plies back if
the accumulator of the preceding ply is not available.
With this patch we look back even further in the position history
in an attempt to reduce the number of complete recomputations.
When we find a usable accumulator for the position N plies back,
we also update the accumulator of the position N-1 plies back
because that accumulator is most likely to be helpful later
when evaluating positions in sibling branches.
By not updating all intermediate accumulators immediately,
we avoid doing too much work that is not certain to be useful.
Overall, roughly 2-3% speedup.
This patch makes the code more specific to the net architecture,
changing input features of the net will require additional changes
to the incremental update code as discussed in the PR #3193 and #3191.
Passed STC:
https://tests.stockfishchess.org/tests/view/5f9056712c92c7fe3a8c60d0
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 10040 W: 1116 L: 968 D: 7956
Ptnml(0-2): 42, 722, 3365, 828, 63
closes https://github.com/official-stockfish/Stockfish/pull/3193
No functional change.
Idea of this patch can be described as following - in case we have consecutive fail highs and we reach late enough moves at root node probability of remaining quiet moves being able to produce even bigger value than moves that produced previous cutoff (so ones that should be high in move ordering but now they fail to produce beta cutoff because we actually reached high move count) should be quiet small so we can reduce them more.
passed STC
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 53392 W: 5681 L: 5474 D: 42237
Ptnml(0-2): 214, 4104, 17894, 4229, 255
https://tests.stockfishchess.org/tests/view/5f88501adcdad978fe8c527e
passed LTC
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 59136 W: 2773 L: 2564 D: 53799
Ptnml(0-2): 30, 2117, 25078, 2300, 43
https://tests.stockfishchess.org/tests/view/5f884dbfdcdad978fe8c527a
closes https://github.com/official-stockfish/Stockfish/pull/3184
Bench: 4066972
We now include the total pawn count in the scaling factor for the output
of the NNUE evaluation network. This should have the effect of trying to
keep more pawns when SF has the advantage, but exchange them when she
is defending.
Thanks to Alexander Pagel (Lolligerhans) for the idea of using the
value of pawns to ease the comparison with the rest of the material
estimation.
Passed STC:
LLR: 2.93 (-2.94,2.94) {-0.25,1.25}
Total: 15072 W: 1700 L: 1539 D: 11833
Ptnml(0-2): 65, 1202, 4845, 1355, 69
https://tests.stockfishchess.org/tests/view/5f7235a63b22d6afa50699b3
Passed LTC:
LLR: 2.93 (-2.94,2.94) {0.25,1.25}
Total: 25880 W: 1270 L: 1124 D: 23486
Ptnml(0-2): 23, 980, 10788, 1126, 23
https://tests.stockfishchess.org/tests/view/5f723b483b22d6afa5069a99
closes https://github.com/official-stockfish/Stockfish/pull/3164
Bench: 3776081
Idea is that division by fraction of 2 is slightly faster than by other numbers so parameters are adjusted in a way that division in null move pruning depth reduction features dividing by 256 instead of dividing by 213.
Other than this patch is almost non-functional - difference starts to exist by depth 133.
passed STC
https://tests.stockfishchess.org/tests/view/5f70dd943b22d6afa50693c5
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 57048 W: 6616 L: 6392 D: 44040
Ptnml(0-2): 304, 4583, 18531, 4797, 309
passed LTC
https://tests.stockfishchess.org/tests/view/5f7180db3b22d6afa506941f
LLR: 2.95 (-2.94,2.94) {0.25,1.25}
Total: 45960 W: 2419 L: 2229 D: 41312
Ptnml(0-2): 43, 1779, 19137, 1987, 34
closes https://github.com/official-stockfish/Stockfish/pull/3159
bench 3789924
Current master uses a constant scale factor of 5/4 = 1.25 for the output
of the NNUE network, for compatibility with search and classical evaluation.
We modify this scale factor to make it dependent on the phase of the game,
going from about 1.5 in the opening to 1.0 for pure pawn endgames.
This helps Stockfish to avoid exchanges of pieces (heavy pieces in particular)
when she has the advantage, keeping more material on the board when attacking.
Passed STC:
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 14744 W: 1771 L: 1599 D: 11374
Ptnml(0-2): 87, 1184, 4664, 1344, 93
https://tests.stockfishchess.org/tests/view/5f6fb0a63b22d6afa506904f
Passed LTC:
LLR: 2.95 (-2.94,2.94) {0.25,1.25}
Total: 8912 W: 512 L: 393 D: 8007
Ptnml(0-2): 7, 344, 3637, 459, 9
https://tests.stockfishchess.org/tests/view/5f6fcf533b22d6afa5069066
closes https://github.com/official-stockfish/Stockfish/pull/3154
Bench: 3943952
- Clean signature of functions in namespace NNUE
- Add comment for countermove based pruning
- Remove bestMoveCount variable
- Add const qualifier to kpp_board_index array
- Fix spaces in get_best_thread()
- Fix indention in capture LMR code in search.cpp
- Rename TtmemDeleter to LargePageDeleter
Closes https://github.com/official-stockfish/Stockfish/pull/3063
No functional change
Use TT memory functions to allocate memory for the NNUE weights. This
should provide a small speed-up on systems where large pages are not
automatically used, including Windows and some Linux distributions.
Further, since we now have a wrapper for std::aligned_alloc(), we can
simplify the TT memory management a bit:
- We no longer need to store separate pointers to the hash table and
its underlying memory allocation.
- We also get to merge the Linux-specific and default implementations
of aligned_ttmem_alloc().
Finally, we'll enable the VirtualAlloc code path with large page
support also for Win32.
STC: https://tests.stockfishchess.org/tests/view/5f66595823a84a47b9036fba
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 14896 W: 1854 L: 1686 D: 11356
Ptnml(0-2): 65, 1224, 4742, 1312, 105
closes https://github.com/official-stockfish/Stockfish/pull/3081
No functional change.
This PR sets the "comp" variable simply to "clang",
which seems to be more consistent and allows a small simplification.
The PR also moves the section that sets "profile_make" and "profile_use" to after the NDK section,
which ensures that these variables are now set correctly for NDK/clang.
closes https://github.com/official-stockfish/Stockfish/pull/3121
No functional change
NNUE appears to provide a more stable eval than the classic eval,
so the time use dependencies on bestMoveChanges, fallingEval,
etc may need to change to make the best use of available time.
This change doubles the effect of totBestMoveChanges when giving
more time because the choice of best move is unstable.
STC:
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 101928 W: 11995 L: 11698 D: 78235 Elo +0.78
Ptnml(0-2): 592, 8707, 32103, 8936, 626
https://tests.stockfishchess.org/tests/view/5f538a462d02727c56b36cec
LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 186392 W: 10383 L: 9877 D: 166132 Elo +0.81
Ptnml(0-2): 207, 8370, 75539, 8870, 210
https://tests.stockfishchess.org/tests/view/5f54a9712d02727c56b36d5a
closes https://github.com/official-stockfish/Stockfish/pull/3119
Bench 4222126
No need to initialize StatScore at rootNode. Current Logic is redundant because at subsequent levels the grandchildren statScore is initialized to zero.
closes https://github.com/official-stockfish/Stockfish/pull/3122
Non functional change.
Restore the default NNUE setting (enabled) after a bench command.
This also makes the resulting program settings independent of the
number of FENs that are being benched.
Fixes issue #3112.
closes https://github.com/official-stockfish/Stockfish/pull/3113
No functional change.
This fixes#3108 and removes some NNUE code that is currently not used.
At the moment, do_null_move() copies the accumulator from the previous
state into the new state, which is correct. It then clears the "computed_score"
flag because the side to move has changed, and with the other side to move
NNUE will return a completely different evaluation (normally with changed
sign but also with different NNUE-internal tempo bonus).
The problem is that do_null_move() clears the wrong flag. It clears the
computed_score flag of the old state, not of the new state. It turns out
that this almost never affects the search. For example, fixing it does not
change the current bench (but it does change the previous bench). This is
because the search code usually avoids calling evaluate() after a null move.
This PR corrects do_null_move() by removing the computed_score flag altogether.
The flag is not needed because nnue_evaluate() is never called twice on a position.
This PR also removes some unnecessary {}s and inserts a few blank lines
in the modified NNUE files in line with SF coding style.
Resulf ot STC non-regression test:
LLR: 2.95 (-2.94,2.94) {-1.25,0.25}
Total: 26328 W: 3118 L: 3012 D: 20198
Ptnml(0-2): 126, 2208, 8397, 2300, 133
https://tests.stockfishchess.org/tests/view/5f553ccc2d02727c56b36db1
closes https://github.com/official-stockfish/Stockfish/pull/3109
bench: 4109324
Official release version of Stockfish 12
Bench: 3624569
-----------------------
It is our pleasure to release Stockfish 12 to users world-wide
Downloads will be freely available at
https://stockfishchess.org/download/
This version 12 of Stockfish plays significantly stronger than
any of its predecessors. In a match against Stockfish 11,
Stockfish 12 will typically win at least ten times more game pairs
than it loses.
This jump in strength, visible in regular progression tests during
development[1], results from the introduction of an efficiently
updatable neural network (NNUE) for the evaluation in Stockfish[2],
and associated tuning of the engine as a whole. The concept of the
NNUE evaluation was first introduced in shogi, and ported to
Stockfish afterward. Stockfish remains a CPU-only engine, since the
NNUE networks can be very efficiently evaluated on CPUs. The
recommended parameters of the NNUE network are embedded in
distributed binaries, and Stockfish will use NNUE by default.
Both the NNUE and the classical evaluations are available, and
can be used to assign values to positions that are later used in
alpha-beta (PVS) search to find the best move. The classical
evaluation computes this value as a function of various chess
concepts, handcrafted by experts, tested and tuned using fishtest.
The NNUE evaluation computes this value with a neural network based
on basic inputs. The network is optimized and trained on the
evaluations of millions of positions.
The Stockfish project builds on a thriving community of enthusiasts
that contribute their expertise, time, and resources to build a free
and open source chess engine that is robust, widely available, and
very strong. We invite chess fans to join the fishtest testing
framework and programmers to contribute on github[3].
Stay safe and enjoy chess!
The Stockfish team
[1] https://github.com/glinscott/fishtest/wiki/Regression-Tests
[2] https://github.com/official-stockfish/Stockfish/commit/84f3e867903f62480c33243dd0ecbffd342796fc
[3] https://stockfishchess.org/get-involved/
std::sort() is not stable so different implementations can produce different results:
use the stable version instead.
Observed for '8/6k1/5r2/8/8/8/1K6/Q7 w - - 0 1' yielding different bench results for gcc and MSVC
and 3-4-5 syzygy TB prior to this patch.
closes https://github.com/official-stockfish/Stockfish/pull/3083
No functional change.
covers the most important cases from the user perspective:
It embeds the default net in the binary, so a download of that binary will result
in a working engine with the default net. The engine will be functional in the default mode
without any additional user action.
It allows non-default nets to be used, which will be looked for in up to
three directories (working directory, location of the binary, and optionally a specific default directory).
This mechanism is also kept for those developers that use MSVC,
the one compiler that doesn't have an easy mechanism for embedding data.
It is possible to disable embedding, and instead specify a specific directory, e.g. linux distros might want to use
CXXFLAGS="-DNNUE_EMBEDDING_OFF -DDEFAULT_NNUE_DIRECTORY=/usr/share/games/stockfish/" make -j ARCH=x86-64 profile-build
passed STC non-regression:
https://tests.stockfishchess.org/tests/view/5f4a581c150f0aef5f8ae03a
LLR: 2.95 (-2.94,2.94) {-1.25,-0.25}
Total: 66928 W: 7202 L: 7147 D: 52579
Ptnml(0-2): 291, 5309, 22211, 5360, 293
closes https://github.com/official-stockfish/Stockfish/pull/3070
fixes https://github.com/official-stockfish/Stockfish/issues/3030
No functional change.
This patch removes the EvalList structure from the Position object and generally simplifies the interface between do_move() and the NNUE code.
The NNUE evaluation function first calculates the "accumulator". The accumulator consists of two halves: one for white's perspective, one for black's perspective.
If the "friendly king" has moved or the accumulator for the parent position is not available, the accumulator for this half has to be calculated from scratch. To do this, the NNUE node needs to know the positions and types of all non-king pieces and the position of the friendly king. This information can easily be obtained from the Position object.
If the "friendly king" has not moved, its half of the accumulator can be calculated by incrementally updating the accumulator for the previous position. For this, the NNUE code needs to know which pieces have been added to which squares and which pieces have been removed from which squares. In principle this information can be derived from the Position object and StateInfo struct (in the same way as undo_move() does this). However, it is probably a bit faster to prepare this information in do_move(), so I have kept the DirtyPiece struct. Since the DirtyPiece struct now stores the squares rather than "PieceSquare" indices, there are now at most three "dirty pieces" (previously two). A promotion move that captures a piece removes the capturing pawn and the captured piece from the board (to SQ_NONE) and moves the promoted piece to the promotion square (from SQ_NONE).
An STC test has confirmed a small speedup:
https://tests.stockfishchess.org/tests/view/5f43f06b5089a564a10d850a
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 87704 W: 9763 L: 9500 D: 68441
Ptnml(0-2): 426, 6950, 28845, 7197, 434
closes https://github.com/official-stockfish/Stockfish/pull/3068
No functional change
to prevent user errors or generating untested code,
check explicitly that the ARCH variable is equivalent to a supported architecture
as listed in `make help`.
To nevertheless compile for an untested target the user can override the internal
variable, passing the undocumented `SUPPORTED_ARCH=true` to make.
closes https://github.com/official-stockfish/Stockfish/pull/3062
No functional change.
Fix the issue where a TT entry with key16==0 would always be reported
as a miss. Instead, we'll use depth8 to detect whether the TT entry is
occupied. In order to do that, we'll change DEPTH_OFFSET to -7
(depth8==0) to distinguish between an unoccupied entry and the
otherwise lowest possible depth, i.e., DEPTH_NONE (depth8==1).
To prevent a performance regression, we'll reorder the TT entry fields
by the access order of TranspositionTable::probe(). Memory in general
works fastest when accessed in sequential order. We'll also match the
store order in TTEntry::save() with the entry field order, and
re-order the 'if-or' expressions in TTEntry::save() from the cheapest
to the most expensive.
Finally, as we now have a proper TT entry occupancy test, we'll fix a
minor corner case with hashfull reporting. To reproduce:
- Use a big hash
- Either:
a. Start 31 very quick searches (this wraparounds generation to 0); or
b. Force generation of the first search to 0.
- go depth infinite
Before the fix, hashfull would incorrectly report nearly full hash
immediately after the search start, since
TranspositionTable::hashfull() used to consider only the entry
generation and not whether the entry was actually occupied.
STC:
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 36848 W: 4091 L: 3898 D: 28859
Ptnml(0-2): 158, 2996, 11972, 3091, 207
https://tests.stockfishchess.org/tests/view/5f3f98d5dc02a01a0c2881f7
LTC:
LLR: 2.95 (-2.94,2.94) {0.25,1.25}
Total: 32280 W: 1828 L: 1653 D: 28799
Ptnml(0-2): 34, 1428, 13051, 1583, 44
https://tests.stockfishchess.org/tests/view/5f3fe77a87a5c3c63d8f5332
closes https://github.com/official-stockfish/Stockfish/pull/3048
Bench: 3760677
due to downclocking on current chips (tested up to cascade lake)
supporting avx512 and vnni512, it is better to use avx2 or vnni256
in multithreaded (in particular hyperthreaded) engine use.
In single threaded use, the picture is different.
gcc compilation for vnni256 requires a toolchain for gcc >= 9.
closes https://github.com/official-stockfish/Stockfish/pull/3038
No functional change
In recent Macs, it is possible to use the Clang compiler provided by Apple
to compile Stockfish out of the box, and this is the method used by default
in our Makefile (the Makefile sets the macosx-version-min=10.14 flag to select
the right libc++ library for the Clang compiler with recent c++17 support).
But it is quite possible to compile and run Stockfish on older Macs! Below
we describe a method to install a recent GNU compiler on these Macs, to get
the c++17 support. We have tested the following procedure to install gcc10 on
machines running Mac OS 10.7, Mac OS 10.9 and Mac OS 10.13:
1) install XCode for your machine.
2) install Apple command-line developer tools for XCode, by typing the following
command in a Terminal:
```
sudo xcode-select --install
```
3) go to the Stockfish "src" directory, then try a default build and run Stockfish:
```
make clean
make build
make net
./stockfish
```
4) if step 3 worked, congrats! You have a compiler recent enough on your Mac
to compile Stockfish. If not, continue with step 5 to install GNU gcc10 :-)
5) install the MacPorts package manager (https://www.macports.org/install.php),
for instance using the fast method in the "macOS Package (.pkg) Installer"
section of the page.
6) use the "port" command to install the gcc10 package of MacPorts by typing the
following command:
```
sudo port install gcc10
```
With this step, MacPorts will install the gcc10 compiler under the name "g++-mp-10"
in the /opt/local/bin directory:
```
which g++-mp-10
/opt/local/bin/g++-mp-10 <--- answer
```
7) You can now go back to the "src" directory of Stockfish, and try to build
Stockfish by pointing at the right compiler:
```
make clean
make build COMP=gcc COMPCXX=/opt/local/bin/g++-mp-10
make net
./stockfish
```
8) Enjoy Stockfish on Macintosh!
See this pull request for further discussion:
https://github.com/official-stockfish/Stockfish/pull/3049
No functional change
Changes to deal with compilation (particularly profile-build) on macOS.
(1) The default toolchain has gcc masquerading as clang,
the previous Makefile was not picking up the required changes
to the different profiling tools.
(2) The previous Makefile test for gccisclang occurred before
a potential overwrite of CXX by COMPCXX
(3) llvm-profdata no longer runs as a command on macOS and
instead is invoked by ``xcrun llvm-profdata``
(4) Needs to support use of true gcc using e.g.
``make build ... COMPCXX=g++-10``
(5) enable profile-build in travis for macOS
closes https://github.com/official-stockfish/Stockfish/pull/3043
No functional change
some GUIs do not show the error message when the engine terminates in the no-net case, as it is send to cerr.
Instead send it as an info string, which the GUI will more likely display.
closes https://github.com/official-stockfish/Stockfish/pull/3031
No functional change.
add new ARCH targets
x86-32-sse41-popcnt > x86 32-bit with sse41 and popcnt support
x86-32-sse2 > x86 32-bit with sse2 support
x86-32 > x86 32-bit generic (with mmx and sse support)
retire x86-32-old (use general-32)
closes https://github.com/official-stockfish/Stockfish/pull/3022
No functional change.
The easiest way to use the NDK in conjunction with this Makefile (tested on linux-x86_64):
1. Download the latest NDK (r21d) from Google from https://developer.android.com/ndk/downloads
2. Place and unzip the NDK in $HOME/ndk folder
3. Export the path variable e.g., `export PATH=$PATH:$HOME/ndk/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/bin`
4. cd to your Stockfish/src dir
5. Issue `make -j ARCH=armv8 COMP=ndk build` (use `ARCH=armv7` or `ARCH=armv7-neon` for older CPUs)
6. Optionally `make -j ARCH=armv8 COMP=ndk strip`
7. That's all. Enjoy!
Improves support from Raspberry Pi (incomplete?) and compiling on arm in general
closes https://github.com/official-stockfish/Stockfish/pull/3015
fixes https://github.com/official-stockfish/Stockfish/issues/2860
fixes https://github.com/official-stockfish/Stockfish/issues/2641
Support is still fragile as we're missing CI on these targets. Nevertheless tested with:
```bash
# build crosses from ubuntu 20.04 on x86 to various arch/OS combos
# tested with suitable packages installed
# (build-essentials, mingw-w64, g++-arm-linux-gnueabihf, NDK (r21d) from google)
# cross to Android
export PATH=$HOME/ndk/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/bin:$PATH
make clean && make -j build ARCH=armv7 COMP=ndk && make -j build ARCH=armv7 COMP=ndk strip
make clean && make -j build ARCH=armv7-neon COMP=ndk && make -j build ARCH=armv7-neon COMP=ndk strip
make clean && make -j build ARCH=armv8 COMP=ndk && make -j build ARCH=armv8 COMP=ndk strip
# cross to Raspberry Pi
make clean && make -j build ARCH=armv7 COMP=gcc COMPILER=arm-linux-gnueabihf-g++
make clean && make -j build ARCH=armv7-neon COMP=gcc COMPILER=arm-linux-gnueabihf-g++
# cross to Windows
make clean && make -j build ARCH=x86-64-modern COMP=mingw
```
No functional change
This patch fixes the byte order when reading 16- and 32-bit values from the network file on a big-endian machine.
Bytes are ordered in read_le() using unsigned arithmetic, which doesn't need tricks to determine the endianness of the machine. Unfortunately the compiler doesn't seem to be able to optimise the ordering operation, but reading in the weights is not a time-critical operation and the extra time it takes should not be noticeable.
Big endian systems are still untested with NNUE.
fixes#3007
closes https://github.com/official-stockfish/Stockfish/pull/3009
No functional change.
Increases the use of NNUE evaluation in positions without captures/pawn moves,
by increasing the NNUEThreshold threshold with rule50_count.
This patch will force Stockfish to use NNUE eval more and more in materially
unbalanced positions, when it seems that the classical eval is struggling to
win and only manages to shuffle. This will ask the (slower) NNUE eval to
double-check the potential fortress branches of the search tree, but only
when necessary.
passed STC:
https://tests.stockfishchess.org/tests/view/5f36f1bf11a9b1a1dbf192d8
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 51824 W: 5836 L: 5653 D: 40335
Ptnml(0-2): 264, 4356, 16512, 4493, 287
passed LTC:
https://tests.stockfishchess.org/tests/view/5f37836111a9b1a1dbf1936d
LLR: 2.93 (-2.94,2.94) {0.25,1.75}
Total: 29768 W: 1747 L: 1590 D: 26431
Ptnml(0-2): 33, 1347, 11977, 1484, 43
closes https://github.com/official-stockfish/Stockfish/pull/3011
Bench: 4173967
Do not show the details of the default architecture for a simple "make help"
invocation, as the details are most likely to confuse beginners. Instead we
make it clear which architecture is the default and put an example at the end
of the Makefile as an incentative to use "make help ARCH=blah" to discover
the flags used by the different architectures.
```
make help
make help ARCH=x86-64-ssse3
```
Also clean-up and modernize a bit the Makefile examples while at it.
closes https://github.com/official-stockfish/Stockfish/pull/2996
No functional change
Adds support for Vector Neural Network Instructions (avx512), as available on Intel Cascade Lake
The _mm512_dpbusd_epi32() intrinsic (vpdpbusd instruction) is taylor made for NNUE.
on a cascade lake CPU (AWS C5.24x.large, gcc 10) NNUE eval is at roughly 78% nps of classical
(single core test)
bench 1024 1 24 default depth:
target classical NNUE ratio
vnni 2207232 1725987 78.20
avx512 2216789 1671734 75.41
avx2 2194006 1611263 73.44
modern 2185001 1352469 61.90
closes https://github.com/official-stockfish/Stockfish/pull/2987
No functional change
fails to build on that target, because of missing Intel Intrinsics.
macOS has posix_memalign() since ~2014 so we can simplify the code and just use that for all Apple platforms.
closes https://github.com/official-stockfish/Stockfish/pull/2985
No functional change.
this workaround is possibly rather a windows & gcc specific problem. See e.g.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412#c25
on Linux with gcc 8 this patch brings roughly a 8% speedup.
However, probably needs some testing in the wild.
includes a workaround for an old msys make (3.81) installation (fixes#2984)
No functional change
Change condition from three friendly pieces to two. This now means that we only extend castling on the king side if there are no other friendly pieces aside from king and rook. For the queen side, we only extend if there is only a rook and another friendly piece or if there is only a single rook and no other friendly piece but this is very rare.
STC:
LLR: 3.20 (-2.94,2.94) {-0.50,1.50}
Total: 31144 W: 4086 L: 3903 D: 23155
Ptnml(0-2): 227, 2843, 9278, 2968, 256
https://tests.stockfishchess.org/tests/view/5f31487f9081672066537516
LTC:
LLR: 2.93 (-2.94,2.94) {0.25,1.75}
Total: 57816 W: 3786 L: 3538 D: 50492
Ptnml(0-2): 92, 2991, 22488, 3251, 86
https://tests.stockfishchess.org/tests/view/5f3167c3908167206653753d
closes https://github.com/official-stockfish/Stockfish/pull/2980
Bench: 4244812
Joint work gvreuls / vondele
* Download the default NNUE net in AppVeyor
* Download net in travis CI `make net`
* Adjust tests to cover more archs, speedup instrumented testing
* Introduce 'mixed' bench as default, with further options:
classical, NNUE, mixed.
mixed (default) and NNUE require the default net to be present,
which can be obtained with
```
make net
```
Further examples (first is equivalent to `./stockfish bench`):
```
./stockfish bench 16 1 13 default depth mixed
./stockfish bench 16 1 13 default depth classical
./stockfish bench 16 1 13 default depth NNUE
```
The net is now downloaded automatically if needed for `profile-build`
(usual `build` works fine without net present)
PGO gives a nice speedup on fishtest:
passed STC:
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 3360 W: 469 L: 343 D: 2548
Ptnml(0-2): 20, 246, 1030, 356, 28
https://tests.stockfishchess.org/tests/view/5f31b5499081672066537569
passed LTC:
LLR: 2.97 (-2.94,2.94) {0.25,1.75}
Total: 8824 W: 609 L: 502 D: 7713
Ptnml(0-2): 8, 430, 3438, 519, 17
https://tests.stockfishchess.org/tests/view/5f31c87b908167206653757c
closes https://github.com/official-stockfish/Stockfish/pull/2931
fixes https://github.com/official-stockfish/Stockfish/issues/2907
requires fishtest updates before commit
Bench: 4290577
This patch allows old x86 CPUs, from AMD K8 (which the x86-64 baseline
targets) all the way down to the Pentium MMX, to benefit from NNUE with
comparable performance hit versus hand-written eval as on more modern
processors.
NPS of the bench with NNUE enabled on a Pentium III 1.13 GHz (using the
MMX code):
master: 38951
this patch: 80586
NPS of the bench with NNUE enabled using baseline x86-64 arch, which is
how linux distros are likely to package stockfish, on a modern CPU
(using the SSE2 code):
master: 882584
this patch: 1203945
closes https://github.com/official-stockfish/Stockfish/pull/2956
No functional change.
Makefile targets x86-64-sse42, x86-sse3 are removed; x86-64-sse41
is renamed to x86-64-sse41-popcnt (it did enable popcnt).
Makefile variables sse3, sse42, their associated compilation flags
and code in misc.cpp are removed.
closes https://github.com/official-stockfish/Stockfish/pull/2922
No functional change
despite usage of alignas, the generated (avx2/avx512) code with older compilers needs to use
unaligned loads with older gcc (e.g. confirmed crash with gcc 7.3/mingw on abrok).
Better performance thus requires gcc >= 9 on hardware supporting avx2/avx512
closes https://github.com/official-stockfish/Stockfish/pull/2969
No functional change
The idea of this patch is that positions are usually more complex and hard to evaluate even if there are more pawns.
This patch adjusts NNUE threshold usage depending on number of pawns in position, if pawn count is <3 we use the
classical evaluation more often, for pawn count = 3 patch the is non-functional,
with pawn count > 3 NNUE evaluation is used more often.
passed STC
https://tests.stockfishchess.org/tests/view/5f2f02d09081672066536b1f
LLR: 2.96 (-2.94,2.94) {-0.50,1.50}
Total: 36520 W: 5011 L: 4823 D: 26686
Ptnml(0-2): 299, 3482, 10548, 3594, 337
passed LTC
https://tests.stockfishchess.org/tests/view/5f2f4c329081672066536b5c
LLR: 2.98 (-2.94,2.94) {0.25,1.75}
Total: 39272 W: 2630 L: 2433 D: 34209
Ptnml(0-2): 53, 2066, 15218, 2229, 70
closes https://github.com/official-stockfish/Stockfish/pull/2960
bench 4084753
This patch tries to run multiple LTO threads in parallel, speeding up
the build process of optimized builds if the -j make parameter is used.
This mitigates the longer linking times of optimized builds since the
integration of the NNUE code. Roughly 2x build speedup.
I've tried a similar patch some two years ago but it ran into trouble
with old compiler versions then. Since we're on the C++17 standard now
these old compilers should be obsolete.
closes https://github.com/official-stockfish/Stockfish/pull/2943
No functional change.
This patch increases LMRdepth threshold for futility pruning at parent nodes so it can apply more often.
With radical change to evaluation approach it seems that search is really far from optimal state, especially it parts that use static evaluation of position.
passed STC
https://tests.stockfishchess.org/tests/view/5f2da75661e3b6af64881fd0
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 8744 W: 1305 L: 1156 D: 6283
Ptnml(0-2): 75, 789, 2500, 928, 80
passed LTC
https://tests.stockfishchess.org/tests/view/5f2dcb2a61e3b6af64881ff3
LLR: 2.98 (-2.94,2.94) {0.25,1.75}
Total: 17728 W: 1256 L: 1117 D: 15355
Ptnml(0-2): 22, 961, 6774, 1070, 37
Bench: 4067325
Allow any pawn in front of a minor piece to replace the pawn protection
requirement for outposts.
+-------+ +-------+
| . . o | | o . . | o Their pawns
| . o x | | o . . | x Our pawns
| o N . | | x o B | N,B New (reachable) outpost
| . . . | | . _ . | _ Reachable square behind a pawn
+-------+ +-------+
N outpost B reaches
outpost
We want outposts to be secured by pawns against major pieces. If
a minor is shielded by any pawn from above, it is rarely at the same
time protected by our pawn attacks from below. However, the pawn shield
in itself offers some degree of protection.
A pawn shield will now suffice to replace the pawn protection for the
outpost (and reachable outpost) bonus.
This effect stacks with the existing "minor behind pawn" bonus.
STC
https://tests.stockfishchess.org/tests/view/5f2bcd14b3ebe5cbfee85b2c
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 27248 W: 5353 L: 5119 D: 16776
Ptnml(0-2): 462, 3174, 6185, 3274, 529
LTC
https://tests.stockfishchess.org/tests/view/5f2bfef5b3ebe5cbfee85b5a
LLR: 2.96 (-2.94,2.94) {0.25,1.75}
Total: 99432 W: 12580 L: 12130 D: 74722
Ptnml(0-2): 696, 8903, 30049, 9391, 677
Closes#2935
Bench: 4143673
This patch ports the efficiently updatable neural network (NNUE) evaluation to Stockfish.
Both the NNUE and the classical evaluations are available, and can be used to
assign a value to a position that is later used in alpha-beta (PVS) search to find the
best move. The classical evaluation computes this value as a function of various chess
concepts, handcrafted by experts, tested and tuned using fishtest. The NNUE evaluation
computes this value with a neural network based on basic inputs. The network is optimized
and trained on the evalutions of millions of positions at moderate search depth.
The NNUE evaluation was first introduced in shogi, and ported to Stockfish afterward.
It can be evaluated efficiently on CPUs, and exploits the fact that only parts
of the neural network need to be updated after a typical chess move.
[The nodchip repository](https://github.com/nodchip/Stockfish) provides additional
tools to train and develop the NNUE networks.
This patch is the result of contributions of various authors, from various communities,
including: nodchip, ynasu87, yaneurao (initial port and NNUE authors), domschl, FireFather,
rqs, xXH4CKST3RXx, tttak, zz4032, joergoster, mstembera, nguyenpham, erbsenzaehler,
dorzechowski, and vondele.
This new evaluation needed various changes to fishtest and the corresponding infrastructure,
for which tomtor, ppigazzini, noobpwnftw, daylen, and vondele are gratefully acknowledged.
The first networks have been provided by gekkehenker and sergiovieri, with the latter
net (nn-97f742aaefcd.nnue) being the current default.
The evaluation function can be selected at run time with the `Use NNUE` (true/false) UCI option,
provided the `EvalFile` option points the the network file (depending on the GUI, with full path).
The performance of the NNUE evaluation relative to the classical evaluation depends somewhat on
the hardware, and is expected to improve quickly, but is currently on > 80 Elo on fishtest:
60000 @ 10+0.1 th 1
https://tests.stockfishchess.org/tests/view/5f28fe6ea5abc164f05e4c4c
ELO: 92.77 +-2.1 (95%) LOS: 100.0%
Total: 60000 W: 24193 L: 8543 D: 27264
Ptnml(0-2): 609, 3850, 9708, 10948, 4885
40000 @ 20+0.2 th 8
https://tests.stockfishchess.org/tests/view/5f290229a5abc164f05e4c58
ELO: 89.47 +-2.0 (95%) LOS: 100.0%
Total: 40000 W: 12756 L: 2677 D: 24567
Ptnml(0-2): 74, 1583, 8550, 7776, 2017
At the same time, the impact on the classical evaluation remains minimal, causing no significant
regression:
sprt @ 10+0.1 th 1
https://tests.stockfishchess.org/tests/view/5f2906a2a5abc164f05e4c5b
LLR: 2.94 (-2.94,2.94) {-6.00,-4.00}
Total: 34936 W: 6502 L: 6825 D: 21609
Ptnml(0-2): 571, 4082, 8434, 3861, 520
sprt @ 60+0.6 th 1
https://tests.stockfishchess.org/tests/view/5f2906cfa5abc164f05e4c5d
LLR: 2.93 (-2.94,2.94) {-6.00,-4.00}
Total: 10088 W: 1232 L: 1265 D: 7591
Ptnml(0-2): 49, 914, 3170, 843, 68
The needed networks can be found at https://tests.stockfishchess.org/nns
It is recommended to use the default one as indicated by the `EvalFile` UCI option.
Guidelines for testing new nets can be found at
https://github.com/glinscott/fishtest/wiki/Creating-my-first-test#nnue-net-tests
Integration has been discussed in various issues:
https://github.com/official-stockfish/Stockfish/issues/2823https://github.com/official-stockfish/Stockfish/issues/2728
The integration branch will be closed after the merge:
https://github.com/official-stockfish/Stockfish/pull/2825https://github.com/official-stockfish/Stockfish/tree/nnue-player-wip
closes https://github.com/official-stockfish/Stockfish/pull/2912
This will be an exciting time for computer chess, looking forward to seeing the evolution of
this approach.
Bench: 4746616
In some French games, Stockfish likes to bring the Knight to a bad outpost spot. This is evident in TCEC S18 Superfinal Game 63, where there is a Knight outpost on the queenside but is actually useless. Stockfish is effectively playing a piece down while holding ground against Leela's break on the kingside.
This patch turns the +56 mg bonus for a Knight outpost into a -7 mg penalty if it satisfies the following conditions:
* The outpost square is not on the CenterFiles (i.e. not on files C,D,E and F)
* The knight is not attacking non pawn enemies.
* The side where the outpost is located contains only few enemies, with a particular conditional_more_than_two() implementation
Thank you to apospa...@gmail.com for bringing this to our attention and for providing insights.
See https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/dEXNzSIBgZU
Reference game: https://tcec-chess.com/#div=sf&game=63&season=18
Passed STC:
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 6960 W: 1454 L: 1247 D: 4259
Ptnml(0-2): 115, 739, 1610, 856, 160
https://tests.stockfishchess.org/tests/view/5f08221059f6f0353289477e
Passed LTC:
LLR: 2.98 (-2.94,2.94) {0.25,1.75}
Total: 21440 W: 2767 L: 2543 D: 16130
Ptnml(0-2): 122, 1904, 6462, 2092, 140
https://tests.stockfishchess.org/tests/view/5f0838ed59f6f035328947a2
various related tests show strong test results, but so far no generalizations or simplifications of conditional_more_than_two() are found. See PR for details.
closes https://github.com/official-stockfish/Stockfish/pull/2803
Bench: 4366686
Probcut is a heuristic that wasn't changed a lot in past years,
all attempts to change it using information / writing info to transposition table failed.
This patch has a number of differences that can be summarized as follows:
* For TT write/read we use depth - 3. Because probcut search is depth - 4 but we actually do the move prior to it so effectively we do depth - 3 search;
* In any case of depth of eval from transposition table being >= depth - 3 we either produce cutoff or refuse to even do probcut search, this is allowing us to write info of probcut to transposition table because we know that we wouldn't be overwriting some deeper data with our depth - 3 search - this is an important aspect of this patch;
* For some not really known reason this patch completely ignores tte->bound() - which was the case for previous patch that made probcut interact with TT, maybe 2) is the reason, although it's unproven.
A first version of this patch passed STC and LTC
passed STC
https://tests.stockfishchess.org/tests/view/5f05908a59f6f03532894613
LLR: 2.95 (-2.94,2.94) {-0.50,1.50}
Total: 95776 W: 18300 L: 17973 D: 59503
Ptnml(0-2): 1646, 10944, 22377, 11279, 1642
passed LTC
https://tests.stockfishchess.org/tests/view/5f06b54059f6f035328946bb
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 57128 W: 7266 L: 6938 D: 42924
Ptnml(0-2): 372, 5163, 17217, 5389, 423
However, an additional bugfix was needed to avoid checking a condition on ttMove if was not available. This passed non-regression bounds on top of the first version:
at STC
https://tests.stockfishchess.org/tests/view/5f080e5059f6f03532894766
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 14096 W: 2800 L: 2628 D: 8668
Ptnml(0-2): 225, 1620, 3238, 1688, 277
at LTC
https://tests.stockfishchess.org/tests/view/5f0836a559f6f0353289479c
LLR: 2.95 (-2.94,2.94) {-1.50,0.50}
Total: 25352 W: 3228 L: 3139 D: 18985
Ptnml(0-2): 175, 2350, 7549, 2415, 187
closes https://github.com/official-stockfish/Stockfish/pull/2804
Bench 4540940
A number of engines, GUIs and tournaments start to report WDL estimates
along or instead of scores. This patch enables reporting of those stats
in a more or less standard way (http://www.talkchess.com/forum3/viewtopic.php?t=72140)
The model this reporting uses is based on data derived from a few million fishtest LTC games,
given a score and a game ply, a win rate is provided that matches rather closely,
especially in the intermediate range [0.05, 0.95] that data. Some data is shown at
https://github.com/glinscott/fishtest/wiki/UsefulData#win-loss-draw-statistics-of-ltc-games-on-fishtest
Making the conversion game ply dependent is important for a good fit, and is in line
with experience that a +1 score in the early midgame is more likely a win than in the late endgame.
Even when enabled, the printing of the info causes no significant overhead.
Passed STC:
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 197112 W: 37226 L: 37347 D: 122539
Ptnml(0-2): 2591, 21025, 51464, 20866, 2610
https://tests.stockfishchess.org/tests/view/5ef79ef4f993893290cc146b
closes https://github.com/official-stockfish/Stockfish/pull/2778
No functional change
We lower the endgame value of the evaluation when we detect that there
is only one queen left on the board (more precisely, we use a scale
factor of 37/64, or about 0.58, for the endgame part of the evaluation).
Hopefully this helps a little bit for the assessment of positions with
queen imbalance, which are one of the well-known Stockfish weaknesses.
STC:
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 21600 W: 4176 L: 3955 D: 13469
Ptnml(0-2): 351, 2457, 5003, 2598, 391
https://tests.stockfishchess.org/tests/view/5ef871b6020eec13834a94e8
LTC:
LLR: 2.97 (-2.94,2.94) {0.25,1.75}
Total: 248328 W: 30596 L: 29720 D: 188012
Ptnml(0-2): 1544, 22345, 75665, 22911, 1699
https://tests.stockfishchess.org/tests/view/5ef87aec020eec13834a94fe
Closes https://github.com/official-stockfish/Stockfish/pull/2781
Bench: 4441323
* Supports popcnt (thanks @daylen)
* bits = 64 is now the default
Tested with g++ (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0 on ThunderX CN8890,
yields about 9% speedup.
Also tested with clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final).
closes https://github.com/official-stockfish/Stockfish/pull/2770
No functional change.
the option was, since at least 2014, not correctly implemented,
ignoring all dynamic adjustments to optimum time in search.
Instead of fixing it, remove it, no need to expose an option that
will influence time management negatively.
closes https://github.com/official-stockfish/Stockfish/pull/2765
No functional change.
The idea of this patch is that if capture can be described as
"less valuable piece takes more valuable piece" it's not really correct
to add only piece value of captured piece to static evaluation
since there can be more threats in other places and opponent can't really
do much but recapture our capturing piece which leaves us space for
more captures thus winning more material and increasing static eval.
passed STC
https://tests.stockfishchess.org/tests/view/5ef0167b122d6514328d760f
LLR: 2.96 (-2.94,2.94) {-0.50,1.50}
Total: 24736 W: 4838 L: 4607 D: 15291
Ptnml(0-2): 438, 2812, 5648, 3021, 449
passed LTC
https://tests.stockfishchess.org/tests/view/5ef073bc122d6514328d7693
LLR: 2.93 (-2.94,2.94) {0.25,1.75}
Total: 46152 W: 5865 L: 5567 D: 34720
Ptnml(0-2): 312, 4160, 13886, 4354, 364
closes https://github.com/official-stockfish/Stockfish/pull/2761
bench 4789930
We increase a little bit the midgame value of pawns on a4, h4, a6 and h6.
Original idea by Malcolm Campbell, who tried the version restricted to the
pawns on the H column a couple of weeks ago and got a patch which almost
passed LTC. The current pull request just adds the same idea for pawns on
the A column.
Possible follow-ups: maybe tweak the a5/h5 pawn values, and/or add a malus
for very low king mobility in midgame?
STC:
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 33416 W: 6516 L: 6275 D: 20625
Ptnml(0-2): 575, 3847, 7659, 4016, 611
https://tests.stockfishchess.org/tests/view/5ee6c4e687586124bc2c10d4
LTC:
LLR: 2.95 (-2.94,2.94) {0.25,1.75}
Total: 134368 W: 16869 L: 16319 D: 101180
Ptnml(0-2): 908, 12083, 40708, 12521, 964
https://tests.stockfishchess.org/tests/view/5ee74e60aae8aec816ab756a
closes https://github.com/official-stockfish/Stockfish/pull/2747
Bench: 5299456
Tuned search constants after many search patches since the last
successful tune.
1st LTC @ 60+0.6 th 1 :
LLR: 2.97 (-2.94,2.94) {0.25,1.75}
Total: 57656 W: 7369 L: 7036 D: 43251
Ptnml(0-2): 393, 5214, 17336, 5437, 448
https://tests.stockfishchess.org/tests/view/5ee1e074f29b40b0fc95af19
SMP LTC @ 20+0.2 th 8 :
LLR: 2.95 (-2.94,2.94) {0.25,1.75}
Total: 83576 W: 9731 L: 9341 D: 64504
Ptnml(0-2): 464, 7062, 26369, 7406, 487
https://tests.stockfishchess.org/tests/view/5ee35a21f29b40b0fc95b008
The changes were rebased on top of a successful patch by Viz (see #2734)
and two different ways of doing this were tested. The successful test
modified the constants in the patch by Viz in a similar manner to the
tuning run:
LTC (rebased) @ 60+0.6 th 1 :
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 193384 W: 24241 L: 23521 D: 145622
Ptnml(0-2): 1309, 17497, 58472, 17993, 1421
https://tests.stockfishchess.org/tests/view/5ee43319ca6c451633a995f9
Further work: the recent patch to quantize eval #2733 affects search quit
quite a bit, so doing another tune in, say, three months time might be a
good idea.
closes https://github.com/official-stockfish/Stockfish/pull/2735
Bench 4246971
We replace the current decrease of the complexity term in initiative
when shuffling by a direct damping of the evaluation. This scheme may
have two benefits over the initiative approach:
a) the damping effect is more brutal for fortresses with heavy pieces
on the board, because the initiative term is almost an endgame term;
b) the initiative implementation had a funny side effect, almost a bug,
in the rare positions where mg > 0, eg < 0 and the tampered eval
returned a positive value (ie with heavy pieces still on the board):
sending eg to zero via shuffling would **increase** the tampered
eval instead of decreasing it, which is somewhat illogical. This
patch avoids this phenomenon.
STC:
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 43072 W: 8373 L: 8121 D: 26578
Ptnml(0-2): 729, 4954, 9940, 5162, 751
https://tests.stockfishchess.org/tests/view/5ee008ebf29b40b0fc95ade2
LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 37376 W: 4816 L: 4543 D: 28017
Ptnml(0-2): 259, 3329, 11286, 3508, 306
https://tests.stockfishchess.org/tests/view/5ee03b06f29b40b0fc95ae0c
Closes https://github.com/official-stockfish/Stockfish/pull/2727
Bench: 4757174
Conceptually group hash clusters into super clusters of 256 clusters.
This scheme allows us to use hash sizes up to 32 TB
(= 2^32 super clusters = 2^40 clusters).
Use 48 bits of the Zobrist key to choose the cluster index. We use 8
extra bits to mitigate the quantization error for very large hashes when
scaling the hash key to cluster index.
The hash index computation is organized to be compatible with the existing
scheme for power-of-two hash sizes up to 128 GB.
Fixes https://github.com/official-stockfish/Stockfish/issues/1349
closes https://github.com/official-stockfish/Stockfish/pull/2722
Passed non-regression STC:
LLR: 2.93 (-2.94,2.94) {-1.50,0.50}
Total: 37976 W: 7336 L: 7211 D: 23429
Ptnml(0-2): 578, 4295, 9149, 4356, 610
https://tests.stockfishchess.org/tests/view/5edcbaaef29b40b0fc95abc5
No functional change.
This is a non-functional code style change which provides a safe access handler for LineBB.
Also includes an assert in debug mode to verify square correctness.
closes https://github.com/official-stockfish/Stockfish/pull/2719
No functional change
Merging this code into one function `winnable()`.
Should allow common concepts used to adjust the eg value,
either by addition or scaling, to be combined more effectively.
Improve trace function.
closes https://github.com/official-stockfish/Stockfish/pull/2710
No functional change.
The idea of this patch is that if rooks are not directly attacking the opponent king,
they can support king attacks staying behind pawns or minor pieces and be really
deadly if position slightly opens up at enemy king ring ranks. Loosely based on
some stockfish games where it underestimated attacks on it king when enemy has one
or two rooks supporting pawn pushes towards it king.
passed STC
https://tests.stockfishchess.org/tests/view/5ecb093680f2c838b96550f9
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 53672 W: 10535 L: 10265 D: 32872
Ptnml(0-2): 952, 6210, 12258, 6448, 968
passed LTC
https://tests.stockfishchess.org/tests/view/5ecb639f80f2c838b9655117
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 62424 W: 8094 L: 7748 D: 46582
Ptnml(0-2): 426, 5734, 18565, 6042, 445
closes https://github.com/official-stockfish/Stockfish/pull/2700
Bench: 4663220
This patch is a combinaison of two recent parameters tweaks which had
failed narrowly (yellow) at long time control:
• improvement in move ordering during search by softening the distinction
between bad captures and good captures during move generation, leading
to improved awareness of Stockfish of potential piece sacrifices (idea
by Rahul Dsilva)
• increase in the weight of pawns in the "initiative" part of the evaluation
function. With this change Stockfish should have more incentive to exchange
pawns when losing, and to keep pawns when winning.
STC:
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 10704 W: 2178 L: 1974 D: 6552
Ptnml(0-2): 168, 1185, 2464, 1345, 190
https://tests.stockfishchess.org/tests/view/5ec5553b377121ac09e1023d
LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 60592 W: 7835 L: 7494 D: 45263
Ptnml(0-2): 430, 5514, 18086, 5817, 449
https://tests.stockfishchess.org/tests/view/5ec55ca2377121ac09e10249
Closes https://github.com/official-stockfish/Stockfish/pull/2691
Bench: 4519117
Do not send the following info string on the first call to
aligned_ttmem_alloc() on Windows:
info string Hash table allocation: Windows large pages [not] used.
The first call occurs before the 'uci' command has been received. This
confuses some GUIs, which expect the first engine-sent command to be
'id' as the response to the 'uci' command. (see https://github.com/official-stockfish/Stockfish/issues/2681)
closes https://github.com/official-stockfish/Stockfish/pull/2689
No functional change.
This is a functional simplification of the time management system.
With this patch, there is a simple equation for each of two distinct
time controls: basetime + increment, and x moves in y seconds (+increment).
These equations are easy to plot and understand making future modifications
or adding additional time controls much easier.
SlowMover is reset to 100 so that is has no effect unless a user changes it.
There are two scaling variables:
* Opt_scale is a scale factor (or percentage) of time to use for this current move.
* Max_scale is a scale factor to apply to the resulting optimumTime.
There seems to be some elo gain in most scenarios.
Better performance is attributable to one of two things:
* minThinkingTime was not allowing reasonable time calculations for very short games like 10+0 or 10+0.01. This is because adding almost no increment and substracting move overhead for 50 moves quickly results in almost 0 time very early in the game. Master depended on minThinkingTime to handle these short games instead of good time management. This patch addresses this issue by lowering minThinkingTime to 0 and adjusting moverOverhead if there are very low increments.
* Notice that the time distribution curves tail downward for the first 10 moves or so. This causes less time to attribute for very early moves leaving more time available for middle moves where more important decisions happen.
Here is a summary of tests for this version at different time controls:
SMP 5+0.05
LLR: 2.97 (-2.94,2.94) {-1.50,0.50}
Total: 46544 W: 7175 L: 7089 D: 32280
Ptnml(0-2): 508, 4826, 12517, 4914, 507
https://tests.stockfishchess.org/tests/user/protonspring
STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 20480 W: 3872 L: 3718 D: 12890
Ptnml(0-2): 295, 2364, 4824, 2406, 351
https://tests.stockfishchess.org/tests/view/5ebc343e7dd5693aad4e6873
STC, sudden death
LLR: 2.93 (-2.94,2.94) {-1.50,0.50}
Total: 7024 W: 1706 L: 1489 D: 3829
Ptnml(0-2): 149, 813, 1417, 938, 195
https://tests.stockfishchess.org/tests/view/5ebc346f7dd5693aad4e6875
STC, TCEC style
LLR: 2.95 (-2.94,2.94) {-1.50,0.50}
Total: 4192 W: 1014 L: 811 D: 2367
Ptnml(0-2): 66, 446, 912, 563, 109
https://tests.stockfishchess.org/tests/view/5ebc34857dd5693aad4e6877
40/10
LLR: 2.93 (-2.94,2.94) {-1.50,0.50}
Total: 54032 W: 10592 L: 10480 D: 32960
Ptnml(0-2): 967, 6148, 12677, 6254, 970
https://tests.stockfishchess.org/tests/view/5ebc50597dd5693aad4e688d
LTC, sudden death
LLR: 2.95 (-2.94,2.94) {-1.50,0.50}
Total: 9152 W: 1391 L: 1263 D: 6498
Ptnml(0-2): 75, 888, 2526, 1008, 79
https://tests.stockfishchess.org/tests/view/5ebc6f5c7dd5693aad4e689b
LTC
LLR: 2.98 (-2.94,2.94) {-1.50,0.50}
Total: 12344 W: 1563 L: 1459 D: 9322
Ptnml(0-2): 70, 1103, 3740, 1171, 88
https://tests.stockfishchess.org/tests/view/5ebc6f4c7dd5693aad4e6899
closes https://github.com/official-stockfish/Stockfish/pull/2678
Bench: 4395562
simplify the usage of the 50 moves counter,
moving it frome the scale factor to initiative.
This patch was inspired by recent games where a blocked or semi-blocked position
was 'blundered', by moving a pawn, into a lost endgame. This patch improves this situation,
finding a more robust move more often.
for example (1s searches with many threads):
```
FEN 8/p3kp2/Pp2p3/1n2PpP1/5P2/1Kp5/8/R7 b - - 68 143
master:
6 bestmove b5c7
6 bestmove e7e8
12 bestmove e7d8
176 bestmove e7d7
patch:
3 bestmove b5c7
5 bestmove e7d8
192 bestmove e7d7
```
fixes https://github.com/official-stockfish/Stockfish/issues/2620
the patch also tests well
passed STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 50168 W: 9508 L: 9392 D: 31268
Ptnml(0-2): 818, 5873, 11616, 5929, 848
https://tests.stockfishchess.org/tests/view/5ebb07287dd5693aad4e680b
passed LTC
LLR: 2.93 (-2.94,2.94) {-1.50,0.50}
Total: 7520 W: 981 L: 870 D: 5669
Ptnml(0-2): 49, 647, 2256, 760, 48
https://tests.stockfishchess.org/tests/view/5ebbff747dd5693aad4e6858
closes https://github.com/official-stockfish/Stockfish/pull/2666
Bench: 4395562
for users that set the needed privilige "Lock Pages in Memory"
large pages will be automatically enabled (see Readme.md).
This expert setting might improve speed, 5% - 30%, depending
on the hardware, the number of threads and hash size. More for
large hashes, large number of threads and NUMA. If the operating
system can not allocate large pages (easier after a reboot), default
allocation is used automatically. The engine log provides details.
closes https://github.com/official-stockfish/Stockfish/pull/2656
fixes https://github.com/official-stockfish/Stockfish/issues/2619
No functional change
fixes https://github.com/official-stockfish/Stockfish/issues/2660
The problem was caused by .depend being created with a rule for tbprobe.o not for syzygy/tbprobe.o.
This patch keeps an explicit list of sources (SRCS), generates OBJS,
and compiles all object files to the src/ directory, consistent with .depend.
VPATH is used to search the syzygy directory as needed.
joint work with @gvreuls
closes https://github.com/official-stockfish/Stockfish/pull/2664
No functional change
The purpose of the code is to allow developers to easily and flexibly
setup SF for a tuning session. Mainly you have just to remove 'const'
qualifiers from the variables you want to tune and flag them for
tuning, so if you have:
int myKing = 10;
Score myBonus = S(5, 15);
Value myValue[][2] = { { V(100), V(20) }, { V(7), V(78) } };
and at the end of the update you may want to call
a post update function:
void my_post_update();
If instead of default Option's min-max values,
you prefer your custom ones, returned by:
std::pair<int, int> my_range(int value)
Or you jus want to set the range directly, you can
simply add below:
TUNE(SetRange(my_range), myKing, SetRange(-200, 200), myBonus, myValue, my_post_update);
And all the magic happens :-)
At startup all the parameters are printed in a
format suitable to be copy-pasted in fishtest.
In case the post update function is slow and you have many
parameters to tune, you can add:
UPDATE_ON_LAST();
And the values update, including post update function call, will
be done only once, after the engine receives the last UCI option.
The last option is the one defined and created as the last one, so
this assumes that the GUI sends the options in the same order in
which have been defined.
closes https://github.com/official-stockfish/Stockfish/pull/2654
No functional change.
This patch increases number of nodes where we produce multicut cutoffs.
The idea is that if our ttMove failed to produce a singular extension
but ttValue is greater than beta we can afford to do one more reduced search
near beta excluding ttMove to see if it will produce a fail high -
and if it does so produce muticut by analogy to existing logic.
passed STC
https://tests.stockfishchess.org/tests/view/5e9a162b5b664cdba0ce6e28
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 58238 W: 11192 L: 10917 D: 36129
Ptnml(0-2): 1007, 6704, 13442, 6939, 1027
passed LTC
https://tests.stockfishchess.org/tests/view/5e9a1e845b664cdba0ce7411
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 137852 W: 17460 L: 16899 D: 103493
Ptnml(0-2): 916, 12610, 41383, 13031, 986
closes https://github.com/official-stockfish/Stockfish/pull/2640
bench 4881443
This is a functional simplification which fixes an awkward numerical cliff.
With master king_safety, no pawns is scored higher than pawn(s) that is/are far from the king. This may motivate SF to throw away pawns to increase king safety. With this patch, there is a consistent value for minPawnDistance where losing a pawn never increases king safety.
STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 45548 W: 8624 L: 8525 D: 28399
Ptnml(0-2): 592, 4937, 11587, 5096, 562
https://tests.stockfishchess.org/tests/view/5e98ced630be947a14e9ddc5
LTC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 42084 W: 5292 L: 5242 D: 31550
Ptnml(0-2): 193, 3703, 13252, 3649, 245
https://tests.stockfishchess.org/tests/view/5e98e22e30be947a14e9de07
closes https://github.com/official-stockfish/Stockfish/pull/2639
bench 4600292
This idea is loosely based on xoroshiro idea about raisedBeta and ttmoves.
If our ttmove have low enough ttvalue and is deep enough (deeper than our probcut depth) it makes little sense to try probcut moves, since the ttMove already more or less failed to produce one according to transposition table.
passed STC
https://tests.stockfishchess.org/tests/view/5e9673ddc2718dee3c822920
LLR: 2.95 (-2.94,2.94) {-0.50,1.50}
Total: 72148 W: 14038 L: 13741 D: 44369
Ptnml(0-2): 1274, 8326, 16615, 8547, 1312
passed LTC
https://tests.stockfishchess.org/tests/view/5e96b378c2718dee3c8229bf
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 89054 W: 11418 L: 10996 D: 66640
Ptnml(0-2): 623, 8113, 26643, 8515, 633
closes https://github.com/official-stockfish/Stockfish/pull/2632
bench 4952731
This patch refines the recently introduced interaction between
the space bonus and the number of blocked pawns in a position.
* pawns count as blocked also if their push square is attacked by 2 enemy pawns;
* overall dependence is stronger as well as offset;
* bonus increase is capped at 9 blocked pawns in position;
passed STC
https://tests.stockfishchess.org/tests/view/5e94560663d105aebbab243d
LLR: 2.96 (-2.94,2.94) {-0.50,1.50}
Total: 29500 W: 5842 L: 5603 D: 18055
Ptnml(0-2): 504, 3443, 6677, 3562, 564
passed LTC
https://tests.stockfishchess.org/tests/view/5e95b383c2aaa99f75d1a14d
LLR: 2.95 (-2.94,2.94) {0.25,1.75}
Total: 63504 W: 8329 L: 7974 D: 47201
Ptnml(0-2): 492, 5848, 18720, 6197, 495
closes https://github.com/official-stockfish/Stockfish/pull/2631
bench 4956028
+-------+
| o . . | o their pawns
| x . . | x our pawns
| . x . | <- Can sacrifice to create passer?
+-------+
yes
1 2 3 4 5
+-------+ +-------+ +-------+ +-------+ +-------+
| o . . | | o r . | | o r . | | o . b | | o . b | lowercase: theirs
| x b . | | x . . | | x . R | | x . R | | x . . | uppercase: ours
| . x . | | . x . | | . x . | | . x . | | . x B |
+-------+ +-------+ +-------+ +-------+ +-------+
no no yes no yes
The value of our top pawn depends on our ability to advance our bottom
pawn, levering their blocker. Previously, this pawn configuration was
always scored as passer (although a blocked one).
Add requirements for the square s above our (possibly) sacrificed pawn:
- s must not be occupied by them (1).
- If they attack s (2), we must attack s (3).
- If they attack s with a minor (4), we must attack s with a minor (5).
The attack from their blocker is ignored because it is inherent in the
structure; we are ok with sacrificing our bottom pawn.
LTC
LLR: 2.95 (-2.94,2.94) {0.25,1.75}
Total: 37030 W: 4962 L: 4682 D: 27386
Ptnml(0-2): 266, 3445, 10863, 3625, 316
https://tests.stockfishchess.org/tests/view/5e92a2b4be6ede5b954bf239
STC
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 40874 W: 8066 L: 7813 D: 24995
Ptnml(0-2): 706, 4753, 9324, 4890, 764
https://tests.stockfishchess.org/tests/view/5e922199af0a0143109dc90e
closes https://github.com/official-stockfish/Stockfish/pull/2624
Bench: 4828294
In master, if the received ttMove meets the prescribed conditions in the various MovePicker constructors, it is returned as the first move, otherwise we set it to MOVE_NONE. If set to MOVE_NONE, we no longer track what the ttMove was, and it will might be returned later in a list of generated moves. This may be a waste. With this patch, if the ttMove fails to meet the prescribed conditions, we simply skip the TT stages, but still store the move and make sure it's never returned.
STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 66424 W: 12903 L: 12806 D: 40715
Ptnml(0-2): 1195, 7730, 15230, 7897, 1160
LTC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 45682 W: 5989 L: 5926 D: 33767
Ptnml(0-2): 329, 4361, 13443, 4334, 374
closes https://github.com/official-stockfish/Stockfish/pull/2616
Bench 4928928
Before this commit, some pawns were considered "candidate" passed pawns and given half bonus. After this commit, all of these pawns are scored as passed pawns, and they do not receive less bonus.
STC:
LLR: 2.95 (-2.94,2.94) {-1.50,0.50}
Total: 21806 W: 4320 L: 4158 D: 13328
Ptnml(0-2): 367, 2526, 5001, 2596, 413
https://tests.stockfishchess.org/tests/view/5e86b4724411759d9d098639
LTC:
LLR: 2.95 (-2.94,2.94) {-1.50,0.50}
Total: 12590 W: 1734 L: 1617 D: 9239
Ptnml(0-2): 96, 1187, 3645, 1238, 129
https://tests.stockfishchess.org/tests/view/5e86d2874411759d9d098640
This PR and commit are dedicated to our colleague Stefan Geschwentner (@locutus2), one of the most respected and accomplished members of the Stockfish developer community. Stockfish is a volunteer project and has always thrived because of Stefan's talent, insight, generosity, and dedication. Welcome back, Stefan!
closes https://github.com/official-stockfish/Stockfish/pull/2613
Bench: 4831963
In more than 100k local KNPK games, there is no discernible difference between master and master with this endgame removed: master:42971, patch:42973, draws: 3969. Removal does not seem to regress in normal games.
STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 46390 W: 8998 L: 8884 D: 28508
Ptnml(0-2): 707, 5274, 11163, 5300, 751
https://tests.stockfishchess.org/tests/view/5e83b18ee42a5c3b3ca2ef02
LTC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 44768 W: 5863 L: 5814 D: 33091
Ptnml(0-2): 251, 3918, 14028, 3905, 282
https://tests.stockfishchess.org/tests/view/5e84a82a4411759d9d0984f4
In tests with a book of endgames that can convert into KNPK, no significant difference can be seen either
```
TC 1.0+0.01
Score of patch vs master: 6131 - 6188 - 27681 [0.499] 40000
Elo difference: -0.5 +/- 1.9, LOS: 30.4 %, DrawRatio: 69.2 %
TC 2.0+0.02
Score of patch vs master: 5740 - 5741 - 28519 [0.500] 40000
Elo difference: -0.0 +/- 1.8, LOS: 49.6 %, DrawRatio: 71.3 %
``
closes https://github.com/official-stockfish/Stockfish/pull/2611
Bench 4512059
The idea behind this patch is that if static eval is really bad so capturing of current piece on spot will still produce a position with an eval much lower than alpha then our best chance is to create some kind of king attack. So captures without check are mostly worse than captures with check and can be reduced more.
passed STC
https://tests.stockfishchess.org/tests/view/5e8514b44411759d9d098543
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 46196 W: 9039 L: 8781 D: 28376
Ptnml(0-2): 750, 5412, 10628, 5446, 862
passed LTC
https://tests.stockfishchess.org/tests/view/5e8530134411759d9d09854c
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 23462 W: 3228 L: 2988 D: 17246
Ptnml(0-2): 186, 2125, 6849, 2405, 166
close https://github.com/official-stockfish/Stockfish/pull/2612
bench 4742598
There is an ambiguity between global and std clamp implementations when compiling in c++17,
and on certain toolchains that are not strictly conforming to c++11.
This is solved by putting our clamp implementation in a namespace.
closes https://github.com/official-stockfish/Stockfish/pull/2572
No functional change.
In November 2019, as a result of the simplification of rank-based outposts by 37698b0,
separate bonuses were introduced for outposts that are currently occupied and outposts
that are reachable on the next move. However, the values of these two bonuses are
quite similar, and they have remained that way for three months of development.
It appears that we can safely retire the separate ReachableOutpost parameter and
use the same Outpost bonus in both cases, restoring the basic principles of Stockfish
outpost evaluation to their pre-November state, while also reducing
the size of the parameter space.
STC:
LLR: 2.96 (-2.94,2.94) {-1.50,0.50}
Total: 47680 W: 9213 L: 9092 D: 29375
Ptnml(0-2): 776, 5573, 11071, 5594, 826
https://tests.stockfishchess.org/tests/view/5e51e33190a0a02810d09802
LTC:
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 14690 W: 1960 L: 1854 D: 10876
Ptnml(0-2): 93, 1381, 4317, 1435, 119
https://tests.stockfishchess.org/tests/view/5e52197990a0a02810d0980f
closes https://github.com/official-stockfish/Stockfish/pull/2559
Bench: 4697493
Current move histories are known to work well near the leaves, whilst at
higher depths they aren't very helpful. To address this problem this
patch introduces a table dedicated for what's happening at plies 0-3.
It's structured like mainHistory with ply index instead of color.
It get cleared with each new search and is filled during iterative
deepening at higher depths when recording successful quiet moves near
the root or traversing nodes which were in the principal variation
(ttPv).
Medium TC (20+0.2):
https://tests.stockfishchess.org/tests/view/5e4d358790a0a02810d096dc
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 100910 W: 16682 L: 16376 D: 67852
Ptnml(0-2): 1177, 10983, 25883, 11181, 1231
LTC:
https://tests.stockfishchess.org/tests/view/5e4e2cb790a0a02810d09714
LLR: 2.95 (-2.94,2.94) {0.25,1.75}
Total: 80444 W: 10495 L: 10095 D: 59854
Ptnml(0-2): 551, 7479, 23803, 7797, 592
closes https://github.com/official-stockfish/Stockfish/pull/2557
Bench: 4705960
This is a patch that significantly improves playing KNNKP endgames:
```
Score of 2553 vs master: 132 - 38 - 830 [0.547] 1000
Elo difference: 32.8 +/- 8.7, LOS: 100.0 %, DrawRatio: 83.0 %
```
At the same time it reduces the evaluation of this mostly draw engame
from ~7.5 to ~1.5
This patch does not regress against master in normal games:
STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 96616 W: 18459 L: 18424 D: 59733
Ptnml(0-2): 1409, 10812, 23802, 10905, 1380
http://tests.stockfishchess.org/tests/view/5e49dfe6f8d1d52b40cd31bc
LTC
LLR: 2.95 (-2.94,2.94) {-1.50,0.50}
Total: 49726 W: 6340 L: 6304 D: 37082
Ptnml(0-2): 239, 4227, 15906, 4241, 250
http://tests.stockfishchess.org/tests/view/5e4ab9ee16fb3df8c4cc01d0
Theory: KNNK is a dead draw, however the presence of the additional weakSide pawn opens up some mate opportunities. The idea is to block the pawn (preferably behind the Troitsky line) with one of the knights and press the weakSide king into a corner. If we can stalemate the king, we release the pawn with the knight (to avoid actual stalemate), and use the knight to complete the mate before the pawn promotes. This is also why there is an additional penalty for advancement of the pawn.
closes https://github.com/official-stockfish/Stockfish/pull/2553
Bench: 4981770
Fixes#2533, fixes#2543, fixes#2423.
the code that prevents false mate announcements depending on the TT
state (GHI), incorrectly used VALUE_MATE_IN_MAX_PLY. The latter
constant, however, also includes, counterintuitively, the TB win range.
This patch fixes that, by restoring the behavior for TB win scores,
while retaining the false mate correctness, and improving the mate
finding ability. In particular
no alse mates are announced with the poisened hash testcase
```
position fen 8/8/8/3k4/8/8/6K1/7R w - - 0 1
go depth 40
position fen 8/8/8/3k4/8/8/6K1/7R w - - 76 1
go depth 20
ucinewgame
```
mates are found with the testcases reported in #2543
```
position fen 4k3/3pp3/8/8/8/8/2PPP3/4K3 w - - 0 1
setoption name Hash value 1024
go depth 55
ucinewgame
```
and
```
position fen 4k3/4p3/8/8/8/8/3PP3/4K3 w - - 0 1
setoption name Hash value 1024
go depth 45
ucinewgame
```
furthermore, on the mate finding benchmark (ChestUCI_23102018.epd),
performance improves over master, roughly reaching performance with the
false mate protection reverted
```
Analyzing 6566 mate positions for best and found mates:
----------------best ---------------found
nodes master revert fixed master revert fixed
16000000 4233 4236 4235 5200 5201 5199
32000000 4583 4585 4585 5417 5424 5418
64000000 4852 4853 4855 5575 5584 5579
128000000 5071 5068 5066 5710 5720 5716
256000000 5280 5282 5279 5819 5827 5826
512000000 5471 5468 5468 5919 5935 5932
```
On a testcase with TB enabled, progress is made consistently, contrary
to master
```
setoption name SyzygyPath value ../../../syzygy/3-4-5/
setoption name Hash value 2048
position fen 1R6/3k4/8/K2p4/4n3/2P5/8/8 w - - 0 1
go depth 58
ucinewgame
```
The PR (prior to a rewrite for clarity)
passed STC:
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 65405 W: 12454 L: 12384 D: 40567
Ptnml(0-2): 920, 7256, 16285, 7286, 944
http://tests.stockfishchess.org/tests/view/5e441a3be70d848499f63d15
passed LTC:
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 27096 W: 3477 L: 3413 D: 20206
Ptnml(0-2): 128, 2215, 8776, 2292, 122
http://tests.stockfishchess.org/tests/view/5e44e277e70d848499f63d63
The incorrectly named VALUE_MATE_IN_MAX_PLY and VALUE_MATED_IN_MAX_PLY
were renamed into VALUE_TB_WIN_IN_MAX_PLY and VALUE_TB_LOSS_IN_MAX_PLY,
and correclty defined VALUE_MATE_IN_MAX_PLY and VALUE_MATED_IN_MAX_PLY
were introduced.
One further (corner case) mistake using these constants was fixed (go
mate X), which could lead to a premature return if X > MAX_PLY / 2,
but TB were present.
Thanks to @svivanov72 for one of the reports and help fixing the issue.
closes https://github.com/official-stockfish/Stockfish/pull/2552
Bench: 4932981
Align the TT allocation by 2M to make it huge page friendly and advise the
kernel to use huge pages.
Benchmarks on my i7-8700K (6C/12T) box: (3 runs per bench per config)
vanilla (nps) hugepages (nps) avg
==================================================================================
bench | 3012490 3024364 3036331 3071052 3067544 3071052 +1.5%
bench 16 12 20 | 19237932 19050166 19085315 19266346 19207025 19548758 +1.1%
bench 16384 12 20 | 18182313 18371581 18336838 19381275 19738012 19620225 +7.0%
On my box, huge pages have a significant perf impact when using a big
hash size. They also speed up TT initialization big time:
vanilla (s) huge pages (s) speed-up
=======================================================================
time stockfish bench 16384 1 1 | 5.37 1.48 3.6x
In practice, huge pages with auto-defrag may always be enabled in the
system, in which case this patch has no effect. This
depends on the values in /sys/kernel/mm/transparent_hugepage/enabled
and /sys/kernel/mm/transparent_hugepage/defrag.
closes https://github.com/official-stockfish/Stockfish/pull/2463
No functional change
This patch greatly increases the endgame penalty for having a trapped rook.
Idea was a result of witnessing Stockfish losing some games at CCCC exchanging
pieces in the position with a trapped rook which directly lead to a lost endgame.
This patch should partially fix such behavior making this penalty high even in
deep endgames.
Passed STC
http://tests.stockfishchess.org/tests/view/5e279d7cc3b97aa0d75bc1c4
LLR: 2.94 (-2.94,2.94) {-1.00,3.00}
Total: 8528 W: 1706 L: 1588 D: 5234
Ptnml(0-2): 133, 957, 1985, 1024, 159
Passed LTC
http://tests.stockfishchess.org/tests/view/5e27aee4c3b97aa0d75bc1e1
LLR: 2.95 (-2.94,2.94) {0.00,2.00}
Total: 88713 W: 11520 L: 11130 D: 66063
Ptnml(0-2): 646, 8170, 26342, 8492, 676
Closes https://github.com/official-stockfish/Stockfish/pull/2510
Bench: 4964462
----------------------
Comment by Malcolm Campbell:
Congrats! I think this might be a common pattern - scores that seem to mainly apply
to the midgame are often better with a similar (or at least fairly big) endgame value
as well. Maybe there are others eval parameters we can tweak like this...
Currently on a normal bench run in ~0,7% of cases 'improving' is set to
true although the static eval isn't improving at all, just keeping
equal. It looks like the strict gt-operator is more appropriate here,
since it returns to 'improving' its literal meaning without sideffects.
STC {-1.00,3.00} failed yellow:
https://tests.stockfishchess.org/tests/view/5e1ec38c8fd5f550e4ae1c28
LLR: -2.93 (-2.94,2.94) {-1.00,3.00}
Total: 53155 W: 10170 L: 10109 D: 32876
Ptnml(0-2): 863, 6282, 12251, 6283, 892
non-regression LTC passed:
https://tests.stockfishchess.org/tests/view/5e1f1c0d8fd5f550e4ae1c41
LLR: 2.98 (-2.94,2.94) {-1.50,0.50}
Total: 23961 W: 3114 L: 3018 D: 17829
Ptnml(0-2): 163, 2220, 7114, 2298, 170
CLoses https://github.com/official-stockfish/Stockfish/pull/2496
bench: 4561386
This is a non-functional simplification. Looks like std::bitset works good
for the KPKBitbase. Thanks for Jorg Oster for helping get the speed up
(the [] accessor is faster than test()).
Speed testing: 10k calls to probe:
master 9.8 sec
patch 9.8 sec.
STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 100154 W: 19025 L: 18992 D: 62137
Ptnml(0-2): 1397, 11376, 24572, 11254, 1473
http://tests.stockfishchess.org/tests/view/5e21e601346e35ac603b7d2b
Closes https://github.com/official-stockfish/Stockfish/pull/2502
No functional change
This is a non-functional speed-up: master has to access SquareBB twice while this patch
determines opposite_colors just using the values of the squares. It doesn't seem to change
the overall speed of bench, but calling opposite_colors(...) 10 Million times:
master: 39.4 seconds
patch: 11.4 seconds.
The only data point I have (other than my own tests), is a quite old failed STC test:
LLR: -2.93 (-2.94,2.94) [-1.50,4.50]
Total: 24308 W: 5331 L: 5330 D: 13647
Ptnml(0-2): 315, 2577, 6326, 2623, 289
http://tests.stockfishchess.org/tests/view/5e010256c13ac2425c4a9a67
Closes https://github.com/official-stockfish/Stockfish/pull/2498
No functional change
This is a non-functional simplification. If we use the "side to move" of the entry
instead of the template, one of the classify methods goes away. Furthermore, I've
resolved the colors in some of the statements (we're already assuming direction
using NORTH), and used stm (side to move) instead of "us," since this is much clearer
to me.
This is not tested because it is non-functional, only applies building the bitbase
and there are no changes to the binary (on my machine).
Closes https://github.com/official-stockfish/Stockfish/pull/2485
No functional change
This is a non-functional simplification. Instead of passing the piece type
for remove_piece, we can rely on the board. The only exception is en-passant
which must be explicitly set because the destination square for the capture
is not the same as the piece to remove.
Verified also in the Chess960 castling case by running a couple of perft, see
the pull request discussion: https://github.com/official-stockfish/Stockfish/pull/2460
STC
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 18624 W: 4147 L: 4070 D: 10407
Ptnml(0-2): 223, 1933, 4945, 1938, 260
http://tests.stockfishchess.org/tests/view/5dfeaa93e70446e17e451163
No functional change
Official release version of Stockfish 11.
Bench: 5156767
-----------------------
It is our pleasure to release Stockfish 11 to our fans and supporters.
Downloads are freely available at http://stockfishchess.org/download/
This version 11 of Stockfish is 50 Elo stronger than the last version, and
150 Elo stronger than the version which famously lost a match to AlphaZero
two years ago. This makes Stockfish the strongest chess engine running on
your smartphone or normal desktop PC, and we estimate that on a modern four
cores CPU, Stockfish 11 could give 1:1000 time odds to the human chess champion
having classical time control, and be on par with him. More specific data,
including nice cumulative curves for the progression of Stockfish strength
over the last seven years, can be found on [our progression page][1], at
[Stefan Pohl site][2] or at [NextChessMove][3].
In October 2019 Stockfish has regained its crown in the TCEC competition,
beating in the superfinal of season 16 an evolution of the neural-network
engine Leela that had won the previous season. This clash of style between an
alpha-beta and an neural-network engine produced spectacular chess as always,
with Stockfish [emerging victorious this time][0].
Compared to Stockfish 10, we have made hundreds of improvements to the
[codebase][4], from the evaluation function (improvements in king attacks,
middlegame/endgame transitions, and many more) to the search algorithm (some
innovative coordination methods for the searching threads, better pruning of
unsound tactical lines, etc), and fixed a couple of bugs en passant.
Our testing framework [Fishtest][5] has also seen its share of improvements
to continue propelling Stockfish forward. Along with a lot of small enhancements,
Fishtest has switched to new SPRT bounds to increase the chance of catching Elo
gainers, along with a new testing book and the use of pentanomial statistics to
be more resource-efficient.
Overall the Stockfish project is an example of open-source at its best, as
its buzzing community of programmers sharing ideas and daily reviewing their
colleagues' patches proves to be an ideal form to develop innovative ideas for
chess programming, while the mathematical accuracy of the testing framework
allows us an unparalleled level of quality control for each patch we put in
the engine. If you wish, you too can help our ongoing efforts to keep improving
it, just [get involved][6] :-)
Stockfish is also special in that every chess fan, even if not a programmer,
[can easily help][7] the team to improve the engine by connecting their PC to
Fishtest and let it play some games in the background to test new patches.
Individual contributions vary from 1 to 32 cores, but this year Bojun Guo
made it a little bit special by plugging a whole data center during the whole
year: it was a vertiginous experience to see Fishtest spikes with 17466 cores
connected playing [25600 games/minute][8]. Thanks Guo!
The Stockfish team
[0]: <http://mytcecexperience.blogspot.com/2019/10/season-16-superfinal-games-91-100.html>
[1]: <https://github.com/glinscott/fishtest/wiki/Regression-Tests>
[2]: <https://www.sp-cc.de/index.htm>
[3]: <https://nextchessmove.com/dev-builds>
[4]: <https://github.com/official-stockfish/Stockfish>
[5]: <https://tests.stockfishchess.org/tests>
[6]: <https://stockfishchess.org/get-involved/>
[7]: <https://github.com/glinscott/fishtest/wiki>
[8]: <https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/lebEmG5vgng%5B1-25%5D>
Based on recent improvement of futility pruning by @locutus2 : we lower
the futility margin to apply it for more nodes but as a compensation
we also lower the history threshold to apply it to less nodes. Further
work in tweaking constants can always be done - numbers are guessed
"by hand" and are not results of some tuning, maybe there is some more
Elo to squeeze from this part of code.
Passed STC
LLR: 2.98 (-2.94,2.94) {-1.00,3.00}
Total: 15300 W: 3081 L: 2936 D: 9283
Ptnml(0-2): 260, 1816, 3382, 1900, 290
http://tests.stockfishchess.org/tests/view/5e18da3b27dab692fcf9a158
Passed LTC
LLR: 2.94 (-2.94,2.94) {0.00,2.00}
Total: 108670 W: 14509 L: 14070 D: 80091
Ptnml(0-2): 813, 10259, 31736, 10665, 831
http://tests.stockfishchess.org/tests/view/5e18fc9627dab692fcf9a180
Bench: 4643972
This patch makes Stockfish search same depth again if > 60% of optimum time is
already used, instead of trying the next iteration. The idea is that the next
iteration will generally take about the same amount of time as has already been
used in total. When we are likely to begin the last iteration, as judged by total
time taken so far > 0.6 * optimum time, searching the last depth again instead of
increasing the depth still helps the other threads in lazy SMP and prepares better
move ordering for the next moves.
STC :
LLR: 2.95 (-2.94,2.94) {-1.00,3.00}
Total: 13436 W: 2695 L: 2558 D: 8183
Ptnml(0-2): 222, 1538, 3087, 1611, 253
https://tests.stockfishchess.org/tests/view/5e1618a761fe5f83a67dd964
LTC :
LLR: 2.94 (-2.94,2.94) {0.00,2.00}
Total: 32160 W: 4261 L: 4047 D: 23852
Ptnml(0-2): 211, 2988, 9448, 3135, 247
https://tests.stockfishchess.org/tests/view/5e162ca061fe5f83a67dd96d
The code was revised as suggested by @vondele for multithreading:
STC (8 threads):
LLR: 2.95 (-2.94,2.94) {0.00,2.00}
Total: 16640 W: 2049 L: 1885 D: 12706
Ptnml(0-2): 119, 1369, 5158, 1557, 108
https://tests.stockfishchess.org/tests/view/5e19826a2cc590e03c3c2f52
LTC (8 threads):
LLR: 2.95 (-2.94,2.94) {-1.00,3.00}
Total: 16536 W: 2758 L: 2629 D: 11149
Ptnml(0-2): 182, 1758, 4296, 1802, 224
https://tests.stockfishchess.org/tests/view/5e18b91a27dab692fcf9a140
Thanks to those discussing Stockfish lazy SMP on fishcooking which made me
try this, and to @vondele for suggestions and doing related tests.
See full discussion in the pull request thread:
https://github.com/official-stockfish/Stockfish/pull/2482
Bench: 4586187
This patch shows a description of the compiler used to compile Stockfish,
when starting from the console.
Usage:
```
./stockfish
compiler
```
Example of output:
```
Stockfish 120120 64 POPCNT by T. Romstad, M. Costalba, J. Kiiski, G. Linscott
Compiled by clang++ 9.0.0 on Apple
__VERSION__ macro expands to: 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.38)
```
No functional change
This updates estimates from 1.5 year ago, and adds missing terms. All estimates
from tests run on fishtest at 10+0.1 (STC), 20000 games, error bars +- 3 Elo,
see the original message in the pull request for the full list of tests.
Noteworthy changes are step 7 (futility pruning) going from ~30 to ~50 Elo
and step 13 (pruning at shallow depth) going from ~170 to ~200 Elo.
Full list of tests: https://github.com/official-stockfish/Stockfish/pull/2401
@Rocky640 made the suggestion to look at time control dependence of these terms.
I picked two large terms (early futility pruning and singular extension), so with
small relative error. It turns out it is actually quite interesting (see figure 1).
Contrary to my expectation, the Elo gain for early futility pruning is pretty time
control sensitive, while singular extension gain is not.
Figure 1: TC dependence of two search terms

Going back to the old measurement of futility pruning (30 Elo vs today 50 Elo),
the code is actually identical but the margins have changed. It seems like a nice
example of how connected terms in search really are, i.e. the value of early futility
pruning increased significantly due to changes elsewhere in search.
No functional change.
User "adentong" reported recently of a game where Stockfish blundered a game
in a tournament because during a search there was an hash-table issue for
positions inside the tree very close to the 50-moves draw rule. This is part
of a problem which is commonly referred to as the Graph History Interaction (GHI),
and is difficult to solve in computer chess because storing the 50-moves counter
in the hash-table loses Elo in general.
Links:
Issue 2451 : https://github.com/official-stockfish/Stockfish/issues/2451
About the GHI : https://www.chessprogramming.org/Graph_History_Interaction
This patch tries to address the issue in this particular game and similar
reported games: it prevents that values from the transposition table are
getting used when the 50-move counter is close to reaching 100 (). The idea
is that in such cases values from previous searches, with a much lower 50-move
count, become less and less reliable.
More precisely, the heuristic we use in this patch is that we don't take the
transposition table cutoff when we have reached a 45-moves limit, but let the
search continue doing its job. There is a possible slowdown involved, but it will
also help to find either a draw when it thought to be losing, or a way to avoid
the draw by 50-move rule. This heuristics probably will not fix all possible cases,
but seems to be working reasonably well in practice while not losing too much Elo.
Passed non-regression tests:
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 274452 W: 59700 L: 60075 D: 154677
http://tests.stockfishchess.org/tests/view/5df546116932658fe9b451bf
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 95235 W: 15297 L: 15292 D: 64646
http://tests.stockfishchess.org/tests/view/5df69c926932658fe9b4520e
Closes https://github.com/official-stockfish/Stockfish/pull/2453
Bench: 4586187
SEE (Static Exchange Evaluation) is a critical component, so we might
indulge some tricks to make it faster. Another pull request #2469 showed
some speedup by removing templates, this version uses Ronald de Man
(@syzygy1) SEE implementation which also unrolls the for loop by
suppressing the min_attacker() helper function and exits as soon as
the last swap is conclusive.
See Ronald de Man version there:
https://github.com/syzygy1/Cfish/blob/master/src/position.c
Patch testes against pull request #2469:
LLR: 2.95 (-2.94,2.94) {-1.00,3.00}
Total: 19365 W: 3771 L: 3634 D: 11960
Ptnml(0-2): 241, 1984, 5099, 2092, 255
http://tests.stockfishchess.org/tests/view/5e10eb135e5436dd91b27ba3
And since we are using new SPRT statistics, and that both pull requests
finished with less than 20000 games I also tested against master as
a speed-up:
LLR: 2.99 (-2.94,2.94) {-1.00,3.00}
Total: 18878 W: 3674 L: 3539 D: 11665
Ptnml(0-2): 193, 1999, 4966, 2019, 250
http://tests.stockfishchess.org/tests/view/5e10febf12ef906c8b388745
Non functional change
This patch excludes blockers for king from mobility area. It was tried a couple
of times by now but now it passed. Performance is not enormously good but this
patch makes a lot of sence - blockers for king can't really move until king moves
(in most cases) so logic behind it is the same as behind excluding king square
from mobility area.
STC
http://tests.stockfishchess.org/tests/view/5dec388651219d7befdc76be
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 6155 W: 1428 L: 1300 D: 3427
LTC
http://tests.stockfishchess.org/tests/view/5dec4a3151219d7befdc76d3
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 120800 W: 19636 L: 19134 D: 82030
Bench: 5173081
This patch simplifies latest @MJZ1977 elo gainer. Seems like PvNode check in
condition of last capture extension is not needed. Note - even if this is a
simplification it actually causes this extension to be applied more often, thus
strengthening effect of @MJZ1977's patch.
passed STC
http://tests.stockfishchess.org/tests/view/5deb9a3eb7bdefd50db28d0e
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 80244 W: 17421 L: 17414 D: 45409
passed LTC
http://tests.stockfishchess.org/tests/view/5deba860b7bdefd50db28d11
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 21506 W: 3565 L: 3446 D: 14495
Bench: 5097036
As reported on the forum it is possible, on very rare occasions, that we are
trying to print a PV line with an invalid previousScore, although this line
has a valid actual score. This patch fixes output of PV lines with invalid
scores in a MultiPV search. This is a follow-up patch to 8b15961 and makes
the fix finally complete.
The reason is the i <= pvIdx condition which probably is a leftover from the
times there was a special root search function. This check is no longer needed
today and prevents PV lines past the current one (current pvIdx) to be flagged
as updated even though they do have a valid score.
https://github.com/official-stockfish/Stockfish/commit/8b15961349e18a9ba113973c53f53913d0cd0fadhttps://groups.google.com/forum/?fromgroups=#!topic/fishcooking/PrnoDLvMvro
No functional change.
the removed line is not needed, since with the conditions on SE, eval
equals ttValue (except inCheck), which must be larger than beta if the second condition
is true.
The removed line is also incorrect as eval might be VALUE_NONE at this
location if inCheck. This removal addresses part of https://github.com/official-stockfish/Stockfish/pull/2406#issuecomment-552642608
No functional change.
this patch extends bench to print static evaluations.
./stockfish bench 16 1 1 filename eval
will now print the evaluations for all fens in the file.
This complements the various 'go' flavors for bench and might be useful for debugging and/or tuning.
No functional change.
In a recent commit, "Introduce king flank defenders," a term was introduced
by Michael Chaly (@Vizvezdenec) to reduce king danger based on king defenders,
i.e., friendly attacks on our King Flank and Camp. This is a powerful idea
and broadly applicable to all of our pieces.
An earlier, but narrower, version of a similar idea was already coded into
king danger, with a term reducing king danger simply if we had a bishop and
king attacking the same square -- there is also a similar term for knights,
but roughly three times larger. I had attempted to tweak this term's coefficient
fairly recently, in a series of tests in early September which increased this
coefficient. All failed STC with significantly negative scores.
Now that the king flank defenders term has been introduced, it appears that
the bishop-defense term can be simplified away without compensation or
significant Elo loss.
Where do we go from here? This PR is a natural follow-up to "Introduce king
flank defenders," which proposed simplification with existing and overlapping
terms, such as this one. That PR also mentioned that the coefficient it
introduced appeared arbitrary, so perhaps this PR can facilitate a tweak to
increase king flank defenders' coefficient.
Additionally, this pull request is extremely similar to https://github.com/official-stockfish/Stockfish/pull/1821,
which was (coincidentally) merged a year ago, to the day (November 23, 2018).
That patch also simplified away a linear king danger tropism term, which was
soon after replaced with a quadratic term by @Vizvezdenec (which would not have
passed without the simplification). @Vizvezdenec, again by coincidence, has
recently been trying to implement a quadratic term, this time for defenders
rather than attackers. This history of this evaluation code suggests that
this simplification might be enough to help a patch for quadratic king-flank
defenders pass.
Bench: 4959670
STC:
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 22209 W: 4920 L: 4800 D: 12489
https://tests.stockfishchess.org/tests/view/5dd444d914339111b9b6bed7
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 152107 W: 24658 L: 24743 D: 102706
https://tests.stockfishchess.org/tests/view/5dd4be31f531e81cf278ea9d
Interesting discussion on Github about this pull request:
https://github.com/official-stockfish/Stockfish/pull/2424
---
This pull request was opened less than one week before the holiday of
Thanksgiving here in the United States. In keeping with the holiday
tradition of expressing gratitude, I would like to thank our generous
CPU donors, talented forum contributors, innovative developers, speedy
fishtest approvers, and especially our hardworking server maintainers
(@ppigazzini and @tomtor). Thank you all for a year of great Stockfish
progress!
This patch measures how frequently search is exploring new configurations.
This is done be computing a running average of ttHit. The ttHitAverage rate
is somewhat low (e.g. 30% for startpos) in the normal case, while it can be
very high if no progress is made (e.g. 90% for the fortress I used for testing).
This information can be used to influence search. In this patch, by adjusting
reductions if the rate > 50%. A first version (using a low ttHitAverageResolution
and this 50% threshold) passed testing:
STC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 26425 W: 5837 L: 5650 D: 14938
http://tests.stockfishchess.org/tests/view/5dcede8b0ebc5902563258fa
LTC
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 32313 W: 5392 L: 5128 D: 21793
http://tests.stockfishchess.org/tests/view/5dcefb1f0ebc590256325c0e
However, as discussed in pull request 2414, using a larger ttHitAverageResolution
gives a better approximation of the underlying distributions. This needs a slight
adjustment for the threshold as the new distributions are shifted a bit compared
to the older ones, and this threshold seemingly is sensitive (we used 0.53125 here).
https://github.com/official-stockfish/Stockfish/pull/2414
This final version also passed testing, and is used for the patch:
STC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 16025 W: 3555 L: 3399 D: 9071
http://tests.stockfishchess.org/tests/view/5dd070b90ebc5902579e20c2
LTC
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 37576 W: 6277 L: 5998 D: 25301
http://tests.stockfishchess.org/tests/view/5dd0f58e6f544e798086f224
Closes https://github.com/official-stockfish/Stockfish/pull/2414
Bench: 4989584
This patch implements what we have been trying for quite some time -
dependance of kingdanger on balance of attackers and defenders of king
flank, to avoid overestimate attacking power if the opponent has enough
defenders of king position. We already have some form of it in bishop
and knight defenders - this is further work in this direction.
What to do based on this?
1) constant 4 is arbitrary, maybe it is not optimal
2) maybe we can use quadratic formula as in kingflankattack
3) simplification into alrealy existing terms is always a possibility :)
4) overall kingdanger tuning always can be done.
passed STC:
http://tests.stockfishchess.org/tests/view/5dcf40560ebc590256325f30
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 26298 W: 5819 L: 5632 D: 14847
passed LTC:
http://tests.stockfishchess.org/tests/view/5dcfa5760ebc590256326464
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 30600 W: 5042 L: 4784 D: 20774
Closes https://github.com/official-stockfish/Stockfish/pull/2415
Bench: 4496847
Introduce OutpostRank[RANK_NB] which contains a bonus according to
the rank of the outpost. We use it for the primary Outpost bonus.
The values are based on the trends of the SPSA tuning run with some
manual tweaks.
Passed STC:
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 27454 W: 6059 L: 5869 D: 15526
http://tests.stockfishchess.org/tests/view/5dcadba20ebc590256922f09
Passed LTC:
LLR: 2.94 (-2.94,2.94) [0.00,3.50]
Total: 57950 W: 9443 L: 9112 D: 39395
http://tests.stockfishchess.org/tests/view/5dcaea880ebc5902569230bc
Bench: 4778405
----------------------------
The inspiration for this patch came from Stefan Geschwentner's attempt
of modifying BishopPawns into a rank-based penalty. Michael Stembera
suggested that maybe the S(0, 0) ranks (3rd, 7th and also maybe 8th)
can still be tuned. This would expand our definition of Outpost and
OutpostRanks would be removed altogether. Special thanks to Mark Tenzer
for all the help and excellent suggestions.
The removed lines approximately duplicate equivalent logic in the movePicker.
Adjust the futility_move_count to componsate for some difference
(the movePicker prunes one iteration of the move loop later).
Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 8114 W: 1810 L: 1663 D: 4641
http://tests.stockfishchess.org/tests/view/5dc6afe60ebc5902562bd318
Passed LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 89956 W: 14473 L: 14460 D: 61023
http://tests.stockfishchess.org/tests/view/5dc6bdcf0ebc5902562bd3c0
Closes https://github.com/official-stockfish/Stockfish/pull/2407
Bench: 4256440
---------------------
How to continue from there?
It would be interesting to see if we can extract some Elo gain
from the new futility_move_count formula, for instance by somehow
incorporating the final -1 in the 5 constant, or adding a linear
term to the quadratics...
```
futility_move_count = (5 + depth * depth) * (1 + improving) / 2 - 1
```
Current master 648c7ec25d will generate an
incorrect mate score for:
```
setoption name Hash value 8
setoption name Threads value 1
position fen 8/1p2KP2/1p4q1/1Pp5/2P5/N1Pp1k2/3P4/1N6 b - - 76 40
go depth 49
```
even though the position is a draw. Generally, SF tries to display only
proven mate scores, so this is a bug.
This was posted http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72166
by Uri Blass, with the correct analysis that this must be related to the
50 moves draw rule being ignored somewhere.
Indeed, this is possible as positions and there eval are stored in the TT,
without reference to the 50mr counter. Depending on the search path followed
a position can thus be mate or draw in the TT (GHI or Graph history interaction).
Therefore, to prove mate lines, the TT content has to be used with care. Rather
than ignoring TT content in general or for mate scores (which impact search or
mate finding), it is possible to be more selective. In particular, @WOnder93
suggested to only ignore the TT if the 50mr draw ply is closer than the mate
ply. This patch implements this idea, by clamping the eval in the TT to
+-VALUE_MATED_IN_MAX_PLY. This retains the TTmove, but causes a research of
these lines (with the current 50mr counter) as needed.
This patch hardly ever affects search (as indicated by the unchanged
bench), but fixes the testcase. As the conditions are very specific,
also mate finding will almost never be less efficient (testing welcome).
It was also shown to pass STC and LTC non-regression testing, in a form
using if/then/else instead of ternary operators:
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 93605 W: 15346 L: 15340 D: 62919
http://tests.stockfishchess.org/tests/view/5db45bb00ebc5908127538d4
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 33873 W: 7359 L: 7261 D: 19253
http://tests.stockfishchess.org/tests/view/5db4c8940ebc5902d6b146fc
closes https://github.com/official-stockfish/Stockfish/issues/2370
Bench: 4362323
This reverts the previous commit. The PSQT changes in this previous
commit originated from tests against quite an old version of master
which did not include the other PSQT changes of 474d133 for the other
pieces, and there might be some unknown interactions between the PSQT
tables. So we made a non-regression test of the last commit against the
last-but-one commit. This test failed, leading to the revert decision.
Failed non-regression test:
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 95536 W: 15047 L: 15347 D: 65142
http://tests.stockfishchess.org/tests/view/5dc0ba1d0ebc5904493b0112
Closes https://github.com/official-stockfish/Stockfish/pull/2395
Bench: 4362323
As Stockfish developers, we aim to make our code as legible and as close
to simple English as possible. However, one of the more notable exceptions
to this rule concerns operations between Squares and Bitboards.
Prior to this pull request, AND, OR, and XOR were only defined when the
Bitboard was the first operand, and the Square the second. For example,
for a Bitboard b and Square s, "b & s" would be valid but "s & b" would not.
This conflicts with natural reasoning about logical operators, both
mathematically and intuitively, which says that logical operators should
commute.
More dangerously, however, both Square and Bitboard are defined as integers
"under the hood." As a result, code like "s & b" would still compile and give
reasonable bench values. This trap occasionally ensnares even experienced
Stockfish developers, but it is especially dangerous for new developers not
aware of this peculiarity. Because there is no compilation or runtime error,
and a reasonable bench, only a close review by approvers can spot this error
when a test has been submitted--and many times, these bugs have slipped past
review. This is by far the most common logical error on Fishtest, and has
wasted uncountable STC games over the years.
However, it can be fixed by adding three non-functional lines of code. In this
patch, we define the operators when the operands are provided in the opposite
order, i.e., we make AND, OR, and XOR commutative for Bitboards and Squares.
Because these are inline methods and implemented identically, the executable
does not change at all.
This patch has the small side-effect of requiring Squares to be explicitly
cast to integers before AND, OR, or XOR with integers. This is only performed
twice in Stockfish's source code, and again does not change the executable at
all (since Square is an enum defined as an integer anyway).
For demonstration purposes, this pull request also inverts the order of one AND
and one OR, to show that neither the bench nor the executable change. (This
change can be removed before merging, if preferred.)
I hope that this pull request significantly lowers the barrier-of-entry for new
developer to join the Stockfish project. I also hope that this change will improve
our efficiency in using our generous CPU donors' machines, since it will remove
one of the most common causes of buggy tests.
Following helpful review and comments by Michael Stembera (@mstembera), we add
a further clean-up by implementing OR for two Squares, to anticipate additional
traps developers may encounter and handle them cleanly.
Closes https://github.com/official-stockfish/Stockfish/pull/2387
No functional change.
Simplify the king ring initialization and make it more regular, by just
moving the king square off the edges and using PseudoAttacks by king from
this new square.
There is a small functional difference from the previous master, as the
old master excludes the original ksq square while this patch always includes
the nine squares block (after moving the king from the edges). Additionally,
master does not adjust the kingRing down if we are on relative rank 8,
while this patch treats all of the edges the same.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 13263 W: 2968 L: 2830 D: 7465
http://tests.stockfishchess.org/tests/view/5db872830ebc5902d1f388aa
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 72996 W: 11819 L: 11780 D: 49397
http://tests.stockfishchess.org/tests/view/5db899c20ebc5902d1f38b5e
Closes https://github.com/official-stockfish/Stockfish/pull/2384
Bench: 4959244
Make dynamic contempt weight factor dependent on static contempt so that higher
static contempt implies less dynamic contempt and vice versa. For default contempt
24 this is a non-functional change. But tests with contempt 0 shows an elo gain.
Also today is my birthday so i have already give to myself a gift with this patch :-)!
Further proceedings:
in the past we checked for default contempt that it doesn't regress against
contempt 0. Now that the later is stronger and the former is the same strength
this should be rechecked. Perhaps the default contempt have to be lowered.
It would be interesting to get some idea of the impact of this patch outside
of the 0-24 contempt range.
STC: (both with contempt=0)
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 21912 W: 3898 L: 3740 D: 14274
http://tests.stockfishchess.org/tests/view/5db74b6f0ebc5902d1f37405
LTC: (both with contempt=0)
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 27172 W: 3350 L: 3126 D: 20696
http://tests.stockfishchess.org/tests/view/5db760020ebc5902d1f375d0
Closes https://github.com/official-stockfish/Stockfish/pull/2382
No functional change (for current default contempt 24).
This PR refactors update_quiet_stats, update_capture_stats and search to more clearly reflect what is actually done.
Effectively, all stat updates that need to be done after search is finished and a bestmove is found,
are collected in a new function ```final_stats_update()```. This shortens our main search routine, and simplifies ```update_quiet_stats```.
The latter function is now more easily reusable with fewer arguments, as the handling of ```quietsSearched``` is only needed in ```final_stats_update```.
```update_capture_stats```, which was only called once is now integrated in ```final_stats_update```, which allows for removing a branch and reusing some ```stat_bonus``` calls. The need for refactoring was also suggested by the fact that the comments of ```update_quiet_stats``` and ```update_capture_stats``` were incorrect (e.g. ```update_capture_stats``` was called, correctly, also when the bestmove was a quiet and not a capture).
passed non-regression STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 75196 W: 16364 L: 16347 D: 42485
http://tests.stockfishchess.org/tests/view/5db004ec0ebc5902c06db9e1
The diff is most easily readable as ```git diff master --patience```
No functional change
- Cleanups by Alain
- Group king attacks and king defenses
- Signature of futility_move_count()
- Use is_discovery_check_on_king()
- Simplify backward definition
- Use static asserts in move generator
- Factor a statement in move generator
No functional change
Stockfish crashes immediately if users enter a wrong file name (or even an existing
folder name) for debug log file. It may be hard for users to find out since it prints
nothing. If they enter the string via a chess GUI, the chess GUI may remember and
auto-send to Stockfish next time, makes Stockfish crashes all the time. Bug report by
Nguyen Hong Pham in this issue: https://github.com/official-stockfish/Stockfish/issues/2365
This patch avoids the crash and instead prefers to exit gracefully with a error
message on std:cerr, like we do with the fenFile for instance.
Closes https://github.com/official-stockfish/Stockfish/pull/2366
No functional change.
With the current questions and issues around threading, I had a look at
https://github.com/official-stockfish/Stockfish/issues/2299.
It seems there was a problem with data races when requesting eval via UCI while
a search was already running. To fix this an extra thread uithread was created,
presumably to avoid an overlap with Threads.main() that was causing problems.
Making this eval request seems to be outside the scope of UCI, and @vondele also
reports that the data race is not even fixed reliably by this change. I suggest
we simplify the threading here by removing this uithread and adding a comment
signaling that user should not request eval when a search is already running.
Closes https://github.com/official-stockfish/Stockfish/pull/2310
No functional change.
The current bench is missing a position with high 50 moves rule counter,
making most 'shuffle' tests based on 50mr > N seem non-functional.
This patch adds one FEN with high 50mr counter to address this issue
(taken from a recent tcec game).
Four new FENs:
- position with high 50mr counter
- tactical position with many captures, checks, extensions, fails high/low
- two losses by Stockfish in the S16 bonus games against Houdini
See the pull request for nice comments by @Alayan-stk-2 about each position
in bench: https://github.com/official-stockfish/Stockfish/pull/2338
Bench: 4590210
Previously, we used various control statements and ternary operators to divide
Outpost into four bonuses, based on whether the outpost was for a knight or
bishop, and whether it was currently an Outpost or merely a potential ("reachable")
one in the future. Bishop outposts, however, have traditionally been worth far
less Elo in testing. An attempt to remove them altogether passed STC, but failed LTC.
Here we include a narrower simplification, removing the reachable Outpost bonus
for bishops. This bonus was always suspect, given that its current implementation
conflicts directly with BishopPawns. BishopPawns penalizes our bishops based on the
number of friendly pawns on the same color of square, but by definition, Outposts
must be pawn-protected! This PR helps to alleviate this conceptual contradiction
without loss of Elo and with slightly simpler code.
On a code level, this allows us to simplify a ternary operator into the previous
"if" block and distribute a multiplication into an existing constant Score. On a
conceptual level, we retire one of the four traditional Outpost bonuses.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 22277 W: 4882 L: 4762 D: 12633
http://tests.stockfishchess.org/tests/view/5d9aeed60ebc5902b6cf9751
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 51206 W: 8353 L: 8280 D: 34573
http://tests.stockfishchess.org/tests/view/5d9af1940ebc5902b6cf9cd5
Closes https://github.com/official-stockfish/Stockfish/pull/2352
Bench: 3941591
This patch changes the base aspiration window size depending on the absolute
value of the previous iteration score, increasing it away from zero. This
stems from the observation that the further away from zero, the more likely
the evaluation is to change significantly with more depth. Conversely, a
tighter aspiration window is more efficient when close to zero.
A beneficial side-effect is that analysis of won positions without a quick
mate is less prone to waste nodes in repeated fail-high that change the eval
by tiny steps.
STC:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 60102 W: 13327 L: 12868 D: 33907
http://tests.stockfishchess.org/tests/view/5d9a70d40ebc5902b6cf39ba
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 155553 W: 25745 L: 25141 D: 104667
http://tests.stockfishchess.org/tests/view/5d9a7ca30ebc5902b6cf4028
Future work : the values used in this patch were only a reasonable guess.
Further testing should unveil more optimal values. However, the aspiration
window is rather tight with a minimum of 21 internal units, so discrete
integers put a practical limitation to such tweaking.
More exotic experiments around the aspiration window parameters could also
be tried, but efficient conditions to adjust the base aspiration window size
or allow it to not be centered on the current evaluation are not obvious.
The aspiration window increases after a fail-high or a fail-low is another
avenue to explore for potential enhancements.
Bench: 4043748
Run as a simplification
a) insures that pawn attacks are always included in the pawn span
(this "fixes" the case where some outpost or reachable outpost
bonus were awarded on squares controlled by enemy pawns).
b) compute the full span only if not "backward" or not "blocked".
By looking at "blocked" instead of "opposed", we get a nice simpli-
fication and the "new" outpost detection is almost identical, except
a few borderline cases on rank 4.
passed STC
http://tests.stockfishchess.org/tests/view/5d9950730ebc5902b6cefb90
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 79113 W: 17168 L: 17159 D: 44786
passed LTC
http://tests.stockfishchess.org/tests/view/5d99d14e0ebc5902b6cf0692
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 41286 W: 6819 L: 6731 D: 27736
See https://github.com/official-stockfish/Stockfish/pull/2348
bench: 3812891
It is always used as a bool, so let's make it a bool straight away.
We can always redefine it as a Piece in a later patch if we want
to use the piece type or the piece color.
No functional change.
Simplification that eliminates ONE_PLY, based on a suggestion in the forum that
support for fractional plies has never been used, and @mcostalba's openness to
the idea of eliminating it. We lose a little bit of type safety by making Depth
an integer, but in return we simplify the code in search.cpp quite significantly.
No functional change
------------------------------------------
The argument favoring eliminating ONE_PLY:
* The term “ONE_PLY” comes up in a lot of forum posts (474 to date)
https://groups.google.com/forum/?fromgroups=#!searchin/fishcooking/ONE_PLY%7Csort:relevance
* There is occasionally a commit that breaks invariance of the code
with respect to ONE_PLY
https://groups.google.com/forum/?fromgroups=#!searchin/fishcooking/ONE_PLY%7Csort:date/fishcooking/ZIPdYj6k0fk/KdNGcPWeBgAJ
* To prevent such commits, there is a Travis CI hack that doubles ONE_PLY
and rechecks bench
* Sustaining ONE_PLY has, alas, not resulted in any improvements to the
engine, despite many individuals testing many experiments over 5 years.
The strongest argument in favor of preserving ONE_PLY comes from @locutus:
“If we use par example ONE_PLY=256 the parameter space is increases by the
factor 256. So it seems very unlikely that the optimal setting is in the
subspace of ONE_PLY=1.”
There is a strong theoretical impediment to fractional depth systems: the
transposition table uses depth to determine when a stored result is good
enough to supply an answer for a current search. If you have fractional
depths, then different pathways to the position can be at fractionally
different depths.
In the end, there are three separate times when a proposal to remove ONE_PLY
was defeated by the suggestion to “give it a few more months.” So… it seems
like time to remove this distraction from the community.
See the pull request here:
https://github.com/official-stockfish/Stockfish/pull/2289
Remove temporary array of shelters and avoid iterating over it each time to find
if the shelter values after castling are better than the current value.
Work done on top of https://github.com/official-stockfish/Stockfish/pull/2277
Speed benchmark did not measure any difference.
No functional change
In lazySMP it makes sense to prune a little more, as multiple threads
search wider. We thus increase the prefactor of the reductions slowly
as a function of the threads. The prefactor of the log(threads) term
is a parameter, this pull request uses 1/2 after testing.
passed STC @ 8threads:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 118125 W: 23151 L: 22462 D: 72512
http://tests.stockfishchess.org/tests/view/5d8bbf4d0ebc59509180f217
passed LTC @ 8threads:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 67546 W: 10630 L: 10279 D: 46637
http://tests.stockfishchess.org/tests/view/5d8c463b0ebc5950918167e8
passed ~LTC @ 14threads:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 74271 W: 12421 L: 12040 D: 49810
http://tests.stockfishchess.org/tests/view/5d8db1f50ebc590f3beb24ef
Note:
A larger prefactor (1) passed similar tests at STC and LTC (8 threads),
while a very large one (2) passed STC quickly but failed LTC (8 threads).
For the single-threaded case there is no functional change.
Closes https://github.com/official-stockfish/Stockfish/pull/2337
Bench: 4088701
Fixup: remove redundant code.
A curious feature of Stockfish's current extension code is its repeated
use of "else if." In most cases, this makes no functional difference,
because no more than one extension is applied; once one extension has
been applied, the remaining ones can be safely ignored.
However, if most singular extension search conditions are true, except
"value < singularBeta", no non-singular extensions (e.g., castling) can
be performed!
Three tests were submitted, for three of Stockfish's four non-singular
extensions. I excluded the shuffle extension, because historically there
have been concerns about the fragility of its conditions, and I did not
want to risk causing any serious search problems.
- Modifying the passed pawn extension appeared roughly neutral at STC. At
best, it appeared to be an improvement of less than 1 Elo.
- Modifying check extension performed very poorly at STC
- Modifying castling extension (this patch) produced a long "yellow" run
at STC (insufficient to pass, but positive score) and a strong LTC.
In simple terms, prior to this patch castling extension was occasionally
not applied during search--on castling moves. The effect of this patch is
to perform castling extension on more castling moves. It does so without
adding any code complexity, simply by replacing an "else if" with "if" and
reordering some existing code.
STC:
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 108114 W: 23877 L: 23615 D: 60622
http://tests.stockfishchess.org/tests/view/5d8d86bd0ebc590f3beb0c88
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 20862 W: 3517 L: 3298 D: 14047
http://tests.stockfishchess.org/tests/view/5d8d99cd0ebc590f3beb1899
Bench: 3728191
--------
Where do we go from here?
- It seems strange to me that check extension performed so poorly -- clearly
some of the singular extension conditions are also very important for check
extension. I am not an expert in search, and I do not have any intuition
about which of the eight conditions is/are the culprit. I will try a
succession of eight STC tests to identify the relevant conditions, then try
to replicate this PR for check extension.
- Recent tests interacting with the castle extension may deserve retesting.
I will shortly resubmit a few of my recent castling extension tweaks, rebased
on this PR/commit.
My deepest thanks to @noobpwnftw for the extraordinary CPU donation, and to
all our other fishtest volunteers, who made it possible for a speculative LTC
to pass in 70 minutes!
Closes https://github.com/official-stockfish/Stockfish/pull/2331
Remove the RookOnPawn logic (for rook on rank 5 and above aligning with pawns
on same row or file) which was overlapping with a few other parameters.
Inspired by @31m059 interesting result hinting that a direct attack on pawns
instead of PseudoAttacks might work.
http://tests.stockfishchess.org/tests/view/5d89a7c70ebc595091801b8d
After a few attempts by me and @31m059, and some long STC greens but red LTC,
as a proof of concept I first tried a local SPSA at VSTC trying to tune related
rook psqt rows, and mainly some rook related stuff in evaluate.cpp.
Result was STC green, but still red LTC,
Finally a 100M fishtest SPSA at LTC proved successful both at STC and LTC.
All this was possible with the awesome fishtest contributors.
At some point, I had 850 workers on the last test !
Run as a simplification
STC
http://tests.stockfishchess.org/tests/view/5d8d68f40ebc590f3beaf171
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 7399 W: 1693 L: 1543 D: 4163
LTC
http://tests.stockfishchess.org/tests/view/5d8d70270ebc590f3beaf63c
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 41617 W: 6981 L: 6894 D: 27742
Closes https://github.com/official-stockfish/Stockfish/pull/2329
bench: 4037914
Revert the previous patch now that the binary for the super-final
of TCEC season 16 has been sent.
Maybe the feature of showing the name of compiler will be added to the
master branch in the future. But we may use a cleaner way to code it, see
some ideas using the Makefile approach at the end of pull request #2327 :
https://github.com/official-stockfish/Stockfish/pull/2327
Bench: 3618154
This patch shows a description of the compiler used to compile Stockfish,
when starting from the console.
Usage:
```
./stockfish
compiler
```
Example of output:
```
Stockfish 240919 64 POPCNT by T. Romstad, M. Costalba, J. Kiiski, G. Linscott
Compiled by clang++ 9.0.0 on Apple
__VERSION__ macro expands to: 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.38)
```
No functional change
This patch replaces the obscure expressions mapping files ABCDEFGH to ABCDDCBA
by explicite calls to an auxiliary function:
old: f = min(f, ~f)
new: f = map_to_queenside(f)
We used the Golbolt web site (https://godbolt.org) to check that the current
code for the auxiliary function is optimal.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 30292 W: 6756 L: 6651 D: 16885
http://tests.stockfishchess.org/tests/view/5d8676720ebc5971531d6aa1
Achieved with a bit of help from Sopel97, snicolet and vondele, thanks everyone!
Closes https://github.com/official-stockfish/Stockfish/pull/2325
No functional change
This change to the Rook psqt encourages rook lifts to the third rank
on the two center files.
STC 10+0.1 th 1 :
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 40654 W: 9028 L: 8704 D: 22922
http://tests.stockfishchess.org/tests/view/5d885da60ebc5906dd3e9fcd
LTC 60+0.6 th 1 :
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 56963 W: 9530 L: 9196 D: 38237
http://tests.stockfishchess.org/tests/view/5d88618c0ebc5906dd3ea45f
Thanks to @snicolet for mentioning that Komodo does this a lot and
Stockfish doesn't, which gave me the idea for this patch, and to
@noobpwnftw for providing cores to fishtest which allowed very quick
testing.
Future work: perhaps this can be refined somehow to encourage this
on other files, my attempts have failed.
Closes https://github.com/official-stockfish/Stockfish/pull/2322
Bench: 3950249
Author: @nickpelling
We replace in the code the obscure expressions mapping files ABCDEFGH to ABCDDCBA
by an explicite call to an auxiliary function :
old: f = min(f, ~f)
new: f = map_to_queenside(f)
We used the Golbolt web site (https://godbolt.org) to find the optimal code
for the auxiliary function.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 30292 W: 6756 L: 6651 D: 16885
http://tests.stockfishchess.org/tests/view/5d8676720ebc5971531d6aa1
No functional change
When scoring the connected pawns, replace the intricate ternary expressions
choosing the coefficient by a simpler addition of boolean conditions:
` value = Connected * (2 + phalanx - opposed) `
This is the map showing the old coefficients and the new ones:
```
phalanx and unopposed: 3x -> 3x
phalanx and opposed: 1.5x -> 2x
not phalanx and unopposed: 2x -> 2x
not phalanx and opposed: 1x -> 1x
```
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 11354 W: 2579 L: 2437 D: 6338
http://tests.stockfishchess.org/tests/view/5d8151f00ebc5971531d244f
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41221 W: 7001 L: 6913 D: 27307
http://tests.stockfishchess.org/tests/view/5d818f930ebc5971531d26d6
Bench: 3959889
blah
It seems there is no other way to specify stack size on std::thread than linker
flags and the effective flags are named differently in many toolchains. On
toolchains where pthread is always available, this patch changes the stack
size change in our C++ code via pthread to ensure a minimum stack size of 8MB,
instead of relying on linker defaults which may be platform-specific.
Also raises default stack size on OSX to current Linux default (8MB) just to
be safe.
Closes https://github.com/official-stockfish/Stockfish/pull/2303
No functional change
This patch decreases the endgame scale factor using the 50 moves counter.
Looking at some games with this patch, it seems to have two effects on
the playing style:
1) when no progress can be made in late endgames (for instance in fortresses
or opposite bishops endgames) the evaluation will be largely tamed down
towards a draw value.
2) more interestingly, there is also a small effect in the midgame play because
Stockfish will panic a little bit if there are more than four consecutive
shuffling moves with an advantage: the engine will try to move a pawn or to
exchange a piece to keep the advantage, so the follow-ups of the position
will be discovered earlier by the alpha-beta search.
passed STC:
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 23017 W: 5080 L: 4805 D: 13132
http://tests.stockfishchess.org/tests/view/5d7e4aef0ebc59069c36fc74
passed LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 30746 W: 5171 L: 4911 D: 20664
http://tests.stockfishchess.org/tests/view/5d7e513d0ebc59069c36ff26
Pull request: https://github.com/official-stockfish/Stockfish/pull/2304
Bench: 4272173
This patch finally introduces something that was tried for years: midgame score
dependance on complexity of position. More precisely, if the position is very
simplified and the complexity measure calculated in the initiative() function
is inferior to -50 by an amount d, then we add this value d to the midgame score.
One example of play of this patch will be (again!) 4 vs 3 etc same flank endgames
where sides have a lot of non-pawn material: 4 vs 3 draw mostly remains the same
draw even if we add a lot of equal material to both sides.
STC run was stopped after 200k games (and not converging):
LLR: -1.75 (-2.94,2.94) [0.50,4.50]
Total: 200319 W: 44197 L: 43310 D: 112812
http://tests.stockfishchess.org/tests/view/5d7cfdb10ebc5902d386572c
passed LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 41051 W: 6858 L: 6570 D: 27623
http://tests.stockfishchess.org/tests/view/5d7d14680ebc5902d3866196
This is the first and not really precise version, a lot of other stuff can be
tried on top of it (separate complexity for middlegame, some more terms, even
simple retuning of values).
Bench: 4248476
Follow-up to previous commit. Update the documentation for the user when using `make`,
to show the preferred bmi2 compile in the advanced examples section.
Note: I made a mistake in the previous commit comment, the documentation is shown when
using `make` or `make help`, not `make --help`.
No functional change
The only change done to the Makefile to get a somewhat faster binary as
discussed in #2291 is to add -msse4 to the compile options of the bmi2 build.
Since all processors supporting bmi2 also support sse4 this can be done easily.
It is a useful step to avoid sending around custom and poorly tested builds.
The speedup isn't enough to pass [0,4] but it is roughly 1.15Elo and a LOS of 90%:
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 93009 W: 20519 L: 20316 D: 52174
Also rewrite the documentation for the user when using `make --help`, so that
the order of architectures for x86-64 has the more performant build one on top.
Closes https://github.com/official-stockfish/Stockfish/pull/2300
No functional change
This patch greatly scales down complexity of endgames when the
following conditions are all true together:
- pawns are all on one flank
- stronger side king is not outflanking weaker side
- no passed pawns are present
This should improve stockfish evaluation of obvious draws 4 vs 3, 3 vs 2
and 2 vs 1 pawns in rook/queen/knight/bishop single flank endgames where
strong side can not make progress.
passed STC
LLR: 2.94 (-2.94,2.94) [0.50,4.50]
Total: 15843 W: 3601 L: 3359 D: 8883
passed LTC
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 121275 W: 20107 L: 19597 D: 81571
Closes https://github.com/official-stockfish/Stockfish/pull/2298
Bench: 3954190
==========================
How to continue from there?
a) This could be a powerful idea for refining some parts of the evaluation
function, a bit like when we try quadratics or other equations to emphasize
certain situations (xoto10).
b) Some other combinaison values for this bonus can be done further, or
overall retuning of weight and offset while keeping the formula simple.
This is a simplification that integrated WeakLever into doubled pawns.
Since we already check for !support for Doubled pawns, it is trivial
to check for weak lever by just checking more_than_one(lever).
We also introduce the Score * bool operation overload to remove some
casts in the code.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 26757 W: 5842 L: 5731 D: 15184
http://tests.stockfishchess.org/tests/view/5d77ee220ebc5902d384e5a4
Closes https://github.com/official-stockfish/Stockfish/pull/2295
No functional change
Maintain best move counter at the root and allow there only moves which has a counter
of zero for Late Move Reduction. For compensation only the first three moves are excluded
from Late Move Reduction per default instead the first four moves.
What we can further do:
- here we use a simple counting scheme but perhaps some aging to fade out early iterations
could be helpful
- use the best move counter also at inner nodes for LMR and/or pruning
STC:
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 17414 W: 3984 L: 3733 D: 9697
http://tests.stockfishchess.org/tests/view/5d6234bb0ebc5939d09f2aa2
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 38058 W: 6448 L: 6166 D: 25444
http://tests.stockfishchess.org/tests/view/5d62681a0ebc5939d09f2f27
Closes https://github.com/official-stockfish/Stockfish/pull/2282
Bench: 3568210
Remove one parameter in function evaluate_shelter(), making all
comparisons for castled/uncastled shelter locally in do_king_safety().
Also introduce BlockedStorm penalty.
Passed non-regression test at STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 65864 W: 14630 L: 14596 D: 36638
http://tests.stockfishchess.org/tests/view/5d5fc80c0ebc5939d09f0acc
No functional change
@Vizvezdenec array suggested that alternate values may be better than current
master (see pull request #2270 ). I tuned some linear equations to more closely
represent his values and it passed. These futility values seem quite sensitive,
so perhaps additional Elo improvements can be found here.
STC
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 12257 W: 2820 L: 2595 D: 6842
http://tests.stockfishchess.org/tests/view/5d5b2f360ebc5925cf1111ac
LTC
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 20273 W: 3497 L: 3264 D: 13512
http://tests.stockfishchess.org/tests/view/5d5c0d250ebc5925cf111ac3
Closes https://github.com/official-stockfish/Stockfish/pull/2272
------------------------------------------
How to continue from there ?
a) we can try a simpler version for the futility margin, this would
be a simplification :
margin = 188 * (depth - improving)
b) on the other direction, we can try a complexification by trying
again to gain Elo with an complete array of futility values.
------------------------------------------
Bench: 4330402
Replace calls to count(key) + operator[key] with a single call to find(key).
Replace the std::map with std::unordered_map which provide O(1) access,
although the map has a really small number of objects.
Test with [0..4] failed yellow:
TC 10+0.1
SPRT elo0: 0.00 alpha: 0.05 elo1: 4.00 beta: 0.05
LLR -2.96 [-2.94,2.94] (rejected)
Elo 1.01 [-0.87,3.08] (95%)
LOS 85.3%
Games 71860 [w:22.3%, l:22.2%, d:55.5%]
http://tests.stockfishchess.org/tests/view/5d5432210ebc5925cf109d61
Closes https://github.com/official-stockfish/Stockfish/pull/2269
No functional change
Non functional simplification when we find the passed pawns in pawn.cpp
and some code clean up. It also better follows the pattern "flag the pawn"
and "score the pawn".
-------------------------
The idea behind the third condition for candidate passed pawn is a little
bit difficult to visualize. Just for the record, the idea is the following:
Consider White e5 d4 against black e6. d4 can (in some endgames) push
to d5 and lever e6. Thanks to this sacrifice, or after d5xe6, we consider
e5 as "passed".
However:
- if White e5/d4 against black e6/c6: d4 cannot safely push to d5 since d5 is double attacked;
- if White e5/d4 against black e6/d5: d4 cannot safely push to d5 since it is occupied.
This is exactly what the following expression does:
```
&& (shift<Up>(support) & ~(theirPawns | dblAttackThem)))
```
--------------------------
http://tests.stockfishchess.org/tests/view/5d3325bb0ebc5925cf0e6e91
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 124666 W: 27586 L: 27669 D: 69411
Closes https://github.com/official-stockfish/Stockfish/pull/2255
No functional change
This exploits the recent fractional Skill Level, and is a result from some discussion in #2221 and the older #758.
Basically, if UCI_LimitStrength is set, it will internally convert UCI_Elo to a matching fractional Skill Level.
The Elo estimate is based on games at TC 60+0.6, Hash 64Mb, 8moves_v3.pgn, rated with Ordo, anchored to goldfish1.13 (CCRL 40/4 ~2000).
Note that this is mostly about internal consistency, the anchoring to CCRL is a bit weak, e.g. within this tournament,
goldfish and sungorus only have a 200Elo difference, their rating difference on CCRL is 300Elo.
I propose that we continue to expose 'Skill Level' as an UCI option, for backwards compatibility.
The result of a tournament under those conditions are given by the following table, where the player name reflects the UCI_Elo.
# PLAYER : RATING ERROR POINTS PLAYED (%) CFS(%)
1 Elo2837 : 2792.2 50.8 536.5 711 75 100
2 Elo2745 : 2739.0 49.0 487.5 711 69 100
3 Elo2654 : 2666.4 49.2 418.0 711 59 100
4 Elo2562 : 2604.5 38.5 894.5 1383 65 100
5 Elo2471 : 2515.2 38.1 651.5 924 71 100
6 Elo2380 : 2365.9 35.4 478.5 924 52 100
7 Elo2289 : 2290.0 28.0 864.0 1596 54 100
8 sungorus1.4 : 2204.9 27.8 680.5 1596 43 60
9 Elo2197 : 2201.1 30.1 523.5 924 57 100
10 Elo2106 : 2103.8 24.5 730.5 1428 51 100
11 Elo2014 : 2030.5 30.3 377.5 756 50 98
12 goldfish1.13 : 2000.0 ---- 511.0 1428 36 100
13 Elo1923 : 1928.5 30.9 641.5 1260 51 100
14 Elo1831 : 1829.0 42.1 370.5 756 49 100
15 Elo1740 : 1738.3 42.9 277.5 756 37 100
16 Elo1649 : 1625.0 42.1 525.5 1260 42 100
17 Elo1558 : 1521.5 49.9 298.0 756 39 100
18 Elo1467 : 1471.3 51.3 246.5 756 33 100
19 Elo1375 : 1407.1 51.9 183.0 756 24 ---
It can be observed that all set Elos correspond within the error bars with the observed Ordo rating.
No functional change
This is a functional simplification that removes the std::pow from reduction. The resulting reduction values are within 1% of master.
This is a simplification because i believe an fp addition and multiplication is much faster than a call to std::pow() which is historically slow and performance varies widely on different architectures.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23471 W: 5245 L: 5127 D: 13099
http://tests.stockfishchess.org/tests/view/5d27ac1b0ebc5925cf0d476b
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 51533 W: 8736 L: 8665 D: 34132
http://tests.stockfishchess.org/tests/view/5d27b74e0ebc5925cf0d493c
Bench 3765158
This is another functional simplification to Stockfish passed pawn evaluation.
Stockfish evaluates some pawns which are not yet passed as "candidate" passed pawns, which are given half the bonus of fully passed ones. Prior to this commit, Stockfish considered a passed pawn to be a "candidate" if (a) it would not be a passed pawn if moved one square forward (the blocking square), or (b) there were other pawns (of either color) in front of it on the file. This latter condition used a fairly complicated method, forward_file_bb; here, rather than inspect the entire forward file, we simply re-use the blocking square. As a result, some pawns previously considered "candidates", but which are able to push forward, no longer have their bonus halved.
Simplification tests passed quickly at both STC and LTC. The results from both tests imply that this simplification is, most likely, additionally a small Elo gain, with a LTC likelihood of superiority of 87 percent.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 12908 W: 2909 L: 2770 D: 7229
http://tests.stockfishchess.org/tests/view/5d2a1c880ebc5925cf0d9006
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 20723 W: 3591 L: 3470 D: 13662
http://tests.stockfishchess.org/tests/view/5d2a21fd0ebc5925cf0d9118
Bench: 3377831
Current master code made sence when we had 2 types of bonuses for protected path to queen. But it was simplified so we have only one bonus now and code was never cleaned.
This non-functional simplification removes useless defendedsquares bitboard and removes one bitboard assignment (defendedSquares &= attackedBy[Us][ALL_PIECES] + defendedSquares & blockSq becomes just attackedBy[Us][ALL_PIECES] & blockSq also we never assign defendedSquares = squaresToQueen because we don't need it).
So should be small non-functional speedup.
Passed simplification SPRT.
http://tests.stockfishchess.org/tests/view/5d2966ef0ebc5925cf0d7659
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23319 W: 5152 L: 5034 D: 13133
bench 3361902
In Stockfish, both the middlegame and endgame bonus for a passed pawn are calculated as a product of two factors. The first is k, chosen based on the presence of defended and unsafe squares. The second is w, a quadratic function of the pawn's rank. Both are only applied if the pawn's relative rank is at least RANK_4.
It does not appear that the complexity of a quadratic function is necessary for w. Here, we replace it with a simpler linear one, which performs equally at both STC and LTC.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 46814 W: 10386 L: 10314 D: 26114
http://tests.stockfishchess.org/tests/view/5d29686e0ebc5925cf0d76a1
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 82372 W: 13845 L: 13823 D: 54704
http://tests.stockfishchess.org/tests/view/5d2980650ebc5925cf0d7bfd
Bench: 3328507
We recently added a bonus for double pawn attacks on unsupported enemy pawns,
on June 27. However, it is possible that the unsupported pawn may become a passer
by simply pushing forward out of the double attack. By rewarding double attacks,
we may inadvertently reward the creation of enemy passers, by encouraging both of
our would-be stoppers to attack the enemy pawn even if there is no opposing
friendly pawn on the same file.
Here, we revise this term to exclude passed pawns. In order to simplify the code
with this change included, we non-functionally rewrite Attacked2Unsupported to
be a penalty for enemy attacks on friendly pawns, rather than a bonus for our
attacks on enemy pawns. This allows us to exclude passed pawns with a simple
& ~e->passedPawns[Us], while passedPawns[Them] is not yet defined in this part
of the code.
This dramatically reduces the proportion of positions in which Attacked2Unsupported
is applied, to about a third of the original. To compensate, maintaining the same
average effect across our bench positions, we nearly triple Attacked2Unsupported
from S(0, 20) to S(0, 56). Although this pawn formation is rare, it is worth more
than half a pawn in the endgame!
STC: (stopped automatically by fishtest after 250,000 games)
LLR: -0.87 (-2.94,2.94) [0.50,4.50]
Total: 250000 W: 56585 L: 55383 D: 138032
http://tests.stockfishchess.org/tests/view/5d25795e0ebc5925cf0cfb51
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 81038 W: 13965 L: 13558 D: 53515
http://tests.stockfishchess.org/tests/view/5d25f3920ebc5925cf0d10dd
Closes https://github.com/official-stockfish/Stockfish/pull/2233
Bench: 3765158
PowerPC has had popcount instructions for a long time, at least as far
back as POWER5 (released 2004). Enable them via a gcc builtin.
Using a gcc builtin has the added bonus that if compiled for a processor
that lacks a hardware instruction, gcc will include a software popcount
implementation that does not use the instruction. It might be slower
than the table lookups (or it might be faster) but it will certainly work.
So this isn't going to break anything.
On my POWER8 VM, this leads to a ~4.27% speedup.
Fir prefetch, the gcc builtin generates a 'dcbt' instruction, which is
supported at least as far back as the G5 (2002) and POWER4 (2001).
This leads to a ~5% speedup on my POWER8 VM.
No functional change
The current skill levels (1-20) allow for adjusting playing strengths, but
do so in big steps (e.g. level 10 vs level 11 is a ~143 Elo jump at STC).
Since the 'Skill Level' input can already be a floating point number, this
patch uses the fractional part of the input to provide the user with
fine control, allowing for varying the playing strength essentially
continuously.
The implementation internally still uses integer skill levels (needed since they pick Depths),
but non-deterministically rounds up or down the used skill level such that the average integer
skill corresponds to the input floating point one. As expected, intermediate
(fractional) skill levels yield intermediate playing strenghts.
Tested at STC, playing level 10 against levels between 10 and 11 for 10000 games
level 10.25 ELO: 24.26 +-6.2
level 10.5 ELO: 67.51 +-6.3
level 10.75 ELO: 98.52 +-6.4
level 11 ELO: 143.65 +-6.7
http://tests.stockfishchess.org/tests/view/5cd9c6b40ebc5925cf056791http://tests.stockfishchess.org/tests/view/5cd9d22b0ebc5925cf056989http://tests.stockfishchess.org/tests/view/5cd9cf610ebc5925cf056906http://tests.stockfishchess.org/tests/view/5cd9d2490ebc5925cf05698e
No functional change.
Initialization of larger hash sizes can take some time.
Don't include this time in the bench by resetting the timer after Search::clear().
Also move 'ucinewgame' command down in the list, so that it is processed
after the configuration of Threads and Hash size.
No functional change.
This is a functional change that rewards double attacks on an unsupported pawns.
STC (non-functional difference)
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 83276 W: 18981 L: 18398 D: 45897
http://tests.stockfishchess.org/tests/view/5d0970500ebc5925cf0a77d4
LTC (incomplete looping version)
LLR: 0.50 (-2.94,2.94) [0.00,3.50]
Total: 82999 W: 14244 L: 13978 D: 54777
http://tests.stockfishchess.org/tests/view/5d0a8d480ebc5925cf0a8d58
LTC (completed non-looping version).
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 223381 W: 38323 L: 37512 D: 147546
http://tests.stockfishchess.org/tests/view/5d0e80510ebc5925cf0ad320
Closes https://github.com/official-stockfish/Stockfish/pull/2205
Bench 3633546
----------------------------------
Comments by Alain SAVARD:
interesting result ! I would have expected that search would resolve such positions
correctly on the very next move. This is not a very common pattern, and when it happens,
it will quickly disappear. So I'm quite surprised that it passed LTC.
I would be even more surprised if this would resist a simplification.
Anyway, let's try to imagine a few cases.
a) If you have White d5 f5 against Black e6, and White to move
last move by Black was probably a capture on e6 and White is about to recapture on e6
b) If you have White d5 f5 against e6, and Black to move
last move by White was possibly a capture on d5 or f5
or the pawn on e6 was pinned or could not move for some reason.
and white wants to blast open the position and just pushed d4-d5 or f4-f5
Some possible follow-ups
a) Motif is so rare that the popcount() can be safely replaced with a bool()
But this would not pass a SPRT[0,4],
So try a simplification with bool() and also without the & ~theirAttacks
b) If it works, we probably can simply have this in the loop
if (lever) score += S(0, 20);
c) remove all this and tweak something in search for pawn captures (priority, SEE, extension,..)
-removes wideUnsafeSquares bitboard
-removes a couple of bitboard operations
-removes one if operator
-updates comments so they actually represent what this part of code is doing now.
passed non-regression STC
http://tests.stockfishchess.org/tests/view/5d0c1ae50ebc5925cf0aa8db
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 16892 W: 3865 L: 3733 D: 9294
No functional change
Use comparison of eval with beta to predict potential cutNodes. This
allows multi-cut pruning to also prune possibly mislabeled Pv and NonPv
nodes.
STC:
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 54305 W: 12302 L: 11867 D: 30136
http://tests.stockfishchess.org/tests/view/5d048ba50ebc5925cf0a15e8
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 189512 W: 32620 L: 31904 D: 124988
http://tests.stockfishchess.org/tests/view/5d04bf740ebc5925cf0a17f0
Normally I would think such changes are risky, specially for PvNodes,
but after trying a few other versions, it seems this version is more
sound than I initially thought.
Aside from this, a small funtional change is made to return
singularBeta instead of beta to be more consistent with the fail-soft
logic used in other parts of search.
=============================
How to continue from there ?
We could try to audit other parts of the search where the "cutNode"
variable is used, and try to use dynamic info based on heuristic
eval rather than on this variable, to check if the idea behind this
patch could also be applied successfuly.
Bench: 3503788
Fixes issues #2126 and #2189 where no progress in rootDepth is made for particular fens:
8/8/3P3k/8/1p6/8/1P6/1K3n2 b - - 0 1
8/1r1rp1k1/1b1pPp2/2pP1Pp1/1pP3Pp/pP5P/P5K1/8 w - - 79 46
the cause are the shuffle extensions. Upon closer analysis, it appears that in these cases a shuffle extension is made for every node searched, and progess can not be made. This patch implements a fix, namely to limit the number of extensions relative to the number of nodes searched. The ratio employed is 1/4, which fixes the issues seen so far, but it is a heuristic, and I expect that certain positions might require an even smaller fraction.
The patch was tested as a bug fix and passed:
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 56601 W: 12633 L: 12581 D: 31387
http://tests.stockfishchess.org/tests/view/5d02b37a0ebc5925cf09f6da
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 52042 W: 8907 L: 8837 D: 34298
http://tests.stockfishchess.org/tests/view/5d0319420ebc5925cf09fe57
Furthermore, to confirm that the shuffle extension in this form indeed still brings Elo, one more test at VLTC was performed:
VLTC:
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 142022 W: 20963 L: 20435 D: 100624
http://tests.stockfishchess.org/tests/view/5d03630d0ebc5925cf0a011a
Bench: 3961247
Since root_probe() and root_probe_wdl() do not reset all tbRank values if they fail,
it is necessary to do this in rank_root_move(). This fixes issue #2196.
Alternatively, the loop could be moved into both root_probe() and root_probe_wdl().
No functional change
This is a non-functional simplification.
backmost_sq and frontmost_sq are redundant. It seems quite clear to always use frontmost_sq and use the correct color.
Non functional change.
Increase size of the pawns table by the factor 8. This decreases the number of recalculations of pawn structure information significantly (at least at LTC).
I have done measurements for different depths and pawn cache sizes.
First are given the number of pawn entry calculations are done (in parentheses is the frequency that a call to probe triggers a pawn entry calculation). The delta% are the percentage of less done pawn entry calculations in comparison to master
VSTC: bench 1 1 12
STC: bench 8 1 16
LTC: bench 64 1 20
VLTC: bench 512 1 24
VSTC STC LTC VLTC
master 82218(6%) 548935(6%) 2415422(7%) 9548071(7%)
pawncache*2 79859(6%) 492943(5%) 2084794(6%) 8275206(6%)
pawncache*4 78551(6%) 458758(5%) 1827770(5%) 7112531(5%)
pawncache*8 77963(6%) 439421(4%) 1649169(5%) 6128652(4%)
delta%(p2-m) -2.9% -10.2% -13.7% -13.3%
delta%(p4-m) -4.5% -16.4% -24.3% -25.5%
delta%(p8-m) -5.2% -20.0% -31.7% -35.8%
STC: (non-regression test because at STC the effect is smaller than at LTC)
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22767 W: 5160 L: 5040 D: 12567
http://tests.stockfishchess.org/tests/view/5d00f6040ebc5925cf09c3e2
LTC:
LLR: 2.94 (-2.94,2.94) [0.00,4.00]
Total: 26340 W: 4524 L: 4286 D: 17530
http://tests.stockfishchess.org/tests/view/5d00a3810ebc5925cf09ba16
No functional change.
This is a functional simplification. This is NOT the exact version that was tested. Beyond the testing, an assignment was removed and a piece changes for consistency.
Instead of rewarding ANY square past an opponent pawn as an "outpost," only use squares that are protected by our pawn. I believe this is more consistent with what the chess world calls an "outpost."
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23540 W: 5387 L: 5269 D: 12884
http://tests.stockfishchess.org/tests/view/5cf51e6d0ebc5925cf08b823
LTC
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 53085 W: 9271 L: 9204 D: 34610
http://tests.stockfishchess.org/tests/view/5cf5279e0ebc5925cf08b992
bench 3424592
Stockfish evaluates passed pawns in part based on a variable k, which shapes the passed pawn bonus based on the number of squares between the current square and promotion square that are attacked by enemy pieces, and the number defended by friendly ones. Prior to this commit, we gave a large bonus when all squares between the pawn and the promotion square were defended, and if they were not, a somewhat smaller bonus if at least the pawn's next square was. However, this distinction does not appear to provide any Elo at STC or LTC.
Where do we go from here? Many promising Elo-gaining patches were attempted in the past few months to refine passed pawn calculation, by altering the definitions of unsafe and defended squares. Stockfish uses these definitions to choose the value of k, so those tests interact with this PR. Therefore, it may be worthwhile to retest previously promising but not-quite-passing tests in the vicinity of this patch.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 42344 W: 9455 L: 9374 D: 23515
http://tests.stockfishchess.org/tests/view/5cf83ede0ebc5925cf0904fb
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 69548 W: 11855 L: 11813 D: 45880
http://tests.stockfishchess.org/tests/view/5cf8698f0ebc5925cf0908c8
Bench: 3854907
This is a non-functional simplification. Since our file_bb handles either Files or Squares, using Square here removes some code. Not likely any performance difference despite the test.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 6081 W: 1444 L: 1291 D: 3346
http://tests.stockfishchess.org/tests/view/5ceb3e2e0ebc5925cf07ab03
Non functional change.
We evaluate defended and unsafe squares for a passed pawn push based on friendly and enemy rooks and queens on the passed pawn's file. Prior to this patch, we further required that these rooks and queens be able to directly attack the passed pawn. However, this restriction appears unnecessary and worth almost exactly 0 Elo at LTC.
The simplified code allows rooks and queens to attack/defend the passed pawn through other pieces of either color.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 29019 W: 6488 L: 6381 D: 16150
http://tests.stockfishchess.org/tests/view/5cdcf7270ebc5925cf05d30c
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 54224 W: 9200 L: 9133 D: 35891
http://tests.stockfishchess.org/tests/view/5cddc6210ebc5925cf05eca3
Bench: 3415326
This is a non-functional simplification that combines vote counting and thread selecting in the same loop.
It is possible that the best thread would be updated more frequently than master, but I'm not sure it matters here. Perhaps "mostVotes" is a better name than "bestVote?"
STC (stopped early).
LLR: 0.70 (-2.94,2.94) [-3.00,1.00]
Total: 10714 W: 2329 L: 2311 D: 6074
http://tests.stockfishchess.org/tests/view/5ccc71470ebc5925cf03d244
No functional change.
Treat all threads the same as main thread and increment
failedHighCnt on fail highs. This makes the search try
again at lower depth.
@vondele suggested also changing the reset of failedHighCnt
when there is a fail low. Tests including this passed so the
branch has been updated to include both changes. failedHighCnt
is now handled exactly the same in helper threads and the main
thread. Thanks vondele :-)
STC @ 5+0.05 th 4 :
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 7769 W: 1704 L: 1557 D: 4508
http://tests.stockfishchess.org/tests/view/5c9f19520ebc5925cfffd2a1
LTC @ 20+0.2 th 8 :
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 37888 W: 5983 L: 5889 D: 26016
http://tests.stockfishchess.org/tests/view/5c9f57d10ebc5925cfffd696
Bench 3824325
Similar to PSQT we only need one instance of the Endgames resource. The current per thread copies are identical and read only(after initialization) so from a design point of view it doesn't make sense to have them.
Tested for no slowdown.
http://tests.stockfishchess.org/tests/view/5c94377a0ebc5925cfff43ca
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 17320 W: 3487 L: 3359 D: 10474
No functional change.
Store repetition info in StateInfo instead of recomputing it in
three different places. This saves some work in has_game_cycle()
where this info is needed for positions before the root.
Tested for non-regression at STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 34104 W: 7586 L: 7489 D: 19029
http://tests.stockfishchess.org/tests/view/5cd0676e0ebc5925cf044b56
No functional change.
High rootDepths, selDepths and generally searches are increasingly
common with long time control games, analysis, and improving hardware.
In this case, depths of MAX_DEPTH/MAX_PLY (128) can be reached,
and the search tree is truncated.
In principle MAX_PLY can be easily increased, except for a technicality
of storing depths in a signed 8 bit int in the TT. This patch increases
MAX_PLY by storing the depth in an unsigned 8 bit, after shifting by the
most negative depth stored in TT (DEPTH_NONE).
No regression at STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 42235 W: 9565 L: 9484 D: 23186
http://tests.stockfishchess.org/tests/view/5cdb35360ebc5925cf0595e1
Verified to reach high depths on
k1b5/1p1p4/pP1Pp3/K2pPp2/1P1p1P2/3P1P2/5P2/8 w - -
info depth 142 seldepth 154 multipv 1 score cp 537 nodes 26740713110 ...
No bench change.
This functional simplification removes the PvNode reduction and adjusts
the ttPv lmr condition accordingly. Their definitions only differ by the
inclusions of ttPv. Aside from this, shallow move pruning definition
will be the only other functional difference, but this does not seem to
matter too much.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 58908 W: 12980 L: 12932 D: 32996
http://tests.stockfishchess.org/tests/view/5cd1aaaa0ebc5925cf046c6a
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 20351 W: 3521 L: 3399 D: 13431
http://tests.stockfishchess.org/tests/view/5cd23fa70ebc5925cf047cd2
Bench: 3687854
In master search() may incorrectly return a draw score in the following
corner case: there was a 2-fold repetition during the game, and the
current position can be reached by a move from a repeated one. This case
is treated as an upcoming 3-fold repetition, which it is not.
Here is a testcase demonstrating the issue (note that the moves
after FEN are required). The input:
position fen 8/8/8/8/8/8/p7/2k4K b - - 0 1 moves c1b1 h1g1 b1c1 g1h1 c1b1 h1g1 b1a1 g1h1
go movetime 1000
produces the output:
[...]
info depth 127 seldepth 2 multipv 1 score cp 0 [...]
bestmove a1b1
saying that the game will be drawn by repetion. However the other possible
move for black, Kb2, avoids repetitions and wins. The patch fixes this behavior.
In particular it finds mate in 10 in the above position.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 10604 W: 2390 L: 2247 D: 5967
http://tests.stockfishchess.org/tests/view/5cb373e00ebc5925cf0167bf
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 19620 W: 3308 L: 3185 D: 13127
http://tests.stockfishchess.org/tests/view/5cb3822f0ebc5925cf016b2d
Bench is not changed since it does not test positions with history of moves.
Bench: 3184182
only the extension part of the shuffle patch is sufficient to
pass [0,3.5] bounds at VLTC as shown by two more tests.
http://tests.stockfishchess.org/tests/view/5cc168bc0ebc5925cf02bda8
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 120684 W: 17875 L: 17400 D: 85409
http://tests.stockfishchess.org/tests/view/5cc14d510ebc5925cf02bcb5
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 68415 W: 10250 L: 9905 D: 48260
this patch proposes to simplify back to this basic and easier to
understand version. In case there is a need to run a [-3, 1] VLTC on
this one, it can be done, but it is resource intensive, and not needed
IMO.
Bench: 3388643
Same idea as fisherman's knight protection.
passed STC
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 17133 W: 3952 L: 3701 D: 9480
http://tests.stockfishchess.org/tests/view/5cc3550b0ebc5925cf02dada
passed LTC
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 37316 W: 6470 L: 6188 D: 24658
http://tests.stockfishchess.org/tests/view/5cc3721d0ebc5925cf02dc90
Looking at this 2 ideas being recent clean elo gainers I have a feeling that we can add also rook and queen protection bonuses or overall move this stuff in pieces loop in the same way as we do pieces attacking bonuses on their kingring... :) Thx fisherman for original idea.
Bench 3429173
Only called with Squares as argument, so remove unused variants.
As this is just syntax changes, only verified bench at high depth.
No functional change.
The "move" class variable is Movepick is removed (removes some abstraction) which saves a few assignment operations, and the effects of "filter" is limited to the current move (movePtr). The resulting code is a bit more verbose, but it is also more clear what is going on. This version is NOT tested, but is substantially similar to:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 29191 W: 6474 L: 6367 D: 16350
http://tests.stockfishchess.org/tests/view/5ca7aab50ebc5925cf006e50
This is a non-functional simplification.
We can remove the values in Pawns if we just use the piece arrays in Position. This reduces the size of a pawn entry. This simplification passed individually, and in concert with ps_passedcount100 (removes passedCount storage in pawns.).
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 19957 W: 4529 L: 4404 D: 11024
http://tests.stockfishchess.org/tests/view/5cb3c2d00ebc5925cf016f0d
Combo STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 17368 W: 3925 L: 3795 D: 9648
http://tests.stockfishchess.org/tests/view/5cb3d3510ebc5925cf01709a
This is a non-functional simplification.
It causes a serious regression hanging a simple fixed
depth search. Reproducible with:
position fen q1B5/1P1q4/8/8/8/6R1/8/1K1k4 w - - 0 1
go depth 13
The reason is a search tree explosion due to:
if (... && depth < 3 * ONE_PLY)
extension = ONE_PLY;
This is very dangerous code by itself because triggers **at the leafs**
and in the above position keeps extending endlessly. In normal games
time deadline makes the search to stop sooner or later, but in fixed
seacrch we just hang possibly for a very long time. This is not acceptable
because 'go depth 13' shall not be a surprise for any position.
This patch reverts commit 76f1807baa.
and fixes the issue https://github.com/official-stockfish/Stockfish/issues/2091
Bench: 3243738
Introduce a new search extension when pushing an advanced passed pawn is
also suggested by the first killer move. There have been previous tests
which have similar ideas, mostly about pawn pushes, but it seems to be
overkill to extend too many moves. My idea is to limit the extension to
when a move happens to be noteworthy in some other way as well, such as
in this case, when it is also a killer move.
STC:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 19027 W: 4326 L: 4067 D: 10634
http://tests.stockfishchess.org/tests/view/5cac2cde0ebc5925cf00c36d
LTC:
LLR: 2.94 (-2.94,2.94) [0.00,3.50]
Total: 93390 W: 15995 L: 15555 D: 61840
http://tests.stockfishchess.org/tests/view/5cac42270ebc5925cf00c4b9
For future tests, it looks like this will interact heavily with passed
pawn evaluation. It may be good to try more variants of some of the more
promising evaluations tests/tweaks.
Bench: 3666092
Instead of looping through kfrom,kto, rfrom, rto, we can use BetweenBB. This is less lines of code and it is more clear what castlingPath actually is. Personal benchmarks are all over the place. However, this code is only executed when loading a position, so performance doesn't seem that relevant.
No functional change.
The kingDanger term is intended to give a penalty which increases rapidly in the middlegame but less so in the endgame. To this end, the middlegame component is quadratic, and the endgame component is linear. However, this produces unintended consequences for relatively small values of kingDanger: the endgame penalty will exceed the middlegame penalty. This remains true up to kingDanger = 256 (a S(16, 16) penalty), so some of these inaccurate penalties are actually rather large.
In this patch, we increase the threshold for applying the kingDanger penalty to eliminate some of this unintended behavior. This was very nearly, but not quite, sufficient to pass on its own. The patch was finally successful by integrating a second kingDanger tweak by @Vizvezdenec, increasing the kingDanger constant term slightly and improving both STC and LTC performance.
Where do we go from here? I propose that in the future, any attempts to tune kingDanger coefficients should also consider tuning the kingDanger threshold. The evidence shows clearly that it should not be automatically taken to be zero.
Special thanks to @Vizvezdenec for the kingDanger constant tweak. Thanks also to all the approvers and CPU donors who made this possible!
STC:
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 141225 W: 31239 L: 30846 D: 79140
http://tests.stockfishchess.org/tests/view/5cabbdb20ebc5925cf00b86c
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 30708 W: 5296 L: 5043 D: 20369
http://tests.stockfishchess.org/tests/view/5cabff760ebc5925cf00c22d
Bench: 3445945
The sed command is a bit different in Mac OS X (why not!).
The ‘-i’ option required a parameter to tell what extension to add for the
backup file. To fix it, just add extension for backup file, for example ‘.bak’
Fix broken Trevis CI test
No functional change.
The current update only by main thread depends on the luck of
whether main thread sees any/many changes to the best move or not.
It then makes large, lumpy changes to the time to be
used (1x, 2x, 3x, etc) depending on that sample of 1.
Use the average across all threads to get a more reliable
number with a smoother distribution.
STC @ 5+0.05 th 4 :
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 51899 W: 11446 L: 11029 D: 29424
http://tests.stockfishchess.org/tests/view/5ca32ff20ebc5925cf0016fb
STC @ 5+0.05 th 8 :
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 13851 W: 2843 L: 2620 D: 8388
http://tests.stockfishchess.org/tests/view/5ca35ae00ebc5925cf001adb
LTC @ 20+0.2 th 8 :
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 48527 W: 7941 L: 7635 D: 32951
http://tests.stockfishchess.org/tests/view/5ca37cb70ebc5925cf001cec
Further work:
Similar changes might be possible for the fallingEval and timeReduction calculations (and elsewhere?), using either the min, average or max values across all threads.
Bench 3506898
Simplification which removes the pawns connected array.
Instead of storing the values in an array, the values are
calculated real-time. This is about 1.6% faster on my machines.
Performance:
master ave nps: 159,248,672
patch ave nps: 161,905,592
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 20363 W: 4579 L: 4455 D: 11329
http://tests.stockfishchess.org/tests/view/5c9925ba0ebc5925cfff79a6
Non functional change.
Shuffle detection procedure :
Shuffling positions are detected if
the last 36 moves are reversible (rule50_count() > 36),
the position have been already in the TT,
there is a still a pawn on the board (to avoid special endings like KBN vs K).
The position is then judged as a draw.
An extension is realized if we already made 14 successive reversible moves in PV to accelerate the detection of the eventual draw.
To go further : we can still improve the idea. The length of the tests need a lot of ressources.
the limit of 36 is logic but must be checked again for special zugzwang positions,
this limit can be decreased in special positions,
the limit of 14 moves for extension has not been tuned.
STC
LLR: -2.94 (-2.94,2.94) [0.50,4.50]
Total: 32595 W: 7273 L: 7275 D: 18047 Elo +0.43
http://tests.stockfishchess.org/tests/view/5c90aa330ebc5925cfff1768
LTC
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 51249 W: 8807 L: 8486 D: 33956 Elo +1.85
http://tests.stockfishchess.org/tests/view/5c90b2450ebc5925cfff1800
VLTC
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 137974 W: 20503 L: 19983 D: 97488 Elo +1.05
http://tests.stockfishchess.org/tests/view/5c9243a90ebc5925cfff2a93
Bench: 3548313
Adding a clamp function makes some of these range limitations a bit prettier and removes some #include's.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 28117 W: 6300 L: 6191 D: 15626
http://tests.stockfishchess.org/tests/view/5c9aa1df0ebc5925cfff8fcc
Non functional change.
always use the implementation of gives_check in position, no need to
hand-inline part of the implementation in search.
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 57895 W: 12632 L: 12582 D: 32681
http://tests.stockfishchess.org/tests/view/5c9eaa4b0ebc5925cfffc9e3
No functional change.
This is a non-functional code style change.
If we add an accessor function for SquareBB we can consolidate all of the asserts. This is also a bit cleaner because all SquareBB accesses go through this method making future changes easier to manage.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 63406 W: 14084 L: 14045 D: 35277
http://tests.stockfishchess.org/tests/view/5c9ea6100ebc5925cfffc9af
No functional change.
This is covered by the line just before. If we would like to protect
against the piece value of e.g. a N == B, this could be done by an
assert, no need to do this at runtime.
No functional change.
This is a non-functional simplification/speedup.
The truth-table for popcount(support) >= popcount(lever) - 1 is:
------------------lever
------------------0-------1---------2
support--0------X-------X---------0
-----------1------X-------X---------X
-----------2------X-------X---------X
Thus, it is functionally equivalent to just do: support || !more_than_one(lever) which removes the expensive popcounts and the -1.
Result of 20 runs:
base (...h_master.exe) = 1451680 +/- 8202
test (./stockfish ) = 1454781 +/- 8604
diff = +3101 +/- 931
STC
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 35424 W: 7768 L: 7674 D: 19982
Http://tests.stockfishchess.org/tests/view/5c970f170ebc5925cfff5e28
No functional change.
While looking at pruning using see_ge() (which is very valuable)
it became apparent that the !extension test is not adding any
value - simplify it away.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 56843 W: 12621 L: 12569 D: 31653
http://tests.stockfishchess.org/tests/view/5c8588cb0ebc5925cffe77f4
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 78622 W: 13223 L: 13195 D: 52204
http://tests.stockfishchess.org/tests/view/5c8611cc0ebc5925cffe7f86
Further work could be to optimize the remaining see_ge() test. The idea of less pruning at higher depths is valuable, but perhaps the test (-PawnValueEg * depth) can be improved.
Bench: 3188688
On OS X threads other than the main thread are created with a reduced stack
size of 512KB by default, this is dangerously low for deep searches, so
adjust it to TH_STACK_SIZE. The implementation calls pthread_create() with
proper stack size parameter.
Verified for no regression at STC enabling the patch on all platforms where
pthread is supported.
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 50873 W: 9768 L: 9700 D: 31405
No functional change.
This is a non-functional simplification / code-style change.
This popcount16 method does nothing but initialize the PopCnt16 arrays.
This can be done in a single bitset line, which is less lines and more clear. Performance for this code is moot.
No functional change.
This is a functional simplification that removes the FutilityMoveCounts array with a simple equation using only ints.
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 14175 W: 3123 L: 2987 D: 8065
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 9900 W: 1735 L: 1597 D: 6568
Bench: 3380343
This is a non-functional patch which shrinks the reductions array.
This saves about 8Kb of memory.
The only slow part of master's reductions array is the calculation
of the log values, so using a separate array for those values and
calculating the rest real-time appears to be just as fast as master.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 63245 W: 13906 L: 13866 D: 35473
http://tests.stockfishchess.org/tests/view/5c7b571f0ebc5925cffdc104
No funcional change.
This removes the skipQuiets variable, as was done in an earlier round by
@protonspring, but fixes an oversight which led to wrong mate
announcements. Quiets can only be pruned when there is no mate score, so
set moveCountPruning at the right spot.
tested as a fix at STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 66321 W: 14690 L: 14657 D: 36974
http://tests.stockfishchess.org/tests/view/5c74f3170ebc5925cffd4b3c
and as the full patch at LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 25903 W: 4341 L: 4228 D: 17334
http://tests.stockfishchess.org/tests/view/5c7540030ebc5925cffd506f
Bench: 3292342
This is a somewhat different patch. It fixes blindspots for
two knights vs pawn endgame.
With local testing starting from random KNNvKP positions where the
pawn has not advanced beyond the 4th rank (thanks @protonspring !)
at 15+0.15 (4 cores), this went +105=868-27 against master. All except
two losses were won in reverse.
The heuristic is simple but effective - the strategy in these endgames
is to push the opposing king to the corner, then move the knight that's
blocking the pawn in for the checkmate while the pawn is free to move
and prevents stalemate. This patch gives SF the little boost it needs
to search the relevant king-cornering mating lines.
See the discussion in pull request 1939 for some more good results for
this test in independant tests:
https://github.com/official-stockfish/Stockfish/pull/1939
Bench: 3310239
This is a functional simplification of the Outposts array
moving it to a single value. This is a duplicate PR because
I couldn't figure out how to fix the original one.
The idea is from @31m059 with formatting recommendations by @snicolet.
See #1940 for additional information.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23933 W: 5279 L: 5162 D: 13492
http://tests.stockfishchess.org/tests/view/5c3575800ebc596a450c5ecb
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41718 W: 6919 L: 6831 D: 27968
http://tests.stockfishchess.org/tests/view/5c358c440ebc596a450c6117
bench 3783543
This is non-functional. These 5 arrays are simple enough to calculate real-time and maintaining an array for them does not help. Decreases the memory footprint.
This seems a tiny bit slower on my machine, but passed STC well enough. Could someone verify speed?
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 44745 W: 9780 L: 9704 D: 25261
http://tests.stockfishchess.org/tests/view/5c47aa2d0ebc5902bca13fc4
The slowdown is minimal even in 32 bit case (thanks to @mstembera for testing):
Compiled using make build ARCH=x86-32 CXX=i686-w64-mingw32-c++ and benched
This patch only:
```
Results for 40 tests for each version:
Base Test Diff
Mean 1455204 1450033 5171
StDev 49452 34533 59621
p-value: 0.465
speedup: -0.004
```
No functional change.
A simple idea, but it makes sense: in current master the search is extended
for checks that are considered somewhat safe, and for for this we use the
static exchange evaluation which only considers the `to_sq` of a move.
This is not reliable for discovered checks, where another piece is giving
the check and is arguably a more dangerous type of check. Thus, if the check
is a discovered check, the result of SEE is not relevant and can be ignored.
STC:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 29370 W: 6583 L: 6274 D: 16513
http://tests.stockfishchess.org/tests/view/5c5062950ebc593af5d4d9b5
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 227341 W: 37972 L: 37165 D: 152204
http://tests.stockfishchess.org/tests/view/5c5094fb0ebc593af5d4dc2c
Bench: 3611854
There was a simplification attempt last week for the tropism
term in king danger, which passed STC but failed LTC. This
was an indirect sign that maybe the tropism factor was sightly
untuned in current master, so we tried to change it from 1/4
to 5/16.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 28098 W: 6264 L: 5990 D: 15844
http://tests.stockfishchess.org/tests/view/5c518db60ebc593af5d4e306
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 103709 W: 17387 L: 16923 D: 69399
http://tests.stockfishchess.org/tests/view/5c52a5510ebc592fc7baea8b
Bench: 4016000
Remove overlapping safe checks from kingdanger:
- rook and queen checks from the same square: rook check is preferred
- bishop and queen checks form the same square: queen check is preferred
Increase bishop and rook check values as a compensation.
STC
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 27480 W: 6111 L: 5813 D: 15556
http://tests.stockfishchess.org/tests/view/5c521d050ebc593af5d4e66a
LTC
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 78500 W: 13145 L: 12752 D: 52603
http://tests.stockfishchess.org/tests/view/5c52b9460ebc592fc7baecc5
Closes https://github.com/official-stockfish/Stockfish/pull/1983
------------------------------------------
I have quite a few ideas of how to improve this patch.
- actually rethinking it now it will maybe be useful to discount
queen/bishop checks if there is only one square that they can
give check from and it's "occupied" by more valuable check. Right
now count of this squares does not really matter.
- maybe some small extra bonus can be given for overlapping checks.
- some ideas about using popcount() on safechecks can be retried.
- tune this safecheck values since they were more or less randomly handcrafted in this patch.
Bench: 3216489
This patch removes line 875 of search.cpp, which was updating pvHit after IID.
Bench testing at depth 22 shows that line 875 of search.cpp never changes the
value of pvHit at NonPV nodes, while at PV nodes it often changes the value
from true to false (and never the reverse). This is because the definition of
pvHit at line 642 is :
```
pvHit = (ttHit && tte->pv_hit()) || (PvNode && depth > 4 * ONE_PLY);
```
while the assignment after IID omits the ` (PvNode && depth > 4 * ONE_PLY) `
condition. As such, unlike the other two post-IID tte reads, this line of code
does not make SF's state more consistent, but rather introduces an inconsistency
in the definition of pvHit. Indeed, changing line 875 read
```
pvHit = (ttHit && tte->pv_hit()) || (PvNode && depth > 4 * ONE_PLY);
```
to match line 642 is functionally equivalent to removing the line entirely, as
this patch does.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 62756 W: 13787 L: 13746 D: 35223
http://tests.stockfishchess.org/tests/view/5c446c850ebc5902bb5d4b75
LTC
LLR: 3.19 (-2.94,2.94) [-3.00,1.00]
Total: 61900 W: 10179 L: 10111 D: 41610
http://tests.stockfishchess.org/tests/view/5c45bf610ebc5902bb5d5d62
Bench: 3796134
This changes 2 parts with regards to static exchange evaluation.
Currently, we do not allow pinned pieces to recapture if *all* opponent
pinners are still in their starting squares. This changes that to having
a less strict requirement, checking if *any* pinners are still in their
starting square. This makes our SEE give more respect to the pinning
side with regards to exchanges, which makes sense because it helps our
search explore more tactical options.
Furthermore, we change the logic for saving pinners into our state
variable when computing slider_blockers. We will include double pinners,
where two sliders may be looking at the same blocker, a similar concept
to our mobility calculation for sliders in our evaluation section.
Interestingly, I think SEE is the only place where the pinners bitboard
is actually used, so as far as I know there are no other side effects
to this change.
An example and some insights:
White Bf2, Kg1
Black Qe3, Bc5
The move Qg3 will be given the correct value of 0. (Previously < 0)
The move Qd4 will be incorrectly given a value of 0. (Previously < 0)
It seems the tradeoff in search is worth it. Qd4 will likely be pruned
soon by something like probcut anyway, while Qg3 could help us spot
tactics at an earlier depth.
STC:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 62162 W: 13879 L: 13408 D: 34875
http://tests.stockfishchess.org/tests/view/5c4ba1a70ebc593af5d49c55
LTC: (Thanks to @alayant)
LLR: 3.40 (-2.94,2.94) [0.00,3.50]
Total: 140285 W: 23416 L: 22825 D: 94044
http://tests.stockfishchess.org/tests/view/5c4bcfba0ebc593af5d49ea8
Bench: 3937213
This patch saves (4-1) * 64 * 64 = 12KiB of cache.
STC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 176120 W: 38944 L: 38087 D: 99089
http://tests.stockfishchess.org/tests/view/5c4c9f840ebc593af5d4a7ce
LTC
As a pure speed up, I've been informed it should not require LTC.
No functional change
stopOnPonderhit is used to stop search quickly on a ponderhit. It is set by mainThread as part of its time management. However, master employs it as a signal between mainThread and the UCI thread. This is not necessary, it is sufficient for the UCI thread to signal that pondering finished, and mainThread should do its usual time-keeping job, and in this case stop immediately.
This patch implements this, removing stopOnPonderHit as an atomic variable from the ThreadPool,
and moving it as a normal variable to mainThread, reducing its scope. In MainThread::check_time() the search is stopped immediately if ponder switches to false, and the variable stopOnPonderHit is set.
Furthermore, ponder has been moved to mainThread, as the variable is only used to exchange signals between the UCI thread and mainThread.
The version has been tested locally (as fishtest doesn't support ponder):
Score of ponderSimp vs master: 2616 - 2528 - 8630 [0.503] 13774
Elo difference: 2.22 +/- 3.54
which indicates no regression.
No functional change.
Small changes in initiative(). For Pawn PSQT, endgame values for d6-e6 and d7-e7 are now symmetric. The MG value of d2 is now smaller than e2 (d2=13, e2=21 now compared to d2=19, e2=16 before). The MG values of h5-h6-h7 also increased so this might encourage stockfish for more h-pawn pushes.
STC
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 81141 W: 17933 L: 17777 D: 45431
http://tests.stockfishchess.org/tests/view/5c4017350ebc5902bb5cf237
LTC
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 83078 W: 13883 L: 13466 D: 55729
http://tests.stockfishchess.org/tests/view/5c40763f0ebc5902bb5cff09
Bench: 3266398
This is a non-functional simplification that removes the AdjacentFiles array.
This array is simple enough to calculate that the pre-calculated array provides
no benefit. Reduces the memory footprint.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 74839 W: 16390 L: 16373 D: 42076
http://tests.stockfishchess.org/tests/view/5c3d75920ebc596a450cfb67
No functionnal change
If we define dcCandidates with & pawnsNotOn7,
we don't have to & it both times.
This seems more clear to me as well.
Tested for no regression.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 44042 W: 9663 L: 9585 D: 24794
http://tests.stockfishchess.org/tests/view/5c21d9120ebc5902ba12e84d
No functional change.
The new form is likely to trigger a bit more at LTC. Given that LTC
appears to be an improvement, I think that is fine.
The change is not very invasive: it does the same as before, use
potentially less time for moves that are very stable. Most of the
time, the full bonus was given if the bonus was given, so the gradual
part {3, 4, 5} didn't matter much. Whereas previously 'stable' was
expressed as the last 80% of iterations are the same, now I use a
fixed depth (10 iterations). For TCEC style TC, it will presumably
imply some more moves that are played quicker (and thus more time
on the clock when it potentially matters). Note that 10 iterations
of stability means we've been proposing that move for 99.9% of search
time.
passed STC
http://tests.stockfishchess.org/tests/view/5c30d2290ebc596a450c055b
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 70921 W: 15403 L: 15378 D: 40140
passed LTC
http://tests.stockfishchess.org/tests/view/5c31ae240ebc596a450c1881
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 17422 W: 2968 L: 2842 D: 11612
No functional change.
Introducing new concept, saving principal lines into the transposition table
to generate a "critical search tree" which we can reuse later for intelligent
pruning/extension decisions.
For instance in this patch we just reduce reduction for these lines. But a lot
of other ideas are possible.
To go further : tune some parameters, how to add or remove lines from the
critical search tree, how to use these lines in search choices, etc.
STC :
LLR: 2.94 (-2.94,2.94) [0.50,4.50]
Total: 59761 W: 13321 L: 12863 D: 33577 +2.23 ELO
http://tests.stockfishchess.org/tests/view/5c34da5d0ebc596a450c53d3
LTC :
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 26826 W: 4439 L: 4191 D: 18196 +2.9 ELO
http://tests.stockfishchess.org/tests/view/5c35ceb00ebc596a450c65b2
Special thanks to Miguel Lahoz for his help in transposition table in/out.
Bench: 3399866
This was inspired after reading about
[Multi-Cut](https://www.chessprogramming.org/Multi-Cut).
We now do non-singular cut node pruning. The idea is to prune when we
have a "backup plan" in case our expected fail high node does not fail
high on the ttMove.
For singular extensions, we do a search on all other moves but the
ttMove. If this fails high on our original beta, this means that both
the ttMove, as well as at least one other move was proven to fail high
on a lower depth search. We then assume that one of these moves will
work on a higher depth and prune.
STC:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 72952 W: 16104 L: 15583 D: 41265
http://tests.stockfishchess.org/tests/view/5c3119640ebc596a450c0be5
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 27103 W: 4564 L: 4314 D: 18225
http://tests.stockfishchess.org/tests/view/5c3184c00ebc596a450c1662
Bench: 3145487
This addresses partially issue #1911 in that it documents in our
Readme the command that users can use to verifying the md5sum of
their downloaded tablebase files.
Additionally, a quick check of the file size (the size of each
tablebase file modulo 64 is 16 as pointed out by @syzygy1) has been
implemented at launch time in Stockfish.
Closes https://github.com/official-stockfish/Stockfish/pull/1927
and https://github.com/official-stockfish/Stockfish/issues/1911
No functional change.
Delay legality check of castling moves at search time,
just before making the move, as is the standard with all
the other move types.
This should avoid an useless and not trivial legality check
when the castling is then not tried later. For instance due
to a previous cut-off.
The patch is also a big simplification and allows to entirely
remove generate_castling()
Bench changes due to a different move sequence out of MovePicker.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 45073 W: 9918 L: 9843 D: 25312
http://tests.stockfishchess.org/tests/view/5c2f176f0ebc596a450bdfb3
LTC:
LLR: 3.15 (-2.94,2.94) [-3.00,1.00]
Total: 10156 W: 1707 L: 1560 D: 6889
http://tests.stockfishchess.org/tests/view/5c2e7dfd0ebc596a450bcdf4
Verified with perft both in standard and Chess960 cases.
Closes https://github.com/official-stockfish/Stockfish/pull/1929
Bench: 3559104
A single popcount in evaluate.cpp replaces all openFiles stuff in pawns. It doesn't seem to affect performance at all.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 28103 W: 6134 L: 6025 D: 15944
http://tests.stockfishchess.org/tests/view/5b7d70a20ebc5902bdbb1999
No functional change.
This custom predicate filter creates an unnecessary abstraction layer, but doesn't make the code any more readable. The code is clear enough without it.
No functional change.
The extra condition is used as a shortcut to skip the following 3 assignments:
```C++
Bitboard b1 = shift<UpRight>(pawnsOn7) & enemies;
Bitboard b2 = shift<UpLeft >(pawnsOn7) & enemies;
Bitboard b3 = shift<Up >(pawnsOn7) & emptySquares;
```
In case of EVASION with no target on 8th rank (the common case), we end up performing the 3 statements for nothing because b1 = b2 = b3 = 0.
But this is just a small micro-optimization and the condition is quite confusing, so just remove it and prefer a readable code instead.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 78020 W: 16978 L: 16967 D: 44075
http://tests.stockfishchess.org/tests/view/5c27b4fe0ebc5902ba135bb0
No functional change.
I tried to improve the Readme because many people in my local
chess club do not understand some of the UCO options properly.
Starting point of this was Cfish's Readme by Ronald de Man,
some internet resources and the Stockfish code itself.
Closes https://github.com/official-stockfish/Stockfish/pull/1898
Initial commit by user @erbsenzaehler, with help from users
Adrian Petrescu, @alayan-stk-2 and Elvin Liu.
No functional change
Co-Authored-By: Alayan-stk-2 <alayan-stk-2@users.noreply.github.com>
Co-Authored-By: Adrian Petrescu <apetresc@gmail.com>
Co-Authored-By: Elvin Liu <solarlight2@users.noreply.github.com>
Recent tests by @xoto10, @Vizvezdenec, and myself seemed to hint that Elo could
be gained by expanding the number of cases where king safety is applied. Several
users (@Spliffjiffer, @Vizvezdenec) have anticipated benefits specifically in
evaluation of tactics. It appears that we actually do not need to restrict the
cases in which we initialize and evaluate king safety at all: initializing and
evaluating it in every position appears roughly Elo-neutral at STC and possibly
a substantial Elo gain at LTC.
Any explanation for this scaling is, at this point, conjecture. Assuming it is
not due to chance, my hypothesis is that initialization of king safety in all
positions is a mild slowdown, offset by an Elo gain of evaluating king safety
in all positions. At STC this produces Elo gains and losses that offset each
other, while at longer time control the slowdown is much less important, leaving
only the Elo gain. It probably helps SF to explore king attacks much earlier in
search with high numbers of enemy pieces concentrating but not essentially attacking
king ring.
Thanks to @xoto10 and @Vizvezdenec for helping run my LTC!
Closes https://github.com/official-stockfish/Stockfish/pull/1906
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 35432 W: 7815 L: 7721 D: 19896
http://tests.stockfishchess.org/tests/view/5c24779d0ebc5902ba131b26
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 12887 W: 2217 L: 2084 D: 8586
http://tests.stockfishchess.org/tests/view/5c25049a0ebc5902ba132586
Bench: 3163951
------------------
How to continue from there?
* Next step will be to tune all the king danger terms once more after that :-)
When iterating through 'SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX' structure, do not use structure member beyond known size.
API is guaranteed to provide us at lease one element upon successful, and no element in the structure can have a zero size.
No functional change.
This pull request fixes a rare crashing bug on Windows inside our NUMA code, first
reported by Dann Corbit in the following forum thread (thanks!):
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/gA6aoMEuOwg
The fix is to not use structure member beyond known size when iterating through
'SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX' structure. We note that the Microsoft
API is guaranteed to provide us at least one element upon successful, and no
element in the structure can have a zero size.
No functional change.
The `&& (ss-1)->killers[0] ` conditions are there seemingly to protect
accessing ss-5.
This is unneeded and not so intuitive (as the killer is checked for equality
with currentMove, and that one is non-zero once we're high enough in the stack,
this protects access to ss-5). We can just extend the stack from ss-4 to ss-5,
so we can call update_continuation_histories(ss-1, ..) always in search.
This goes a bit further than #1881 and addresses a comment in #1878.
passed STC:
http://tests.stockfishchess.org/tests/view/5c1aa8d50ebc5902ba127ad0
LLR: 3.12 (-2.94,2.94) [-3.00,1.00]
Total: 53515 W: 11734 L: 11666 D: 30115
passed LTC:
http://tests.stockfishchess.org/tests/view/5c1b272c0ebc5902ba12858d
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 140176 W: 23123 L: 23192 D: 93861
Bench: 3451321
Even when playing without endgame table bases, this particular endgame should
be a win 100% of the time when Stockfish is given a KRBK position, assuming
there are enough moves remaining in the FEN to finish the game without hitting
the 50 move rule.
PROBLEM: The issue with master here is that the PushClose difference per square
is 20, however, the difference in squares for the PushToCorners array is usually
less. Thus, the engine prefers to move the kings closer together rather than pushing
the weak king to the correct corner.
What happens is if the weak king is in a safe corner, SF still prefers pushing the
kings together. Occasionally, the strong king traps the weak king in the safe corner.
It takes a while for SF to figure it out, but often draws the game by the 50 move rule
(on shorter time controls).
This patch increases the PushToCorners values to correct this problem. We also added
an assert to catch any overflow problem if anybody would want to increase the array
values again in the future.
It was tested in a couple of matches starting with random KRBK positions and showed
increased winning rates, see https://github.com/official-stockfish/Stockfish/pull/1877
No functional change
* Update our continuous integration machinery
Ubuntu 16.04 can now be used with travis. Updating all the other stuff
when there.
Invoking the lld linker seems to save 5 minutes with clang on linux.
No functional change.
* fix
* Use a bit less code to calculate hashfull(). Change post increment to preincrement as is more standard
in the rest of stockfish. Scale result so we report 100% instead of 99.9% when completely full.
No functional change.
Although this is a compile-time constant, we stick the castlingSide into a CastlingRight, then pull it out again. This seems unecessarily complex.
No functional change.
We do not need to change the winnerKSq variable, so we can simplify
a little bit the logic of the code by changing only the loserKSq
variable when it is necessary. Also consolidate and clarify comments.
See the pull request thread for a proof that the code is correct:
https://github.com/official-stockfish/Stockfish/pull/1854
No functional change
~stronglyProtected is quite similar to ~attackedBy[Them][PAWN] & ~attackedBy2[Them],
the only difference appears to be that the former includes squares attacked twice
by both sides. The resulting logic is simpler, and the change appears to be at least
Elo-neutral at both STC and LTC.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 35924 W: 7978 L: 7885 D: 20061
http://tests.stockfishchess.org/tests/view/5c14a5c00ebc5902ba11ed72
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 37078 W: 6125 L: 6030 D: 24923
http://tests.stockfishchess.org/tests/view/5c14ae880ebc5902ba11eed8
Bench: 3646542
this patch fixes a rare but reproducible segfault observed playing a
multi-threaded match, it is discussed somewhat in fishcooking.
From the core file, it could be observed that the issue was in qsearch, namely:
````
ss->pv[0] = MOVE_NONE;
````
and the backtrace shows the it arrives there via razoring, called from the rootNode:
````
(gdb) bt
alpha=-19, beta=682, depth=DEPTH_ZERO) at search.cpp:1247
````
Indeed, ss->pv can indeed by a nullptr at the rootNode. However, why is the
segfault so rare ?
The reason is that the condition that guards razoring:
````
(depth < 2 * ONE_PLY && eval <= alpha - RazorMargin)
````
is almost never true, since at the root alpha for depth < 5 is -VALUE_INFINITE.
Nevertheless with the new failHigh scheme, this is not guaranteed, and rootDepth > 5,
can still result in a depth < 2 search at the rootNode. If now another thread,
via the hash, writes a new low eval to the rootPos qsearch can be entered.
Rare but not unseen... I assume that some of the crashes in fishtest recently
might be due to this.
Closes https://github.com/official-stockfish/Stockfish/pull/1860
No functional change
As noticed in the forum, a crash in extract_ponder_from_tt could result
if hash size is set before the ponder move is printed. While it is arguably
a GUI issue (but it got me on the cli), it is easy to avoid this issue.
Closes https://github.com/official-stockfish/Stockfish/pull/1856
No functional change.
On November 30th, @xoto10 experimented with removing this threshold,
but the simplification barely failed LTC. I was inspired to try various
[0, 4] tweaks to increase its value, which would narrow the effects of
this threshold without removing it entirely. Various values repeatedly
led to Elo gains at both STC and LTC, most of which were insufficient
to pass.
After a couple of weeks, I tried again to find an Elo-gaining tweak
but noticed that I could raise the threshold higher and higher without
regression. I decided to try removing it entirely--forgetting that
@xoto10 had already attempted this. However, this now performs much
better at both STC and LTC, producing a STC Elo gain and also potentially
a smaller LTC one.
The reason appears to be a recent change in master (e8ffca3) near
this code, which interacts with this patch. This simplification
governs the conditions under which that patch's effects are applied.
Something non-obvious about that change has significantly improved
the performance of this simplification.
I recognize and thank @xoto10, who originally had this idea. Since
I ran several LTCs recently (to determine whether to open this PR,
or one for a related [0, 4]), I would also like to acknowledge the
other developers and CPU donors for their patience. Thank you all!
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 13445 W: 3000 L: 2862 D: 7583
http://tests.stockfishchess.org/tests/view/5c11f01b0ebc5902ba11a6b8
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 33868 W: 5663 L: 5563 D: 22642
http://tests.stockfishchess.org/tests/view/5c11ffe90ebc5902ba11a8a9
Closes https://github.com/official-stockfish/Stockfish/pull/1870
Bench: 3343286
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 13323 W: 3015 L: 2818 D: 7490
http://tests.stockfishchess.org/tests/view/5c00a2520ebc5902bcedd41b
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 52294 W: 9093 L: 8756 D:34445
http://tests.stockfishchess.org/tests/view/5c00b2c40ebc5902bcedd596
Some obvious followups to this are to further tune this PSQT, or
try 8x8 for other pieces. As of now I don't plan on trying this
for other pieces as I think the majority of the ELO it brings is
for pawns and kings.
Looking at the new values, the differences between kingside and
queenside are quite significant. I am very hopeful that this a
llows SF to understand and plan pawn structures even better than
it already does. Cheers!
Closes https://github.com/official-stockfish/Stockfish/pull/1839
Bench: 3569243
I've gone through the RENAME/REFORMATTING thread and changed everything I could find, plus a few more. With this, let's close the previous issue and open another.
No functional change.
This reverts commit 33d9548218 ,
which crashed in DEBUG mode because of the following assert in position.h
````
Assertion failed: (is_ok(m)), function capture, file ./position.h, line 369.
````
No functional change
Remove the F[] array which I find unhelpful and rename `improvingFactor` to
`fallingEval` since larger values indicate a falling eval and more time use.
I realise a test was not strictly necessary, but I ran STC [-3,1] just to
check there are no foolish errors before creating the pull request:
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 35804 W: 7753 L: 7659 D: 20392
http://tests.stockfishchess.org/tests/view/5bef3a0c0ebc595e0ae39c19
It was then suggested to clean the constants around `fallingEval`
to make it more clear this is a factor around ~1 that adjusts time
up or downwards depending on some conditions. We then ran a double
test with this simplification suggestion:
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 68435 W: 14936 L: 14906 D: 38593
http://tests.stockfishchess.org/tests/view/5c02c56b0ebc5902bcee0184
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 37258 W: 6324 L: 6230 D: 24704
http://tests.stockfishchess.org/tests/view/5c030a520ebc5902bcee0a32
No functional change
Exclude doubly protected by pawns squares when calculating attackers on
king ring. Idea of this patch is not to count attackers if they attack
only squares that are protected by two pawns.
STC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 70040 W: 15476 L: 15002 D: 39562
http://tests.stockfishchess.org/tests/view/5c0354860ebc5902bcee1106
LTC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 16530 W: 2795 L: 2607 D: 11128
http://tests.stockfishchess.org/tests/view/5c0385080ebc5902bcee14b5
This is third king safety patch in recent times so we probably need
retuning of king safety parameters.
Bench: 3057978
Official release version of Stockfish 10.
This is also the 10th anniversary version of the Stockfish project, which
started exactly ten years ago! I wish to extend a huge thank you to
all contributors and authors in our amazing community :-)
Bench: 3939338
The patch was tested for correctness by running bench with and
without the change against current master, and the tablebase hit
numbers were found to be identical in both cases. See the pull
request comments for details:
https://github.com/official-stockfish/Stockfish/pull/1826
No functional change.
On November 16th, before the removal of the depth condition, I tried
revising castling extensions to only handle castling moves, rather than
moves that change castling rights generally. It appeared to be a slight
Elo gain at STC but insufficient to pass [0, 4] (+0.5 Elo), but what I
overlooked was that it made pos.can_castle(us) irrelevant and should
have been a simplification. Recent discussion with @Chess13234 and
Michael Chaly (@Vizvezdenec) inspired me to take a second look, and
the simplification continues to pass when rebased on the current master.
This replaces two conditions with one, because type_of(move) == CASTLING
implies pos.can_castle(Us), allowing us to remove the latter condition.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 110948 W: 24209 L: 24263 D: 62476
http://tests.stockfishchess.org/tests/view/5bf8f65c0ebc5902bced3a63
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 88283 W: 14681 L: 14668 D: 58934
http://tests.stockfishchess.org/tests/view/5bf994a60ebc5902bced4349
Bench: 3939338
When running on a cloud VM (n1-highcpu-96) with several NVMe SSDs and
some non-SSDs for tablebases, I noticed that the average SSD request size was
more than 256 kB. This doesn't make a lot of sense for Syzygy tablebases,
which have a block size of 32 bytes and very low locality.
Seemingly, the tablebase access patterns during probing make the OS,
at least Linux, think that readahead is advantageous; normally, it
gives up doing readahead if there are too many misses, but it doesn't,
perhaps due to the fairly high overall hit rates. (It seems the kernel cannot
distinguish between reading a block that was paged in because the userspace
wanted it explicitly, and one that was read as part of readahead.)
Setting MADV_RANDOM effectively turns off readahead, which causes
the request size to drop to 4 kB. In the aforemented cloud VM test,
this roughly tripled the amount of I/O requests that were able to go
through, while reducing the total traffic from 2.8 GB/sec to 56 MB/sec
(moving the bottleneck to the non-SSDs; it seems the SSDs could have
sustained many more requests).
Closes https://github.com/official-stockfish/Stockfish/pull/1829
No functional change.
Don't do an extra TT update in case of a fail-high,
but simply break off the moves loop and let the TT update
at the end of qsearch do this job.
Same workflow/logic as in our main search function now.
Tested for no regression to be on the safe side.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 30237 W: 6665 L: 6560 D: 17012
http://tests.stockfishchess.org/tests/view/5bf928e80ebc5902bced3f3a
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 51067 W: 8625 L: 8553 D: 33889
http://tests.stockfishchess.org/tests/view/5bf937180ebc5902bced3fdc
No functional change.
Tropism in kingdanger was simplified away in this pull request #1821.
This patch reintroduces tropism in kingdanger with using quadratic scaling.
Passed STC http://tests.stockfishchess.org/tests/view/5bf7c1b10ebc5902bced1f8f
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 52803 W: 11835 L: 11442 D: 29526
Passed LTC http://tests.stockfishchess.org/tests/view/5bf816e90ebc5902bced24f1
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 17204 W: 2988 L: 2795 D: 11421
How do we continue from there?
I've recently tried to introduce tropism difference term in kingdanger which
passed STC 6 times but failed LTC all the time. Maybe using quadratic scaling
for it will also be helpful.
Bench 4041387
A recent LTC tuning session by @candirufish showed this term decreasing significantly. It appears that it can be removed altogether without significant Elo loss.
I also thank @GuardianRM, whose attempt to remove tropism from king danger inspired this one.
After this PR is merged, my next step will be to attempt to tune the coefficients of this new, simplified kingDanger calculation.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 12518 W: 2795 L: 2656 D: 7067
http://tests.stockfishchess.org/tests/view/5befadda0ebc595e0ae3a289
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 164771 W: 26463 L: 26566 D: 111742
http://tests.stockfishchess.org/tests/view/5befcca70ebc595e0ae3a343
LTC 2, rebased on Stockfish 10 beta:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 75226 W: 12563 L: 12529 D: 50134
http://tests.stockfishchess.org/tests/view/5bf2e8910ebc5902bcecb919
Bench: 3412071
Because of aggressive time management and optimistic assumptions
about move overhead, it's still very easy to get Stockfish to forfeit
on time when we hit an endgame and have Syzygy EGTB on a spinning
drive. The latency from serving a few thousand EGTB probes (~10ms each),
of which there can currently be up to 4000 outstanding before a time
check, will easily overwhelm the default Move Overhead of 30ms.
This problem was first raised by Gian-Carlo Pascutto and some solutions
and improvements were discussed in the following pull requests:
https://github.com/official-stockfish/Stockfish/pull/1471https://github.com/official-stockfish/Stockfish/pull/1623https://github.com/official-stockfish/Stockfish/pull/1783
This patch is a minimal change proposed by Marco Costalba to lower
the impact of the bug. We now force a check of the clock right after
each tablebase read.
No functional change.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 51883 W: 11297 L: 10915 D: 29671
http://tests.stockfishchess.org/tests/view/5bf1e2ee0ebc595e0ae3cacd
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 15859 W: 2752 L: 2565 D: 10542
http://tests.stockfishchess.org/tests/view/5bf337980ebc5902bcecbf62
Notes:
(1) The bonus value has not been carefully tested, so it may be possible
to find slightly better values.
(2) Plan is to now try adding similar restriction for pawns. I wanted to
include that as part of this pull request, but I was advised to do it as
two separate pull requests. STC is currently running here, but may not add
enough value to pass green.
Bench: 3679086
Preparation commit for the upcoming Stockfish 10 version, giving a chance to catch last minute feature bugs and evaluation regression during the one-week code freeze period. Also changing the copyright dates to include 2019.
No functional change
STC:
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 84697 W: 18173 L: 18009 D: 48515
http://tests.stockfishchess.org/tests/view/5bea366f0ebc595e0ae34793
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 157625 W: 25533 L: 24893 D: 107199
http://tests.stockfishchess.org/tests/view/5be8b69e0ebc595e0ae33024
Personally, I feel like SF has been tuned to death recently and that we
need to step away from existing-parameter tunes for a bit and focus more
on new ideas. I don't really think there's much more ELO in these tunes
(for now). For me at least, this was the last existing-parameter tune I'll
be running for quite a while. Cheers!
Bench: 3572567
It does not appear to be not necessary or advantageous to
conditionally initialize kingRing[Us] or kingAttackersCount[Them],
so the 'else' can be removed.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22873 W: 4923 L: 4804 D: 13146
http://tests.stockfishchess.org/tests/view/5be9a8270ebc595e0ae33c7e
No functional change
To top the rating lists and get more interesting middle play, it
is a good habit to set the default contempt to the highest value
that does not regress against contempt=0. We recently decreased
PawnValueEg it is logical that to raise a little bit the default
higher contempt because of the following internal dependency in
line 334 of search.cpp :
````
int ct = int(Options["Contempt"]) * PawnValueEg / 100; // From centipawns
````
STC: contempt=24 passed non-regression vs contempt=0
http://tests.stockfishchess.org/tests/view/5bd6d7f80ebc595e0ae21e14
LTC: contempt=24 passed non-regression LTC vs contempt=0
http://tests.stockfishchess.org/tests/view/5bd6e0980ebc595e0ae21f07
On 2018-11-01, we also tested the effects of contempt=21 and contempt=24
against Stockfish 9, and the net result was neutral:
Contempt 21
ELO: 51.68 +-1.9 (95%) LOS: 100.0%
Total: 40000 W: 9487 L: 3581 D: 26932
http://tests.stockfishchess.org/tests/view/5bdb1a140ebc595e0ae2620a
Contempt 24
ELO: 52.21 +-2.0 (95%) LOS: 100.0%
Total: 40000 W: 9759 L: 3793 D: 26448
http://tests.stockfishchess.org/tests/view/5bdb1b680ebc595e0ae2620d
Bench: 3459874
This patch will make possible to free mapped TB files with "ucinewgame" command.
We wrote this patch specifically to address a problem that arose while
running Stockfish with 7-piece tablebases as a kibitzer at TCEC for
extended periods of time across multiple games. It was noted that after
some time, the NPS of the kibitzing Stockfish (which is usually 3x faster
than the Stockfish actually competing) would drop precipitously, eventually
falling to preposterously low numbers until restarted.
Their eval bot basically inputs FEN, go infinite, stop and loop, it probably
didn't do ucinewgame either. As time goes it gradually slowed down and OS
starts to use swap, this is not reasonable since the engine only uses 16GB
hash and the machine has 1TB physical RAM and does nothing else.
Author : noobpwnftw
Closes https://github.com/official-stockfish/Stockfish/pull/1790
No functional change.
We don't need to pass the king square as an explicit parameter to the functions
king_safety() and do_king_safety() since we already pass in the position.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 69686 W: 14894 L: 14866 D: 39926
http://tests.stockfishchess.org/tests/view/5be84ac20ebc595e0ae3283c
No functional change.
We calculate tropism as a sum of two factors. The first is the number of squares in our kingFlank and Camp that are attacked by the enemy; the second is number of these squares that are attacked twice. Prior to this commit, we excluded squares we defended with pawns from this second value, but this appears unnecessary. (Doubly-attacked squares near our king are still dangerous.) The removal of this exclusion is a possible small Elo gain at STC (estimated +1.59) and almost exactly neutral at LTC (estimated +0.04).
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 20942 W: 4550 L: 4427 D: 11965
http://tests.stockfishchess.org/tests/view/5be4e0ae0ebc595e0ae308a0
LTC:
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 56941 W: 9172 L: 9108 D: 38661
http://tests.stockfishchess.org/tests/view/5be4ec340ebc595e0ae30938
Bench: 3813986
The recently committed Fail-High patch (081af90805)
had a number of changes beyond adjusting the depth of search on fail high, with
some undesirable side effects.
1) Decreasing depth on PV output, confusing GUIs and players alike as described in
issue #1787. The depth printed is anyway a convention, let's consider adjustedDepth
an implementation detail, and continue to print rootDepth. Depth, nodes, time and
move quality all increase as we compute more. (fixing this output has no effect on
play).
2) Fixes go depth output (now based on rootDepth again, no effect on play), also
reported in issue #1787
3) The depth lastBestDepth is used to compute how long a move is stable, a new move
found during fail-high is incorrectly considered stable if based on adjustedDepth
instead of rootDepth (this changes time management). Reverting this passed STC
and LTC:
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 82982 W: 17810 L: 17808 D: 47364
http://tests.stockfishchess.org/tests/view/5bd391a80ebc595e0ae1e993
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 109083 W: 17602 L: 17619 D: 73862
http://tests.stockfishchess.org/tests/view/5bd40c820ebc595e0ae1f1fb
4) In the thread voting scheme, the rank of the fail-high thread is now artificially
low, incorrectly since the quality of the move is much better than what adjustedDepth
suggests (e.g. if it takes 10 iterations to find VALUE_KNOWN_WIN, it has very low
depth). Further evidence comes from a test that showed that the move of highest
depth is not better than that of the last PV (which is potentially of much lower
adjustedDepth).
I.e. this test http://tests.stockfishchess.org/tests/view/5bd37a120ebc595e0ae1e7c3
failed SPRT[0, 5]:
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 10609 W: 2266 L: 2345 D: 5998
In a running 5+0.05 th 8 test (more than 10000 games) a positive Elo estimate is
shown (strong enough for a [-3,1], possibly not [0,4]):
http://tests.stockfishchess.org/tests/view/5bd421be0ebc595e0ae1f315
LLR: -0.13 (-2.94,2.94) [0.00,4.00]
Total: 13644 W: 2573 L: 2532 D: 8539
Elo 1.04 [-2.52,4.61] / LOS 71%
Thus, restore old behavior as a bugfix, keeping the core of the fail-high patch
idea as resolving scheme. This is non-functional for bench, but changes searches
via time management and in the threaded case.
Bench: 3556672
This helps resolving consecutive FH's during aspiration more efficiently
STC:
http://tests.stockfishchess.org/tests/view/5bc857920ebc592439f85765
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 4992 W: 1134 L: 980 D: 2878 Elo +10.72
LTC:
http://tests.stockfishchess.org/tests/view/5bc868050ebc592439f857ef
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 8123 W: 1363 L: 1210 D: 5550 Elo +6.54
No-Regression test with 8 threads, tc=15+0.15:
http://tests.stockfishchess.org/tests/view/5bc874ca0ebc592439f85938
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 24740 W: 3977 L: 3863 D: 16900 Elo +1.60
This was a cooperation between me and Michael Stembera:
-me recognizing SF having problems with resolving FH's efficiently at
high depths, thus starting some tests based on consecutive FH's.
-mstembera picking up the idea with first success at STC & LTC (so full
credits to him!)
-me suggesting how to resolve the issues pinpointed by S.G on PR #1768
and finally restricting the logic to the main thread so that it don't
regresses at multi-thread.
bench: 3314347
Enable numa machinery only for STRICTLY MORE than 8 threads. Reason for this
change is that nowadays SMP tests are always done with 8 threads. That is a
problem for multi-socket Windows machines running on fishtest.
No functional change
The patch adds a small random component (+-1) to VALUE_DRAW for the evaluation
of draw positions (mostly 3folds). This random component is not static, but
potentially different for each visit of the node (hence derived from the node
counter). The effect is that in positions with many 3fold draw lines, different
lines are followed at each iteration. This keeps the search much more dynamic,
as opposed to being locked to one particular 3fold.
An example of a position where master suffers from 3fold-blindness and this patch
solves quickly is the famous TCEC game 53:
FEN: 3r2k1/pr6/1p3q1p/5R2/3P3p/8/5RP1/3Q2K1 b - - 0 51
master doesn't see that this is a lost position (draw eval up to depth 50) as
Qf6-e6 d4-d5 (found by patch at depth 23) leads to a loss.
The 3fold-blindness is more important at longer TC, the patch was yellow STC and
LTC, but passed VLTC:
STC
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 46328 W: 10048 L: 9953 D: 26327
http://tests.stockfishchess.org/tests/view/5b9c0ca20ebc592cf275f7c7
LTC
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 54663 W: 8938 L: 8846 D: 36879
http://tests.stockfishchess.org/tests/view/5b9ca1610ebc592cf27601d3
VLTC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 31789 W: 4512 L: 4284 D: 22993
http://tests.stockfishchess.org/tests/view/5b9d1a670ebc592cf276076d
Credit to @crossbr for pointing to this problem repeatedly, and giving the hint
that many draw lines are typical in those situations.
Bench: 4756639
Currently we update (track up) the pv even in the fail high case.
However most times in such cases the pv in the ply below remains unset
because there we have value == alpha and so finally we see truncated
pv's (=just one move) in fail high cases.
Of course tracking down these pv's (+sending them to the gui) comes at a
certian cost, but no-regression tests passed:
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 16300 W: 3556 L: 3424 D: 9320
http://tests.stockfishchess.org/tests/view/5b9b73500ebc592cf275ea92
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 202411 W: 32734 L: 32897 D: 136780
http://tests.stockfishchess.org/tests/view/5b9baed10ebc592cf275ef6d
N.B.: Digging also into qsearch was tried in another version but seemed
not to pass the tests. This means that we don't always will get a pv
until the very tips.
No functional change
This PR is a combination of two unrelated [0, 4] patches that appeared promising
but not quite strong enough to pass on their own. The combination initially failed
STC with a positive score after a long run, and the subsequent speculative LTC test
passed.
* tweak_threatOnQueen4 :
Increase the middlegame components of ThreatByMinor[QUEEN]
and ThreatByRook[QUEEN] by 15 each. Bryan's (@crossbr) analysis of CCC Bonus Game 10
inspired several tests on penalizing a queen with limited safe mobility. While
attempting to implement this idea, I noticed that when I did not include the queen's
current square in the calculations, the Elo gains seemed to vanish--and only then did
I have the idea to revisit ThreatByMinor[QUEEN] and ThreatByRook[QUEEN], adding a
corresponding value to each. Without Bryan's work, this test would never have been
submitted. I would also like to recognize the efforts and contributions of @SFisGOD,
who also vigorously worked on this idea.
* Use pure static eval for null move pruning :
This idea was directly re-purposed from a promising test by Jerry Donald Watson
(@jerrydonaldwatson) in August. It was also independently developed and tested by
Stefan Geschwentner (@locutus2) previously.
Thank you all!
STC (failed yellow):
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 83913 W: 17986 L: 17825 D: 48102
http://tests.stockfishchess.org/tests/view/5bbc59300ebc592439f76aa5
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 137198 W: 22351 L: 21772 D: 93075
http://tests.stockfishchess.org/tests/view/5bbce35f0ebc592439f77639
Bench: 4312846
Note by snicolet: I use this non-functional change patch
as a pretext to correct the wrong bench number I introduced
in the message of the previous commit.
Bench: 4059356
Storing unconditionally the current generation and bound is equivalent to master.
Part of the condition was added as a speed optimization in #429.
Here the branch is fully eliminated.
passed STC single-threaded:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 73515 W: 16378 L: 16359 D: 40778
http://tests.stockfishchess.org/tests/view/5b2fc38c0ebc5902b2e57fd5
passed STC multi-threaded:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 63725 W: 12916 L: 12874 D: 37935
http://tests.stockfishchess.org/tests/view/5b307b8f0ebc5902b2e5895f
The multithreaded test was run after a plausible suggestion by @mstembera that the effect of this could be larger with many cores. The result seems to indicate this doesn't really matter on the 8core architecture abundantly available on fishtest.
No functional change
a) Reduce PSQT values along the long diagonals on non-central squares
and increase the LongDiagonal bonus accordingly. The effect is to penalise
bishops on the long diagonal which can not "see" the 2 central squares.
The "good" bishops still have more or less the same bonus as current master.
b) For a bishop on a central square, because of the "| s" term in the code,
the LongDiagonalBonus was always given. So while being there, remove the "| s"
and compensate the central Bishop PSQT accordingly.
Passed STC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 44498 W: 9658 L: 9323 D: 25517
http://tests.stockfishchess.org/tests/view/5b8992770ebc592cf2748942
Passed LTC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 63092 W: 10324 L: 9975 D: 42793
http://tests.stockfishchess.org/tests/view/5b89a17a0ebc592cf2748b59
Closes https://github.com/official-stockfish/Stockfish/pull/1760
bench: 4693901
It looks like PawnsOnBothFlanks can be removed from initiative().
A barrage of tests seem to confirm that the adjustment to -110
does not gain elo to offset any potential loss by removing
PawnsOnBothFlanks.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22014 W: 4760 L: 4639 D: 12615
http://tests.stockfishchess.org/tests/view/5b7f50cc0ebc5902bdbb3a3e
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 40561 W: 6667 L: 6577 D: 27317
http://tests.stockfishchess.org/tests/view/5b801f9f0ebc5902bdbb4467
The barrage of 0,4 tests on the -136 value are in my ps_tunetests branch.
http://tests.stockfishchess.org/tests/user/protonspring
Closes https://github.com/official-stockfish/Stockfish/pull/1751
Bench: 4413173
-------------
How to continue from there?
The fact that endgames with all the pawns on only one flank are
drawish is a well-known chess idea, so it seems quite strange that
this can be removed so easily without losing Elo.
In the past there had been attempts to improve on PawnsOnBothFlanks
with similar concepts (for instance using the pawn span value), but
the tests were at best neutral. Maybe Stockfish is now mature enough
that these refined ideas would work to replace PawnsOnBothFlanks?
There is no need to make this as large as 65536 just for the sake of the
single 7-man tablebase that happens to have the key 0xf9247fff. Idea for the
fix by Ronald de Man, who suggested simply to allow more buckets past the end.
We also implement Robin Hood hashing for the hash table, which takes the worst
-case search for full 7-man tablebases down from 68 to 11 probes (Also takes
the average probe length from 2.06 to 2.05). For a table with 8K entries, the
corresponding numbers would be worst-case from 9 to 4, with average from 1.30
to 1.29.
https://github.com/official-stockfish/Stockfish/pull/1747
No functional change
This commit tries to make the new pure static eval code more readable by
splitting up the nested assignments into separate lines and making a few
more cosmetic tweaks.
No functional change.
This is a non-functional change. By pre-incrementing minKingPawnDistance
instead of post-incrementing, we can remove this -1.
This also makes DistanceRing more consistent with the rest of stockfish
since it now holds an actual "distance" instead of a less natural distance-1.
In current master, PseudoAttacks[KING][ksq] == DistanceRingBB[ksq][0]
With this patch, it will be PseudoAttacks[KING][ksq] == DistanceRingBB[ksq][1]
ie squares at distance 1 from the king. This is more natural use of distance.
The current array size DistanceRingBB[SQUARE_NB][8] is still OK with the new
definition, because maximum distance between two squares on a chess board is
seven (for example Kh1 and a8).
No functional change.
Follow-up for the previous patch: we use an affine formula to mix stats
and evaluation in search. The idea is to give a bonus if the previous
move of the opponent was historically bad, and a malus if the previous
move of the opponent was historically good.
More precisely, if x is the stat score of the previous move by the opponent,
we implement the following formulas to tweak the evaluation at an internal
node of the tree for our pruning decisions at this node:
if x = 0, use v' = eval(P)
if x > 0, use v' = eval(P) - 5 - x/1024
if x < 0, use v' = eval(P) + 5 - x/1024
For reference, the previous master had this simpler rule:
if x > 0, use v' = eval(P) - 10
if x <= 0, use v' = eval(P)
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 29322 W: 6359 L: 6088 D: 16875
http://tests.stockfishchess.org/tests/view/5b76a5980ebc5902bdba957f
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 30893 W: 5154 L: 4910 D: 20829
http://tests.stockfishchess.org/tests/view/5b76ca6d0ebc5902bdba9914
Closes https://github.com/official-stockfish/Stockfish/pull/1740
Bench: 4592766
Mix search stats with evaluation: if the opponent's move has a good historyStat,
then decrease the evaluation of the internal node a bit for the pruning decisions
during search.
STC;
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 72083 W: 15683 L: 15203 D: 41197
http://tests.stockfishchess.org/tests/view/5b74c3ea0ebc5902bdba7d41
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 29104 W: 4867 L: 4630 D: 19607
http://tests.stockfishchess.org/tests/view/5b7565000ebc5902bdba851b
Closes https://github.com/official-stockfish/Stockfish/pull/1738
Bench: 4514101
-----------
How to continue from there?
• the use of the previous stat score can probably be simplified in lines 587 and 716
• we could try to use a continuous bonus based on the previous stat score, instead
of just a fixed offset of -10 when the opponent previous move was good.
----------
Comments by Stefan Geschwentner:
Interesting idea. Because only the eval in search is tweak this should only
influence the eval and static eval used at inner nodes, and not on the return
search value (which comes in the end from quiescence search), except through
saving in TT followed by a TT cutoff.
So essentialy this effects diverse pruning/reduction parts -- eval and static
eval are lowered for good opponent moves:
• tt cutoff (ttValue)
• improving (static eval)
• more razoring (eval)
• less futility pruning (eval)
• less null move pruning (eval + static eval) (but with little more depth)
• more probcut (static eval)
• more move futility pruning (static eval)
We double in this patch the weight of the capture history table in the
local scoring of captures for move ordering.
The capture history table is indexed by the triplet (capturing piece,
capture square, captured piece) and gets information like "it seems to
have been historically good in that part of the search tree to capture
a pawn with a rook on g3, even if it seems to lose material", and affect
the normaly pure « Most Valuable Victim » ordering of captures.
Finished yellow at STC after 228842 games (posting a +1.36 Elo gain):
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 228842 W: 50894 L: 50152 D: 127796
http://tests.stockfishchess.org/tests/view/5b714bb00ebc5902bdba332d
Passed LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 43251 W: 7425 L: 7131 D: 28695
http://tests.stockfishchess.org/tests/view/5b71c7d40ebc5902bdba3e51
Thanks to user Vizvezdenec for running the LTC test.
Closes https://github.com/official-stockfish/Stockfish/pull/1736
Bench: 4272361
This patch introduces a non-linear bonus for pawns, along with some
(linear) corrections for the other pieces types.
The original values were obtained by a massive non-linear tuning of both
pawns and other pieces by GuardianRM, while Alain Savard and Chris Cain
later simplified the patch by observing that, apart from the pawn case, the
tuned corrections were in fact almost affine and could be incorporated in
our current code base via the piece values in types.h (offset) and the diagonal
of the quadratic matrix (slope). See discussion in PR#1725 :
https://github.com/official-stockfish/Stockfish/pull/1725
STC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 42948 W: 9662 L: 9317 D: 23969
http://tests.stockfishchess.org/tests/view/5b6ff6e60ebc5902bdba1d87
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 19683 W: 3409 L: 3206 D: 13068
http://tests.stockfishchess.org/tests/view/5b702dbd0ebc5902bdba216b
How to continue from there?
- Maybe the non-linearity for the pawn value could be somewhat tempered
again and a simpler linear correction for pawns would work?
Closes https://github.com/official-stockfish/Stockfish/pull/1734
Bench: 4681496
We simplify the razoring logic by applying it to all nodes at depth 1 only.
An added advantage is that only one razor margin is needed now, and we treat
PV and Non-PV nodes in the same manner.
How to continue?
- There may be some conditions in which depth 2 razoring is beneficial.
- We can see whether the razor margin can be tuned, perhaps even with a
different value for PV nodes.
- Perhaps we can unify the treatment of PV and Non-PV nodes in other parts
of the search as well.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 5474 W: 1281 L: 1127 D: 3066
http://tests.stockfishchess.org/tests/view/5b6de3b20ebc5902bdba0d1e
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 62670 W: 10749 L: 10697 D: 41224
http://tests.stockfishchess.org/tests/view/5b6dee340ebc5902bdba0eb0
In addition, we ran a fixed LTC test against a similar patch which also
passed SPRT [-3, 1]:
ELO: 0.23 +-2.1 (95%) LOS: 58.6%
Total: 36412 W: 6168 L: 6144 D: 24100
http://tests.stockfishchess.org/tests/view/5b6e83940ebc5902bdba1485
We are opting for this patch as the more logical and simple of the two,
and it appears to be no less strong. Thanks in particular to @DU-jdto
for input into this patch.
Bench: 4476945
Currently, we do not consider pawns passed if there is another pawn of
the same color in front of them. It appears that this condition is not
necessary. The idea is that the doubled pawns are likely to be weak and
one of them will be likely captured anyway. On the other hand, if we do
somehow manage to promote a pawn, then the pawn behind it becomes passed
as well. In any case, the end result is we end up with an extra
potentially passed pawn. The current evaluation for passed pawns already
handles this case by also scaling down this effect.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 28291 W: 6287 L: 6178 D: 15826
http://tests.stockfishchess.org/tests/view/5b6c4b960ebc5902bdb9f256
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 30717 W: 5256 L: 5151 D: 20310
http://tests.stockfishchess.org/tests/view/5b6c82980ebc5902bdb9f863
Bench: 4938285
Unify the "quiet" and "non-quiet" reduction rules for use at any kind of moves.
The idea behind it was that both rules reduce at similiar cases in master:
one directly for late previous moves and the other indirectly by using a
bad stat score which is used for most move sorting and so approximates the
late move condition.
For captures/promotions the old rule was triggered in 25% but the new
rule only for 3% of all cases (so now more reductions are done, whereas
for quiet moves reductions keep the same level).
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 162327 W: 35976 L: 36134 D: 90217
http://tests.stockfishchess.org/tests/view/5b6a9a430ebc5902bdb9d5c1
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 29570 W: 5083 L: 4976 D: 19511
http://tests.stockfishchess.org/tests/view/5b6bc5d00ebc5902bdb9e9d6
Bench: 4526980
Currently, we first calculate some bitboards at the top of Evaluation::space()
and then check whether we actually need them. Invert the ordering. Of course this
does not make a difference in current master because the constexpr bitboard
calculations are in fact done at compile time by any decent compiler, but I find
my version a bit healthier since it will always meet or exceed current implementation
even if we eventually change the spaceMask to something not contsexpr.
No functional change.
After a session of tuning for King Psqt I got some new values, which was later
tweaked manually by me Fauzi, to result in an Elo-gain patch which seems to scale
pretty well:
STC: LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 100653 W: 22550 L: 22314 D: 55789 [Yellow patch]
LTC: LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 147079 W: 25584 L: 24947 D: 96548 [Green Patch]
Bench: 4669050
Introduce voting system for best move selction in multi-threads mode.
Joint work with Stefan Geschwentner, based on ideas introduced by
Michael Stembera.
Moves are upvoted by every thread using the margin to the minimum score
across threads and the completed depth.
First thread voting for the winner move is selected as best thread.
Passed STC, LTC. A further LTC test with only 4 threads failed with positive
score. A LTC with 31 threads was stopped with LLR 0.77 after 25k games to
avoid use of excessive resources (equivalent to 1.5M STC games).
Similar ideas were proposed by Michael Stembera 2 years ago #507, #508.
This implementation seems simpler and more understandable, the results
slightly more promising.
Further possible work:
1) Tweak of the formula using for assigning votes.
2) Use a different baseline for the score dependent part: maximum score
or winning probability could make more sense.
3) Assign votes in `Thread::Search` as iterations are completed and use
voting results to stop search.
4) Select best thread as the threads voting for best move with the highest
completed depth or, alternatively, vote on PV moves.
Link to SPRT tests
[stopped LTC, 31 threads 20+0.02](http://tests.stockfishchess.org/tests/view/5b61dc090ebc5902bdb95192)
LLR: 0.77 (-2.94,2.94) [0.00,5.00]
Total: 25602 W: 3977 L: 3850 D: 17775
Elo: 1.70 [-0.68,4.07] (95%)
[passed LTC, 8 threads 20+0.02](http://tests.stockfishchess.org/tests/view/5b5df5180ebc5902bdb9162d)
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 44478 W: 7602 L: 7300 D: 29576
Elo: 1.92 [-0.29,3.94] (95%)
[failed LTC, 4 threads 20+0.02](http://tests.stockfishchess.org/tests/view/5b5f39ef0ebc5902bdb92792)
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 29922 W: 5286 L: 5285 D: 19351
Elo: 0.48 [-1.98,3.10] (95%)
[passed STC, 4 threads 5+0.05](http://tests.stockfishchess.org/tests/view/5b5dbf0f0ebc5902bdb9131c)
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 9108 W: 2033 L: 1858 D: 5217
Elo: 6.11 [1.26,10.89] (95%)
No functional change (in simple threat mode)
Use operator const T&() instead of operator T() to avoid possible
costly hidden copies of non-scalar nested types.
Currently StatsEntry has a single member T, so assuming
sizeof(StatsEntry) == sizeof(T) it happens to work, but it's
better to use the size of the proper entry type in std::fill.
Note that current code works because std::array items are ensured
to be allocated in contiguous memory and there is no padding among
nested arrays. The latter condition does not seem to be strictly
enforced by the standard, so be careful here.
Finally use address-of operator instead of get() to fully hide the
wrapper class StatsEntry at calling sites. For completness add
the arrow operator too and simplify the C++ code a bit more.
Same binary code as previous master under the Clang compiler.
No functional change.
As a note, current 2 LMR conditions on stat score
could be simplified in a single line:
r -= ((ss->statScore >= 0) - ((ss-1)->statScore >= 0)) * ONE_PLY;
We keep them splitted in 2 "if" statements because are easier
to (immediately) read.
No functional change.
This is the first patch teaching Stockfish how to use the 7-pieces
Syzygy tablebase currently calculated by Bujun Guo (@noobpwnftw) and
Ronald de Man (@syzygy1). The 7-pieces database are so big that they
required a change in the internal format of the files (technically,
some DTZ values are 16 bits long, so this had to be stored as wide
integers in the Huffman tree).
Here are the estimated file size for the 7-pieces Syzygy files,
compared to the 151G of the 6-pieces Syzygy:
```
7.1T ./7men_testing/4v3_pawnful (ongoing, 120 of 325 sets remaining)
2.4T ./7men_testing/4v3_pawnless
2.3T ./7men_testing/5v2_pawnful
660G ./7men_testing/5v2_pawnless
117G ./7men_testing/6v1_pawnful
87G ./7men_testing/6v1_pawnless
```
Some pointers to download or recalculate the tables:
Location of original files, by Bujun Guo:
ftp://ftp.chessdb.cn/pub/syzygy/
Mirrors:
http://tablebase.sesse.net/ (partial)
http://tablebase.lichess.ovh/tables/standard/7/
Generator code:
https://github.com/syzygy1/tb/
Closes https://github.com/official-stockfish/Stockfish/pull/1707
Bench: 5591925 (No functional change if SyzygyTB is not used)
----------------------
Comment by Leonardo Ljubičić (@DragonMist)
This is an amazing achievement, generating and being able to use 7 men syzygy
on the fly. Thank you for your efforts @noobpwnftw !! Looking forward how this
will work in real life, and expecting some trade off between gaining perfect
play and slow disc Access, but once the disc speed and space is not a problem,
I expect 7 men to yield something like 30 elo at least.
-----------------------
Comment by Michael Byrne (@MichaelB7)
This definitely has a bright future. I turned off the 50 move rule (ala ICCF
new rules) for the following position: `[d]8/8/1b6/8/4N2r/1k6/7B/R1K5 w - - 0 1`
This position is a 451 ply win for white (sans the 50 move rule, this position
was identified by the generator as the longest cursed win for white in KRBN v KRB).
Now Stockfish finds it instantly (as it should), nice work 👊👍 .
```
dep score nodes time
7 +132.79 4339 0:00.00 Rb1+ Kc4 Nd6+ Kc5 Bg1+ Kxd6 Rxb6+ Kc7 Be3 Rh2 Bd4
6 +132.79 1652 0:00.00 Rb1+ Kc4 Nd2+ Kd5 Rxb6 Rxh2 Nf3 Rf2
5 +132.79 589 0:00.00 Rb1+ Kc4 Rxb6 Rxh2 Nf6 Rh1+ Kb2
4 +132.79 308 0:00.00 Rb1+ Kc4 Nd6+ Kc3 Rxb6 Rxh2
3 +132.79 88 0:00.00 Rb1+ Ka4 Nc3+ Ka5 Ra1+ Kb4 Ra4+ Kxc3 Rxh4
2 +132.79 54 0:00.00 Rb1+ Ka4 Nc3+ Ka5 Ra1+ Kb4
1 +132.7
```
This patch adds the tropism measure as a new term in the king danger variable.
Since we then trasform this variable as a Score via a quadratic formula, the
main effect of the patch is the positive correlation of the tropism measure
with some checks and pins information already present in the king danger code.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 6805 W: 1597 L: 1431 D: 3777
http://tests.stockfishchess.org/tests/view/5b5df8d10ebc5902bdb91699
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 32872 W: 5782 L: 5523 D: 21567
http://tests.stockfishchess.org/tests/view/5b5e08d80ebc5902bdb917ee
How to continue from there?
• it may be possible to use CloseEnemies=S(7,0)
• we may want to try incorporating other strategic features in the quadratic
king danger.
Closes https://github.com/official-stockfish/Stockfish/pull/1717
Bench: 5591925
The previous commit wouldn't compile on the Microsoft Virtual Studio C++ compiler. So use a more compatible style for the same idea (which we already use in numerous places of evaluate.cpp, for instance in line 563).
Under the Clang compiler, both versions generate exactly the same machine code (same md5 signatures for the two binaries).
No functional change.
Remove a popcount for HinderPassedPawn, and compensate by doubling
the bonus from S(4,0) to to S(8,0).
Maybe it was pure luck, but we got the idea of this Elo gaining patch by
seing the simplification attempt by Mike Whiteley in pull request #1703.
This suggests that whenever we have a passed evaluation simplification,
we should consider the possibility that the master bonus has become
slightly out of tune with time, and we should try a few Elo gaining [0..4]
tests by hand-tuning the master bonus.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 19136 W: 4388 L: 4147 D: 10601
http://tests.stockfishchess.org/tests/view/5b59be6f0ebc5902bdb8ac06
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 99382 W: 17324 L: 16843 D: 65215
http://tests.stockfishchess.org/tests/view/5b59d2410ebc5902bdb8afa8
Closes https://github.com/official-stockfish/Stockfish/pull/1710
Bench: 4688817
This tweak excludes files D and E from the KingFlank bitboard when our
king is on the A or H files respectively. As far as I can tell, this
affects two things: the calculation for CloseEnemies and PawnlessFlank.
Aside from filtering out slightly less relevant attacks in the flank,
I suspect this helps with king prophylaxis, avoiding attacks and moving
towards the center when the pawns start to come off.
STC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 56755 W: 12881 L: 12489 D: 31385
http://tests.stockfishchess.org/tests/view/5b58a94c0ebc5902bdb88c72
LTC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 130205 W: 22536 L: 21957 D: 85712
http://tests.stockfishchess.org/tests/view/5b58b7580ebc5902bdb89029
How to continue: Tweaking the two bonuses mentioned might give some
gain, although as far as I can tell, CloseEnemies is very sensitive to
even small changes.
Closes https://github.com/official-stockfish/Stockfish/pull/1705
Bench: 5026009
When evaluating threat by safe pawn and pawn push the same expression is used.
STC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 19444 W: 4540 L: 4309 D: 10595
http://tests.stockfishchess.org/tests/view/5b5a6e150ebc5902bdb8c5c0
Closes https://github.com/official-stockfish/Stockfish/pull/1709
No functional change.
--------------------
Comments by Stéphane Nicolet:
I don't measure any speed-up on my system, with two parallel benches at depth 22:
Total time (ms) : 74989
Nodes searched : 144830258
Nodes/second : 1931353
master
Total time (ms) : 75341
Nodes searched : 144830258
Nodes/second : 1922329
testedpatch
And anyway, like Stefan Geschwentner, I don't think that a 0.3% speed-up would
be enough to pass a [0..5] LTC test -- as a first approximation, we have this
rule of thumb that 1% speed-up gives about 1 Elo point.
However, considering the facts that the reformatting by itself is interesting,
that this is your first green test and that you played by the rules by running
the SPRT[0..5] test before opening the pull request, I will commit the change.
I will only take the liberty to change the occurrences of safe in lines 590 and
591 to b, to make the code more similar to lines 584 and 585.
So approved, and congrats :-)
This patch implements some idea by Alain Savard and Mike Whiteley taken from the perpertual renaming/reformatting thread.
This is a pure code cleaning patch (so no change in functionality), but I use it as a pretext to correct the bogus bench number that I introduced in the previous commit.
Bench: 4413383
This patch reverts the recent commit called "Tweak reductions formula, etc."
The decisions for the revert decision were as follows:
1) The original commit called "Tweak reductions formula: 0.88 * depth + 0.12"
showed bad scaling at in a Very Long Time Control (VLTC) test:
VLTC (180+1.8):
LLR: -1.59 (-2.94,2.94) [0.00,5.00]
Total: 14968 W: 2247 L: 2257 D: 10464
http://tests.stockfishchess.org/tests/view/5b559ffa0ebc5902bdb84f36
2) So there was a suspicion that the original fast passing LTC test which lead
us to accept the patch may have been a statistical accident, so we organized
a match against the previous master at LTC to get an Elo estimate for the
patch:
LTC match:
ELO: -1.83 +-2.1 (95%) LOS: 4.3%
Total: 36018 W: 6018 L: 6208 D: 23792
http://tests.stockfishchess.org/tests/view/5b55f8110ebc5902bdb8526f
3) Based on these results, we ran a simplification test with [-3..1] bounds
for the revert at LTC:
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 41501 W: 7107 L: 7020 D: 27374
http://tests.stockfishchess.org/tests/view/5b5738670ebc5902bdb86932
4) So we revert.
Bench: 4491691
There seems to be some strange interaction between Overload and Connectivity.
Overload encourages us to not have too many defended and attacked pieces,
as this may expose us to various tactics. This feels somewhat like it is in
conflict with Connectivity, where pieces are defended preemptively.
Here I take the "pick one or the other" approach and just remove connectivity,
while strengthening the effect of Overload to compensate. The reasoning is that
if we defend our pieces preemptively, then it does get attacked, we want to do
something about it so we don't get penalized by Overload. On the other
hand, if it doesn't get attacked, then there's no need to defend it.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 27734 W: 6174 L: 6064 D: 15496
http://tests.stockfishchess.org/tests/view/5b5073bd0ebc5902bdb7ba5c
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 51606 W: 8897 L: 8827 D: 33882
http://tests.stockfishchess.org/tests/view/5b50aa900ebc5902bdb7bf29
Bench: 4658006
After some recent big tuning session, the values for King Protector were
simplified to only be used on minor pieces. This patch tries to further
simplify by just using a single value, since current S(6,5) and S(5,6)
are close to each other. The value S(6,6) ended up passing, although
S(5,5) was also tried and failed STC.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 14261 W: 3288 L: 3151 D: 7822
http://tests.stockfishchess.org/tests/view/5b4ccdf50ebc5902bdb77f65
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 19606 W: 3396 L: 3273 D: 12937
http://tests.stockfishchess.org/tests/view/5b4ce4280ebc5902bdb7803b
Bench: 5448998
Extend the bonus for Overload to cases where our side
has more than one attacker to a non pawn piece.
Based on an idea by Bryan in the forum. For instance,
now black gets the overload bonus in this position:
8/5R1k/6pb/p6p/P1N4P/1Pp5/2K3P1/2N4r b - - 6 46
because two black pieces are attacking the knight on c1
that is defended only by the king.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 57446 W: 12762 L: 12711 D: 31973
http://tests.stockfishchess.org/tests/view/5b4ca9970ebc5902bdb77a88
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 42113 W: 7295 L: 7209 D: 27609
http://tests.stockfishchess.org/tests/view/5b4ccea00ebc5902bdb77f69
Bench: 4667263
For the rationale to allow this, see commit
a66c73deef
This was broken when cuckoo hashing was added, and
subtly broke (for example) lichess' Android application,
thus illustrating the original judgement was sound.
No functional change.
Various king and pawn eval values tuned after 2 million games. Rounding
slightly adjusted.
LTC: http://tests.stockfishchess.org/tests/view/5b477a260ebc5978f4be3ed4
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 32783 W: 5852 L: 5588 D: 21343
STC: http://tests.stockfishchess.org/tests/view/5b472d420ebc5978f4be3e4d
LLR: 3.23 (-2.94,2.94) [0.00,4.00]
Total: 44380 W: 10201 L: 9841 D: 24338
I think I reached the limit of the fishtest framework. It frequently
crashed at 2 million games already. The small values also moved a lot
throughout the entire tuning session though with smaller margin. The
passed danger and close enemies values seems the most sensitive (changing
close enemies alone to 6 failed before but now it passes), whether or not
they are close to optimal I don't know, but it seems some parameters are
also correlated to others.
Closes https://github.com/official-stockfish/Stockfish/pull/1670
bench: 5103722
In the current master, ThreatByKing is an array of two Scores, one for
when we have a single attack and one for when we have many. The latter
case is very rarely called during bench and was recently given a strange
negative value during a tuning run, as pointed out by @candirufish on
commit efd4ca2. Here, we simplify away this second case entirely, and
increase the remaining ThreatByKing to compensate.
Although I derived the parameter tweak independently, with the goal of
preserving the same average bonus, I later noticed that a very similar
Score had already been derived by an ongoing SPSA tuning session.
I therefore recognize @candirufish for first discovering these values.
I would also like to thank @Rocky640 for valuable feedback that pointed
me in the direction of ThreatByKing.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 7677 W: 1772 L: 1623 D: 4282
http://tests.stockfishchess.org/tests/view/5b3db0320ebc5902b9ffe97a
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 108031 W: 18329 L: 18350 D: 71352
http://tests.stockfishchess.org/tests/view/5b3dbf4b0ebc5902b9ffe9db
Closes https://github.com/official-stockfish/Stockfish/pull/1666
Bench: 4678861
In current master, the function make_bitboard() does nothing apart from
helping initialize the SquareBB[] array. This seems like an unnecessary
abstraction layer.
The advantage of make_bitboard() is we can define a bitboard, in a simple
and general way, not only from a single square but also from a list of
squares. It is more elegant, faster and readable than combining multiple
SquareBB explicitly, but the last complex use case in evaluation was
simplified away a few months ago.
If make_bitboard() becomes useful again to define complicated bitboards,
it will be easy enough to reintroduce it using this pull request as
an implementation reference.
No functional change.
Make sure each piece is not scored more than once as a passed pawn "hinderer",
by scoring only the blockers along the passed pawn path. Inspired by TCEC Game 29.
Passed STC as a simplification
http://tests.stockfishchess.org/tests/view/5b3016d00ebc5902b2e58552
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 75388 W: 16656 L: 16641 D: 42091
Passed LTC as a simplification
http://tests.stockfishchess.org/tests/view/5b302ed90ebc5902b2e587fc
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 49157 W: 8460 L: 8386 D: 32311
Current master was also counting the number of attacks along a passed pawn path,
which might be misleading:
a) a defender might be counted many times for the same pawn path. For example a
White rook on a1 attacking a black pawn on a7 would score the bonus * 6 but
would be probably better placed on a8
b) a defender might be counted on different pawn paths and might be overloaded. For
example a Ke4 or Qe4 against pawns on d6 and f6 would score the bonus * 6.
Counting each blocker or attacker only once is more complicated, and does not help
either: http://tests.stockfishchess.org/tests/view/5b2ff1cb0ebc5902b2e582b2
After this small simplification, there might be ways to increase the HinderPassedPawn
penalty.
Closes https://github.com/official-stockfish/Stockfish/pull/1661
Bench: 4520519
Silences the following warnings when compiling with GCC 8.
The fix is to use an intermediate pointer to anonymous function:
```
misc.cpp: In function 'int WinProcGroup::get_group(size_t)':
misc.cpp:241:77: warning: cast between incompatible function types from 'FARPROC' {aka 'long long int (*)()'} to 'fun1_t' {aka 'bool (*)(_LOGICAL_PROCESSOR_RELATIONSHIP, _SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX*, long unsigned int*)'} [-Wcast-function-type]
auto fun1 = (fun1_t)GetProcAddress(k32, "GetLogicalProcessorInformationEx");
^
misc.cpp: In function 'void WinProcGroup::bindThisThread(size_t)':
misc.cpp:309:71: warning: cast between incompatible function types from 'FARPROC' {aka 'long long int (*)()'} to 'fun2_t' {aka 'bool (*)(short unsigned int, _GROUP_AFFINITY*)'} [-Wcast-function-type]
auto fun2 = (fun2_t)GetProcAddress(k32, "GetNumaNodeProcessorMaskEx");
^
misc.cpp:310:67: warning: cast between incompatible function types from 'FARPROC' {aka 'long long int (*)()'} to 'fun3_t' {aka 'bool (*)(void*, const _GROUP_AFFINITY*, _GROUP_AFFINITY*)'} [-Wcast-function-type]
auto fun3 = (fun3_t)GetProcAddress(k32, "SetThreadGroupAffinity");
^
```
No functional change.
Compiling the current master with MSVC gives the following error:
```
search.cpp(956): error C2660: 'operator *': function does not take 1 arguments
types.h(303): note: see declaration of 'operator *'
```
This was introduced in commit:
https://github.com/official-stockfish/Stockfish/commit/88de112b84a5285c2afb3e075a05c2ab8ad3fd33
We use a suggestion by @vondele to fix the error, thanks!
No functional change.
[STC](http://tests.stockfishchess.org/tests/view/5b2614000ebc5902b8d17193)
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 17733 W: 3996 L: 3866 D: 9871
[LTC](http://tests.stockfishchess.org/tests/view/5b264d0f0ebc5902b8d17206)
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 55524 W: 9535 L: 9471 D: 36518
Use pawn count scaling also for opposite bishops endings with additional material, with a slope of 2 instead of 7. This simplifies slightly the code.
This PR is a functionally equivalent refactoring of the version which was submitted.
Four versions tried, 2 passed both STC and LTC. I picked the one which seemed more promising at LTC.
Slope 4 passed STC (-0.54 Elo), LTC not attempted
Slope 3 passed STC (+2.51 Elo), LTC (-0.44 Elo)
Slope 2 passed STC (+2.09 Elo), LTC (+0.04 Elo)
Slope 1 passed STC (+0.90 Elo), failed LTC (-3.40 Elo)
Bench: 4761613
I believe using foward_file_bb() here is fewer instructions.
a) Fewer instructions and probably more clear (debatable).
b) Possible that a lookup is slower than a few local operations, but the
forward_file_bb table is probably used often enough that it is always
cached.
Passed
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 21004 W: 4263 L: 4141 D: 12600
http://tests.stockfishchess.org/tests/view/5b1cad830ebc5902ab9c6239
Closes https://github.com/official-stockfish/Stockfish/pull/1644
No functional change.
After a helpful suggestion from AppVeyor support staff, moving the Stockfish
execution from ps to cmd seems to work. Alternative to PR #1624 tested in PR #1637.
No functional change.
The '- 1' subtrahend was introduced for guarding against null move
search at root, which would be nonsense. But this is actually already
guaranteed by the !PvNode condition. This followed from the discussion
in pull request 1609: https://github.com/official-stockfish/Stockfish/pull/1609
No functional change
The function Position::has_repeated() is used by Tablebases::root_probe()
to determine whether we can rank all winning moves with the same value, or
if we need to strictly rank by dtz in case the position has already been
repeated once, and we are risking to run into the 50-move rule and thus
losing the win (especially critical in some very complicated endgames).
To check whether the current position or one of the previous positions
after the last zeroing move has already been occured once, we start looking
for a repetition of the current position, and if that is not the case, we
step one position back and repeat the check for that position, and so on.
If you now look at how this was done before the new root ranking patch was
merged two months ago, it seems quite obvious that it is a simple oversight:
https://github.com/official-stockfish/Stockfish/commit/108f0da4d7f993732aa2e854b8f3fa8ca6d3b46c
More specifically, after we stepped one position back with
```
stc = stc->previous;
```
we now have to start checking for a repetition with
```
StateInfo* stp = stc->previous->previous;
```
and not with
```
StateInfo* stp = st->previous->previous;
```
Closes https://github.com/official-stockfish/Stockfish/pull/1625
No functional change
Fix an error when compiling current master with MSVC due to the
ambiguity of which operator* overload was intended (reported by
Jarrod Torriero).
No functional change.
Makes sure the potential benefit of first touch does not depend on
the order of the UCI commands Threads and Hash, by reallocating the
hash if a Threads is issued. The cost is zeroing the TT once more
than needed. In case the prefered order (first Threads than Hash)
is employed, this amounts to zeroing the default sized TT (16Mb),
which is essentially instantaneous.
Follow up for https://github.com/official-stockfish/Stockfish/pull/1601
where additional data and discussion is available.
Closes https://github.com/official-stockfish/Stockfish/pull/1620
No functional change.
Stockfish currently takes a while to clear the TT when using larger hash sizes.
On one machine with 128 GB hash it takes about 50 seconds with a single thread,
allowing it to use all allocated cores brought that time down to 4 seconds on
some Linux systems. The patch was further tested on Windows and refined with
NUMA binding of the hash initializing threads (we refer to pull request #1601
for the complete discussion and the speed measurements).
Closes https://github.com/official-stockfish/Stockfish/pull/1601
No functional change
I was able to get this to pass which reduces BlockedByPawn to one dimension
with NO distance from edge offset.
GOOD) It's more simple and may provide additional clarity for further
simplifications. Facilitates migrating unblocked to one dimension as well.
BAD) If there is indeed a distance component to BlockedStorm (may or may
not be the case), this obfuscates this component into ShelterStrength and
UnblockedStorm. This may be more convoluted. Also, it may be more convenient
to have each of the three arrays (ShelterStrength, BlockedStorm, and UnBlocked)
be the same size.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 96173 W: 19326 L: 19343 D: 57504
http://tests.stockfishchess.org/tests/view/5b04544d0ebc5914abc12965
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 49818 W: 7441 L: 7363 D: 35014
http://tests.stockfishchess.org/tests/view/5b0487d50ebc5914abc12990
Closes https://github.com/official-stockfish/Stockfish/pull/1611
Bench: 5133208
define Color us and use this instead of pos.side_to_move() and nmp_odd. The latter allows to clarify the nmp verification criterion.
Tested for no regression:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76713 W: 15303 L: 15284 D: 46126
http://tests.stockfishchess.org/tests/view/5b046a0d0ebc5914abc12971
No functional change.
Simplifying away all the progressKey stuff gives exactly the same bench,
without any speed impact. Tested for speed against master with two benches
at depth 22 ran in parallel:
**testedpatch**
Total time (ms) : 92350
Nodes searched : 178962949
Nodes/second : 1937877
**master**
Total time (ms) : 92358
Nodes searched : 178962949
Nodes/second : 1937709
We also tested the patch at STC for no-regression with [-3, 1] bounds:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 57299 W: 11529 L: 11474 D: 34296
http://tests.stockfishchess.org/tests/view/5b015a1c0ebc5914abc126e5
Closes https://github.com/official-stockfish/Stockfish/pull/1603
No functional change.
Default template parameters values and recursive functions do not play well
together. Fix for below errors that showed up after updating to latest MSVC.
````
tbprobe.cpp(1156): error C2672:
'search': no matching overloaded function found
tbprobe.cpp(1198): error C2783:
'Tablebases::WDLScore `anonymous-namespace'::search(Position &,Tablebases::ProbeState *)':
could not deduce template argument for 'CheckZeroingMoves'
````
Closes https://github.com/official-stockfish/Stockfish/pull/1594
No functional change.
A position which has a move which draws by repetition, or which could have
been reached from an earlier position in the game tree, is considered to be
at least a draw for the side to move.
Cycle detection algorithm by Marcel van Kervink:
https://marcelk.net/2013-04-06/paper/upcoming-rep-v2.pdf
----------------------------
How does the algorithm work in practice? The algorithm is an efficient
method to detect if the side to move has a drawing move, without doing any
move generation, thus possibly giving a cheap cutoffThe most interesting
conditions are both on line 1195:
```
if ( originalKey == (progressKey ^ stp->key)
|| progressKey == Zobrist::side)
```
This uses the position keys as a sort-of Bloom filter, to avoid the expensive
checks which follow. For "upcoming repetition" consider the opening Nf3 Nf6 Ng1.
The XOR of this position's key with the starting position gives their difference,
which can be used to look up black's repeating move (Ng8). But that look-up is
expensive, so line 1195 checks that the white pieces are on their original squares.
This is the subtlest part of the algorithm, but the basic idea in the above game
is there are 4 positions (starting position and the one after each move). An XOR
of the first pair (startpos and after Nf3) gives a key matching Nf3. An XOR of
the second pair (after Nf6 and after Ng1) gives a key matching the move Ng1. But
since the difference in each pair is the location of the white knight those keys
are "identical" (not quite because while there are 4 keys the the side to move
changed 3 times, so the keys differ by Zobrist::side). The loop containing line
1195 does this pair-wise XOR-ing.
Continuing the example, after line 1195 determines that the white pieces are
back where they started we still need to make sure the changes in the black
pieces represents a legal move. This is done by looking up the "moveKey" to
see if it corresponds to possible move, and that there are no pieces blocking
its way. There is the additional complication that, to match the behavior of
is_draw(), if the repetition is not inside the search tree then there must be
an additional repetition in the game history. Since a position can have more
than one upcoming repetition a simple count does not suffice. So there is a
search loop ending on line 1215.
On the other hand, the "no-progress' is the same thing but offset by 1 ply.
I like the concept but think it currently has minimal or negative benefit,
and I'd be happy to remove it if that would get the patch accepted. This
will not, however, save many lines of code.
-----------------------------
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 36430 W: 7446 L: 7150 D: 21834
http://tests.stockfishchess.org/tests/view/5afc123f0ebc591fdf408dfc
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 12998 W: 2045 L: 1876 D: 9077
http://tests.stockfishchess.org/tests/view/5afc2c630ebc591fdf408e0c
How could we continue after the patch:
• The code in search() that checks for cycles has numerous possible variants.
Perhaps the check need could be done in qsearch() too.
• The biggest improvement would be to get "no progress" to be of actual benefit,
and it would be helpful understand why it (probably) isn't. Perhaps there is an
interaction with the transposition table or the (fantastically complex) tree
search. Perhaps this would be hard to fix, but there may be a simple oversight.
Closes https://github.com/official-stockfish/Stockfish/pull/1575
Bench: 4550412
At PvNodes allow bonus for prior counter move that caused a fail low
for depth 1 and 2. Note : I did a speculative LTC on yellow STC patch
since history stats tend to be highly TC sensitive
STC (Yellow):
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 64295 W: 13042 L: 12873 D: 38380
http://tests.stockfishchess.org/tests/view/5af507c80ebc5968e6524153
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 22407 W: 3413 L: 3211 D: 15783
http://tests.stockfishchess.org/tests/view/5af85dd40ebc591fdf408b87
Also use local variable excludedMove in NMP (marotear)
Bench: 5294316
Use the whole kingRing for pawn attackers instead of only the squares directly
around the king. This tends to give quite a lot more kingAttackersCount, so to
compensate and to avoid raising the king danger too fast we lower the values
in the KingAttackWeights array a little bit.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 51892 W: 10723 L: 10369 D: 30800
http://tests.stockfishchess.org/tests/view/5af6d4dd0ebc5968e652428e
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 24536 W: 3737 L: 3515 D: 17284
http://tests.stockfishchess.org/tests/view/5af709890ebc5968e65242ac
Credits to user @xoroshiro for the idea of using the kingRing for pawn attackers.
How to continue? It seems that the KingAttackWeights[] array stores values
which are quite Elo-sensitive, yet they have not been tuned with SPSA recently.
There might be easy Elo points to get there.
Closes https://github.com/official-stockfish/Stockfish/pull/1597
Bench: 5282815
Simplification: in king danger, include all blockers and not only pinned
pieces, since blockers enemy pieces can result in discovered checks which
are also bad.
STC http://tests.stockfishchess.org/tests/view/5af35f9f0ebc5968e6523fe9
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 145781 W: 29368 L: 29478 D: 86935
LTC http://tests.stockfishchess.org/tests/view/5af3cb430ebc5968e652401f
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76398 W: 11272 L: 11232 D: 53894
I also incorrectly scheduled STC with [0,5] which it failed.
http://tests.stockfishchess.org/tests/view/5af283c00ebc5968e6523f33
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 12338 W: 2451 L: 2522 D: 7365
Closes https://github.com/official-stockfish/Stockfish/pull/1593
bench: 4698290
----------------------------------------
Thanks to @vondele and @Rocky640 for a cleaner version of the patch,
and the following comments!
> Most of the pinned, (or for this pull request, blocking) squares were
> already computed in the unsafeChecks, the only missing squares being:
>
> a) squares attacked by a Queen which are occupied by friendly piece
> or "unsafe". Note that adding such squares never passed SPRT[0,5].
>
> b) squares not in mobilityArea[Us].
>
> There is a strong relationship between the blockers and the unsafeChecks,
> but the bitboard unsafeChecks is still useful when the checker is not
> aligned with the king, and the checking square is occupied by friendly
> piece or is "unsafe". This is always the case for the Knight.
This patch simplifies the control flow in search(), removing an if
and a goto. A side effect of the patch is that Stockfish is now a
little bit more selective at low depths, because we allow razoring,
futility pruning and probcut pruning after a null move.
passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 32035 W: 6523 L: 6422 D: 19090
http://tests.stockfishchess.org/tests/view/5af142ca0ebc597fb3d39bb6
passed LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 41431 W: 6187 L: 6097 D: 29147
http://tests.stockfishchess.org/tests/view/5af148770ebc597fb3d39bc1
Ideas for further work:
• Use the nodes credit opened by the patch (the increased selectivity)
to try somewhat higher razoring, futility or probcut margins at [0..4].
Bench: 4855031
We can view the patch version as adding some "undermining bonus" for
level pawns, when the defending side can not easily avoid the exchange
by advancing her pawn.
• Case 1) White b2,c3, Black a3,b3:
Black is breaking through, b2 deserves a penalty
• Case 2) White b2,c3, Black a3,c4:
if b2xa3 then White ends up with a weak pawn on a3
and probably a weak pawn on c3 too.
In either case, White can still not safely play b2-b3 and make a
phalanx with c3, which is the essence of a backward pawn definition.
Passed STC in SPRT[0, 4]:
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 131169 W: 26523 L: 26199 D: 78447
http://tests.stockfishchess.org/tests/view/5aefa4d50ebc5902a409a151
ELO 1.19 [-0.38,2.88] (95%)
Passed LTC in SPRT[-3, 1]:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 24824 W: 3732 L: 3617 D: 17475
http://tests.stockfishchess.org/tests/view/5af04d3f0ebc5902a88b2e55
ELO 1.27 [-1.21,3.70] (95%)
Closes https://github.com/official-stockfish/Stockfish/pull/1584
How to continue from there?
There were some promising tests a couple of months ago about adding
a lever condition for king danger in evaluate.cpp, maybe it would
be time to re-try this after all the recent changes in pawns.cpp
Bench: 4773882
The two lines of code in the patch seem to be just as good as master.
1. We now only look at the current square to see if it is currently backward,
whereas master looks there AND further ahead in the current file (master would
declare a pawn "backward" even though it could still safely advance a little).
This simplification allows us to avoid the use of the difficult logic with
`backmost_sq(Us, neighbours | stoppers)`.
2. The condition `relative_rank(Us,s) < RANK_5` is simplified away.
Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 68132 W: 14025 L: 13992 D: 40115
http://tests.stockfishchess.org/tests/view/5aedc97a0ebc5902a4099fd6
Passed LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23789 W: 3643 L: 3527 D: 16619
http://tests.stockfishchess.org/tests/view/5aee4f970ebc5902a409a03a
Ideas for further work:
• The new code flags some pawns on the 5th rank as backward, which was not the
case in the old master. So maybe we should test a version with that included?
• Further tweaks of the backward condition with [0..5] bounds?
Closes https://github.com/official-stockfish/Stockfish/pull/1583
Bench: 5122789
When we are using the "Bitboard + Square" overloaded operators,
the compiler uses the interpediate SquareBB[s] to transform the
square into a Bitboard, and then calculate the result.
For instance, the following code:
```
b = pos.pieces(Us, PAWN) & s
```
generates in fact the code:
```
b = pos.pieces(Us, PAWN) & SquareBB[s]`
```
The bug introduced by Stéphane in the previous patch was the
use of `b = pos.pieces(Us, PAWN) & (s + Up)` which can result
in out-of-bounds errors for the SquareBB[] array if s in the
last rank of the board.
We coorect the bug, and also add some asserts in bitboard.h to
make the code more robust for this particular bug in the future.
Bug report by Joost VandeVondele. Thanks!
Bench: 5512000
Simplification: remove BlockedByKing from storm array and use a special rule.
The BlockedByKing section in the storm array is substantially similar to the
Unopposed section except for two extreme values V(-290), V(-274). Turns out
removing BlockedByKing and using a special rule for these two values shows
no Elo loss. All the other values in the BlockedByKing section are apparently
irrelevant. BlockedByKing now falls under unopposed which (to me) is a bit
more logical since there is no defending pawn on this file. Also, retuning
the Unopposed section may be another improvement.
GOOD) This is a simplification because the entire BlockedByKing section of
the storm array goes away reducing a few lines of code (and less values to
tune). This also brings clarity because the special rule is self documenting.
BAD) It takes execution time to apply the special rule. This should be negli-
gible because it is based on a template parameter and is boiled down to two
bitwise AND's.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 33470 W: 6820 L: 6721 D: 19929
http://tests.stockfishchess.org/tests/view/5ae7b6e60ebc5926dba90e13
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 47627 W: 7045 L: 6963 D: 33619
http://tests.stockfishchess.org/tests/view/5ae859ff0ebc5926dba90e85
Closes https://github.com/official-stockfish/Stockfish/pull/1574
Bench: 5512000
-----------
How to continue after this patch?
This patch may open the possibility to move the special rule to evaluate.cpp
in the evaluate::king() function, where we could refine the rule using king
danger information. For instance, with a king in H2 blocking an opponent pawn
in H3, it may be critical to know that the opponent has no safe check in G2
before giving the bonus :-)
This is a further step in the long quest for a simple way of determining
scale factors for the endgame.
Here we remove the artificial restriction in evaluate_scale_factor()
based on endgame score. Also SCALE_FACTOR_ONEPAWN can be simplified
away. The latter is a small non functional simplification with respect
to the version that was testedin the framework, verified on bench with
depth 22 for good measure.
Passed STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 49438 W: 9999 L: 9930 D: 29509
http://tests.stockfishchess.org/tests/view/5ae20c8b0ebc5963175205c8
Passed LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 101445 W: 15113 L: 15110 D: 71222
http://tests.stockfishchess.org/tests/view/5ae2a0560ebc5902a1998986
How to continue from there?
Maybe the general case could be scaled with pawns from both colors
without losing Elo. If that is the case, then this could be merged
somehow with the scaling in evaluate_initiative(), which also uses
a additive malus down when the number of pawns in the position goes
down.
Closes https://github.com/official-stockfish/Stockfish/pull/1570
Bench: 5254862
Currently the make strip target is broken on mingw as the exe name is wrong (stockfish instead of stockfish.exe).
Needs some testing by mingw users (both profile-build and strip, native and cross).
No functional change.
Queen was recently excluded from the mobility area of friendly minor
pieces. Exclude queen also from the mobility area of friendly majors too.
Run as a simplification:
STC
http://tests.stockfishchess.org/tests/view/5ade396f0ebc59602d053742
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 46972 W: 9511 L: 9437 D: 28024
LTC
http://tests.stockfishchess.org/tests/view/5ade64b50ebc5949f20a24d3
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 66855 W: 10157 L: 10105 D: 46593
How to continue from there?
The mobilityArea is used in various places of the evaluation as a
soft proxy for "not attacked by the opponent pawns". Now that the
mobility area is getting smaller and smaller, it may be worth to
hunt for Elo gains by trying the more direct ~attackedBy[Them][PAWN]
instead of mobilityArea[Us] in these places.
Bench: 4650572
Remove the distinction between the king file and the two neighbours
files in the ShelterStrength[] array. Instead we initialize the safety
variable in the evaluate_shelter() function with a -10 penalty if our
king is on a semi-open file (ie. if our king is on a file without a pawn
protection).
Also rename shelter_storm() to evaluate_shelter() while there.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 23153 W: 4795 L: 4677 D: 13681
http://tests.stockfishchess.org/tests/view/5adcb83d0ebc595ec7ff8aa7
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 25728 W: 3934 L: 3821 D: 17973
http://tests.stockfishchess.org/tests/view/5adcdcb60ebc595ec7ff8adb
See the commit history in PR#1559 for the proof that the committed
version is equivalent to the version in the tests above:
https://github.com/official-stockfish/Stockfish/pull/1559
Full credit to @protonspring for the renormalized values of the
ShelterStrength[] array used for the simplification. Thanks!
Bench: 4703935
Change the operators of the Option type in uci.h to accept floating
point numbers in double precision on input as the numerical type for
the "spin" values of the UCI protocol.
The output of Stockfish after the "uci" command is unaffected.
This change is compatible with all the existing GUI (as they will
continue sending integers that we can interpret as doubles in SF),
and allows us to pass double parameters to Stockfish in the console
via the "setoption" command. This will be useful if we implement
another tuner as an alternative for SPSA.
Closes https://github.com/official-stockfish/Stockfish/pull/1556
No functional change.
---------------------
A example of the new functionality in action in the branch `tune_float2'`:
https://github.com/snicolet/Stockfish/commit/876c322d0f20ee232da977b4d3489c4cc929765e
I have added the following lines in ucioptions.cpp:
```C++
void on_pi(const Option& o)
{
double x = Options["PI"]; // or double x = o;
std::cerr << "received value is x = " << x << std::endl;
}
...
o["PI"] << Option(3.1415926, -10000000, 10000000, on_pi);
```
Then I can change the value of Pi in Stockfish via the command line, and
check that Stockfish understands a floating point:
````
> ./stockfish
> setoption name PI value 2.7182818284
received value is x = 2.71828
````
On output, the default value of Pi is truncated to 3 (to remain compatible
with the UCI protocol and GUIs):
````
> uci
[...]
option name SyzygyProbeLimit type spin default 6 min 0 max 6
option name PI type spin default 3 min -10000000 max 10000000
uciok
````
This patch is non-functional. Current master does four operations to determine
whether an enemy pawn on this file is blocked by the king or not
```
f == file_of(ksq) && rkThem == relative_rank(Us, ksq) + 1 )
```
By adding a direction (based on the template color), this is reduced to two
operations. This works because b is limited to enemy pawns that are ahead of
the king and on the current file.
```
shift<Down>(b) & ksq
```
I've added a line of code, but the number of executing instructions is reduced
(I think). I'm not sure if this counts as a simplification, but it should
theoretically be a little faster (barely). The code line length is also reduced
making it a little easier to read.
Closes https://github.com/official-stockfish/Stockfish/pull/1552
No functional change.
This patch corrects both MultiPV behaviour and "go searchmoves" behaviour
for tablebases.
We change the logic of table base probing at root positions from filtering
to ranking. The ranking code is much more straightforward than the current
filtering code (this is a simplification), and also more versatile.
If the root is a TB position, each root move is probed and assigned a TB score
and a TB rank. The TB score is the Value to be displayed to the user for that
move (unless the search finds a mate score), while the TB rank determines which
moves should appear higher in a multi-pv search. In game play, the engine will
always pick a move with the highest rank.
Ranks run from -1000 to +1000:
901 to 1000 : TB win
900 : normally a TB win, in rare cases this could be a draw
1 to 899 : cursed TB wins
0 : draw
-1 to -899 : blessed TB losses
-900 : normally a TB loss, in rare cases this could be a draw
-901 to -1000 : TB loss
Normally all winning moves get rank 1000 (to let the search pick the best
among them). The exception is if there has been a first repetition. In that
case, moves are ranked strictly by DTZ so that the engine will play a move
that lowers DTZ (and therefore cannot repeat the position a second time).
Losing moves get rank -1000 unless they have relatively high DTZ, meaning
they have some drawing chances. Those get ranks towards -901 (when they
cross -900 the draw is certain).
Closes https://github.com/official-stockfish/Stockfish/pull/1467
No functional change (without tablebases).
This patch introduces an Analysis Contempt UCI combo box to control
the behaviour of contempt during analysis. The possible values are
Both, Off, White, Black. Technically, the engine is supposed to be in
analysis mode if UCI_AnalyseMode is set by the graphical user interface
or if the user has chosen infinite analysis mode ("go infinite").
Credits: the idea for the combo box is due to Michel Van den Bergh.
No functional change (outside analysis mode).
-----------------------------------------------------
The so-called "contempt" is an optimism value that the engine adds
to one color to avoid simplifications and keep tension in the position
during its search. It was introduced in Stockfish 9 and seemed to give
good results during the TCEC 11 tournament (Stockfish seemed to play a
little bit more actively than in previous seasons).
The patch does not change the play during match or blitz play, but gives
more options for correspondance players to decide for which color(s) they
would like to use contempt in analysis mode (infinite time). Here is a
description of the various options:
* Both : in analysis mode, use the contempt for both players (alternating)
* Off : in analysis mode, use the contempt for none of the players
* White : in analysis mode, White will play actively, Black will play passively
* Black : in analysis mode, Black will play actively, White will play passively
This patch adds some documentation and code cleanup to tablebase code.
It took me some time to understand the relation among the differrent
structs, although I have rewrote them fully in the past. So I wrote
some detailed documentation to avoid the same efforts for future readers.
Also noteworthy is the use a standard hash table implementation with a
more efficient 1D array instead of a 2D array. This reduces the average
lookup steps of 90% (from 343 to 38 in a bench 128 1 16 run) and reduces
also the table from 5K to 4K
entries.
I have tested on 5-men and no functional and no slowdown reported. It
should be verified on 6-men that the new hash does not overflow. It is
enough to run ./stockfish with 6-men available: if it does not assert at
startup it means everything is ok with 6-men too.
EDIT: verified for 6-men tablebase by Jörg Oster. Thanks!
No functional change.
We remove an unnecessary condition in the definition of safe squares
in the space evaluation. Only the squares which are occupied by our
pawns or attacked by our opponent's pawns are now excluded.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 21096 W: 4321 L: 4199 D: 12576
http://tests.stockfishchess.org/tests/view/5acbf7510ebc59547e537d4e
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 23437 W: 3577 L: 3460 D: 16400
http://tests.stockfishchess.org/tests/view/5acc0f750ebc59547e537d6a
It may be possible to further refine the definition of such safe squares.
Bench: 5351765
This patch applies a S(10, 5) bonus for every square that is:
- Occupied by an enemy piece which is not a pawn
- Attacked exactly once by our pieces
- Defended exactly once by enemy pieces
The idea is that these pieces must be defended. Their defenders have
dramatically limited mobility, and they are vulnerable to our future
attack.
As with connectivity, there are probably many more tests to be run in
this area. In particular:
- I believe @snicolet's queen overload tests have demonstrated a potential
need for a queen overload bonus above and beyond this one; however, the
conditions for "overload" in this patch are different (excluding pieces
we attack twice). My next test after this is (hopefully) merged will be
to intersect the Bitboard I define here with the enemy's queen attacks and
attempt to give additional bonus.
- Perhaps we should exclude pieces attacked by pawns--can pawns really be
overloaded? Should they have the same weight, or less? This didn't work
with a previous version, but it could work with this one.
- More generally, different pieces may need more or less bonus. We could
change bonuses based on what type of enemy piece is being overloaded, what
type of friendly piece is attacking, and/or what type of piece is being
defended by the overloaded piece and attacked by us, or any intersection
of these three. For example, here attacked/defended pawns are excluded,
but they're not totally worthless targets, and could be added again with
a smaller bonus.
- This list is by no means exhaustive.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 17439 W: 3599 L: 3390 D: 10450
http://tests.stockfishchess.org/tests/view/5ac78a2e0ebc59435923735e
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 43304 W: 6533 L: 6256 D: 30515
http://tests.stockfishchess.org/tests/view/5ac7a1d80ebc59435923736f
Closes https://github.com/official-stockfish/Stockfish/pull/1533
Bench: 5248871
----------------
This is my first time opening a PR, so I apologize if there are errors.
There are too many people to thank since I submitted my first test just
over a month ago. Thank you all for the warm welcome and here is to more
green patches!
In particular, I would like to thank:
- @crossbr, whose comment in a FishCooking thread first inspired me to
consider the overloading of pieces other than queens,
- @snicolet, whose queen overload tests inspired this one and served as
the base of my first overload attempts,
- @protonspring, whose connectivity tests inspired this one and who provided
much of the feedback needed to take this from red to green,
- @vondele, who kindly corrected me when I submitted a bad LTC test,
- @Rocky640, who has helped me over and over again in the past month.
Thank you all!
In master, we already remove the King from the mobility area of minor pieces
because the King simply stands in the way of other pieces, and since opponent
cannot capture the King, any piece which "protects" the King cannot recapture.
Similarly, this patch introduces the idea that it is rarely a need for a Queen
to be "protected" by a minor (unless it is attacked only by a Queen, in fact).
We used to have a LoosePiece bonus, and in a similar vein the Queen was excluded
from that penalty.
Idea came when reviewing an old game of Kholmov. He was a very good midgame
player, but in the opening his misplace his Queen (and won in the end :-) :
http://www.chessgames.com/perl/chessgame?gid=1134645
Both white queen moves 10.Qd3 and 13.Qb3 are in the way of some minor piece.
I would prefer to not give a bishop mobility bonus at move 10 for the square d3,
or later a knight mobility bonus at move 13 for the square b3. And the textbook
move is 19.Qe3! which prepares 20.Nb3. This short game sample shows how much a
queen can be "in the way" of minor pieces.
STC
http://tests.stockfishchess.org/tests/view/5ac2c15f0ebc591746423fa3
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 22066 W: 4561 L: 4330 D: 13175
LTC
http://tests.stockfishchess.org/tests/view/5ac2d6500ebc591746423faf
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 25871 W: 3953 L: 3738 D: 18180
Closes https://github.com/official-stockfish/Stockfish/pull/1532
Ideas for future work in this area:
• tweak some more mobility areas for other piece type.
• construct a notion of global mobility for the whole piece set.
• bad bishops.
Bench: 4989125
Simplify ThreatBySafePawn evaluation by removing the 'if (weak)' speed
optimization check from threats evaluation. This is a non functional
change as it removes just a speed optimization conditional which was
probably useful before but does no longer provide benefits. This section
section had a few more lines not long ago, with ThreatByHangingPawn and
a loop through the threatened pieces, but now there is not much left.
Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 47775 W: 9696 L: 9624 D: 28455
http://tests.stockfishchess.org/tests/view/5ac298910ebc591746423f8b
Closes https://github.com/official-stockfish/Stockfish/pull/1531
Non functional change.
1) Use make_bitboard() in Bitboards::init()
2) Fix MSVC warning: search.h(85): warning C4244: '=': conversion from
'TimePoint' to 'int', possible loss of data.
Closes https://github.com/official-stockfish/Stockfish/pull/1524
No functional change.
When we reach a position with only two opposite colored bishops and
one pawn on the board, current master would give it a scale factor
of 9/64=0.14 in about one position out of 7200, and a scale factor
of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in
100% of the cases.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 55845 W: 11467 L: 11410 D: 32968
http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 11915 W: 1852 L: 1719 D: 8344
http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c
We also have exhaustive coverage analysis of this patch effect by
Alain Savard, comparing the perfect evaluation given by the Syzygy
tablebase with the heuristic play after this patch for the set of
all legal positions of the KBPKP endgame with opposite bishops, in
the comments thread for this pull request:
https://github.com/official-stockfish/Stockfish/pull/1520
Alain's conclusion:
> According to this definition and the data, I consider this PR is
> identical to master to "solve for draw" and slightly better than
> master to solve earlier for "wins".
Note: this patch is a side effect of an ongoing effort to improve
the evaluation of positions involving a pair of opposite bishops.
See the GitHub diff of this LTC test which almost passed at sprt[0..5]
for a discussion:
http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93
No functional change (at small bench depths)
Include some not fully supported levers in the (candidate) passed pawns
bitboard, if otherwise unblocked. Maybe levers are usually very short
lived, and some inaccuracy in the lever balance for the definition of
candidate passed pawns just triggers a deeper search.
Here is a example of a case where the patch has an effect on the definition
of candidate passers: White c5/e5 pawns, against Black d6 pawn. Let's say
we want to test if e5 is a candidate passer. The previous master looks
only at files d, e and f (which is already very good) and reject e5 as
a candidate. However, the lever d6 is challenged by 2 pawns, so it should
not fully count. Indirectly, this patch will view such case (and a few more)
to be scored as candidates.
STC
http://tests.stockfishchess.org/tests/view/5abcd55d0ebc5902926cf1e1
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 16492 W: 3419 L: 3198 D: 9875
LTC
http://tests.stockfishchess.org/tests/view/5abce1360ebc5902926cf1e6
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 21156 W: 3201 L: 2990 D: 14965
This was inspired by this test of Jerry Donald Watson, except the case of
zero supporting pawns against two levers is excluded, and it seems that
not excluding that case is bad, while excluding is it beneficial. See the
following tests on fishtest:
https://github.com/official-stockfish/Stockfish/pull/1519http://tests.stockfishchess.org/tests/view/5abccd850ebc5902926cf1ddhttp://tests.stockfishchess.org/tests/view/5abcdd490ebc5902926cf1e4
Closes https://github.com/official-stockfish/Stockfish/pull/1521
Bench: 5568461
----
Comments by Jerry Donald Watson:
> My thinking as to why this works:
>
> The evaluation is either called in an interior node or in the qsearch.
> The calls at the end of the qsearch are the more important as they
> ultimately determine the scoring of each move, whereas the internal
> values are mainly used for pruning decisions with a margin. Some strong
> engines don't even call the eval at all nodes. Now the whole point of
> the qsearch is to find quiet positions where captures do not change the
> evaluation of the position with regards to the search bounds - i.e. if
> there were good captures they would be tried.* So when a candidate lever
> appears in the evaluation at the end of the qsearch, the qsearch has
> guaranteed that it cannot just be captured, or if it can, this does not
> take the score past the search bounds. Practically this may mean that
> the side with the candidate lever has the turn, or perhaps the stopping
> lever pawn is pinned, or that side is forced for other reasons to make
> some other move (e.g. d6 can only take one of the pawns in the example
> above).
>
> Hence granting the full score for only one lever defender makes some
> sense, at least, to me.
>
> IMO this is also why huge bonuses for possible captures in the evaluation
> (e.g. threat on queen and our turn), etc. don't tend to work. Such things
> are best left to the search to figure out.
We now use per-thread dynamic contempt. This patch has the following
effects:
* for Threads=1: **non-functional**
* for Threads>1:
* with MultiPV=1: **no regression, little to no ELO gain**
* with MultiPV>1: **clear improvement over master**
First, I tried testing at standard MultiPV=1 play with [0,5] bounds.
This yielded 2 yellow and 1 red test:
5+0.05, Threads=5:
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 82689 W: 16439 L: 16190 D: 50060
http://tests.stockfishchess.org/tests/view/5aa93a5a0ebc5902952892e6
5+0.05, Threads=8:
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 27164 W: 4974 L: 4983 D: 17207
http://tests.stockfishchess.org/tests/view/5ab2639b0ebc5902a6fbefd5
5+0.5, Threads=16:
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 41396 W: 7127 L: 7082 D: 27187
http://tests.stockfishchess.org/tests/view/5ab124220ebc59029516cb62
Then, I tested with Skill Level=17 (implicitly MutliPV=4), showing
a clear improvement:
5+0.05, Threads=5:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 3498 W: 1316 L: 1135 D: 1047
http://tests.stockfishchess.org/tests/view/5ab4b6580ebc5902932aeca2
Next, I tested the patch with MultiPV=1 again, this time checking for
non-regression ([-3, 1]):
5+0.5, Threads=5:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 65575 W: 12786 L: 12745 D: 40044
http://tests.stockfishchess.org/tests/view/5ab4e8500ebc5902932aecb3
Finally, I ran some tests with fixed number of games, checking if
reverting dynamic contempt gains more elo with Skill Level=17 (i.e.
MultiPV) than applying the "prevScore" fix and this patch. These tests
showed, that this patch gains 15 ELO when playing with Skill Level=17:
5+0.05, Threads=3, "revert dynamic contempt" vs. "WITHOUT this patch":
ELO: -11.43 +-4.1 (95%) LOS: 0.0%
Total: 20000 W: 7085 L: 7743 D: 5172
http://tests.stockfishchess.org/tests/view/5ab636450ebc590295d88536
5+0.05, Threads=3, "revert dynamic contempt" vs. "WITH this patch":
ELO: -26.42 +-4.1 (95%) LOS: 0.0%
Total: 20000 W: 6661 L: 8179 D: 5160
http://tests.stockfishchess.org/tests/view/5ab62e680ebc590295d88524
---
***FAQ***
**Why should this be commited?**
I believe that the gain for multi-thread MultiPV search is a sufficient
justification for this otherwise neutral change. I also believe this
implementation of dynamic contempt is more logical, although this may
be just my opinion.
**Why is per-thread contempt better at MultiPV?**
A likely explanation for the gain in MultiPV mode is that during
search each thread independently switches between rootMoves and via
the shared contempt score skews each other's evaluation.
**Why were the tests done with Skill Level=17?**
This was originally suggested by @Hanamuke and the idea is that with
Skill Level Stockfish sometimes plays also moves it thinks are slightly
sub-optimal and thus the quality of all moves offered by the MultiPV
search is checked by the test.
**Why are the ELO differences so huge?**
This is most likely because of the nature of Skill Level mode --
since it slower and weaker than normal mode, bugs in evaluation have
much greater effect.
---
Closes https://github.com/official-stockfish/Stockfish/pull/1515.
No functional change -- in single thread mode.
Extends valgrind/sanitizer testing to cover syzygy code.
The script downloads 4 man syzygy as needed. The time needed for the
additional testing is small (in fact hard to see a difference compared
to the large fluctuations in testing time in travis).
Possible follow-ups:
* include more TB sensitive positions in bench.
* include the test script of recent commit "Refactor tbprobe.cpp".
* verify unchanged bench with TB (with a long run).
* make the TB part of the continuation integration tests optional.
Closes https://github.com/official-stockfish/Stockfish/pull/1518
and https://github.com/official-stockfish/Stockfish/pull/1490
No functional change.
Adjust criterion for applying extra reduction if not improving.
We now add an extra ply of reduction if r > 1.0, instead of the
previous condition Reductions[NonPV][imp][d][mc] >= 2.
Why does this work? Previously, reductions when not improving had
a discontinuity as the depth and/or move count increases due to the
Reductions[NonPV][imp][d][mc] >= 2 condition. Hence, values of r
such that 0.5 < r < 1.5 would be mapped to a reduction of 1, while
1.5 < r < 2.5 would be mapped to a reduction of 3. This patch allows
values of r satisfying 1.0 < r < 1.5 to be mapped to a reduction of 2,
making the reduction formula more continuous.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 35908 W: 7382 L: 7087 D: 21439
http://tests.stockfishchess.org/tests/view/5aba723a0ebc5902a4743e8f
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 23087 W: 3584 L: 3378 D: 16125
http://tests.stockfishchess.org/tests/view/5aba89070ebc5902a4743ea9
Ideas for future work:
- We could look at retuning the LMR formula.
- We could look at adjusting the reductions in PV nodes if not improving.
Bench: 5326261
Use rootMoves[PVIdx].previousScore instead of bestValue for
dynamic contempt. This is equivalent for MultiPV=1 (bench remained the
same, even for higher depths), but more correct for MultiPV.
STC (MultiPV=3):
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 2657 W: 1079 L: 898 D: 680
http://tests.stockfishchess.org/tests/view/5aaa47cb0ebc590297330403
LTC (MultiPV=3):
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 2390 W: 874 L: 706 D: 810
http://tests.stockfishchess.org/tests/view/5aaa593a0ebc59029733040b
VLTC 240+2.4 (MultiPV=3):
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 2399 W: 861 L: 694 D: 844
http://tests.stockfishchess.org/tests/view/5aaf983e0ebc5902a182131f
LTC (MultiPV=4, Skill Level=17):
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 747 W: 333 L: 175 D: 239
http://tests.stockfishchess.org/tests/view/5aabccee0ebc5902997ff006
Note: although the ELO differences seem huge, they are inflated by the
nature of Skill Level / MultiPV search, so I don't think they can be
reasonably compared with classic ELO strength.
See https://github.com/official-stockfish/Stockfish/pull/1491 for some
verifications searches with MultiPV = 10 at depths 12 and 24 from the
starting position and the position after 1.e4, comparing the outputs
of the full PV by the old master and by this patch.
No functional change for MultiPV=1
This involves:
* replacing the union hacks with simply reusing the EntryPiece arrays
for the no-pawns case
* merging the PairsData structure with the EntryPiece/-Pawn structs
(with credit to Marco: @mcostalba)
* simplifying some HashTable functions
* thanks to previous changes, removing the ugly memsets
* simplifying the template logic for WDL/DTZ distinction
(now we distinguish based on an enum type, not the entry classes)
* removing the unneeded Atomic wrapper
-----------------------------
For reference, here is a manual way to check that patches concerning
table bases code are non-functional changes:
0) Download the Syzygy table bases (up to 6 men).
1) Make sure you have branches master and the pull request pointing to
the right commits.
2) Download the bench calculation scripts from the following URL:
https://gist.github.com/WOnder93/b5fcf9c989b4a1715684d5c82367cdbe
and copy into src inside your Stockfish repo.
3) Make the scripts executable (chmod +x *.sh).
4) Run the following command to use TBs located at <path>:
export SYZYGY_PATH='<path>'
5) After that, run this (it will take a long time, this is a deep bench):
BENCH_ARGS='128 1 22' ./check_benches.sh master tbprobe_cleanup 2>/dev/null`
==> You should see two equal numbers printed.
(Of course, now we have to trust that the script itself is correct :)
-----------------------------
Closes https://github.com/official-stockfish/Stockfish/pull/1477
No functional change.
This is a patch to fix issue #1498, switching the time management variables
to 64 bits to avoid overflow of time variables after 25 days.
There was a bug in Stockfish 9 causing the output to be wrong after
2^31 milliseconds search. Here is a long run from the starting position:
info depth 64 seldepth 87 multipv 1 score cp 23 nodes 13928920239402
nps 0 tbhits 0 time -504995523 pv g1f3 d7d5 d2d4 g8f6 c2c4 d5c4 e2e3 e7e6 f1c4
c7c5 e1g1 b8c6 d4c5 d8d1 f1d1 f8c5 c4e2 e8g8 a2a3 c5e7 b2b4 f8d8 b1d2 b7b6 c1b2
c8b7 a1c1 a8c8 c1c2 c6e5 d1c1 c8c2 c1c2 e5f3 d2f3 a7a5 b4b5 e7c5 f3d4 d8c8 d4b3
c5d6 c2c8 b7c8 b3d2 c8b7 d2c4 d6c5 e2f3 b7d5 f3d5 e6d5 c4e5 a5a4 e5d3 f6e4 d3c5
e4c5 b2d4 c5e4 d4b6 e4d6 g2g4 d6b5 b6c5 b5c7 g1g2 c7e6 c5d6 g7g6
We check at compile time that the TimePoint type is exactly 64 bits long for
the compiler (TimePoint is our alias in Stockfish for std::chrono::milliseconds
-- it is a signed integer type of at least 45 bits according to the C++ standard,
but will most probably be implemented as a 64 bits signed integer on modern
compilers), and we use this TimePoint type consistently across the code.
Bug report by user "fischerandom" on the TCEC chat (thanks), and the
patch includes code and suggestions by user "WOnder93" and Ronald de Man.
Fixes issue: https://github.com/official-stockfish/Stockfish/issues/1498
Closes pull request: https://github.com/official-stockfish/Stockfish/pull/1510
No functional change.
Make kingRing always eight squares, extending the bitboard to the
F file if the king is on the H file, and to the C file if the king
is on the A file. This may deal with cases where Stockfish (like
many other engines) would shift the king around on the back rank
like g1h1, not because there is some imminent threat, but because
it makes king safety look a little better just because the king ring
had a smaller area.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 34000 W: 7167 L: 6877 D: 19956
http://tests.stockfishchess.org/tests/view/5ab8216d0ebc5902932cbe64
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 22574 W: 3576 L: 3370 D: 15628
http://tests.stockfishchess.org/tests/view/5ab84e6a0ebc5902932cbe72
How to continue from there?
This patch probably makes it easier to tune the king safety evaluation,
because the new regularity of the king ring size will make the king
safety function more continuous.
Closes https://github.com/official-stockfish/Stockfish/pull/1512
Bench: 5934103
Rewrite the MovePicker class using lambda expressions for move filtering.
Includes code style changes by @mcostalba.
Verified for speed, passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 43191 W: 9391 L: 9312 D: 24488
http://tests.stockfishchess.org/tests/view/5a99b9df0ebc590297cc8f04
This rewrite of MovePicker.cpp seems to trigger less random crashes on Ryzen
machines than the version in previous master (reported by Bojun Guo).
Closes https://github.com/official-stockfish/Stockfish/pull/1454
No functional change.
To more clearly distinguish them from "const" local variables, this patch
defines compile-time local constants as constexpr. This is consistent with
the definition of PvNode as constexpr in search() and qsearch(). It also
makes the code more robust, since the compiler will now check that those
constants are indeed compile-time constants.
We can go even one step further and define all the evaluation and search
compile-time constants as constexpr.
In generate_castling() I replaced "K" with "step", since K was incorrectly
capitalised (in the Chess960 case).
In timeman.cpp I had to make the non-local constants MaxRatio and StealRatio
constepxr, since otherwise gcc would complain when calculating TMaxRatio and
TStealRatio. (Strangely, I did not have to make Is64Bit constexpr even though
it is used in ucioption.cpp in the calculation of constexpr MaxHashMB.)
I have renamed PieceCount to pieceCount in material.h, since the values of
the array are not compile-time constants.
Some compile-time constants in tbprobe.cpp were overlooked. Sides and MaxFile
are not compile-time constants, so were renamed to sides and maxFile.
Non-functional change.
Implements renaming suggestions by Marco Costalba, Günther Demetz,
Gontran Lemaire, Ronald de Man, Stéphane Nicolet, Alain Savard,
Joost VandeVondele, Jerry Donald Watson, Mike Whiteley, xoto10,
and I hope that I haven't forgotten anybody.
Perpetual renaming thread for suggestions:
https://github.com/official-stockfish/Stockfish/issues/1426
No functional change.
If search depth is less than ONE_PLY call qsearch(), no need to check the
depth condition at various call sites of search().
Passed STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 14568 W: 3011 L: 2877 D: 8680
http://tests.stockfishchess.org/tests/view/5aa846190ebc59029781015b
Also helps gcc to find some optimizations (smaller binary, some speedup).
Thanks to Aram and Stefan for identifying an oversight in an early version.
Closes https://github.com/official-stockfish/Stockfish/pull/1487
No functional change.
The NO_BSF does not cover any real life use-case today. The only compilers that
can compile SF today, with the current Makefile and no source code changes, are
either GCC compatible (define __GNUC__) or MSVC compatible (define _MSC_VER). So
they all support LSB/MSB intrinsics.
This patch simplifies away the software fall-backs of LSB/MSB that were still
in Stockfish code, but unused in any of the officially supported compilers.
Note the (legacy) MSVC/WIN32 case, where we use a 32-bit BSF/BSR solution, as
64-bit intrinsics aren't available there.
Discussed in: https://github.com/official-stockfish/Stockfish/pull/1447
and: https://github.com/official-stockfish/Stockfish/pull/1479
No functional change.
Adjust ProbCut rBeta by whether the score is improving, and also
set improving to false when in check. More precisely, this patch
has two parts:
1) the increased beta threshold for ProbCut is now adjusted based
on whether the score is improving
2) when in check, improving is always set to false.
Co-authored by Joost VandeVondele (@vondele) and Bill Henry (@VoyagerOne).
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 13480 W: 2840 L: 2648 D: 7992
http://tests.stockfishchess.org/tests/view/5aa693fe0ebc59029781004c
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 25895 W: 4099 L: 3880 D: 17916
http://tests.stockfishchess.org/tests/view/5aa6ac940ebc59029781006e
In terms of opportunities for future work opened up by this patch,
the ProbCut rBeta formula could probably be tuned to gain more Elo.
Closes https://github.com/official-stockfish/Stockfish/pull/1485
Bench: 5328254
King and pawn endgames are typically decisive, and a small
advantage is often sufficient to win. Therefore we now take
this into account when computing the initiative adjustment.
This idea came from a series of patches by Gian-Carlo Pascutto.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 48770 W: 10203 L: 9845 D: 28722
http://tests.stockfishchess.org/tests/view/5aa58cce0ebc59029780ff8d
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 22252 W: 3572 L: 3366 D: 15314
http://tests.stockfishchess.org/tests/view/5aa5b27c0ebc59029780ffad
Ideas for future developement:
- There have been a number of changes to the initiative
calculation lately. Perhaps the coefficients could be
tuned again.
- It may be possible to add special knowledge for other
endgames in the initiative calculation.
Closes https://github.com/official-stockfish/Stockfish/pull/1481
Bench: 5750110
"Loose pieces drop, in blitz keep everything protected"
Adding a small S(2,2) bonus for knights, bishops, rooks, and
queens that are "connected" to each other (in the sense that
they are under attack by our own pieces) apparently is a good
thing. It probably helps the pieces work together a bit better.
STC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 12317 W: 2655 L: 2467 D: 7195
http://tests.stockfishchess.org/tests/view/5aa2d86b0ebc590297cb6474
LTC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 35725 W: 5516 L: 5263 D: 24946
http://tests.stockfishchess.org/tests/view/5aa2fc6f0ebc590297cb64a8
How to continue from there (by Stefan Geschwentner)?
• First we should identify all other eval terms which have an overlap
with new connectivity bonus (like the outpost bonus). A simple way
would be subtract the connectivity bonus from them and look if this
better, or use a SPSA session for these terms.
• Tuning Connectivity himself with SPSA seems not so promising because
of the small range which is useful. Here manual testing changes of
Connectivity like +-1 seems better.
• The eg value is more important because in endgame the position gets
more open and so attacks on pieces are easier. Another important point
is that when defending/fortress-like positions each defending piece
needs a protection, otherwise attacks on them can break defense.
Closes https://github.com/official-stockfish/Stockfish/pull/1474
Bench: 5318575
Avoid duplicated code after recent commit "Use evaluation trend
to adjust futility margin". We initialize the improving variable
to true in the check case, which allows to avoid redundant code
in the general case.
Tested for speed by snicolet, patch seems about 0.4% faster.
No functional change.
Note: initializing the improving variable to false in the check
case was tested as a functional change, ending yellow in both STC
and LTC. This change is not included in the commit, but it is an
interesting result that could become part of a future patch about
improving or LMR. Reference of the LTC yellow test:
http://tests.stockfishchess.org/tests/view/5aa131560ebc590297cb636e
Allow a potential slider threat from a square currently occupied
by a harmless attacker, just as the recent "knight on queen" patch.
Also from not completely safe squares, use the mobilityArea instead
of excluding all pawns for both SlidersOnQueen and KnightOnQueen
We now compute the potential sliders threat on queen only if opponent
has one queen.
Run as SPRT [0,4] since it is some kind of simplification but maybe
not clearly one.
STC:
http://tests.stockfishchess.org/tests/view/5aa1ddf10ebc590297cb63d8
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 22997 W: 4817 L: 4570 D: 13610
LTC:
http://tests.stockfishchess.org/tests/view/5aa1fe6b0ebc590297cb63e5
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 11926 W: 1891 L: 1705 D: 8330
After this patch is committed, we may try to:
• re-introduce some "threat by queen" bonus to make Stockfish's queen
more aggressive (attacking aspect)
• introduce a concept of "queen overload" to force the opponent queen
into passivity and protecting duties (defensive aspect)
• more generally, re-tune the queen mobility array since patches in the
last three months have affected a lot the location/activity of queens.
Closes https://github.com/official-stockfish/Stockfish/pull/1473
bench: 5788691
We give a S(21,11) bonus for knight threats on the next moves
against enemy queen. The threats are from squares which are
"not strongly protected" and which may be empty, contain enemy
pieces or even one of our piece at the moment (N,B,Q,R) -- hence
be two-steps threats in the later case because we will have to
move our piece and *then* attack the enemy queen with the knight.
STC: http://tests.stockfishchess.org/tests/view/5a9e442e0ebc590297cb6162
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 35129 W: 7346 L: 7052 D: 20731
LTC: http://tests.stockfishchess.org/tests/view/5a9e6e620ebc590297cb617f
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 42442 W: 6695 L: 6414 D: 29333
How to continue from there?
• Trying to refine the threat condition ("not strongly protected")
• Trying the two-steps idea for bishops or rooks threats against queen
Bench: 6051247
Similar removal of superposition code trick as in the
"Simplify tropism computation" patch. This simplification
of the space() function will allow us to specify space
masks which can reach into enemy territory.
passed STC:
LLR: 3.38 (-2.94,2.94) [-3.00,1.00]
Total: 184630 W: 40581 L: 40758 D: 103291
http://tests.stockfishchess.org/tests/view/5a8433360ebc590297cc80c5
passed LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 231799 W: 37647 L: 37858 D: 156294
http://tests.stockfishchess.org/tests/view/5a96a34a0ebc590297cc8cfd
No functional change.
Add a logarithmic term in the optimism computation, increase
the maximal optimism and lower the contempt offset.
This increases the dynamics of the optimism aspects, giving
a boost for balanced positions without skewing too much on
unbalanced positions (but this version will enter panic mode
faster than previous master when behind, trying to draw faster
when slightly behind). This helps, since optimism is in general
a good thing, for instance at LTC, but too high optimism
rapidly contaminates play.
passed STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 159343 W: 34489 L: 33588 D: 91266
http://tests.stockfishchess.org/tests/view/5a8db9340ebc590297cc85b6
passed LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 47491 W: 7825 L: 7517 D: 32149
http://tests.stockfishchess.org/tests/view/5a9456a80ebc590297cc8a89
It must be mentioned that a version of the PR with contempt 0
did not pass STC [0,5]. The version in the patch, which uses
default contempt 12, was found to be as strong as current master
on different matches against SF7 and SF8, both at STC and LTC.
One drawback maybe is that it raises the draw rate in self-play
from 56% to 59%, giving a little bit less sensitivity for SF
developpers to find evaluation improvements by selfplay tests
in fishtest.
Possible further work:
• tune the values accurately, while keeping in mind the drawrate issue
• check whether it is possible to remove linear and offset term
• try to simplify the S-shape curve
Bench: 5934644
Use a recursive std::array with variadic template
parameters to get rid of the last redundacy.
The first template T parameter is the base type of
the array, the W parameter is the weight applied to
the bonuses when we update values with the << operator,
the D parameter limits the range of updates (range is
[-W * D, W * D]), and the last parameters (Size and
Sizes) encode the dimensions of the array.
This allows greater flexibility because we can now tweak
the range [-W * D, W * D] for each table.
Patch removes more lines than what adds and streamlines
the Stats soup in movepick.h
Closes PR#1422 and PR#1421
No functional change.
The first depth 2 margin triggers the verification quiescence search.
This qsearch() result has to be better then the second lower margin,
so we only skip the razoring when the qsearch gives a significant
improvement.
Passed STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 32133 W: 7395 L: 7101 D: 17637
http://tests.stockfishchess.org/tests/view/5a93198b0ebc590297cc8942
Passed LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 17382 W: 3002 L: 2809 D: 11571
http://tests.stockfishchess.org/tests/view/5a93b18c0ebc590297cc89c2
This Elo-gaining version was further simplified following a suggestion
of Marco Costalba:
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 15553 W: 3505 L: 3371 D: 8677
http://tests.stockfishchess.org/tests/view/5a964be90ebc590297cc8cc4
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 13253 W: 2270 L: 2137 D: 8846
http://tests.stockfishchess.org/tests/view/5a9658880ebc590297cc8cca
How to continue after this patch?
Reformating the razoring code (step 7 in search()) to unify the
depth 1 and depth 2 treatements seems quite possible, this could
possibly lead to more simplifications.
Bench: 5765806
In pawn structures like white pawns f6,h6 against black pawns f7,g6,h7
the attack on the king is blocked by the own pawns. So decrease the
penalty for king safety.
See diagram and discussion in
https://github.com/official-stockfish/Stockfish/pull/1434
A sample position that this patch wants to avoid is the following
1rr2bk1/3q1p1p/2n1bPpP/pp1pP3/2pP4/P1P1B3/1PBQN1P1/1K3R1R w - - 0 1
White pawn storm on the king side was a disaster, it locked the king
side completely. Therefore, all the king tropism bonus that white have
on the king side are useless, and kingadjacent attacks too. Master
gives White a static +4.5 advantage, but White cannot win that game.
The patch is lowering this evaluation artefact.
STC:
LLR: 2.94 (-2.94,2.94) [0.00,5.00]
Total: 16467 W: 3750 L: 3537 D: 9180
http://tests.stockfishchess.org/tests/view/5a92102d0ebc590297cc87d0
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 64242 W: 11130 L: 10745 D: 42367
http://tests.stockfishchess.org/tests/view/5a923dc80ebc590297cc8806
This version includes reformatting and speed optimization by Alain Savard.
Bench: 5643527
Using a SPSA tuning session to optimize the time management
parameters.
With SPSA tuning it is not always possible to say where improvements
came from. Maybe some variables changed randomly or because result
was not sensitive enough to them. So my explanation of changes will
not be necessarily correct, but here it is.
• When decrease of thinking time was added by Joost a few months ago
if best move has not changed for several plies, one more competing
indicator was introduced for the same purpose along with increase
in score and absence of fail low at root. It seems that tuning put
relatively more importance on that new indicator what allowed to save
time.
• Some of this saved time is distributed proportionally between all
moves and some more time were given to moves when score dropped a lot
or best move changed.
• It looks also that SPSA redistributed more time from the beginning to
later stages of game via other changes in variables - maybe because
contempt made game to last longer or for whatever reason.
All of this is just small tweaks here and there (a few percentages changes).
STC (10+0.1):
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 18970 W: 4268 L: 4029 D: 10673
http://tests.stockfishchess.org/tests/view/5a9291a40ebc590297cc8881
LTC (60+0.6):
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 72027 W: 12263 L: 11878 D: 47886
http://tests.stockfishchess.org/tests/view/5a92d7510ebc590297cc88ef
Additional non-regression tests at other time controls
Sudden death 60s:
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 14444 W: 2715 L: 2608 D: 9121
http://tests.stockfishchess.org/tests/view/5a9445850ebc590297cc8a65
40 moves repeating at LTC:
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 10309 W: 1880 L: 1759 D: 6670
http://tests.stockfishchess.org/tests/view/5a9566ec0ebc590297cc8be1
This is a functional patch only for time management, but the bench
does not reflect this because it uses fixed depth search, so the number
of nodes does not change during bench.
No functional change.
This is the sequel of the previous patch, we now let the parent node initialize
stat score to zero once for all grandchildren.
Initialize statScore to zero for the grandchildren of the current position.
So statScore is shared between all grandchildren and only the first grandchild
starts with statScore = 0. Later grandchildren start with the last calculated
statScore of the previous grandchild. This influences the reduction rules in
LMR which are based on the statScore of parent position.
Tests results against the previous patch:
STC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 23676 W: 5417 L: 5157 D: 13102
http://tests.stockfishchess.org/tests/view/5a9423a90ebc590297cc8a46
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 35485 W: 6168 L: 5898 D: 23419
http://tests.stockfishchess.org/tests/view/5a9435550ebc590297cc8a54
Bench: 5643520
Let the parent node initialize stat score to zero once for all siblings.
Initialize statScore to zero for the children of the current position.
So statScore is shared between sibling positions and only the first sibling
starts with statScore = 0. Later siblings start with the last calculated
statScore of the previous sibling. This influences the reduction rules in
in LMR which are based on the statScore of parent position.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 22683 W: 5202 L: 4946 D: 12535
http://tests.stockfishchess.org/tests/view/5a93315f0ebc590297cc894f
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 48548 W: 8346 L: 8035 D: 32167
http://tests.stockfishchess.org/tests/view/5a933ba90ebc590297cc8962
Bench: 5833683
The current master applies Eval::Tempo even to leaves evaluated
as draw by some of the static evaluation functions of endgame.cpp
(for instance KNN vs K or stalemates in KP vs K). This results in
some lines being reported as +0.07 or -0.07 when the terminal
position has reached such endgames (0.07 being about the value
of a tempo for Stockfish).
This patch does not apply Eval::tempo to these positions. This leads
to more nodes being evaluated as VALUE_DRAW during search, giving more
opportunities for cut-offs in alpha-beta.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 52602 W: 11776 L: 11403 D: 29423
http://tests.stockfishchess.org/tests/view/5a8cb8f60ebc590297cc8546
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,4.00]
Total: 156613 W: 26820 L: 26158 D: 103635
http://tests.stockfishchess.org/tests/view/5a8f452d0ebc590297cc865a
Bench: 4924749
To compute dicovered check or pinned pieces we use some bitwise
operators that are not really needed because already accounted for
at the caller site.
For instance in evaluation we compute:
pos.pinned_pieces(Us) & s
Where pinned_pieces() is:
st->blockersForKing[c] & pieces(c)
So in this case the & operator with pieces(c) is useless,
given the outer '& s'.
There are many places where we can use the naked blockersForKing[]
instead of the full pinned_pieces() or discovered_check_candidates().
This path is simpler than original and gives around 1% speed up for me.
Also tested for speed by mstembera and snicolet (neutral in both cases).
No functional change.
Apply -mdynamic-no-pic in a single place in the Makefile instead of 5 places.
Verified on three different Macs:
- a MacBook from 2013
- a MacBook running MacOS 10.9.5
- an iMac running MacOS 10.13.3
No functional change.
The previous asymmetry measure of the pawn structure only used to
consider the number of pawns on semi-opened files in the position.
With this patch we also increase the measure by the number of passed
pawns for both players.
Many thanks to the community for the nice feedback on the previous
version, with special mentions to Alain Savard and Marco Costalba
for clarity and speed suggestions.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 13146 W: 3038 L: 2840 D: 7268
http://tests.stockfishchess.org/tests/view/5a91dd0c0ebc590297cc877e
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 27776 W: 4771 L: 4536 D: 18469
http://tests.stockfishchess.org/tests/view/5a91fdd50ebc590297cc879b
How to continue after this patch?
Stockfish will now evaluate more positions with passed pawns, so
tuning the passed pawns values may bring Elo. The patch has also
consequences on the initiative term, where we might want to give
different weights to passed pawns and semi-openfiles (idea by
Stefano Cardanobile).
Bench: 5302866
Move the first killer move out of the capture stage, combining treatment
of first and second killer move.
passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 55777 W: 12367 L: 12313 D: 31097
http://tests.stockfishchess.org/tests/view/5a88617e0ebc590297cc8351
Similar to an earlier proposition of Günther Demetz, see pull request #1075.
I think it is more robust and readable than master, why hand-unroll the loop
over the killer array, and duplicate code ?
This version includes review comments from Marco Costalba.
Bench: 5227124
The previous asymmetry measure of the pawn structure only used to
consider the number of pawns on semi-opened files in the postions.
With this patch we also increase the measure by the number of passed
pawns for both players.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 13146 W: 3038 L: 2840 D: 7268
http://tests.stockfishchess.org/tests/view/5a91dd0c0ebc590297cc877e
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 27776 W: 4771 L: 4536 D: 18469
http://tests.stockfishchess.org/tests/view/5a91fdd50ebc590297cc879b
How to continue from there: Stockfish will now evaluate more positions
with passed pawns, so tuning the passed pawns values may bring Elo.
The patch also has consequences on the initiative term.
Bench: 5302866
When iid (Internal iterative deepening) is invoked, the prior value of ttValue is
not guaranteed to be VALUE_NONE. As such, it is currently possible to enter a state
in which ttValue has a specific value which is inconsistent with tte->bound() and
tte->depth(). Currently, ttValue is only used within the search in a context that
prevents this situation from making a difference (and so this change is non-functional,
but this is not guaranteed to remain the case in the future.
For instance, just changing the tt depth condition in singular extension node to be
tte->depth() >= depth - 4 * ONE_PLY
instead of
tte->depth() >= depth - 3 * ONE_PLY
interacts badly with the absence of ttMove in iid. For the ttMove to become a singular
extension candidate, singularExtensionNode needs to be true. With the current master,
this requires that tte->depth() >= depth - 3 * ONE_PLY. This is not currently possible
if tte comes from IID, since the depth 'd' used for the IID search is always less than
depth - 4 * ONE_PLY for depth >= 8 * ONE_PLY (below depth 8 singularExtensionNode can
never be true anyway). However, with DU-jdto/Stockfish@251281a , this condition can be
met, and it is possible for singularExtensionNode to become true after IID. There are
then two mechanisms by which this patch can affect the search:
• If ttValue was VALUE_NONE prior to IID, the fact that this patch sets ttValue allows
the 'ttValue != VALUE_NONE' condition of singularExtensionNode to be met.
• If ttValue wasn't VALUE_NONE prior to IID, the fact that this patch modifies ttValue's
value causes a different 'rBeta' to be calculated if the singular extension search is
performed.
Tested at STC for non-regression:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76981 W: 17060 L: 17048 D: 42873
http://tests.stockfishchess.org/tests/view/5a7738b70ebc5902971a9868
No functional change
The razoring heuristic is quite a drastic pruning technique,
using a depth 0 search at internal nodes of the search tree
to estimate the true value of depth n nodes. This patch limits
this razoring to the case of internal nodes of depth 1.
Author: Jarrod Torriero (DU-jdto)
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 8043 W: 1865 L: 1716 D: 4462
http://tests.stockfishchess.org/tests/view/5a90a9290ebc590297cc86c1
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 32890 W: 5577 L: 5476 D: 21837
http://tests.stockfishchess.org/tests/view/5a90c8510ebc590297cc86d5
Opportunities opened by this patch: it would be interesting to
know if it brings Elo to re-introduce razoring or soft razoring
at depth >= 2, maybe using a larger margin to compensate for the
increased pruning effect.
Bench: 5227124
This is one of the most difficult to understand but also
most important and speed critical functions of SF.
This patch rewrites some part of it to hopefully
make it clearer and drop some redundant variables
in the process.
Same speed than master (or even a bit more).
Thanks to Chris Cain for useful feedback.
No functional change.
Where variable names are explicitly incorrect, I feel morally obligated to at least
suggest an alternative. There are many, but these two are especially egregious.
No functional change.
As far as can tell, semiopenFiles are set if there is a pawn anywhere on
the file. The removed condition would be true even if the pawns were very
advanced, which doesn't make sense if we're looking for a trapped rook.
Seems the engine fairs better with this removed. My guess s that the
condition that mobility is 3 or less does this well enough.
Begs the question whether this is a mobility issue alone... not sure.
Should I do LTC test?
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 13377 W: 3009 L: 2871 D: 7497
http://tests.stockfishchess.org/tests/view/5a855be40ebc590297cc8166
Passed LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 16288 W: 2813 L: 2685 D: 10790
http://tests.stockfishchess.org/tests/view/5a8575a80ebc590297cc817e
Bench: 5006365
Make contempt dependent on the current score of the root position.
The idea is that we now use a linear formula like the following to decide
on the contempt to use during a search :
contempt = x + y * eval
where x is the base contempt set by the user in the "Contempt" UCI option,
and y * eval is the dynamic part which adapts itself to the estimation of
the evaluation of the root position returned by the search. In this patch,
we use x = 18 centipawns by default, and the y * eval correction can go
from -20 centipawns if the root eval is less than -2.0 pawns, up to +20
centipawns when the root eval is more than 2.0 pawns.
To summarize, the new contempt goes from -0.02 to 0.38 pawns, depending if
Stockfish is losing or winning, with an average value of 0.18 pawns by default.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 110052 W: 24614 L: 23938 D: 61500
http://tests.stockfishchess.org/tests/view/5a72e6020ebc590f2c86ea20
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 16470 W: 2896 L: 2705 D: 10869
http://tests.stockfishchess.org/tests/view/5a76c5b90ebc5902971a9830
A second match at LTC was organised against the current master:
ELO: 1.45 +-2.9 (95%) LOS: 84.0%
Total: 19369 W: 3350 L: 3269 D: 12750
http://tests.stockfishchess.org/tests/view/5a7acf980ebc5902971a9a2e
Finally, we checked that there is no apparent problem with multithreading,
despite the fact that some threads might have a slightly different contempt
level that the main thread.
Match of this version against master, both using 5 threads, time control 30+0.3:
ELO: 2.18 +-3.2 (95%) LOS: 90.8%
Total: 14840 W: 2502 L: 2409 D: 9929
http://tests.stockfishchess.org/tests/view/5a7bf3e80ebc5902971a9aa2
Include suggestions from Marco Costalba, Aram Tumanian, Ronald de Man, etc.
Bench: 5207156
This patch simplifies the time management code, removing the extra
thinking time for moves with draw PV and increasing thinking time
for all moves proportionally by around 4%.
Last time when the time management was carefully tuned was 1.5-2 years
ago. As new patches were getting added, time management was drifting out
of optimum. This happens because when search becomes more precise pv and
score are becoming more stable, there are less fail lows, best move is
picked earlier and there are less best move changes. All this factors are
entering in time management, and average time per move is decreasing with
more and more good patches. For individual patches such effect is small
(except some) and may be up or down, but when there are many of them,
effect is more substantial. The same way benchmark with more and more
patches is slowly drifting down on average.
So my understanding that back in October adding more think time for draw
PV showed positive Elo because time management was not well tuned, there
was more time available, and think_hard patch applied this additional time
to moves with draw PV, while just retuning back to optimum would recover Elo
anyway. It is possible that absence of contempt also helped, as SF9 is showing
less 0.0 scores than the October version.
Anyway, to me it seems that proper place to deal with draw PV is search, and
contempt sounds as much better solution. In time management there is little
additional elo, and if some code is not helping like removed here, it is better
to discard it. It is simpler to find genuine improvement if code is clean.
• Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 20487 W: 4558 L: 4434 D: 11495
http://tests.stockfishchess.org/tests/view/5a7706ec0ebc5902971a9854
• Passed LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41960 W: 7145 L: 7058 D: 27757
http://tests.stockfishchess.org/tests/view/5a778c830ebc5902971a9895
• Passed an additional non-regression [-5..0] test at the time control
of 60sec for the game (sudden death) with disabled draw adjudication:
LLR: 2.95 (-2.94,2.94) [-5.00,0.00]
Total: 8438 W: 1675 L: 1586 D: 5177
http://tests.stockfishchess.org/tests/view/5a7c3d8d0ebc5902971a9ac0
• Passed an additional non-regression [-5..0] test at the time control
of 1sec+1sec per move with disabled draw adjudication:
LLR: 2.97 (-2.94,2.94) [-5.00,0.00]
Total: 27664 W: 5575 L: 5574 D: 16515
http://tests.stockfishchess.org/tests/view/5a7c3e820ebc5902971a9ac3
This is a functional change for the time management code.
Bench: 4983414
The 'eval' debugging command in Terminal did not initialize the Eval::Contempt
variable, leading to random output during debugging sessions (normal search
was unaffected by the bug).
Example of session where the two 'eval' commands should give the same output,
but did not:
./stockfish
position startpos
d
eval
go depth 20
d
eval
The bug is fixed by initializing Eval::Contempt to SCORE_ZERO in Eval::trace
No functional change.
The current logic in master is to continue return quiet moves if their
history score is above 0. It appears as though this check can be
removed, which is also more logically consistent with the “skipQuiets”
semantics used in search.cpp.
This patch may open new opportunitiesto get Elo by changing or
tuning the definition of 'moveCountPruning' in line 830 of search.cpp,
because obeying skipQuiets without checking the history scores makes
the search more sensitive to 'moveCountPruning'.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 34780 W: 7680 L: 7584 D: 19516
http://tests.stockfishchess.org/tests/view/5a79f8d80ebc5902971a99db
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 38757 W: 6732 L: 6641 D: 25384
http://tests.stockfishchess.org/tests/view/5a7afebe0ebc5902971a9a46
Bench 4954595
Enable link-time optimization in the Makefile when compiling with clang.
Also update travis.yml to use clang++-5.0 and llvm-5.0-dev.
No functional change.
For these recaptures, we’re are only considering those captures
that recapture the recapture square (small portion of all the
captures). Therefore, scoring all of the captures and pick_besting
out of the whole group is not necessary.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 85583 W: 18978 L: 18983 D: 47622
http://tests.stockfishchess.org/tests/view/5a717faa0ebc590f2c86e9a7
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 20231 W: 3533 L: 3411 D: 13287
http://tests.stockfishchess.org/tests/view/5a73ad330ebc5902971a96ba
Bench: 5023593
The difference between QCAPTURES_1 and QCAPTURES_2 quiescence search stages
boils down to a simple check of depth. The way it's being done now is
unnecessarily complex.
This patch is simpler, clearer, and easier to understand.
Passed SPRT[-3..1] test at STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 99755 W: 22158 L: 22192 D: 55405
http://tests.stockfishchess.org/tests/view/5a71f41c0ebc590f2c86e9cb
No functional change.
It seems to be a waste of time to loop through all remaining root moves
after finishing each PV line. This patch skips this until we have reached
the last PV line (this is the way it was done in Glaurung and very early
versions of Stockfish).
No functional change in Single PV mode.
MultiPV=3 STC and LTC tests
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 3113 W: 1248 L: 1064 D: 801
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 2260 W: 848 L: 679 D: 733
Bench: 5023629
It does the following:
- If a TB win or loss value allows an alpha or beta cutoff, the cutoff is taken.
- Otherwise, the search of the current subtree continues. In PV nodes, the final value returned is adjusted to reflect that the position is a TB win (or loss).
The patch also fixes a potential problem caused by root_probe() and root_probe_wdl() dirtying the root-move scores.
This patch removes the limitation of current master that a mate is never found if the root position is not yet in the TBs, but the path to mate at some point enters the TBs. The patch is intended to preserve the efficiency and effectiveness of the current TB probing approach.
No functional change (withouth TB)
Set the default contempt value of Stockfish to 20 centipawns.
The contempt feature of Stockfish tries to prevent the engine from
simplifying the position too quickly when it feels that it is very
slightly behind, instead keeping the tension a little bit longer.
Various tests in November 2017 have proved that our current imple-
mentation works well against SF7 (which is about 130 Elo weaker than
current master) and than the Elo gain is an increasing function of
contempt, going (against SF7) from +0 Elo when contempt is set at
zero centipawns, to +30 Elo when contempt is 40 centipawns.
See pull request 1325 for details:
https://github.com/official-stockfish/Stockfish/pull/1325
This november discussion left open the decision of which "default"
value for contempt we should use for Stockfish, taking into account
the various uses ofStockfish (opening preparation for humans, computer
online tournaments,analysis tool for web pages, human/computer play,
etc).
This pull request proposes to set the default contempt value of SF
to twenty centipawns, which turns out to be the highest value which
is not a regression against current master, as this seemed to be a
good compromise between risk and safety. A couple of SPRT[-3..1]
tests were done to bisect this value:
Contempt 10: http://tests.stockfishchess.org/tests/view/5a5d42d20ebc5902977e2901 (PASSED)
Contempt 15: http://tests.stockfishchess.org/tests/view/5a5d41740ebc5902977e28fa (PASSED)
Contempt 20: http://tests.stockfishchess.org/tests/view/5a5d42060ebc5902977e28fc (PASSED)
Contempt 25: http://tests.stockfishchess.org/tests/view/5a5d433f0ebc5902977e2904 (FAILED)
Surprisingly, a test at "very long time control" hinted that using
contempt 20 is not only be non-regressive against contempt 0, but
may actually exhibit some small Elo gain, giving a likehood of superio-
rity of 88.7% after 8500 games:
VLTC:
ELO: 2.28 +-3.7 (95%) LOS: 88.7%
Total: 8521 W: 1096 L: 1040 D: 6385
http://tests.stockfishchess.org/tests/view/5a60b2820ebc590297b9b7e0
Finally, there was some concerns that a contempt value of 20 would
be worse than a value of 7, but a test with 20000 games at STC was
neutral:
STC:
ELO: 0.45 +-3.1 (95%) LOS: 61.2%
Total: 20000 W: 4222 L: 4196 D: 11582
http://tests.stockfishchess.org/tests/view/5a64d2fd0ebc590297903868
See the comments in pull request 1361 for the long, nice discussion
(180 entries :-)) leading to the decision to propose contempt 20 as
the default value:
https://github.com/official-stockfish/Stockfish/pull/1361
Whether Stockfish should strictly adhere to the Komodo and Houdini
semantics and add the UCI commands to force the contempt to be White
in the so-called "analysis mode" is still under discussion, and may
be or may not be the object of a future commit.
Bench: 5783344
1. avoid recursive call of verification.
For the interested side to move recursion makes no sense.
For the other side it could make sense in case of mutual zugzwang,
but I was not able to figure out any concrete problematic position.
Allows the removal of 2 local variables.
2. avoid further reduction by removing R += ONE_PLY;
Benchmark with zugzwang-suite (see #1338), max 45 secs per position:
Patch solves 33 out of 37
Master solves 31 out of 37
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76188 W: 13866 L: 13840 D: 48482
http://tests.stockfishchess.org/tests/view/5a5612ed0ebc590297da516c
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40479 W: 5247 L: 5152 D: 30080
http://tests.stockfishchess.org/tests/view/5a56f7d30ebc590299e4550e
bench: 5340015
as discussed in issue #1349, the way pages are allocated with calloc might imply some overhead on first write.
This overhead can be large and slow down the first search after a TT resize significantly, especially for large TT.
Using an explicit clear of the TT on resize fixes this problem.
Not implemented, but possibly useful for large TT, is to do this zero-ing using all search threads. Not only would this be faster, it could also lead to a more favorable memory allocation on numa systems with a first touch policy.
No functional change.
The heuristic to avoid thread binding if less than 8 threads are requested resulted in the first 7 threads not being bound.
The branch was verified to yield a roughly 13% speedup by @CoffeeOne on the appropriate hardware and OS, and an earlier version of this patch tested well on his machine:
http://tests.stockfishchess.org/tests/view/5a3693480ebc590ccbb8be5a
ELO: 9.24 +-4.6 (95%) LOS: 100.0%
Total: 5000 W: 634 L: 501 D: 3865
To make sure all threads (including mainThread) are bound as soon as the total number exceeds 7, recreate all threads on a change of thread number.
To do this, unify Threads::init, Threads::exit and Threads::set are unified in a single Threads::set function that goes through the needed steps.
The code includes several suggestions from @joergoster.
Fixes issue #1312
No functional change
For efficiency reasons current master only allows for transposition table sizes that are N = 2^k in size, the index computation can be done efficiently as (hash % N) can be written instead as (hash & 2^k - 1). On a typical computer (with 4, 8... etc Gb of RAM), this implies roughly half the RAM is left unused in analysis.
This issue was mentioned on fishcooking by Mindbreaker:
http://tests.stockfishchess.org/tests/view/5a3587de0ebc590ccbb8be04
Recently a neat trick was proposed to map a hash into the range [0,N[ more efficiently than (hash % N) for general N, nearly as efficiently as (hash % 2^k):
https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
namely computing (hash * N / 2^32) for 32 bit hashes. This patch implements this trick and now allows for general hash sizes. Note that for N = 2^k this just amounts to using a different subset of bits from the hash. Master will use the lower k bits, this trick will use the upper k bits (of the 32 bit hash).
There is no slowdown as measured with [-3, 1] test:
http://tests.stockfishchess.org/tests/view/5a3587de0ebc590ccbb8be04
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 128498 W: 23332 L: 23395 D: 81771
There are two (smaller) caveats:
1) the patch is implemented for a 32 bit hash (so that a 64 bit multiply can be used), this effectively limits the number of clusters that can be used to 2^32 or to 128Gb of transpostion table. That's a change in the maximum allowed TT size, which could bother those using 256Gb or more regularly.
2) Already in master, an excluded move is hashed into the position key in rather simple way, essentially only affecting the lower 16 bits of the key. This is OK in master, since bits 0-15 end up in the index, but not in the new scheme, which picks the higher bits. This is 'fixed' by shifting the excluded move a few bits up. Eventually a better hashing scheme seems wise.
Despite these two caveats, I think this is a nice improvement in usability.
Bench: 5346341
Current master can yield different staticEvals depending on the path
used to reach the position. The reason for this is that the evaluation after a
null move is always computed subtracting 2 * Eval::Tempo, while this is not
the case for lazy or specialized evals. This patch always adds tempo to evals,
which doesn't affect playing strength:
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 59911 W: 7616 L: 7545 D: 44750
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 104947 W: 18897 L: 18919 D: 67131
Fixes issue #1335
Bench: 5208264
Simplify the other check penalty computation. Compared to current master,
a) it uses a 143 kingDanger penalty instead of S(10, 10) for the "otherCheck"
(credits to ElbertoOne for finding a suitable kingDanger range to replace the score
and to Guardian for showing this could also be a neutral change at LTC).
This makes our king safety model more consistent and simpler.
b) it might also score more than one "otherCheck" penalty for a given piece type instead of just one
c) it might score many pinned penalties instead of just one.
d) It also remove 3 conditionals and uses simpler expressions.
So it was tested as a SPRT[-3, 1]
Passed STC
http://tests.stockfishchess.org/tests/view/5a2b560b0ebc590ccbb8ba6b
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 11705 W: 2217 L: 2080 D: 7408
And LTC
http://tests.stockfishchess.org/tests/view/5a2bfd0d0ebc590ccbb8bab0
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 26812 W: 3575 L: 3463 D: 19774
Trying to improve on b) another attempt was made to score also the
"otherchecks" for piece types which had some safe checks, but this
failed STC http://tests.stockfishchess.org/tests/view/5a2c79e60ebc590ccbb8badd
bench: 5149133
* A better contempt implementation for Stockfish
The round 2 of TCEC season 10 demonstrated the benefit of having a nice contempt implementation: it gives the strongest programs in the tournament the ability to slow down the game when they feel the position is slightly worse, prefering to stay in a complicated (even if slightly risky) middle game rather than simplifying by force into a drawn endgame.
The current contempt implementation of Stockfish is inadequate, and this patch is an attempt to provide a better one.
Passed STC non-regression test against master:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 83360 W: 15089 L: 15075 D: 53196
http://tests.stockfishchess.org/tests/view/5a1bf2de0ebc590ccbb8b370
This contempt implementation is showing promising results in certains situations. For instance, it obtained a nice +30 Elo gain when playing with contempt=40 against Stockfish 7, compared to current master:
• master against SF 7 (20000 games at LTC): +121.2 Elo
• this patch with contempt=40 (20000 games at LTC): +154.11 Elo
This was the result of real cooperative work from the Stockfish team, with key ideas coming from Stefan Geschwentner (locutus2) and Chris Cain (ceebo) while most of the community helped with feedback and computer time.
In this commit the bench is unchanged by default, but you can test at home with the new contempt in the UCI options. The style of play will change a lot when using contempt different of zero (I repeat: not done in this version by default, however)!
The Stockfish team is still deliberating over the best default contempt value in self-play and the best contempt modeling strategy, to help users choosing a contempt value when playing against much weaker programs. These informations will be given in future commits when available :-)
Bench: 5051254
* Remove the prefetch
No functional change.
Currently the NORTH/WEST/SOUTH/EAST values are of type Square, but conceptually they are not squares but directions. This patch separates these values into a Direction enum and overloads addition and subtraction to allow adding a Square to a Direction (to get a new Square).
I have also slightly trimmed the possible overloadings to improve type safety. For example, it would normally not make sense to add a Color to a Color or a Piece to a Piece, or to multiply or divide them by an integer. It would also normally not make sense to add a Square to a Square.
This is a non-functional change.
Add the -fno-exceptions flag to the Makefile to avoid the unecessary exceptions support in the executable (we do not use any exception in Stockfish at the moment).
This change gives a 9.2% reduction in size for the executable binary.
Before : executable size = 376956 bytes
After: executable size = 347652 bytes
No functional change.
Four very minor edits. Note that tte->save() uses posKey and
not pos.key() in other places.
Originally I also added a futility_move_counts() function to
make things more consistent with the futility_margin() and
reduction() functions. But then razor_margin[] should probably
also be turned into a function, etc. Maybe a good idea, maybe not.
So I did not include it.
Non functional change.
The new "weak" expression helps simplify the safe check calculations for rooks or minors, (but the end result for all the safe checks is the exactly the same as in current master)
The only functional change is for the "outer king ring" (for example, squares f3 g3 h3 when white king is on g1). In current master, there was a 191 penalty if any of these was not defended at all.
With this pr, there is this 191 penalty if any of these is not defended at all or is only defended by a white queen.
Tested as a simplification
STC
http://tests.stockfishchess.org/tests/view/59fb03d80ebc590ccbb89fee
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 66167 W: 12015 L: 11971 D: 42181
(against master (Update Copyright year inMakefile))
LTC
http://tests.stockfishchess.org/tests/view/5a0106ae0ebc590ccbb8a55f
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 15790 W: 2095 L: 1968 D: 11727
(against master (Handle BxN trade as good capture when history scor))
same as #1296 but rebased on latest master
bench: 5109559
In terms of technical changes this patch eliminates the return
statements from the main loop of pos.see_ge() and replaces two conditional
computations with a single bitwise negation.
No functional change
Stockfish currently relies on the "filter_root_moves" function also
having the side effect of clamping Cardinality against MaxCardinality
(the actual piece count in the tablebases). So if we skip this function,
we will end up probing in the search even without tablebases installed.
We cannot bail out of this function before this check is done, so move
the MultiPV hack a few lines below.
If the PV leads to a draw (3-fold / 50-moves) position
and we're ahead of time, think a little longer, possibly
finding a better way.
As this is most likely effective at higher draw rates,
tried speculative LTC after a yellow STC:
STC:
http://tests.stockfishchess.org/tests/view/59eb173a0ebc590ccbb8975d
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 56095 W: 10013 L: 9902 D: 36180
elo = 0.688 +- 1.711 LOS: 78.425%
LTC:
http://tests.stockfishchess.org/tests/view/59eba1670ebc590ccbb897b4
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 59579 W: 7577 L: 7273 D: 44729
elo = 1.773 +- 1.391 LOS: 99.381%
bench: 5234652
In x/y time controls there was a theoretical possibility
to use all available time few moves before the clock will
be updated with new time. This patch fixes that issue.
Tested at 60/15 time control:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 113963 W: 20008 L: 20042 D: 73913
The test was done without adjudication rules!
Bench 5234652
A band-aid patch to workaround current TB code
limitations with multi PV.
Hopefully this will be removed after committing the
big update of TB impementation, now under discussion.
No functional change.
If the search is quit before skill.pick_best is called,
skill.best_move might be MOVE_NONE.
Ensure skill.best is always assigned anyhow.
Also retire the tricky best_move() and let the underlying
semantic to be clear and explicit.
No functional change.
The first change (ss->statScore >= 0) does nothing.
The second change ((ss-1)->statScore >= 0 ) has a massive change.
(ss-1)->statScore is not set until (ss-1) begins to apply LMR to moves.
So we now increase the reduction for bad quiets when our opponent is
running through the first captures and the hash move.
STC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 57762 W: 10533 L: 10181 D: 37048
LTC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 19973 W: 2662 L: 2480 D: 14831
Bench: 5037819
This patch lets ss->ply be equal to 0 at the root of the search.
Currently, the root has ss->ply == 1, which is less intuitive:
- Setting the rootNode bool has to check (ss-1)->ply == 0.
- All mate values are off by one: the code seems to assume that mated-in-0
is -VALUE_MATE, mate-1-in-ply is VALUE_MATE-1, mated-in-2-ply is VALUE_MATE+2, etc.
But the mate_in() and mated_in() functions are called with ss->ply, which is 1 in
at the root.
- The is_draw() function currently needs to explain why it has "ply - 1 > i" instead
of simply "ply > i".
- The ss->ply >= MAX_PLY tests in search() and qsearch() already assume that
ss->ply == 0 at the root. If we start at ss->ply == 1, it would make more sense to
go up to and including ss->ply == MAX_PLY, so stop at ss->ply > MAX_PLY. See also
the asserts testing for 0 <= ss->ply && ss->ply < MAX_PLY.
The reason for ss->ply == 1 at the root is the line "ss->ply = (ss-1)->ply + 1" at
the start for search() and qsearch(). By replacing this with "(ss+1)->ply = ss->ply + 1"
we keep ss->ply == 0 at the root. Note that search() already clears killers in (ss+2),
so there is no danger in accessing ss+1.
I have NOT changed pv[MAX_PLY + 1] to pv[MAX_PLY + 2] in search() and qsearch().
It seems to me that MAX_PLY + 1 is exactly right:
- MAX_PLY entries for ss->ply running from 0 to MAX_PLY-1, and 1 entry for the
final MOVE_NONE.
I have verified that mate scores are reported correctly. (They were already reported
correctly due to the extra ply being rounded down when converting to moves.)
The value of seldepth output to the user should probably not change, so I add 1 to it.
(Humans count from 1, computers from 0.)
A small optimisation I did not include: instead of setting ss->ply in every invocation
of search() and qsearch(), it could be set once for all plies at the start of
Thread::search(). This saves a couple of instructions per node.
No functional change (unless the search searches a branch MAX_PLY deep), so bench
does not change.
This shoudl reduce time losses experienced by
users after new time management code.
Verified for no regression in very short TC (4sec + 0.1)
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 35262 W: 7426 L: 7331 D: 20505
Bench 5322108
Use different penalties for weaknesses in the pawn shelter
depending on whether it is on the king's file or not.
STC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 71617 W: 13471 L: 13034 D: 45112
LTC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 48708 W: 6463 L: 6187 D: 36058
Bench: 5322108
Two simplifications:
- Remove the initialisation to 0 of occupied, which is now unnecessary.
- Remove the initial check for nextVictim == KING
If nextVictim == KING, then PieceValue[MG][nextVictim] will be 0, so that
balance >= threshold is true. So see_ge() returns true anyway.
No functional change.
Compile with -Werror flag. To make debugging easier
also show compile ourput.
This flag is enabled only in Travis CI, not in the shipped
Makefile becuase we can't test on every possible platform.
In light of issue #1232, a test was performed about the value of '-fno-exceptions' and a second one of the combination '-fno-exceptions -fno-rtti'. It turns out these options are can be removed without introducing slowdown.
STC for removing '-fno-exceptions'
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 13678 W: 2572 L: 2439 D: 8667
STC for removing '-fno-exceptions -fno-rtti' (current patch)
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 32557 W: 6074 L: 5973 D: 20510
No functional change.
During TB initialisation, Stockfish checks for the presence of WDL
tables but not for the presence of DTZ tables. When attempting to probe
a DTZ table, it is therefore possible that the table is not present.
In that case, Stockfish should neither exit nor report an error.
To verify the bug:
$ ./stockfish
setoption name SyzygyTable value <path_to_WDL_dir>
position fen 8/8/4r3/4k3/8/1K2P3/3P4/6R1 w - -
go infinite
Could not mmap() /opt/tb/regular/KRPPvKR.rtbz
$
(On my system, the WDL tables are in one directory and the DTZ tables
in another. If they are in the same directory, it will be difficult
to trigger the bug.)
The fix is trivial: check the file descriptor/handle after opening
the file.
No functional change.
After increasing the number of threads, the histories were not cleared,
resulting in uninitialized memory usage.
This patch fixes this by clearing threads histories in Thread c'tor as
is the idomatic way.
This fixes issue 1227
No functional change.
credit goes to @mstembera for suggesting this approach.
SEE now deals with castling, promotion and en passant in a similar way.
passed STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 32902 W: 6079 L: 5979 D: 20844
passed LTC
LLR: 3.92 (-2.94,2.94) [-3.00,1.00]
Total: 110698 W: 14198 L: 14145 D: 82355
Bench: 5713905
And set x86 and x64 platforms for real.
Currently this is broken and the same binary is compiled for all platforms.
This is becuase we use a custom build step. OTH the default
build step seems not compatible with cmake generated *sln file.
No functional change.
If any thread found a 'mate in x' stop the search. Previously only
mainThread would do so. Requires the bestThread selection to be
adjusted to always prefer mate scores, even if the search depth is less.
I've tried to collect some data for this patch. On 30 cores, mate finding
seems 5-30% faster on average. It is not so easy to get numbers for this,
as the time to find a mate fluctuates significantly with multi-threaded runs,
so it is an average over 100 searches for the same position. Furthermore,
hash size and position make a difference as well.
Bench: 5965302
Avoid constructing, passing as a parameter and binding a useless empty tuple of pointers in the qsearch move picker constructor.
Also reformat the scoring function in movepicker.cpp and do some cleaning in evaluate.cpp while there.
No functional change.
What this patch does is:
* increase safety margin from 40ms to 60ms. It's worth noting that the previous
code not only used 60ms incompressible safety margin, but also an additional
buffer of 30ms for each "move to go".
* remove a whart, integrating the extra 10ms in Move Overhead value instead.
Additionally, this ensures that optimumtime doesn't become bigger than maximum
time after maximum time has been artificially discounted by 10ms. So it keeps
the code more logical.
Tested at 3 different time controls:
Standard 10+0.1
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 58008 W: 10674 L: 10617 D: 36717
Sudden death 16+0
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 59664 W: 10945 L: 10891 D: 37828
Tournament 40/10
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 16371 W: 3092 L: 2963 D: 10316
bench: 5479946
Add tests for:
- Positions with move list
- Chess960 positions
Now bench covers almost all cases, only few endgames
are still out of reach (verified with lcov)
It is a non functionality patch, but bench
changed because we added new test positions.
bench: 5479946
Rewrite perft to be placed naturally inside new
bench code. In particular we don't have special
custom code to run perft anymore but perft is
just a new parameter of 'go' command.
So user API is now changed, old style command:
$perft 5
becomes
$go perft 4
No functional change.
First step in improving bench to handle
arbitrary UCI commands so to test many
more code paths.
This first patch just set the new code
structure.
No functional change.
In particular clarify that 'sd'
parameter is used only in !movesToGo
case.
Verified with Ivan's check tool it is
equivalent to original code.
No functional change.
The only call site of Time.maximum() corrected by 10.
Do this directly in remaining().
Ponder increased Time.optimum by 25% in init(). Idem.
Delete unused includes.
No functional change.
Avoid a couple of redundant rebuilds and compile
with 2 threads since travis gives 2vCPUs.
Also enable -O1 optimization for valgrind and
sanitizers, it should be safe withouth false
positives and it gives a very sensible speed
up, especially with valgrind.
The spee dup allow us to increase testing to
depth 10, useful for thread sanitizer.
No functional change.
Current update formula ensures that the
possible value range is [-32 * D, 32 * D].
So we never overflow if abs(32 * D) < INT16_MAX
Thanks to Joost and mstembera to clarify this.
No functional change.
The trick is to create an ambiguity for the
compiler in case an unwanted conversion to
Move is attempted like in:
ExtMove m1{Move(17),4}, m2{Move(4),17};
std::cout << (m1 < m2) << std::endl; // 1
std::cout << (m1 > m2) << std::endl; // 1(!)
This fixes issue #1204
No functional change.
Reduces memory footprint by ~1.2MB (per thread).
Strong pressure: small but mesurable gain
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 258430 W: 46977 L: 45943 D: 165510
Low pressure: no regression
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 73542 W: 13058 L: 13026 D: 47458
Strong pressure + LTC: elo gain confirmed
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 31489 W: 4532 L: 4295 D: 22662
Tested for crashing on overflow and after 70K
games at STC we have only 4 time losses,
possible candidate for an overflow.
No functional change.
We use Position::set() to set root position across
threads. But there are some StateInfo fields (previous,
pliesFromNull, capturedPiece) that cannot be deduced
from a fen string, so set() clears them and to not lose
the info we need to backup and later restore setupStates->back().
Note that setupStates is shared by threads but is accessed
in read-only mode.
This fixes regression introduced by df6cb446ea
Tested with 3 threads at STC:
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 14436 W: 2304 L: 2196 D: 9936
Bench: 5608839
Some warnings after a run of:
$ clang-tidy-3.8 -checks='modernize-*' *.cpp syzygy/*.cpp -header-filter=.* -- -std=c++11
I have not fixed all suggestions, for instance I still prefer
to declare the type instead of a spread use of 'auto'. I also
perfer good old 'typedef' to the new 'using' form.
I have not fixed some warnings in the last functions of
syzygy code because those are still the original functions
and need to be completely rewritten anyhow.
Thanks to erbsenzaehler for the original idea.
No functional change.
Simplify out low level sync stuff (mutex
and friends) and avoid to use them directly
in many functions.
Also some renaming and better comment while
there.
No functional change.
The case of three or more bishops against a long king must look at all of the
bishops and not just the first two in the piece lists. This patch makes sure
that the position is treated as a win when there are bishops on opposite
colors. This functional change is very small because bench remains the same.
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 24249 W: 4349 L: 4275 D: 15625
http://tests.stockfishchess.org/tests/view/598186530ebc5916ff64a218
Bench: 5608839
In this rare case (e.g. go infinite on a stalemate),
just spin till ponderhit/stop comes.
The Thread::wait() is a renmant of the old YBWC
code, today with lazy SMP, threads don't need to
wait when outside of their idle loop.
No functional change.
But this time correctly set Threads.ponder
We avoid using 'limits' for passing pondering
flag because we don't want to have 2 ponder
variables in search scope: Search::Limits.ponder
and Threads.ponder. This would be confusing also
because limits.ponder is set at the beginning of
the search and never changes, instead Threads.ponder
can change value asynchronously during search.
No functional change.
This reverts commit 5410424e3d.
After the commit pondering is broken, so revert for now. I will
resubmit with a proper fix.
The issue is mine, Joost original code is correct.
No functional change.
Limits::ponder was used as a signal between uci and search threads,
but is not an atomic variable, leading to the following race as
flagged by a sanitized binary.
Expect input:
```
spawn ./stockfish
send "uci\n"
expect "uciok"
send "setoption name Ponder value true\n"
send "go wtime 4000 btime 4000\n"
expect "bestmove"
send "position startpos e2e4 d7d5\n"
send "go wtime 4000 btime 4000 ponder\n"
sleep 0.01
send "ponderhit\n"
expect "bestmove"
send "quit\n"
expect eof
```
Race:
```
WARNING: ThreadSanitizer: data race (pid=7191)
Read of size 4 at 0x0000005c2260 by thread T1:
Previous write of size 4 at 0x0000005c2260 by main thread:
Location is global 'Search::Limits' of size 88 at 0x0000005c2220 (stockfish+0x0000005c2260)
```
The reason of teh race is that ponder is not just set in UCI go()
assignment but also is signaled by an async ponderhit in uci.cpp:
else if (token == "ponderhit")
Search::Limits.ponder = 0; // Switch to normal search
The fix is to add an atomic bool to the threads structure to
signal the ponder status, letting Search::Limits to reflect just
what was passed to 'go'.
No functional change.
Better split code that should be run at
startup from code run at ucinewgame. Also
fix several races when 'bench', 'perft' and
'ucinewgame' are sent just after 'bestomve'
from the engine threads are still running.
Also use a specific UI thread instead of
main thread when setting up the Position
object used by UI uci loop. This fixes a
race when sending 'eval' command while searching.
We accept a race on 'setoption' to allow the
GUI to change an option while engine is searching
withouth stalling the pipe. Note that changing an
option while searchingg is anyhow not mandated by
UCI protocol.
No functional change.
as a lower level routine, movepicker should not depend on the
search stack or the thread class, removing a circular dependency.
Instead of copying the search stack into the movepicker object,
as well as accessing the thread class for one of the histories,
pass the required fields explicitly to the constructor (removing
the need for thread.h and implicitly search.h in movepick.cpp).
The signature is thus longer, but more explicit:
Also some renaming of histories structures while there.
passed STC [-3,1], suggesting a small elo impact:
LLR: 3.13 (-2.94,2.94) [-3.00,1.00]
Total: 381053 W: 68071 L: 68551 D: 244431
elo = -0.438 +- 0.660 LOS: 9.7%
No functional change.
A pawn (according to all the searched positions of a bench run) is not supported 85% of the time,
(in current master it is either isolated, backward or "unsupported").
So it made sense to try moving the S(17, 8) "unsupported" penalty value into the base pawn value hoping for a more representative pawn value, and accordingly
a) adjust backward and isolated so that they stay more or less the same as master
b) increase the mg connected bonus in the supported case by S(17, 0) and let the Connected formula find a suitable eg value according to rank.
Tested as a simplification SPRT(-3, 1)
Passed STC
http://tests.stockfishchess.org/tests/view/5970dbd30ebc5916ff649dd6
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 19613 W: 3663 L: 3540 D: 12410
Passed LTC
http://tests.stockfishchess.org/tests/view/597137780ebc5916ff649de3
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 24721 W: 3306 L: 3191 D: 18224
Bench: 5581946
Closes#1179
in the last month a couple of timeouts have been seen in travis valgrind testing, leading to undesired false positives. The precise cause of this is unclear: a normal valgrind instrumented run is about 6min, the timeout is 10min. Either there are rare hangs (not reproduced locally), or maybe the actual runtime fluctuates on the travis infrastructure (which uses VMs on AWS as far as I know). This patch leads to roughly a 2x speedup of the instrumented testing by reducing the depth from 10 to 9. If timeouts persist, it needs further analysis.
No functional change.
Closes#1171
For some reason, although game phase is used
only in material, it is computed in Position.
Move computation to material, where it belongs,
and remove the useless call chain.
No functional change.
Instead of having Signals in the search namespace,
make the stop variables part of the Threads structure.
This moves more of the shared (atomic) variables towards
the thread-related structures, making their role more clear.
No functional change
Closes#1149
Optimization options for official stockfish should be
consistent, easy, future proof and simple.
We don't want to optimize for any specific version of gcc
No functional change
Closes#1165
Addition of correction values in case of Imbalance of queens,
depending on the number of light pieces on the side without a queen.
Passed patch:
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 29036 W: 5379 L: 5130 D: 18527
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 13680 W: 1836 L: 1674 D: 10170
Bench: 6258930
Closes#1155
It is not needed becuase the only case is a real special
one (bench on depth with many threads) and can be easily
rewritten to avoid sharing rootDepth.
Verified with ThreadSanitizer.
No functional change.
Closes#1159
Only one remains (also in tbprobe.cpp), but is bougus.
As a side note, tbprobe.cpp is almost clean, only the last 3
functions probe_wdl(), root_probe() and root_probe_wdl()
are still the original ones and are quite hacky.
Somewhere in the future we will reformat also the last 3
ones. The reason why has not been done before it is because
these functions are really wrong by design and should be
rewritten entirely, not only reformatted.
No functional change.
Closes#1160
Make magic_index() a member of Magic since it uses all it's members
and keep us from having to pass the function pointer around to
init_magics().
No functional change
Closes#1146
The main change of the patch is that now time check
is done only by main thread. In the past, before lazy
SMP, we needed all the threds to check for available
time because main thread could have been blocked on
a split point, now this is no more the case and main
thread can do the job alone, greatly simplifying the logic.
Verified for regression testing on STC with 7 threads:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 11895 W: 1741 L: 1608 D: 8546
No functional change.
Closes#1152
Now we don't need anymore the tricky pointer to
show the failed test. Added some few tests too.
Also small rename in see_ge() while there.
No functional change
Closes#1151
The idea is that chances are the tt-move is best and will be difficult to raise alpha when playing a quiet move.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 7582 W: 1415 L: 1259 D: 4908
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 59553 W: 7885 L: 7573 D: 44095
Bench: 5725676
Closes#1147
This patch puts the evaluation helper functions inside EvalInfo struct, which simplifies a bit their signature and (most importantly, IMHO) makes their C++ code much cleaner and simpler to read (by removing the "ei." qualifiers all around in evaluate.cpp).
Also rename the EvalInfo struct into Evaluation class to get a natural invocation v = Evaluation(p).value() to evaluation position p.
The downside is an increase of 20 lines in evaluate.cpp (for the prototypes of the helper functions). The upsides are better readability and a speed-up of 0.6% (by generating all the helpers for the NO_TRACE case together, which helps the instruction cache).
No functional change
Closes#1135
the nodes, tbHits, rootDepth and lastInfoTime variables are read by multiple threads, but not declared atomic, leading to data races as found by -fsanitize=thread. This patch fixes this issue. It is based on top of the CI-threading branch (PR #1129), and should fix the corresponding CI error messages.
The patch passed an STC check for no regression:
http://tests.stockfishchess.org/tests/view/5925d5590ebc59035df34b9f
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 169597 W: 29938 L: 30066 D: 109593
Whereas rootDepth and lastInfoTime are not performance critical, nodes and tbHits are. Indeed, an earlier version using relaxed atomic updates on the latter two variables failed STC testing (http://tests.stockfishchess.org/tests/view/592001700ebc59035df34924), which can be shown to be due to x86-32 (http://tests.stockfishchess.org/tests/view/592330ac0ebc59035df34a89). Indeed, the latter have no instruction to atomically update a 64bit variable. The proposed solution thus uses a variable in Position that is accessed only by one thread, which is copied every few thousand nodes to the shared variable in Thread.
No functional change.
Closes#1130Closes#1129
The change passed an STC regression:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 59350 W: 10793 L: 10738 D: 37819
I verified that there was no change in performance on my machine, but of course YMMV:
Results for 40 tests for each version:
Base Test Diff
Mean 2014338 2016121 -1783
StDev 62655 63441 3860
p-value: 0.678
speedup: 0.001
No functional change.
Closes#1137
TT.new_search() was being called by mainThread in Thread::search(). However, mainThread is the last to start searching, and helper threads could reach a measured rootDepth 10 (on 64 cores) before mainThread increments the TT generation. Fixed by moving the call to MaintThread::search() before helper threads start searching.
No functional change.
Closes#1134
Gather all magic relevant data into a struct.
This changes memory layout putting everything necessary for processing a single square
in the same memory location thus speeding up access.
Original patch by @snicolet
No functional change.
Closes#1127Closes#1128
In the current version a search stops when the current position is the same as
any position earlier in the search stack,
including the root position but excluding positions before the root.
The new version makes an exception for repeating the root position.
This gives correct scores for moves in the MultiPV > 1 mode.
Fixes#948 (see it for the detailed description of the bug).
LTC: http://tests.stockfishchess.org/tests/view/587910bc0ebc5915193f754b
ELO: 0.38 +-1.7 (95%) LOS: 66.8%
Total: 40000 W: 5166 L: 5122 D: 29712
STC: http://tests.stockfishchess.org/tests/view/5922e6230ebc59035df34a50
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 94622 W: 17059 L: 17064 D: 60499
LTC: http://tests.stockfishchess.org/tests/view/59273a000ebc59035df34c03
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 61259 W: 7965 L: 7897 D: 45397
Bench: 6599721
Closes#1126
Rearrange and rename all history heuristic code. Naming
is now based on chessprogramming.wikispaces.com conventions
and the relations among the various heuristics are now more
clear and consistent.
No functional change.
The previous patch has added a fraction of the king danger score to the
endgame score of the tapered eval, so it seems natural to perform the
king danger computation later in the endgame.
With this patch we extend the limit of such check analysis down to the
material of Rook+Knight, when we have at least two pieces attacking the
opponent king zone.
Passed STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 7446 W: 1409 L: 1253 D: 4784
http://tests.stockfishchess.org/tests/view/591c097c0ebc59035df3477c
and LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 14234 W: 1946 L: 1781 D: 10507
http://tests.stockfishchess.org/tests/view/591c24f10ebc59035df3478c
Bench: 5975183
Closes#1121
Fixes a bug in Search::clear, where the filling of CounterMoveStats&, overwrote (currently presumably unused) memory because sizeof(cm) returns the size in bytes, whereas elements was needed.
No functional change
Closes#1119
In current master the size of the king ring varies abruptly from eight
squares when the king is in g8, to 12 squares when it is in g7. Because
the king ring is used for estimating attack strength, this may lead to
an overestimation of king danger in some positions. This patch limits
the king ring to eight squares in all cases.
Inspired by the following forum thread:
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/xrUCQ7b0ObE
Passed STC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 9244 W: 1777 L: 1611 D: 5856
and LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 87121 W: 11765 L: 11358 D: 63998
Bench: 6121121
Closes#1115
execute an implied ucinewgame upon entering the UCI::loop,
to make sure that searches starting with and without an (optional) ucinewgame
command yield the same search.
This is needed now that seach::clear() initializes tables to non-zero default values.
No functional change
Closes#1101Closes#1104
Replacing the old Protector table with a simple linear formula which takes into account a different slope for each different piece type.
The idea of this simplification of Protector is originated by Alain (Rocky)
STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 70382 W: 12859 L: 12823 D: 44700
LTC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 61554 W: 8098 L: 8031 D: 45425
Bench: 6107863
Closes#1099
In general, this patch handles the cases where we don't have a valid score for each PV line in a multiPV search. This can happen if the search has been stopped in an unfortunate moment while still in the aspiration loop. The patch consists of two parts.
Part 1: The new PVIdx was already part of the k-best pv's in the last iteration, and we therefore have a valid pv and score to output from the last iteration. This is taken care of with:
bool updated = (i <= PVIdx && rootMoves[i].score != -VALUE_INFINITE);
Case 2: The new PVIdx was NOT part of the k-best pv's in the last iteration, and we have no valid pv and score to output. Not from the current nor from the previous iteration. To avoid this, we are now also considering the previous score when sorting, so that the PV lines with no actual but with a valid previous score are pushed up again, and the previous score can be displayed.
bool operator<(const RootMove& m) const {
return m.score != score ? m.score < score : m.previousScore < previousScore; } // Descending sort
I also added an assertion in UCI::value() to possibly catch similar issues earlier.
No functional change.
Closes#502Closes#1074
Testing the release candidate revealed only one minor issue, namely a new warning -Wimplicit-fallthrough (part of -Wextra) triggers in the movepicker. This can be silenced by adding a comment, and once we move to c++17 by adding a standard annotation [[fallthrough]];.
No functional change.
Closes#1090
StepAttacks[] is misdesigned, the color dependance is specific
to pawns, and trying to generalise to king and knights, proves
neither useful nor convinient in practice.
So this patch reformats the code with the following changes:
- Use PieceType instead of Piece in attacks_() functions
- Use PseudoAttacks for KING and KNIGHT
- Rename StepAttacks[] into PawnAttacks[]
Original patch and idea from Alain Savard.
No functional change.
Closes#1086
ss->killers can change while the movepicker is active.
The reason ss->killers changes is related to the singular
extension search in the moves loop that calls search<>
recursively with ss instead of ss+1,
effectively using the same stack entry for caller and callee.
By making a copy of the killers,
the movepicker does the right thing nevertheless.
Tested as a bug fix
STC:
http://tests.stockfishchess.org/tests/view/58ff130f0ebc59035df33f37
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 70845 W: 12752 L: 12716 D: 45377
LTC:
http://tests.stockfishchess.org/tests/view/58ff48000ebc59035df33f3d
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 28368 W: 3730 L: 3619 D: 21019
Bench: 6465887
Closes#1085
history related scores are not related to evaluation based scores.
For example, can easily exceed the range -VALUE_INFINITE,VALUE_INFINITE.
As such the current type is confusing, and a plain int is a better match.
tested for no regression:
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 43693 W: 7909 L: 7827 D: 27957
No functional change.
Closes#1070
the order of elements returned by std::partition is implementation defined (since not stable) and could depend on the version of libstdc++ linked.
As std::stable_partition was tested to be too slow (http://tests.stockfishchess.org/tests/view/585cdfd00ebc5903140c6082).
Instead combine partition with our custom implementation of insert_sort, which fixes this issue.
Implementation based on a patch by mstembera (http://tests.stockfishchess.org/tests/view/58d4d3460ebc59035df3315c), which suggests some benefit by itself.
Higher depth moves are all sorted (INT_MIN version), as in current master.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 33116 W: 6161 L: 6061 D: 20894
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 88703 W: 11572 L: 11540 D: 65591
Bench: 6256522
Closes#1058Closes#1065
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 58462 W: 10615 L: 10558 D: 37289
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 65061 W: 8539 L: 8477 D: 48045
It is worth noting that an attempt to only increase the bonus passed STC but failed LTC, and
an attempt to remove the cap without increasing the bonus is still running at STC, but will probably fail after more than 100k.
Bench: 6188591
Closes#1063
By adding pos.non_pawn_material(pos.side_to_move()) as a precondition in step 13,
which is already in use in Futility Pruning (child node) and Null Move Pruning for similar reasons.
Pawn endgames, especially those with only 1 or 2 pawns, are simply heavily influenced by zugzwang situations.
Since we are using a bitbase for KPK endgames, I see no reason to accept buggy evals as shown in #760
Patch looks neutral at STC
LLR: 2.32 (-2.94,2.94) [-3.00,1.00]
Total: 79580 W: 10789 L: 10780 D: 58011
and LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 27071 W: 3502 L: 3390 D: 20179
Bench: 6259071
Closes#1051Closes#760
a single Xeon Phi can present itself as a single numa node with up to 288 threads (4 threads per hardware core).
Tested to work as expected with a Xeon Phi CPU 7230 up to 256 threads.
No functional change
Closes#1045
If we can moveCountPrune and next quiet move has negative stats,
then go directly to the next move stage (Bad_Captures).
Reduction formula is tweaked to compensate for the decrease in move count that is used in LMR.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 6847 W: 1276 L: 1123 D: 4448
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 48687 W: 6503 L: 6226 D: 35958
Bench: 5919519
Closes#1036
Instead of having a continuous increasing bonus for our number of pawns when calculating imbalance, use a separate lookup array with tuned values.
Idea by GuardianRM.
STC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 16155 W: 2980 L: 2787 D: 10388
LTC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 100478 W: 13055 L: 12615 D: 74808
Bench: 6128779
Closes#1030
Simplifies away all associated checks, leading to a ~0.5% speedup.
The code now explicitly checks if moves are OK, rather than using nullptr checks.
Verified for no regression:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 32218 W: 5762 L: 5660 D: 20796
No functional change
Closes#1021
This patch removes the empty rows at the beginning and at the end of
MobilityBonus[] and Protector[] arrays:
• reducing the size of MobilityBonus from 768 bytes to 512 bytes
• reducing the size of Protector from 1024 to 512 bytes
Also adds some comments and cleaner code for the arrays in pawns.cpp
No speed penalty (measured speed-up of 0.4%).
No functional change.
Closes#1018
Replaces the HalfDensity array with an equivalent, compact implementation.
Includes suggestions by mcostalba & snicolet.
No functional change
Closes#1004
By defining "strongly protected" as "protected by a pawn, or protected
by two pieces and not attacked by two enemy pieces".
Passed STC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 17050 W: 3128 L: 2931 D: 10991
Passed LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 120995 W: 15852 L: 15343 D: 89800
Bench : 6269229
Closes#1016
This eliminates alignment padding and reduces size from 48 to 40 bytes.
This makes the material HashTable smaller and more cache friendly.
No functional change
Closes#1013
Initial protective idea by Snicolet for knight, for other pieces too
Patch add penalties and bonuses for pieces, depending on the distance from the own king
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 21192 W: 3919 L: 3704 D: 13569
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 26177 W: 3642 L: 3435 D: 19100
Bench : 6687377
Closes#1012
Positions with pawns on only one flank tend to be more drawish. We add
a term to the initiative bonus to help the attacking player keep pawns
on both flanks.
STC: yellowish run stopped after 257137 games
LLR: -0.92 (-2.94,2.94) [0.00,5.00]
Total: 257137 W: 46560 L: 45511 D: 165066
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 15602 W: 2125 L: 1956 D: 11521
Bench : 6976310
Closes#1009
A tuning patch which cover the following changes:
increase the importance of queen and rook mobility in endgame and
decrease it in mg, since if we use the heavy pieces too early in the game
we will just make opponent develop their pieces by threatening ours.
King Psqt:
1)King will be encouraged more to stay in the first ranks in the MG
2)and will be encouraged more to go to the middle of the board/last ranks in the EG
Bishop scale better in EG
Logical changes on various psqt tables
1/6 of the changes of the last tuning session on mobility tables
STC: LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 227879 W: 41240 L: 40313 D: 146326
LTC : LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 167047 W: 21871 L: 21291 D: 123885
Bench: 5695960
Closes#1008
Fixes failing build for
make ARCH=x86-32 clean && make ARCH=x86-32 optimize=no build
by passing -m32 also to the link step.
Extend travis testing accordingly.
No functional change.
Closes#999
Minor non-functional simplifications in computing the scale factor.
In my opinion, the code is now slightly more readable:
- remove one condition which can never be satisfied.
- immediately return instead of assigning the sf variable.
Tested for non-regression:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 62162 W: 11166 L: 11115 D: 39881
No functional change
Closes#992
Also the penalty/bonus function is misleading, we
should simply change it to stat_bonus(depth) for
bonus and -stat_bonus(depth+ ONE_PLY) for extra
penalty.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 11656 W: 2183 L: 2008 D: 7465
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 11152 W: 1531 L: 1377 D: 8244
Bench: 6101931
Results for 20 tests for each version (pgo-builds):
Base Test Diff
Mean 2110519 2118116 -7597
StDev 8727 4906 10112
p-value: 0,774
speedup: 0,004
Further verified for no regression:
http://tests.stockfishchess.org/tests/view/5885abd10ebc5915193f79e6
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 21786 W: 3959 L: 3840 D: 13987
No functional change
Move more code into eval_init, removing some
clutter in the main routine.
Write eval_init only from "our" point of view
(do not init the attackedBy[Them] bitboards).
Add mobilityArea to the evalinfo
A few edits while being there
tested for non-regression at STC
http://tests.stockfishchess.org/tests/view/587fab230ebc5915193f77d9
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 39585 W: 7183 L: 7094 D: 25308
Non functional change.
After we have taken into account all cheap evaluation
terms, we check whether the score exceeds a given threshold.
If this is the case, we return a scaled down evaluation.
STC:
LLR: 3.35 (-2.94,2.94) [0.00,5.00]
Total: 12575 W: 2316 L: 2122 D: 8137
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 67480 W: 9016 L: 8677 D: 49787
Current version is the one rewritten by ceebo
further edited by me.
Bench: 5367704
After the history simplifications, we are only using Value Stats for CounterMoveHistory table. Therefore the parameter CM is not necessary.
No functional change.
Add asserts to check for overflow in Score * int multiplication.
There is no overflow in current master, but it would be easy to
create one as the scale of the current eval does not leave many
spare bits. For instance, adding the following unused variables
in master at the end of evaluate() (line 882 of evaluate.cpp)
overflows:
Score s1 = score * 4; // no overflow
Score s2 = score * 5; // overflow
Assertion failed: (eg_value(result) == (i * eg_value(s))),
function operator*, file ./types.h, line 336.
Same md5 checksum as current master for non debug compiles.
No functional change.
Order the enum and the array the same way they appear around line 250.
Makes it much easier to follow.
Add comments in the array definition and critical rows.
Use same terminology as elsewhere in pawns.cpp
No functional change.
Previously, we had duplicated History:
- one with (piece,to) called History
- one with (from,to) called FromTo
Now that we have only one, rename it to History, which is the generally accepted
name in the chess programming litterature for this technique.
Also correct some comments that had not been updated since the introduction of CMH.
No functional change.
Perform more complex verification and validation.
- signature.sh : extract and optionally compare Bench/Signature/Node count.
- perft.sh : verify perft counts for a number of positions.
- instrumented.sh : run a few commands or uci sequences through valgrind/sanitizer instrumented binaries.
- reprosearch.sh : verify reproducibility of search.
These script can be used from directly from the command line in the src directory.
Update travis script to use these shell scripts.
No functional change.
Now taht we correctly value-initialize Thread objects,
we don't need c'tors anymore because tables will be
zero-initialized by the compier when Thread object
is instanced.
Verified that we have no errors with Valgrind.
No functional change.
If not explicitly initialized in a class constructor,
then all data members are default-initialized when
the corresponing struct/class is instanced.
For array and built-in types (int, char, etc..)
default-initialization is a no-op and we need to
explicitly zero them.
No functional change.
Tested using syzygy bench method:
- 2016 random positions ranging between 3 and 10 pieces
- each searched using bench at depth=10
Same node count (and no speed regression).
No functional change.
EasyMove is cleared after every iteration of the
search if the 3rd move in the PV of the main thread
changes from the previous iteration. Therefore,
clearing EasyMove during a search iteration may be
excessive. The tests show that this is indeed unnecessary.
In the new version the EasyMove variable is used only in
the Thread::search function.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 47719 W: 8438 L: 8362 D: 30919
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 122841 W: 15448 L: 15457 D: 91936
bench: 5468995
Implement a threefold repetition detection. Below are the examples of
problems fixed by this change.
Loosing move in a drawn position.
position fen 8/k7/3p4/p2P1p2/P2P1P2/8/8/K7 w - - 0 1 moves a1a2 a7a8 a2a1
The old code suggested a loosing move "bestmove a8a7", the new code suggests "bestmove a8b7" leading to a draw.
Incorrect evaluation (happened in a real game in TCEC Season 9).
position fen 4rbkr/1q3pp1/b3pn2/7p/1pN5/1P1BBP1P/P1R2QP1/3R2K1 w - - 5 31 moves e3d4 h8h6 d4e3
The old code evaluated it as "cp 0", the new code evaluation is around "cp -50" which is adequate.
Brings 0.5-1 ELO gain. Passes [-3.00,1.00].
STC: http://tests.stockfishchess.org/tests/view/584ece040ebc5903140c5aea
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 47744 W: 8537 L: 8461 D: 30746
LTC: http://tests.stockfishchess.org/tests/view/584f134d0ebc5903140c5b37
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 36775 W: 4739 L: 4639 D: 27397
Patch has been rewritten into current form for simplification and
logic slightly changed so that return a draw score if the position
repeats once earlier but after or at the root, or repeats twice
strictly before the root. In its original form, repetition at root
was not returned as an immediate draw.
After retestimng testing both version with SPRT[-3, 1], both passed
succesfully, but this version was chosen becuase more natural. There is
an argument about MultiPV in which an extended draw at root may be sensible.
See discussion here:
https://github.com/official-stockfish/Stockfish/pull/925
For documentation, current version passed both at STC and LTC:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 51562 W: 9314 L: 9245 D: 33003
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 115663 W: 14904 L: 14906 D: 85853
bench: 5468995
Non-functional changes
a) splitting the threat array to avoid using an enum
b) reorder the scores according to functions where they are used.
c) declarations in evaluate_pieces after the const(s) like elsewhere
d) more compact definitions of KingFlank,
now that we need it also for the PanwLessFlank penalty.
e) reuse CenterFiles in evaluate_space
f) move one line inside next popcount
No functional change.
It was a bit of a hack, without intrinsic value, but rather compensating for the
fact that checks were mistuned.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 88308 W: 15553 L: 15545 D: 57210
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 53115 W: 6741 L: 6662 D: 39712
bench 5468995
Fixes the only exception, in razoring.
The code already does assert(PvNode || (alpha == beta - 1)), and it can be verified by studying the program flow that this is indeed the case, also for the modified line.
No functional change.
make skipEarlyPruning a search argument instead of managing this by hand.
Verified for no regression at STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 96754 W: 17089 L: 17095 D: 62570
No functional change.
* Refactor bonus and penalty calculation
Compute common terms in a helper function.
No functional change.
* Further refactoring
Remove some parenthesis that are now useless.
Define prevSq once, use repeatedly.
No functional change.
bench: 5884767 (bench of previous patch is wrong)
This patch tweaks some pawn values to favor flank attacks.
The first part of the patch increases the midgame psqt values of external pawns to launch more attacks (credits to user GuardianRM for this idea), while the second part increases the endgame connection values for pawns on upper ranks.
Passed STC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 34997 W: 6328 L: 6055 D: 22614
and LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 13844 W: 1832 L: 1650 D: 10362
Bench: 5884767
GCC compiles builtin_clzll to “63 ^ BSR”. BSR is processor instruction "Bit Scan Reverse".
So old msb() function is basically 63 - 63 ^ BSR.
Unfortunately, GCC fails to simplify this expression.
Old function compiles to
bsrq %rdi, %rdi
movl $63, %eax
xorq $63, %rdi
subl %edi, %eax
ret
New function compiles to
bsrq %rdi, %rax
ret
BTW, Clang compiles both function to the same (optimal) code.
No functional change.
The needed Windows API for processor groups could be missed from old Windows
versions, so instead of calling them directly (forcing the linker to resolve
the calls at compile time), try to load them at runtime. To do this we need
first to define the corresponding function pointers.
Also don't interfere with running fishtest on numa hardware with Windows.
Avoid all stockfish one-threaded processes will run on the same node
No functional change.
This patch fixed bugs #859 and #882.
At initialization we generate a new random key (Zobrist::noPawns).
It's added to the pawn key of all positions, so that the pawn key
of a pawnless position is no longer 0.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 21307 W: 3738 L: 3618 D: 13951
LTC:
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 45270 W: 5737 L: 5648 D: 33885
No functional change.
Under Windows it is not possible for a process to run on more than one
logical processor group. This usually means to be limited to use max 64
cores. To overcome this, some special platform specific API should be
called to set group affinity for each thread. Original code from Texel by
Peter sterlund.
Tested by Jean-Paul Vael on a Xeon E7-8890 v4 with 88 threads and confimed
speed up between 44 and 88 threads is about 30%, as expected.
No functional change.
This refines the profile-build target to avoid 'touch'ing the sources,
keeping meaningful modification dates and avoiding editor warnings like vi's:
WARNING: The file has been changed since reading it!!!
Do you really want to write to it (y/n)?
Instead of touching sources, the (instrumented) object files are removed,
which has the same effect of rebuilding them in the next step.
As a side effect, this simplifies the Makefile a bit.
No functional change.
A position can never repeat the one on the previous move.
Thus we can start searching for a repetition from the 4th
ply behind. In the case:
std::min(st->rule50, st->pliesFromNull) < 4
We don't need to do any more calculations. This case happens
very often - in more than a half of all calls of the function.
No functional change.
Anonymous struct inside anonymous unions are a GCC extension.
This patch uses named structs to stick to the C+11 standard.
Avoids a string of warnings on the Clang compiler.
Non functional change (same bench and same MD5 signature,
so compiled code is exactly the same as in current master)
Makes the actual number of nodes searched match closely
the number of nodes requested, by increasing the frequency
of checking the number of nodes searched at low node count.
All other searches retain the default checking frequency of
once per 4096 nodes, and are thus unaffected.
Passed STC as non-regression
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 26643 W: 4766 L: 4655 D: 17222
No functional change.
In 10 of 12 calls total to Position::do_move()the givesCheck argument is
simply gives_check(m). So it's reasonable to make an overload without
this parameter, which wraps the existing version.
No functional change.
Currently, we only check if there is a pawn in place
to make the en-passant capture. Now also check that
there is a pawn that could just have advanced two
squares. Also update the corresponding comment.
This makes the parsing of FENs a bit more robust, and
now correctly handles positions like the one reported by Dann Corbit.
position fen rnbqkb1r/ppp3pp/3p1n2/3P4/8/2P5/PP3PPP/RNBQKB1R w KQkq e6
d
+---+---+---+---+---+---+---+---+
| r | n | b | q | k | b | | r |
+---+---+---+---+---+---+---+---+
| p | p | p | | | | p | p |
+---+---+---+---+---+---+---+---+
| | | | p | | n | | |
+---+---+---+---+---+---+---+---+
| | | | P | | | | |
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
| | | P | | | | | |
+---+---+---+---+---+---+---+---+
| P | P | | | | P | P | P |
+---+---+---+---+---+---+---+---+
| R | N | B | Q | K | B | | R |
+---+---+---+---+---+---+---+---+
Fen: rnbqkb1r/ppp3pp/3p1n2/3P4/8/2P5/PP3PPP/RNBQKB1R w KQkq - 0 1
No functional change.
Non functional change, tests under sanitizers OK.
Rationales for change
- Offset in code is in range -4 ... 2
- There was an error by (pathological) corner case MAX_PLY=0
No functional change.
Casting a pointer to a different type with stricter alignment
requirements yields to implementation dependent behaviour.
Practicaly everything is fine for common platforms because the
CPU/OS/compiler will generate correct code, but anyhow it is
better to be safe than sorry.
Testing with dbg_hit_on() shows that the unalignment accesses are
very rare (below 0.1%) so it makes sense to split the code in a
fast path for the common case and a slower path as a fallback.
No functional change (verified with TB enabled).
Small fixes for compilation with sanitize=yes optimize=no,
by always adding -fsanitize=undefined to the LDFLAGS as required.
Updates config-sanity to check&report the status of the flag.
No functional change.
Rewrite the code in SF style, simplify and
document it.
Code is now much clear and bug free (no mem-leaks and
other small issues) and is also smaller (more than
600 lines of code removed).
All the code has been rewritten but root_probe() and
root_probe_wdl() that are completely misplaced and should
be retired altogheter. For now just leave them in the
original version.
Code is fully and deeply tested for equivalency both in
functionality and in speed with hundreds of games and
test positions and is guaranteed to be 100% equivalent
to the original.
Tested with tb_dbg branch for functional equivalency on
more than 12M positions.
stockfish.exe bench 128 1 16 syzygy.epd
Position: 2016/2016
Total 12121156 Hits 0 hit rate (%) 0
Total time (ms) : 4417851
Nodes searched : 1100151204
Nodes/second : 249024
Tested with 5,000 games match against master, 1 Thread,
128 MB Hash each, tc 40+0.4, which is almost equivalent
to LTC in Fishtest on this machine. 3-, 4- and 5-men syzygy
bases on SSD, 12-moves opening book to emphasize mid- and endgame.
Score of SF-SyzygyC++ vs SF-Master: 633 - 617 - 3750 [0.502] 5000
ELO difference: 1
No functional change.
Avoid shifting negative signed integers and use typed
enum to avoids decrementing a variable beyond its defined
range, like:
for (Rank r = RANK_8; r >= RANK_1; --r)
Changes were tested individually and passed SPRT[-3, 1].
With this patch gcc --sanitize builds cleanly.
No functional change.
Instead of outputting "info nodes ... time ..." when the last
iteration is interrupted, simply call UCI::pv() to output the PV.
I thought about calling UCI:pv() with bounds -VALUE_INFINITE, VALUE_INFINITE
to avoid "lowerbound" or "upperbound" appearing in it, but I'm not sure that
would be any better.
This patch fixes rare inconsistencies between the first move of
the last PV output and the bestmove played. It also makes sure
that all the latest statistics are sent to the GUI (not only nodes
and time but also nps, tbhits, hashfull).
No functional change.
Restore original behaviour to reset
the counter before a new move search.
Also fixed some warnings and added const
qualifier to a couple of functions, as
suggested by m_stembera.
Thanks to Werner Bergmans for reporting
the regression.
No functional change.
Use a per-thread counter to reduce contention
with many cores and endgame positions.
Measured around 1% speed-up on a 12 core and 8%
on 28 cores with 6-men, searching on:
7R/1p3k2/2p2P2/3nR1P1/8/3b1P2/7K/r7 b - - 3 38
Also retire the unused set_nodes_searched() and fix
a couple of return types and naming conventions.
No functional change.
For a default bench, this fixes the last valgrind
error (jump on uninitialised value).
Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 187869 W: 33303 L: 33463 D: 121103
No functional change.
We reference (ss-1)->currentMove, i.e. we peek
current move of the parent node, so currentMove
should be valid in the main move loop, when we
search() the subtree, but outside of main loop
it is useless.
No functional change.
Also a speedup since we don't need to recalculate SEE
for extensions...as it already determined to be positive.
Results for 12 tests for each version:
Base Test Diff
Mean 2132395 2191002 -58607
StDev 128058 85917 134239
p-value: 0.669
speedup: 0.027
Non functional change.
The target:
Odroid U3 (http://www.hardkernel.com/main/products/prdt_info.php?g_code=g138745696275)
Debian Jessie
As listed in #550 and #638 three modifications are needed for compilation to work:
float-abi flag for GCC If an FPU is present and supported by the installed os then passed value need to be hard.
I didn't find any better solution than using readelf to check for the availibilty of Tag_ABI_VFP_args which sould indicate support for the FPU. The check is only done if the arch is arm and if readelf is not present
on the system, there will be an error (/bin/sh: 1: readelf: not found) but it will not break and will continue with the default softfp value. Outputing the error is not really acceptable but I wanted some feedback on the
check itself.
-lpthread is needed on armv7 outside of Android
I replaced UNAME with KERNEL and OS to allow to differentiate Android.
m32 flag
My understanding is that outside of Android the flag is generating errors on armv7.
These modifications should introduce change only for non Android armv7 build.
No functional change.
The target:
Odroid U3 (http://www.hardkernel.com/main/products/prdt_info.php?g_code=g138745696275)
Debian Jessie
As listed in #550 and #638 three modifications are needed for compilation to work:
float-abi flag for GCC If an FPU is present and supported by the installed os then passed value need to be hard.
I didn't find any better solution than using readelf to check for the availibilty of Tag_ABI_VFP_args which sould indicate support for the FPU. The check is only done if the arch is arm and if readelf is not present
on the system, there will be an error (/bin/sh: 1: readelf: not found) but it will not break and will continue with the default softfp value. Outputing the error is not really acceptable but I wanted some feedback on the
check itself.
-lpthread is needed on armv7 outside of Android
I replaced UNAME with KERNEL and OS to allow to differentiate Android.
m32 flag
My understanding is that outside of Android the flag is generating errors on armv7.
These modifications should introduce change only for non Android armv7 build.
No functional change.
Stephane's patch removes the only usage of Position::see, where the
returned value isn't immediately compared with a value. So I replaced
this function by its optimised and more specific version see_ge. This
function also supersedes the function Position::see_sign.
bool Position::see_ge(Move m, Value v) const;
This function tests if the SEE of a move is greater or equal than a
given value. We use forward iteration on captures instread of backward
one, therefore we don't need the swapList array. Also we stop as soon
as we have enough information to obtain the result, avoiding unnecessary
calls to the min_attacker function.
Speed tests (Windows 7), 20 runs for each engine:
Test engine: mean 866648, st. dev. 5964
Base engine: mean 846751, st. dev. 22846
Speedup: 1.023
Speed test by Stephane Nicolet
Fishtest STC test:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 26040 W: 4675 L: 4442 D: 16923
http://tests.stockfishchess.org/tests/view/57f648990ebc59038170fa03
No functional change.
This is a bit tricky because we don't want
to prune the only legal evasions, even if
with negative SEE. So add an assert to avoid
this subtle bug to slip in later.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 14140 W: 2625 L: 2421 D: 9094
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 11558 W: 1555 L: 1379 D: 8624
bench: 5256717
Rename shift_bb() to shift(), and DELTA_S to SOUTH, etc.
to improve code readability, especially in evaluate.cpp
when they are used together:
old b = shift_bb<DELTA_S>(pos.pieces(PAWN))
new b = shift<SOUTH>(pos.pieces(PAWN))
While there fix some small code style issues.
No functional change.
Drop useless condition
abs(ttValue) < VALUE_KNOWN_WIN
And extend singular extension search to cases when ttValue
stores a mate score. This improves mate finding and does
not introduce any regression.
Yery tested this patch against current master on the 6500+
Chest mate suite with 200K fixed nodes:
shortest mates found: master: 1206 patch:1205
any mate found: master: 1903 patch: 2003
with 1 sec time:
shortest mates found: master: 2667 patch: 2628
any mate found: master: 3585 patch: 3646
Verified for no regression:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 25655 W: 4578 L: 4465 D: 16612
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 66247 W: 8618 L: 8557 D: 49072
bench: 6335042
Both Tablebases::filter_root_moves() and
extract_ponder_from_tt(9 were unable to handle
a mate/stalemate position.
Spotted and reported by Dann Corbit.
Added some mate/stalemate positions to bench so
to early catch this regression in the future.
No functional change.
Use the following transformations:
- to check that A is included in B, testing "(A & ~B) == 0" is faster
than "(A & B) == A"
- to remove the intersection of A and B from A, doing "A &= ~B;" is as
fast as "if (A & B) A &= ~B;" but is simpler.
Overall, the simpler patch version is 0.3% than current master.
No functional change.
Correct pinners calculation and fix bug with pinned
pieces giving check. With this patch 'pinners' only
returns sliders with exactly one defensive piece between
the slider and the attacked square (in other words, pinners
returns exact pinners).
This was a co-operation between Marco Costalba,
Stphane Nicolet and me.
Special thanks to Ronald de Man for reporting the bug with
pinned pieces giving check, discussed here:
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/S_4E_Xs5HaE
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 132118 W: 23578 L: 23645 D: 84895
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 36424 W: 4770 L: 4670 D: 26984
bench: 6272231
Instrumentation shows that in make_score(mg, eg) calls, the mg value is
zero in 25,9% of the calls while the eg value is zero in 36,8% of the
calls.
Swapping the internal fields of mg and eg in the internal
representation of Score allows the compiler to optimize away the shift
in (eg << 16) + mg in more cases, thus resulting in a 0.3% speed-up
overall.
No functional change
Rescales the king danger variables in evaluate_king() to
suppress the KingDanger[] array. This avoids the cost of
the memory accesses to the array and simplifies the non-linear
transformation used.
Full credits to "hxim" for the seminal idea and implementation,
see pull request #786.
https://github.com/official-stockfish/Stockfish/pull/786
Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 9649 W: 1829 L: 1689 D: 6131
Passed LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 53494 W: 7254 L: 7178 D: 39062
Bench: 6116200
Drops a scalability bottleneck due to memory contention
of a single shared table across threads. The effect starts
to be sensible with a high number of threads. Specifically
we have a small regression with 7 threads both at 60 and
180 seconds TC:
10000 @ 60+0.6 th 7
ELO: -2.46 +-3.2 (95%) LOS: 6.5%
Total: 9896 W: 1037 L: 1107 D: 7752
5000 @ 180+0.6 th 7
ELO: -1.95 +-4.1 (95%) LOS: 17.7%
Total: 5000 W: 444 L: 472 D: 4084
We have a regression because counterMoveHistory table is
quite big and it takes time for a single thread to fill it.
Sharing the table yields to a higher fill rate and better
quality of moves and up to 7 threads the benefits of sharing
more then compensate the loss in speed due to contention.
Interestingly even with a 3X longer TC, so with more time
for the single thread to catch up, the improvment is quite
limited and below noise level. It seems we really need much
longer TC to saturate the table.
When we move to high threads number it's another story:
5000 @ 60+0.6 th 22
ELO: 3.49 +-4.3 (95%) LOS: 94.6%
Total: 4880 W: 490 L: 441 D: 3949
2000 @ 60+0.6 th 32
ELO: 8.34 +-6.9 (95%) LOS: 99.1%
Total: 2000 W: 229 L: 181 D: 1590
As expected the speed-up more than compensates the filling
rate, and we expect that with tournament TC, where single
thread is able to saturate the table, the difference will
be even stronger. For instance for TCEC 9 super-final time
control will be 180 minutes + 15 seconds and this scalability
improvement seems definitely the way to go.
So, summarizing:
GOOD:
Measured big improvement in high core scenario
Suitable for TCEC 9 superfinal (big hardware, very long TC)
Consistent and natural patch that extends to counterMoveHistory
what we already do for remaining history tables, that are all per-thread
Non functional change for the common case of a single core
Very simple (just 6 lines modified, no added ones)
BAD:
Small regression (within 2-3 ELO) with few threads and short TC
bench: 5341477
Measured bench speed up goes from 0,7% to 2%,
given the unreliable measure a reverse simmplification
test was done on fishtest:
master vs patch
LLR: -2.94 (-2.94,2.94) [-3.00,1.00]
Total: 15499 W: 2685 L: 2867 D: 9947
Test result is positive, master is weaker.
No functional change.
This is the most compact and neatest version
is was able to produce.
On normal builds I have a small slowdown:
normal builds base vs. simplification (gcc 4.8.1 Win7-64 i7-3770 @ 3.4GHz x86-64-modern)
Results for 20 tests for each version:
Base Test Diff
Mean 1974744 1969333 5411
StDev 11825 10281 5874
p-value: 0,178
speedup: -0,003
On pgo-builds however I measure a nice 1.1% speedup
pgo-builds base vs. simplification
Results for 20 tests for each version:
Base Test Diff
Mean 1974119 1995444 -21325
StDev 8703 5717 4623
p-value: 1
speedup: 0,011
No functional change.
Don't allow pinned pieces to attack the exchange-square as long all
pinners (this includes also potential ones) are on their original
square.
As soon a pinner moves to the exchange-square or get captured on it, we
fall back to standard SEE behaviour.
This correctly handles the majority of cases with absolute pins.
bench: 6883133
In evaluate, we start by initializing the pos.psq_score
and adding the material imbalance. After that, we check
whether a specialized eval exists and if yes we return
that value and discard whatever we have computed until now.
It sounds more logical to first probe material entry and
return if we have a specialized eval, and only if it is
not the case initialize eval with some values. There is
no measurable speed-difference on my computer.
Non functional change.
In case we have installed a not complete set of 6-men tables and
there is 6 piece position on board, but no corresponding
tablebase engine is not using any syzygy at all.
Reported by Jouni Uski, fix by Peter Österlund,
confirmed as a bug by Ronald de Man.
bench: 7591630
If the opponent has a cramped position, opening a file often
helps him/her to exchange pieces, so it makes sense to reduce
the space bonus if there are open files.
Credits: Leonardo Ljubičić for the strategic idea, Alain Savard for the
implementation of the open files calculation, "CrunchyNYC" for the
compensation of the numerator.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 49112 W: 9239 L: 8900 D: 30973
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 89415 W: 12014 L: 11601 D: 65800
Bench: 7591630
This greately simplifies usage because hides to the
search the implementation specific CheckInfo.
This is based on the work done by Marco in pull request #716,
implementing on top of it the ideas in the discussion: caching
the calls to slider_blockers() in the CheckInfo structure,
and simplifying the slider_blockers() function by removing its
first parameter.
Compared to master, bench is identical but the number of calls
to slider_blockers() during bench goes down from 22461515 to 18853422,
hopefully being a little bit faster overall.
archlinux, gcc-6
make profile-build ARCH=x86-64-bmi2
50 runs each
bench:
base = 2356320 +/- 981
test = 2403811 +/- 981
diff = 47490 +/- 1828
speedup = 0.0202
P(speedup > 0) = 1.0000
perft 6:
base = 175498484 +/- 429925
test = 183997959 +/- 429925
diff = 8499474 +/- 469401
speedup = 0.0484
P(speedup > 0) = 1.0000
perft 7 (but only 10 runs):
base = 185403228 +/- 468705
test = 188777591 +/- 468705
diff = 3374363 +/- 476687
speedup = 0.0182
P(speedup > 0) = 1.0000
$ ./pyshbench ../Stockfish/master ../Stockfish/test 20
run base test diff
...
base = 2501728 +/- 182034
test = 2532997 +/- 182034
diff = 31268 +/- 5116
speedup = 0.0125
P(speedup > 0) = 1.0000
No functional change.
This non-functional change patch is a deep work to allow SF to be independent
from the actual value of ONE_PLY (currently set to 1). I have verified SF is
now independent for ONE_PLY values 1, 2, 4, 8, 16, 32 and 256.
This patch gives consistency to search code and enables future work, opening
the door to safely tweaking the ONE_PLY value for any reason.
Verified for no speed regression at STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 95643 W: 17728 L: 17737 D: 60178
No functional change.
Rewritten in a way to have explicit in the search
the bonus/penalty we apply: hopefully this will lead
to further simplification/fix of current rather messy
stats update code.
No functional change.
STC: (Yellow)
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 69115 W: 12880 L: 12797 D: 43438
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 124163 W: 16923 L: 16442 D: 90798
Note: Note based off past experiments / patches... history pruning
is quite TC sensitive. I believe the reason for this TC dependency
is that the CMH/FMH is a very large table that takes time to fill
up with. In addition having more time for will increase the accuracy
of the stats' value.
Bench: 7351698
GCC uses SSE instructions to move data but in 32-bit gcc version used
by abrok the stack is not 16-byte aligned due to a bug.
This patch workaround teh bug by not using the stack
to store KingFlank[]
Fixes issue #721.
No functional change.
Checking for legality of a possible ponder move
must be done before we undo the first pv move,
of course. (spotted by mohammed li.)
This obviously only has any effect when playing in ponder mode.
No functional change.
Take advantage that VALUE_NONE = 32002 to remove
the condition.
Commented out and not removed becuase it is tricky
to rely on the hidden value of VALUE_NONE and code
can break in case we change VALUE_NONE in the future.
No functional change.
This code was added before the accurate pv patch, when
we retrieved PV directly from TT.
It's not required for correct (and long) PVs any more and
should be safe to remove it.
Also, allowing helper threads to repeatedly over-write
TT doesn't seem to make sense(that was probably an un-intended
side-effect of lazy smp). Before Lazy SMP only Main Thread used
to run ID loop and insert PV into TT.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 74346 W: 13946 L: 13918 D: 46482
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 47265 W: 6531 L: 6447 D: 34287
bench: 8819179
Allow to specifiy the log file name, this comes
handy in case of self-matches so that each SF
instance writes into a different log file.
No functional change.
Currently root moves are copied to all teh threads
but are DTZ filtered only in main thread at the
beginning of teh search.
This patch moves the TB filtering before the
copy of root moves fixing issue #679https://github.com/official-stockfish/Stockfish/issues/679
No bench change.
There are two concepts with this patch:
Limit check extensions by using move count.
The idea is to limit search explosion.
Always extend check if the first move gives check.
The idea is to save expensive SEE calls, since the vast
majority of first move will have SEE value >= 0, also
first move may still be strong even if the SEE is negative.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 16503 W: 3068 L: 2873 D: 10562
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 37202 W: 5261 L: 5014 D: 26927
bench: 8543366
Moving a few lines from evaluate_threats to evaluate_pieces allows to
a) Remove a condition pos.count<QUEEN>(Them) == 1
b) use precalculated s instead of pos.square(Them)
c) do not check the condition at all in queenless endings
Passed STC
http://tests.stockfishchess.org/tests/view/5752e0410ebc59029919b1f4
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 67175 W: 12194 L: 12152 D: 42829
and LTC
http://tests.stockfishchess.org/tests/view/57587b140ebc59029919b2f4
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 20276 W: 2774 L: 2653 D: 14849
bench: 7907962
In this position: 3K4/8/3k4/8/4p3/4B3/5P2/8 w - - 0 5
Current DTZ probe returns 1 instead of 15
What happens is that the double push f4 is erroneously detected as a win move.
After the push we have:
[D]3K4/8/3k4/8/4pP2/4B3/8/8 b - f3 0 5
And here the code misses the possible ep capture exf3.
The bug is in probe_dtz_no_ep() where is used probe_ab() that is
blind to ep captures so it returns v == 2 (win) for position
3K4/8/3k4/8/4pP2/4B3/8/8 b - f3 0 5
Note that at the caller site the original position did not have any
possible ep capture, so probe_dtz() returns immediately after calling
probe_dtz_no_ep().
The fix is to call the ep-aware probe_wdl() instead of probe_ab()
I have verified that DTZ is correct now and also there are no more
mistmatches compared to the new 'syzygy' branch. Tested on a set of
more than 600 endgame positions, included some tricky ones.
For people interested to redo the test or doing additional tests
please pull branch tb_dbg from https://github.com/mcostalba/Stockfish repo.
bench: 8450534 (bench unaffected because syzygy is not exercized during bench)
A middle ground patch of two successful tuning patches,
one at STC, the other at LTC, which now passed both.
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 67893 W: 12777 L: 12384 D: 42732
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 30165 W: 4189 L: 3960 D: 22016
bench: 9209507
Introducing a new multi-purpose penalty related to King safety, which
includes all kind of potential checks (from unsafe or unavailable
squares currently occupied by some other piece)
This will indirectly detect and reward some pins, discovered checks, and
motifs such as square vacation, or rook behind its pawn and aligned with
King (example Black Rg8, g7 against Kg1),
and penalize some pawn blockers (if they move, it allows a discovered
check by the pawn).
And since it looks also at protected squares, it detects some potential
defense overloading.
Finally, the rook contact checks had been removed some time ago. This
test will give a small bonus for them, as well as for bishop contact
checks.
Passed STC
http://tests.stockfishchess.org/tests/view/5729ec740ebc59301a354b36
LLR: 2.94 (-2.94,2.94) [0.00,5.00]
Total: 13306 W: 2477 L: 2296 D: 8533
and LTC
http://tests.stockfishchess.org/tests/view/572a5be00ebc59301a354b65
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 20369 W: 2750 L: 2565 D: 15054
bench: 9298175
Just use _mm_popcnt_u64() that is available
both for MSVC abd Intel compiler.
Verified on MSVC that the produced assembly
has the hardware 'popcnt' instruction.
No functional change.
Currently, helper threads will only search up to the
specified depth limit. Now let them search until the
main thread has finished the specified depth.
On the other hand, we don't want to pick a thread with
a higher search depth.
This may be considered cheating. ;-)
No functional change.
It seems that icc used our fallback version of popcount.
Now use intrinsics.
icc version 16.0.2 (gcc version 5.3.0 compatibility)
bmi2 compile
uname -r 4.5.1-1-ARCH
20xbench gives a nice speedup
./stockfish-icc-master 2161515 +- 34462
./stockfish-icc-sse42 2260857 +- 50349
I could not find anything documented that is necessary that prepending -mbmi to -mbmi2 gives some benefit.
Instead at
https://gcc.gnu.org/onlinedocs/gcc/x86-Built-in-Functions.html#x86-Built-in-Functions
The following built-in functions are available when -mbmi is used. All of them generate the machine instruction that is part of the name.
unsigned int __builtin_ia32_bextr_u32(unsigned int, unsigned int);
unsigned long long __builtin_ia32_bextr_u64 (unsigned long long, unsigned long long);
The following built-in functions are available when -mbmi2 is used. All of them generate the machine instruction that is part of the name.
unsigned int _bzhi_u32 (unsigned int, unsigned int)
unsigned int _pdep_u32 (unsigned int, unsigned int)
unsigned int _pext_u32 (unsigned int, unsigned int)
unsigned long long _bzhi_u64 (unsigned long long, unsigned long long)
unsigned long long _pdep_u64 (unsigned long long, unsigned long long)
unsigned long long _pext_u64 (unsigned long long, unsigned long long)
and at
https://gcc.gnu.org/ml/gcc/2014-02/msg00204.html
( "... The real optimization comes from being able to use pext
(parallel bit extract), which can implement several bextr expressions in
parallel.")
Apart from that we don't use all -msse -msse2 -msse3 -msse4.2 etc. but just -msse3 (or -msse4.2) only.
As regards to the speedup within noise level - this pull request is actually reversal of mcostalba#198 wherein prepending -mbmi to -mbmi2 was claimed to be 0.3% faster and here (removing -mbmi) gives 0.4% speed gain.
Seems to give around 1% speed-up for CPUs with popcnt support.
Seems to give a very minor speed-up for CPUs without popcnt.
No functional change
Resolves#646
In this position we should have draw for repetition:
position fen rnbqkbnr/2pppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1 moves g1f3 g8f6 f3g1
go infinite
But latest patch broke it.
Actually we had two(!) very subtle bugs, the first is that Position::set()
clears the passed state and in particular 'previous' member, so
that on passing setupStates, 'previous' pointer was reset.
Second bug is even more subtle: SetupStates was based on std::vector
as container, but when vector grows, std::vector copies all its contents
to a new location invalidating all references to its entries. Because
all StateInfo records are linked by 'previous' pointer, this made pointers
go stale upon adding more element to setupStates. So revert to use a
std::deque that ensures references are preserved when pushing back new
elements.
No functional change.
And passed in do_move(), this ensures maximum efficiency and
speed and at the same time unlimited move numbers.
The draw back is that to handle Position init we need to
reserve a StateInfo inside Position itself and use at
init time and when copying from another Position.
After lazy SMP we don't need anymore this gimmick and we can
get rid of this special case and always pass an external
StateInfo to Position object.
Also rewritten and simplified Position constructors.
Verified it does not regress with a 3 threads SMP test:
ELO: -0.00 +-12.7 (95%) LOS: 50.0%
Total: 1000 W: 173 L: 173 D: 654
No functional change.
When starting search in a mate or stalemate position, Stockfish does not
even care to reinitialize and start worker threads. However after search
all threads are checked for the best move.
This can lead to bestmove and info beeing carried over from the last
search.
Example session:
setoption name threads value 7
go movetime 4000
position startpos moves f2f3 e7e5 g2g4 d8h4
go movetime 4000
Actual output is like (almost always):
[...]
bestmove e2e4
info depth 0 score mate 0
info depth 20 seldepth 29 multipv 1 score cp 28 [...] pv e2e4
bestmove e2e4
Expected output / output after fix:
[...]
bestmove e2e4 ponder e7e6
info depth 0 score mate 0
bestmove (none)
Resolves#623
Broken after "32-bit/64-bit Makefile fix" commit.
Ubuntu "Precise" 12.04.5 supports multilib only until
g++ 4.6 that is not enough to compile Stockfish.
So move to Ubuntu 14.04.4 LTS (Trusty Tahr)
No functional change.
There was already a penalty for squares only defended by King (undefended)
This test records a penalty for completely undefended squares in the so called extended king-ring
(so if we exclude squares defended by a Kg8 for example, we only look at h6 g6 and f6)
We also exclude squares occupied by opponent pieces in this computation,
based on the following results
Was yellow at STC
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 112499 W: 20649 L: 20293 D: 71557
and passed LTC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 36805 W: 5100 L: 4857 D: 26848
Bench: 8430233
Resolves: #619
On top of the usual conditions
a) some opponent in front (but no lever)
b) some neighbours (in front) (but no neighbour behind or same rank)
c) < rank_5
to find out if a pawn is backward we look at the squares in front of this pawn to reach the same rank as the next neighbour.
In current master, a pawn is backward if any of those squares is controlled by an enemy pawn on an adjacent file
In this version, a pawn is ALSO backward if any of those squares is occupied by an enemy pawn.
STC:
http://tests.stockfishchess.org/tests/view/56fe7efd0ebc59301a3541f1
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 19051 W: 3557 L: 3433 D: 12061
LTC:
http://tests.stockfishchess.org/tests/view/56febc2d0ebc59301a354209
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40810 W: 5619 L: 5526 D: 29665
Bench: 7525245
Resolves#614
Also a speedup(about 1%) on 64-bit w/o hardware popcnt
Retire Max15 and Full template parameters
(Contributed by Marco Costalba)
Now that we have just SW and HW versions, use
template default parameter to get rid of explicit
template parameters.
Retire bitcount.h and move the only defined
function to bitboard.h
No functional change
Resolves#620
Counter intuitively, make build ARCH=x86-32 does NOT produce a 32-bit compile
when running a 64-bit OS. Nor would ARCH=x86-64 produce a 64-bit compile when
running a 32-bit OS (assuming it compiled w/o errors).
No functional change
Resolves#621
lsb(b) and msb(b) are undefined when b == 0. This can lead to subtle bugs, where
the resulting code behaves differently on different configurations:
- It can be the home grown software LSB/MSB
- It can be the compiler generated software LSB/MSB (when using compiler
intrinsics without the right compiler flags to allow compiler to use hardware
LSB/MSB). Which of course depends on the compiler.
- It can be hardware LSB/MSB generated by the compiler.
- Not to mention that hardware LSB/MSB can return different value on different
hardware when b == 0.
No functional change
Resolves#610
Use compiler intrinsics when possible to
avoid writing platform specific asm code.
Tested on Windows 7 with MSVC 2013 and mingw 4.8.3 (32 and 64 bit)
and on Linux Mint with g++ 4.8.4 and clang 3.4 (32 and 64 bit).
No functional change
Resolves#609
Give more weight to the pawns number and
the vertical king distance in evaluate_initiative()
Passed STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 26729 W: 5067 L: 4825 D: 16837
and LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 60480 W: 8338 L: 8016 D: 44126
Bench: 8295162
Resolves#594
PredictedDepth can be negative, causing the futility_margin to be negative.
It will be very difficult to tweak moveCount pruning and reduction formula, as they are tuned to prevent this behavior.
No functional change
Resolves#587
There is no reason to compile 3 different copies of search(). PV nodes are on
the cold path, and PvNode is a template parameter, so there is no cost in
computing:
const bool RootNode = PvNode && (ss-1)->ply == 0;
And this simplifies code a tiny bit as well.
Speed impact is negligible on my machine (i7-3770k, linux 4.2, gcc 5.2):
nps +/-
test 2378605 3118
master 2383128 2793
diff -4523 2746
Bench: 7751425
No functional change.
Resolves#568
In case of a very high material score, we can
overflow VALUE_INFINITE.
This patch fixes an assert with:
position fen 7k/QQQQR3/2B5/4KN1Q/3QQ3/8/8/4R3 b - - 0 1
go depth 1
No functional change.
Resolves#546
Make it explicit that those variables are not globals, but
are used only by main thread. I think it is a sensible
clarification because easy move is already tricky enough
and current patch makes the involved actors explicit.
No functional change.
Resolves#537
Tuned the global mobility factor for each piece, as well as some +- delta,
The master mobility factor was {266,334} and tuning gave
{267, 362} +S(-2,-2) for the Knight
{249, 328} +S( 0,-2) for the Bishop
{298, 353} +S(1,1) for the Rook
{265, 358} +S(2,-1) for the Queen
Passed STC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 49402 W: 9367 L: 9037 D: 30998
and LTC
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 26831 W: 3871 L: 3658 D: 19302
Bench: 8355485
Resolves#536
Fix a bug where we could stop the search after only 10% of time used due to a matching easy move but later switch to a different move that was never pre-screened as easy due to SMP thread select.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 27227 W: 4910 L: 4800 D: 17517
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 40368 W: 5826 L: 5733 D: 28809
Resolves#521
Simplify time management code by removing hard stops for unchanging first root moves.
Search is now stopped earlier at the end iteration if it did not have fail-lows at root.
This simplification also fixes pondering bug. Ponder flag was true by default
and cutechess-cli doesn't change it to false even though no pondering is possible.
Fix the issue by setting the default value of 'Ponder' flag to false.
10+0.1:
ELO: 3.51 +-3.0 (95%) LOS: 99.0%
Total: 20000 W: 3898 L: 3696 D: 12406
40+0.4:
ELO: 1.39 +-2.7 (95%) LOS: 84.7%
Total: 20000 W: 3104 L: 3024 D: 13872
60+0.06:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 37231 W: 5333 L: 5236 D: 26662
Stopped run at 100+1:
LLR: 1.09 (-2.94,2.94) [-3.00,1.00]
Total: 37253 W: 4862 L: 4856 D: 27535
Resolves#523Fixes#510
Also inline defintions of SpaceMask and CenterBindMask.
Verified from assembly that compiler computes the values
at compile time, so it is also theoretical faster.
While there factor out scale factor evaluation.
No functional change.
This is used by std::stable_sort() to sort moves from highest score to lowest score.
1) The comment is incorrect since highest to lowest means descending.
2) It's more natural to implement a less operator using another less operator rather than a greater operator.
No functional change.
Resolves#504
Comment is based on a misunderstanding of what unaligned memory access is. Here
is an article that explains it very clearly:
https://www.kernel.org/doc/Documentation/unaligned-memory-access.txt
No matter how we define TTEntry or TTCluster, there will never be any unaligned
memory access. This is because the complier knows the alignment rules, and does
the necessary adjustments to make sure unaligned memory access does not occur.
The issue being adressed here has nothing to do with unaligned memory access. It
is about cache performance. In order to achieve best cache performance:
- we prefetch the cacheline as soon as possible.
- we ensure that TT clusters do not spread across two cachelines. If they did,
we would need to prefetch 2 cachelines, which could hurt cache performance.
Therefore the true conditions to achieve this are:
1/ start adress of TT is cache line aligned. void TranspositionTable::resize()
enforces this.
2/ TT cluster size should *divide* the cache line size. Currently, we pack 2
clusters per cache lines. It used to be 1 before "TT sardines". Does not matter
what the ratio is, all we want is to fit an integer number of clusters per cache
line.
No functional change.
Resolves#506
Instead of creating a running std::thread and
returning, wait in Thread c'tor that the native
thread of execution goes to sleep in idle_loop().
In this way we can simplify how search is started,
because when main thread is idle we are sure also
all other threads will be idle, in any case, even
at thread creation and startup.
After lazy smp went in, we can simpify and rewrite
a lot of logic that is now no more needed. This is
hopefully the final big cleanup.
Tested for no regression at 5+0.1 with 3 threads:
LLR: 2.95 (-2.94,2.94) [-5.00,0.00]
Total: 17411 W: 3248 L: 3198 D: 10965
No functional change.
Now that we don't have anymore TimerThread, there is
no need of this long class hierarchy.
Also assorted reformatting while there.
To verify no regression, passed at STC with 7 threads:
LLR: 2.97 (-2.94,2.94) [-5.00,0.00]
Total: 30990 W: 4945 L: 4942 D: 21103
No functional change.
When we reach the maximum depth, we can finish the
search without a raise of Signals.stop. However, if
we are pondering or in an infinite search, the UCI
protocol states that we shouldn't print the best move
before the GUI sends a "stop" or "ponderhit" command.
It was broken by lazy smp. Fix it by moving the stopping
of the threads after waiting for GUI.
No functional change.
Indeed, if we use a depth >= DEPTH_MAX, we start having negative depth in the
TT (due to int8_t cast).
No functional change in single thread mode
Resolves#490
Unfortunately std::condition_variable::wait_for()
is not accurate in general case and the timer thread
can wake up also after tens or even hundreds of
millisecs after time has elapsded. CPU load, process
priorities, number of concurrent threads, even from
other processes, will have effect upon it.
Even official documentation says: "This function may
block for longer than timeout_duration due to scheduling
or resource contention delays."
So retire timer and use a polling scheme based on a
local thread counter that counts search() calls and
a small trick to keep polling frequency constant,
independently from the number of threads.
Tested for no regression at very fast TC 2+0.05 th 7:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 32969 W: 6720 L: 6620 D: 19629
TC 2+0.05 th 1:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 7765 W: 1917 L: 1765 D: 4083
And at STC TC, both single thread
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 15587 W: 3036 L: 2905 D: 9646
And with 7 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 8149 W: 1367 L: 1227 D: 5555
bench: 8639247
The only interesting change is the moving of
stack[MAX_PLY+4] back to its original position
in id_loop (now renamed Thread::search).
No functional change.
Reduce the variation in Root Depth between different threads. This
prevents threads from searching at a depth much higher than Main Thread.
Performed well at STC 24 Threads:
ELO: 3.44 +-3.8 (95%) LOS: 96.1%
Total: 10000 W: 1627 L: 1528 D: 6845
And LTC 24 Threads
LLR: 1.43 (-2.94,2.94) [0.00,4.00]
Total: 3804 W: 500 L: 420 D: 2884
ELO : +7.31
p-value: 73.16%
Passed no regression at STC 3 Threads:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40457 W: 7148 L: 7060 D: 26249
And LTC 3 Threads:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 17704 W: 2489 L: 2364 D: 12851
Raising a pull request early as 24 Thread tests are very expensive and
this is clearly a positive gain at high thread counts and high time
controls. The change is a small parameter tweak with no additional
logic.
No functional change for single thread mode.
Resolves#481
Rely on well defined behaviour for message passing, instead of volatile. Three
versions have been tested, to make sure this wouldn't cause a slowdown on any
platform.
v1: Sequentially consistent atomics
No mesurable regression, despite the extra memory barriers on x86. Even with 15
threads and extreme time pressure, both acting as a magnifying glass:
threads=15, tc=2+0.02
ELO: 2.59 +-3.4 (95%) LOS: 93.3%
Total: 18132 W: 4113 L: 3978 D: 10041
threads=7, tc=2+0.02
ELO: -1.64 +-3.6 (95%) LOS: 18.8%
Total: 16914 W: 4053 L: 4133 D: 8728
v2: Acquire/Release semantics
This version generates no extra barriers for x86 (on the hot path). As expected,
no regression either, under the same conditions:
threads=15, tc=2+0.02
ELO: 2.85 +-3.3 (95%) LOS: 95.4%
Total: 19661 W: 4640 L: 4479 D: 10542
threads=7, tc=2+0.02
ELO: 0.23 +-3.5 (95%) LOS: 55.1%
Total: 18108 W: 4326 L: 4314 D: 9468
As suggested by Joona, another test at LTC:
threads=15, tc=20+0.05
ELO: 0.64 +-2.6 (95%) LOS: 68.3%
Total: 20000 W: 3053 L: 3016 D: 13931
v3: Final version: SeqCst/Relaxed
threads=15, tc=10+0.1
ELO: 0.87 +-3.9 (95%) LOS: 67.1%
Total: 9541 W: 1478 L: 1454 D: 6609
Resolves#474
Using less parameters and code to compute Threats
Includes also a few spacing edits.
Run as a simplification.
Passed STC 10+0.1
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 18879 W: 3725 L: 3600 D: 11554
Passed LTC 60+0.4
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 74116 W: 11001 L: 10958 D: 52157
bench: 8004751
Fishtest is a key factor of SF success.
Thanks to Fishtest we have not only greately
improved ELO but, even more important, we
have enabled a kind of joint development and
testing that it is the herat of on open
source project like SF.
Open sourcing is not just about open code, it is
about commuity development. In case of a chess engine
this has never been possible before due to missing
a strong and strict testing environment that allows
many people to contribute in a safe and coordinate way.
Fishtest is a new way of developing chess engines,
something that has never exsisted before.
No functional change.
Collect and give a second try to some almost passed tuning attempts and
one-line tweaks from the last month.
Passed STC
LLR: 3.07 (-2.94,2.94) [0.00,4.00]
Total: 15124 W: 2974 L: 2756 D: 9394
And LTC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 21577 W: 3507 L: 3289 D: 14781
Bench: 8855226
Resolves#464
Start all threads searching on root position and
use only the shared TT table as synching scheme.
It seems this scheme scales better than YBWC for
high number of threads.
Verified for nor regression at STC 3 threads
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40232 W: 6908 L: 7130 D: 26194
Verified for nor regression at LTC 3 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 28186 W: 3908 L: 3798 D: 20480
Verified for nor regression at STC 7 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 3607 W: 674 L: 526 D: 2407
Verified for nor regression at LTC 7 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 4235 W: 671 L: 528 D: 3036
Tested with fixed games at LTC with 20 threads
ELO: 44.75 +-7.6 (95%) LOS: 100.0%
Total: 2069 W: 407 L: 142 D: 1520
Tested with fixed games at XLTC (120secs) with 20 threads
ELO: 28.01 +-6.7 (95%) LOS: 100.0%
Total: 2275 W: 349 L: 166 D: 1760
Original patch of mbootsector, with additional work
from Ivan Ivec (log formula), Joerg Oster (id loop
simplification) and Marco Costalba (assorted formatting
and rework).
Bench: 8116244
Apply bonus for the prior CMH that caused a fail low.
Balance Stats: CMH and History bonuses are updated differently.
This eliminates the "fudge" factor weight when scoring moves. Also
eliminated discontinuity in the gravity history stat formula. (i.e. stat
scores will no longer inverse when depth exceeds 22)
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 21802 W: 4107 L: 3887 D: 13808
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 46036 W: 7046 L: 6756 D: 32234
Bench: 7677367
Add Travis CI support to GitHub repo.
After every push to master, Travis will build
the sources directly from GitHub repo according
to .travis.yml and verify everything is ok.
No functional change.
Fix issues after a run of PVS-STUDIO analyzer.
Mainly false positives but warnings are anyhow
useful to point out not very readable code.
Noteworthy is the memset() one, where PVS prefers ss-2
instead of stack. This is because memeset() could
be optimized away by the compiler when using 'stack',
due to stack being a local variable no more used after
memset. This should normally not happen, but when
it happens it leads to very sublte and difficult
to find bug, so better to be safe than sorry.
No functional change.
When changing 'search' and 'splitPointsSize' we have to
use thread locks, not split point ones, because can_join()
is called under the formers.
Verified succesfully with 24 hours toruture tests with 20
cores machine by Louis Zulli: it does not hangs.
Verifyed for no regressions with STC, 7 threads:
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 52804 W: 8159 L: 8087 D: 36558
No functional change.
Only refresh TT entry when it's really necessary.
This should give a small speed boost for some machines.
And it's a risk-free change.
No functional change.
Resolves#429
Louis Zulli reported that Stockfish suffers from very occasional hangs with his 20 cores machine.
Careful SMP debugging revealed that this was caused by "a ghost split point slave", where thread
was marked as a split point slave, but wasn't actually working on it.
The only logical explanation for this was double booking, where due to SMP race, the same thread
is booked for two different split points simultaneously.
Due to very intermittent nature of the problem, we can't say exactly how this happens.
The current handling of Thread specific variables is risky though. Volatile variables are in some
cases changed without spinlock being hold. In this case standard doesn't give us any kind of
guarantees about how the updated values are propagated to other threads.
We resolve the situation by enforcing very strict locking rules:
- Values for key thread variables (splitPointsSize, activeSplitPoint, searching)
can only be changed when the thread specific spinlock is held.
- Structural changes for splitPoints[] are only allowed when the thread specific spinlock is held.
- Thread booking decisions (per split point) can only be done when the thread specific spinlock is held.
With these changes hangs didn't occur anymore during 2 days torture testing on Zulli's machine.
We probably have a slight performance penalty in SMP mode due to more locking.
STC (7 threads):
ELO: -1.00 +-2.2 (95%) LOS: 18.4%
Total: 30000 W: 4538 L: 4624 D: 20838
However stability is worth more than 1-2 ELO points in this case.
No functional change
Resolves#422
v = value without ep capture being considered
v1 = value of the ep capture
The correct logic is:
if without e.p. capture we are losing, and the value of e.p is either draw, or win or "loss, but 50 move rule saves us", then we should use the value of ep capture.
Credit and thanks to syzygy1 and lantonov !
No functional change (except with syzygy bases)
Resolves#415Resolves#394
Instead of using hard coded Min and Max values for history,
always adjust the old value slightly downwards before adding a new value.
The adjustment acts like gravity that prevents the value escaping too
far from zero.
Bench: 8020484
Resolves#407
Apart from usual renaiming, take advantage of
C++11 function template default parmeter to
get rid of Eval trampoline functions.
Some triviality fixes while there.
No functional change.
Align the behaviour with reductions. Initially castling moves had to be
treated differently, because the SEE did not handle them correctly. But now it
does.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 83750 W: 15722 L: 15711 D: 52317
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 97183 W: 15120 L: 15115 D: 66948
bench 7759837
Resolves#403
PawnSafePush, with the value S(5,5) proved not "necessary"
possibly due to recent changes to MobilityArea and other changes to Connected bonus.
STC:
LLR: 3.22 (-2.94,2.94) [-3.00,1.00]
Total: 98528 W: 18757 L: 18759 D: 61012
LTC:
LLR: 5.30 (-2.94,2.94) [-3.00,1.00]
Total: 204194 W: 31698 L: 31734 D: 140762
Bench: 7620871
Resolves#396
Use Position::square and Position::squares instead.
This allow us to remove king_square(), simplify
endgames and to have more naming uniformity.
Moreover, this is a prerequisite step in case
in the future we decide to retire piece lists
altoghter and use pop_lsb() to loop across
pieces and serialize the moves. In this way
we just need to change definition of Position::square
to something like:
template<PieceType Pt> inline
Square Position::square(Color c) const {
return lsb(byColorBB[c]);
}
No functional change.
Based off of Pull request #383:
Include squares occupied by some pawns in the MobilityArea
a) not blocked
b) on rank 4 and above
c) or captures
Passed STC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 8157 W: 1644 L: 1516 D: 4997
And LTC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 26086 W: 4274 L: 4051 D: 17761
-----------
Then, a simplification test failed, trying to remove b and c)
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 6048 W: 1117 L: 1288 D: 3643
Another simplification test, was run to remove just (c)
Passed STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 28073 W: 5364 L: 5255 D: 17454
And LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 34652 W: 5448 L: 5348 D: 23856
A parameter tweak test showed that changing b) for "on rank 3 and above"
does not work
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 5233 W: 937 L: 1077 D: 3219
Finally, a small rewrite, and we have this version
Include squares occupied by some pawns in the MobilityArea which are
a) not blocked
b) on rank 4 and above
Bench: 8977899
Resolves#385
Under Windows with MSVC we use the IDE to compile,
in this case we infer some compiler flags usually
set by Makefile.
The condition to check this was wrong, namely when compiling
with mingw under Windows 64 bit we always set IS_64BIT and
USE_BSFQ even if compiled with ARCH=x86-32 (this is how I
found it).
Small code style touches while there.
No functional change.
Error is:
a value of type "int" cannot be assigned to an entity of type "Value"
Also retire the now unused squares_of_color() function.
No functional change.
In Chess960 we can have legal positions with
opponent rook in A or H file and with castling
available, for instance:
4k3/pppppppp/8/8/8/8/PPPPPPPP/rR2K3 w Q - 0 1
In those cases we pick up the wrong rook when
setting castling.
Fix it by checking the color of the rook.
Bug reported by Matthew Lai.
No functional change.
Instead of crafting a clever formula to calculate the array offset, simply use a
3 dimensional array. Remove the comment while at it, because now the code is
self-documenting.
No functional change.
Resolves#344
Introduce helper function Search::reset() which clears all kind of search
memory, in order to restore a deterministic search state.
Generalize TT.clear() into Search::reset() for the following use cases:
- bench: needed to guarantee deterministic bench (ie. if you call bench from
interactive command line twice in a row you get the same value).
- Clear Hash: restore clean search state, which is the purpose of this button.
- ucinewgame: ditto.
No functional change.
Resolves#346
Based on an idea and patch by VoyagerOne.
Small simplification, but was tedted for an ELO gain anyway.
STC:
LLR: 2.95 (-2.94,2.94) [-1.00,4.00]
Total: 5375 W: 1119 L: 977 D: 3279
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 17893 W: 2984 L: 2792 D: 12117
bench 8322847
Use symmetry along vertical middle axis of the board
to reduce the number of parameters.
For instance psqt value of SQ_A5 == SQ_A4 and value of
SQ_F8 == SQ_F1.
This is always true, at least until now nobody came in
with an asymmetric psqt table that worked.
Original patch by Lucas.
No functional change.
Easier for tuning psq tables:
TUNE(myParameters, PSQT::init);
Also move PSQT code in a new *.cpp file, and retire the
old and hacky psqtab.h that required to be included only
once to work correctly, this is not idiomatic for a header
file.
Give wide visibility to psq tables (previously visible only
in position.cpp), this will easy the use of psq tables outside
Position, for instance in move ordering.
Finally trivial code style fixes of the latest patches.
Original patch of Lucas Braesch.
No functional change.
In ei.attackedBy, Queen does not x-ray through Rook, but the Rook does
X-ray through the Queen.
So most of the rook contact checks supported by queen are, in fact,
Queen Contact Checks and they are already scored separately.
Bench: 7762189
Resolves#338
Currently Zobrist::castling[] are not properly zeroed
and rely on the compiler to do this at startup, but this
makes Position::init() to set different values every time
it is called!
This is a bit odd, and although not impacting normal usage,
can yield to subtle misbehaviour, very difficult to track
down, in case we happen to call it more than once for some
reason. I found this while developing tuning support and
it took me a while to track it down.
So properly init Zobrist::castling[]
No functional change.
Resolves#329
When running more games in parallel, or simply when running a game
with a background process, due to how OS scheduling works, there is no
guarantee that the CPU resources allocated evenly between the two
players. This introduces noise in the result that leads to unreliable
result and in the worst cases can even invalidate the result. For
instance in SF test framework we avoid running from clouds virtual
machines because are a known source of very unstable CPU speed.
To overcome this issue, without requiring changes to the GUI, the idea
is to use searched nodes instead of time, and to convert time to
available nodes upfront, at the beginning of the game.
When nodestime UCI option is set at a given nodes per milliseconds
(npmsec), at the beginning of the game (and only once), the engine
reads the available time to think, sent by the GUI with 'go wtime x'
UCI command. Then it translates time in available nodes (nodes =
npmsec * x), then feeds available nodes instead of time to the time
management logic and starts the search. During the search the engine
checks the searched nodes against the available ones in such a way
that all the time management logic still fully applies, and the game
mimics a real one played on real time. When the search finishes,
before returning best move, the total available nodes are updated,
subtracting the real searched nodes. After the first move, the time
information sent by the GUI is ignored, and the engine fully relies on
the updated total available nodes to feed time management.
To avoid time losses, the speed of the engine (npms) must be set to a
value lower than real speed so that if the real TC is for instance 30
secs, and npms is half of the real speed, the game will last on
average 15 secs, so much less than the TC limit, providing for a
safety 'time buffer'.
There are 2 main limitations with this mode.
1. Engine speed should be the same for both players, and this limits
the approach to mainly parameter tuning patches.
2. Because npms is fixed while, in real engines, the speed increases
toward endgame, this introduces an artifact that is equivalent to an
altered time management. Namely it is like the time management gives
less available time than what should be in standard case.
May be the second limitation could be mitigated in a future with a
smarter 'dynamic npms' approach.
Tests shows that the standard deviation of the results with 'nodestime'
is lower than in standard TC, as is expected because now all the introduced
noise due the random speed variability of the engines during the game is
fully removed.
Original NIT idea by Michael Hoffman that shows how to play in NIT mode
without requiring changes to the GUI. This implementation goes a bit
further, the key difference is that we read TC from GUI only once upfront
instead of re-reading after every move as in Michael's implementation.
No functional change.
And reformat a bit time manager code.
Note that now we set starting search time in think() and
no more in ThreadPool::start_thinking(), the added delay
is less than 1 msec, so below timer resolution (5msec) and
should not affect time lossses ratio.
No functional change.
Code like this is more a case of showing off one's C++ knowledge, rather than
using it adequately, IMHO.
**First loop (std::generate)**
Iterators are inadequate here, because they lose the key information which is
idx. As a result, we need to carry a redundant idx variable, and increment it
along the way. Very clumsy.
Usage of std::generate and a lambda function only obfuscate the code, which is
merely a simple and stupid loop over the elements of a vector.
**Second loop (std::accumulate)**
This code is thoroughlly incomprehensible. Restore the original, which was much
simpler to understand.
**Third loop (range based loop)**
Again, a range based loop is inadequate, because we lose idx! To resolve this
artificially created problem, the data model was made redundant (idx is a data
member of db[] elements!?), which is ugly and unjustified. A simple and stupid
for loop with idx does the job much better.
No functional change.
Resolves#313
Small speed-up in pawn.cpp
Results for 10 tests for each version:
Base Test Diff
Mean 1435636 1445238 -9602
StDev 22576 23189 1848
p-value: 1
speedup: 0.007
No functional change
Resolves#295
If the sum of CounterMoveHistory heuristic and History heuristic is below zero,
then reduce an extra ply in cut nodes
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 6479 W: 1099 L: 967 D: 4413
Bench: 7773299
Resolves#315
Currently if we call it more than once, we crash.
This is not a real problem, because this function is
indeed called just once. Nevertheless with this small fix,
that gets rid of a hidden 'static' variable, we cleanly
resolve the issue.
While there, fix also ThreadPool::exit to return in a
consistent state. Now all the init() functions but
UCI::init() are reentrant and can be called multiple
times.
No functional change.
Profiling shows that resetting attacks table after
a failed candidate magic attempt is the biggest
time consumer, so rewrite the logic avoiding the
memset()
Magics init for rook+bishop goes from 200msecs to
under 100msec.
No functional change.
To sync UI with main thread it is enough a single
condition variable because here we have a single
producer / single consumer design pattern.
Two condition variables are strictly needed just for
many producers / many consumers case.
Note that this is possible because now we don't send to
sleep idle threads anymore while searching, so that now
only UI can wake up the main thread and we can use the
same ConditionVariable for both threads.
The natural consequence is to retire wait_for_think_finished()
and move all the logic under MainThread class, yielding the
rename of teh function to join()
No functional change.
Now that we use spinlocks everywhere and don't put
threads to sleep while idle, we can use the slower
(but no more in hot path) std::condition_variable_any
instead of our homwgrown ConditionVariable struct.
Verified fo rno regression at STC with 7 threads:
ELO: -0.66 +-2.7 (95%) LOS: 31.8%
Total: 20000 W: 3210 L: 3248 D: 13542
No functional change
There's no reason to define it as a Bitboard, so for consistency, use bool.
This is even a speedup on my machine: i7-3770k, using gcc 4.9.1 (linux):
stat test master diff
mean 2,341,338 2,327,998 13,134
stdev 15,765 14,717 5,405
speedup 0.56%
P(speedup>0) 100.0%
No functional change.
Resolves#298
Avoid redundant 'while' conditions. It is enough to
check them in the outer loop.
Quick tested for no regression 10K games at 4 threads
ELO: -1.32 +-3.9 (95%) LOS: 25.6%
Total: 10000 W: 1653 L: 1691 D: 6656
No functional change.
Spinlock must be used instead.
Tested for no regression at 15+0.05 th 4:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 25928 W: 4303 L: 4190 D: 17435
No functional change.
Resolves#297
Unify the quites moves loop for both cases,
the compiler optimizes away the
if (is_ok((ss-1)->currentMove))
inside loop, so that the result is same
speed as original.
No functional change.
During the search, do not block on condition variable, but instead use std::this_thread::yield().
Clear gain with 16 threads. Again results vary highly depending on hardware, but on average it's a clear gain.
ELO: 12.17 +-4.3 (95%) LOS: 100.0%
Total: 7998 W: 1407 L: 1127 D: 5464
There is no functional change in single thread mode
Resolves#294
Both are the result of a SPSA tuning session with a custom book, 50k iterations each.
After an additional tuning session of the mobility values, tuning the delta values, with following result.
40k games at 9+0.05:
ELO: 4.13 +-2.2 (95%) LOS: 100.0%
Total: 40000 W: 8581 L: 8106 D: 23313
and LTC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 36518 W: 6049 L: 5782 D: 24687
Bench: 8567402
Resolves#284
Fixes reported startup error about missing libwinpthread-1.dll
when the dll is not in the path.
The current -static-xxxx flags, introduced with:
https://github.com/official-stockfish/Stockfish/commit/373503f4a9a990054b5
Only take in account standard libraries, but not thread
library.
No functional change.
Resolves#289
Idea and original implementation by Stephane Nicolet
7 threads 15+0.05
ELO: 3.54 +-2.9 (95%) LOS: 99.2%
Total: 17971 W: 2976 L: 2793 D: 12202
There is no functional change in single thread mode
Spend much less time in positions where one move is much better than all other alternatives.
We carry forward pv stability information from the previous search to identify such positions.
It's based on my old InstaMove idea but with two significant improvements.
1) Much better instability detection inside the search itself.
2) When it's time to make a FastMove we no longer make it instantly but still spend at least 10% of normal time verifying it.
Credit to Gull for the inspiration.
BIG thanks to Gary because this would not work without accurate PV!
20K
ELO: 8.22 +-3.0 (95%) LOS: 100.0%
Total: 20000 W: 4203 L: 3730 D: 12067
STC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 23266 W: 4662 L: 4492 D: 14112
LTC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 12470 W: 2091 L: 1931 D: 8448
Resolves#283
Introduce a counter move history table which additionally is indexed by the last move's piece and target square.
For quiet move ordering use now the sum of standard and counter move history table.
STC:
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 4747 W: 1005 L: 885 D: 2857
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 5726 W: 1001 L: 872 D: 3853
Because of reported low NPS on multi core test
STC (7 threads):
ELO: 7.26 +-3.3 (95%) LOS: 100.0%
Total: 14937 W: 2710 L: 2398 D: 9829
Bench: 7725341
Resolves#282
Use Mutex instead.
This is in preparaation for merging with master branch,
where we stilll don't have spinlocks.
Eventually spinlocks will be readded in some future
patch, once c++11 has been merged.
No functional change.
minKingPawnDistance is used only as local variable in one place so we don't need it to be part of "Pawns::Entry" structure.
No functional change.
Resolves#277
This change in the Makefile restores the possibility to compile
Stockfish on Mac OS X 10.9 and 10.10 after the C++11 has been merged.
To use the default (fastest) settings, compile with:
make build ARCH=x86-64-modern
To test the clang settings, compile with
make build ARCH=x86-64-modern COMP=clang
Beware that the clang settings may provide a slightly slower (6%)
executable.
Backported from master.
No functional change
Resolves#275
This change in the Makefile restores the possibility to compile
Stockfish on Mac OS X 10.9 and 10.10 after the C++11 has been merged.
To use the default (fastest) settings, compile with:
make build ARCH=x86-64-modern
To test the clang settings, compile with
make build ARCH=x86-64-modern COMP=clang
Beware that the clang settings may provide a slightly slower (6%)
executable.
No functional change
Resolves#275
Now that c++11 branch has been merged in master,
disable unconditionally the spinlocks and use mutex
instead. This will allow to run fishtest even on HT
machines withouth changes.
In the future we will reintorduce spinlocks, once
we will have took care of fishtest.
No functional change.
And use mutex instead. You may never want to do this.
It is a workaround to run c++11 on fishtest where many
machiens have HTenabled and this can be a problem when
number of cores set is higher than number of physical cores.
To disable spinlocks, just compile with -DNO_SPINLOCK flag
No functional change.
Calling lock.test_and_set() in a tight loop creates expensive
memory synchronizations among processors and penalize other
running threads. So syncronize only only once at the beginning
with fetch_sub() and then loop on a simple load() that puts much
less pressure on the system.
Reported about 2-3% speed up on various systems.
Patch by Ronald de Man.
No functional change.
It is reported to be defenitly faster with increasing
number of threads, we go from a +3.5% with 4 threads
to a +15% with 16 threads.
The only drawback is that now when testing with more
threads than physical available cores, the speed slows
down to a crawl. This is expected and was similar at what
we had setting the old sleepingThreads to false.
No functional change.
Initialization is more complex than what I'd like due
to MSVC compatibility that for some reason does not like:
std::atomic_flag lock = ATOMIC_FLAG_INIT;
No functional change.
It should be used slavesMask.count() instead.
Verified 100% equivalent when sp->allSlavesSearching:
dbg_hit_on(sp->allSlavesSearching, sp->slavesCount != sp->slavesMask.count());
No functional change.
Document and clarify that we cannot rejoin on ourselves
and that we never late join if we are master and all
slaves have finished, inded in this case we exit idle_loop.
No functional change.
In case of Threads.size() == 2 we have that sp->allSlavesSearching
is always false (because we have finished our search), bestSp is
always NULL and we never late join, so there is no need to special
case here.
Tested with dbg_hit_on(sp && sp->allSlavesSearching) and
verified it never fires.
No functional change.
And retire a redundant field. This is important also
from a concept point of view becuase we want to keep
SMP structures as simple as possible with the only
strictly necessary data.
Verified with
dbg_hit_on(sp->spLevel != level)
that the values are 100% the same out of more 50K samples.
No functional change.
Balance threads between split points.
There are huge differences between different machines and autopurging makes it very difficult to measure the improvement in fishtest, but the following was recorded for 16 threads at 15+0.05:
For Bravone (1000 games): 0 ELO
For Glinscott (1000 games): +20 ELO
For bKingUs (1000 games): +50 ELO
For fastGM (1500 games): +50 ELO
The change was regression for no one, and a big improvement for some, so it should be fine to commit it.
Also for 8 threads at 15+0.05 we measured a statistically significant improvement:
ELO: 6.19 +-3.9 (95%) LOS: 99.9%
Total: 10325 W: 1824 L: 1640 D: 6861
Finally it was verified that there was no (significant) regression for
4 threads:
ELO: 0.09 +-2.8 (95%) LOS: 52.4%
Total: 19908 W: 3422 L: 3417 D: 13069
2 threads:
ELO: 0.38 +-3.0 (95%) LOS: 60.0%
Total: 19044 W: 3480 L: 3459 D: 12105
1 thread:
ELO: -1.27 +-2.1 (95%) LOS: 12.3%
Total: 40000 W: 7829 L: 7975 D: 24196
Resolves#258
This micro-optimization only complicates the code and provides no benefit.
Removing it is even a speedup on my machine (i7-3770k, linux, gcc 4.9.1):
stat test master diff
mean 2,403,118 2,390,904 12,214
stdev 12,043 10,620 3,677
speedup 0.51%
P(speedup>0) 100.0%
No functional change.
This micro-optimization only complicates the code and provides no benefit.
Removing it is even a speedup on my machine (i7-3770k, linux, gcc 4.9.1):
stat test master diff
mean 2,403,118 2,390,904 12,214
stdev 12,043 10,620 3,677
speedup 0.51%
P(speedup>0) 100.0%
No functional change.
Use integer arithmetic instead of floating point arithmetic.
Floating point arithmetic was causing different results for some 32-bit compiles
No functional change
Resolves#249Resolves#250
Also remove useless StateCopySize64 optimization:
compiler uses SSE movups instruction anyhow and
does not need this trick (verified with fishbench).
No functional change.
Funny enough, gcc __builtin_prefetch() expects
already a void*, instead Windows's _mm_prefetch()
requires a char*.
The patch allows to remove ugly casts from caller
sites.
No functional change.
Results for 10 tests for each version (gcc 4.8.3 on mingw):
Base Test Diff
Mean 1502447 1507917 -5470
StDev 3119 1364 4153
p-value: 0,906
speedup: 0,004
Results for 10 tests for each version (MSVC 2013):
Base Test Diff
Mean 1400899 1403713 -2814
StDev 1273 2804 2700
p-value: 0,851
speedup: 0,002
No functional change.
I went through all the individual compile options that differ between
-fprofile-generate/-fprofile-use and -fprofile-arcs/-fbranch-probabilities
and distilled the speed difference down to only turning off
-fno-peel-loops and -fno-tracer. Using this we still get the full speedup
(maybe a bit more because other optimizations stay on) and it's also much cleaner
because we can get rid of the "@rm -f ucioption.gc*" hack for all versions of gcc.
No functional change.
Resolves#237
Follow the usual approach to delay computation
as far as possible, in case an earlier killer
cut-offs we avoid to do useless work.
This also greatly simplifies the code.
No functional change.
Verified with perft there is no speed regression,
and code is simpler. It is also conceptually correct
becuase an extended move is just a move that happens
to have also a score.
No functional change.
This is an old patch from Jean-Francois Romang to send
UCI hashfull info to the GUI:
https://github.com/mcostalba/Stockfish/pull/60/files
It was wrongly judged as a slowdown, but it takes much
less than 1 ms to run, indeed on my core i5 2.6Ghz it
takes about 2 microsecs to run!
Regression test is good:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 7352 W: 1548 L: 1401 D: 4403
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 61432 W: 10307 L: 10251 D: 40874
I have set the name of the author to the original
one.
No functional change.
This patch has two positive effects:
- Retire a hackish formula and leave
just a natural, simple and plain one.
- Reduce strenght at very low level, but
don't impact medium/high levels.
Actually even at level 0, SF is still too
strong for many beginners (this was reported
many times for instance on Droidfish user
comments on Google Play).
Test on fishtest shows that ELO drop is around
170 ELO at level 0 (good!), 130 ELO at level 1
and smoothly reduces (as expected) until level
10 where the drop is just of 8 ELO.
No functional change.
- Fix a skill level problem: Don't allow move pruning at root node
- Revert "Fix profile build for gcc on Mac OSX". Results for a faster binary in x86-64.
- Fix a MSVC warning
Bench: 8918745
Seems to be a performance regression for standard build.
For SF6 people compiling on Mac OSX using profile-build option
just need to make necessary adjustments manually...
No functional change
Resolves#223
Instead of swapping sub-optimal move in Skill
d'tor, make it explicitly at the end of the search.
Also streamline and clarify relation with multiPV
and pass it directly instead of relying on the hacky
'candidates' member.
No functional change.
Warning is C4512 (assignment operator could not be generated)
Now, apart the foreign syzygy code, everything compiles
without warnings at warning level 4.
Backported from C++11 branch.
No functional change.
- Fix a compilation issue related to BMI2 PEXT instruction
- Retrieve a ponder move from TT if PV is only one move long
Bench: 8080602
No functional change
This intrinsic to call BMI2 PEXT instruction is
defined in immintrin.h. This header should be
included only when USE_PEXT is defined, otherwise
we define _pext_u64 as 0 forcing a nop.
But under some mingw platforms, even if we don't
include the header, immintrin.h gets included
anyhow through an include chain that starts with
STL <algorithm> header. So we end up both defining
_pext_u64 function and at the same time defining
_pext_u64 as 0 leading to a compile error.
The correct solution is of not using _pext_u64 directly.
This patch fixes a compile error with some mingw64
package when compiling with x86-64.
No functional change.
Resolves#222
In case we stop the search during a fail-high
it is possible we return to GUI without a ponder
move. This patch try harder to find a ponder move
retrieving it from TT. This is important in games
played with 'ponder on'.
bench: 8080602
Resolves#221
When not using Makefile, e.g. with MSVC, if hardware
supports BMI2 instructions, then USE_PEXT should be
added in project configuration to enable pext support.
No functional change.
This intrinsic to call BMI2 PEXT instruction is
defined in immintrin.h. This header should be
included only when USE_PEXT is defined, otherwise
we define _pext_u64 as 0 forcing a nop.
But under some mingw platforms, even if we don't
include the header, immintrin.h gets included
anyhow through an include chain that starts with
STL <algorithm> header. So we end up both defining
_pext_u64 function and at the same time defining
_pext_u64 as 0 leading to a compile error.
The correct solution is of not using _pext_u64 directly.
This patch fixes a compile error with some mingw64
package when compiling with x86-64.
No functional change.
Warning is C4512 (assignment operator could not be generated)
Now, apart the foreign syzygy code, everything compiles
without warnings at warning level 4.
No functional change.
Coverity scan warns about uninitialized 'sf' argument when
calling probe(). Actually it is a false positive because
argument is passed by reference and assigned inside
probe(). Nevertheless it is a hint that fucntion signature
is a bit tricky, so rewrite it in a more conventional way,
assigning 'sf' from probe() return value.
No functional change.
On platforms where size_t is 32 bit, we
can have an overflow in this expression:
(mbSize * 1024 * 1024)
Fix it setting max hash size of 2GB on platforms
where size_t is 32 bit.
A small rename while there: now struct Cluster
is definied inside class TranspositionTable so
we should drop the redundant TT prefix.
No functional change.
On Android-ARM current TB code crashes at
random times even in single thread mode.
Reported, debugged, fixed and verified
by Peter Osterlund.
No functional change.
Resolves#201
This optimization is aimed at old hardware only (withouth popcount), and even on
non popcount compile (ARCH=x86-64), it provides no mesurable speedup:
stat test master diff
mean 2,341,779 2,354,699 -12,920
stdev 12,910 14,770 18,150
speedup -0.55%
P(speedup>0) 23.8%
No functional change.
Resolves#187
- Change UCI::value() signature
This function should only return the value,
lowerbound and upperbound info is up to the
caller because it requires external knowledge,
out of the scope of this little helper.
- Retire 'key' command
It is not an UCI command and is absolutely
useless: never used.
- Comments fixing and other trivia
No functional change.
And reshuffle a bit the functions to place
them in a consistent order.
To be on the safe side, patch has been
validated for no regression/crashes with
a small 8K games test with 3 threads:
ELO: 3.98 +-4.4 (95%) LOS: 96.3%
Total: 8388 W: 1500 L: 1404 D: 5484
No functional change.
It is up to material (and pawn) table look up
code to know where the per-thread tables are,
so change API to reflect this.
Also some comment fixing while there
No functional change.
Move all in evaluation.
Simplify the code and concentrate in a single place
all the logic behind space evaluation, making it much
more clear.
Verified also at STC it does not regress due to a possible
slow down:
LLR: 3.91 (-2.94,2.94) [-3.00,1.00]
Total: 65744 W: 13285 L: 13194 D: 39265
No functional change.
Commenst are obsolete now, an updated description
would be quite obscure, so better let the code
to talk and remove them all together.
No functional change.
Use the same template of other pawns moves generation,
make the code more uniform, simplify generate_promotions
that has now been renamed.
No functional change (verified also with perft).
In some UNIX systems "rm" prompts user for confirmation.
However "rm -f" is always a guaranteed forced deletion.
Also move gcc profiling hack under the correct target
No Functional change
Resolves#168
ShelterWeakness and Stormdanger array are now indexed additionally by
file pair (a/h,b/g,c/f,d/e). The special case of king blocking a pawn
is incorporated in the StormDanger array. Finally the 93 parameters
are tuned by SPSA on LTC.
STC
ELO: 3.46 +-2.2 (95%) LOS: 99.9%
Total: 40000 W: 8275 L: 7877 D: 23848
LTC
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 10311 W: 1876 L: 1721 D: 6714
Bench: 9498821
Resolves#163
The evaluation is already done by the specialized
function, don't need to add something elese later.
With this patch following positions are evaluated
correctly as draws:
8/6p1/1Pkp1p1p/2nNn2P/2P1K1P1/8/8/3B4 w - - 7
8/1k4p1/1P1p1p1p/3NnK1P/2P3P1/1n6/4B3/8 w - -
Verified it not regress with an STC test:
LLR: 3.15 (-2.94,2.94) [-3.00,1.00]
Total: 49812 W: 10095 L: 10016 D: 29701
Reported by Arjun Temurnikar.
bench: 8289983
In particular seems more natural to return
bool and TTEntry on the same line, actually
we should pass and return them as a pair,
but due to limitations of C++ and not wanting
to use std::pair this can be an acceptable
compromise.
No functional change.
Resolves#157
In some cases we want to go direcly to the moves loop
without checking for early return. The patch make this
logic more clear and consistent.
Tested for no regression, passed STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 25282 W: 5136 L: 5022 D: 15124
and LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 72007 W: 12133 L: 12095 D: 47779
bench: 9316798
This patch replaces RKISS by a simpler and faster PRNG, xorshift64* proposed
by S. Vigna (2014). It is extremely simple, has a large enough period for
Stockfish's needs (2^64), requires no warming-up (allowing such code to be
removed), and offers slightly better randomness than MT19937.
Paper: http://xorshift.di.unimi.it/
Reference source code (public domain):
http://xorshift.di.unimi.it/xorshift64star.c
The patch also simplifies how init_magics() searches for magics:
- Old logic: seed the PRNG always with the same seed,
then use optimized bit rotations to tailor the RNG sequence per rank.
- New logic: seed the PRNG with an optimized seed per rank.
This has two advantages:
1. Less code and less computation to perform during magics search (not ROTL).
2. More choices for random sequence tuning. The old logic only let us choose
from 4096 bit rotation pairs. With the new one, we can look for the best seeds
among 2^64 values. Indeed, the set of seeds[][] provided in the patch reduces
the effort needed to find the magics:
64-bit SF:
Old logic -> 5,783,789 rand64() calls needed to find the magics
New logic -> 4,420,086 calls
32-bit SF:
Old logic -> 2,175,518 calls
New logic -> 1,895,955 calls
In the 64-bit case, init_magics() take 25 ms less to complete (Intel Core i5).
Finally, when playing with strength handicap, non-determinism is achieved
by setting the seed of the static RNG only once. Afterwards, there is no need
to skip output values.
The bench only changes because the Zobrist keys are now different (since they
are random numbers straight out of the PRNG).
The RNG seed has been carefully chosen so that the
resulting Zobrist keys are particularly well-behaved:
1. All triplets of XORed keys are unique, implying that it
would take at least 7 keys to find a 64-bit collision
(test suggested by ceebo)
2. All pairs of XORed keys are unique modulo 2^32
3. The cardinality of { (key1 ^ key2) >> 48 } is as close
as possible to the maximum (65536)
Point 2 aims at ensuring a good distribution among the bits
that determine an TT entry's cluster, likewise point 3
among the bits that form the TT entry's key16 inside a
cluster.
Details:
Bitset card(key1^key2)
------ ---------------
RKISS
key16 64894 = 99.020% of theoretical maximum
low18 180117 = 99.293%
low32 305362 = 99.997%
Xorshift64*, old seed
key16 64918 = 99.057%
low18 179994 = 99.225%
low32 305350 = 99.993%
Xorshift64*, new seed
key16 65027 = 99.223%
low18 181118 = 99.845%
low32 305371 = 100.000%
Bench: 9324905
Resolves#148
Currently Search::RootMoves is accessed and even
modified by TB probing functions in a hidden
and sneaky way.
This is bad practice and makes the code tricky.
Instead explicily pass the vector as function
argument so to clarify that the vector is modified
inside the functions.
No functional change.
Simplified also some logic while there.
TBLargest needs renaming too, but itis for
a future patch because touches also syzygy
directory stuff.
No functional change.
We really don't need to uglify in this way
our nice count() API with this ad-hoc hack.
So remove the hack and use the already
existing infrastructure.
No functional change.
Resolves#134
Updating the makefile so that the clean and gcc-profile-clean targets also
remove the profiling data files in the syzygy directory.
No functional change.
Resolves#136
Just like in Physics, the ratio of 2 things in the same unit, should be
without unit.
Example use case:
- Ratio of a Depth by a Depth (eg. ONE_PLY) should be an int.
- Ratio of a Value by a Value (eg. PawnValueEg) should be an int.
Remove a bunch of useless const while there.
No functional change.
Resolves#128
Adds support for Syzygy tablebases to Stockfish. See
the Readme for information on using the tablebases.
Tablebase support can be enabled/disabled at the Makefile
level as well, by setting syzygy=yes or syzygy=no.
Big/little endian are both supported.
No functional change (if Tablebases are not used).
Resolves#6
It is possible that we won't have a ponder move if our PV
is too short. In that case, just don't print a ponder move.
No functional change
Resolves#130
When evaluating double pawns we use always
lsb() to extract the frontmost square.
This breaks evaluation color symmetry as is
possible to verify with an instrumented evaluate()
Value evaluate(const Position& pos) {
Value v = do_evaluate<false>(pos);
Position p = pos;
p.flip();
assert(v == do_evaluate<false>(p));
return v;
}
Passed no regression test:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 21035 W: 4244 L: 4122 D: 12669
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 39839 W: 6662 L: 6572 D: 26605
bench: 8255966
It is a non functional change, but because
we now skip copying of pv[] in SpNode, patch
has been tested for regression with 3 threads:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 54668 W: 9873 L: 9809 D: 34986
No functional change.
This is a regression from 428962a
We have to cast to char here, otherwise the compiler
interprets it as an integer, and writes a number.
No functional change
Resolves#122
This is the first of a patch series to
rearrange and simplify accurate PV.
In this patch there is simple coding
style and reformatting stuff.
Verified with fishtest it does not crash
with MAX_PLY = 8
No functional change.
Previously token would keep its value from the previous line when an empty
line was input, leading to unexpected behaviour.
No functional change
Resolves#119
When comparing to a Depth it is more
consistent to use DEPTH_MAX instead
of a int.
This is a subtle difference because we use
ply and depth almost interchangably in SF,
but they are different. FOr counting plies
makes ense to continue using ints, while
for Depth we have our specific enum.
This cleanly fixes a new Clang 3.5 warning:
No functional change.
This gives SF accurate PVs, such that the evaluation of the leaf node in
the PV matches the score backed up to the root (99% of the time.
q-search will use the value stored in the hash table instead of the eval
value sometimes).
One drawback is that fail-high/low only get a minimal 2 move PV.
It doesn't add any additional overhead to the non-PV codepath except an
extra eight bytes to the SearchStack structure in multi-threaded
searches.
A core part of this is not pruning based on TT score in PV nodes. This
was measured as not being a regression at multiple TCs, except for one
exception, fast TC with huge hash, which is not realistic for longer
searches.
STC - 1 thread, 128 mb hash
ELO: 1.42 +-3.1 (95%) LOS: 81.9%
Total: 20000 W: 4078 L: 3996 D: 11926
STC - 3 thread, 128 mb hash
ELO: -3.60 +-2.9 (95%) LOS: 0.8%
Total: 20000 W: 3575 L: 3782 D: 12643
STC - 3 thread, 8 mb hash
ELO: 0.12 +-2.9 (95%) LOS: 53.3%
Total: 20000 W: 3654 L: 3647 D: 12699
LTC - 3 thread, 32mb hash
ELO: 2.29 +-2.0 (95%) LOS: 98.8%
Total: 35740 W: 5618 L: 5382 D: 24740
Bench: 6984058
Resolves#102
Daniel Jose reported that it was an elo gain in his engine:
http://www.talkchess.com/forum/viewtopic.php?t=54290
STC: Hash=16
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 33067 W: 6670 L: 6571 D: 19826
LTC: Hash=64
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41181 W: 7008 L: 6920 D: 27253
And another one to verify no regression with hash pressure:
STC: Hash=4
LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 25085 W: 5059 L: 4991 D: 15035
Verified that qsearch does not explode after this patch (recapture threshold).
Bench 7962287
Resolves#112
Real men jump/branch by hand...but
we prefer the humble way.
Moved also some uci info code where it
belongs, while there.
No functional change.
Resolves#110
16MB for 1" searches is more comensurate with the average use case.
And 16 is the default used by stockfish bench, so it makes sense to be
consistent, if only to have the same minimum memory requirement for using
SF and compiling it with PGO.
No functional change.
Resolves#109
Now we can directly replace it with
the definition resulting in simpler
and possibly faster code because
PvNode is evaluated at compile time.
No functional change.
* remove some erroneous comments, that were based on the ONE_PLY == 2.
* rename hd to d, because there's no more half-depth in SF.
* rescope variables into the for loops.
* reindent the for loops correctly.
* add a comment to explain the eval improving part (not so obvious to read
this code as array has 4 dimensions).
No functional change.
This should reduce search inconsistencies, and doesn't seem to have a measurable ELO Impact:
STC with Hash=16
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 49264 W: 10076 L: 10007 D: 29181
LTC with Hash=64
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 82149 W: 14044 L: 14023 D: 54082
Plus an extra test, to make sure it doesn't regress with strong hash pressure:
STC with Hash=4
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 21498 W: 4327 L: 4246 D: 12925
Bench: 7302735
Resolves#100
Idea is to apply king safety later in the endgame. Previously, we didn't
apply KS in a RR vs. Q ending for example, which causes poor play.
Now we calculate king attacks when the attacking side has a queen or more.
STC with 8moves_v3
LLR: 3.06 (-2.94,2.94) [0.00,4.00]
Total: 38481 W: 6228 L: 5952 D: 26301
LTC with 2moves_v1
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 51053 W: 8670 L: 8353 D: 34030
Bench: 7514010
Resolves#98
UCI specification is not clear on the size of
integers that are exchanged in the protocol, so
instead of a simple int, assume 'nodes' is a
int64_t because we need a bigger size to store
this value in many real cases, especialy with
very long searches.
No functional change.
Resolves#75
Clang 3.5 issues warning on constructs like: abs(f1 - f2). The thing is that
f1 and f2 are enum types, and the range given (all positive) allows the
compiler to choose an unsigned type (efficiency being one reason to prefer
unsigned arithmetic). If f1 < f2 are unsigned, then f1 - f2 wraps around zero
and the abs() becomes a no-op. It's the reinterpretation of the unsigned
result (large value) as a signed int that happens to give the correct result,
thanks to 2's complement. This is all tricky and dangerous!
In the spirit of the standard, we assume nothing on the signedness of enums,
and simply calculate the rank and file distances as:
- rank_dist(r1, r2) = r1 < r2 ? r2 - r1 : r1 - r2
- file_dist(f1, f2) = f1 < f2 ? f2 - f1 : f1 - f2
this logic can in fact be applied to any enum we may use, so for better
generality and to avoid code duplication, we use a template function diff()
here.
No functional change.
Resolves#95
This area has become obscure and tricky over the course of incremental
changes that did not respect the original's consistency and clarity. Now,
it's not clear why we use MAX_PLY = 120, or why we use MAX_PLY+6, among
other things.
This patch does the following:
* ID loop: depth ranges from 1 to MAX_PLY-1, and due to TT constraint (depth
must fit into an int8_t), MAX_PLY should be 128.
* stack[]: plies now range from 0 to MAX_PLY-1, hence stack[MAX_PLY+4],
because of the extra 2+2 padding elements (for ss-2 and ss+2). Document this
better, while we're at it.
* Enforce 0 <= ply < MAX_PLY:
- stop condition is now ss->ply >= MAX_PLY and not ss->ply > MAX_PLY.
- assert(ss->ply < MAX_PLY), before using ss+1 and ss+2.
- as a result, we don't need the artificial MAX_PLY+6 range. Instead we
can use MAX_PLY+4 and it's clear why (for ss-2 and ss+2).
* fix: extract_pv_from_tt() and insert_pv_in_tt() had no reason to use
MAX_PLY_PLUS_6, because the array is indexed by plies, so the range of
available plies applies (0..MAX_PLY before, and now 0..MAX_PLY-1).
Tested with debug compile, using MAX_PLY=16, and running bench at depth 17,
using 1 and 7 threads. No assert() fired. Feel free to submit to more severe
crash-tests, if you can think of any.
No functional change.
SSE4.2 has nothing to do with POPCNT. We must dispell this myth, because
Stockfish is a reference and many will copy this mistake if they see it in Stockfish:
* SSE is an SIMD instruction set, relative to vectorization (using special 128-bit registers).
* POPCNT/LZCNT work on normal registers (eg. AL, AX, EAX, RAX).
The confusion comes from the fact that, in the Intel product line, it just
so happens that SSE4.2 and POPCNT/LZCNT came out at the same time. But this
is not true for AMD. For example, all AMD Pheniom II have SSE3 but no
POPCNT/LZCNT, and that is why the modern compile uses -msse3 -popcnt and not -msse4.2.
No functional change.
Resolves#86
Objects that are only accessible at file-scope should be put in the anonymous namespace.
This is what the C++ standard recommends, rather than using static, which is really C-style and results in static linkage.
Stockfish already does this throughout the code. So let's weed out the few exceptions,
because... they have no reason to be exceptional.
No functional change.
Resolves#84
It is more idiomatic, we didn't used it
in the past because Position::pretty(Move)
had a calling argument, but now we can.
As an added benefit, we avoid a lot of string
copies in the process because now we avoid
std::ostringstream ss.
No functional change.
This commit fixes two issues:
1) Don't print PVs after the search has been interrupted
This solves the "mate 0 upperbound" scores that sometimes
creep up when a multi-PV analysis gets interrupted with
the `stop` command.
2) Print multipv before score
Shredder Classic fails to identify the main PV
(the one with multipv 1) if `score` comes first.
This leads to an eval graph that doesn't reflect
the scores actually reported by Stockfish when
doing a multiPV analysis.
No functional change
Closes#76
Helper function should just know how to find the
biggest piece type in a bitboard. All the threat
logic and data shoud be in evaluate_threats().
This nicely separates the scope of the two functions
in a more consistent way and simplifies the code.
No functional change.
Use the max_threat() helper function to estimate more precisely the
best hanging piece threat. Also retunes the Threat array using SPSA.
STC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 7598 W: 1596 L: 1468 D: 4534
LTC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 7896 W: 1495 L: 1350 D: 5051
Bench: 6816504
Resolves#73
Instead of hard-code the weights in a big table,
we prefer to calculate them out of few parameters
at startup. This allows to keep low the number of
independent parameters and hence is good for tuning
and for a better insight in the meaning of the numbers.
No functional change.
Make even more clear what are the terms that
contribute to evaluate connected pawns, and
completely separate them from the weights
that are now fully looked up in a table.
For future tuning makes sense to init the table with
a formula instead of hard-code it. This allows to
reduce problem space cardinality and makes tuning
easier.
And fix a MSVC warning while there:
warning C4804: '>>' : unsafe use of type 'bool' in operation
No functional change.
These two notions are very correlated. Since connected has the most
generality, it makes sense to generalize it to encompass what is
covered by candidate.
STC:
LLR: 4.03 (-2.94,2.94) [-3.00,1.00]
Total: 11970 W: 2577 L: 2379 D: 7014
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 13194 W: 2389 L: 2255 D: 8550
bench 7328585
Now that half plies have been removed from the engine, we can encode
TT depth into an int8_t.
Range is -128 to +127, so it goes still further than the previous
limit of 121 plies (with ONE_PLY == 2 where depth - DEPTH_NONE was
encoded as an uint8_t).
No functional change.
Resolved#60
Now that half-plies are no more used we can simplify
the code assuming that ONE_PLY is 1 and no more 2.
Verified with a SMP test:
LLR: 2.95 (-2.94,2.94) [-4.50,0.00]
Total: 8926 W: 1712 L: 1607 D: 5607
No functional change.
On top of previous patch, rename time variables to
reflect the simplification of UCI parameters.
It is more correct to use as varibales directly the
corresponding UCI option, without intorducing redundant
intermediate variables.
This allows also to simplify the code.
No functional change.
In case of a succesful late join we set again
'searching' flag, so we can restart search
immediately without an useless lock/unlock
cycle.
No functional change.
Now "Write Search Log" will pring moves in UCI format, consistent with all the rest. This functionality is
not aimed at end-users anyway. It's hardly useful at all, in fact. Also, pretty-printing SAN moves is
something that better belongs in the GUI than in the engine.
No functional change.
Where they better belong.
Also, this removes '#include <string>' from types.h, which reduces the amount of code to compile (every
translation unit includes types.h).
No functional change.
Unify various perft functions and move all the code
to search.cpp.
Avoid perft implementation to be splitted between
benchmark.cpp (where it has no reason to be) and
search.cpp
No functional and no speed change (tested).
So that one can redirect cout to /dev/null and only print print cerr in the terminal (for more accurate speed
tests).
Suggested by Marco.
No functional change.
The compiler tries to cast Options["Hash"] into a string, using:
Option::operator std::string() const {
assert(type == "string");
return currentValue;
}
And, as expected, the assert() fails.
std::to_string() would be the right solution, but it's C++11. And using a stringstream is too much code to
achieve so little. Let's keep it the way it was: hardcoded (ie. default hash defined in two places).
No functional change.
The eval already returns zero in KK, KBK, KNK (see material.cpp). The difference is:
- we lose the "TB pruning" benefit of the draw rule (ie. search goes on even if eval is zero)
- we gain some speed by removing a useless test from the hot path
STC:
LLR: 0.05 (-2.94,2.94) [-3.00,1.00]
Total: 128000 W: 21357 L: 21560 D: 85083
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 33023 W: 4613 L: 4509 D: 23901
bench 7461881
First, remove some dead code (function never called with a Move argument).
Then, remove printing of legal moves, which does not belong here. Let's keep commands orthogonal and minimal:
- the "d" command should display the board, nothing more, or less.
- "perft 1" will display the list of legal moves.
No functional change.
Stockfish allocates the default hash (32MB) in main(), before entering UCI::loop(). If there is not enough
memory, the program will crash even before UCI::loop() is entered and the GUI is given a change to specify a
lower Hash value.
This defective design could be resolved by doing a lazy allocation upon "isready" command, as the UCI protocol
guarantees that "isready" will be sent at least once before any search. But it's a bit cumbersome when using
Stockfish "manually" to have to remember to type "isready" everytime.
So leave the current design, but reduce the default hash to 16MB instread of 32MB. In order to perform such
quick searches (depth=13), there is no reason to use so much Hash anyway. Another benefit is to introduce a
bit of hash pressure in bench, which increases chances to detect rare bugs related to TT replacement, for
example.
This is not a functional change, although it obviously changes the bench.
bench 7461879
Instead of defining it both in ucioption.cpp and benchmark.cpp. Obviously changing the default Hash will
change the bench as a result.
No functional change.
The main purpose of perft is to help debugging. But without the breakdown in sum of perft(N-1), it is a
completely useless debugging tool.
So perft now displays the breakdown, and divide is therefore removed.
No functional change.
After commit 94b1bbb68b, in case available root moves are less than multiPV, we
could never reach condition:
PVIdx + 1 == multiPV
and as a consequence UCI output is not printed.
Fixed suggested by Joerg Oster.
No functional change.
Despite being neutral at STC, it turned out to be regressive at LTC:
40k games at LTC with Hash=8
ELO: -2.06 +-1.9 (95%) LOS: 1.4%
Total: 39720 W: 5740 L: 5976 D: 28004
40k games at LTC with Hash=128
ELO: -2.69 +-1.9 (95%) LOS: 0.2%
Total: 39149 W: 5702 L: 6005 D: 27442
bench 7477963
Also raise the admissible bounds to (-100,100), as there is no reason to prevent users from using high
values if they want to.
Does not regress in self play:
ELO: 0.10 +-2.0 (95%) LOS: 53.7%
Total: 40000 W: 7084 L: 7073 D: 25843
master vs SF 3
ELO: 182.86 +-2.7 (95%) LOS: 100.0%
Total: 40000 W: 21843 L: 2541 D: 15616
Contempt = 20 vs SF 3
ELO: 189.25 +-2.8 (95%) LOS: 100.0%
Total: 40000 W: 22721 L: 2859 D: 14420
Diff is therefore 6.4 +/- 3.9 elo against a 180-190 elo weaker engine, which is significantly positive,
as expected. This elo difference is likely understated, because of FishTest aggressive draw adjudication
though.
We could push Contempt further, but after 20cp, it would get in the way of FishTest draw adjudication
rule, and is likely to reduce the testing throughput as a result.
bench 8198667
Steamline a bit the implementation of
skill levels. As a side effect we can
retire MultiPV global and use a local
variable instead.
No functional change.
- Currently broken
- Never been really useful
- Does not work well with new splitting model
Verified for no regression at STC with 3 threads:
LLR: 2.96 (-2.94,2.94) [-6.00,0.00]
Total: 81905 W: 12122 L: 12381 D: 57402
No functional change
Bitboard init code is already noteasy to follow,
so don't make it even harder using 'smart' code.
Also reindent a while loop in standard way.
No functional change.
With Eelco's patch "Don't special case for abs(beta) >= VALUE_MATE_IN_MAX_PLY" condition "abs(ttValue) < VALUE_KNOWN_WIN" has been removed from singular extension search, and condition "abs(beta) < VALUE_KNOWN_WIN" was added to the SingularExtensionNode definition.
This might lead to problems, especially in positions, where a mate is due.
For example, this position 5rk1/4K1pp/8/5PPP/8/8/8/1R6 w - - 12 1 triggers an assert.
stockfish: search.cpp:434: Value {anonymous}::search(Position&, Search::Stack*, Value, Value, Depth, bool) [with {anonymous}::NodeType NT = (<unnamed>::NodeType)2u; bool SpNode = false]: Assertion `-VALUE_INFINITE <= alpha && alpha < beta && beta <= VALUE_INFINITE' failed.
So let's re-insert the removed condition.
First spotted by Uri Blass, fix by me.
Bench: 8759675
* /boot/common was removed from Haiku
* The equivalent path now that package management has been implemented is /boot/system/non-packaged
No functional change
Bench: 8759681
The assert:
assert(ttValue != VALUE_NONE);
Could fire for multiple reasons (although is very rare),
for instance after an IID we can have ttMove != MOVE_NONE
while ttValue is still set at VALUE_NONE.
But not only this, actually SMP is a source of corrupted
ttValue and anyhow we can detect the condition:
ttMove != MOVE_NONE && ttValue == VALUE_NONE
even north of IID.
Reported by Ronald de Man.
It is so rare that bench didn't change.
bench: 7710548
Remove from the search this special case and apply
null search and razoring also in mate positions.
Tested in no-regression mode and passed both
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 65431 W: 10860 L: 10810 D: 43761
and LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 34928 W: 4814 L: 4713 D: 25401
This patch kicks in only in mate positions and in
these cases it seems beneficial in finding mates
faster as Yery Spark measured on the Chest mate suite:
Total number of positions 6425
Fixed nodes 200K per position
master: 1049
new: 1154
And also the 5446 'hard' positions again with 2000K nodes
(those not found by both engines in 200K nodes):
master: 1069
new: 1395
bench: 7710548
It seems this flag is only for gcc and
yields a warning under OSX Mavericks:
clang: warning: argument unused during compilation: '-ansi'
No functional change.
Here MSVC is worried that
StepAttacksBB[PAWN][psq]
could overflow, so change psq initialization
to clarify psq is always less than 64.
No functional change.
Don't take the split lock if we don't have
available slaves (about 30-40% of times).
This new condition allows to retire the now
redundant one on number of threads.
No functional change.
Split previous patch in 2 steps: first remove
the MOVE_NULL hack, then retire nullChild.
The first step is a prerequisite
for second one and affects bench.
The second step (next patch) just removes nullChild
without affecting bench.
bench: 8205159
Are broken for big-endian case and
I have verified with MSVC 2013 Premium
bench is correct and there is no
miscompilation, so the main reason
to change the original code drops.
No functional change.
Another attempt at retiring current asymmetric
king evaluation and use a much simpler symmetric
one. As a good side effect we can avoid recalculating
eval after a null move.
Tested in no-regression mode and passed
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 21580 W: 3752 L: 3632 D: 14196
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 18253 W: 2593 L: 2469 D: 13191
And a LTC regression test against SF DD to
verify we don't have regression against
weaker engines due to some kind of 'contempt'
effect:
ELO: 54.69 +-2.1 (95%) LOS: 100.0%
Total: 40000 W: 11072 L: 4827 D: 24101
bench: 8205159
Before it was working by accident in case of
see_sign() and failing with see() due to how
castle moves are coded (king captures the rook).
Better to explicitly filter out castling moves
and use see() without any surprise/trick.
No functional case.
Book handling belongs to GUI, we kept this code
for historical reasons, but nowdays there is
really no need of this old, (mostly) unused
and especially incorrect designed functionality.
It is up to the GUI to choose the book (far easier for
the user) and to select the book parameters. In no
place, including fishtest, TCEC, rating lists, etc.
the "own book" is used, moreover currently SF is
released without any book and even if in the future we
bundle a book in the release package, it will be the GUI
that will take care of it.
This corrects a wrong design decision that Galurung
and later Stockfish inherited from what was common
practice many yeas ago.
No functional change.
There is really little that user can achieve (apart
from a weakened engine) tweaking these parameters
that are already tuned and have no immediate or visible
effect.
So better do not expose them to the user and avoid the
typical "What is the best setup for my machine?" kind of
question (by far the most common, by far the most useless).
No functional change.
To show perft numbers for each move. Just
use 'divide' instead of 'perft', for instance:
position startpos moves e2e4 e7e5
divide 4
Inspired by Ronald de Man.
No functional change.
Retire current asymmetric king evaluation
and use a much simpler symmetric one.
As a side effect retire the infamous
'Aggressiveness' and 'Cowardice' UCI
options.
Tested in no-regression mode,
Passed both STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 33855 W: 5863 L: 5764 D: 22228
And LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40571 W: 5852 L: 5760 D: 28959
bench: 8321835
At root we start counting plies from 1,
instead pv[] array starts from 0. So
the variable 'ply' we use in extract_pv_from_tt
to index pv[] is misnamed, indeed it is
not the real ply, but ply-1.
The fix is to leave ply name in extract_pv_from_tt
but assign it the correct start value and
consequentely change all the references to pv[].
Instead in insert_pv_in_tt it's simpler to rename
the misnamed 'ply' in 'idx'.
The off-by-one bug was unhidden when trying to use
'ply' for what it should have been, for instance in
this position:
position fen 8/6R1/8/3k4/8/8/8/2K5 w - - 0 1
at depth 24 mate line is erroneusly truncated due
to value_from_tt() using the wrong ply.
Spotted by Ronald de Man.
bench: 8732553
If razoring conditions are satisfied and
depth is low, then directly drop in qsearch.
Passed both STC
LLR: 2.98 (-2.94,2.94) [-1.50,4.50]
Total: 12914 W: 2345 L: 2208 D: 8361
And LTC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 50600 W: 7548 L: 7230 D: 35822
bench: 8739659
When there aren't legal moves after
a search, instead of returning imediately,
save bestValue in TT as in the usual case.
There is really no reason to special case
this one.
With this patch is fully fixed (again) follwing
position:
7k/6p1/6B1/5K1P/8/8/8/8 w - - 0 1
Also in SMP case.
bench: 8802105
After last Joona's patch there is no measurable
difference between the option set or unset.
Tested by Andreas Strangmüller with 16 threads
on his Dual Opteron 6376.
After 5000 games at 15+0.05 the result is:
1 Stockfish_14050822_T16_on : 3003 5000 (+849,=3396,-755), 50.9 %
2 Stockfish_14050822_T16_off : 2997 5000 (+755,=3396,-849), 49.1 %
bench: 880215
Instead of waiting to be allocated, actively search
for another split point to join when finishes its
search. Also modify split conditions.
This patch has been tested with 7 threads SMP and
passed both STC:
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 2885 W: 519 L: 410 D: 1956
And a reduced-LTC at 25+0.05
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 4401 W: 684 L: 566 D: 3151
Was then retested against regression in 3 thread case
at standard LTC of 60+0.05:
LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 40809 W: 5446 L: 5406 D: 29957
bench: 8802105
On a final fixed game number test it failed
to prove better than standard version.
STC 15+0.05
ELO: -0.86 +-1.7 (95%) LOS: 15.8%
Total: 57578 W: 10070 L: 10213 D: 37295
bench: 8802105
Unfortunatly we have a slow down that causes
a regression in STC with no-regression mode:
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22454 W: 3836 L: 4029 D: 14589
bench: 8678654
The bug was found to be elsewhere. This version
is correct and also is able to detect as draw
positions like:
8/8/5b2/8/8/4k1p1/6P1/5K2 b - - 6 133
bench: 8678654
Position is win also if strong side has a bishop
and a knight (plus other material, otherwise
KBNK would be triggered instead of KXK).
This fixes a subtle bug where a search on position
k7/8/8/8/8/P7/PB6/K7 b - - 6 1
Instead of returning a draw score, suddendly returns
a big score. This happens because at one point in
search we reach this position:
8/Pk6/8/8/8/4B3/P7/K7 w - - 3 8
Where white can promote. In case of rook promotion (and also in case of
queen promotion) the resutling position gets a huge static eval that is
above VALUE_KNOWN_WIN (from the point of view of white). So for rook
promotion it is
&& futilityBase > -VALUE_KNOWN_WIN
that prevents futility pruning in qsearch. (Removing this condition indeed
lets the problem occur). Raising the static eval for K+B+N+X v K to a value
higher than VALUE_KNOWN_WIN fixes this particular problem without having to
introduce an extra futility pruning condition in qsearch.
I just checked and it seems K+R v K, K+2B v K and even K+B+N v K already get
a huge static eval. Why not K+B+N+P v K?
I think this fix corrects an oversight. There is special code for KBNK, but
KBNXK is handled by KXK, so the test for sufficient material should also test
for B+N.
bench: 8678654
In the (rare) cases when the two conditions
are true, then fully check again with a slow
but correct MoveList<LEGAL>(pos).size().
This is able to detect false positives like
this one:
8/8/8/Q7/5k1p/5P2/4KP2/8 b - - 0 17
When we have a possible simple pawn push that
is not stored in attacks[] array. Because the
third condition triggers very rarely, even if
it is slow, it does not alters in a measurable
way the average speed of the engine.
bench: 8678654
Currently a stealmate position is misevaluated
in a negative/positive score, this leads qsearch(),
that does not detects stealmates too, to return the
wrong score and this yields to some kind of endgames
to be completely misevaluated.
With this patch is fully fixed follwing position
7k/6p1/6B1/5K1P/8/8/8/8 w - - 0 1
Also in SMP case.
Correct root cause analysys by Ronald de Man.
bench: 8678654
After reverting to the original Tord's
endgame, a search on position
7k/6p1/6B1/5K1P/8/8/8/8 w - - 0 1
Reports, correctly, a draw score instead of
an advantage for white.
Issue reported by Uri Blass.
bench: 8678654
Remove the optimization for Intel, is not
standard and can break at any time, moreover
our release build is not done with Intel C++
anymore so we don't need to sqeeze the extra
speed out from this compiler.
No functional change.
If we return from split with a stale value
due to a stop or a cutoff upstream occurred,
then we exit moves loop and save a stale value
in TT before returning search().
This patch, from Joona, fixes this.
bench: 8678654
We can never have bestValue == -VALUE_INFINITE at
the end of move loop because if no legal move exists
we detect it with previous condition on !moveCount,
if a legal move exists we never prune it due to
futility pruning condition:
bestValue > VALUE_MATED_IN_MAX_PLY
So this code never executes, as I have also verified
directly.
Issue reported by Joona.
No functional change.
Intel compiler is very picky:
"error: this operation on an enumerated type requires an
applicable user-defined operator function"
Reported by Tony Gaor.
No functional change.
Put the division at the end to reduce
rounding errors. This alters the bench
due to different rounding errors, but
should not alter ELO in any way.
bench: 7615217
This apparentely silly tweak allows
to speed up the bench by almost 3%.
Not clear why, repeating with perft,
the speed up vanishes.
Suggested by Jonathan Calovski.
No functional change.
Believed to be a speed optimization as benched
on Windows with bench realtime affinity 0x1 deleting
highest and lowest runs:
Base Test
1549259 1608202
1538115 1583934
1543168 1556938
1536365 1554179
1533026 1582010
Signature remains unchanged and gives anywhere from 1-2% nps
boost in analysis depending on number of cores used.
No functional change.
This is a very discussed patch with many
argumentations pro and against. The fact is
it passed both STC:
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 16305 W: 3001 L: 2855 D: 10449
And LTC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 34273 W: 5180 L: 4931 D: 24162
Although it is true that a correct test should
include foreign engines, we commit it anyhow so
people can test it out in the wild, under broader
conditions.
bench: 7384368
Idea from Lyudmil Tsvetkov.
The value seems to be raised a bit abruptly, but as
Gary said, a blocked pawn on the sixth rank has been
instrumental in limiting king mobility in multiple
losses that I've seen from SF. A blocked pawn on fifth
rank is much less serious on the king safety impact.
Passed both STC
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 14551 W: 2750 L: 2607 D: 9194
and LTC
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 43595 W: 6917 L: 6618 D: 30060
And even a retest at 60" fixed games 40K
ELO: 1.79 +-1.9 (95%) LOS: 97.0%
Total: 39889 W: 6018 L: 5813 D: 28058
bench: 7154916
Adding BMI1 allows the compiler to use _blsr_u64
automatically (the advertised 0.3% speed gain).
I verified that the compiler does not use this
instruction with the -mbmi2 flag only. Also, all
processors supporting BMI2 is also supporting BMI1.
No functional change
Prefer
file_of(s) < file_of(ksq)
to the inidrect
file_of(ksq) < FILE_E
To evaluate if semiopen side to check is the left side.
Also other small touches while there.
No functional change.
Right now the Makefile is cluttered with OS X equivalents
of all the x86 targets. We can get rid of all of them and
just check UNAME against "Darwin" for the few OS X-specific
things we need to do.
We also disable Clang LTO when using BMI2 instructions. For
some reason, LLVM cannot find the PEXT instruction when using
LTO. I don't know why, but disabling LTO for BMI2 fixes it.
No functional change.
Intel Haswell and newer CPUs can calculate sliders
attacks using special PEXT asm instructions instead
of magic bitboards. This gives a +3% speed up.
To enable it just compile with ARCH=x86-64-bmi2
No functional change.
Retire software pext and introduce hardware
call when USE_PEXT is defined during compilation.
This is a full complete implementation of sliding
attacks using PEXT.
No functional change.
Reshuffle functions to define them in reverse
calling order (C style).
This allow us to define templates before they are
used. Currently it is not like this, for instance
evaluate_pieces is defined after do_evaluate that
calls it. This happens to work for some strange
reason (two phase lookup?) but we want to avoid
code that works 'by magic'.
As a nice side-effect we can now remove the function
prototypes.
No functional change.
This is more consistent with what other engines are doing.
Often people thinks that SF's scores are overblown. In the
end, it just boils down to the arbitrary way of rescaling them.
No functional change.
I have noticed that increasing the bench depth produces
progressively smaller and slightly faster executables at
the cost of longer compile times. Also using bench "time"
instead of "depth" seems to produce slightly smaller/faster
executables given comparable compile times.
I have made a new Makefile that generates smaller and
about 1% to 2% faster profile executables at only a
little extra compile time. On my mobile 2GHz i7 a
full profile build time goes from 3'48" to 4'13" and
the exe goes down by 5% from 416,310 bytes to 395,567
bytes.
No functional change.
Small simplification.
Passed SPRT(-3,1) both at STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 17051 W: 3132 L: 3005 D: 10914
and LTC:
LLR: 4.55 (-2.94,2.94) [-3.00,1.00]
Total: 24890 W: 3842 L: 3646 D: 17402
The rationale behind this is that I've never managed to add a
Queen on 7th rank bonus in DiscoCheck, because it never showed
to be positive (evne slightly) in testing. The only thing that
worked is Rook on 7th rank.
In terms of SF code, it seemed natural to group it with QueenOnPawn
as well as those are done together. I know you're against groupping
in general, but when it comes to non regression test, you are being
more conservative by groupping. If the group passes SPRT(-3,1) it's
safer to commit, than test every component in SPRT(-3,1) and end up
with the risk of commiting several -1 elo regression instead of just
one -1 elo regression.
In chess terms, perhaps it's just easier to manouver a Queen (which
can more also diagonaly) than a Rook. Therefore you can let the search
do its job without needing eval ad-hoc terms to guide it. For the Rook
which takes more moves to manouver such eval terms can be (marginally)
useful.
bench: 7473314
We chose this instead of negamax sign convention
(ie. from the point of view of the side to move)
because it is more in line to how the eval
table is presented.
Also some tweak to formatting while there.
No functional change.
In some legal positions like this one:
R6R/3Q4/1Q4Q1/4Q3/2Q4Q/Q4Q2/Np1Q4/kB1N1KB1 b -- 0 1
We can have a very high score, in this case 30177 and 29267
for midgame and endgame respectively, and because
VALUE_INFINITE = 30001 we have an assert in interpolate()
Midgame and endgame scores are stored in 16 bit signed integers
so we can rise VALUE_INFINITE a little bit. This does not fix
the possibility of overflow in general case, just makes the
condition more difficult to trigger and anyhow better uses all
the score width.
Raising VALUE_INFINITE to 32000 seems to fix the problem for this
particular case.
No functional change.
Tested directly at LTC because previous long
test series on this topic shows it is TC dependant.
Tested with no-regression mode because gets rid of
an ugly and ad-hoc rule.
Test at LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 67918 W: 10590 L: 10541 D: 46787
bench: 7926803
Here the new idea is to link pinned pieces
with king safety.
Passed both STC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 10047 W: 1867 L: 1737 D: 6443
And LTC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 10419 W: 1692 L: 1543 D: 7184
bench: 8325087
We add a penalty for each pawn which is not protected by another pawn
of the same color.
Passed both short TC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 12107 W: 2411 L: 2272 D: 7424
And long TC
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 9204 W: 1605 L: 1458 D: 6141
bench: 7682173
We want all the UCI options are printed in the order in which are
assigned, so we use an index that, depending on Options.size(),
increases after each option is added to the map. The problem is
that, for instance, in the first assignment:
o["Write Debug Log"] = Option(false, on_logger);
Options.size() can value 0 or 1 according if the l-value (that
increments the size) has been evaluated after or before the
r-value (that uses the size value).
The culprit is that assignment operator in C++ is not a
sequence point:
http://en.wikipedia.org/wiki/Sequence_point
(Note: to be nitpick here we actually use std::map::operator=()
that being a function can evaluate its arguments in any order)
So there is no guarantee on what term is evaluated first and
behavior is undefined by standard in this case. The net result
is that in case r-value is evaluated after l-value the last
idx is not size() - 1, but size() and in the printing loop
we miss the last option!
Bug was there since ages but only recently has been exposed by
the removal of UCI_Analyze option so that the last one becomes
UCI_Chess960 and when it is missing engine cannot play anymore
Chess960.
The fix is trivial (although a bit hacky): just increase the
last loop index.
Reported by Eric Mullins that found it on an ARM and MIPS
platforms with gcc 4.7
No functional change.
Thanks to std::bitset we can easily increase
the limit of active threads above 64.
Thanks to Lucas Braesch for pointing at the
correct solution of using std::bitset.
No functional change.
Using memset on a std::vector is undefined behavior,
so manually init all the data memebers of LimitsType.
Bug intorduced in 41641e3b1e
No functional change.
Because we test for available slaves before
entering split(), we almost always allocate a
slave, only in the rare case of a race (less
then 2% of cases) this is not true, but to
special case this occurrence is not worth
the added complexity.
bench: 7451319
Split delta value in aspiration window so that when
search depth is less than 24 a smaller delta value
is used. The idea is that the search is likely to
be more accurate at lower depths and so we can exclude
more possibilities, 25% to be exact.
Passed STC
LLR: 2.96 (-2.94, 2.94) [-1.50, 4.50]
Total: 20430 W: 3775 L: 3618 D: 13037
And LTC
LLR: 2.96 (-2.94, 2.94) [0.00, 6.00]
Total: 5032 W: 839 L: 715 D: 3478
Bench: 7451319
During endgame initialization we get the material
hash key of each endgame forging and ad-hoc position
that in same cases is illegal (leaves teh king under
capture). This is not a problem for the material key,
but rises an assert when SF is run in debug mode with
'testKingCapture' set in pos_is_ok().
So rewrite the code to always produce legal positions.
No functional change.
Big simplification of pawn move check.
Code has been tested with a brute force approach: for
every position reached during a bench search, the function
has been called for each combinations of Move(from, to)
and verified the result is the same of old code.
Actually this function is very critical becuase is the
one that ensures corrupted TT moves are discarded, so
to properly test it a simple bench is not enough.
Verified also speed is not changed.
No functional chnage.
It has been obsoleted out already some time ago
and currently there is no point in changing eval
score according to if we are in game or analyzing.
So retire the option.
No functional change.
Try to avoid repetition draws at early midgame,
this should give an edge against weaker opponents
and reduce draw rate.
Tested for regressions with SPRT[-3, 1] and
passed both short TC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 68498 W: 12928 L: 12891 D: 42679
And long TC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 40212 W: 6386 L: 6295 D: 27531
bench: 7990513
When running the following position:
8/kPp5/2P3p1/p1P1p1P1/2PpPp2/3p1p2/3P1P2/5K2 w - - 0 1
An assert is raised at depth 92:
assert(-VALUE_INFINITE <= alpha && alpha < beta && beta <= VALUE_INFINITE);
This is because it happens that beta = 29832,
so rbeta = 30032 that is > VALUE_INFINITE
Bug spotted and analyzed by Uri, fix suggested by Joerg.
Other fixes where possible but this one is pointed
exactly at the source of the bug, so it is the best
from a code documentation point of view.
bench: 8430785
Under some very rare case 100 plies of search
could be not enough. Increasing more could lead
to crashes due to reached stack size limit on
some platforms.
Strongly requested by Uri.
bench: 8430785
Currently king has no material key associated because
it can never happen to find a legal position without
both kings, so there is no need to keep track of it.
The consequence is that a position with only the two
kings has material key set at zero and if the material
hash table is empty any entry will match and this is
wrong.
Normally bug is hidden becuase the checking for a draw
with pos.is_draw() is done earlier than evaluate() call,
so that we never check in gameplay the material key of a
position with two kings.
Nevertheless the bug is there and can be reproduced setting
at startup a position with only two kings and typing
'eval' from prompt.
The fix is very simple: add a random key also for the king.
Also fixed the condition in material.cpp to avoid asserting
when a 'just 2 kings' postion is evaluated.
No functional change.
Actually MultiCut is too different from current scheme.
Note that neither ProbCut is exactly what we do because
we try just a handful of captures instead of all moves,
nevertheless it seems more in line with what we do.
Suggested by Joona.
No functional change.
Makes more sense than returning a draw score. Tested
with reduced MAX_PLY = 30 and passed both short TC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 17434 W: 3345 L: 3194 D: 10895
And long TC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 2610 W: 488 L: 373 D: 1749
With current limit of MAX_PLY = 100 the patch should not
introduce any measurable change, nevertheless is the correct
approach.
Idea of returning eval is from Michel Van den Bergh.
bench: 8430785
Fix small overflow error while converting
magic boosters from right rotate to left rotate,
in particular booster 38 was converted to 4122
instead of the corrcet value 26.
Formula used was:
s1 = original & 63, s2 = (original >> 6) & 63;
new = (64 - s1) | ((64 - s2) << 6);
Instead of:
s1 = original & 63, s2 = (original >> 6) & 63;
new = ((64 - s1) & 63) | (((64 - s2) & 63) << 6);
This has no impact in number of cycles needed, but
just in the resultig number that yields to a rotate
amount bigger than 63.
Spotted by Ehsan Rashid.
No functional change.
Latest master triggers a compiler warning due
to comparing int64_t to uint64_t.
notation.cpp: In Funktion »std::string pretty_pv(Position&, int, Value, int64_t, Move*)«:
notation.cpp:230:30: Warnung: Vergleich zwischen vorzeichenbehafteten und vorzeichenlosen Ganzzahlausdrücken [-Wsign-compare]
This patch should fix it.
No functional change.
When initializing the magic numbers used to
compute sliding attacks, we endless generate a
random and test it as a possible magic.
In the general case this takes a lot of iterations,
but here, insteaad of picking a casual random, we
rotate it a couple of times and generate a number that
we know has a good probability to be a magic candidate.
This is becuase the quantities by which we rotate the
number are known in advance to produce quickly a good
canidate.
The patch, inspired by DON, just moves the shuffle to RKISS
changing the boosters to take in account a left rotation
instead of a right rotation as in the original.
No functional change.
Although does not change ELO level, it seems
verification is useful in many zugzwang positions
as reported by many sources.
So revert this simplification.
bench: 8430785
Another SEE speed up that passed the SPRT short TC test!
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 81337 W: 15060 L: 14745 D: 51532
No functional change.
Actually race conditions do exist in an engine, just
think for a moment to TT concurrent access. Racy code
is not a problem per se, if the consequences are well
known and correctly handled.
In case of TT access we ensure that the TT move is validated
before to be tried, here we just retry the same move in less
that 1 case out of a million: this is totally harmless considering
that very probably the second time the move is tried we get
immediately a TT hit and search quickly returns.
So we simplify the code for no harm.
No fuctional change (in single thread case)
Tested with SPRT in simplification mode [-4.00,0.00],
this ensures that the patch is (very probably) not
a regression.
Passed both short TC
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 27543 W: 4278 L: 4209 D: 19056
And long TC
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 39483 W: 7325 L: 7305 D: 24853
bench: 8347121
Hopefully this patch makes the code more:
* Self-documenting: Null search is always a zero window search,
because it is testing for a fail high. It should never be done
on a full window! The current code only works because we don't
do it at PV nodes, and therefore (alpha, beta) = (beta-1, beta):
that's the kind of "clever" trick we should avoid.
* Idiot-proof: If we want to enable null search at PV nodes, all we
need to do now is comment out the !PvNode condition. It's that simple!
In theory, null search should not be done at PV nodes, because PV nodes
should never fail high. But in practice, they DO fail high, because of
aspiration windows, and search inconsistencies, for example. So it makes
sense to keep that flexibility in the code.
No functional change.
Use ralpha instead of rbeta
* rbeta is confusing people. It took THREE attempts to code razoring
at PV nodes correctly in a recent test, because of the rbeta trick.
Unnecessary tricks should be avoided.
* The more correct and self-documenting way of doing this, is to say
that we use a zero window around alpha-margin, not beta-margin.
The fact that, because we only do it at PV nodes, alpha happens to be
beta-1 and that the current stuff with rbeta works, may be correct,
but is confusing.
Remove the misleading and partially erroneous comment about returning
v + margin:
* comments should explain what the code does, not what it could have done.
* this comment is partially wrong in saying that v+margin is "logical",
and that it is "surprising" that is doesn't work.
From a theoretical perspective, at least 3 ways of doing this are equally
defendable:
1/ fail hard: return alpha: The most conservative. We bet that the search
will fail low, but we don't know by how much and don't want to take risks.
2/ aggressive fail soft: return v (what the current code does). This
corresponds to normal fail soft, with the added assumption that we don't
care about the reduction effect (see below point 3/)
3/ conservative fail soft: return v + margin. If the reduced search (qsearch)
gives us a score <= v, we bet that the non reduced search will give us a
score <= v + margin.
* Saying that 2/ is "logical" implies that 1/ and 3/ are not, which is
arguably wrong. Besides, experimental results tell us that 2/ beats 3/,
and that's not something we can argue against: experimental results are
the only trusted metric.
* Also, with the benefit of hindsight, I don't think the fact that 2/ is
better than 3/ is surprising at all. The point is that it is YOUR turn to
move, and you are assuming that by NOT playing (and letting the opponent
capture your hanging pieces in QS) you cannot generally GAIN razor_margin(depth).
No functional change.
With some positions like
8/8/8/2p2K2/1pp5/br1p1b2/2p2r2/qqkqq3 w - -
The eval score is higher than VALUE_INFINITE because
is the sum of VALUE_KNOWN_WIN plus a big material
advantage. This leads to an assert. Here are the
steps to reproduce:
Compile SF with debug=yes then do
./stockfish
position fen 8/8/8/2p2K2/1pp5/br1p1b2/2p2r2/qqkqq3 w - -
go depth 1
This patch fixes the issue in this case, but do exsist
other positions for which the patch is not enough and
we will need to limit the eval score to be sure not
overflow the limit.
Note that is not possible to increase the value of
VALUE_INFINITE because should remain within int16_t
type to be stored in a TT entry.
bench: 7356053
Depth is already dependent on the actual value
of ONE_PLY, in particular can be expressed like:
Depth = n * ONE_PLY
And because formula is used to calculate R that is
also dependent on the value of ONE_PLY and can be
expressed like:
R = x * ONE_PLY
We don't want to divide depth by a 'ply' value but
directly by an integer number.
Spotted by sf-x
No functional change.
Instead of a fixed reduction of ONE_PLY, now
Null move dynamic reduction based on value can
grow larger in case we are above beta of a value
much higher then PawnValueMg.
Note that now an eval returning VALUE_KNOWN_WIN
makes null search to drop in qsearch.
Passed both short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 26141 W: 4871 L: 4699 D: 16571
And long TC:
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 33695 W: 5309 L: 5056 D: 23330
bench: 7356053
Verified there are no hidden bugs and is
actually a speed optimization:
Fixed games at 15+0.05 TC
ELO: 1.72 +-2.9 (95%) LOS: 87.5%
Total: 20000 W: 3741 L: 3642 D: 12617
No functional change
When time remaining is less than Emergency Move Time,
we won't even complete one iteration and engine reports
a stale +M0 score.
To reproduce run "go wtime 10"
info depth 1 seldepth 1 score mate 0 upperbound nodes 2 nps 500 time 4 multipv 1 pv a2a3
info nodes 2 time 4
bestmove a2a3 ponder (none)
This patch fixes the issue.
Tested by Binky at very short TC: 0.05+0.05
ELO: 5.96 +-12.9 (95%) LOS: 81.7%
Total: 1458 W: 394 L: 369 D: 695
And at a bit longer TC:
ELO: 1.56 +-3.7 (95%) LOS: 79.8%
Total: 16511 W: 3983 L: 3909 D: 8619
bench: 7804908
TCEC season 3, which is due to start in a few weeks, just
had its server upgraded to 64GB RAM and will therefore allow
16GB hash to be used per engine.
This is almost the upper limit without changing the
type of size and hashMask. After this we need to
move to uint64_t instead of uint32_t.
No functional change.
Retire KmmKm evaluation function. Instead give a very drawish
scale factor when the material advantage is small and not much
material remains.
Retire NoPawnsSF array. Pawnless endgames without a bishop will
now be scored higher. Pawnless endgames with a bishop pair will
be scored lower. The effect of this is hopefully small.
Consistent results both at short TC (fixed games):
ELO: -0.00 +-2.1 (95%) LOS: 50.0%
Total: 40000 W: 7405 L: 7405 D: 25190
And long TC (fixed games):
ELO: 0.77 +-1.9 (95%) LOS: 78.7%
Total: 39690 W: 6179 L: 6091 D: 27420
bench: 7213723
When we have a fail-high of a quiet move, store it in
a Followupmoves table indexed by the previous move of
the same color (instead of immediate previous move as
is in countermoves case).
Then use this table for quiet moves ordering in the same
way we are already doing with countermoves.
These followup moves will be tried just after countermoves
and before remaining quiet moves.
Passed both short TC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10350 W: 1998 L: 1866 D: 6486
And long TC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 14066 W: 2303 L: 2137 D: 9626
bench: 7205153
This hacky rule allows to get an about right eval out of this position:
r2qk2r/ppp2p2/2npbn2/2b1p3/2P1P1P1/2NB1PPp/PPNP3K/R1BQ1R2 b kq - 0 13
And, more importantly, passed both short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 6239 W: 1249 L: 1127 D: 3863
And long TC:
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 38371 W: 6165 L: 5888 D: 26318
bench: 8183238
As pointed out by Joona, Lucas and otehr people in
the forum, this endgame is not a known, there are many
positions where it takes more than 50 moves to claim the
win and becasue exact rules is not possible better to
retire and allow the search to workout the endgame for us.
bench: 8502826
While editing original Uri's messy patch
I have incorrectly simplified the logic
condition. Here is the correct original
version, as it was tested.
bench: 8502826
Remove the easy move code and add the condition to
play instantly if only one legal move is available.
Verified there is no regression at 60+0.05
ELO: 0.17 +-1.9 (95%) LOS: 57.0%
Total: 40000 W: 6397 L: 6377 D: 27226
bench: 8502826
If we are still at first move, without a fail-low and
current iteration is taking too long to complete then
stop the search.
Passed short TC:
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 26030 W: 4959 L: 4785 D: 16286
Long TC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 18019 W: 2936 L: 2752 D: 12331
And performed well at 40/30
ELO: 4.33 +-2.8 (95%) LOS: 99.9%
Total: 20000 W: 3480 L: 3231 D: 13289
bench: 8502826
First tested with 50K games at very short TC of 5+0.05
ELO: 3.11 +-2.0 (95%) LOS: 99.9%
Total: 49665 W: 10941 L: 10497 D: 28227
Then retested with usual SPRT at short TC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 16875 W: 3198 L: 3049 D: 10628
And at long TC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 5890 W: 985 L: 857 D: 4048
bench: 7800379
In case ply is very high, function will round
to zero (although mathematically it is always
bigger than zero). On my system this happens at
movenumber 6661.
Although 6661 moves in a game is, of course,
probably impossible, for safety and to be locally
consistent makes sense to ensure returned value
is positive.
Non functional change.
Function move_importance() is already always
positive, so we don't need to add a constant
term to ensure it.
Becuase move_importance() is used to calculate
ratios of a linear combination (as explained in
previous patch), result is not affected. I have
also verified it directly.
No functional change.
Drop a useless parameter. This works because ratio1 and ratio2
are ratios of linear combinations of thisMoveImportance and
otherMovesImportance and so the yscale cancels out.
Therefore the values of ratio1 and ratio2 are independent
of yscale and yscale can be retired.
The same applies to yshift, but here we want to ensure
move_importance() > 0, so directly hard-code this safety
guard in function definition.
Actually there are some small differences due to rounding errors
and usually are at most few millisecond, that's means below 1% of
returned time, apart from very short intervals in which a difference
of just 1 msec can raise to 2-3% of total available time.
No functional change.
The flag raises also in case of a pawn duo, i.e.
when we have two adjacent pawns on the same rank,
and not only in case of a chain, i.e. when the two
pawns are on a diagonal line.
See this for a reference:
http://en.wikipedia.org/wiki/Connected_pawns
Renaming suggested by Ralph.
No functional change.
Use a skew-logistic function to replace the
MoveImportance[] array.
Verified it does not regress at fixed number
of games both at short TC:
LLR: -2.91 (-2.94,2.94) [-1.50,4.50]
Total: 39457 W: 7539 L: 7538 D: 24380
And long TC:
ELO: -0.49 +-1.9 (95%) LOS: 31.0%
Total: 39358 W: 6135 L: 6190 D: 27033
bench: 7335588
Passed short TC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 18331 W: 3608 L: 3453 D: 11270
And scored above 50% on a very long test in long TC
LLR: -2.97 (-2.94,2.94) [0.00,6.00]
Total: 51533 W: 8181 L: 8047 D: 35305
bench: 7335588
In endgame it's better to have pawns on both wings.
So give a bonus according to file distance between left
and right outermost pawns.
Passed both short TC
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 39073 W: 7749 L: 7536 D: 23788
And long TC
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 6149 W: 1040 L: 910 D: 4199
bench: 7665034
Reduce eval discontinuity becuase now we kick in
king safety evaluation in many more cases.
Passed both short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 8708 W: 1742 L: 1613 D: 5353
And long TC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 6743 W: 1122 L: 990 D: 4631
bench: 6835416
Tighter lower bound for pawn attacks so to
activate king safety in some cases like here:
6k1/2B3p1/2Pp1p2/2nPp3/2Q1P2K/P2n1qP1/R6P/1R6 w
Original patch by Chris, further simplified by
Jörg Oster.
Passed both short TC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 30171 W: 5887 L: 5700 D: 18584
And long TC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 20706 W: 3402 L: 3204 D: 14100
bench: 7607562
Add a bonus according if the attacking
pieces are minor or major.
Passed both short TC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 13142 W: 2625 L: 2483 D: 8034
And long TC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 18059 W: 3031 L: 2844 D: 12184
bench: 7425809
A great simplification that shows no regression
and it seems even a bit scalable.
Tested with fixed number of games:
Short TC
ELO: 0.60 +-2.1 (95%) LOS: 71.1%
Total: 39554 W: 7477 L: 7409 D: 24668
Long TC
ELO: 2.97 +-2.0 (95%) LOS: 99.8%
Total: 36424 W: 5894 L: 5583 D: 24947
bench: 8184352
Change updating rule after a TT hit to match
the same one at the end of the search.
Small change in functionality, but we want to
have uniform rules in the code.
bench: 7767864
We already update killers so it is natural to extend to
history and counter move too.
Passed both short TC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 52690 W: 9955 L: 9712 D: 33023
And long TC
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 5555 W: 935 L: 808 D: 3812
bench: 7876473
After a fail high in LMR, if reduction is very high do
a research at lower depth before teh full depth one.
Chances are that the re-search will fail low and the
full depth one is skipped.
Passed both short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 11363 W: 2204 L: 2069 D: 7090
And long TC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 7292 W: 1195 L: 1061 D: 5036
bench: 7869223
Since all ENPASSANT moves are now considered dangerous, this
change of order should give a slight speedup.
Also simplify futilityValue formula.
No functional change.
We avoid to use an ad-hoc table at the cost of a
relative_rank() call in advanced_pawn_push().
On my 32 bit system it is even slightly faster (on 64bit
may be different). This is the speed in nps alternating
old and new bench runs:
new
368890
368825
369972
old
367798
367635
368026
No functional change.
Instead of a passed pawn now we just require the pawn to
be in the opponent camp to be considered a dangerous
move. Added some renaming to reflect the change.
Passed both short TC test
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10358 W: 2033 L: 1900 D: 6425
And long TC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 21459 W: 3486 L: 3286 D: 14687
bench: 8322172
To align to same named Position function and
avoid using std::cout directly.
Also remove some stale <iostream> include while
there.
No functional change.
Add a Mac SSE4.2 target. Also change the Mac OS X minimum version to
10.6. Rationale: 97% of Macs run at least 10.6, version 10.9 is now
free, and using 10.6 as the minimum version gives a small 5% boost in
benchmark speed over versions using 10.0 as the minimum version.
Finally, enable Clang’s Link Time Optimization when compiling for the
Mac.
No functional change.
An old idea retested at SPRT(0, 3) with 60+0.05 TC:
LLR: 2.95 (-2.94,2.94) [0.00,3.00]
Total: 98872 W: 15549 L: 15123 D: 68200
This is a very small elo increase patch so it really
stresses the limits of fishtest.
bench: 8596156
It seems to intorduce a regression when tested
with 3 threads at 15+0.05:
ELO: -2.26 +-2.2 (95%) LOS: 2.4%
Total: 30000 W: 4813 L: 5008 D: 20179
bench: 8331357
Tested setting FakeSplit to true and running
./stockfish bench 128 2
There is a different signature with and without
the patch so it affects functionality but
only in SMP case.
bench: 8331357
SMP case is very tricky and raises an assert in stage_moves():
assert(stage == KILLERS_S1 || stage == QUIETS_1_S1 || stage == QUIETS_2_S1)
So rewrite the code to just return moves[] when we are sure
we are in quiet moves stages.
Also rename stage_moves to quiet_moves to reflect that.
No functional change (but needs testing in SMP case)
Use MovePicker moves[] to access already tried
quiet moves. A bit of care shall be taken
to avoid calling stage_moves() when we are still
at ttMove stage, because moves are yet to be
generated. Actually our staging move generation
makes this code a bit more tricky than what I'd
like, but removing an ausiliary redundant
array like quietsSearched[] is a good thing.
Idea by DiscoCheck
bench: 9355734
Use the newly introduced LineBB[] to simplify this
super hot-path function.
Verified with perft we don't have any speed regression, although
the number of squares removed is less than before in case of
contact check.
Insipred by DiscoCheck implementation.
Perft numbers are the same, but we have an harmless functional
change due to reorder of moves, because now some illegal moves
are no more detected at generation time, but in the search.
bench: 8331357
This seems a die hard idea :-)
Passed both short TC
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 17485 W: 3307 L: 3156 D: 11022
And long TC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 38181 W: 6002 L: 5729 D: 26450
bench: 8659830
Actually, it is not used, as both arrays have the
same values. Some local tests in either direction
showed no improvement.
Also some minor corrections in the comments.
No functional change.
Previously some squares could be "incorrectly" awarded
to a pinned piece.
e.g. in 3k4/1q6/3b4/3Q4/8/5K2/B7/8 b - - 0 1 the black
bishop get 4 squares too many and the white queen gets 6.
Passed both short TC.
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 4871 W: 934 L: 817 D: 3120
And long TC:
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 38968 W: 6113 L: 5837 D: 27018
bench: 9282549
But compensate by reducing rook and queen
value by 53 = (160 / 3)
Material imbalances are affected as follows:
Red. Major Rook Queen Total
QRR +160 -2*53 -53 +1
QR +160 -53 -53 +54
RR +160 -2*53 0 +54
R 0 -53 0 -53
Q 0 0 -53 -53
so that the imbalance changes by at most 54 + 53 = 107 units.
This corresponds to appromximately 3.5cp in the final evaluation.
Verified with fixed number 40000 games at both short
and long TC it does not regress.
Short TC 15+0.05
ELO: 1.93 +-2.1 (95%) LOS: 96.6%
Total: 40000 W: 7520 L: 7298 D: 25182
Long TC 60+0.05
ELO: -0.33 +-1.9 (95%) LOS: 36.5%
Total: 39663 W: 6067 L: 6105 D: 27491
bench: 6703846
As, Gary (that analyzed the bug) says:
SF does not print a PV when the original best move fails low,
we hit our time allowance, and stop the search. The output from
the SF search is below. It was failing low on Ne1 at depth 34.
Then, we get bestmove Qd3, but no PV change.
info depth 34 seldepth 45 score cp 38 upperbound nodes 483484489 nps 15464575 time 31264 multipv 1 pv f3e1 h5h4 e1d3 h4g3 f2g3 a6f6 f1f6 e7f6 d1a4 f6e7 a1f1 d8f8 a4b3 b7b6 b3c2 f7f6 c2a4 h3g5 b2b3 g5f7 a4c6 f7d6 h1g2 f6f5 e4f5 d6f5
info depth 34 seldepth 45 score cp 38 upperbound nodes 483484489 nps 15464575 time 31264 multipv 1 pv f3e1 h5h4 e1d3 h4g3 f2g3 a6f6 f1f6 e7f6 d1a4 f6e7 a1f1 d8f8 a4b3 b7b6 b3c2 f7f6 c2a4 h3g5 b2b3 g5f7 a4c6 f7d6 h1g2 f6f5 e4f5 d6f5
info depth 34 seldepth 47 score cp 30 upperbound nodes 2112334132 nps 17255517 time 122415 multipv 1 pv f3e1 h5h4 d1a4 a6f6 e1d3 d8f8 a4c2 h4g3 f2g3 f6f1 a1f1 h7g8 b2b3 f7f6 a3a4 b7b6
info depth 34 seldepth 47 score cp 30 upperbound nodes 2112334132 nps 17255517 time 122415 multipv 1 pv f3e1 h5h4 d1a4 a6f6 e1d3 d8f8 a4c2 h4g3 f2g3 f6f1 a1f1 h7g8 b2b3 f7f6 a3a4 b7b6
info nodes 18235667001 time 969824
bestmove e2d3 ponder c8d7
Looking at the code, if we hit Signals.stop, we return from id_loop
before printing any PV. It is possible for us to have resorted the
RootMove list though, which will change the move that is actually
played.
No functional change.
1/ eval margin and gains removed:
16bit are now free on TT entries, due to the removal of eval margin. may be useful
in the future :) gains removed: use instead by Value(128). search() and qsearch()
are now consistent in this regard.
2/ futility_margin()
linear formula instead of complex (log(depth), movecount) formula.
3/ unify pre & post futility pruning
pre futility pruning used depth < 7 plies, while post futility pruning used
depth < 4 plies. Now it's always depth < 7.
Tested with fixed number of games both at short TC:
ELO: 0.82 +-2.1 (95%) LOS: 77.3%
Total: 40000 W: 7939 L: 7845 D: 24216
And long TC
ELO: 0.59 +-2.0 (95%) LOS: 71.9%
Total: 40000 W: 6876 L: 6808 D: 26316
bench 7243575
1/ eval margin and gains removed:
- gains removed by Value(128): search() and qsearch() now behave consistently!
2/ futility_margin()
- testing showed that there is no added value in this weird (log(depth), movecount)
formula, and a much simpler linear formula is just as good. In fact, it is most
likely better, as it is not yet optimally tuned.
- the new simplified formula also means we get rid of FutilityMargins[], its
initialization code, and more importantly ss->futilityMoveCount, and the hacky
code that updates it throughout the search().
- the current formula gives negative futility margins, and there is a hidden interaction
between the move coutn pruning formula and the futility margin one: what happens is
that MCP is supposed to be triggered before we use the non-sensical negative futility
margins.
3/ unify pre & post futility pruning
- pre futility pruning (what SF calls value based pruning) used depth < 7 plies,
while post futility pruning (what SF calls static null move pruning) used depth < 4 plies.
- also the condition depth < 7 in pre futility pruning was not obvious, and it seemd
to be depth < 16 (futility_margin() returns an infinite value when depth >= 7).
Tested with fixed number of games both at short TC:
ELO: 0.82 +-2.1 (95%) LOS: 77.3%
Total: 40000 W: 7939 L: 7845 D: 24216
And long TC
ELO: 0.59 +-2.0 (95%) LOS: 71.9%
Total: 40000 W: 6876 L: 6808 D: 26316
bench: 10206576
RedundantRook and RedundantQueen replaced by simple
variable RedundantMajor. Also the SameColor coefficient
for Queen<->Queen has been set by definition to 0.
The remaining 5 parameters:
LinearCoefficients[ROOK]
LinearCoefficients[QUEEN]
QuadraticCoefficientsSameColor[ROOK][ROOK]
QuadraticCoefficientsSameColor[QUEEN][ROOK]
RedundantMajor
are sufficient to equate the material imbalances for the
5 common material configurations of R, RR, Q, QR and QRR
to any desired values simultaneously.
With the chosen parameters there should be no functional
change unless one side has more than 2 rooks or more
than 1 queen. For example bench from the start position
using the commands:
./stockfish
go depth 16
produces identical output except for one extra node
in the last iteration.
bench: 8198094
Coefficients for Bishop<->BishopPair and Bishop<->Bishop
are also pretty much redundant. By altering the values
in LinearCoefficients[] these coefficients can be zeroed
without changing the imbalance calculations in any position
with less than 3 bishops for one side.
bench: 7995098
First coefficient in the SameColor array does an
equivalent job when folded into the LinearCoefficients
array.
All of the diagonal terms in the OppositeColor array
are redundant due to cancellation.
No functional change.
In case we find a very good move after a
troubled start, we don't return immediately
anymore.
Tested directly at long TC where it passed:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 13910 W: 2397 L: 2228 D: 9285
bench: 7995098
And remove a complex (and broken) formula.
Indeed previous code was broken in case of TC with big
time increments where available_time() was too similar
to total time yielding to many time losses, so for instance:
go wtime 2600 winc 2600
info nodes 4432770 time 2601 <-- time forfeit!
maximum search time = 2530 ms
available_time = 2300 ms
For a reference and further details see:
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/dCPAvQDcm2E
Speed tested with bench disabling timer alltogheter vs timer set at
max resolution, showed we have no speed regressions both in single
core and when using all physical cores.
No functional change.
A combo of two patches that failed SPRT with score
higher than 50% but togheter they succeed:
SPRT at 60+0.05
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 7312 W: 1276 L: 1139 D: 4897
bench: 8029334
If the game got late enough that move_importance(currentPly) * slowMover / 100
rounds to 0, then we ended up dividing 0 by 0 when only looking 1 move ahead.
This apparently caused the search to almost immediately abort and Stockfish
would blunder in long games. So convert thisMoveImportance to a double.
No functional change.
This seems more a material imbalance topic,
anyhow test is good and so patch is applied
as is.
Passed both short TC:
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 17391 W: 3548 L: 3393 D: 10450
And long TC:
LLR: 3.00 (-2.94,2.94) [0.00,6.00]
Total: 34660 W: 5972 L: 5700 D: 22988
bench: 8291883
Dumb down a bit the code and trade some possible
speed (but this is far from hot path anyhow) for
some added readability for the layman.
No functional change.
The normalising transformation is computed all at
once by the helper function get_flip_sq and then
applied immediately to the relevant squares as soon
as they are loaded from the position class.
bench: 8350690
New formula mathces the old formula until d = 45
Test code:
int main() {
for(int d=1; d<=45; d++)
{
int a = int(log(double(d * d) / 2) / log(2.0) + 1.001);
int b = int(2.9 * log(double(d)));
if (a != b) std::cout << d << std::endl;
}
return 0;
}
bench: 8455956
This is the first chain bonus version
from Ralph that also passed both
Short TC:
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 23460 W: 4727 L: 4556 D: 14177
And long TC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 31858 W: 5497 L: 5240 D: 21121
And performed better against current
committed version, always at 60secs:
LLR: -2.94 (-2.94,2.94) [-3.00,3.00]
Total: 26301 W: 4477 L: 4580 D: 17244
This test was done by Leonid.
bench: 8455956
After 40K games at 60 secs, result is still
not clear, but not a regression against SF 4
After
ELO: 50.11 +-2.1 (95%) LOS: 100.0%
Total: 40000 W: 10547 L: 4817 D: 24636
Before
ELO: 49.51 +-2.1 (95%) LOS: 100.0%
Total: 40000 W: 10483 L: 4821 D: 24696
So re-apply the patch to avoid to
special-case this one.
bench: 7403882
It was never updated !
Currently it only affects evaluate_passed_pawns()
and in particularly the rule to increase the bonus
if we have more non-pawn pieces. We could simply use
popcount() instead and avoid the little slowdown
in put_piece() and remove_piece(), but this would
leave a very subtle and tricky hole where people
are forced to remember that pos.count<ALL_PIECES>()
does not work. This is not obvious and so dangerous.
Thanks to Ronald de Man for spotting this.
bench: 7931424
Due to a strange issue (bug?) the ternary
operator does not return a BitCountType for
icc, so revert to the expression.
The same patch was already applied in
9749f1f14c
Thanks to NssY Wanyonyi for pointing out
this.
No functional change.
This reverts commit 4bc2374450 for
two reasons.
First regression testing shows almost equal
score:
Before the patch:
ELO: 49.75 +-2.5 (95%) LOS: 100.0%
Total: 27205 W: 7113 L: 3244 D: 16848
After the patch:
ELO: 48.87 +-2.9 (95%) LOS: 100.0%
Total: 20860 W: 5478 L: 2563 D: 12819
Second, and more sensible to me, this patch
increases safe check bonuses to 4 times their
original value (!) and considering:
- Values were already well tuned
- Values are highly critical
- King safety is highly critical, very TC
dependent and very difficult to test
- Our testing coverage is partial (self-testing,
blitz times)
I think is better to be safe than sorry and so
I revert the patch.
bench: 8440524
Passed both short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10466 W: 2087 L: 1953 D: 6426
And long TC:
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 26334 W: 4540 L: 4310 D: 17484
And also proved stronger than a slightly
different patch, also succesful against master:
https://github.com/mcostalba/Stockfish/commit/dc6830a3b4ed12
But losing against current one in a match
at 60secs with SPRT [-3, 3]:
LLR: -2.96 (-2.94,2.94) [-3.00,3.00]
Total: 44484 W: 7360 L: 7463 D: 29661
bench: 9160831
Also the drawing criteria has been slightly loosened.
It now detects a draw if the king is ahead of all the
pawns and on the same file or the adjacent file.
bench: 7700683
This lost position 8/8/3q4/8/5k2/2P1R3/2K2P2/8 w - - 0 1
was previously evaluated as a draw.
The king and rook need to be correctly placed with
respect to the _same_ pawn.
(Note also that the check for the pawn being on RANK_2
in the old version is redundant: it must be on RANK_2 if
it hopes to protect a rook on RANK_3)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Good both at short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 5448 W: 1133 L: 1012 D: 3303
And at long TC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 40509 W: 6836 L: 6541 D: 27132
bench: 7700683
Don't need a struct here. Speed test shows
result is teh same. Moreover RKISS is used
mainly at startup to compute magics, so
prefer to keep it simple...RKISS ;-)
Also some assorted triviality while there.
No functional change.
These two changes go in opposite directions and it
seems that the combination is stronger than original.
Here are the positive tests at various TC:
15+0.05
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 24561 W: 4946 L: 4772 D: 14843
60+0.05
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 15259 W: 2598 L: 2423 D: 10238
40/30
LLR: 2.96 (-2.94,2.94) [-3.00,3.00]
Total: 2570 W: 527 L: 422 D: 1621
Unfortunately there is also a bad result
with one sec time increment that needs
to be further investigated:
12+1
LLR: -2.97 (-2.94,2.94) [-3.00,3.00]
Total: 2694 W: 438 L: 543 D: 1713
bench: 8340585
Rationale:
- Speed of double and float is about the same (not on the hot path anyway)
- Double makes code prettier (no need to write 1.0f, just 1.0)
- Only practical advantage of float is to use less memory, but since we never
store large arrays of double, we don't care.
No functional change.
Increase bench default depth from 12 to 13 and
add 15 new endgame positions to have broader
coverage and also more reliable nps calulcation
used for fishtest framework.
Due to the new endgame positions, where nps is higher,
the total nps is increased of about 15%.
Thanks to Lucas and Jörg for the suggestions.
No functional change, but bench number is now:
bench: 8336338
For some users -stack_size,0x4000 does not work,
so revert for now.
osX 10.6.8
gcc version 4.7.3 (MacPorts gcc47 4.7.3_2)
g++: error: unrecognized command line option '-stack_size,0x4000'
make[2]: *** [stockfish] Error 1
make[1]: *** [gcc-profile-make] Error 2
make: *** [profile-build] Error 2
No functional change.
This reverts commit 800410eef1 and instead increases
stack size.
I went through the old emails with Daylen that reported the
crash issue on Mac OS X and was fixed by 0049d3f337.
It was reported default stack size for a thread in Mac OS X is 8
megabytes while the patch that we are reverting allows to reduce
stack size at max of about 217KB, so the reason for the crash was
only marginal in MAX_MOVES value. On those emails Daylen also
hinted how to increase stack size for Mac OS X to 16MB.
So prefer to increase stack size to 16MB instad of re-inventing
the wheel and do our home grown stack as we did with the patch
that we are now reverting (it will remain anyhow in git history
for documentation purposes).
No functional change.
Unify extensions between PV and not PV nodes
and remove all but check extensions.
This is a simplification so tested at fixed number
of games where proved to not regress.
About 45k games at 15+0.05
ELO: 1.23 +-2.0 (95%) LOS: 88.5%
Total: 45643 W: 9107 L: 8946 D: 27590
About 45k games at 60+0.05
ELO: 1.07 +-1.8 (95%) LOS: 87.8%
Total: 46786 W: 7728 L: 7584 D: 31474
bench: 3172206
If the uci option 'Best Book Move' is set to true the lookup still
returns a move at random instead of the move with the highest
weight.
No functional change.
This should be enough for any legal position, even
the handcrafted ones, like the one presented by Reuven:
1Q5R/4Q1K1/B1Q5/B4Q2/N2Q4/pQ4Q1/pn2Q3/krQ4R w - -
Where currently we crash. This reverts the patch
0049d3f337 of 8/4/2012 where stack
was shrinked due to crashes while in deep analysys.
No functional change.
This greately reduces stack usage and is a
prerequisite for next patch.
Verified with 40K games both in single and SMP
case that there are no regressions.
No functional change.
This is an even safer setup proposed and tested
by Alexandre Meirelles.
Regression testing of 40K games at 10+0.05 show
result is stable both against current master:
ELO: -0.29 +-2.2 (95%) LOS: 39.7%
Total: 40000 W: 8010 L: 8043 D: 23947
and again original master (the one with smallest
time parameters):
ELO: 1.71 +-2.2 (95%) LOS: 93.8%
Total: 40000 W: 8325 L: 8128 D: 23547
Alexandre verified with LittleBlitzer time losses are
greately reduced with this setup:
Games Completed = 2100 of 3000 (Avg game length = 35.745 sec)
Settings = RR/128MB/15000ms+50ms/M 1000cp for 12 moves, D 150 moves/
Time = 39200 sec elapsed, 16800 sec remaining
1. Stockfish 190913 1091.5/2100 803-720-577 (L: m=313 t=1 i=0 a=406) (D: r=278 i=91 f=136 s=8 a=64) (tpm=212.5 d=14.75 nps=925427)
2. Houdini 2.0 w32 1008.5/2100 720-803-577 (L: m=250 t=299 i=0 a=254) (D: r=278 i=91 f=136 s=8 a=64) (tpm=204.1 d=12.04 nps=1326351)
No functional change.
Goes in the direction of avoiding time losses and seems
equivalent after almost 40K games at super fast TC of 10+0.05
ELO: 2.61 +-2.2 (95%) LOS: 99.1%
Total: 39869 W: 8258 L: 7959 D: 23652
No functional change.
Goes in the direction of avoiding time losses and seems
equivalent after almost 40K games at super fast TC of 10+0.05
ELO: 2.41 +-2.3 (95%) LOS: 98.1%
Total: 37222 W: 7843 L: 7585 D: 21794
No functional change.
The ideal setting for super-blitz might be something like:
"Emergency Base Time" = 50
"Emergency Move Time" = 5
This would give a total emergency time buffer of:
50 + 40 * 5 = 250 ms
This setup replaces the previous half cooked hack
"Don't blunder under extreme time pressure".
Test results are very good at super blitz, but keep good even
at 60 secs.
At 5+0.05
ELO: 24.30 +-2.4 (95%) LOS: 100.0%
Total: 37802 W: 10060 L: 7420 D: 20322
At 15+0.05
ELO: 13.41 +-2.9 (95%) LOS: 100.0%
Total: 22271 W: 4853 L: 3994 D: 13424
At 60+0.05
ELO: 5.30 +-3.2 (95%) LOS: 99.9%
Total: 16000 W: 2897 L: 2653 D: 10450
No functional change.
Instead of current code, give a bonus according to the frontmost
square among candidate + passed pawns.
This is a big simplification that removes a lot of accurate code
substituting it with a statistically based one using the common
'bonus' scheme, leaving to the search to sort out the details.
Results are equivalent but code is much less and, as an added bonus,
we now store candidates bitboard in pawns hash and allow this
info to be used in evaluation. This paves the way to possible
candidate pawns evaluations together with all the other pieces,
as we do for passed.
Patch passed short TC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 16927 W: 3462 L: 3308 D: 10157
Then failed (quite quickly) at long TC
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 8451 W: 1386 L: 1448 D: 5617
But when ran with a conclusive 40K fixed games at 60 secs it proved
almost equivalent to original one.
ELO: 1.08 +-2.0 (95%) LOS: 85.8%
Total: 40000 W: 6739 L: 6615 D: 26646
bench: 3884003
Now that we use pre-increment on enums, it
make sense, for code style uniformity, to
swith to pre-increment also for native types,
although there is no speed difference.
No functional change.
ENABLE_OPERATORS_ON has incorrect definitions of
post-increment and post-decrement operators.
In particularly the returned value is the variable
already incremented/decremented, while instead they
should return the variable _before_ inc/dec.
This has no real effect because are only used in loops
and where the returned value is never used, neverthless
it is wrong. The fix would be to copy the variable to a
dummy, then inc/dec the variable, then return the dummy.
So instead, rename to pre-increment that can be implemented
without the dummy, actually the current implementation
it is already the correct pre-increment, with the only change
to return a reference (an l-value) and not a copy, so
to properly mimic the pre-increment on native integers.
Spotted by Kojirion.
No functional change.
We always attempt to keep at least this emergencyBaseTime
at clock. But if available time is very low it means that
we will force ourself to play immediately to satisfy the
emergencyBaseTime constrain and so leading to blunders.
Patch is good at short and very short TC (15secs and 5secs respectively)
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 26590 W: 5426 L: 5245 D: 15919
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 5767 W: 1397 L: 1268 D: 3102
Instead seems has no influence at longer TC (60 secs)
LLR: -2.96 (-2.94,2.94) [0.00,6.00]
Total: 79862 W: 13623 L: 13339 D: 52900
So it is committed to have a broader testing but is
to be consider still EXPERIMENTAL and can be reverted
easily.
No functional change.
In case we have less then 10ms to think as soon as
we wake up the timer, it immediately fires and calls
check_time() where due to condition:
elapsed > TimeMgr.maximum_time() - 2 * TimerResolution
the stop flag is set and search returns immediately, without
actually search anything.
Here the somewhat hacky fix is to start the timer after
at least one iteration as been completed.
No functional change.
The possible maximum mobility cardinality (plus one in case of
zero squares available) is:
- Knights: max. 8 squares -> max. 9 entries
- Bishops: max. 13 squares -> max. 14 entries
- Rooks: max. 14 squares -> max. 15 entries
- Queen: max. 27 squares -> max. 28 entries
So remove the extra entries in the table.
Spotted by Dariusz Orzechowski.
No functional change.
Union of
- LMR >= 3 plies from Gary tests.stockfishchess.org/tests/view/522522960ebc595d328fcafd
- allows() tweak from Reuven tests.stockfishchess.org/tests/view/5225fa1c0ebc595d328fcb53
Both passed Step I and failed Step II.
Instead this union passed both short TC:
LLR: 2.95 (-2.94,2.94)
Total: 14525 W: 3063 L: 2874 D: 8588
And long TC
LLR: 2.94 (-2.94,2.94)
Total: 31075 W: 5566 L: 5308 D: 20201
bench: 4238160
Idea is sound but implementation is partial. Ryan and Joona noticed that
we leave an hole in material table. Also we got another report by an user
of an odd behaviour. Namely, if you start stockfish and from the prompt
give 'bench' you get 3453941, then if you run again bench you get 3453940.
The reason is that two different positions with the same number of pieces,
but one with a bishop pair and another without have the same material key.
But after Eelco patch also different material imbalance and this yields
to this issue.
Restesting at long TC shows the patch does not really contribute at
ELO improvement. Actually patch failed at long TC.
LLR: -2.97 (-2.94,2.94)
Total: 23109 W: 4104 L: 4092 D: 14913
So revert.
bench: 3453945
Prefer pos.bishop_pair() to pos.count<BISHOP>(WHITE) > 1
because the first checks that the two bishops are on
different color squares.
Although the change seems to kick in only in very rare cases,
quite surprisingly it was able to pass SPRT test at short TC.
LLR: 2.95 (-2.94,2.94)
Total: 39818 W: 8174 L: 7956 D: 23688
bench: 3453941
This was thought to be a draw but the bishops generally win. However,
it takes up to 66 moves. The position in the diagram was thought to be
a draw for over one hundred years, but tablebases show that White wins
in 45 moves. All of the long wins go through this type of semi-fortress
position. It takes several moves to force Black out of the temporary
fortress in the corner; then precise play with the bishops prevents Black
from forming the temporary fortress in another corner (Nunn 1995:265ff).
Before computer analysis, Speelman listed this position as unresolved,
but "probably a draw" (Speelman 1981:109).
bench: 3453945
STANDALONE-TOOLCHAIN.html in Android NDK says:
It is recommended to use the -mthumb compiler flag to force the generation
of 16-bit Thumb-1 instructions (the default being 32-bit ARM ones).
If you want to target the 'armeabi-v7a' ABI, you will need ensure that the
following two flags are being used:
CFLAGS='-march=armv7-a -mfloat-abi=softfp'
Note: The first flag enables Thumb-2 instructions, and the second one
enables H/W FPU instructions while ensuring that floating-point
parameters are passed in core registers, which is critical for
ABI compatibility. Do *not* use these flags separately!
Thanks to Peter Osterlund for pointout this doc and for showing me
an example Makefile to follow.
No functional change.
Becuase castle is coded as "king captures the rook"
the to_sq(move), A1/8 or H1/8 is empty after the move,
leading to assert assert(p != NO_PIECE) in color_of().
Teach allows() asserts about castle and fix the crash.
Bug reported by Ryan Takker and tracked down by Tom Vijlbrief.
No functional change.
If our rook is behind a passed pawn, all
squares are defended.
One of the longest tests to pass !
Passed both short TC
LLR: 2.97 (-2.94,2.94)
Total: 44560 W: 9518 L: 9281 D: 25761
And long TC
LLR: 2.96 (-2.94,2.94)
Total: 61348 W: 11618 L: 11192 D: 38538
bench: 3787694
With
position fen 7k/8/8/8/8/7P/6K1/7B w - - 0 1
go depth 25
The evaluation at depth 22 is not draw as it should be. The reason is that
when search reaches the position 8/6kP/8/8/8/3B4/6K1/8 w - - 0 1 if white plays
h8R or h8N then we get a position that is a "KNOWN_WIN" and is _not_ a check, so
futility pruning in qsearch kicks in and black may think that it is "futile"
to reply Kxh8 since, according to the logic of the code, it cannot raise the score
back towards a draw.
bench: 4728533
The case of two lone kings on the board is already considered
by the "No pawns" scaling factor rules in material.cpp as is
KBK and KNK.
Moreover we had a small leak in endgames map because for
KK endgame it happens white and black material keys are the
same (both equal to zero), so when adding the black endgame in
Endgames::add() we were overwriting the already exsisting
white one, leading to a memory leak found by Valgrind.
So remove the endgames althogheter and rely on scaling
to correctly set the endgames value to a draw.
No functional change.
Instead of classical flags, throw an
exception when we want to immediately halt
the search. Currently only one type
is used for both UCI stop and threads
cut off.
No functional change.
Idea originated from a post of Don Dailey
on talkchess and reported by Eelco.
This is the last succesful attempt of a long
series of trials (as usually happens, the
'idea' alone is not enough).
Passed both short 15secs TC
LLR: 2.97 (-2.94,2.94)
Total: 7629 W: 1645 L: 1515 D: 4469
And long 60secs TC
LLR: 2.96 (-2.94,2.94)
Total: 10218 W: 1932 L: 1775 D: 6511
bench: 4944581
With the new automatic setting of split depth
instead of a default, the user no longer needs
guidance on setting the split point.
Also threads now defaults to one.
No functional change.
Search::RootColor is a global parameter set
before to start a search, it is not something
trace() should change.
This patch allows to add trace() calls, for
debugging, inside search itself without altering
the bench, and also ensures that the values
returned by trace() and evaluate() are fully
equivalent.
No functional change.
The rounding formula is different between
positive and negative scores due to the
GrainSize/2 term that is asymmetric.
So use truncation instead of rounding. This
guarantees that evaluation is rounded to zero
in the same way for both positive and negative
scores.
Found with position's flip
bench: 4634244
Because VALUE_NONE is 30002, it happens that
after a check the next move is never an improving
one.
After this patch bench signature is independent from
VALUE_NONE actual value.
bench: 4303194
Apply to LMR the same Eelco's idea
applied to move count pruning.
This is the result of a series of
attempts started by Thomas Kolarik.
Passed both short TC
LLR: 2.95 (-2.94, 2.94)
Total: 5675 W: 1241 L: 1117 D: 3317
And long TC:
LLR: 2.95 (-2.94, 2.94)
Total: 8748 W: 1689 L: 1539 D: 5520
bench: 4356801
Not a real functional change, but bench changed due to different piecelist
reordering. To verify it a temporary my canonicalize_rooks function was
written as follows. It just ensures that the rook on the "smaller" square
is listed first.
void Position::canonicalize_rooks(Color c)
{
if (pieceCount[c][ROOK] == 2)
{
Square s0 = pieceList[c][ROOK][0];
Square s1 = pieceList[c][ROOK][1];
if (s0 > s1)
{
pieceList[c][ROOK][0] = s1;
pieceList[c][ROOK][1] = s0;
index[s0] = 1;
index[s1] = 0;
}
}
}
With this both bench and the test on Chess960 positions
./stockfish bench 128 1 8 Chess960.epd file > /dev/null
Gives same result.
bench: 4424151
Set threads number always to 1 at startup and let the
user explicitly to chose the number of threads.
Also preserve the useful behavior of automatically set
"Min Split Depth" according to the requested threads,
indeed this parameter is too technical for a casual user,
so, when left to zero, we set it on a sensible value.
No functional change
The new Position methods add_piece, move_piece, and remove_piece
now manage the member variables pieceList, pieceCount, and index,
and 9 blocks of code in Position that used to manipulate those
data structures by hand now call the new methods.
There is a slightly slowdown (< 1%) on Clang and on perft,
but the cleanup compensates the little speed loss.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Introduce ThreadBase struct that is search
agnostic and just handles low level stuff,
and derive all the other specialized classes
form here.
In particular TimerThread does not hinerits
anymore all the search related stuff from Thread.
Also some renaming while there.
Suggested by Steven Edwards
No functional change.
At thread creation start_routine() is called
and from there the virtual function idle_loop()
because we do this inside Thread c'tor, where the
virtual mechanism is disabled, it could happen that
the base class idle_loop() is called instead.
The issue happens with TimerThread and MainThread
where, at launch, start_routine calls
Thread::idle_loop instead of the derived ones.
Normally this bug is hidden because c'tor finishes
before start_routine() is actually called in the
just created execution thread, but on some platforms
and in some cases this is not guaranteed and the
engine hangs.
Reported by Ted Wong on talkchess
No functional change.
The endgame king + minor vs king is erroneusly
detected as king + minor vs king + minor
Here the fix is to detect king + minor earlier,
in particular to add these trivial cases to
endgame evaluation functions.
Spotted by Reuven Peleg
bench: 4727133
And #ifdef instead of #if defined
This is more standard form (see for example iostream file).
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
A big simplification and removing of useless code.
Finished at 50% both at short TC (with SPRT) than
at long TC at fixed number of games:
ELO: -0.14 +-3.4 (95%) LOS: 46.8%
Total: 15206 W: 2836 L: 2842 D: 9528
bench: 5059948
This reverts commit 4b3a0fdab0.
As Gary says: " It failed when I tried it at long TC previously, and only
barely passed this time. Some anecdotal evidence is that it hurts vs other
engines as well (the Lightspeed rating list showed a 16 elo drop from previous
best version - still +- 5 error bars on both, but that's still significant)"
I also agree that if we have some doubts (like in this case) it is better to
be safe than sorry.
bench: 4615572
Reduces the influence of PSQT for entries such as
the extended center and the h-file.
Passed both short TC test:
LLR: 2.95 (-2.94,2.94)
Total: 23919 W: 5207 L: 5029 D: 13683
And long TC one:
LLR: 2.96 (-2.94,2.94)
Total: 5762 W: 1108 L: 974 D: 3680
Bench: 4617880
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This very speed critical code was full of clever (!)
tricks and subtle details.
So I have rewritten it in a more straithforward way
and, as very often happens, result is even faster
than original.
No functional change.
This reverts commit 3e95800814
For some reason it fails the short TC test:
LLR: -2.96 (-2.94,2.94)
Total: 20033 W: 4214 L: 4265 D: 11554
bench: 4769737
It is somewhat unbilievable but our SEE is broken !
If the first SEE move is a king capture and square is
defended then SEE continues instead of breaking.
The bug shows only on normal SEE, not see_sign() so
probing with a:
dbg_hit_on_c(slIndex==1, captured == KING);
reports just a tiny:
Total 3465656 Hits 6646 hit rate (%) 0
Bug was there since Retire seeValues[] and move PieceValue[] out of Position of 26/6/2011 (!)
although for some reason didn't show immediately, indeed the
bougous patch was a "No functional change" (!!)
bench: 4699504
It is somewhat unbilievable but our SEE is broken !
If the first SEE move is a king capture and square is
defended then SEE continues instead of breaking.
The bug shows only on normal SEE, not see_sign() so
probing with a:
dbg_hit_on_c(slIndex==1, captured == KING);
reports just a tiny:
Total 3465656 Hits 6646 hit rate (%) 0
Bug was there since 351ef5c85b of 26/6/2011 (!)
although for some reason didn't show immediately, indeed the
bougous patch was a "No functional change" (!!)
bench: 4793754
Without patch we have 333198 nps, with patch 334249.
A very small +0.3%, not a lot manily becuase this is a
side path that is taken very few times.
Anyhow idea is correct becuase first 'quick' condition
has an hit rate of about 95%.
No functional change.
But still keep the same original
margin for score.
Passed both short TC test
LR: 2.95 (-2.94,2.94)
Total: 3710 W: 845 L: 726 D: 2139
And long TC
LLR: 2.95 (-2.94,2.94)
Total: 57859 W: 10939 L: 10532 D: 36388
bench: 4769737
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Partially revert previous patch and use
unlikey() just as code annotation.
Actually it is better to rely on a profiler for branch prediction:
http://blog.man7.org/2012/10/how-much-do-builtinexpect-likely-and.html
"In fact, even when only one in ten thousand values is nonzero,
we're still at only roughly the break-even point"
No functional change,
It is somewhat redundant and could make SF
name too long, so use just Version, in case
of a signature build Version will be set to
'sig-xxx' otherwise, if left empty, we fall
back on usual date stamp.
No functional change.
When compiling with:
make signature-build ARCH=xxx COMP=xxx
After binary has been roduced, it will be run to
get the signature 'stockfish bench' and this
number will be used as Version, so that it
will be easy to track the original sources
from a binary.
No functinal change.
Due to a strange issue (bug?) the ternary
operator does not return a BitCountType for
icc, so revert to the expression used
before bcbc9bfd1f
No functional change.
Here speed up is the name of the game.
Speed up is gained:
- Removing the useless enoughMaterial code
- Limiting trapped rook evaluation to where it counts
Tested at long TC:
LLR: 2.97 (-2.94,2.94)
Total: 10061 W: 1948 L: 1790 D: 6323
bench: 4558173
Thanks to Don, Miguel, Louis and the other people
of talkchess forum for the suggestion:
http://www.talkchess.com/forum/viewtopic.php?t=48612
Also sync polyglot.ini with current UCI options
No functional change.
Unfortunatly a reverse test at long TC failed:
master^ vs master
LLR: 1.37 (-2.94,2.94)
Total: 33682 W: 6294 L: 6071 D: 21317
So becuase short TC score is 50% there is a good
possibility patch is not scalable.
So revert it.
bench: 4507288
Do not use threat move to detect the condition. This
let us to retire the big allows() function.
Test at short TC was within 50% score:
LLR: -2.95 (-2.94,2.94)
Total: 38272 W: 7941 L: 7940 D: 22391
To be verified with reverse long TC
bench: 4191565
Here the main difference is that now we center
aspiration window on last returned score. This allows
to simplify handling of mate scores.
We have done a reversed SPRT tests, where we wanted to
verify if master is stronger than this patch.
Long TC: master vs this patch (reverse test)
LLR: -2.95 (-2.94,2.94)
Total: 37992 W: 7012 L: 6920 D: 24060
bench: 4507288
Temporary revert aspiration window patch
so to be visible to everybody: it will be
re-applied with next patch
No functional change (together with next one)
Here the main difference is that now we center
aspiration window on last returned score. This allows
to simplify handling of mate scores.
We have done a reversed SPRT tests, where we wanted to
verify if master is stronger than this patch.
Long TC: master vs this patch (reverse test)
LLR: -2.95 (-2.94,2.94)
Total: 37992 W: 7012 L: 6920 D: 24060
bench: 4507288
A simplification of the 'dangerous' definition.
Seems neutral at reverse test at long TC
master vs patch
LLR: -2.96 (-2.94,2.94)
Total: 16974 W: 3122 L: 3139 D: 10713
bench: 4689029
Function calloc() already initializes memory to
zero, so avoid calling clear() afterwards.
Also some renaming while there (inspired by DiscoCheck).
No functional change.
Here we skip the call to pos.attacks_from<ROOK>(s) in the 98%
of cases, testing the first 2 members first. Unfortunatly
code is a bit triky and not clear. So we give up to the
speed optimization in exchange of more code clarity.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
So when we are doing a LMR search at the parent ALL node.
This patch didn't prove stronger at 60" TC
LLR: -2.97 (-2.94,2.94)
Total: 22398 W: 4070 L: 4060 D: 14268
But, first, it scores at 50%, second (and most important for me) the opposite,
i.e. normal reduction when parent node is not reduced, seems very bad:
LLR: -2.95 (-2.94,2.94)
Total: 7036 W: 1446 L: 1534 D: 4056
According to Don, this idea of increased reduction of CUT nodes
works because if parent node is reduced, missing a cut-off due to
reduced depth search (meaning position is somehow tricky) forces
a full depth research at parent node, giving due insight in this
set of sensible positions.
IOW if we expect a node to fail-high at depth n, then we assume it
should fail-high also at depth n-1, if this doesn't happen it means
position is tricky enough to deserve a research at depth n+1.
bench: 4687419
We got a good result from this tweak, in line with
what was already found by Don Dailey.
At short TC:
LLR: 2.95 (-2.94,2.94)
Total: 13097 W: 2742 L: 2598 D: 7757
At long TC:
LLR: 2.97 (-2.94,2.94)
Total: 7281 W: 1408 L: 1265 D: 4608
bench: 5108393
Follow Don Dailey definition of cut/all node:
"If the previous node was a cut node, we consider this an ALL node.
The only exception is for PV nodes which are a special case of ALL nodes.
In the PVS framework, the first zero width window searched from a PV
node is by our definition a CUT node and if you have to do a re-search
then it is suddenly promoted to a PV nodes (as per PVS search) and only
then can the cut and all nodes swap positions. In other words, these
internal search failures can force the status of every node in the subtree
to swap if it propagates back to the last PV nodes."
http://talkchess.com/forum/viewtopic.php?topic_view=threads&p=519741&t=47577
With this definition we have an hit rate higher than 90% on:
if (!PvNode && depth > 4 * ONE_PLY)
dbg_hit_on_c(cutNode, (bestValue >= beta));
And an hit rate of just 28% on:
if (!PvNode && depth > 4 * ONE_PLY)
dbg_hit_on_c(!cutNode, (bestValue >= beta));
No functional change.
Fix was wrong becuase search starts from ss+1,
code is a bit tricky here, so rewrite in a way
to be more easy to read and understand.
Spotted by Eelco.
No functional change.
In case of we pick a sub-optimal move be
sure to print this, and not the best one
on seach log file.
Bug spotted by Guenther Demetz.
No functional change.
Search is started after setting a position and
issuing UCI 'go' command. Then if we stop the search
and call 'go' again without setting a new position it
is assumed that the previous setup is preserved, but
this is not the case because what happens is that
SetupStates is reset to NULL, leading to a crash as
soon as RootPos.is_draw() is called because st->previous
is now stale.
UCI protocol is not very clear about requiring that a
position is setup always before launching a search,
so here we easy the life of GUI developers assuming
that the current state is preserved after returning
from a 'stop' command.
Bug reported by Gregor Cramer.
No functional change.
A small number of tests with simulated
annealing at 15s indicated these values
may be better
And this is verified at long 60+0.05 TC
LLR: 2.95 (-2.94,2.94)
Total: 40658 W: 7821 L: 7501 D: 25336
bench: 4931544
This patch is the sum of:
- Grainsize of 4 instead of 8
- Removing "depth < DEPTH_ZERO"
- Change DEPTH_QS_RECAPTURES = -5 to -7
All the patches individually failed to pass SPRT but scored
around 50%.
Together they pass easily short TC:
LLR: 2.96 (-2.94,2.94)
Total: 4429 W: 964 L: 844 D: 2621
And with some difficult long TC of 60+0.05:
LLR: 2.95 (-2.94,2.94)
Total: 64133 W: 11968 L: 11532 D: 40633
bench: 4821467
Add MOVE_NONE at the tail, this allows to loop
across MoveList checking for *it != MOVE_NONE,
and because *it is used imediately after compiler
is able to reuse it.
With this small patch perft speed increased of 3%
And it is also a semplification !
No functional change.
Most of the time we cut-off earlier, at captures, so this
results in useless work.
There is a small functionality change becuase 'ss' can change
from MovePicker c'tor to when killers are tried due, for
instance, to singular search.
bench: 4603795
Very good at long 60"+0.05 TC
LLR: 2.95 (-2.94,2.94)
Total: 5954 W: 1151 L: 1016 D: 3787
[edit: slightly changed form original patch to avoid useless loop
across killers when killer is MOVE_NONE]
bench: 4327405
Performed more or less well at short TC
LLR: 2.95 (-2.94,2.94)
Total: 50517 W: 9815 L: 9574 D: 31128
And a bit better at long TC
LLR: 2.96 (-2.94,2.94)
Total: 15564 W: 2805 L: 2624 D: 10135
bench: 4375253
It seems that do not limiting checking the
trapped rook only on rank 1 improves the
score.
At long TC
LLR: 2.97 (-2.94,2.94)
Total: 6581 W: 1346 L: 1204 D: 4031
bench: 4985012
Don't update refutation table in case of
previous move is MOVE_NULL or MOVE_NONE
and don't try refutation if is already
a killer move.
Pass both short TC
LLR: 2.96 (-2.94,2.94)
Total: 4310 W: 953 L: 869 D: 2488
And long one
LLR: 2.95 (-2.94,2.94)
Total: 6707 W: 1254 L: 1184 D: 4269
bench: 4785954
Very good result both at short TC 15+0.05
LLR: 2.95 (-2.94,2.94)
Total: 2803 W: 596 L: 483 D: 1724
And at long TC 60+0.05
LLR: 2.95 (-2.94,2.94)
Total: 2862 W: 548 L: 431 D: 1883
bench: 4329221
Unortunatly we have no guarantee that the call to
operator~(Color c) is resolved at compile time.
Perhaps the solution would be to use C++11 const_expr,
but for now simply use the good old-style ternary operator
that works as expected.
No functional change.
A rook is trapped if on rank 1 as is the king.
Currently the condition aloows for the rook
to be also in front of the pawns as long
as king is on first rank.
Verified with short TC test:
LLR: -1.71 (-2.94,2.94)
Total: 23234 W: 4317 L: 4317 D: 14600
Here what it counts is that after 23K games
result is equal.
bench: 4696542
When it is already defined(_WIN32).
According to Microsoft documentation:
http://msdn.microsoft.com/en-us/library/b0084kay.aspx
_WIN32 Defined for applications for Win32 and Win64. Always defined.
_WIN64 Defined for applications for Win64.
Patch suggested by Joona.
No functional change.
This info is normally printed together with
PV info in uci_pv() but when search is stopped,
for instance when max search time is reached,
uci_pv is not called and we miss this bits.
Suggested by gravy_train
No functional change.
But this time do not play with pointers, in
particular do not assume that size_t is an
unsigned type of the same width as pointers.
This code should be fully portable.
No functional change.
This fixes an assert while testing with debug on.
Assert was due to static null pruning returning value
eval - futility_margin(depth, (ss-1)->futilityMoveCount)
That was sometimes higher than VALUE_INFINITE triggering
an assert at the caller site.
Because eval con be equal to ttValue and anyhow is read from
TT that can be corrupted in SMP case, we need to sanity
check it before to use.
bench: 4176431
Let TT clusters (16*4=64 bytes) to hold on a singe cache line.
This avoids the need for the double prefetch.
Original patches by Lucas and Jean-Francois that has also tested
on his AMD FX:
BIG HASHTABLE
./stockfish bench 1024 1 18 > /dev/null
Before:
1437642 nps
1426519 nps
1438493 nps
After:
1474482 nps
1476375 nps
1475877 nps
SMALL HASHTABLE
./stockfish bench 128 1 18 > /dev/null
Before:
1435207 nps
1435586 nps
1433741 nps
After:
1479143 nps
1471042 nps
1472286 nps
No functional change.
Crash is due to uninitialized ss->futilityMoveCount that
when happens to be negative, yields to an out of range
access in futility_margin().
Bug is subtle because it shows itself only in SMP case.
Indeed in single thread mode we only use the
Stack ss[MAX_PLY_PLUS_2];
Allocated at the begin of id_loop() and due to pure
(bad) luck, it happens that for all the MAX_PLY_PLUS_2
elements, ss[i].futilityMoveCount >= 0
Note that the patch does not prevent futilityMoveCount
to be overwritten after, for instance singular search
or null verification, but to keep things readable and
because the effect is almost unmeasurable, we here
prefer a slightly incorrect but simpler patch.
bench: 4311634
Instead of a pointer. This should fix the issue of
remaining with a stale pointer when for instance calling
IID, but also null search verification, singular search
and razoring where we call search with the same ss
pointer. In this case ss->ei is overwritten in the
search() call and upon returning remains stale.
This patch could have a performance hit because Eval::Info
is big (176 bytes) and during splitting we copy 4 ss entries.
On the good side, this patch is a clean solution.
Proposed by Gary.
No functional change.
Allow to use EvalInfo struct, populated by
evaluation(), in search.
In particular we allocate Eval::Info on the stack
and pass a pointer to this to evaluate().
Also add to Search::Stack a pointer to Eval::Info,
this allows to reference eval info of previous/next
nodes.
WARNING: Eval::Info is NOT initialized and is populated
by evaluate(), only if the latter is called, and this
does not happen in all the code paths, so care should be
taken when accessing this struct.
No functional change.
The idea is to fail high more easily in static
null test if in the parent node we are already
very deep in the move list, so the propability
to fail high there is very low.
[edit: I have slightly changed the functionality
moving
ss->futilityMoveCount = moveCount;
At the end of the pruning code, this should not affect
ELO in anyway, but makes code more natural and logic]
Test with SPRT is good at 15+0.05
LLR: 2.96 (-2.94,2.94)
Total: 50653 W: 10024 L: 9780 D: 30849
And at 60+0.05
LLR: 2.97 (-2.94,2.94)
Total: 40799 W: 7227 L: 6921 D: 26651
bench: 4530093
When we use sysconf(_SC_NPROCESSORS_ONLN) to get number of
cores, we have to include sysconf library that is unistd.h
Sometimes it happens to work just becuase unistd.h indirectly
included by some other libraries, but not always.
Reported and fixed by Eyal BD
No functional change.
The idea is to penalize a bishop in case of
its pawns are on the same colored squares.
Good at short 15"+0.05 TC
LLR: 2.95 (-2.94,2.94)
Total: 4252 W: 925 L: 806 D: 2521
And at longer 60"+0.05 TC
LLR: 2.95 (-2.94,2.94)
Total: 15006 W: 2743 L: 2564 D: 9699
bench: 5274705
It seems stronger both at fast 15+0.05 TC with fixed game number test:
ELO: 2.74 +-2.7 (95%) LOS: 97.6%
Total: 24000 W: 4698 L: 4509 D: 14793
And also at long 60+0.05 TC with SPRT
LLR: 3.05 (-2.94,2.94)
Total: 38986 W: 6845 L: 6547 D: 25594
bench: 5157061
Use an optinal argument instead of a template
parameter. Interestingly, not only is simpler,
but also faster, perhaps due to less L1 instruction
cache pressure because we don't duplicate the very
used SEE code path.
No functional change.
Rationale:
- Current settings seem to make engine *significantly* weaker in analysis mode.
- In practice this setting only has effect when king safety scores are high.
- Even in analysis mode its far more important to know if one side is getting mated,
rather than get evaluation correct with 1cp accuracy.
No functional change
According to Jean-Paul this setup should be stronger
than default.
And SPRT test seems to confirm it:
At fast TC 15"+0.05
ELO: 3.33 +-2.7 (95%) LOS: 99.2%
Total: 25866 W: 5461 L: 5213 D: 15192
At longer TC 60"+0.05
ELO: 7.27 +-5.0 (95%) LOS: 99.8%
Total: 6544 W: 1212 L: 1075 D: 4257
bench: 5473339
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Increasing depth limit to 10 plies seems stronger
after 16K games at 15"+0.05 (ELO: +3.56) and also
repeating the test at 60"+0.05 TC:
ELO: 2.08 +-3.1 (95%) LOS: 90.9%
Total: 16000 W: 2641 L: 2545 D: 10814
Moreover setting the limit to 12 is proved stronger
then limit set to 10 by direct SPRT test at 15"+0.05:
ELO: 2.56 +-2.0 (95%) LOS: 99.5%
Total: 46568 W: 9240 L: 8897 D: 28431
So we directly set the limit to 12, the strongest setup.
bench: 4361224
Master IID formula is depth / 2
Previous patch is depth - 4 * ONE_PLY
This one is the middle way:
(dept/2 + depth-4*ONE_PLY)/2 -> depth-2*ONE_PLY-depth/4
After 16000 games at 60+0.05 th 1
ELO: 4.08 +-3.1 (95%) LOS: 99.5%
Total: 16000 W: 2742 L: 2554 D: 10704
bench: 4781239
Same idea of 5af8179647
in qsearch() but applied to search()
After 15500 games at 15+0.05
ELO: 4.48 +-3.4 (95%) LOS: 99.5%
Total: 15500 W: 3061 L: 2861 D: 9578
bench: 4985829
Always before pruning the move, it's important to check that:
bestValue > VALUE_MATED_IN_MAX_PLY
See example position:
8/2p1p3/P1NpP3/3k4/1P1BN3/2P1P3/2Q5/6K1 w - - 0 1
http://support.stockfishchess.org/discussions/problems/268-wrong-declaring-a-forced-mate-in-3-moves
This problem was present in 2.3.1, then it was fixed by my patch.
After 24000 games at 15+0.05
ELO: 2.40 +-4.4 (95%) LOS: 95.7%
Total: 24000 W: 4774 L: 4608 D: 14618
bench: 4465997
This reverts commit a24da071f0
Seems a regression when tested against 2.3.1
With this patch, have after 20000 games at 60+0.05, we have
ELO: 13.42 +-4.8 (95%) LOS: 100.0%
Total: 20000 W: 3746 L: 2974 D: 13280
Instead with the patch reverted:
ELO: 16.62 +-4.8 (95%) LOS: 100.0%
Total: 20000 W: 3816 L: 2860 D: 13324
Although we are within error bounds here we take the conservative
approach of not introducing changes that are not proved stronger
It doesn't mean that the change shall be weaker, simply that we
don't want to take any risk.
No functional change.
Here the rational seems to be that if after one try easy
move detection fails then the easy move is not so easy :-)
After 15563 games at 60+0.05
ELO: 3.04 +-5.5 (95%) LOS: 97.0%
Total: 15563 W: 2664 L: 2528 D: 10371
No functional change.
Increase MaxRatio to use more time when in trouble.
After 16000 games at 60+0.05
ELO: 4.89 +-5.4 (95%) LOS: 99.9%
Total: 16000 W: 2700 L: 2475 D: 10825
No functional change.
This seems good at short TC controls.
After 10000 games at 20+0.05
ELO: 9.56 +-6.8 (95%) LOS: 100.0%
Total: 10000 W: 1949 L: 1674 D: 6377
Testing at long TC and regression testing is still
ongoing. So this is a bit speculative commit and
could be reverted in the future.
Also re-testing at long TC the SEE pruning in PV nodes
seems less effective (perhaps even a regression, but
still ongoing) so disabled for now.
bench: 4968764
This reverts commit 0d68b523a3.
After easy move semplification this machinery is not
needed anymore (because of we don't need to know if a
root move is a recapture)
No functional change.
Detect a move as easy only if it is the only one ;-)
or if is much better than remaining ones after we
have spent 20% of search time.
Tests are ongoing, but it seems this semplification
stands. Anyhow it is experimental for now and could
be reverted/improved with further work Gary is
testing right now.
No functional change.
After previous patch if split point master is
waiting for job and "Use Sleeping Threads" is
false (our condition for official releases) then
it will lock/unlock splitPoint mutex in a super
tight loop badly affecting performance.
Rewrite the code to lock only when we are about
to finish. Note that race condition on slavesMask
is anyhow fixed.
No functional change.
There is no point searching a move that is forced.
It wastes time while allowing computer opponents to
fill hash with 100% accuracy.
[edit: Condition moved together with "easy move" ones]
Bench identical: 4922272
We detect an easy move as a recapture with an
high margin on the second best move.
Unfortunatly the recapture detection is broken
becuase we identify as a recapture any move that
follows an opponent's previous capture !
This patch fix the logic to correctly detect a
real re-capture.
No functional change.
Note that we read shared data without lock
protection, so code is theoretically prone to
torn reads. But, first splitPoint pointer
never changes, and alpha is of integer type so
it is read in a single DWORD access.
No functional change.
This patch is actually the sum of two contributions that
have been tested independently:
1) Pruning of negative SEE moves in PV
After 10000 games at 20+0.05
ELO: 5.18 +-7 (95%) LOS: 99.2%
Total: 10000 W: 1952 L: 1803 D: 6245
2) Remove of bestValue > VALUE_MATED_IN_MAX_PLY condition
After 23000 games at 20+0.05
ELO: 1.63 +-4 (95%) LOS: 88.1%
Total: 23000 W: 4232 L: 4124 D: 14644
The whole patch as been re-tested at long TC with positive results:
After 10000 games at 60+0.05
ELO: 4.31 +-7 (95%) LOS: 98.3%
Total: 10000 W: 1765 L: 1641 D: 6594
kf = (kf == FILE_A) ? kf++ : ....
is tricky becuase kf is updated twice and it happens
to do the right thing just by accident.
Rewrite in a better way.
Spotted by pdimov
No functional change.
Give a bonus if a bishop can pin a piece or can
give a discovered check through an x-ray attack.
Seems good after 24000 games at 15"+0.05 (single thread):
ELO: 12.30 +- 99%: 5.79 95%: 4.40 LOS: 100.00%
Total: 24000 W: 4931 L: 4082 D: 14987
bench: 4917064
Rename startPosPly to gamePly and increment/decrement
the variable after each do/undo move. This adds a little
overhead in do_move() but we will need to have the
game ply during the search for the next patches now
under test.
Currently we don't increment gamePly in do_null_move()
becuase it is not needed at the moment. Could change
in the future.
As a nice side effect we can now remove an hack in
startpos_ply_counter().
No functional change.
Another trick, along the same lines of previous
patch. This time we first check positions with
white side to move that, becuase we start with
pawn on rank 7, are easily classified as wins,
then black ones.
Number of cycles reduced to 15 !
Becuase now it is faster we can remove a lot of
code to detect theoretical draws. We will calculate
them anyhow, although a bit slower, but the speed
up trick more than compensates it.
Verified that generated bitbases match original ones.
No functional change.
Change the way the index is coded so that
now looping from 0 to IndexMax generates
the pawns from RANK_7 down to RANK2.
Becuase positions with pawns at RANK_7
are easily classified as wins/draws, this
small trick allows to reduce the number
of needed iterations from 30 down to 26!
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Rename to insertion_sort so to avoid confusion
with std::sort, also move it to movepicker.cpp
and use the bit slower std::stable_sort in
search.cpp where it is used in not performance
critical paths.
No functional change.
We can encode the ClusterSize directly in the
hashMask, this allows to skip the left shift.
There is no real change, but bench number is now
different because instead of using the lowest order
bits of the key to index the start of the cluster,
now we don't use the last two lsb bits that are
always set to zero (cluster size is 4). So for
instance, if 10 bits are used to index the cluster,
instead of bits [9..0] now we use bits [11..2].
This changes the positions that end up in the same
cluster affecting TT hits and so bench is different.
Also some renaming while there.
bench: 5383795
Do a 32bit bitwise 'and' instead of a 64bit
subtract and bitwise 'and'.
This is possible because even in the biggest
hash table case (8GB) the number of entries
is 2^29 so storable in an unsigned int.
No functional change.
Save the current active position in each Thread
instead of keeping a centralized array in struct
SplitPoint.
This allow to skip a memset() call at each split.
No functional change.
Obsolete renmant of when position was directly
passed to the search instead of being copied
for the main thread as is now.
From Jundery.
No functional change.
The syntax splitPoints() should force the compiler to
value-initialize the array and because there is no
user defined c'tor it falls back on zero-initialization.
Unfortunatly this is broken in MSVC compilers, because
value initialization for non-POD types is not supported,
so left splitPoints un-initialized and add in split()
initialization of slavesPositions, that is the only
member not already set at split time.
This fixes an assert under MSVC when running with
more than one thread.
Spotted and reported by Jundery.
No functional change.
It is a draw if pawns are on G or B files, weaker pawn is
on rank 7 and bishop can't attack the pawn.
No functional change (because it is very rare and does not appear in bench)
To return a pointer to the available
thread instead of a bool. This allows
to simplify the core loop in split().
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This function "returns" two values: bestValue and bestMove
Instead of returning one and passing as pointer the other
be consistent and pass as pointers both.
No functional change.
Currently a ttMove is reduced with ss->reduction = DEPTH_ZERO,
so it is actually not reduced (as it should be), but the
trick works just becuase it happens that ttMove is the first
to be tried and
reduction(depth, 1)
Always returns zero. So explicitly forbid reduction of ttMove
in the LMR condition. This is much clear and self-documented.
No functional change.
Surprisingly this rare case was not considered
when scoring a capture.
Also take in account that in the promotion case
we gain a new piece (typically a queen) but we
lose the promoting pawn.
These small issues were present since Glaurung times!
Found while browsing DiscoCheck sources
bench: 5400063
Handling of History and Gains is almost the same, with
the exception of the update logic, so unify both
classes under a single Stats struct.
No functional change.
And handle the castle directly in do/undo_move().
This allow to greatly simplify the code.
Here the beast is the nasty Chess960 that is
really tricky to get it right because could be
that 'from' and 'to' squares are the same or
that king's 'to' square is rook's 'from' square.
Anyhow should work: verified on all Chess960
starting positions.
No functional and no speed change also in Chess960.
Use a more traditional approach, along the same lines
of do_move().
It is true that we copy more in do_null_move(), but we
save the work in undo_null_move(). Speed test shows the
new code to be even a bit faster.
No functional change.
We have only one call place so inline its content.
BTW, function is already declared as FORCE_INLINE.
Also some small refactoring while there.
No functional change.
When a thread is allocated a bit is set in slavesMask.
This bit corresponds to the thread's index field that,
because it happens to be the position in the threads
array, eventually it is equal to the loop index 'i'.
But instead of relying on this 'coincidence', explicitly
use the 'idx' field so to clarify slavesMask usage.
Backported from c++11 branch.
No functional change.
Intel Compiler has 'invented' this pearl:
warning #1476: field uses tail padding of a base class
Just becuase we have subclassed MainThread and added
the field 'bool thinking'.
Pure nosense. Silence the warning.
No functional change.
Fix again TimerThread::idle_loop() to prevent a
theoretical race with 'exit' flag in ~Thread().
Indeed in Thread d'tor we raise 'exit' and then
call notify() that is lock protected, so we
have to check again for 'exit' before going to
sleep in idle_loop().
Also same change in Thread::idle_loop() where we
now check for 'exit' before to go to sleep.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Great code simplification: - instead do not futility
prune threat refutations. allows_move() is therefore removed.
4000 games at 50,000 nodes/move:
1085-989-1926 [51.2%] LOS=98.3%
4000 games in 10"+0.1"
756-751-2493 [50.1%] LOS=55.1%
EDIT: I have retested the patch of Lucas in a slightly different form
(without pruning in PvNode) and test mre or less confirms that
60 lines of code are totally unuseful:
After 6195 games at 15"+0.05"
1333 - 1325 - 3537 ELO 0
bench 5140990
Silly logic bug introduced in dda7de17e7
Timer thread, when msec = 0, instead of going
to sleep, calls check_time() in an endless loop.
Spotted and reported by snino64 due to abnormally
high CPU usage.
No functional change.
Subclass MainThread and TimerThread and declare
idle_loop() virtual. This allow us to cleanly
remove a good bunch of hacks, relying on C++
polymorphism to do the job.
No functional change.
Rename it is_finished and use it only in main
thread to signal search is finished. This allows
us to simplify the complex SMP logic.
Ultra tricky patch: deep test is required under
wide conditions like pondering on and option
"Use Sleeping Threads" set to false.
No functional change.
This reverts commit 869c924410
I misunderstood here. Actually it can happen that
thread is created but still not entered idle_loop
and at the same time start_searching() is called.
Becuase 'do_sleep' is set start_searching() will
set it to false and start the search, but when,
at last, the thread enters idle_loop(), resets
the flag and goes to sleep: not what we want.
Revert the hack waiting for a better solution
in the next patches.
No functional change.
Finally we can now merge the 'ponderhit' case with
'stop' and 'quit'.
The patches have been done step by step to help debugging
becuase this is really tricky code.
No functional change.
Reset Limits.ponder only if search continue, but if
we are going to stop the search there is no need
(and is also confusing) to clear the 'ponder' flag.
This mimics the behaviour upon rceiving 'stop' when
pondering.
No functional change.
In SAN notation when two pieces of the same type can
move to a given destination square, a disambiguation
additional info (like starting file) shall be added
to the SAN move.
If one of the two pieces is pinned, the corresponding
move _could_ be illegal and in this case disambiguation
is not needed. But to be pinned alone it is not enough
to deduce that the move is illegal, for instance in this
position:
R3rk2/2r6/8/8/8/8/8/K7 b - - 0 1
The move Rc8 is ambiguous although the rook in e8 is pinned
and the correct SAN notation should be Rcc8.
No functional change.
Don't wait for the search to finish after a 'stop'
command, but keep processing the GUI input if any.
Also explicitly wake up the main thread (that could be
sleeping) after a 'stop' or 'quit' command and do not
rely on wait_for_search_finished() doing it for us.
This patch cleans up the code and functions's definitions,
but it is risky and needs a good test under different
conditions to be sure it does not introduces hungs up.
No functional change.
Revert patch c581b7ea36
Seems a regression after testing from Gary:
ELO: 7.24 +- 99%: 17.03 95%: 12.93
LOS: 97.86%
Wins: 439 Losses: 381 Draws: 1962
And mine:
After 5410 games at 15"+0.05
Wins: 936 Losses: 1141 Draws: 3333 ELO -13
Moreover we know that there is a regression in the range
of patches which include the fromNull patch.
Probably this is not the only regression since 2.3.1 and
perhaps the idea under fromNull is good, but at the moment,
while in deep regression hunting, better to be on the safe
side and revert it entirely.
My guess on why this is a regression is that using the
negated evaluation of previous ply in case of null search
fails to take in account the king safety asymmetry between
the two colors. This is of course just a guess.
bench 5503830
They are not self-describing and create a lot of user
requests about them.
Given that the values are already well tuned there
is no need to expose them as UCI options.
No functional change.
Sometimes is faster, but not always and on very long mates
produces strange scores probably due to truncation of PV
artifacts.
So simply perform normal search also in case of UCI 'mate x'
command, with the only difference that when a mate in x is
found search returns immediately.
No functional change.
Now that we can call 'bench' command also from
interactive terminal it makes no more sense to
exit the application if the user types a wrong
file name.
No functional change.
Since &-ing with SpaceMask restricts the set to the home
half of the board, it is possible to use just one popcount
instead of 2 by shifting "safe" to the other half of the
board. This gives a small speedup especially on systems
where hardware popcount is not available.
Patch kindly sent by Richard Vida.
No functional change.
It is now possible to run SF on a 'mate in x' testsuite.
For instance in case of a file with fen strings of
positions with mate in 10 we can now 'bench' on it:
stockfish bench 128 1 10 mate_in_10.epd mate
No functional change.
Following a user request I added the handling of UCI:
go mate x
Currently we just return from a PV node if x moves have been
done. Probably not the best approach. I have looked at Fruit/Toga
sources and there is even simpler: engine falls back on a fixed
depth search.
No functional change.
And return on using TT as backing store for position
evaluations.
Tests (even on single thread) show eval cache was a regression.
In multi thread result should be even worst because eval cache
is a per-thread struct, while TT is shared.
After 4957 games at 15"+0.05 (single thread)
eval cache vs master 969 - 1093 - 2895 -9 ELO
So previous reported result of +18 ELO was probably due to an
issue in the testing framework (a bug in cutechess-cli) that
has been fixed in the meanwhile.
bench: 5386711
In case of null search at low depths returns a fail low
due to a threat then, rather than return beta-1 (to cause
a re-search at full depth in the parent node), we set a flag
threatExtension = true (false by default) that will cause
moves that prevent the threat to be extended of one ply
in the following search.
Idea and patch is by Lucas Braesch.
Lucas also did the tests:
1500 games in 5"+0.05":
SF_threatExtension vs SF_20121222: 366 - 331 - 803 [51.2%] LOS=90.8%
3000 games in 10"+0.1":
SF_threatExtension vs SF_20121222: 610 - 559 - 1831 [50.8%] LOS=93.2%
Tests confirmed by Gary after 10570 games,
ELO: 2.79 +- 99%: 8.72 95%: 6.63
LOS: 94.08%
Wins: 1523 Losses: 1438 Draws: 7607
And finally by me at 15"+0.05, single thread, 3824 games
threatExtension vs master 768 - 692 - 2364 +7 ELO
bench 4918443
This is difficult code becuase a bug here could lead
to very subtle crashes in case of SMP games where we
have TT move corruption due to concurrent access.
Anyhow I have fully verified te code throwing at it
random moves. It shoudl work.
No functional change.
Test by Joona prooves the new feature don't value 70 added lines of code.
Grand totals after 10040 games (crashes: 0) for tt_both
master_9edc7 - 6a93488_6a934: 1756 - 1688 - 6596 ELO +2 (+- 2.7)
Confirmed by test of Gary:
After 8680 games:
ELO: 0.80 +- 99%: 9.62 95%: 7.31
LOS: 65.38%
Wins: 1288 Losses: 1268 Draws: 6130
Thanks a lot to both for testing it !!!
bench 5149248
This is more complex than what I'd like but I
was unable to split in small chunks.
Here we add 2 slots to TTEntry (valueUpper and depthUpper)
so that sizeof(TTEntry) returns to the original 16 bytes
and we can pack exactly 4 entries in a 64 bytes cache line.
Now we save an upper bound score alongside a lower (exact)
score. The idea is to increase TT cut-offs rates becuase
there is now an higher probability for a node to use TT info.
This patch is highly experimental and probably needs further
steps as is hinted by an unrealistic bench number:
bench: 2022385
In almost all cases we already know in advance that
color_of() argument is different from NO_PIECE.
So avoid the check for NO_PIECE in color_of() and
test at caller site in the very few places where
this case could occur.
As a nice side effect, this patch fixes a (bogus)
warning under some versions of gcc.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use an eval cache instead of TT to store node
position evaluations.
It is already an improvment and, because it frees
two TT entry slots, paves the way to extend TT to
store both upper and lower bounds.
After 4855 games, single thread, 15"+0.05
Mod vs Orig 1165 -920 - 2770 ELO +18
bench: 5149248
And document why this is an hard limit. It
seems for some (lucky) people 32 threads
are not enough.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In search(), after we evalute the position, in case there
isn't any TT entry we create one with just the evaluation
score.
This patches removes that code. The reason becuase the patch
deserves a single commit it is becuase introduces a (very small)
functional change due to the fact that the total number of
TT stores is less now and this slightly alters the TT hits
of our benchmark.
bench: 4983262
Rely fully on eval cache. Note that we still save eval
info to TT, this is not needed at this moment and will be
removed in future patches. We keep it so to have a "non
functional change" patch.
No functional change.
With this patch series we want to introduce a per-thread
evaluation cache to store node evaluation and do not
rely anymore on the TT table for this.
This patch just introduces the infrastructure.
No functional change.
In case of a RootNode or a SpNode move has
been already checked for legality so we can
skip a redundant check.
Spotted by Frank Genot.
No functional change.
In qsearch we should update the bestValue as we do
in case of futilityValue < beta, also when pruning
moves with non-positive see.
Spotted by Lucas Braesch
Bench: 5695710
Send the PV lines to GUI only once at the end of the
PV search loop or just in case of long searches.
We need to sync also sending of "currmove" info to
avoid sending info on current move without first
informing the GUI on the PV line we are searching on.
No functional change.
At this point we have already verified (value > alpha)
and this implies, in case of a non-PV node, where search
window size is zero, that value >= beta.
This is not so self-evident, so document the code with
an assert condition.
No functional change.
Let the caller to decide where to redirect (cout or cerr) the
ASCII representation of the position. Rename the function to
reflect this.
Renamed also from_fen() and to_fen() to set() and fen() respectively.
No functional change.
In case a PvNode node has a static evaluation above alpha but
no available moves we want to flag the node as BOUND_EXACT,
not as BOUND_UPPER as is currently.
The behaviour was recently introduced with patch d471c49700
of 3/10/2012
Spotted by Hongzhi Cheng.
bench: 5558464
Both Lucas re-test and Jean-Francois confirrm it
is a regression.
Here Jean-Francois's results after 3600 games :
Score of 96d3b1c92b vs 3b87314: 690 - 729 - 2181 [0.495] 3600
ELO: -3.86 +- 99%: 14.94 95%: 11.35
LOS: 15.03%
Wins: 690 Losses: 729 Draws: 2181 Total: 3600
Bench: 5404066
Don't prune and eventually extend check moves of type
DISCO_CHECK (pun intended, Lucas will understand :-) ).
Patch from Lucas Braesch that has also tested it:
Result: 879-661-2137, score=52.96%, LOS=100.00% (time 10"+0.1")
I have started a verification test right now.
bench: 6004966
From Jean-Francois's
Final result after 5000 games :
Score of c581b7e vs a878312: 1163 - 970 - 2867 [0.519] 5000
ELO: 13.35 +- 99%: 12.71 95%: 9.65
LOS: 100.00%
Wins: 1163 Losses: 970 Draws: 2867 Total: 5000
From me
After 3266 games at 20"+0,05
Score of c581b7e vs a878312: 612 - 607 - 2047
So no regression at longer TC and perhaps a little gain at
fast TC.
In this case we try a rather drastic
approach: we simply don't futility prune
in qsearch when arriving from a null move.
So we save evaluating and also save to mess
with eval margins at all because margin is used
only in futility.
Also accuracy should not be affected, actually it
improves because we don't prune anything anymore.
bench: 5404066
Performs well at very short TC of 40/4+0.05 (courtesy of Jean-Francois):
Wins: 2503 Losses: 2146 Draws: 5581 Total: 10230 +12 ELO
But is poor at longer TC of 20"+0.05
Wins: 321 Losses: 373 Draws: 1141 Total: 1808 -10 ELO
The patch was clearly a tradoff between speed and accuracy and
the most interesting part of it are test results that can be
commented as follows:
- A short TC is very sensible to any speed increase
- A longer TC is more sensible to accuracy and less to speed
So a patch that does not change speed is suitable to be tested at
short TC, while a speed/accuracy compromise patch is IMO better to
be tested at longer TC to verify loss of accuracy can be tolerated.
In this case the revert is only temporary. We will come back again
once we will be able to preserve the evaluation margin.
bench: 5809010
Reuse the evaluation of the parent with inverted sign and
set margin to zero (this is an hack!).
This is done only in qsearch where almost 15% of calls are
from a null move. In normal search the number of nodes where
(ss-1)->currentMove == MOVE_NULL
is almost zero and so there is no need of using this trick.
The big advantage of this patch is a speed-up due to skipped
evaluate() calls, that are very costly.
Functionality is of course affected and we will need to proper
test it later. For now we just register a 3-4% speed up.
Suggested by Hongzhi Cheng.
bench: 5051328
When testing if a move blocks the threat path there is no
reason to require the threat to be a slider. Indeed threat
can be a double pawn push like in this example:
r1bq1rk1/ppp1np1p/4n1p1/3p4/3P2Q1/2P1B3/PPBN2PP/R4RK1 w - - 0 16
Where white's move Rf6 blocks the threat f5.
As a nice side effect we can retire the now useless helper
piece_is_slider().
This patch kicks in only very rare cases, indeed the bench is
still the same!
bench: 5809010
There is no guarantee that split() consumes all the node's
moves. Indeed split() can return without performing any job
for instance because MAX_SPLITPOINTS_PER_THREAD is reached
or becuase no available threads are found (this latter case
is much more common).
So search must continue in those cases and we cannot force
exiting from move's loop.
Bug introduced by 1ac417edb8 of 5/10/2012
Spotted by Frank Genot.
No functional change.
If a previous move attacks the king (with the piece
of the threat move removed) then must be a discovered
check, otherwise it means that first move gave check
and we were not able to do a null move.
Also renamed stuff to better document the function's
context.
No functional change.
When testing if a piece is moving through the squares
vacated by a previous move there is no reason to require
the piece to be a slider, indeed we can have a double
pawn push like in this example:
r1q2rk1/2p1bppp/2Pp4/pN5b/Q1P1p3/4B2P/PP1R1PP1/1K5R w - - 3 18
Where black's move f5 is connected to previous move Be7 that
frees the path.
Or we can have a castle move:
r1bqkb1r/pppp1ppp/2n1pn2/1B6/4P3/2N2N2/PPPP1PPP/R1BQK2R b KQkq - 5 1
Where a previous move Bb5 allows the white to castle king side.
This time patch is mine ;-)
new bench: 5809010
We send to GUI multi-pv info after each cycle,
not just once at the end of the PV loop. This is
because at high depths a single root search can
be very slow and we want to update the gui as
soon as we have a new PV score.
Idea is good but implementation is broken because
sort() takes as arguments a pointer to the first
element and one past the last element.
So fix the bug and rename sort arguments to better
reflect their meaning.
Another hit by Hongzhi Cheng. Impressive!
No functional change.
When checking if the moving piece p1 in a previous
move m1 defends the destination square of a move m2
we have to use the occupancy with the from square of
m2 removed so to take in account the case in which
f2 will block an x-ray attack from p1.
For instance in this position:
r2k3r/p1pp1pb1/qn3np1/1N2P3/1p3P2/2B5/PPP3QP/R3K2R b KQ - 1 9
The move eXf6 is connected to the previous move Bc3 that
defends the destination square f6.
With this patch we have about 10% more moves detected as
'connected'. Anyhow the absolute number is very low, about
4000 more moves out of 6M nodes searched.
Another issue spotted by Hongzhi "Hawk Eye" Cheng ;-)
new bench: 5757373
On Intel, perhaps due to 'lea' instruction this way of
zeroing the lsb of *b seems faster than a shift+negate.
On perft (where any speed difference is magnified) I
got a 6% speed up on my Intel i5 64bit.
Suggested by Hongzhi Cheng.
No functional change.
Compiler complies that 'cnt' is initialized but
unused (in !CheckThreeFold case). Moving the
definition of 'cnt'out of the loop seems to do
the trick.
No functional change.
Instead of use a variable so to resolve many conditions
already at compile time. In quiesce is also where we
have most of the InCheck nodes and is one of the most
performance critical code paths.
Speed up of 1.5% with Clang and 1% with gcc
Suggested by Hongzhi Cheng.
No functional change.
When checking if a move defends the threatened piece
we correctly remove from the occupancy bitboard the
moved piece. This patch removes from the occupancy also
the threatening piece so to consider the cases of moves
that defend the threatened piece x-raying through the
threat move.
As example in this position:
r3k2r/p1ppqp2/Bn4p1/3p1n2/4P1N1/5Q1P/PPP2P1P/R3K2R w KQkq - 1 10
The threat black move is dxe4. With this patch we include
(and so don't prune) white's Bb7 that would be pruned otherwise.
The number of affected position is very low, around 1% of cases,
so we don't expect ELO changes, neverthless this is the logical
and natural thing to do.
Patch suggested by Hongzhicheng.
new bench: 5323798
ReducedStateInfo is a redundant struct that is also
prone to errors, indeed must be updated any time is
updated StateInfo. It is a trick to partial copy a
StateInfo object in do_move().
This patch takes advantage of builtin macro offsetof()
to directly calculate the number of quad words to copy.
Note that we still use memcpy to do the actual job of
copying the (48 bytes) of data.
Idea by Richard Vida.
No functional and no performance change.
Only in not performance critical code like pretty_pv(),
otherwise continue to use the good old C-style arrays
like in extract/insert PV where I have done some code
refactoring anyhow.
No functional change.
Silly typo (introduced in e304db9d1e) completely
messed up move notation in case of promotions causing
"Illegal move" warning in cutechess-cli.
Reported by Jörg Oster.
No functional change.
In multi-threads runs with debug on we experience some
asserts due to the fact that TT access is intrinsecally
racy and its contents cannot be always trusted so must
be validated before to be used and this is what the
patch does.
No functional case.
And restore old behaviour of not returning from a RootNode
without updating RootMoves[].
Also renamed is_draw() template parameters to reflect a
'positive' logic (Check instead of Skip) that is easier
to follow.
New bench: 5312693
When signal 'stop' is raised we return bestValue
that could be still set at -VALUE_INFINITE and
this triggers an assert. Fix it by returning
a value we know for sure is not +-VALUE_INFINITE.
Reported by 平岡拓也 Hiraoka.
No functional change.
Note that the actual pickup is done in the class
d'tor so to be sure it is always triggered, even
in case of a sudden exit due to a 'stop' signal.
No functional change.
Function move_to_san() requires the Position to be
passed by referenced because a do/undo move is done
inside the function to detect a possible mate and to
add to the san string the corresponding '#' suffix.
Instead of passing a copy of current position pass
directly the original position object after const
casting it. This has the advantage to avoid a costly
Position copy, on the down side a bench test could
report different searched nodes if print(move) is
used, due to the additionals do_move() calls.
No functional change.
Existing Makefile is buggy for PowerPC, it has no
SSE, yet it is given it if Prefetch is enabled,
because it isn't ARMv7.
Patch from Matthew Brades.
No functional change.
We can have ttValue == VALUE_NONE when we use a TT
slot to just save a position static evaluation, but
in this case we also save DEPTH_NONE so to avoid
using the ttValue in search. This happens to work,
but due to a number of lucky and tricky cases that
we now documnet through a bunch of asserts and a
little change to value_from_tt() that has no real
effect but clarifing the code.
No functional change.
On some platforms 128 MB of RAM for TT is too much,
so run 'bench' with the default 32 MB size.
No functional change although of course now 'bench'
reports a different number: 5545018
Use popcount() instead in the only calling place.
It is used only at initialization so there is no
speed regression and anyhow even initialization
itself is not slowed down: magic bitboard setup
stays around 175 msec on my slow 32bit Core Duo.
No functional change.
Simplify the code and doing this introduce a couple
of (very small) functional changes:
- Always compare to depth even in "mate value" condition
- TT cut-off in qsearch also in case of PvNode, as in search
Verified against regression with 2500 games at 30"+0.05
on 2 threads: 451 - 444 - 1602
Functional changed: new bench is 5544977
As Joona says: "The problem is that when doing full
window search (-VALUE_INFINITE, VALUE_INFINITE), and
pruning all the moves will return fail low which is
mate score, because only clause touching alpha is
"mate distance pruning". So we are returning mate score
although we are just pruning all the moves. In reality
there probably is no mate in sight.
Bug spotted and fixed by Joona.
Implement lsb/msb using armv7 assembly instructions.
msb is the easiest one, using a gcc intrinsic that generates
code using the ARM's clz instruction. lsb is also using this
clz instruction, but with the help of ARM's 'rbit' (bit
reversing) instruction. This leads to a >2% speed gain.
I also renamed 'arm-32' to the more meaningfull 'armv7' in the Makefile
No functional change.
Port to qsearch() the same changes we recently
added to search().
Overall this search refactoring series shows
almost 2% speed up on gcc compile.
No functional change.
When using the Makefile (as for the mingw case),
IS_64BIT and USE_BSFQ are already set with
ARCH=x86-64 and do not need to be redefined.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
First disable Contempt Factor during analysis, then
calculate the modified draw score from the point of
view of the player, so from the point of view of
RootPosition color.
Thanks to Ryan Taker for suggesting the fixes.
No functional change.
These kind of arch specific code is really nasty
to make it right becuase you need to verify on
all the platforms.
Now should compile properly also on ARM
Reported by Jean-Francois.
No functional change.
In particular on ARM processors. Original patch by
Jean-Francois, sligtly modified by me to preserve
the meaning of NO_PREFETCH flag.
Verified with gcc, clang and icc that prefetch instruction
is correctly created.
No functional change.
A simplification and also a small speed-up of
about 1% mainly due to reducing calls to
thisThread->cutoff_occurred().
Worst case split point recovering time after a
cut-off occurred is limited to 3 msec on my slow
PC, and usually is below 1 msec, so it seems safe
to remove the cutoff_occurred() check.
No functional change.
This is very crude and very basic: simply in case
of a draw for repetition or 50 moves rule return
a negative score instead of zero according to the
contempt factor (in centipawns). If contempt is
positive engine will try to avoid draws (to use
with weaker opponents), if negative engine will
try to draw. If zero (default) there are no changes.
No functional change.
Use a value related to PawnValue instead.
This is a different patch from previous one because
could affect game play and skill levels, although
in a mostly unmeasurable way. Indeed thresold has
been raised so easy move is a bit harder to trigger
and skill level is a bit more prone to blunders.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Show the real value in the code, not hide it
behind a variable name, especially when there
is only once occurence.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Extend for an extra half-ply in case the node is (probably)
going to fail high. In this case the added overhead is limited.
A novelity is the way this patch has been tested: Always in
self-play but with a much longer TC to allow the singular
extension to fully kick in and also (my impression) to have
less noisy results.
Ater 1015 games on my QUAD at 60"+0.05
Mod vs Orig 173 - 150 - 692 ELO +8
Handle also the SMP case. This has been quite tricky, not
trivial to enforce the node limit in SMP case becuase
with "helpful master" concept we can have recursive split
points and we cannot lock them all at once so there is the
risk of counting the same nodes more than once.
Anyhow this patch should be race free and counted nodes are
correct.
No functional change.
As long as isPvMove (renamed to pvMove) is set after
legality check, we can postpone legality even in PV case.
Patch aligns the PV case with the common non-pv one.
No functional change.
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.