Official release version of Stockfish 17
Bench: 1484730
---
Stockfish 17
Today we have the pleasure to announce a new major release of Stockfish. As
always, you can freely download it at https://stockfishchess.org/download and
use it in the GUI of your choice.
Don’t forget to join our Discord server[1] to get in touch with the community
of developers and users of the project!
*Quality of chess play*
In tests against Stockfish 16, this release brings an Elo gain of up to 46
points[2] and wins up to 4.5 times more game pairs[3] than it loses. In
practice, high-quality moves are now found in less time, with a user upgrading
from Stockfish 14 being able to analyze games at least 6 times[4] faster with
Stockfish 17 while maintaining roughly the same quality.
During this development period, Stockfish won its 9th consecutive first place
in the main league of the Top Chess Engine Championship (TCEC)[5], and the 24th
consecutive first place in the main events (bullet, blitz, and rapid) of the
Computer Chess Championship (CCC)[6].
*Update highlights*
*Improved engine lines*
This release introduces principal variations (PVs) that are more informative
for mate and decisive table base (TB) scores. In both cases, the PV will
contain all moves up to checkmate. For mate scores, the PV shown is the best
variation known to the engine at that point, while for table base wins, it
follows, based on the TB, a sequence of moves that preserves the game outcome
to checkmate.
*NUMA performance optimization*
For high-end computers with multiple CPUs (typically a dual-socket architecture
with 100+ cores), this release automatically improves performance with a
`NumaPolicy` setting that optimizes non-uniform memory access (NUMA). Although
typical consumer hardware will not benefit, speedups of up to 2.8x[7] have been
measured.
*Shoutouts*
*ChessDB*
During the past 1.5 years, hundreds of cores have been continuously running
Stockfish to grow a database of analyzed positions. This chess cloud
database[8] now contains well over 45 billion positions, providing excellent
coverage of all openings and commonly played lines. This database is already
integrated into GUIs such as En Croissant[9] and Nibbler[10], which access it
through the public API.
*Leela Chess Zero*
Generally considered to be the strongest GPU engine, it continues to provide
open data which is essential for training our NNUE networks. They released
version 0.31.1[11] of their engine a few weeks ago, check it out!
*Website redesign*
Our website has undergone a redesign in recent months, most notably in our home
page[12], now featuring a darker color scheme and a more modern aesthetic,
while still maintaining its core identity. We hope you'll like it as much as we
do!
*Thank you*
The Stockfish project builds on a thriving community of enthusiasts (thanks
everybody!) who contribute their expertise, time, and resources to build a free
and open-source chess engine that is robust, widely available, and very strong.
We would like to express our gratitude for the 11k stars[13] that light up our
GitHub project! Thank you for your support and encouragement – your recognition
means a lot to us.
We invite our chess fans to join the Fishtest testing framework[14] to
contribute compute resources needed for development. Programmers can contribute
to the project either directly to Stockfish[15] (C++), to Fishtest[16] (HTML,
CSS, JavaScript, and Python), to our trainer nnue-pytorch[17] (C++ and Python),
or to our website[18] (HTML, CSS/SCSS, and JavaScript).
The Stockfish team
[1] https://discord.gg/GWDRS3kU6R
[2] https://tests.stockfishchess.org/tests/view/66d738ba9de3e7f9b33d159a
[3] https://tests.stockfishchess.org/tests/view/66d738f39de3e7f9b33d15a0
[4] https://github.com/official-stockfish/Stockfish/wiki/Useful-data#equivalent-time-odds-and-normalized-game-pair-elo
[5] https://en.wikipedia.org/wiki/Stockfish_(chess)#Top_Chess_Engine_Championship
[6] https://en.wikipedia.org/wiki/Stockfish_(chess)#Chess.com_Computer_Chess_Championship
[7] https://github.com/official-stockfish/Stockfish/pull/5285
[8] https://chessdb.cn/queryc_en/
[9] https://encroissant.org/
[10] https://github.com/rooklift/nibbler
[11] https://github.com/LeelaChessZero/lc0/releases/tag/v0.31.1
[12] https://stockfishchess.org/
[13] https://github.com/official-stockfish/Stockfish/stargazers
[14] https://github.com/official-stockfish/fishtest/wiki/Running-the-worker
[15] https://github.com/official-stockfish/Stockfish
[16] https://github.com/official-stockfish/fishtest
[17] https://github.com/official-stockfish/nnue-pytorch
[18] https://github.com/official-stockfish/stockfish-web
Created from 2 distinct spsa tunes of the latest main net (nn-31337bea577c.nnue)
and applying the params to the prior main net (nn-e8bac1c07a5a.nnue). This
effectively reverts the modifications to output weights and biases in
https://github.com/official-stockfish/Stockfish/pull/5509
SPSA:
A: 6000, alpha: 0.602, gamma: 0.101
1st - 437 feature transformer biases where values are < 25
54k / 120k games at 180+1.8
https://tests.stockfishchess.org/tests/view/66af98ac4ff211be9d4edad0
nn-808259761cca.nnue
2nd - 208 L2 weights where values are zero
112k / 120k games at 180+1.8
https://tests.stockfishchess.org/tests/view/66b0c3074ff211be9d4edbe5
nn-a56cb8c3d477.nnue
When creating the above 2 nets (nn-808259761cca.nnue, nn-a56cb8c3d477.nnue),
spsa params were unintentionally applied to nn-e8bac1c07a5a.nnue rather
than nn-31337bea577c.nnue due to an issue in a script that creates nets
by applying spsa results to base nets.
Since they both passed STC and were neutral or slightly positive at LTC,
they were combined to see if the elo from each set of params was additive.
The 2 nets can be merged on top of nn-e8bac1c07a5a.nnue with:
https://github.com/linrock/nnue-tools/blob/90942d3/spsa/combine_nnue.py
```
python3 combine_nnue.py \
nn-e8bac1c07a5a.nnue \
nn-808259761cca.nnue \
nn-a56cb8c3d477.nnue
```
Merging yields nn-87caa003fc6a.nnue which was renamed to nn-1111cefa1111.nnue
with an updated nnue-namer around 10x faster than before by:
- using a prefix trie for efficient prefix matches
- modifying 4 non-functional bytes near the end of the file instead of 2
https://github.com/linrock/nnue-namer
Thanks to @MinetaS for pointing out in #nnue-dev what the non-functional bytes are:
L3 is 32, 4 bytes for biases, 32 bytes for weights. (fc_2)
So -38 and -37 are technically -2 and -1 of fc_1 (type AffineTransform<30, 32>)
And since InputDimension is padded to 32 there are total 32 of 2 adjacent bytes padding.
So yes, it's non-functional whatever values are there.
It's possible to tweak bytes at -38 - 32 * N and -37 - 32 * N given N = 0 ... 31
The net renamed with the new method passed non-regression STC vs. the original net:
https://tests.stockfishchess.org/tests/view/66c0f0a821503a509c13b332
To print the spsa params with nnue-pytorch:
```
import features
from serialize import NNUEReader
feature_set = features.get_feature_set_from_name("HalfKAv2_hm")
with open("nn-31337bea577c.nnue", "rb") as f:
model = NNUEReader(f, feature_set).model
c_end = 16
for i,ft_bias in enumerate(model.input.bias.data[:3072]):
value = int(ft_bias * 254)
if abs(value) < 25:
print(f"ftB[{i}],{value},-1024,1024,{c_end},0.0020")
c_end = 6
for i in range(8):
for j in range(32):
for k in range(30):
value = int(model.layer_stacks.l2.weight.data[32 * i + j, k] * 64)
if value == 0:
print(f"twoW[{i}][{j}][{k}],{value},-127,127,{c_end},0.0020")
```
New params found with the same method as:
https://github.com/official-stockfish/Stockfish/pull/5459
Passed STC:
https://tests.stockfishchess.org/tests/view/66b4d4464ff211be9d4edf6e
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 136416 W: 35753 L: 35283 D: 65380
Ptnml(0-2): 510, 16159, 34416, 16597, 526
Passed LTC:
https://tests.stockfishchess.org/tests/view/66b76e814ff211be9d4ee1cc
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 159336 W: 40753 L: 40178 D: 78405
Ptnml(0-2): 126, 17497, 43864, 18038, 143
closes https://github.com/official-stockfish/Stockfish/pull/5534
bench 1613043
This patch moves the DotProd code into the propagation function which
has sequential access optimization. To prove the speedup, the comparison
is done without the sparse layer. With the sparse layer the effect is
marginal (GCC 0.3%, LLVM/Clang 0.1%).
For both tests, binary is compiled with GCC 14.1. Each test had 50 runs.
Sparse layer included:
```
speedup = +0.0030
P(speedup > 0) = 1.0000
```
Sparse layer excluded:
```
speedup = +0.0561
P(speedup > 0) = 1.0000
```
closes https://github.com/official-stockfish/Stockfish/pull/5520
No functional change
Since simplification of quiet checks in qsearch this depth isn't used by
any function at all apart movepicker, which also doesn't use passed
qsearch depth in any way, so can be removed. No functional change.
closes https://github.com/official-stockfish/Stockfish/pull/5514
No functional change
Created by updating output weights (256) and biases (8)
of the previous main net with values found with spsa around
101k / 120k games at 140+1.4.
264 spsa params: output weights and biases in nn-e8bac1c07a5a.nnue
A: 6000, alpha: 0.602, gamma: 0.101
weights: [-127, 127], c_end = 6
biases: [-8192, 8192], c_end = 64
Among the 264 params, 189 weights and all 8 biases were changed.
Changes in the weights:
- mean: -0.111 +/- 3.57
- range: [-8, 8]
Found with the same method as:
https://github.com/official-stockfish/Stockfish/pull/5459
Due to the original name (nn-ea8c9128c325.nnue) being too similar
to the previous main net (nn-e8bac1c07a5a.nnue) and creating confusion,
it was renamed by making non-functional changes to the .nnue file
the same way as past nets with:
https://github.com/linrock/nnue-namer
To verify that bench is the same and view the modified non-functional bytes:
```
echo -e "setoption name EvalFile value nn-ea8c9128c325.nnue\nbench" | ./stockfish
echo -e "setoption name EvalFile value nn-31337bea577c.nnue\nbench" | ./stockfish
cmp -l nn-ea8c9128c325.nnue nn-31337bea577c.nnue
diff <(xxd nn-ea8c9128c325.nnue) <(xxd nn-31337bea577c.nnue)
```
Passed STC:
https://tests.stockfishchess.org/tests/view/669564154ff211be9d4ec080
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 57280 W: 15139 L: 14789 D: 27352
Ptnml(0-2): 209, 6685, 14522, 6995, 229
Passed LTC:
https://tests.stockfishchess.org/tests/view/669694204ff211be9d4ec1b4
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 63030 W: 16093 L: 15720 D: 31217
Ptnml(0-2): 47, 6766, 17516, 7139, 47
closes https://github.com/official-stockfish/Stockfish/pull/5509
bench 1371485
even if beta is below TB range, once we return probcutBeta with beta + 390 we
can return wrong TB value, and guard against ttData.value being `VALUE_NONE`
closes https://github.com/official-stockfish/Stockfish/pull/5499
bench: 1440277
These values represent the lowest Elo rating in the skill level calculation,
and the highest one, but it's not clear from the code where these values come
from other than the comment. This should improve code readability and
maintainability. It makes the purpose of the values clear and allows for easy
modification if the Elo range for skill level calculation changes in the
future. Moved the Skill struct definition from search.cpp to search.h header
file to define the Search::Skill struct, making it accessible from other files.
closes https://github.com/official-stockfish/Stockfish/pull/5508
No functional change
in the case of MultiPV, the first move of the Nth multiPV could actually turn a
winning position in a losing one, so don't attempt to correct it. Instead,
always perform the first move without correction.
Fixes#5505
Closes https://github.com/official-stockfish/Stockfish/pull/5506
No functional change
now checks correctness of PV lines with TB score.
uses 3-4-5 man table bases, downloaded from lichess,
which are cached with the appropriate action.
closes https://github.com/official-stockfish/Stockfish/pull/5500
No functional change
This patch removes lmrDepth limit for quiet moves history based pruning.
Previously removal of this type of depth limits was considered bad because it
was performing bad for matetrack - but with this pruning heuristic this
shouldn't be that bad because it's "naturally" depth limited by history
threshold and should be completely disabled at depth >= 15 or so. Also this
heuristic in previous years was known to scale non-linearly - bigger lmrDepth
thresholds were better at longer time controls and removing it completely
probably should scale pretty well.
Passed STC:
https://tests.stockfishchess.org/tests/view/6692b89b4ff211be9d4eab21
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 114464 W: 29675 L: 29545 D: 55244
Ptnml(0-2): 372, 12516, 31329, 12640, 375
Passed LTC:
https://tests.stockfishchess.org/tests/view/6692c4554ff211be9d4eab3d
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 67746 W: 17182 L: 17014 D: 33550
Ptnml(0-2): 28, 6993, 19652, 7183, 17
closes https://github.com/official-stockfish/Stockfish/pull/5485
Bench: 1250388
now uses the following format:
`info string Found 510 WDL and 510 DTZ tablebase files (up to 6-man).`
this clarifies exactly what has been found, as the difference matters,
e.g. for the PV extension of TB scores.
closes https://github.com/official-stockfish/Stockfish/pull/5471
No functional change
- Capitalize comments
- Reformat multi-lines comments to equalize the widths of the lines
- Try to keep the width of comments around 85 characters
- Remove periods at the end of single-line comments
closes https://github.com/official-stockfish/Stockfish/pull/5469
No functional change
In probcut move loop, everything is enclosed within a large if statement. I've
changed it to guard clauses to stay consistent with other move loops.
closes https://github.com/official-stockfish/Stockfish/pull/5463
No functional change
Created by modifying L2 weights from the previous main net (nn-74f1d263ae9a.nnue)
with params found by spsa around 9k / 120k games at 120+1.2.
370 spsa params - L2 weights in nn-74f1d263ae9a.nnue where |val| >= 50
A: 6000, alpha: 0.602, gamma: 0.101
weights: [-127, 127], c_end = 6
To print the spsa params with nnue-pytorch:
```
import features
from serialize import NNUEReader
feature_set = features.get_feature_set_from_name("HalfKAv2_hm")
with open("nn-74f1d263ae9a.nnue", "rb") as f:
model = NNUEReader(f, feature_set).model
c_end = 6
for i in range(8):
for j in range(32):
for k in range(30):
value = int(model.layer_stacks.l2.weight[32 * i + j, k] * 64)
if abs(value) >= 50:
print(f"twoW[{i}][{j}][{k}],{value},-127,127,{c_end},0.0020")
```
Among the 370 params, 229 weights were changed.
avg change: 0.0961 ± 1.67
range: [-4, 3]
The number of weights changed, grouped by layer stack index,
shows more weights were modified in the lower piece count buckets:
[54, 52, 29, 23, 22, 18, 14, 17]
Found with the same method described in:
https://github.com/official-stockfish/Stockfish/pull/5459
Passed STC:
https://tests.stockfishchess.org/tests/view/668aec9a58083e5fd88239e7
LLR: 3.00 (-2.94,2.94) <0.00,2.00>
Total: 52384 W: 13569 L: 13226 D: 25589
Ptnml(0-2): 127, 6141, 13335, 6440, 149
Passed LTC:
https://tests.stockfishchess.org/tests/view/668af50658083e5fd8823a0b
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 46974 W: 12006 L: 11668 D: 23300
Ptnml(0-2): 25, 4992, 13121, 5318, 31
closes https://github.com/official-stockfish/Stockfish/pull/5466
bench 1300471
Always use the posix function posix_memalign() as aligned memory
allocator on Apple computers. This should allow to compile Stockfish
out of the box on all versions of Mac OS X.
Patch tested on the following systems (apart from the CI) :
• Mac OS 10.9.6 (arch x86-64-sse41-popcnt) with gcc-10
• Mac OS 10.13.6 (arch x86-64-bmi2) with gcc-10, gcc-14 and clang-11
• Mac OS 14.1.1 (arch apple-silicon) with clang-15
closes https://github.com/official-stockfish/Stockfish/pull/5462
No functional change
Created by setting output weights (256) and biases (8) of the previous main net
nn-ddcfb9224cdb.nnue to values found around 12k / 120k spsa games at 120+1.2
This used modified fishtest dev workers to construct .nnue files from
spsa params, then load them with EvalFile when running tests:
https://github.com/linrock/fishtest/tree/spsa-file-modified-nnue/worker
Inspired by researching loading spsa params from files:
https://github.com/official-stockfish/fishtest/pull/1926
Scripts for modifying nnue files and preparing params:
https://github.com/linrock/nnue-pytorch/tree/no-gpu-modify-nnue
spsa params:
weights: [-127, 127], c_end = 6
biases: [-8192, 8192], c_end = 64
Example of reading output weights and biases from the previous main net using
nnue-pytorch and printing spsa params in a format compatible with fishtest:
```
import features
from serialize import NNUEReader
feature_set = features.get_feature_set_from_name("HalfKAv2_hm")
with open("nn-ddcfb9224cdb.nnue", "rb") as f:
model = NNUEReader(f, feature_set).model
c_end_weights = 6
c_end_biases = 64
for i in range(8):
for j in range(32):
value = round(int(model.layer_stacks.output.weight[i, j] * 600 * 16) / 127)
print(f"oW[{i}][{j}],{value},-127,127,{c_end_weights},0.0020")
for i in range(8):
value = int(model.layer_stacks.output.bias[i] * 600 * 16)
print(f"oB[{i}],{value},-8192,8192,{c_end_biases},0.0020")
```
For more info on spsa tuning params in nets:
https://github.com/official-stockfish/Stockfish/pull/5149https://github.com/official-stockfish/Stockfish/pull/5254
Passed STC:
https://tests.stockfishchess.org/tests/view/66894d64e59d990b103f8a37
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 32000 W: 8443 L: 8137 D: 15420
Ptnml(0-2): 80, 3627, 8309, 3875, 109
Passed LTC:
https://tests.stockfishchess.org/tests/view/6689668ce59d990b103f8b8b
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 172176 W: 43822 L: 43225 D: 85129
Ptnml(0-2): 97, 18821, 47633, 19462, 75
closes https://github.com/official-stockfish/Stockfish/pull/5459
bench 1120091
Currently (after #5407), SF has the property that any PV line with a decisive
TB score contains the corresponding TB position, with a score that correctly
identifies the depth at which TB are entered. The PV line that follows might
not preserve the game outcome, but can easily be verified and extended based on
TB information. This patch provides this functionality, simply extending the PV
lines on output, this doesn't affect search.
Indeed, if DTZ tables are available, search based PV lines that correspond to
decisive TB scores are verified to preserve game outcome, truncating the line
as needed. Subsequently, such PV lines are extended with a game outcome
preserving line until mate, as a possible continuation. These lines are not
optimal mating lines, but are similar to what a user could produce on a website
like https://syzygy-tables.info/ clicking always the top ranked move, i.e.
minimizing or maximizing DTZ (with a simple tie-breaker for moves that have
identical DTZ), and are thus an just an illustration of how to game can be won.
A similar approach is already in established in
https://github.com/joergoster/Stockfish/tree/matefish2
This also contributes to addressing #5175 where SF can give an incorrect TB
win/loss for positions in TB with a movecounter that doesn't reflect optimal
play. While the full solution requires either TB generated differently, or a
search when ranking rootmoves, current SF will eventually find a draw in these
cases, in practice quite quickly, e.g.
`1kq5/q2r4/5K2/8/8/8/8/7Q w - - 96 1`
`8/8/6k1/3B4/3K4/4N3/8/8 w - - 54 106`
Gives the same results as master on an extended set of test positions from
https://github.com/mcostalba/Stockfish/commit/9173d29c414ddb8f4bec74e4db3ccbe664c66bf9
with the exception of the above mentioned fen where this commit improves.
With https://github.com/vondele/matetrack using 6men TB, all generated PVs verify:
```
Using ../Stockfish/src/stockfish.syzygyExtend on matetrack.epd with --nodes 1000000 --syzygyPath /chess/syzygy/3-4-5-6/WDL:/chess/syzygy/3-4-5-6/DTZ
Engine ID: Stockfish dev-20240704-ff227954
Total FENs: 6555
Found mates: 3299
Best mates: 2582
Found TB wins: 568
```
As repeated DTZ probing could be slow a procedure (100ms+ on HDD, a few ms on
SSD), the extension is only done as long as the time taken is less than half
the `Move Overhead` parameter. For tournaments where these lines might be of
interest to the user, a suitable `Move Overhead` might be needed (e.g. TCEC has
1000ms already).
closes https://github.com/official-stockfish/Stockfish/pull/5414
No functional change
To avoid output that depends on timing, output currmove and similar only from depth > 30
onward. Current choice of 3s makes the output of the same search depending on
the system load, and doesn't always start at move 1. Depth 30 is nowadays
reached in a few seconds on most systems.
closes https://github.com/official-stockfish/Stockfish/pull/5436
No functional change
do not upload some unneeded intermediate directories,
disable running authenticated git commands with the checkout action.
Thanks to Yaron A for the report.
closes https://github.com/official-stockfish/Stockfish/pull/5435
No functional change
Follow up from #5404 ... current location leads to troubles with Aquarium GUI
Fixes#5430
Now prints the information on threads and available processors at the beginning
of search, where info about the networks is already printed (and is known to
work)
closes https://github.com/official-stockfish/Stockfish/pull/5433
No functional change.
This patch introduces history updates to probcut. Standard depth - 3 bonus and
maluses are given to the capture that caused fail high and previously searched
captures, respectively. Similar to #5243, a negative history fill is applied to
compensate for an increase in capture history average, thus improving the
scaling of this patch.
Passed STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 84832 W: 21941 L: 21556 D: 41335
Ptnml(0-2): 226, 9927, 21688, 10386, 189
https://tests.stockfishchess.org/tests/view/6682fab9389b9ee542b1d029
Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 104298 W: 26469 L: 26011 D: 51818
Ptnml(0-2): 43, 11458, 28677, 11940, 31
https://tests.stockfishchess.org/tests/view/6682ff06389b9ee542b1d0a0
closes https://github.com/official-stockfish/Stockfish/pull/5428
bench 1281351
this action plays games under fast-chess with a `debug=yes` compiled binary.
It checks for triggered asserts in the code, or generally for engine disconnects.
closes https://github.com/official-stockfish/Stockfish/pull/5403
No functional change
try to avoid missing good moves for opponent or engine, by updating bestMove
also when value == bestValue (i.e. value == alpha) under certain conditions.
In particular require this is at higher depth in the tree, leaving the logic
near the root unchanged, and only apply randomly. Avoid doing this near mate
scores, leaving mate PVs intact.
Passed SMP STC 6+0.06 th7 :
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 42040 W: 10930 L: 10624 D: 20486
Ptnml(0-2): 28, 4682, 11289, 4998, 23
https://tests.stockfishchess.org/tests/view/66608b00c340c8eed7757d1d
Passed SMP LTC 24+0.24 th7 :
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 73692 W: 18978 L: 18600 D: 36114
Ptnml(0-2): 9, 7421, 21614, 7787, 15
https://tests.stockfishchess.org/tests/view/666095e8c340c8eed7757d49
closes https://github.com/official-stockfish/Stockfish/pull/5367
Bench 1205168
In case a stop is received during multithreaded searches, the PV of the best
thread might be printed without the correct upperbound/lowerbound indicators.
This was due to the pvIdx variable being incremented after receiving the stop.
passed STC:
https://tests.stockfishchess.org/tests/view/666985da602682471b064d08
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 196576 W: 51039 L: 50996 D: 94541
Ptnml(0-2): 760, 22545, 51603, 22652, 728
closes https://github.com/official-stockfish/Stockfish/pull/5391
Bench: 1160467
currently extensions can cause depth to exceed MAX_PLY.
This triggers the assert near line 542 in search when running a binary compiled with `debug=yes` on a testcase like:
```
position fen 7K/P1p1p1p1/2P1P1Pk/6pP/3p2P1/1P6/3P4/8 w - - 0 1
go nodes 1000000
```
passed STC
https://tests.stockfishchess.org/tests/view/6668a56a602682471b064c8d
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 143936 W: 37338 L: 37238 D: 69360
Ptnml(0-2): 514, 16335, 38149, 16477, 493
closes https://github.com/official-stockfish/Stockfish/pull/5383
Bench: 1160467
Move the engine options into the engine class, also avoid duplicated
initializations after startup. UCIEngine needs to register an add_listener to
listen to all option changes and print these. Also avoid a double
initialization of the TT, which was the case with the old state.
closes https://github.com/official-stockfish/Stockfish/pull/5356
No functional change
1. Fix GetProcessGroupAffinity still not getting properly aligned memory
sometimes.
2. Fix a very theoretically possible heap corruption if
GetActiveProcessorGroupCount changes between calls.
3. Fully determine affinity on Windows 11 and Windows Server 2022. It
should only ever be indeterminate in case of an error.
4. Separate isDeterminate for old and new API, as they are &'d together
we still can end up with a subset of processors even if one API is
indeterminate.
5. likely_used_old_api() that is based on actual affinity that's been
detected
6. IMPORTANT: Gather affinities at startup, so that we only later use
the affinites set at startup. Not only does this prevent us from our
own calls interfering with detection but it also means subsequent
setoption NumaPolicy calls should behave as expected.
7. Fix ERROR_INSUFFICIENT_BUFFER from GetThreadSelectedCpuSetMasks being
treated like an error.
Should resolve
https://github.com/vondele/Stockfish/commit/02ff76630b358e5f958793cc93df0009d2da65a5#commitcomment-142790025
closes https://github.com/official-stockfish/Stockfish/pull/5372
Bench: 1231853
Can be used in case a GUI (e.g. ChessBase 17 see #5307) sets affinity to a
single processor group, but the user would like to use the full capabilities of
the hardware. Improves affinity handling on Windows in case of multiple
available APIs and existing affinities.
closes https://github.com/official-stockfish/Stockfish/pull/5353
No functional change
Passed STC:
https://tests.stockfishchess.org/tests/view/665ce3f8fd45fb0f907c537f
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 282784 W: 73032 L: 73082 D: 136670
Ptnml(0-2): 680, 31845, 76364, 31851, 652
Recently when I overhauled these comments, Disservin asked why these
were so much lower: they're a relic from when we had a third QS stage at
-5. Now we don't, so fix these to the obvious place.
I was fairly sure it was nonfunctional but ran the nonreg to be double
sure.
closes https://github.com/official-stockfish/Stockfish/pull/5343
Bench: 1057383
Previously, we had two type aliases, LargePagePtr and AlignedPtr, which
required manually initializing the aligned memory for the pointer.
The new helpers:
- make_unique_aligned
- make_unique_large_page
are now available for allocating aligned memory (with large pages). They
behave similarly to std::make_unique, ensuring objects allocated with
these functions follow RAII.
The old approach had issues with initializing non-trivial types or
arrays of objects. The evaluation function of the network is now a
unique pointer to an array instead of an array of unique pointers.
Memory related functions have been moved into memory.h
Passed High Hash Pressure Test Non-Regression STC:
https://tests.stockfishchess.org/tests/view/665b2b36586058766677cfd2
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 476992 W: 122426 L: 122677 D: 231889
Ptnml(0-2): 1145, 51027, 134419, 50744, 1161
Failed Normal Non-Regression STC:
https://tests.stockfishchess.org/tests/view/665b2997586058766677cfc8
LLR: -2.94 (-2.94,2.94) <-1.75,0.25>
Total: 877312 W: 225233 L: 226395 D: 425684
Ptnml(0-2): 2110, 94642, 246239, 93630, 2035
Probably a fluke since there shouldn't be a real slowndown and it has also
passed the high hash pressure test.
closes https://github.com/official-stockfish/Stockfish/pull/5332
No functional change
This reverts commit 783dfc2eb2.
could lead to a division by zero for:
ttValue = (ttValue * tte->depth() + beta) / (tte->depth() + 1)
as other threads can overwrite the tte with a QS depth of -1.
closes https://github.com/official-stockfish/Stockfish/pull/5338
Bench: 1280020
This is reintroduction of the recently simplified logic - if positive tt cutoff
occurs return not a tt value but smth between it and beta. Difference is that
instead of static linear combination there we use basically the same formula as
we do in the main search - with the only difference being using tt depth
instead of depth, which makes a lot of sense.
Passed STC:
https://tests.stockfishchess.org/tests/view/665b3a34f4a1fd0c208ea870
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 54944 W: 14239 L: 13896 D: 26809
Ptnml(0-2): 151, 6407, 14008, 6760, 146
Passed LTC:
https://tests.stockfishchess.org/tests/view/665b520011645bd3d3fac341
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 90540 W: 23070 L: 22640 D: 44830
Ptnml(0-2): 39, 9903, 24965, 10315, 48
closes https://github.com/official-stockfish/Stockfish/pull/5336
bench 1381237
The idea is, that if we have the information that the singular search failed low and therefore produced an upperbound score, we can use the score from singularsearch as approximate upperbound as to what bestValue our non ttMoves will produce. If this value is well below alpha, we assume that all non-ttMoves will score below alpha and therfore can skip more moves.
This patch also sets up variables for future patches wanting to use teh singular search result outside of singular extensions, in singularBound and singularValue, meaning further patches using this search result to affect various pruning techniques can be tried.
Passed STC:
https://tests.stockfishchess.org/tests/view/6658d13e6b0e318cefa90120
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 85632 W: 22112 L: 21725 D: 41795
Ptnml(0-2): 243, 10010, 21947, 10349, 267
Passed LTC:
https://tests.stockfishchess.org/tests/view/6658dd356b0e318cefa9016a
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 243978 W: 62014 L: 61272 D: 120692
Ptnml(0-2): 128, 26598, 67791, 27348, 124
closes https://github.com/official-stockfish/Stockfish/pull/5325
bench 1397172
Specialize and privatize NumaConfig::get_process_affinity.
Only enable NUMA capability for 64-bit Windows.
Following #5307 and some more testing it was determined that the way affinity
was being determined on Windows was incorrect, based on incorrect assumptions
about GetNumaProcessorNodeEx.
This patch fixes the issue by attempting to retrieve the actual process'
processor affinity using Windows API. However one issue persists that is not
addressable due to limitations of Windows, and will have to be considered a
limitation. If affinities were set using SetThreadAffinityMask instead of
SetThreadSelectedCpuSetMasks and GetProcessGroupAffinity returns more than 1
group it is NOT POSSIBLE to determine the affinity programmatically on Windows.
In such case the implementation assumes no affinites are set and will consider
all processors available for execution.
closes https://github.com/official-stockfish/Stockfish/pull/5312
No functional change
This PR updates the internal WDL model, using data from 2.5M games played by SF-dev (3c62ad7).
Note that the normalizing constant has increased from 329 to 368.
Changes to the fitting procedure:
* the value for --materialMin was increased from 10 to 17: including data with less material leads to less accuracy for larger material count values
* the data was filtered to only include single thread LTC games at 60+0.6
* the data was filtered to only include games from master against patches that are (approximatively) within 5 nElo of master
For more information and plots of the model see PR#5309
closes https://github.com/official-stockfish/Stockfish/pull/5309
No functional change
As stockfish nets and search evolve, the existing time control appears
to give too little time at STC, roughly correct at LTC, and too little
at VLTC+.
This change adds an adjustment to the optExtra calculation. This
adjustment is easy to retune and refine, so it should be easier to keep
up-to-date than the more complex calculations used for optConstant and
optScale.
Passed STC 10+0.1:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 169568 W: 43803 L: 43295 D: 82470
Ptnml(0-2): 485, 19679, 44055, 19973, 592
https://tests.stockfishchess.org/tests/view/66531865a86388d5e27da9fa
Yellow LTC 60+0.6:
LLR: -2.94 (-2.94,2.94) <0.50,2.50>
Total: 209970 W: 53087 L: 52914 D: 103969
Ptnml(0-2): 91, 19652, 65314, 19849, 79
https://tests.stockfishchess.org/tests/view/6653e38ba86388d5e27daaa0
Passed VLTC 180+1.8 :
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 85618 W: 21735 L: 21342 D: 42541
Ptnml(0-2): 15, 8267, 25848, 8668, 11
https://tests.stockfishchess.org/tests/view/6655131da86388d5e27db95f
closes https://github.com/official-stockfish/Stockfish/pull/5297
Bench: 1212167
Allow for NUMA memory replication for NNUE weights. Bind threads to ensure execution on a specific NUMA node.
This patch introduces NUMA memory replication, currently only utilized for the NNUE weights. Along with it comes all machinery required to identify NUMA nodes and bind threads to specific processors/nodes. It also comes with small changes to Thread and ThreadPool to allow easier execution of custom functions on the designated thread. Old thread binding (WinProcGroup) machinery is removed because it's incompatible with this patch. Small changes to unrelated parts of the code were made to ensure correctness, like some classes being made unmovable, raw pointers replaced with unique_ptr. etc.
Windows 7 and Windows 10 is partially supported. Windows 11 is fully supported. Linux is fully supported, with explicit exclusion of Android. No additional dependencies.
-----------------
A new UCI option `NumaPolicy` is introduced. It can take the following values:
```
system - gathers NUMA node information from the system (lscpu or windows api), for each threads binds it to a single NUMA node
none - assumes there is 1 NUMA node, never binds threads
auto - this is the default value, depends on the number of set threads and NUMA nodes, will only enable binding on multinode systems and when the number of threads reaches a threshold (dependent on node size and count)
[[custom]] -
// ':'-separated numa nodes
// ','-separated cpu indices
// supports "first-last" range syntax for cpu indices,
for example '0-15,32-47:16-31,48-63'
```
Setting `NumaPolicy` forces recreation of the threads in the ThreadPool, which in turn forces the recreation of the TT.
The threads are distributed among NUMA nodes in a round-robin fashion based on fill percentage (i.e. it will strive to fill all NUMA nodes evenly). Threads are bound to NUMA nodes, not specific processors, because that's our only requirement and the OS can schedule them better.
Special care is made that maximum memory usage on systems that do not require memory replication stays as previously, that is, unnecessary copies are avoided.
On linux the process' processor affinity is respected. This means that if you for example use taskset to restrict Stockfish to a single NUMA node then the `system` and `auto` settings will only see a single NUMA node (more precisely, the processors included in the current affinity mask) and act accordingly.
-----------------
We can't ensure that a memory allocation takes place on a given NUMA node without using libnuma on linux, or using appropriate custom allocators on windows (https://learn.microsoft.com/en-us/windows/win32/memory/allocating-memory-from-a-numa-node), so to avoid complications the current implementation relies on first-touch policy. Due to this we also rely on the memory allocator to give us a new chunk of untouched memory from the system. This appears to work reliably on linux, but results may vary.
MacOS is not supported, because AFAIK it's not affected, and implementation would be problematic anyway.
Windows is supported since Windows 7 (https://learn.microsoft.com/en-us/windows/win32/api/processtopologyapi/nf-processtopologyapi-setthreadgroupaffinity). Until Windows 11/Server 2022 NUMA nodes are split such that they cannot span processor groups. This is because before Windows 11/Server 2022 it's not possible to set thread affinity spanning processor groups. The splitting is done manually in some cases (required after Windows 10 Build 20348). Since Windows 11/Server 2022 we can set affinites spanning processor group so this splitting is not done, so the behaviour is pretty much like on linux.
Linux is supported, **without** libnuma requirement. `lscpu` is expected.
-----------------
Passed 60+1 @ 256t 16000MB hash: https://tests.stockfishchess.org/tests/view/6654e443a86388d5e27db0d8
```
LLR: 2.95 (-2.94,2.94) <0.00,10.00>
Total: 278 W: 110 L: 29 D: 139
Ptnml(0-2): 0, 1, 56, 82, 0
```
Passed SMP STC: https://tests.stockfishchess.org/tests/view/6654fc74a86388d5e27db1cd
```
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 67152 W: 17354 L: 17177 D: 32621
Ptnml(0-2): 64, 7428, 18408, 7619, 57
```
Passed STC: https://tests.stockfishchess.org/tests/view/6654fb27a86388d5e27db15c
```
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 131648 W: 34155 L: 34045 D: 63448
Ptnml(0-2): 426, 13878, 37096, 14008, 416
```
fixes#5253
closes https://github.com/official-stockfish/Stockfish/pull/5285
No functional change
This speedup was first inspired by a comment by @AndyGrant on my recent
PR "If mullo_epi16 would preserve the signedness, then this could be
used to remove 50% of the max operations during the halfkp-pairwise
mat-mul relu deal."
That got me thinking, because although mullo_epi16 did not preserve the
signedness, mulhi_epi16 did, and so we could shift left and then use
mulhi_epi16, instead of shifting right after the mullo.
However, due to some issues with shifting into the sign bit, the FT
weights and biases had to be multiplied by 2 for the optimisation to
work.
Speedup on "Arch=x86-64-bmi2 COMP=clang", courtesy of @Torom
Result of 50 runs
base (...es/stockfish) = 962946 +/- 1202
test (...ise-max-less) = 979696 +/- 1084
diff = +16750 +/- 1794
speedup = +0.0174
P(speedup > 0) = 1.0000
CPU: 4 x Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Hyperthreading: on
Also a speedup on "COMP=gcc", courtesy of Torom once again
Result of 50 runs
base (...tockfish_gcc) = 966033 +/- 1574
test (...max-less_gcc) = 983319 +/- 1513
diff = +17286 +/- 2515
speedup = +0.0179
P(speedup > 0) = 1.0000
CPU: 4 x Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Hyperthreading: on
Passed STC:
LLR: 2.96 (-2.94,2.94) <0.00,2.00>
Total: 67712 W: 17715 L: 17358 D: 32639
Ptnml(0-2): 225, 7472, 18140, 7759, 260
https://tests.stockfishchess.org/tests/view/664c1d75830eb9f886616906
closes https://github.com/official-stockfish/Stockfish/pull/5282
No functional change
Also "fix" movepicker to allow depths between CHECKS and NO_CHECKS,
which makes them easier to tweak (not that they get tweaked hardly ever)
(This was more beneficial when there was a third stage to DEPTH_QS, but
it's still an improvement now)
closes https://github.com/official-stockfish/Stockfish/pull/5205
No functional change
Removes some max calls
Some speedup stats, courtesy of @AndyGrant (albeit measured in an alternate implementation)
Dev 749240 nps
Base 748495 nps
Gain 0.100%
289936 games
STC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 203040 W: 52213 L: 52179 D: 98648
Ptnml(0-2): 480, 20722, 59139, 20642, 537
https://tests.stockfishchess.org/tests/view/664805fe6dcff0d1d6b05f2ccloses#5261
No functional change
Created by first retraining the spsa-tuned main net `nn-ae6a388e4a1a.nnue` with:
- using v6-dd data without bestmove captures removed
- addition of T80 mar2024 data
- increasing loss by 20% when Q is too high
- torch.compile changes for marginal training speed gains
And then SPSA tuning weights of epoch 899 following methods described in:
https://github.com/official-stockfish/Stockfish/pull/5149
This net was reached at 92k out of 120k steps in this 70+0.7 th 7 SPSA tuning run:
https://tests.stockfishchess.org/tests/view/66413b7df9f4e8fc783c9bbb
Thanks to @Viren6 for suggesting usage of:
- c value 4 for the weights
- c value 128 for the biases
Scripts for automating applying fishtest spsa params to exporting tuned .nnue are in:
https://github.com/linrock/nnue-tools/tree/master/spsa
Before spsa tuning, epoch 899 was nn-f85738aefa84.nnue
https://tests.stockfishchess.org/tests/view/663e5c893a2f9702074bc167
After initially training with max-epoch 800, training was resumed with max-epoch 1000.
```
experiment-name: 3072--S11--more-data-v6-dd-t80-mar2024--see-ge0-20p-more-loss-high-q-sk28-l8
nnue-pytorch-branch: linrock/nnue-pytorch/3072-r21-skip-more-wdl-see-ge0-20p-more-loss-high-q-torch-compile-more
start-from-engine-test-net: False
start-from-model: /data/config/apr2024-3072/nn-ae6a388e4a1a.nnue
early-fen-skipping: 28
training-dataset:
/data/S11-mar2024/:
- leela96.v2.min.binpack
- test60-2021-11-12-novdec-12tb7p.v6-dd.min.binpack
- test78-2022-01-to-05-jantomay-16tb7p.v6-dd.min.binpack
- test80-2022-06-jun-16tb7p.v6-dd.min.binpack
- test80-2022-08-aug-16tb7p.v6-dd.min.binpack
- test80-2022-09-sep-16tb7p.v6-dd.min.binpack
- test80-2023-01-jan-16tb7p.v6-sk20.min.binpack
- test80-2023-02-feb-16tb7p.v6-sk20.min.binpack
- test80-2023-03-mar-2tb7p.v6-sk16.min.binpack
- test80-2023-04-apr-2tb7p.v6-sk16.min.binpack
- test80-2023-05-may-2tb7p.v6.min.binpack
# https://github.com/official-stockfish/Stockfish/pull/4782
- test80-2023-06-jun-2tb7p.binpack
- test80-2023-07-jul-2tb7p.binpack
# https://github.com/official-stockfish/Stockfish/pull/4972
- test80-2023-08-aug-2tb7p.v6.min.binpack
- test80-2023-09-sep-2tb7p.binpack
- test80-2023-10-oct-2tb7p.binpack
# S9 new data: https://github.com/official-stockfish/Stockfish/pull/5056
- test80-2023-11-nov-2tb7p.binpack
- test80-2023-12-dec-2tb7p.binpack
# S10 new data: https://github.com/official-stockfish/Stockfish/pull/5149
- test80-2024-01-jan-2tb7p.binpack
- test80-2024-02-feb-2tb7p.binpack
# S11 new data
- test80-2024-03-mar-2tb7p.binpack
/data/filt-v6-dd/:
- test77-dec2021-16tb7p-filter-v6-dd.binpack
- test78-juntosep2022-16tb7p-filter-v6-dd.binpack
- test79-apr2022-16tb7p-filter-v6-dd.binpack
- test79-may2022-16tb7p-filter-v6-dd.binpack
- test80-jul2022-16tb7p-filter-v6-dd.binpack
- test80-oct2022-16tb7p-filter-v6-dd.binpack
- test80-nov2022-16tb7p-filter-v6-dd.binpack
num-epochs: 1000
lr: 4.375e-4
gamma: 0.995
start-lambda: 0.8
end-lambda: 0.7
```
Training data can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move:
nn-epoch899.nnue : 4.6 +/- 1.4
Passed STC:
https://tests.stockfishchess.org/tests/view/6645454893ce6da3e93b31ae
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 95232 W: 24598 L: 24194 D: 46440
Ptnml(0-2): 294, 11215, 24180, 11647, 280
Passed LTC:
https://tests.stockfishchess.org/tests/view/6645522d93ce6da3e93b31df
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 320544 W: 81432 L: 80524 D: 158588
Ptnml(0-2): 164, 35659, 87696, 36611, 142
closes https://github.com/official-stockfish/Stockfish/pull/5254
bench 1995552
Basically the same idea as it is for continuation/main history, but it
has some tweaks.
1) it has * 2 multiplier for bonus instead of full/half bonus - for
whatever reason this seems to work better;
2) attempts with this type of big bonuses scaled somewhat poorly (or
were unlucky at longer time controls), but after measuring the fact
that average value of pawn history in LMR after adding this bonuses
increased by substantial number (for multiplier 1,5 it increased by
smth like 400~ from 8192 cap) attempts were made to make default pawn
history negative to compensate it - and version with multiplier 2 and
initial fill value -900 passed.
Passed STC:
https://tests.stockfishchess.org/tests/view/66424815f9f4e8fc783cba59
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 115008 W: 30001 L: 29564 D: 55443
Ptnml(0-2): 432, 13629, 28903, 14150, 390
Passed LTC:
https://tests.stockfishchess.org/tests/view/6642f5437134c82f3f7a3ffa
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 56448 W: 14432 L: 14067 D: 27949
Ptnml(0-2): 36, 6268, 15254, 6627, 39
Bench: 1857237
Stockfish appears to take too much time on the first move of a game and
then not enough on moves 2,3,4... Probably caused by most of the factors
that increase time usually applying on the first move.
Attempts to give more time to the subsequent moves have not worked so
far, but this change to simply reduce first move time by 5% worked.
STC 10+0.1 :
LLR: 2.96 (-2.94,2.94) <0.00,2.00>
Total: 78496 W: 20516 L: 20135 D: 37845
Ptnml(0-2): 340, 8859, 20456, 9266, 327
https://tests.stockfishchess.org/tests/view/663d47bf507ebe1c0e9200ba
LTC 60+0.6 :
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 94872 W: 24179 L: 23751 D: 46942
Ptnml(0-2): 61, 9743, 27405, 10161, 66
https://tests.stockfishchess.org/tests/view/663e779cbb28828150dd9089
closes https://github.com/official-stockfish/Stockfish/pull/5235
Bench: 1876282
1. The current time management system utilizes limits.inc and
limits.time, which can represent either milliseconds or node count,
depending on whether the nodestime option is active. There have been
several modifications which brought Elo gain for typical uses (i.e.
real-time matches), however some of these changes overlooked such
distinction. This patch adjusts constants and multiplication/division to
more accurately simulate real TC conditions when nodestime is used.
2. The advance_nodes_time function has a bug that can extend the time
limit when availableNodes reaches exact zero. This patch fixes the bug
by initializing the variable to -1 and make sure it does not go below
zero.
3. elapsed_time function is newly introduced to print PV in the UCI
output based on real time. This makes PV output more consistent with the
behavior of trivial use cases.
closes https://github.com/official-stockfish/Stockfish/pull/5186
No functional changes
If there is an upper bound stored in the transposition table, but we still have a ttMove, the upperbound indicates that the last time the ttMove was tried, it failed low. This fail low indicates that the ttMove may not be good, so this patch introduces a depth reduction of one for cutnodes with such ttMoves.
Passed STC:
https://tests.stockfishchess.org/tests/view/663be4d1ca93dad645f7f45f
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 139424 W: 35900 L: 35433 D: 68091
Ptnml(0-2): 425, 16357, 35743, 16700, 487
Passed LTC:
https://tests.stockfishchess.org/tests/view/663bec95ca93dad645f7f5c8
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 129690 W: 32902 L: 32390 D: 64398
Ptnml(0-2): 63, 14304, 35610, 14794, 74
closes https://github.com/official-stockfish/Stockfish/pull/5227
bench 2257437
Make it formula more in line with what we use in search - current formula is more or less the one we used years ago for search but since then it was remade, this patch remakes qsearch formula to almost exactly the same as we use in search - with sum of conthist 0, 1 and pawn structure history.
Passed STC:
https://tests.stockfishchess.org/tests/view/6639c8421343f0cb16716206
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 84992 W: 22414 L: 22019 D: 40559
Ptnml(0-2): 358, 9992, 21440, 10309, 397
Passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 119136 W: 30407 L: 29916 D: 58813
Ptnml(0-2): 46, 13192, 32622, 13641, 67
closes https://github.com/official-stockfish/Stockfish/pull/5224
Bench: 2138659
The idea came to me by checking for trends from the megafauzi tunes, since the values of the divisor for this specific formula were as follows:
stc: 15990
mtc: 16117
ltc: 14805
vltc: 12719
new vltc passed by Muzhen: 12076
This shows a clear trend related to time control, the higher it is, the lower the optimum value for the divisor seems to be.
So I tried a simple formula, using educated guesses based on some calculations, tests show it works pretty fine, and it can still be further tuned at VLTC in the future to scale even better.
Passed STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 431360 W: 110791 L: 109898 D: 210671
Ptnml(0-2): 1182, 50846, 110698, 51805, 1149
https://tests.stockfishchess.org/tests/view/663770409819650825aa269f
Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 114114 W: 29109 L: 28625 D: 56380
Ptnml(0-2): 105, 12628, 31101, 13124, 99
https://tests.stockfishchess.org/tests/view/66378c099819650825aa73f6https://github.com/official-stockfish/Stockfish/pull/5223
bench: 2273551
This adds the functions `update_refutations` and `update_quiet_histories` to better distinguish the two. `update_quiet_stats` now just calls both of these functions.
The functional side of this patch is two-fold:
1. Stop refutations being updated when we carry out multicut
2. Update pawn history every time we update other quiet histories
Yellow STC:
LLR: -2.95 (-2.94,2.94) <0.00,2.00>
Total: 238976 W: 61506 L: 61415 D: 116055
Ptnml(0-2): 846, 28628, 60456, 28705, 853
https://tests.stockfishchess.org/tests/view/66321b5ed01fb9ac9bcdca83
However, it passed in <-1.75, 0.25> bounds:
$ python3 sprt.py --wins 61506 --losses 61415 --draws 116055 --elo0 -1.75 --elo1 0.25
ELO: 0.132 +- 0.998 [-0.865, 1.13]
LLR: 4.15 [-1.75, 0.25] (-2.94, 2.94)
H1 Accepted
Passed LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 399126 W: 100730 L: 100896 D: 197500
Ptnml(0-2): 116, 44328, 110843, 44158, 118
https://tests.stockfishchess.org/tests/view/66357b0473559a8aa857ba6fcloses#5215
Bench 2370967
Saves a (currently) 800 KB allocation and deallocation when running
`eval`, not particularly significant and zero impact on play but not
necessary either.
closes https://github.com/official-stockfish/Stockfish/pull/5201
No functional change
Adds size in memory as well as layer sizes as in
info string NNUE evaluation using nn-ae6a388e4a1a.nnue (132MiB, (22528, 3072, 15, 32, 1))
info string NNUE evaluation using nn-baff1ede1f90.nnue (6MiB, (22528, 128, 15, 32, 1))
For example, the size in MiB is useful to keep the fishtest memory sizes up-to-date,
the L1-L3 sizes give a useful hint about the architecture used.
closes https://github.com/official-stockfish/Stockfish/pull/5193
No functional change
For each thread persist an accumulator cache for the network, where each
cache contains multiple entries for each of the possible king squares.
When the accumulator needs to be refreshed, the cached entry is used to more
efficiently update the accumulator, instead of rebuilding it from scratch.
This idea, was first described by Luecx (author of Koivisto) and
is commonly referred to as "Finny Tables".
When the accumulator needs to be refreshed, instead of filling it with
biases and adding every piece from scratch, we...
1. Take the `AccumulatorRefreshEntry` associated with the new king bucket
2. Calculate the features to activate and deactivate (from differences
between bitboards in the entry and bitboards of the actual position)
3. Apply the updates on the refresh entry
4. Copy the content of the refresh entry accumulator to the accumulator
we were refreshing
5. Copy the bitboards from the position to the refresh entry, to match
the newly updated accumulator
Results at STC:
https://tests.stockfishchess.org/tests/view/662301573fe04ce4cefc1386
(first version)
https://tests.stockfishchess.org/tests/view/6627fa063fe04ce4cefc6560
(final)
Non-Regression between first and final:
https://tests.stockfishchess.org/tests/view/662801e33fe04ce4cefc660a
STC SMP:
https://tests.stockfishchess.org/tests/view/662808133fe04ce4cefc667c
closes https://github.com/official-stockfish/Stockfish/pull/5183
No functional change
Parameters Tune, adding also another tunable parameter (npmDiv) to be
variable for different nets (bignet, smallnet, psqtOnly smallnet). P.s:
The changed values are only the parameters where there is agreement
among the different time controls, so in other words, the tunings are
telling us that changing these specific values to this specific
direction is good in all time controls, so there shouldn't be a high
risk of regressing at longer time controls.
Passed STC:
LLR: 2.97 (-2.94,2.94) <0.00,2.00>
Total: 39552 W: 10329 L: 9999 D: 19224
Ptnml(0-2): 156, 4592, 9989, 4844, 195
https://tests.stockfishchess.org/tests/view/661be9a0bd68065432a088c0
Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 56394 W: 14439 L: 14078 D: 27877
Ptnml(0-2): 30, 6152, 15480, 6497, 38
https://tests.stockfishchess.org/tests/view/661c746296961e72eb565406
closes https://github.com/official-stockfish/Stockfish/pull/5187
Bench: 1836777
Previously it was possible to also get the node counter after running a bench with perft, i.e.
`./stockfish bench 1 1 5 current perft`, caused by a small regression from the uci refactoring.
```
Nodes searched: 4865609
===========================
Total time (ms) : 18
Nodes searched : 4865609
Nodes/second : 270311611
````
closes https://github.com/official-stockfish/Stockfish/pull/5188
No functional change
We change the definition of "age" from "age of this position" to "age of this TT entry".
In this way, despite being on the same position, when we save into TT, we always prefer the new entry as compared to the old one.
Passed STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 152256 W: 39597 L: 39110 D: 73549
Ptnml(0-2): 556, 17562, 39398, 18063, 549
https://tests.stockfishchess.org/tests/view/6620faee3fe04ce4cefbf215
Passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 51564 W: 13242 L: 12895 D: 25427
Ptnml(0-2): 24, 5464, 14463, 5803, 28
https://tests.stockfishchess.org/tests/view/66231ab53fe04ce4cefc153ecloses#5184
Bench 1479416
It makes more sense to not (potentially) change the developers alsr entropy setting to make the test run through. This should be an active choice even if the test then might fail locally for them.
closes https://github.com/official-stockfish/Stockfish/pull/5182
No functional change
the recent refactoring has shown some limitations of our testing, hence we add a couple of more tests including:
* expected mate score
* expected mated score
* expected in TB win score
* expected in TB loss score
* expected info line output
* expected info line output (wdl)
closes https://github.com/official-stockfish/Stockfish/pull/5181
No functional change
Fix another case of 9032c6cbe7
* TB values can have a distance of 0, mainly when we are in a tb position but haven't found mate.
* Add a missing whitespace to UCIEngine::on_update_no_moves()
Closes https://github.com/official-stockfish/Stockfish/pull/5172
No functional change
Also fixes searchmoves.
Drop the need of a Position object in uci.cpp.
A side note, it is still required for the static functions,
but these should be moved to a different namespace/class
later on, since sf kinda relies on them.
closes https://github.com/official-stockfish/Stockfish/pull/5169
No functional change
The assignment (ss + 1)->excludedMove = Move::none() can be simplified away because when that line is reached, (ss + 1)->excludedMove is always already none. The only moment stack[x]->excludedMove is modified, is during singular search, but it is reset to none right after the singular search is finished.
closes https://github.com/official-stockfish/Stockfish/pull/5153
No functional change
The same functionality is available by using COMPCXX and having another variable which does the same is just confusing.
There was only one mention on Stockfish Wiki about this which has been changed to COMPCXX.
closes https://github.com/official-stockfish/Stockfish/pull/5154
No functional change
Part 2 of the Split UCI into UCIEngine and Engine refactor.
This creates function callbacks for search to use when an update should occur.
The benching in uci.cpp for example does this to extract the total nodes
searched.
No functional change
This is another refactor which aims to decouple uci from stockfish. A new engine
class manages all engine related logic and uci is a "small" wrapper around it.
In the future we should also try to remove the need for the Position object in
the uci and replace the options with an actual options struct instead of using a
map. Also convert the std::string's in the Info structs a string_view.
closes#5147
No functional change
In the last couple of months we sometimes saw duplicated prereleases uploaded to GitHub, possibly due to some racy behavior when concurrent jobs create a prerelease. This now creates an empty prerelease at the beginning of the CI and the binaries are later just attached to this one.
closes https://github.com/official-stockfish/Stockfish/pull/5144
No functional change
This PR proposes to change the parameter dependence of Stockfish's
internal WDL model from full move counter to material count. In addition
it ensures that an evaluation of 100 centipawns always corresponds to a
50% win probability at fishtest LTC, whereas for master this holds only
at move number 32. See also
https://github.com/official-stockfish/Stockfish/pull/4920 and the
discussion therein.
The new model was fitted based on about 340M positions extracted from
5.6M fishtest LTC games from the last three weeks, involving SF versions
from e67cc979fd (SF 16.1) to current
master.
The involved commands are for
[WDL_model](https://github.com/official-stockfish/WDL_model) are:
```
./updateWDL.sh --firstrev e67cc979fd
python scoreWDL.py updateWDL.json --plot save --pgnName update_material.png --momType "material" --momTarget 58 --materialMin 10 --modelFitting optimizeProbability
```
The anchor `58` for the material count value was chosen to be as close
as possible to the observed average material count of fishtest LTC games
at move 32 (`43`), while not changing the value of
`NormalizeToPawnValue` compared to the move-based WDL model by more than
1.
The patch only affects the displayed cp and wdl values.
closes https://github.com/official-stockfish/Stockfish/pull/5121
No functional change
Before, one always had to keep track of the bonus one assigns to a history to stop
the stats from overflowing. This is a quality of life improvement. Since this would often go unnoticed during benching.
Passed non-regression bounds:
https://tests.stockfishchess.org/tests/view/65ef2af40ec64f0526c44cbc
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 179232 W: 46513 L: 46450 D: 86269
Ptnml(0-2): 716, 20323, 47452, 20432, 693
closes https://github.com/official-stockfish/Stockfish/pull/5116
No functional change
Reported by @Torom over discord.
> dev build fails on Raspberry Pi 5 with clang
```
clang++ -o stockfish benchmark.o bitboard.o evaluate.o main.o misc.o movegen.o movepick.o position.o search.o thread.o timeman.o tt.o uci.o ucioption.o tune.o tbprobe.o nnue_misc.o half_ka_v2_hm.o network.o -fprofile-instr-generate -latomic -lpthread -Wall -Wcast-qual -fno-exceptions -std=c++17 -fprofile-instr-generate -pedantic -Wextra -Wshadow -Wmissing-prototypes -Wconditional-uninitialized -DUSE_PTHREADS -DNDEBUG -O3 -funroll-loops -DIS_64BIT -DUSE_POPCNT -DUSE_NEON=8 -march=armv8.2-a+dotprod -DUSE_NEON_DOTPROD -DGIT_SHA=627974c9 -DGIT_DATE=20240312 -DARCH=armv8-dotprod -flto=full
/tmp/lto-llvm-e9300e.o: in function `_GLOBAL__sub_I_network.cpp':
ld-temp.o:(.text.startup+0x704c): relocation truncated to fit: R_AARCH64_LDST64_ABS_LO12_NC against symbol `gEmbeddedNNUEBigEnd' defined in .rodata section in /tmp/lto-llvm-e9300e.o
/usr/bin/ld: ld-temp.o:(.text.startup+0x704c): warning: one possible cause of this error is that the symbol is being referenced in the indicated code as if it had a larger alignment than was declared where it was defined
ld-temp.o:(.text.startup+0x7068): relocation truncated to fit: R_AARCH64_LDST64_ABS_LO12_NC against symbol `gEmbeddedNNUESmallEnd' defined in .rodata section in /tmp/lto-llvm-e9300e.o
/usr/bin/ld: ld-temp.o:(.text.startup+0x7068): warning: one possible cause of this error is that the symbol is being referenced in the indicated code as if it had a larger alignment than was declared where it was defined
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [Makefile:1051: stockfish] Error 1
make[2]: Leaving directory '/home/torsten/chess/Stockfish_master/src'
make[1]: *** [Makefile:1058: clang-profile-make] Error 2
make[1]: Leaving directory '/home/torsten/chess/Stockfish_master/src'
make: *** [Makefile:886: profile-build] Error 2
```
closes https://github.com/official-stockfish/Stockfish/pull/5106
No functional change
- fix naming convention for `workingDirectory`
- use type alias for `EvalFiles` everywhere
- move `ponderMode` into `LimitsType`
- move limits parsing into standalone static function
closes https://github.com/official-stockfish/Stockfish/pull/5098
No functional change
Fixes two issues with master for go mate x:
- when running go mate x in losing positions, master always goes to the
maximal depth, arguably against what the UCI protocol demands
- when running go mate x in winning positions with multiple
threads, master may return non-mate scores from the search (this issue
is present in stockfish since at least sf16) The issues are fixed by
(a) also checking if score is mate -x and by (b) only letting
mainthread stop the search for go mate x commands, and by not looking
for a best thread but using mainthread as per the default. Related:
niklasf/python-chess#1070
More diagnostics can be found here peregrineshahin#6 (comment)
closes https://github.com/official-stockfish/Stockfish/pull/5094
No functional change
Co-Authored-By: Robert Nürnberg <28635489+robertnurnberg@users.noreply.github.com>
This reduces the futiltiy_margin if our opponents last move was bad by
around ~1/3 when not improving and ~1/2.7 when improving, the idea being
to retroactively futility prune moves that were played, but turned out
to be bad. A bad move is being defined as their staticEval before their
move being lower as our staticEval now is. If the depth is 2 and we are
improving the opponent worsening flag is not set, in order to not risk
having a too low futility_margin, due to the fact that when these
conditions are met the futility_margin already drops quite low.
Passed STC:
https://tests.stockfishchess.org/tests/live_elo/65e3977bf2ef6c733362aae3
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 122432 W: 31884 L: 31436 D: 59112
Ptnml(0-2): 467, 14404, 31035, 14834, 476
Passed LTC:
https://tests.stockfishchess.org/tests/live_elo/65e47f40f2ef6c733362b6d2
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 421692 W: 106572 L: 105452 D: 209668
Ptnml(0-2): 216, 47217, 114865, 48327, 221
closes https://github.com/official-stockfish/Stockfish/pull/5092
Bench: 1565939
Created by retraining the previous main net `nn-b1a57edbea57.nnue` with:
- some of the same options as before:
- ranger21, more WDL skipping, 15% more loss when Q is too high
- removal of the huge 514G pre-interleaved binpack
- removal of SF-generated dfrc data (dfrc99-16tb7p-filt-v2.min.binpack)
- interleaving many binpacks at training time
- training with some bestmove capture positions where SEE < 0
- increased usage of torch.compile to speed up training by up to 40%
```yaml
experiment-name: 2560--S10-dfrc0-to-dec2023-skip-more-wdl-15p-more-loss-high-q-see-ge0-sk28
nnue-pytorch-branch: linrock/nnue-pytorch/r21-more-wdl-skip-15p-more-loss-high-q-skip-see-ge0-torch-compile-more
start-from-engine-test-net: True
early-fen-skipping: 28
training-dataset:
# similar, not the exact same as:
# https://github.com/official-stockfish/Stockfish/pull/4635
- /data/S5-5af/leela96.v2.min.binpack
- /data/S5-5af/test60-2021-11-12-novdec-12tb7p.v6-dd.min.binpack
- /data/S5-5af/test77-2021-12-dec-16tb7p.v6-dd.min.binpack
- /data/S5-5af/test78-2022-01-to-05-jantomay-16tb7p.v6-dd.min.binpack
- /data/S5-5af/test78-2022-06-to-09-juntosep-16tb7p.v6-dd.min.binpack
- /data/S5-5af/test79-2022-04-apr-16tb7p.v6-dd.min.binpack
- /data/S5-5af/test79-2022-05-may-16tb7p.v6-dd.min.binpack
- /data/S5-5af/test80-2022-06-jun-16tb7p.v6-dd.min.unmin.binpack
- /data/S5-5af/test80-2022-07-jul-16tb7p.v6-dd.min.binpack
- /data/S5-5af/test80-2022-08-aug-16tb7p.v6-dd.min.binpack
- /data/S5-5af/test80-2022-09-sep-16tb7p.v6-dd.min.unmin.binpack
- /data/S5-5af/test80-2022-10-oct-16tb7p.v6-dd.min.binpack
- /data/S5-5af/test80-2022-11-nov-16tb7p.v6-dd.min.binpack
- /data/S5-5af/test80-2023-01-jan-16tb7p.v6-sk20.min.binpack
- /data/S5-5af/test80-2023-02-feb-16tb7p.v6-dd.min.binpack
- /data/S5-5af/test80-2023-03-mar-2tb7p.min.unmin.binpack
- /data/S5-5af/test80-2023-04-apr-2tb7p.binpack
- /data/S5-5af/test80-2023-05-may-2tb7p.min.dd.binpack
# https://github.com/official-stockfish/Stockfish/pull/4782
- /data/S6-1ee1aba5ed/test80-2023-06-jun-2tb7p.binpack
- /data/S6-1ee1aba5ed/test80-2023-07-jul-2tb7p.min.binpack
# https://github.com/official-stockfish/Stockfish/pull/4972
- /data/S8-baff1edbea57/test80-2023-08-aug-2tb7p.v6.min.binpack
- /data/S8-baff1edbea57/test80-2023-09-sep-2tb7p.binpack
- /data/S8-baff1edbea57/test80-2023-10-oct-2tb7p.binpack
# https://github.com/official-stockfish/Stockfish/pull/5056
- /data/S9-b1a57edbea57/test80-2023-11-nov-2tb7p.binpack
- /data/S9-b1a57edbea57/test80-2023-12-dec-2tb7p.binpack
num-epochs: 800
lr: 4.375e-4
gamma: 0.995
start-lambda: 1.0
end-lambda: 0.7
```
This particular net was reached at epoch 759. Use of more torch.compile decorators
in nnue-pytorch model.py than in the previous main net training run sped up training
by up to 40% on Tesla gpus when using recent pytorch compiled with cuda 12:
https://github.com/linrock/nnue-tools/blob/7fb9831/Dockerfile
Skipping positions with bestmove captures where static exchange evaluation is >= 0
is based on the implementation from Sopel's NNUE training & experimentation log:
https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY
Experiment 293 - only skip captures with see>=0
Positions with bestmove captures where score == 0 are always skipped for
compatibility with minimized binpacks, since the original minimizer sets
scores to 0 for slight improvements in compression.
The trainer branch used was:
https://github.com/linrock/nnue-pytorch/tree/r21-more-wdl-skip-15p-more-loss-high-q-skip-see-ge0-torch-compile-more
Binpacks were renamed to be sorted chronologically by default when sorted by name.
The binpack data are otherwise the same as binpacks with similar names in the prior
naming convention.
Training data can be found at:
https://robotmoon.com/nnue-training-data/
Passed STC:
https://tests.stockfishchess.org/tests/view/65e3ddd1f2ef6c733362ae5c
LLR: 2.92 (-2.94,2.94) <0.00,2.00>
Total: 149792 W: 39153 L: 38661 D: 71978
Ptnml(0-2): 675, 17586, 37905, 18032, 698
Passed LTC:
https://tests.stockfishchess.org/tests/view/65e4d91c416ecd92c162a69b
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 64416 W: 16517 L: 16135 D: 31764
Ptnml(0-2): 38, 7218, 17313, 7602, 37
closes https://github.com/official-stockfish/Stockfish/pull/5090
Bench: 1373183
Based on 130M positions from 2.1M games.
```
Look recursively in directory pgns for games from SPRT tests using books
matching "UHO_4060_v..epd|UHO_Lichess_4852_v1.epd" for SF revisions
between 8e75548f2a (from 2024-02-17
17:11:46 +0100) and HEAD (from 2024-02-17 17:13:07 +0100). Based on
127920843 positions from 2109240 games, NormalizeToPawnValue should
change from 345 to 356.
```
The patch only affects the UCI-reported cp and wdl values.
closes https://github.com/official-stockfish/Stockfish/pull/5070
No functional change
Created by retraining the previous main net `nn-baff1edbea57.nnue` with:
- some of the same options as before: ranger21, more WDL skipping
- the addition of T80 nov+dec 2023 data
- increasing loss by 15% when prediction is too high, up from 10%
- use of torch.compile to speed up training by over 25%
```yaml
experiment-name: 2560--S9-514G-T80-augtodec2023-more-wdl-skip-15p-more-loss-high-q-sk28
training-dataset:
# https://github.com/official-stockfish/Stockfish/pull/4782
- /data/S6-514G-1ee1aba5ed.binpack
- /data/test80-aug2023-2tb7p.v6.min.binpack
- /data/test80-sep2023-2tb7p.binpack
- /data/test80-oct2023-2tb7p.binpack
- /data/test80-nov2023-2tb7p.binpack
- /data/test80-dec2023-2tb7p.binpack
early-fen-skipping: 28
start-from-engine-test-net: True
nnue-pytorch-branch: linrock/nnue-pytorch/r21-more-wdl-skip-15p-more-loss-high-q-torch-compile
num-epochs: 1000
lr: 4.375e-4
gamma: 0.995
start-lambda: 1.0
end-lambda: 0.7
```
Epoch 819 trained with the above config led to this PR. Use of torch.compile
decorators in nnue-pytorch model.py was found to speed up training by at least
25% on Ampere gpus when using recent pytorch compiled with cuda 12:
https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch
See recent main net PRs for more info on
- ranger21 and more WDL skipping: https://github.com/official-stockfish/Stockfish/pull/4942
- increasing loss when Q is too high: https://github.com/official-stockfish/Stockfish/pull/4972
Training data can be found at:
https://robotmoon.com/nnue-training-data/
Passed STC:
https://tests.stockfishchess.org/tests/view/65cd76151d8e83c78bfd2f52
LLR: 2.98 (-2.94,2.94) <0.00,2.00>
Total: 78336 W: 20504 L: 20115 D: 37717
Ptnml(0-2): 317, 9225, 19721, 9562, 343
Passed LTC:
https://tests.stockfishchess.org/tests/view/65ce5be61d8e83c78bfd43e9
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 41016 W: 10492 L: 10159 D: 20365
Ptnml(0-2): 22, 4533, 11071, 4854, 28
closes https://github.com/official-stockfish/Stockfish/pull/5056
Bench: 1351997
This introduces a form of node counting which can
be used to further tweak the usage of our search
time.
The current approach stops the search when almost
all nodes are searched on a single move.
The idea originally came from Koivisto, but the
implemention is a bit different, Koivisto scales
the optimal time by the nodes effort and then
determines if the search should be stopped.
We just scale down the `totalTime` and stop the
search if we exceed it and the effort is large
enough.
Passed STC:
https://tests.stockfishchess.org/tests/view/65c8e0661d8e83c78bfcd5ec
LLR: 2.97 (-2.94,2.94) <0.00,2.00>
Total: 88672 W: 22907 L: 22512 D: 43253
Ptnml(0-2): 310, 10163, 23041, 10466, 356
Passed LTC:
https://tests.stockfishchess.org/tests/view/65ca632b1d8e83c78bfcf554
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 170856 W: 42910 L: 42320 D: 85626
Ptnml(0-2): 104, 18337, 47960, 18919, 108
closes https://github.com/official-stockfish/Stockfish/pull/5053
Bench: 1198939
This patch does similar thing to how it's done for
qsearch - in case of fail high adjust result to
lower value. Difference is that it is done only
for non-pv nodes and it's depth dependent - so
lower depth entries will have bigger adjustment
and higher depth entries will have smaller
adjustment.
Passed STC:
https://tests.stockfishchess.org/tests/view/65c3c0cbc865510db0283b21
LLR: 2.96 (-2.94,2.94) <0.00,2.00>
Total: 112032 W: 29142 L: 28705 D: 54185
Ptnml(0-2): 479, 13152, 28326, 13571, 488
Passed LTC:
https://tests.stockfishchess.org/tests/view/65c52e62c865510db02855d5
LLR: 2.96 (-2.94,2.94) <0.50,2.50>
Total: 132480 W: 33457 L: 32936 D: 66087
Ptnml(0-2): 67, 14697, 36222, 15156, 98
closes https://github.com/official-stockfish/Stockfish/pull/5047
Bench: 1168241
In both search and qsearch, there are instances
where we do unadjustedStaticEval = ss->staticEval
= eval/bestValue = tte->eval(), but immediately
after re-assign ss-static and eval/bestValue to
some new value, which makes the initial assignment
redundant.
closes https://github.com/official-stockfish/Stockfish/pull/5045
No functional change
- Update codeql to v3
- Switch from dev-drprasad to native github cli
- Update softprops/action-gh-release to node 20 commit
`thollander/actions-comment-pull-request` needs to
be bumped to node20 too, but the author hasnt done
so atm
closes https://github.com/official-stockfish/Stockfish/pull/5044
No functional change
Move divisor from capture scoring to good capture
check and sligthly increase it.
This has several effects:
- its a speedup because for quience and probcut
search the division now never happens. For main
search its delayed and can be avoided if a good
capture triggers a cutoff
- through the higher resolution of scores we have
a more granular sorting
STC: https://tests.stockfishchess.org/tests/view/65bf2a93c865510db027dc27
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 470016 W: 122150 L: 121173 D: 226693
Ptnml(0-2): 2133, 55705, 118374, 56644, 2152
LTC: https://tests.stockfishchess.org/tests/view/65c1d16dc865510db0281339
LLR: 2.96 (-2.94,2.94) <0.50,2.50>
Total: 98988 W: 25121 L: 24667 D: 49200
Ptnml(0-2): 77, 10998, 26884, 11464, 71
closes https://github.com/official-stockfish/Stockfish/pull/5036
Bench: 1233867
This refactors the CI workflows to group some
logic and makes sure that all (pre)release
binaries are actually tested.
The screenshot below shows the execution logic of
the reworked ci,
https://github.com/Disservin/Stockfish/actions/runs/7773581379.
You can also hover over the cards to see the
execution flow.
The `matrix.json` and `arm_matrix.json` define the
binaries which will be uploaded to GitHub.
Afterwards a matrix is created and each job
compiles a profile guided build for that arch and
uploads that as an artifact to GitHub. The
Binaries/ARM_Binaries workflow's are called when
the previous step has been completed, and uploads
all artifacts to the (pre)release.
This also fixes some indentations and renames the
workflows, see
https://github.com/official-stockfish/Stockfish/actions,
where every workflow is called `Stockfish` vs
https://github.com/Disservin/Stockfish/actions. It
also increases the parallel compilation used for
make from `-j2 to -j4`.
It now also prevents the prerelease action from
running on forks.
A test release can be viewed here
https://github.com/Disservin/Stockfish/releases.
closes https://github.com/official-stockfish/Stockfish/pull/5035
No functional change
This replaces the PvNode condition and tte Pv call previously with using
the precomputed ttPv, and also removes the multiplier of 2. This new
depth condition occurs with approximately equal frequency (47%) to the
old depth condition (measured when the other conditions in the if are
true), so non-linear scaling behaviour isn't expected.
Passed Non-Reg STC:
https://tests.stockfishchess.org/tests/view/65b0e132c865510db026da27
LLR: 2.97 (-2.94,2.94) <-1.75,0.25>
Total: 243232 W: 62432 L: 62437 D: 118363
Ptnml(0-2): 910, 28937, 61900, 28986, 883
Passed Non-Reg LTC:
https://tests.stockfishchess.org/tests/view/65b2053bc865510db026eea1
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 190596 W: 47666 L: 47618 D: 95312
Ptnml(0-2): 115, 21710, 51596, 21766, 111
closes https://github.com/official-stockfish/Stockfish/pull/5015
Bench: 1492957
This splits the logic of search and perft. Before, threads were started,
which then constructed a search object, which then started perft and
returned immediately. All of this is unnecessary, instead uci should
start perft right away.
closes https://github.com/official-stockfish/Stockfish/pull/5008
No functional change
Update the internal WDL model. After the dual net merge, the internal
evaluations have drifted upwards a bit. With this PR
`NormalizeToPawnValue` changes from `328` to `345`.
The new model was fitted based on about 200M positions extracted from
3.4M fishtest LTC games from the last two weeks, involving SF versions
from 6deb88728f to current master.
Apart from the WDL model parameter update, this PR implements the
following changes:
WDL Model:
- an incorrect 8-move shift in master's WDL model has been fixed
- the polynomials `p_a` and `p_b` are fitted over the move range [8, 120]
- the coefficients for `p_a` and `p_b` are optimized by maximizing the
probability of predicting the observed outcome (credits to @vondele)
SF code:
- for wdl values, move will be clamped to `max(8, min(120, move))`
- no longer clamp the internal eval to [-4000,4000]
- compute `NormalizeToPawnValue` with `round`, not `trunc`
The PR only affects displayed `cp` and `wdl` values.
closes https://github.com/official-stockfish/Stockfish/pull/5002
No functional change
The idea of this is to unroll the futility_margin calculation to allow
for the improving flag to have a greater effect on the futility margin.
The current factor is 1.5 instead of the previous 1 resulting in a
deduction of an extra margin/2 from futilit_margin if improving. The
chosen value was not tuned, meaning that there is room for tweaking it.
This patch is partially inspired by @Vizvezdenec, who, although quite
different in execution, tested another idea where the futility_margin is
lowered further when improving [1].
[1]: (first take) https://tests.stockfishchess.org/tests/view/65a56b1879aa8af82b97164b
Passed STC:
https://tests.stockfishchess.org/tests/live_elo/65a8bfc179aa8af82b974e3c
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 161152 W: 41321 L: 40816 D: 79015
Ptnml(0-2): 559, 19030, 40921, 19479, 587
Passed rebased LTC:
https://tests.stockfishchess.org/tests/live_elo/65a8b9ef79aa8af82b974dc0
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 96024 W: 24172 L: 23728 D: 48124
Ptnml(0-2): 56, 10598, 26275, 11012, 71
closes https://github.com/official-stockfish/Stockfish/pull/5000
Bench: 1281703
This addresses the issue where Stockfish may output non-proven checkmate
scores if the search is prematurely halted, either due to a time control
or node limit, before it explores other possibilities where the
checkmate score could have been delayed or refuted.
The fix also replaces staving off from proven mated scores in a
multithread environment making use of the threads instead of a negative
effect with multithreads (1t was better in proving mated in scores than
more threads).
Issue reported on mate tracker repo by and this PR is co-authored with
@robertnurnberg Special thanks to @AndyGrant for outlining that a fix is
eventually possible.
Passed Adj off SMP STC:
https://tests.stockfishchess.org/tests/view/65a125d779aa8af82b96c3eb
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 303256 W: 75823 L: 75892 D: 151541
Ptnml(0-2): 406, 35269, 80395, 35104, 454
Passed Adj off SMP LTC:
https://tests.stockfishchess.org/tests/view/65a37add79aa8af82b96f0f7
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 56056 W: 13951 L: 13770 D: 28335
Ptnml(0-2): 11, 5910, 16002, 6097, 8
Passed all tests in matetrack without any better mate for opponent found in 1t and multithreads.
Fixed bugs in https://github.com/official-stockfish/Stockfish/pull/4976
closes https://github.com/official-stockfish/Stockfish/pull/4990
Bench: 1308279
Co-Authored-By: Robert Nürnberg <28635489+robertnurnberg@users.noreply.github.com>
Also remove dead code, `rootSimpleEval` is no longer used since the introduction of dual net.
`iterBestValue` is also no longer used in evaluate and can be reduced to a local variable.
closes https://github.com/official-stockfish/Stockfish/pull/4979
No functional change
This aims to remove some of the annoying global structure which Stockfish has.
Overall there is no major elo regression to be expected.
Non regression SMP STC (paused, early version):
https://tests.stockfishchess.org/tests/view/65983d7979aa8af82b9608f1
LLR: 0.23 (-2.94,2.94) <-1.75,0.25>
Total: 76232 W: 19035 L: 19096 D: 38101
Ptnml(0-2): 92, 8735, 20515, 8690, 84
Non regression STC (early version):
https://tests.stockfishchess.org/tests/view/6595b3a479aa8af82b95da7f
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 185344 W: 47027 L: 46972 D: 91345
Ptnml(0-2): 571, 21285, 48943, 21264, 609
Non regression SMP STC:
https://tests.stockfishchess.org/tests/view/65a0715c79aa8af82b96b7e4
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 142936 W: 35761 L: 35662 D: 71513
Ptnml(0-2): 209, 16400, 38135, 16531, 193
These global structures/variables add hidden dependencies and allow data
to be mutable from where it shouldn't it be (i.e. options). They also
prevent Stockfish from internal selfplay, which would be a nice thing to
be able to do, i.e. instantiate two Stockfish instances and let them
play against each other. It will also allow us to make Stockfish a
library, which can be easier used on other platforms.
For consistency with the old search code, `thisThread` has been kept,
even though it is not strictly necessary anymore. This the first major
refactor of this kind (in recent time), and future changes are required,
to achieve the previously described goals. This includes cleaning up the
dependencies, transforming the network to be self contained and coming
up with a plan to deal with proper tablebase memory management (see
comments for more information on this).
The removal of these global structures has been discussed in parts with
Vondele and Sopel.
closes https://github.com/official-stockfish/Stockfish/pull/4968
No functional change
Created by retraining the previous main net nn-b1e55edbea57.nnue with:
- some of the same options as before: ranger21 optimizer, more WDL
skipping
- adding T80 aug filter-v6, sep, and oct 2023 data to the previous best
dataset
- increasing training loss for positions where predicted win rates were
higher than estimated match results from training data position scores
```yaml
experiment-name: 2560--S8-r21-more-wdl-skip-10p-more-loss-high-q-sk28
training-dataset:
# https://github.com/official-stockfish/Stockfish/pull/4782
- /data/S6-1ee1aba5ed.binpack
- /data/test80-aug2023-2tb7p.v6.min.binpack
- /data/test80-sep2023-2tb7p.binpack
- /data/test80-oct2023-2tb7p.binpack
early-fen-skipping: 28
start-from-engine-test-net: True
nnue-pytorch-branch: linrock/nnue-pytorch/r21-more-wdl-skip-10p-more-loss-high-q
num-epochs: 1000
lr: 4.375e-4
gamma: 0.995
start-lambda: 1.0
end-lambda: 0.7
```
Training data can be found at:
https://robotmoon.com/nnue-training-data/
Training loss was increased by 10% for positions where predicted win
rates were higher than suggested by the win rate model based on the
training data, by multiplying with: ((qf > pt) * 0.1 + 1). This was a
variant of experiments from Sopel's NNUE training & experimentation log:
https://docs.google.com/document/d/1gTlrr02qSNKiXNZ_SuO4-RjK4MXBiFlLE6jvNqqMkAY
Experiment 302 - increase loss when prediction too high, vondele’s idea
Experiment 309 - increase loss when prediction too high, normalize in a
batch
Passed STC:
https://tests.stockfishchess.org/tests/view/6597a21c79aa8af82b95fd5c
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 148320 W: 37960 L: 37475 D: 72885
Ptnml(0-2): 542, 17565, 37383, 18206, 464
Passed LTC:
https://tests.stockfishchess.org/tests/view/659834a679aa8af82b960845
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 55188 W: 13955 L: 13592 D: 27641
Ptnml(0-2): 34, 6162, 14834, 6535, 29
closes https://github.com/official-stockfish/Stockfish/pull/4972
Bench: 1219824
Created by training an L1-128 net from scratch with a wider range of
evals in the training data and wld-fen-skipping disabled during
training. The differences in this training data compared to the first
dual nnue PR are:
- removal of all positions with 3 pieces
- when piece count >= 16, keep positions with simple eval above 750
- when piece count < 16, remove positions with simple eval above 3000
The asymmetric data filtering was meant to flatten the training data
piece count distribution, which was previously heavily skewed towards
positions with low piece counts.
Additionally, the simple eval range where the smallnet is used was
widened to cover more positions previously evaluated by the big net and
simple eval.
```yaml
experiment-name: 128--S1-hse-S7-v4-S3-v1-no-wld-skip
training-dataset:
- /data/hse/S3/leela96-filt-v2.min.high-simple-eval-1k.binpack
- /data/hse/S3/dfrc99-16tb7p-eval-filt-v2.min.high-simple-eval-1k.binpack
- /data/hse/S3/test80-apr2022-16tb7p.min.high-simple-eval-1k.binpack
- /data/hse/S7/test60-2020-2tb7p.v6-3072.high-simple-eval-v4.binpack
- /data/hse/S7/test60-novdec2021-12tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
- /data/hse/S7/test77-nov2021-2tb7p.v6-3072.min.high-simple-eval-v4.binpack
- /data/hse/S7/test77-dec2021-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
- /data/hse/S7/test77-jan2022-2tb7p.high-simple-eval-v4.binpack
- /data/hse/S7/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
- /data/hse/S7/test78-juntosep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
- /data/hse/S7/test79-apr2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
- /data/hse/S7/test79-may2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
- /data/hse/S7/test80-may2022-16tb7p.high-simple-eval-v4.binpack
- /data/hse/S7/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
- /data/hse/S7/test80-jul2022-16tb7p.v6-dd.min.high-simple-eval-v4.binpack
- /data/hse/S7/test80-aug2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
- /data/hse/S7/test80-sep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
- /data/hse/S7/test80-oct2022-16tb7p.v6-dd.high-simple-eval-v4.binpack
- /data/hse/S7/test80-nov2022-16tb7p-v6-dd.min.high-simple-eval-v4.binpack
- /data/hse/S7/test80-jan2023-3of3-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
- /data/hse/S7/test80-feb2023-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
- /data/hse/S7/test80-mar2023-2tb7p.v6-sk16.min.high-simple-eval-v4.binpack
- /data/hse/S7/test80-apr2023-2tb7p-filter-v6-sk16.min.high-simple-eval-v4.binpack
- /data/hse/S7/test80-may2023-2tb7p.v6.min.high-simple-eval-v4.binpack
- /data/hse/S7/test80-jun2023-2tb7p.v6-3072.min.high-simple-eval-v4.binpack
- /data/hse/S7/test80-jul2023-2tb7p.v6-3072.min.high-simple-eval-v4.binpack
- /data/hse/S7/test80-aug2023-2tb7p.v6.min.high-simple-eval-v4.binpack
- /data/hse/S7/test80-sep2023-2tb7p.high-simple-eval-v4.binpack
- /data/hse/S7/test80-oct2023-2tb7p.high-simple-eval-v4.binpack
wld-fen-skipping: False
start-from-engine-test-net: False
nnue-pytorch-branch: linrock/nnue-pytorch/L1-128
engine-test-branch: linrock/Stockfish/L1-128-nolazy
engine-base-branch: linrock/Stockfish/L1-128
num-epochs: 500
start-lambda: 1.0
end-lambda: 1.0
```
Experiment yaml configs converted to easy_train.sh commands with:
https://github.com/linrock/nnue-tools/blob/4339954/yaml_easy_train.py
Binpacks interleaved at training time with:
https://github.com/official-stockfish/nnue-pytorch/pull/259
FT weights permuted with 10k positions from fishpack32.binpack with:
https://github.com/official-stockfish/nnue-pytorch/pull/254
Data filtered for high simple eval positions (v4) with:
https://github.com/linrock/Stockfish/blob/b9c8440/src/tools/transform.cpp#L640-L675
Training data can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move of
L1-128 smallnet (nnue-only eval) vs. L1-128 trained on standard S1 data:
nn-epoch319.nnue : -241.7 +/- 3.2
Passed STC vs. 36db936:
https://tests.stockfishchess.org/tests/view/6576b3484d789acf40aabbfe
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 21920 W: 5680 L: 5381 D: 10859
Ptnml(0-2): 82, 2488, 5520, 2789, 81
Passed LTC vs. DualNNUE #4915:
https://tests.stockfishchess.org/tests/view/65775c034d789acf40aac7e3
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 147606 W: 36619 L: 36063 D: 74924
Ptnml(0-2): 98, 16591, 39891, 17103, 120
closes https://github.com/official-stockfish/Stockfish/pull/4919
Bench: 1438336
Add a `.git-blame-ignore-revs` file which can be used to skip specified
commits when blaming, this is useful to ignore formatting commits, like
clang-format #4790.
Github blame automatically supports this file format, as well as other
third party tools. Git itself needs to be told about the file name to
work, the following command will add it to the current git repo. `git
config blame.ignoreRevsFile .git-blame-ignore-revs`, alternatively one
has to specify it with every blame. `git blame --ignore-revs-file
.git-blame-ignore-revs search.cpp`
Supported since git 2.23.
closes https://github.com/official-stockfish/Stockfish/pull/4969
No functional change
Only Direction type is using two of the enable overload macros.
Aside from this, only two of the overloads are even being used.
Therefore, we can just define the needed overloads and remove the macros.
closes https://github.com/official-stockfish/Stockfish/pull/4966
No functional change.
Remove a redundant int cast in the calculation of fwdOut. The variable
OutputType is already defined as std::int32_t, which is an integer type, making
the cast unnecessary.
closes https://github.com/official-stockfish/Stockfish/pull/4961
No functional change
The primary rationale behind this lies in the fact that enums were not
originally designed to be employed in the manner we currently utilize them.
The Value enum was used like a type alias throughout the code and was often
misused. Furthermore, changing the underlying size of the enum to int16_t broke
everything, mostly because of the operator overloads for the Value enum, were
causing data to be truncated. Since Value is now a type alias, the operator
overloads are no longer required.
Passed Non-Regression STC:
https://tests.stockfishchess.org/tests/view/6593b8bb79aa8af82b95b401
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 235296 W: 59919 L: 59917 D: 115460
Ptnml(0-2): 743, 27085, 62054, 26959, 807
closes https://github.com/official-stockfish/Stockfish/pull/4960
No functional change
As some have noticed, a security alert has been complaining about a for loop in
our TB code for quite some now. Though it was never a real issue, so not of high
importance.
A few lines earlier the symlen vector is resized
`d->symlen.resize(number<uint16_t, LittleEndian>(data));` while this code seems
odd at first, it resizes the array to at most (2 << 16) - 1 elements, basically
making the infinite loop issue impossible to occur.
closes https://github.com/official-stockfish/Stockfish/pull/4953
No functional change
Idea from Caissa (https://github.com/Witek902/Caissa) chess engine.
With given pawn structure collect data with how often search result and by how
much it was better / worse than static evalution of position and use it to
adjust static evaluation of positions with given pawn structure. Details:
1. excludes positions with fail highs and moves producing it being a capture;
2. update value is function of not only difference between best value and static
evaluation but also is multiplied by linear function of depth;
3. maximum update value is maximum value of correction history divided by 2;
4. correction history itself is divided by 32 when applied so maximum value of
static evaluation adjustment is 32 internal units.
Passed STC:
https://tests.stockfishchess.org/tests/view/658fc7b679aa8af82b955cac
LLR: 2.96 (-2.94,2.94) <0.00,2.00>
Total: 128672 W: 32757 L: 32299 D: 63616
Ptnml(0-2): 441, 15241, 32543, 15641, 470
Passed LTC:
https://tests.stockfishchess.org/tests/view/65903f6979aa8af82b9566f1
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 97422 W: 24626 L: 24178 D: 48618
Ptnml(0-2): 41, 10837, 26527, 11245, 61
closes https://github.com/official-stockfish/Stockfish/pull/4950
Bench: 1157852
For developing an Android GUI it can be helpful to use the Emulator on Windows.
Therefor an android_x86-64 library of Stockfish is needed. It would be nice to
compile it "out-of-the-box".
This change is originally suggested by Craftyawesome
closes https://github.com/official-stockfish/Stockfish/pull/4927
No functional change
This fixes futility pruning return values after recent tweaks, `eval` is
guaranteed to be less than the mate-in range but it can be as low value such
that the average between eval and beta can still fall in the mated-in range when
beta is as low in mated range. i.e. (eval + beta) / 2 being at mated-range which
can break mates.
Passed non-regression STC:
https://tests.stockfishchess.org/tests/view/658f3eed79aa8af82b955139
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 117408 W: 29891 L: 29761 D: 57756
Ptnml(0-2): 386, 13355, 31120, 13429, 414
Passed non-regression LTC:
https://tests.stockfishchess.org/tests/view/658f8b7a79aa8af82b9557bd
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 60240 W: 14962 L: 14786 D: 30492
Ptnml(0-2): 22, 6257, 17390, 6425, 26
changes signature at higher depth e.g. `128 1 15`
closes https://github.com/official-stockfish/Stockfish/pull/4944
Bench: 1304666
Instead of returning strict fail soft fail high return value between value from
search and beta (somewhat by analogy to futility pruning and probcut).
This seems to be somewhat depth sensitive heuristic which performed much worse
at LTC while performing much better at STC if it is more aggressive, passed
version is the least aggressive one.
Passed STC:
https://tests.stockfishchess.org/tests/view/657b06414d789acf40ab1475
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 212352 W: 53900 L: 53315 D: 105137
Ptnml(0-2): 809, 25236, 53520, 25783, 828
Passed LTC:
https://tests.stockfishchess.org/tests/view/657ce36f393ac02e79120a7c
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 319362 W: 79541 L: 78630 D: 161191
Ptnml(0-2): 202, 35839, 86709, 36708, 223
closes https://github.com/official-stockfish/Stockfish/pull/4928
Bench: 974739
Sometimes if we count the reported PV length, it turns out to be longer than the
selective depth reported. This fixes this behavior by applying the selective
depth to qsearch since we do report PVs from it as well.
Passed non-regression STC:
https://tests.stockfishchess.org/tests/view/656cf5b66980e15f69c7499d
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 223648 W: 56372 L: 56356 D: 110920
Ptnml(0-2): 710, 25580, 59231, 25590, 713
closes https://github.com/official-stockfish/Stockfish/pull/4903
No functional change
We use following line to clamp the search depth in some range:
Depth d = std::clamp(newDepth - r, 1, newDepth + 1);
Through negative extension its possible that the maximum value becomes smaller than the minimum value but then the behavior is undefined (see https://en.cppreference.com/w/cpp/algorithm/clamp). So replace this line with a safe implementation.
Remark:
We have in recent master already one line where up to 3 negative extensions are possible which could trigger this undefined behavior but this can only be happen for completed depth > 24 so its not discovered by our default bench. Recent negative extension tests by @fauzi shows then this undefined behavior with wrong bench numbers.
closes https://github.com/official-stockfish/Stockfish/pull/4877
No functional change
- updates the SDE action to v2.2
- removes the linux x86-32 builds, which were almost unused,
and the build process under SDE started failing recently,
possibly related to glibc update (The futex facility returned an unexpected error code.)
closes https://github.com/official-stockfish/Stockfish/pull/4875
No functional change
Huge credit goes also to candirufish,
as the idea was first tried by him, and then tuned by me at multiple phases.
Tweaking the futility pruning formula to be a bit more selective about when pruning is applied.
Adjust the value added to the static eval based on the bestValue relative to ss->staticEval. If bestValue is significantly lower, we add a larger value.
Passed STC:
LLR: 2.98 (-2.94,2.94) <0.00,2.00>
Total: 37120 W: 9590 L: 9266 D: 18264
Ptnml(0-2): 130, 4301, 9385, 4603, 141
https://tests.stockfishchess.org/tests/view/6544cf90136acbc573523247
Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 49632 W: 12381 L: 12033 D: 25218
Ptnml(0-2): 30, 5429, 13549, 5779, 29
https://tests.stockfishchess.org/tests/view/65453bc1136acbc573523a3c
closes https://github.com/official-stockfish/Stockfish/pull/4861
bench: 1107118
Corrects some incorrect or outdated comments.
Credit is shared with yaneurao (see 38e830a#commitcomment-131131500) and locutus2
closes#4852
No functional change.
Original idea by Seer chess engine https://github.com/connormcmonigle/seer-nnue,
coding done by @Disservin, code refactoring done by @locutus2 to match the style
of other histories.
This patch introduces pawn structure based history, which assings moves values
based on last digits of pawn structure hash and piece type of moved piece and
landing square of the move. Idea is that good places for pieces are quite often
determined by pawn structure of position. Used in 3 different places
- sorting of quiet moves, sorting of quiet check evasions and in history based
pruning in search.
Passed STC:
https://tests.stockfishchess.org/tests/view/65391d08cc309ae83955dbaf
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 155488 W: 39408 L: 38913 D: 77167
Ptnml(0-2): 500, 18427, 39408, 18896, 513
Passed LTC:
https://tests.stockfishchess.org/tests/view/653a36a2cc309ae83955f181
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 70110 W: 17548 L: 17155 D: 35407
Ptnml(0-2): 33, 7859, 18889, 8230, 44
closes https://github.com/official-stockfish/Stockfish/pull/4849
Bench: 1257882
Co-Authored-By: Disservin <disservin.social@gmail.com>
Co-Authored-By: Stefan Geschwentner <locutus2@users.noreply.github.com>
- remove the blank line between the declaration of the function and it's
comment, leads to better IDE support when hovering over a function to see it's
description
- remove the unnecessary duplication of the function name in the functions
description
- slightly refactored code for lsb, msb in bitboard.h There are still a few
things we can be improved later on, move the description of a function where
it was declared (instead of implemented) and add descriptions to functions
which are behind macros ifdefs
closes https://github.com/official-stockfish/Stockfish/pull/4840
No functional change
Performance improvement for the shell commands in the Makefile.
By using expanded variables, the shell commands are only
evaluated once, instead of every time they are used.
closes https://github.com/official-stockfish/Stockfish/pull/4838
No functional change
This introduces clang-format to enforce a consistent code style for Stockfish.
Having a documented and consistent style across the code will make contributing easier
for new developers, and will make larger changes to the codebase easier to make.
To facilitate formatting, this PR includes a Makefile target (`make format`) to format the code,
this requires clang-format (version 17 currently) to be installed locally.
Installing clang-format is straightforward on most OS and distros
(e.g. with https://apt.llvm.org/, brew install clang-format, etc), as this is part of quite commonly
used suite of tools and compilers (llvm / clang).
Additionally, a CI action is present that will verify if the code requires formatting,
and comment on the PR as needed. Initially, correct formatting is not required, it will be
done by maintainers as part of the merge or in later commits, but obviously this is encouraged.
fixes https://github.com/official-stockfish/Stockfish/issues/3608
closes https://github.com/official-stockfish/Stockfish/pull/4790
Co-Authored-By: Joost VandeVondele <Joost.VandeVondele@gmail.com>
This patch is a simplification and a fix to dealing with null moves scores that returns proven mates or TB scores by preventing 'null move pruning' if the nullvalue is in that range.
Current solution downgrades nullValues on the non-PV node but the value can be used in a transposed PV-node to the same position afterwards (Triangulation), the later is prone to propagate a wrong score (96.05) to root that will not be refuted unless we search further.
Score of (96.05) can be obtained be two methods,
maxim static-eval returned on Pv update (mostly qSearch)
this downgrade (clamp) in NMP
and theoretically can happen with or without TBs but the second scenario is more dangerous than the first.
This fixes the reproducible case in very common scenarios with TBs as shown in the debugging at discord.
Fixes: #4699
Passed STC:
https://tests.stockfishchess.org/tests/view/64c1eca8dc56e1650abba6f9
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 670048 W: 171132 L: 171600 D: 327316
Ptnml(0-2): 2134, 75687, 179820, 75279, 2104
Passed LTC:
https://tests.stockfishchess.org/tests/view/64c5e130dc56e1650abc0438
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 92868 W: 23642 L: 23499 D: 45727
Ptnml(0-2): 52, 9509, 27171, 9648, 54
closes https://github.com/official-stockfish/Stockfish/pull/4715
Bench: 1327410
After removing classic evaluation VALUE_KNOWN_WIN is not anymore returned explicit evaluation. So remove and replace it with VALUE_TB_WIN_IN_MAX_PLY.
Measurement on my big bench (bench 16 1 16 pos1000.fen) verifies that at least with current net the calculated evaluation lies always in the open interval (-VALUE_KNOWN_WIN, VALUE_KNOWN_WIN).
So i consider this a non-functional change. But to be safe i tested this also at LTC as requested by Stephane Nicolet.
STC:
https://tests.stockfishchess.org/tests/view/64f9db40eaf01be8259a6ed5
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 455296 W: 115981 L: 116217 D: 223098
Ptnml(0-2): 1415, 50835, 123420, 50527, 1451
LTC:
https://tests.stockfishchess.org/tests/view/650bfd867ca0d3f7bbf25feb
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 35826 W: 9170 L: 8973 D: 17683
Ptnml(0-2): 12, 3523, 10645, 3722, 11
closes https://github.com/official-stockfish/Stockfish/pull/4792
Bench: 1603079
in the case of avx512 and vnni512 archs.
Up to 17% speedup, depending on the compiler, e.g.
```
AMD pro 7840u (zen4 phoenix apu 4nm)
bash bench_parallel.sh ./stockfish_avx512_gcc13 ./stockfish_avx512_pr_gcc13 20 10
sf_base = 1077737 +/- 8446 (95%)
sf_test = 1264268 +/- 8543 (95%)
diff = 186531 +/- 4280 (95%)
speedup = 17.308% +/- 0.397% (95%)
```
Prior to this patch, it appears gcc spills registers.
closes https://github.com/official-stockfish/Stockfish/pull/4796
No functional change
The commit adds a CI workflow that uses the included-what-you-use (IWYU)
tool to check for missing or superfluous includes in .cpp files and
their corresponding .h files. This means that some .h files (especially
in the nnue folder) are not checked yet.
The CI setup looks like this:
- We build IWYU from source to include some yet unreleased fixes.
This IWYU version targets LLVM 17. Thus, we get the latest release
candidate of LLVM 17 from LLVM's nightly packages.
- The Makefile now has an analyze target that just build the object
files (without linking)
- The CI uses the analyze target with the IWYU tool as compiler to
analyze the compiled .cpp file and its corresponding .h file.
- If IWYU suggests a change the build fails (-Xiwyu --error).
- To avoid false positives we use LLVM's libc++ as standard library
- We have a custom mappings file that adds some mappings that are
missing in IWYU's default mappings
We also had to add one IWYU pragma to prevent a false positive in
movegen.h.
https://github.com/official-stockfish/Stockfish/pull/4783
No functional change
deal with the general case
About a 8.6% speedup (for general arch)
Results for 200 tests for each version:
Base Test Diff
Mean 141741 153998 -12257
StDev 2990 3042 3742
p-value: 0.999
speedup: 0.086
closes https://github.com/official-stockfish/Stockfish/pull/4786
No functional change
The commit removes all uses of ICC's __INTEL_COMPILER macro and other
references to ICC. It also adds ICX info to the compiler command and
fixes two typos in Makefile's help output.
closes https://github.com/official-stockfish/Stockfish/pull/4769
No functional change
Created by retraining the master net on a dataset composed by:
- adding Leela data from T60 jul-dec 2020, T77 nov 2021, T80 jun-jul 2023
- deduplicating and unminimizing parts of the dataset before interleaving
Trained initially with max epoch 800, then increased near the end of training
twice. First to 960, then 1200. After training, post-processing involved:
- greedy permuting L1 weights with https://github.com/official-stockfish/Stockfish/pull/4620
- greedy 2- and 3- cycle permuting with https://github.com/official-stockfish/Stockfish/pull/4640
python3 easy_train.py \
--experiment-name 2048-retrain-S6-sk28 \
--training-dataset /data/S6.binpack \
--early-fen-skipping 28 \
--start-from-engine-test-net True \
--max_epoch 1200 \
--lr 4.375e-4 \
--gamma 0.995 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--tui False \
--seed $RANDOM \
--gpus 0
In the list of datasets below, periods in the filename represent the sequence of
steps applied to arrive at the particular binpack. For example:
test77-dec2021-16tb7p.filter-v6-dd.min-mar2023.unminimized.binpack
1. test77 dec2021 data rescored with 16 TB of syzygy tablebases during data conversion
2. filtered with csv_filter_v6_dd.py - v6 filtering and deduplication in one step
3. minimized with the original mar2023 implementation of `minimize_binpack` in
the tools branch
4. unminimized by removing all positions with score == 32002 (`VALUE_NONE`)
Binpacks were:
- filtered with: https://github.com/linrock/nnue-data
- unminimized with: https://github.com/linrock/Stockfish/tree/tools-unminify
- deduplicated with: https://github.com/linrock/Stockfish/tree/tools-dd
DATASETS=(
leela96-filt-v2.min.unminimized.binpack
dfrc99-16tb7p-eval-filt-v2.min.unminimized.binpack
# most of the 0dd1cebea57 v6-dd dataset (without test80-jul2022)
# https://github.com/official-stockfish/Stockfish/pull/4606
test60-novdec2021-12tb7p.filter-v6-dd.min-mar2023.unminimized.binpack
test77-dec2021-16tb7p.filter-v6-dd.min-mar2023.unminimized.binpack
test78-jantomay2022-16tb7p.filter-v6-dd.min-mar2023.unminimized.binpack
test78-juntosep2022-16tb7p.filter-v6-dd.min-mar2023.unminimized.binpack
test79-apr2022-16tb7p.filter-v6-dd.min-mar2023.unminimized.binpack
test79-may2022-16tb7p.filter-v6-dd.min-mar2023.unminimized.binpack
test80-jun2022-16tb7p.filter-v6-dd.min-mar2023.unminimized.binpack
test80-aug2022-16tb7p.filter-v6-dd.min-mar2023.unminimized.binpack
test80-sep2022-16tb7p.filter-v6-dd.min-mar2023.unminimized.binpack
test80-oct2022-16tb7p.filter-v6-dd.min.binpack
test80-nov2022-16tb7p.filter-v6-dd.min.binpack
test80-jan2023-3of3-16tb7p.filter-v6-dd.min-mar2023.unminimized.binpack
test80-feb2023-16tb7p.filter-v6-dd.min-mar2023.unminimized.binpack
# older Leela data, recently converted
test60-octnovdec2020-2tb7p.min.unminimized.binpack
test60-julaugsep2020-2tb7p.min.binpack
test77-nov2021-2tb7p.min.dd.binpack
# newer Leela data
test80-mar2023-2tb7p.min.unminimized.binpack
test80-apr2023-2tb7p.filter-v6-sk16.min.unminimized.binpack
test80-may2023-2tb7p.min.dd.binpack
test80-jun2023-2tb7p.min.binpack
test80-jul2023-2tb7p.binpack
)
python3 interleave_binpacks.py ${DATASETS[@]} /data/S6.binpack
Training data can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move:
nn-epoch1059 : 2.7 +/- 1.6
Passed STC:
https://tests.stockfishchess.org/tests/view/64fc8d705dab775b5359db42
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 168352 W: 43216 L: 42704 D: 82432
Ptnml(0-2): 599, 19672, 43134, 20160, 611
Passed LTC:
https://tests.stockfishchess.org/tests/view/64fd44a75dab775b5359f065
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 154194 W: 39436 L: 38881 D: 75877
Ptnml(0-2): 78, 16577, 43238, 17120, 84
closes https://github.com/official-stockfish/Stockfish/pull/4782
Bench: 1603079
Add more static checks regarding the SIMD width match.
STC: https://tests.stockfishchess.org/tests/view/64f5c568a9bc5a78c669e70e
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 125216 W: 31756 L: 31636 D: 61824
Ptnml(0-2): 327, 13993, 33848, 14113, 327
Fixes a bug introduced in 2f2f45f, where with AVX-512 the weights and input to
the last layer were being read out of bounds. Now AVX-512 is only used for the
layers it can be used for. Additional static assertions have been added to
prevent more errors like this in the future.
closes https://github.com/official-stockfish/Stockfish/pull/4773
No functional change
This is a cleanup PR that prepares the automatic checking of missing or
superfluous #include directives via the include-what-you-use (IWYU) tool
on the CI. Unfortunately, IWYU proposes additional includes for
"namespace std" although we don't need them.
To avoid the problem, the commit removes all "using namespace std"
statements from the code and directly uses the std:: prefix instead.
Alternatively, we could add specific usings (e.g. "using std::string")
foreach used type. Also, a mix of both approaches would be possible.
I decided for the prefix approach because most of the files were already
using the std:: prefixes despite the "using namespace std".
closes https://github.com/official-stockfish/Stockfish/pull/4772
No functional change
This patch implements the pure materialistic evaluation called simple_eval()
to gain a speed-up during Stockfish search.
We use the so-called lazy evaluation trick: replace the accurate but slow
NNUE network evaluation by the super-fast simple_eval() if the position
seems to be already won (high material advantage). To guard against some
of the most obvious blunders introduced by this idea, this patch uses the
following features which will raise the lazy evaluation threshold in some
situations:
- avoid lazy evals on shuffling branches in the search tree
- avoid lazy evals if the position at root already has a material imbalance
- avoid lazy evals if the search value at root is already winning/losing.
Moreover, we add a small random noise to the simple_eval() term. This idea
(stochastic mobility in the minimax tree) was worth about 200 Elo in the pure
simple_eval() player on Lichess.
Overall, the current implementation in this patch evaluates about 2% of the
leaves in the search tree lazily.
--------------------------------------------
STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 60352 W: 15585 L: 15234 D: 29533
Ptnml(0-2): 216, 6906, 15578, 7263, 213
https://tests.stockfishchess.org/tests/view/64f1d9bcbd9967ffae366209
LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 35106 W: 8990 L: 8678 D: 17438
Ptnml(0-2): 14, 3668, 9887, 3960, 24
https://tests.stockfishchess.org/tests/view/64f25204f5b0c54e3f04c0e7
verification run at VLTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 74362 W: 19088 L: 18716 D: 36558
Ptnml(0-2): 6, 7226, 22348, 7592, 9
https://tests.stockfishchess.org/tests/view/64f2ecdbf5b0c54e3f04d3ae
All three tests above were run with adjudication off, we also verified that
there was no regression on matetracker (thanks Disservin!).
----------------------------------------------
closes https://github.com/official-stockfish/Stockfish/pull/4771
Bench: 1393714
To enhance code clarity and prevent potential confusion with the
'r' variable assigned to reduction later in the code, this pull
request renames it to 'reductionScale' when we use the same name
in the reduction() function.
Using distinct variable names for separate functions improves code
readability and maintainability.
closes https://github.com/official-stockfish/Stockfish/pull/4765
No functional change
The UCI protocol is rather technical and has little value in our README. Instead
it should be explained in our wiki. "Contributing" is moved above "Compiling
Stockfish" to make it more prominent.
Also move the CONTRIBUTING.md into the root directory and include it in the
distributed artifacts/releases.
closes https://github.com/official-stockfish/Stockfish/pull/4766
No functional change
This patch decays a little the evaluation (up to a few percent) for
positions which have a large complexity measure (material imbalance,
positional compensations, etc).
This may have nice consequences on the playing style, as it modifies
the search differently for attack and defense, both effects being
desirable:
- to see the effect on positions when Stockfish is defending, let us
suppose for instance that the side to move is Stockfish and the nnue
evaluation on the principal variation is -100 : this patch will decay
positions with an evaluation of -103 (say) to the same level, provided
they have huge material imbalance or huge positional compensation.
In other words, chaotic positions with an evaluation of -103 are now
comparable in our search tree to stable positions with an evaluation
of -100, and chaotic positions with an evaluation of -102 are now
preferred to stable positions with an evaluation of -100.
- the effect on positions when Stockfish is attacking is the opposite.
Let us suppose for instance that the side to move is Stockfish and the
nnue evaluation on the principal variation is +100 : this patch will
decay the evaluation to +97 if the positions on the principal variation
have huge material imbalance or huge positional compensation. In other
words, stable positions with an evaluation of +97 are now comparable
in our search tree to chaotic positions with an evaluation of +100,
and stable positions with an evaluation of +98 are now preferred to
chaotic positions with an evaluation of +100.
So the effect of this small change of evaluation on the playing style
is that Stockfish should now play a little bit more turbulent when
defending, and choose slightly simpler lines when attacking.
passed STC:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 268448 W: 68713 L: 68055 D: 131680
Ptnml(0-2): 856, 31514, 68943, 31938, 973
https://tests.stockfishchess.org/tests/view/64e252bb99700912526653ed
passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 141060 W: 36066 L: 35537 D: 69457
Ptnml(0-2): 71, 15179, 39522, 15666, 92
https://tests.stockfishchess.org/tests/view/64e4447a9009777747553725
closes https://github.com/official-stockfish/Stockfish/pull/4762
Bench: 1426295
Increase reduction on retrying a move we just retreated that falls in a repetition:
if current move can be the same move from previous previous turn then we retreated
that move on the previous turn, this patch increases reduction if retrying that move
results in a repetition.
How to continue from there? Maybe we some variants of this idea could bring Elo too
(only testing the destination square, or triangulations, etc.)
Passed STC:
https://tests.stockfishchess.org/tests/view/64e1aede883cbb7cbd9ad976
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 424000 W: 108675 L: 107809 D: 207516
Ptnml(0-2): 1296, 47350, 113896, 48108, 1350
Passed LTC:
https://tests.stockfishchess.org/tests/view/64e32d629970091252666872
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 89682 W: 22976 L: 22569 D: 44137
Ptnml(0-2): 39, 8843, 26675, 9240, 44
closes https://github.com/official-stockfish/Stockfish/pull/4757
bench: 1574347
Squared numbers are never negative, so barring any wraparound there
is no need to clamp to 0. From reading the code, there's no obvious
way to get wraparound, so the entire operation can be simplified
away. Updated original truncated code comments to be sensible.
Verified by running ./stockfish bench 128 1 24 and by the following test:
STC: https://tests.stockfishchess.org/tests/view/64da4db95b17f7c21c0eabe7
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 60224 W: 15425 L: 15236 D: 29563
Ptnml(0-2): 195, 6576, 16382, 6763, 196
closes https://github.com/official-stockfish/Stockfish/pull/4751
No functional change
Current master fails to compile for ARMv8 on Raspi cause gcc (version 10.2.1)
does not like to cast between signed and unsigned vector types. This patch
fixes it by using unsigned vector pointer for ARM to avoid implicite cast.
closes https://github.com/official-stockfish/Stockfish/pull/4752
No functional change
a) Add further tests to CI to cover most features. This uncovered a potential race
in case setoption was sent between two searches. As the UCI protocol requires
this sent to be went the engine is not searching, setoption now ensures that
this is the case.
b) Remove some unused code
closes https://github.com/official-stockfish/Stockfish/pull/4730
No functional change
If an incorrect network file is present at the start of the compilation stage, the
Makefile script now correctly removes it before trying to download a clean version.
closes https://github.com/official-stockfish/Stockfish/pull/4726
No functional change
Based on vondele's deletepsqt branch:
https://github.com/vondele/Stockfish/commit/369f5b051
This huge simplification uses a weighted material differences instead of
the positional piece square tables (psqt) in the semi-classical complexity
calculation. Tuned weights using spsa at 45+0.45 with:
int pawnMult = 100;
int knightMult = 325;
int bishopMult = 350;
int rookMult = 500;
int queenMult = 900;
TUNE(SetRange(0, 200), pawnMult);
TUNE(SetRange(0, 650), knightMult);
TUNE(SetRange(0, 700), bishopMult);
TUNE(SetRange(200, 800), rookMult);
TUNE(SetRange(600, 1200), queenMult);
The values obtained via this tuning session were for a model where
the psqt replacement formula was always from the point of view of White,
even if the side to move was Black. We re-used the same values for an
implementation with a psqt replacement from the point of view of the side
to move, testing the result both on our standard book on positions with
a strong White bias, and an alternate book with positions with a strong
Black bias.
We note that with the patch the last use of the venerable "Score" type
disappears in Stockfish codebase (the Score type was used in classical
evaluation to get a tampered eval interpolating values smoothly from the
early midgame stage to the endgame stage). We leave it to another commit
to clean all occurrences of Score in the code and the comments.
-------
Passed non-regression LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 142542 W: 36264 L: 36168 D: 70110
Ptnml(0-2): 76, 15578, 39856, 15696, 65
https://tests.stockfishchess.org/tests/view/64c8cb495b17f7c21c0cf9f8
Passed non-regression LTC (with a book with Black bias):
https://tests.stockfishchess.org/tests/view/64c8f9295b17f7c21c0cfdaf
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 494814 W: 125565 L: 125827 D: 243422
Ptnml(0-2): 244, 53926, 139346, 53630, 261
------
closes https://github.com/official-stockfish/Stockfish/pull/4713
Bench: 1655985
Also make two get_weight_index() static methods constexpr, for
consistency with the other static get_hash_value() method right above.
Tested for speed by user Torom (thanks).
closes https://github.com/official-stockfish/Stockfish/pull/4708
No functional change
This patch changes the frequency with which the time is checked, changing
frequency from every 1024 counted nodes to every 512 counted nodes. The
master value was tuned for the old classical eval, the patch takes the
roughly 2x slowdown in nps with SFNNUEv7 into account. This could reduce
a bit the losses on time on fishtest, but they are probably unrelated.
passed STC:
https://tests.stockfishchess.org/tests/view/64bb8ae5dc56e1650abb1b11
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 76576 W: 19677 L: 19501 D: 37398
Ptnml(0-2): 274, 8592, 20396, 8736, 290
closes https://github.com/official-stockfish/Stockfish/pull/4704
No functional change
allows for building x86-64-avx2 and x86-64-bmi2 binaries for mac
install coreutils
show hardware capabilities as seen by the compilers
move some tests from sse41 to avx2 as platforms support it
closes https://github.com/official-stockfish/Stockfish/pull/4692
No functional change
Explicitly describe the architecture as deprecated,
it remains available as its current alias x86-64-sse41-popcnt
CPUs that support just this instruction set are now years old,
any few years old Intel or AMD CPU supports x86-64-avx2. However,
naming things 'modern' doesn't age well, so instead use explicit names.
Adjust CI accordingly. Wiki, fishtest, downloader done as well.
closes https://github.com/official-stockfish/Stockfish/pull/4691
No functional change.
use a fixed compiler on Linux and Windows (right now gcc 11).
build avxvnni on Windows (Linux needs updated core utils)
build x86-32 on Linux (Windows needs other mingw)
fix a Makefile issue where a failed PGOBENCH would not stop the build
reuse the WINE_PATH for SDE as we do for QEMU
use WINE_PATH variable also for the signature
verify the bench for each of the binaries
do not build x86-64-avx2 on macos
closes https://github.com/official-stockfish/Stockfish/pull/4682
No functional change
use intel's Software Development Emulator (SDE) in the actions that build the binaries.
This allows for building on Windows and Linux binaries for
- x86-64-avx512
- x86-64-vnni256
- x86-64-vnni512
(x86-64-avxvnni needs more recent gcc in the actions)
also build x86-64-avx2 on macos.
closes https://github.com/official-stockfish/Stockfish/pull/4679
No functional change
since the introduction of NNUE (first released with Stockfish 12), we
have maintained the classical evaluation as part of SF in frozen form.
The idea that this code could lead to further inputs to the NN or
search did not materialize. Now, after five releases, this PR removes
the classical evaluation from SF. Even though this evaluation is
probably the best of its class, it has become unimportant for the
engine's strength, and there is little need to maintain this
code (roughly 25% of SF) going forward, or to expend resources on
trying to improve its integration in the NNUE eval.
Indeed, it had still a very limited use in the current SF, namely
for the evaluation of positions that are nearly decided based on
material difference, where the speed of the classical evaluation
outweights its inaccuracies. This impact on strength is small,
roughly 2Elo, and probably decreasing in importance as the TC grows.
Potentially, removal of this code could lead to the development of
techniques to have faster, but less accurate NN evaluation,
for certain positions.
STC
https://tests.stockfishchess.org/tests/view/64a320173ee09aa549c52157
Elo: -2.35 ± 1.1 (95%) LOS: 0.0%
Total: 100000 W: 24916 L: 25592 D: 49492
Ptnml(0-2): 287, 12123, 25841, 11477, 272
nElo: -4.62 ± 2.2 (95%) PairsRatio: 0.95
LTC
https://tests.stockfishchess.org/tests/view/64a320293ee09aa549c5215b
Elo: -1.74 ± 1.0 (95%) LOS: 0.0%
Total: 100000 W: 25010 L: 25512 D: 49478
Ptnml(0-2): 44, 11069, 28270, 10579, 38
nElo: -3.72 ± 2.2 (95%) PairsRatio: 0.96
VLTC SMP
https://tests.stockfishchess.org/tests/view/64a3207c3ee09aa549c52168
Elo: -1.70 ± 0.9 (95%) LOS: 0.0%
Total: 100000 W: 25673 L: 26162 D: 48165
Ptnml(0-2): 8, 9455, 31569, 8954, 14
nElo: -3.95 ± 2.2 (95%) PairsRatio: 0.95
closes https://github.com/official-stockfish/Stockfish/pull/4674
Bench: 1444646
- loop through the commits starting from the latest one
- read the bench value from the last match, if any, of the template
in the commit body text
closes https://github.com/official-stockfish/Stockfish/pull/4627
No functional change
Current logic can apply Null move pruning
on a dead-lost position returning an unproven loss
(i.e. in TB loss score or mated in losing score) on nonPv nodes.
on a default bench, this can be observed by adding this debugging line:
```
if (nullValue >= beta)
{
// Do not return unproven mate or TB scores
nullValue = std::min(nullValue, VALUE_TB_WIN_IN_MAX_PLY-1);
dbg_hit_on(nullValue <= VALUE_TB_LOSS_IN_MAX_PLY); // Hit #0: Total 73983 Hits 1 Hit Rate (%) 0.00135166
if (thisThread->nmpMinPly || depth < 14)
return nullValue;
```
This fixes this very rare issue (happens at ~0.00135166% of the time) by
eliminating the need to try Null Move Pruning with dead-lost positions
and leaving it to be determined by a normal searching flow.
The previous try to fix was not as safe enough because it was capping
the returned value to (out of TB range) thus reviving the dead-lost position
based on an artificial clamp (i.e. the in TB score/mate score can be lost on that nonPv node):
https://tests.stockfishchess.org/tests/view/649756d5dc7002ce609cd794
Final fix:
Passed STC:
https://tests.stockfishchess.org/tests/view/649a5446dc7002ce609d1049
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 577280 W: 153613 L: 153965 D: 269702
Ptnml(0-2): 1320, 60594, 165190, 60190, 1346
Passed LTC:
https://tests.stockfishchess.org/tests/view/649cd048dc7002ce609d4801
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 246432 W: 66769 L: 66778 D: 112885
Ptnml(0-2): 83, 22105, 78847, 22100, 81
closes https://github.com/official-stockfish/Stockfish/pull/4649
Bench: 2425978
using github actions, create a prerelease for the latest commit to master.
As such a development version will be available on github, in addition to the latest release.
closes https://github.com/official-stockfish/Stockfish/pull/4622
No functional change
Implemented LEB128 (de)compression for the feature transformer.
Reduces embedded network size from 70 MiB to 39 Mib.
The new nn-78bacfcee510.nnue corresponds to the master net compressed.
closes https://github.com/official-stockfish/Stockfish/pull/4617
No functional change
Use block sparse input for the first fully connected layer on architectures with at least SSSE3.
Depending on the CPU architecture, this yields a speedup of up to 10%, e.g.
```
Result of 100 runs of 'bench 16 1 13 default depth NNUE'
base (...ockfish-base) = 959345 +/- 7477
test (...ckfish-patch) = 1054340 +/- 9640
diff = +94995 +/- 3999
speedup = +0.0990
P(speedup > 0) = 1.0000
CPU: 8 x AMD Ryzen 7 5700U with Radeon Graphics
Hyperthreading: on
```
Passed STC:
https://tests.stockfishchess.org/tests/view/6485aa0965ffe077ca12409c
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 8864 W: 2479 L: 2223 D: 4162
Ptnml(0-2): 13, 829, 2504, 1061, 25
This commit includes a net with reordered weights, to increase the likelihood of block sparse inputs,
but otherwise equivalent to the previous master net (nn-ea57bea57e32.nnue).
Activation data collected with https://github.com/AndrovT/Stockfish/tree/log-activations, running bench 16 1 13 varied_1000.epd depth NNUE on this data. Net parameters permuted with https://gist.github.com/AndrovT/9e3fbaebb7082734dc84d27e02094cb3.
closes https://github.com/official-stockfish/Stockfish/pull/4612
No functional change
Created by retraining an earlier epoch (ep659) of the experiment that led to the first SFNNv6 net:
- First retrained on the nn-0dd1cebea573 dataset
- Then retrained with skip 20 on a smaller dataset containing unfiltered Leela data
- And then retrained again with skip 27 on the nn-0dd1cebea573 dataset
The equivalent 7-step training sequence from scratch that led here was:
1. max-epoch 400, lambda 1.0, constant LR 9.75e-4, T79T77-filter-v6-dd.min.binpack
ep379 chosen for retraining in step2
2. max-epoch 800, end-lambda 0.75, T60T70wIsRightFarseerT60T74T75T76.binpack
ep679 chosen for retraining in step3
3. max-epoch 800, end-lambda 0.75, skip 28, nn-e1fb1ade4432 dataset
ep799 chosen for retraining in step4
4. max-epoch 800, end-lambda 0.7, skip 28, nn-e1fb1ade4432 dataset
ep759 became nn-8d69132723e2.nnue (first SFNNv6 net)
ep659 chosen for retraining in step5
5. max-epoch 800, end-lambda 0.7, skip 28, nn-0dd1cebea573 dataset
ep759 chosen for retraining in step6
6. max-epoch 800, end-lambda 0.7, skip 20, leela-dfrc-v2-T77decT78janfebT79aprT80apr.binpack
ep639 chosen for retraining in step7
7. max-epoch 800, end-lambda 0.7, skip 27, nn-0dd1cebea573 dataset
ep619 became nn-ea57bea57e32.nnue
For the last retraining (step7):
python3 easy_train.py
--experiment-name L1-1536-Re6-masterShuffled-ep639-sk27-Re5-leela-dfrc-v2-T77toT80small-Re4-masterShuffled-ep659-Re3-sameAs-Re2-leela96-dfrc99-16t-v2-T60novdecT80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-Re1-LeelaFarseer-new-T77T79 \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-L1-1536 \
--early-fen-skipping 27 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--max_epoch 800 \
--start-from-engine-test-net False \
--start-from-model /data/L1-1536-Re5-leela-dfrc-v2-T77toT80small-epoch639.nnue \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--seed $RANDOM \
--gpus "0,"
For preparing the step6 leela-dfrc-v2-T77decT78janfebT79aprT80apr.binpack dataset:
python3 interleave_binpacks.py \
leela96-filt-v2.binpack \
dfrc99-16tb7p-eval-filt-v2.binpack \
test77-dec2021-16tb7p.no-db.min-mar2023.binpack \
test78-janfeb2022-16tb7p.no-db.min-mar2023.binpack \
test79-apr2022-16tb7p-filter-v6-dd.binpack \
test80-apr2022-16tb7p.no-db.min-mar2023.binpack \
/data/leela-dfrc-v2-T77decT78janfebT79aprT80apr.binpack
The unfiltered Leela data used for the step6 dataset can be found at:
https://robotmoon.com/nnue-training-data
Local elo at 25k nodes per move:
nn-epoch619.nnue : 2.3 +/- 1.9
Passed STC:
https://tests.stockfishchess.org/tests/view/6480d43c6e6ce8d9fc6d7cc8
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 40992 W: 11017 L: 10706 D: 19269
Ptnml(0-2): 113, 4400, 11170, 4689, 124
Passed LTC:
https://tests.stockfishchess.org/tests/view/648119ac6e6ce8d9fc6d8208
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 129174 W: 35059 L: 34579 D: 59536
Ptnml(0-2): 66, 12548, 38868, 13050, 55
closes https://github.com/official-stockfish/Stockfish/pull/4611
bench: 2370027
Created by retraining an earlier epoch of the experiment leading to the first SFNNv6 net
on a more-randomized version of the nn-e1fb1ade4432.nnue dataset mixed with unfiltered
T80 apr2023 data. Trained using early-fen-skipping 28 and max-epoch 960.
The trainer settings and epochs used in the 5-step training sequence leading here were:
1. train from scratch for 400 epochs, lambda 1.0, constant LR 9.75e-4, T79T77-filter-v6-dd.min.binpack
2. retrain ep379, max-epoch 800, end-lambda 0.75, T60T70wIsRightFarseerT60T74T75T76.binpack
3. retrain ep679, max-epoch 800, end-lambda 0.75, skip 28, nn-e1fb1ade4432 dataset
4. retrain ep799, max-epoch 800, end-lambda 0.7, skip 28, nn-e1fb1ade4432 dataset
5. retrain ep439, max-epoch 960, end-lambda 0.7, skip 28, shuffled nn-e1fb1ade4432 + T80 apr2023
This net was epoch 559 of the final (step 5) retraining:
```bash
python3 easy_train.py \
--experiment-name L1-1536-Re4-leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr-shuffled-sk28 \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-L1-1536 \
--early-fen-skipping 28 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--max_epoch 960 \
--start-from-engine-test-net False \
--start-from-model /data/L1-1536-Re3-nn-epoch439.nnue \
--engine-test-branch linrock/Stockfish/L1-1536 \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--seed $RANDOM \
--gpus "0,"
```
During data preparation, most binpacks were unminimized by removing positions with
score 32002 (`VALUE_NONE`). This makes the tradeoff of increasing dataset filesize
on disk to increase the randomness of positions in interleaved datasets.
The code used for unminimizing is at:
https://github.com/linrock/Stockfish/tree/tools-unminify
For preparing the dataset used in this experiment:
```bash
python3 interleave_binpacks.py \
leela96-filt-v2.binpack \
dfrc99-16tb7p-eval-filt-v2.binpack \
filt-v6-dd-min/test60-novdec2021-12tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-aug2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-sep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd/test80-jul2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-oct2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-nov2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd-min/test80-jan2023-3of3-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-feb2023-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd/test79-apr2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test79-may2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd-min/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd/test78-juntosep2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test77-dec2021-16tb7p-filter-v6-dd.binpack \
test80-apr2023-2tb7p.binpack \
/data/leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr.binpack
```
T80 apr2023 data was converted using lc0-rescorer with ~2tb of tablebases and can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move vs. nn-e1fb1ade4432.nnue (L1 size 1024):
nn-epoch559.nnue : 25.7 +/- 1.6
Passed STC:
https://tests.stockfishchess.org/tests/view/647cd3b87cf638f0f53f9cbb
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 59200 W: 16000 L: 15660 D: 27540
Ptnml(0-2): 159, 6488, 15996, 6768, 189
Passed LTC:
https://tests.stockfishchess.org/tests/view/647d58de726f6b400e4085d8
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 58800 W: 16002 L: 15657 D: 27141
Ptnml(0-2): 44, 5607, 17748, 5962, 39
closes https://github.com/official-stockfish/Stockfish/pull/4606
bench 2141197
Created by training a new net from scratch with L1 size increased from 1024 to 1536.
Thanks to Vizvezdenec for the idea of exploring larger net sizes after recent
training data improvements.
A new net was first trained with lambda 1.0 and constant LR 8.75e-4. Then a strong net
from a later epoch in the training run was chosen for retraining with start-lambda 1.0
and initial LR 4.375e-4 decaying with gamma 0.995. Retraining was performed a total of
3 times, for this 4-step process:
1. 400 epochs, lambda 1.0 on filtered T77+T79 v6 deduplicated data
2. 800 epochs, end-lambda 0.75 on T60T70wIsRightFarseerT60T74T75T76.binpack
3. 800 epochs, end-lambda 0.75 and early-fen-skipping 28 on the master dataset
4. 800 epochs, end-lambda 0.7 and early-fen-skipping 28 on the master dataset
In the training sequence that reached the new nn-8d69132723e2.nnue net,
the epochs used for the 3x retraining runs were:
1. epoch 379 trained on T77T79-filter-v6-dd.min.binpack
2. epoch 679 trained on T60T70wIsRightFarseerT60T74T75T76.binpack
3. epoch 799 trained on the master dataset
For training from scratch:
python3 easy_train.py \
--experiment-name new-L1-1536-T77T79-filter-v6dd \
--training-dataset /data/T77T79-filter-v6-dd.min.binpack \
--max_epoch 400 \
--lambda 1.0 \
--start-from-engine-test-net False \
--engine-test-branch linrock/Stockfish/L1-1536 \
--nnue-pytorch-branch linrock/Stockfish/misc-fixes-L1-1536 \
--tui False \
--gpus "0," \
--seed $RANDOM
Retraining commands were similar to each other. For the 3rd retraining run:
python3 easy_train.py \
--experiment-name L1-1536-T77T79-v6dd-Re1-LeelaFarseer-Re2-masterDataset-Re3-sameData \
--training-dataset /data/leela96-dfrc99-v2-T60novdecT80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd.binpack \
--early-fen-skipping 28 \
--max_epoch 800 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--lr 4.375e-4 \
--gamma 0.995 \
--start-from-engine-test-net False \
--start-from-model /data/L1-1536-T77T79-v6dd-Re1-LeelaFarseer-Re2-masterDataset-nn-epoch799.nnue \
--engine-test-branch linrock/Stockfish/L1-1536 \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-L1-1536 \
--tui False \
--gpus "0," \
--seed $RANDOM
The T77+T79 data used is a subset of the master dataset available at:
https://robotmoon.com/nnue-training-data/
T60T70wIsRightFarseerT60T74T75T76.binpack is available at:
https://drive.google.com/drive/folders/1S9-ZiQa_3ApmjBtl2e8SyHxj4zG4V8gG
Local elo at 25k nodes per move vs. nn-e1fb1ade4432.nnue (L1 size 1024):
nn-epoch759.nnue : 26.9 +/- 1.6
Failed STC
https://tests.stockfishchess.org/tests/view/64742485d29264e4cfa75f97
LLR: -2.94 (-2.94,2.94) <0.00,2.00>
Total: 13728 W: 3588 L: 3829 D: 6311
Ptnml(0-2): 71, 1661, 3610, 1482, 40
Failing LTC
https://tests.stockfishchess.org/tests/view/64752d7c4a36543c4c9f3618
LLR: -1.91 (-2.94,2.94) <0.50,2.50>
Total: 35424 W: 9522 L: 9603 D: 16299
Ptnml(0-2): 24, 3579, 10585, 3502, 22
Passed VLTC 180+1.8
https://tests.stockfishchess.org/tests/view/64752df04a36543c4c9f3638
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 47616 W: 13174 L: 12863 D: 21579
Ptnml(0-2): 13, 4261, 14952, 4566, 16
Passed VLTC SMP 60+0.6 th 8
https://tests.stockfishchess.org/tests/view/647446ced29264e4cfa761e5
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 19942 W: 5694 L: 5451 D: 8797
Ptnml(0-2): 6, 1504, 6707, 1749, 5
closes https://github.com/official-stockfish/Stockfish/pull/4593
bench 2222567
This change removes one of the constants in the calculation of optimism. It also changes the 2 constants used with the scale value so that they are independent, instead of applying a constant to the scale and then adjusting it again when it is applied to the optimism. This might make the tuning of these constants cleaner and more reliable in the future.
STC 10+0.1 (accidentally run as an Elo gainer:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 154080 W: 41119 L: 40651 D: 72310
Ptnml(0-2): 375, 16840, 42190, 17212, 423
https://tests.stockfishchess.org/tests/live_elo/64653eabf3b1a4e86c317f77
LTC 60+0.6:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 217434 W: 58382 L: 58363 D: 100689
Ptnml(0-2): 66, 21075, 66419, 21088, 69
https://tests.stockfishchess.org/tests/live_elo/6465d077f3b1a4e86c318d6c
closes https://github.com/official-stockfish/Stockfish/pull/4576
bench: 3190961
Created by retraining nn-dabb1ed23026.nnue with a dataset composed of:
* The previous best dataset (nn-1ceb1a57d117.nnue dataset)
* Adding de-duplicated T80 data from feb2023 and the last 10 days of jan2023, filtered with v6-dd
Initially trained with the same options as the recent master net (nn-1ceb1a57d117.nnue).
Around epoch 890, training was manually stopped and max epoch increased to 1000.
```
python3 easy_train.py \
--experiment-name leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovjanfebT79aprmayT78jantosepT77dec-v6dd \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovjanfebT79aprmayT78jantosepT77dec-v6dd.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes \
--start-from-engine-test-net True \
--early-fen-skipping 30 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--max_epoch 900 \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--gpus "0," \
--seed $RANDOM
```
The same v6-dd filtering and binpack minimizer was used for preparing the recent nn-1ceb1a57d117.nnue dataset.
```
python3 interleave_binpacks.py \
leela96-filt-v2.binpack \
dfrc99-filt-v2.binpack \
T60-nov2021-12tb7p-eval-filt-v2.binpack \
T60-dec2021-12tb7p-eval-filt-v2.binpack \
filt-v6/test80-aug2022-16tb7p-filter-v6.min-mar2023.binpack \
filt-v6/test80-sep2022-16tb7p-filter-v6.min-mar2023.binpack \
filt-v6-dd/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test80-jul2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-oct2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-nov2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-jan2022-3of3-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test80-feb2023-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test79-apr2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test79-may2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test78-juntosep2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test77-dec2021-16tb7p-filter-v6-dd.binpack \
/data/leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovjanfebT79aprmayT78jantosepT77dec-v6dd.binpack
```
Links for downloading the training data components can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move:
nn-epoch919.nnue : 2.6 +/- 2.8
Passed STC vs. nn-dabb1ed23026.nnue
https://tests.stockfishchess.org/tests/view/644420df94ff3db5625f2af5
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 125960 W: 33898 L: 33464 D: 58598
Ptnml(0-2): 351, 13920, 34021, 14320, 368
Passed LTC vs. nn-1ceb1a57d117.nnue
https://tests.stockfishchess.org/tests/view/64469f128d30316529b3dc46
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 24544 W: 6817 L: 6542 D: 11185
Ptnml(0-2): 8, 2252, 7488, 2505, 19
closes https://github.com/official-stockfish/Stockfish/pull/4546
bench 3714847
* Extending v6 filtering to data from T77 dec2021, T79 may2022, and T80 nov2022
* Reducing the number of duplicate positions, prioritizing position scores seen later in time
* Using a binpack minimizer to reduce the overall data size
Trained the same way as the previous master net, aside from the dataset changes:
```
python3 easy_train.py \
--experiment-name leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovT79aprmayT78jantosepT77dec-v6dd \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovT79aprmayT78jantosepT77dec-v6dd.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes \
--start-from-engine-test-net True \
--early-fen-skipping 30 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--max_epoch 900 \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--gpus "0," \
--seed $RANDOM
```
The new v6-dd filtering reduces duplicate positions by iterating over hourly data files within leela test runs, starting with the most recent, then keeping positions the first time they're seen and ignoring positions that are seen again. This ordering was done with the assumption that position scores seen later in time are generally more accurate than scores seen earlier in the test run. Positions are de-duplicated based on piece orientations, the first token in fen strings.
The binpack minimizer was run with default settings after first merging monthly data into single binpacks.
```
python3 interleave_binpacks.py \
leela96-filt-v2.binpack \
dfrc99-filt-v2.binpack \
T60-nov2021-12tb7p-eval-filt-v2.binpack \
T60-dec2021-12tb7p-eval-filt-v2.binpack \
filt-v6/test80-aug2022-16tb7p-filter-v6.min-mar2023.binpack \
filt-v6/test80-sep2022-16tb7p-filter-v6.min-mar2023.binpack \
filt-v6-dd/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test80-jul2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-oct2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-nov2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test79-apr2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test79-may2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.binpack \
filt-v6-dd/test78-juntosep2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test77-dec2021-16tb7p-filter-v6-dd.binpack \
/data/leela96-dfrc99-T60novdec-v2-T80augsep-v6-T80junjuloctnovT79aprmayT78jantosepT77dec-v6dd.binpack
```
The code for v6-dd filtering is available along with training data preparation scripts at:
https://github.com/linrock/nnue-data
Links for downloading the training data components:
https://robotmoon.com/nnue-training-data/
The binpack minimizer is from: #4447
Local elo at 25k nodes per move:
nn-epoch859.nnue : 1.2 +/- 2.6
Passed STC:
https://tests.stockfishchess.org/tests/view/643aad7db08900ff1bc5a832
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 565040 W: 150225 L: 149162 D: 265653
Ptnml(0-2): 1875, 62137, 153229, 63608, 1671
Passed LTC:
https://tests.stockfishchess.org/tests/view/643ecf2fa43cf30e719d2042
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 1014840 W: 274645 L: 272456 D: 467739
Ptnml(0-2): 515, 98565, 306970, 100956, 414
closes https://github.com/official-stockfish/Stockfish/pull/4545
bench 3476305
This idea is a result of my second condition combination tuning for reductions:
https://tests.stockfishchess.org/tests/view/643ed5573806eca398f06d61
There were used two parameters per combination: one for the 'sign' of the first and the second condition in a combination. Values >= 50 indicate using a condition directly and values <= -50 means use the negation of a condition.
Each condition pair (X,Y) had two occurances dependent of the order of the two conditions:
- if X < Y the parameters used for more reduction
- if X > Y the parameters used for less reduction
- if X = Y then only one condition is present and A[X][X][0]/A[X][X][1] stands for using more/less reduction for only this condition.
The parameter pair A[7][2][0] (value = -94.70) and A[7][2][1] (value = 93.60) was one of the strongest signals with values near 100/-100.
Here condition nr. 7 was '(ss+1)->cutoffCnt > 3' and condition nr. 2 'move == ttMove'. For condition nr. 7 the negation is used because A[7][2][0] is negative.
This translates finally to less reduction (because 7 > 2) for tt moves if child cutoffs <= 3.
STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 65728 W: 17704 L: 17358 D: 30666
Ptnml(0-2): 184, 7092, 18008, 7354, 226
https://tests.stockfishchess.org/tests/view/643ff767ef2529086a7ed042
LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 139200 W: 37776 L: 37282 D: 64142
Ptnml(0-2): 58, 13241, 42509, 13733, 59
https://tests.stockfishchess.org/tests/view/6440bfa9ef2529086a7edbc7
closes https://github.com/official-stockfish/Stockfish/pull/4538
Bench: 3548023
This patch is a simplification of my recent elo gainer.
Logically the Elo gainer didn't make much sense and this patch simplifies it into smth more logical.
Instead of assigning negative bonuses to all non-first moves that enter PV nodes
we assign positive bonuses in full depth search after LMR only for moves that
will result in a fail high - thus not assigning positive bonuses
for moves that will go to pv search - so doing "almost" the same as we do in master now for them.
Logic differs for some other moves, though, but this removes some lines of code.
Passed STC:
https://tests.stockfishchess.org/tests/view/642cf5cf77ff3301150dc5ec
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 409320 W: 109124 L: 109308 D: 190888
Ptnml(0-2): 1149, 45385, 111751, 45251, 1124
Passed LTC:
https://tests.stockfishchess.org/tests/view/642fe75d20eb941419bde200
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 260336 W: 70280 L: 70303 D: 119753
Ptnml(0-2): 99, 25236, 79528, 25199, 106
closes https://github.com/official-stockfish/Stockfish/pull/4522
Bench: 4286815
Since bestValue becomes value and beta - alpha is always non-negative,
extraReduction is always false, hence it has no effect.
This patch includes small changes to improve readability.
closes https://github.com/official-stockfish/Stockfish/pull/4505
No functional change
The current implementation generates warnings on MSVC. However, we have
no real use cases for double-typed UCI option values now. Also parameter
tuning only accepts following three types:
int, Value, Score
closes https://github.com/official-stockfish/Stockfish/pull/4505
No functional change
Replace the deprecated Intel compiler icc with its newer icx variant.
This newer compiler is based on clang, and yields good performance.
As before, currently only linux is supported.
closes https://github.com/official-stockfish/Stockfish/pull/4478
No functional change
Made advanced Windows API calls (those from Advapi32.dll) dynamically
linked to avoid link errors when compiling using
Intel icx compiler for Windows.
https://github.com/official-stockfish/Stockfish/pull/4467
No functional change
this makes it easier to compile under MSVC, even though we recommend gcc/clang for production compiles at the moment.
In Win32 API, by default, most null-terminated character strings arguments are of wchar_t (UTF16, formerly UCS16-LE) type, i.e. 2 bytes (at least) per character. So, src/misc.cpp should have proper type. Respectively, for src/syzygy/tbprobe.cpp, in Widows, file paths should be std::wstring rather than std::string. However, this requires a very big number of changes, since the config files are also keeping the 8-bit-per-character std::string strings. Therefore, just one change of using 8-byte-per-character CreateFileA make it compile under MSVC.
closes https://github.com/official-stockfish/Stockfish/pull/4438
No functional change
Created by retraining the master net with these modifications:
* New filtering methods for existing data from T80 sep+oct2022, T79 apr2022, T78 jun+jul+aug+sep2022, T77 dec2021
* Adding new filtered data from T80 aug2022 and T78 apr+may2022
* Increasing early-fen-skipping from 28 to 30
```
python3 easy_train.py \
--experiment-name leela96-dfrc99-T80novT79mayT60novdec-v2-T80augsepoctT79aprT78aprtosep-v6-T77dec-v3-sk30 \
--training-dataset /data/leela96-dfrc99-T80novT79mayT60novdec-v2-T80augsepoctT79aprT78aprtosep-v6-T77dec-v3.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes \
--start-from-engine-test-net True \
--early-fen-skipping 30 \
--max_epoch 900 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--gpus "0," \
--seed $RANDOM
```
The v3 filtering used for data from T77dec 2021 differs from v2 filtering in that:
* To improve binpack compression, positions after ply 28 were skipped during training by setting position scores to VALUE_NONE (32002) instead of removing them entirely
* All early-game positions with ply <= 28 were removed to maximize binpack compression
* Only bestmove captures at d6pv2 search were skipped, not 2nd bestmove captures
* Binpack compression was repaired for the remaining positions by effectively replacing bestmoves with "played moves" to maintain contiguous sequences of positions in the training game data
After improving binpack compression, The T77 dec2021 data size was reduced from 95G to 19G.
The v6 filtering used for data from T80augsepoctT79aprT78aprtosep 2022 differs from v2 in that:
* All positions with only one legal move were removed
* Tighter score differences at d6pv2 search were used to remove more positions with only one good move than before
* d6pv2 search was not used to remove positions where the best 2 moves were captures
```
python3 interleave_binpacks.py \
nn-547-dataset/leela96-eval-filt-v2.binpack \
nn-547-dataset/dfrc99-eval-filt-v2.binpack \
nn-547-dataset/test80-nov2022-12tb7p-eval-filt-v2-d6.binpack \
nn-547-dataset/T79-may2022-12tb7p-eval-filt-v2.binpack \
nn-547-dataset/T60-nov2021-12tb7p-eval-filt-v2.binpack \
nn-547-dataset/T60-dec2021-12tb7p-eval-filt-v2.binpack \
filt-v6/test80-aug2022-16tb7p-filter-v6.binpack \
filt-v6/test80-sep2022-16tb7p-filter-v6.binpack \
filt-v6/test80-oct2022-16tb7p-filter-v6.binpack \
filt-v6/test79-apr2022-16tb7p-filter-v6.binpack \
filt-v6/test78-aprmay2022-16tb7p-filter-v6.binpack \
filt-v6/test78-junjulaug2022-16tb7p-filter-v6.binpack \
filt-v6/test78-sep2022-16tb7p-filter-v6.binpack \
filt-v3/test77-dec2021-16tb7p-filt-v3.binpack \
/data/leela96-dfrc99-T80novT79mayT60novdec-v2-T80augsepoctT79aprT78aprtosep-v6-T77dec-v3.binpack
```
The code for the new data filtering methods is available at:
https://github.com/linrock/Stockfish/tree/nnue-data-v3/nnue-data
The code for giving hexword names to .nnue files is at:
https://github.com/linrock/nnue-namer
Links for downloading the training data components can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move:
nn-epoch779.nnue : 0.6 +/- 3.1
Passed STC:
https://tests.stockfishchess.org/tests/view/64212412db43ab2ba6f8efb0
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 82256 W: 22185 L: 21809 D: 38262
Ptnml(0-2): 286, 9065, 22067, 9407, 303
Passed LTC:
https://tests.stockfishchess.org/tests/view/64223726db43ab2ba6f91d6c
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 30840 W: 8437 L: 8149 D: 14254
Ptnml(0-2): 14, 2891, 9323, 3177, 15
closes https://github.com/official-stockfish/Stockfish/pull/4465
bench 5101970
This patch simplifies initialization of statScore to "always set it up to 0" instead of setting it up to 0 two plies deeper.
Reason for why it was done in previous way partially was because of LMR usage of previous statScore which was simplified long time ago so it makes sense to make in more simple there.
Passed STC:
https://tests.stockfishchess.org/tests/view/641a86d1db43ab2ba6f7b31d
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 115648 W: 30895 L: 30764 D: 53989
Ptnml(0-2): 368, 12741, 31473, 12876, 366
Passed LTC:
https://tests.stockfishchess.org/tests/view/641b1c31db43ab2ba6f7d17a
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 175576 W: 47122 L: 47062 D: 81392
Ptnml(0-2): 91, 17077, 53390, 17141, 89
closes https://github.com/official-stockfish/Stockfish/pull/4460
bench 5081969
Patch analyzes field after SEE exchanges concluded with a recapture by
the opponent:
if opponent Queen/Rook/King results under attack after the exchanges, we
consider the move sharp and don't prune it.
Important note:
By accident I forgot to adjust 'occupied' when the king takes part in
the exchanges.
As result of this a move is considered sharp too, when opponent king
apparently can evade check by recapturing.
Surprisingly this seems contribute to patch's strength.
STC:
https://tests.stockfishchess.org/tests/view/640b16132644b62c33947397
LLR: 2.96 (-2.94,2.94) <0.00,2.00>
Total: 116400 W: 31239 L: 30817 D: 54344
Ptnml(0-2): 350, 12742, 31618, 13116, 374
LTC:
https://tests.stockfishchess.org/tests/view/640c88092644b62c3394c1c5
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 177600 W: 47988 L: 47421 D: 82191
Ptnml(0-2): 62, 16905, 54317, 17436, 80
closes https://github.com/official-stockfish/Stockfish/pull/4453
bench: 5012145
Since st is a member of position we don't need to pass it separately as
parameter.
While being there also remove some line in pos_is_ok, where
a copy of StateInfo was made by using default copy constructor and
then verified it's correctedness by doing a memcmp.
There is no point in doing that.
Passed non-regression test
https://tests.stockfishchess.org/tests/view/64098d562644b62c33942b35
LLR: 3.24 (-2.94,2.94) <-1.75,0.25>
Total: 548960 W: 145834 L: 146134 D: 256992
Ptnml(0-2): 1617, 57652, 156261, 57314, 1636
closes https://github.com/official-stockfish/Stockfish/pull/4444
No functional change
Keep incbin.h with the same mode as the other source files.
A mode diff might show up when working with patch files or sending the source code between devices.
This patch should fix such behaviour.
closes https://github.com/official-stockfish/Stockfish/pull/4442
No functional change
in a some of cases movepicker returned some moves more than once which lead
to them being searched more than once. This bug was possible because of how
we use queen promotions - they are generated as a captures but are not
included in position function which checks if move is a capture. Thus if
any refutation (killer or countermove) was a queen promotion it was
searched twice - once as a capture and one as a refutation.
This patch affects various things, namely stats assignments for queen promotions
and other moves if best move is queen promotion,
also some heuristics in search and qsearch.
With this patch every queen promotion is now considered a capture.
After this patch number of found duplicated moves is 0 during normal 13 depth bench run.
Passed STC:
https://tests.stockfishchess.org/tests/view/63f77e01e74a12625bcd87d7
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 80920 W: 21455 L: 21289 D: 38176
Ptnml(0-2): 198, 8839, 22241, 8963, 219
Passed LTC:
https://tests.stockfishchess.org/tests/view/63f7e020e74a12625bcd9a76
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 89712 W: 23674 L: 23533 D: 42505
Ptnml(0-2): 24, 8737, 27202, 8860, 33
closes https://github.com/official-stockfish/Stockfish/pull/4405
bench 4681731
Call the recently added hint function for NNUE accumulator update after a failed probcut search.
In this case we already searched at least some captures and tt move which, however, is not sufficient for a cutoff.
So it seems we have a greater chance that the full search will also have no cutoff and hence all moves have to be searched.
STC: https://tests.stockfishchess.org/tests/view/63fa74a4e74a12625bce1823
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 70096 W: 18770 L: 18423 D: 32903
Ptnml(0-2): 191, 7342, 19654, 7651, 210
To be sure that we have no heavy interaction retest on top of #4410.
Rebased STC: https://tests.stockfishchess.org/tests/view/63fb2f62e74a12625bce3b03
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 137688 W: 36790 L: 36349 D: 64549
Ptnml(0-2): 397, 14373, 38919, 14702, 453
closes https://github.com/official-stockfish/Stockfish/pull/4411
No functional change
Credits to Stefan Geschwentner (locutus2) showing that the hint
is useful on PvNodes. In contrast to his test,
this version avoids to use the hint when in check.
I believe checking positions aren't good candidates for the hint
because:
- evasion moves are rather few, so a checking pos. has much less childs
than a normal position
- if the king has to move the NNUE eval can't use incremental updates,
so the child nodes have to do a full refresh anyway.
Passed STC:
https://tests.stockfishchess.org/tests/view/63f9c5b1e74a12625bcdf585
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 124472 W: 33268 L: 32846 D: 58358
Ptnml(0-2): 350, 12986, 35170, 13352, 378
closes https://github.com/official-stockfish/Stockfish/pull/4410
no functional change
Params found with the nevergrad TBPSA optimizer via nevergrad4sf modified to:
* use SPRT LLR with fishtest STC elo gainer bounds [0, 2] as the objective function
* increase the game batch size after each new optimal point is found
The params were the optimal point after TBPSA iteration 7 and 160 nevergrad evaluations with:
* initial batch size of 96 games per evaluation
* batch size increase of 64 games after each iteration
* a budget of 512 evaluations
* TC: fixed 1.5 million nodes per move, no time limit
nevergrad4sf enables optimizing stockfish params with TBPSA:
https://github.com/vondele/nevergrad4sf
Using pentanomial game results with smaller game batch sizes was inspired by:
Use of SPRT LLR calculated from pentanomial game results as the objective function was an experiment at maximizing the information from game batches to reduce the computational cost for TBPSA to converge on good parameters.
For the exact code used to find the params:
https://github.com/linrock/tuning-fork
Passed STC:
https://tests.stockfishchess.org/tests/view/63f4ef5ee74a12625bcd114a
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 66552 W: 17736 L: 17390 D: 31426
Ptnml(0-2): 164, 7229, 18166, 7531, 186
Passed LTC:
https://tests.stockfishchess.org/tests/view/63f56028e74a12625bcd2550
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 71264 W: 19150 L: 18787 D: 33327
Ptnml(0-2): 23, 6728, 21771, 7083, 27
closes https://github.com/official-stockfish/Stockfish/pull/4401
bench 3687580
This patch introduces `hint_common_parent_position()` to signal that potentially several child nodes will require an NNUE eval. By populating explicitly the accumulator, these subsequent evaluations can be performed more efficiently.
This was based on the observation that calculating the evaluation in an excluded move position yielded a significant Elo gain, even though the evaluation itself was already available (work by pb00067).
Sopel wrote the code to perform just the accumulator update. This PR is based on cleaned up code that
passed STC:
https://tests.stockfishchess.org/tests/view/63f62f9be74a12625bcd4aa0
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 110368 W: 29607 L: 29167 D: 51594
Ptnml(0-2): 41, 10551, 33572, 10967, 53
and in an the earlier (equivalent) version
passed STC:
https://tests.stockfishchess.org/tests/view/63f3c3fee74a12625bcce2a6
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 47552 W: 12786 L: 12467 D: 22299
Ptnml(0-2): 120, 5107, 12997, 5438, 114
passed LTC:
https://tests.stockfishchess.org/tests/view/63f45cc2e74a12625bccfa63
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 110368 W: 29607 L: 29167 D: 51594
Ptnml(0-2): 41, 10551, 33572, 10967, 53
closes https://github.com/official-stockfish/Stockfish/pull/4402
Bench: 3726250
The sdot instruction computes (and accumulates) a signed dot product,
which is quite handy for Stockfish's NNUE code. The instruction is
optional for Armv8.2 and Armv8.3, and mandatory for Armv8.4 and above.
The commit adds a new 'arm-dotprod' architecture with enabled dot
product support. It also enables dot product support for the existing
'apple-silicon' architecture, which is at least Armv8.5.
The following local speed test was performed on an Apple M1 with
ARCH=apple-silicon. I had to remove CPU pinning from the benchmark
script. However, the results were still consistent: Checking both
binaries against themselves reported a speedup of +0.0000 and +0.0005,
respectively.
```
Result of 100 runs
==================
base (...ish.037ef3e1) = 1917997 +/- 7152
test (...fish.dotprod) = 2159682 +/- 9066
diff = +241684 +/- 2923
speedup = +0.1260
P(speedup > 0) = 1.0000
CPU: 10 x arm
Hyperthreading: off
```
Fixes#4193
closes https://github.com/official-stockfish/Stockfish/pull/4400
No functional change
Created by retraining the master net on a dataset composed of:
* Most of the previous best dataset filtered to remove positions likely having only one good move
* Adding training data from Leela T77 dec2021 rescored with 16tb of 7-piece tablebases
Trained with end lambda 0.7 and max epoch 900. Positions with ply <= 28 were removed from most of the previous best dataset before training began. A new nnue-pytorch trainer param for skipping early plies was used to skip plies <= 24 in the unfiltered and additional Leela T77 parts of the dataset.
```
python easy_train.py \
--experiment-name leela96-dfrc99-T80octnovT79aprmayT60novdec-eval-filt-v2-T78augsep-12tb-T77dec-16tb-lambda7-sk24 \
--training-dataset /data/leela96-dfrc99-T80octnovT79aprmayT60novdec-eval-filt-v2-T78augsep-12tb-T77dec-16tb.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/easy-train-early-fen-skipping \
--early-fen-skipping 24 \
--gpus "0," \
--start-from-engine-test-net True \
--start-lambda 1.0 \
--end-lambda 0.7 \
--gamma 0.995 \
--lr 4.375e-4 \
--tui False \
--seed $RANDOM \
--max_epoch 900
```
The depth6 multipv2 search filtering method is the same as the one used for filtering recent best datasets, with a lower eval difference threshold to remove slightly more positions than before. These parts of the dataset were filtered:
* 96% of T60T70wIsRightFarseerT60T74T75T76.binpack
* 99% of dfrc_n5000.binpack
* T80 oct + nov 2022 data, no positions with castling flags, rescored with ~600gb 7p tablebases
* T79 apr + may 2022 data, rescored with 12tb 7p tablebases
* T60 nov + dec 2021 data, rescored with 12tb 7p tablebases
These parts of the dataset were not filtered. Positions with ply <= 24 were skipped during training:
* T78 aug + sep 2022 data, rescored with 12tb 7p tablebases
* 84% of T77 dec 2021 data, rescored with 16tb 7p tablebases
The code and exact evaluation thresholds used for data filtering can be found at:
https://github.com/linrock/Stockfish/tree/tools-filter-multipv2-eval-diff-t2/src/filter
The exact training data used can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move:
nn-epoch859.nnue : 3.5 +/ 1.2
Passed STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
https://tests.stockfishchess.org/tests/view/63dfeefc73223e7f52ad769f
Total: 219744 W: 58572 L: 58002 D: 103170
Ptnml(0-2): 609, 24446, 59284, 24832, 701
Passed LTC:
https://tests.stockfishchess.org/tests/view/63e268fc73223e7f52ade7b6
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 91256 W: 24528 L: 24121 D: 42607
Ptnml(0-2): 48, 8863, 27390, 9288, 39
closes https://github.com/official-stockfish/Stockfish/pull/4387
bench 3841998
This patch is a simplification / code normalisation in qsearch.
Adds steps in comments the same way we have in search;
Makes a separate "pruning" stage instead of heuristics randomly being spread over qsearch code;
Reorders pruning heuristics from least taxing ones to more taxing ones;
Removes repeated check for best value not being mated, instead uses 1 check - thus removes some lines of code.
Moves prefetch and move setup after pruning - makes no sense to do them if move will actually get pruned.
Passed non-regression test:
https://tests.stockfishchess.org/tests/view/63dd2c5ff9a50a69252c1413
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 113504 W: 29898 L: 29770 D: 53836
Ptnml(0-2): 287, 11861, 32327, 11991, 286
https://github.com/official-stockfish/Stockfish/pull/4382
Non-functional change.
PR consists of 2 improvements on nodes with excludeMove:
1. Remove xoring the posKey with make_key(excludedMove)
Since we never call tte->save anymore with excludedMove,
the unique left purpose of the xoring was to avoid a TT hit.
Nevertheless on a normal bench run this produced ~25 false positives
(key collisions)
To avoid that we now forbid early TT cutoff's with excludeMove
Maybe these accesses to TT with xored key caused useless misses
in the CPU caches (L1, L2 ...)
Now doing the probe with the same key as the enclosing search does,
should hit the CPU cache.
2. Don't probe Tablebases with excludedMove.
This can't be tested on fishtest, but it's obvious that
tablebases don't deliver any information about suboptimal moves.
Side note:
Very surprisingly it looks like we cannot use static eval's from
TT since they slightly differ over time due to changing optimism.
Attempts to use static eval's from TT did loose about 13 ELO.
This is something about to investigate.
LTC: https://tests.stockfishchess.org/tests/view/63dc0f8de9d4cdfbe672d0c6
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 44736 W: 12046 L: 11733 D: 20957
Ptnml(0-2): 12, 4212, 13617, 4505, 22
An analogue of this passed STC & LTC
see PR #4374 (thanks Dubslow for reviewing!)
closes https://github.com/official-stockfish/Stockfish/pull/4380
Bench: 4758694
This patch adds more debugging slots up to 32 per type and provide tools
to calculate standard deviation and Pearson's correlation coefficient.
However, due to slot being 0 at default, dbg_hit_on(c, b) has to be removed.
Initial idea from snicolet/Stockfish@d8ab604
closes https://github.com/official-stockfish/Stockfish/pull/4354
No functional change
Current master prunes all moves with negative SEE values in qsearch.
This patch sets constant negative threshold thus allowing some moves with negative SEE values to be searched.
Value of threshold is completely arbitrary and can be tweaked - also it as function of depth can be tried.
Original idea by author of Alexandria engine.
Passed STC
https://tests.stockfishchess.org/tests/view/63d79a59a67dd929a5564976
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 34864 W: 9392 L: 9086 D: 16386
Ptnml(0-2): 113, 3742, 9429, 4022, 126
Passed LTC
https://tests.stockfishchess.org/tests/view/63d8074aa67dd929a5565bc2
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 91616 W: 24532 L: 24126 D: 42958
Ptnml(0-2): 32, 8840, 27662, 9238, 36
closes https://github.com/official-stockfish/Stockfish/pull/4376
Bench: 4010877
update the WLD model with about 400M positions extracted from recent LTC games after the net updates.
This ensures that the 50% win rate is again at 1.0 eval.
closes https://github.com/official-stockfish/Stockfish/pull/4373
No functional change.
Beyond the simplification, this could be considered a bugfix from a certain point of view.
However, the effect is very subtle and essentially impossible for users to notice.
5372f81cc8 added about 2 Elo at LTC, but only for second and later `go` commands; now, with
this patch, the first `go` command will also benefit from that gain. Games under time
controls are unaffected (as per the tests).
STC: https://tests.stockfishchess.org/tests/view/63c3d291330c0d3d051d48a8
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 473792 W: 124858 L: 125104 D: 223830
Ptnml(0-2): 1338, 49653, 135063, 49601, 1241
LTC: https://tests.stockfishchess.org/tests/view/63c8cd56a83c702aac083bc9
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 290728 W: 76926 L: 76978 D: 136824
Ptnml(0-2): 106, 27987, 89221, 27953, 97
closes https://github.com/official-stockfish/Stockfish/pull/4361
bench 4208265
This patch results in search values for a TB win/loss to be reported in a way that does not change with normalization, i.e. will be consistent over time.
A value of 200.00 pawns is now reported upon entering a TB won position. Values smaller than 200.00 relate to the distance in plies from the root to the probed position position,
with 1 cp being 1 ply distance.
closes https://github.com/official-stockfish/Stockfish/pull/4353
No functional change
Created by retraining the master net with Leela T78 data from Aug+Sep 2022 added to the previous best dataset. Trained with end lambda 0.7 and started with max epoch 800. All positions with ply <= 28 were skipped:
```
python easy_train.py \
--experiment-name leela95-dfrc96-filt-only-T80octnov-T60novdecT78augsepT79aprmay-12tb7p-sk28-lambda7 \
--training-dataset /data/leela95-dfrc96-filt-only-T80octnov-T60novdecT78augsepT79aprmay-12tb7p.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-skip-ply-lteq-28 \
--start-from-engine-test-net True \
--gpus "0," \
--start-lambda 1.0 \
--end-lambda 0.7 \
--gamma 0.995 \
--lr 4.375e-4 \
--tui False \
--seed $RANDOM \
--max_epoch 800
```
Around epoch 750, training was manually paused and max epoch increased to 950 before resuming. The additional Leela training data from T78 was prepared in the same way as the previous best dataset.
The exact training data used can be found at:
https://robotmoon.com/nnue-training-data/
While the local elo ratings during this experiment were much lower than in recent master nets, several later epochs had a consistent elo above zero, and this was hypothesized to represent potential strength at slower time controls.
Local elo at 25k nodes per move
leela95-dfrc96-filt-only-T80octnov-T60novdecT78augsepT79aprmay-12tb7p-sk28-lambda7
nn-epoch819.nnue : 0.4 +/- 1.1 (nn-bc24c101ada0.nnue)
nn-epoch799.nnue : 0.3 +/- 1.2
nn-epoch759.nnue : 0.3 +/- 1.1
nn-epoch839.nnue : 0.2 +/- 1.4
Passed STC
https://tests.stockfishchess.org/tests/view/63cabf6f0eefe8694a0c6013
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 41608 W: 11161 L: 10848 D: 19599
Ptnml(0-2): 116, 4496, 11281, 4781, 130
Passed LTC
https://tests.stockfishchess.org/tests/view/63cb1856344bb01c191af263
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 76760 W: 20517 L: 20137 D: 36106
Ptnml(0-2): 34, 7435, 23070, 7799, 42
closes https://github.com/official-stockfish/Stockfish/pull/4351
bench 3941848
Bit-shifting is a single instruction, and should be faster than an array lookup
on supported architectures. Besides (ever so slightly) speeding up the
conversion of a square into a bitboard, we may see minor general performance
improvements due to preserving more of the CPU's existing cache.
passed STC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 47280 W: 12469 L: 12271 D: 22540
Ptnml(0-2): 128, 4893, 13402, 5087, 130
https://tests.stockfishchess.org/tests/view/63c5cfe618c20f4929c5fe46
Small speedup locally:
```
Result of 20 runs
==================
base (./stockfish.master ) = 1752135 +/- 10943
test (./stockfish.patch ) = 1763939 +/- 10818
diff = +11804 +/- 4731
speedup = +0.0067
P(speedup > 0) = 1.0000
CPU: 16 x AMD Ryzen 9 3950X 16-Core Processor
```
Closes https://github.com/official-stockfish/Stockfish/pull/4343
Bench: 4106793
The accumulator should be an earlyclobber because it is written before
all input operands are read. Otherwise, the asm code computes a wrong
result if the accumulator shares a register with one of the other input
operands (which happens if we pass in the same expression for the
accumulator and the operand).
Closes https://github.com/official-stockfish/Stockfish/pull/4339
No functional change
Created by retraining the master net on a dataset composed of:
* The Leela-dfrc_n5000.binpack dataset filtered with depth6 multipv2 search to remove positions with only one good move, in addition to removing positions where either of the two best moves are captures
* The same Leela T80 oct+nov 2022 training data used in recent best datasets
* Additional Leela training data from T60 nov+dec 2021 and T79 apr+may 2022
Trained with end lambda 0.7 and started with max epoch 800. All positions with ply <= 28 were skipped:
```
python easy_train.py \
--experiment-name leela95-dfrc96-mpv-eval-fonly-T80octnov-T79aprmayT60novdec-12tb7p-sk28-lambda7 \
--training-dataset /data/leela95-dfrc96-mpv-eval-fonly-T80octnov-T79aprmayT60novdec-12tb7p.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-skip-ply-lteq-28 \
--start-from-engine-test-net True \
--gpus "0," \
--start-lambda 1.0 \
--end-lambda 0.7 \
--gamma 0.995 \
--lr 4.375e-4 \
--tui False \
--seed $RANDOM \
--max_epoch 800
```
Around epoch 780, training was manually paused and max epoch increased to 920 before resuming.
During depth6 multipv2 data filtering, positions were considered to have only one good move if the score of the best move was significantly better than the 2nd best move in a way that changes the outcome of the game:
* the best move leads to a significant advantage while the 2nd best move equalizes or loses
* the best move is about equal while the 2nd best move loses
The modified stockfish branch and exact score thresholds used for filtering are at:
https://github.com/linrock/Stockfish/tree/tools-filter-multipv2-eval-diff/src/filter
About 95% of the Leela portion and 96% of the DFRC portion of the Leela-dfrc_n5000.binpack dataset was filtered. Unfiltered parts of the dataset were left out.
The additional Leela training data from T60 nov+dec 2021 and T79 apr+may 2022 was WDL-rescored with about 12TB of syzygy 7-piece tablebases where the material difference is less than around 6 pawns. Best moves were exported to .plain data files during data conversion with the lc0 rescorer.
The exact training data can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move
experiment_leela95-dfrc96-mpv-eval-fonly-T80octnov-T79aprmayT60novdec-12tb7p-sk28-lambda7
run_0/nn-epoch899.nnue : 3.8 +/- 1.6
Passed STC
https://tests.stockfishchess.org/tests/view/63bed1f540aa064159b9c89b
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 103344 W: 27392 L: 26991 D: 48961
Ptnml(0-2): 333, 11223, 28099, 11744, 273
Passed LTC
https://tests.stockfishchess.org/tests/view/63c010415705810de2deb3ec
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 21712 W: 5891 L: 5619 D: 10202
Ptnml(0-2): 12, 2022, 6511, 2304, 7
closes https://github.com/official-stockfish/Stockfish/pull/4338
bench 4106793
Removed sprintf() which generated a warning, because of security reasons.
Replace NULL with nullptr
Replace typedef with using
Do not inherit from std::vector. Use composition instead.
optimize mutex-unlocking
closes https://github.com/official-stockfish/Stockfish/pull/4327
No functional change
If a global function has no previous declaration, either the declaration
is missing in the corresponding header file or the function should be
declared static. Static functions are local to the translation unit,
which allows the compiler to apply some optimizations earlier (when
compiling the translation unit rather than during link-time
optimization).
The commit enables the warning for gcc, clang, and mingw. It also fixes
the reported warnings by declaring the functions static or by adding a
header file (benchmark.h).
closes https://github.com/official-stockfish/Stockfish/pull/4325
No functional change
This is a later epoch (epoch 859) from the same experiment run that trained yesterday's master net nn-60fa44e376d9.nnue (epoch 779). The experiment was manually paused around epoch 790 and unpaused with max epoch increased to 900 mainly to get more local elo data without letting the GPU idle.
nn-60fa44e376d9.nnue is from #4314
nn-335a9b2d8a80.nnue is from #4295
Local elo vs. nn-335a9b2d8a80.nnue at 25k nodes per move:
experiment_leela93-dfrc99-filt-only-T80-oct-nov-skip28
run_0/nn-epoch779.nnue (nn-60fa44e376d9.nnue) : 5.0 +/- 1.2
run_0/nn-epoch859.nnue (nn-a3dc078bafc7.nnue) : 5.6 +/- 1.6
Passed STC vs. nn-335a9b2d8a80.nnue
https://tests.stockfishchess.org/tests/view/63ae10495bd1e5f27f13d94f
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 37536 W: 10088 L: 9781 D: 17667
Ptnml(0-2): 110, 4006, 10223, 4325, 104
An LTC test vs. nn-335a9b2d8a80.nnue was paused due to nn-60fa44e376d9.nnue passing LTC first:
https://tests.stockfishchess.org/tests/view/63ae5d34331d5fca5113703b
Passed LTC vs. nn-60fa44e376d9.nnue
https://tests.stockfishchess.org/tests/view/63af1e41465d2b022dbce4e7
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 148704 W: 39672 L: 39155 D: 69877
Ptnml(0-2): 59, 14443, 44843, 14936, 71
closes https://github.com/official-stockfish/Stockfish/pull/4319
bench 3984365
Created by retraining the master net on the previous best dataset with additional filtering. No new data was added.
More of the Leela-dfrc_n5000.binpack part of the dataset was pre-filtered with depth6 multipv2 search to remove bestmove captures. About 93% of the previous Leela/SF data and 99% of the SF dfrc data was filtered. Unfiltered parts of the dataset were left out. The new Leela T80 oct+nov data is the same as before. All early game positions with ply count <= 28 were skipped during training by modifying the training data loader in nnue-pytorch.
Trained in a similar way as recent master nets, with a different nnue-pytorch branch for early ply skipping:
python3 easy_train.py \
--experiment-name=leela93-dfrc99-filt-only-T80-oct-nov-skip28 \
--training-dataset=/data/leela93-dfrc99-filt-only-T80-oct-nov.binpack \
--start-from-engine-test-net True \
--nnue-pytorch-branch=linrock/nnue-pytorch/misc-fixes-skip-ply-lteq-28 \
--gpus="0," \
--start-lambda=1.0 \
--end-lambda=0.75 \
--gamma=0.995 \
--lr=4.375e-4 \
--tui=False \
--seed=$RANDOM \
--max_epoch=800 \
--network-testing-threads 20 \
--num-workers 6
For the exact training data used: https://robotmoon.com/nnue-training-data/
Details about the previous best dataset: #4295
Local testing at a fixed 25k nodes:
experiment_leela93-dfrc99-filt-only-T80-oct-nov-skip28
Local Elo: run_0/nn-epoch779.nnue : 5.1 +/- 1.5
Passed STC
https://tests.stockfishchess.org/tests/view/63adb3acae97a464904fd4e8
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 36504 W: 9847 L: 9538 D: 17119
Ptnml(0-2): 108, 3981, 9784, 4252, 127
Passed LTC
https://tests.stockfishchess.org/tests/view/63ae0ae25bd1e5f27f13d884
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 36592 W: 10017 L: 9717 D: 16858
Ptnml(0-2): 17, 3461, 11037, 3767, 14
closes https://github.com/official-stockfish/Stockfish/pull/4314
bench 4015511
In both modified methods, the variable 'result' is checked to detect
whether the probe operation failed. However, the variable is not
initialized on all paths, so the check might test an uninitialized
value.
A test position (with TB) is given by:
position fen 3K1k2/R7/8/8/8/8/8/R6Q w - - 0 1 moves a1b1 f8g8 b1a1 g8f8 a1b1 f8g8 b1a1
This is now fixed by always initializing the variable.
closes https://github.com/official-stockfish/Stockfish/pull/4309
No functional change
Created by retraining the master net with a combination of:
the previous best dataset (Leela-dfrc_n5000.binpack), with about half the dataset filtered using depth6 multipv2 search to throw away positions where either of the 2 best moves are captures
Leela T80 Oct and Nov training data rescored with best moves, adding ~9.5 billion positions
Trained effectively the same way as the previous master net:
python3 easy_train.py \
--experiment-name=leela-dfrc-filtered-T80-oct-nov \
--training-dataset=/data/leela-dfrc-filtered-T80-oct-nov.binpack \
--start-from-engine-test-net True \
--gpus="0," \
--start-lambda=1.0 \
--end-lambda=0.75 \
--gamma=0.995 \
--lr=4.375e-4 \
--tui=False \
--seed=$RANDOM \
--max_epoch=800 \
--auto-exit-timeout-on-training-finished=900 \
--network-testing-threads 20 \
--num-workers 6
Local testing at a fixed 25k nodes:
experiments/experiment_leela-dfrc-filtered-T80-oct-nov/training/run_0/nn-epoch779.nnue
localElo: run_0/nn-epoch779.nnue : 4.7 +/- 3.1
The new Leela T80 part of the dataset was prepared by downloading test80 training data from all of Oct 2022 and Nov 2022, rescoring with syzygy 6-piece tablebases and ~600 GB of 7-piece tablebases, saving best moves to exported .plain files, removing all positions with castling flags, then converting to binpacks and using interleave_binpacks.py to merge them together. Scripts used in this data conversion process are available at:
https://github.com/linrock/lc0-data-converter
Filtering binpack data using depth6 multipv2 search was done by modifying transform.cpp in the tools branch:
https://github.com/linrock/Stockfish/tree/tools-filter-multipv2-no-rescore
Links for downloading the training data (total size: 338 GB) are available at:
https://robotmoon.com/nnue-training-data/
Passed STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 30544 W: 8244 L: 7947 D: 14353
Ptnml(0-2): 93, 3243, 8302, 3542, 92
https://tests.stockfishchess.org/tests/view/63a0d377264a0cf18f86f82b
Passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 32464 W: 8866 L: 8573 D: 15025
Ptnml(0-2): 19, 3054, 9794, 3345, 20
https://tests.stockfishchess.org/tests/view/63a10bc9fb452d3c44b1e016
closes https://github.com/official-stockfish/Stockfish/pull/4295
Bench 3554904
Instead of allowing .depend for specific build-related targets, filter
non-build-related targets (i.e. help, clean) so that other targets can
normally execute .depend target.
closes https://github.com/official-stockfish/Stockfish/pull/4293
No functional change
If ttMove is doubly extended, we allow a depth growth of the remaining moves.
The idea is to get a more realistic score comparison, because of the depth
difference. We take some care to avoid this extension for high depths,
in order to avoid the cost, since the search result is supposed
to be more accurate in this case.
This pull request includes some small cleanups.
STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 60256 W: 16189 L: 15848 D: 28219
Ptnml(0-2): 182, 6546, 16330, 6889, 181
https://tests.stockfishchess.org/tests/view/639109a1792a529ae8f27777
LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 106232 W: 28487 L: 28053 D: 49692
Ptnml(0-2): 46, 10224, 32145, 10652, 49
https://tests.stockfishchess.org/tests/view/63914cba792a529ae8f282ee
closes https://github.com/official-stockfish/Stockfish/pull/4271
Bench: 3622368
Add a constraint so that the dependency build only occurs when users
actually run build tasks.
This fixes a bug on some systems where gcc/g++ is not available.
closes https://github.com/official-stockfish/Stockfish/pull/4255
No functional change
fixes the lowerbound/upperbound output by avoiding
scores outside the alpha,beta bracket. Since SF search
uses fail-soft we can't simply take the returned value
as score.
closes https://github.com/official-stockfish/Stockfish/pull/4259
No functional change
# Use only 'java' to analyze code written in Java, Kotlin, or both
# Use only 'javascript' to analyze code written in JavaScript, TypeScript or both
# Learn more about CodeQL language support at https://aka.ms/codeql-docs/language-support
steps:
- name:Checkout repository
uses:actions/checkout@v4
with:
persist-credentials:false
# Initializes the CodeQL tools for scanning.
- name:Initialize CodeQL
uses:github/codeql-action/init@v3
with:
languages:${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.
# For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.