Since no correction histories are ever used inside Movepick, and many
existing histories are closely integrated into search, it might be more
logical to separate them into their own header. PR based on #5650
closes https://github.com/official-stockfish/Stockfish/pull/5652
No functional change
Idea of this patch comes from the fact that current history heuristics
are mostly populated by low depth entries since our stat bonus reaches
maximum value at depth 5-6 and number of low depth nodes is much bigger
than number of high depth nodes. But it doesn't make a whole lost of
sense to use this low-depth centered histories to sort moves at root.
Current patch introduces special history table that is used exclusively
at root, it remembers which quiet moves were good and which quiet moves
were not good there and uses this information for move ordering.
Passed STC:
https://tests.stockfishchess.org/tests/view/66dda74adc53972b68218cc9
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 127680 W: 33579 L: 33126 D: 60975
Ptnml(0-2): 422, 15098, 32391, 15463, 466
Passed LTC:
https://tests.stockfishchess.org/tests/view/66dead2adc53972b68218d34
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 381978 W: 96958 L: 95923 D: 189097
Ptnml(0-2): 277, 42165, 105089, 43162, 296
closes https://github.com/official-stockfish/Stockfish/pull/5595
Bench: 1611283
Since simplification of quiet checks in qsearch this depth isn't used by
any function at all apart movepicker, which also doesn't use passed
qsearch depth in any way, so can be removed. No functional change.
closes https://github.com/official-stockfish/Stockfish/pull/5514
No functional change
These values represent the lowest Elo rating in the skill level calculation,
and the highest one, but it's not clear from the code where these values come
from other than the comment. This should improve code readability and
maintainability. It makes the purpose of the values clear and allows for easy
modification if the Elo range for skill level calculation changes in the
future. Moved the Skill struct definition from search.cpp to search.h header
file to define the Search::Skill struct, making it accessible from other files.
closes https://github.com/official-stockfish/Stockfish/pull/5508
No functional change
- Capitalize comments
- Reformat multi-lines comments to equalize the widths of the lines
- Try to keep the width of comments around 85 characters
- Remove periods at the end of single-line comments
closes https://github.com/official-stockfish/Stockfish/pull/5469
No functional change
Currently (after #5407), SF has the property that any PV line with a decisive
TB score contains the corresponding TB position, with a score that correctly
identifies the depth at which TB are entered. The PV line that follows might
not preserve the game outcome, but can easily be verified and extended based on
TB information. This patch provides this functionality, simply extending the PV
lines on output, this doesn't affect search.
Indeed, if DTZ tables are available, search based PV lines that correspond to
decisive TB scores are verified to preserve game outcome, truncating the line
as needed. Subsequently, such PV lines are extended with a game outcome
preserving line until mate, as a possible continuation. These lines are not
optimal mating lines, but are similar to what a user could produce on a website
like https://syzygy-tables.info/ clicking always the top ranked move, i.e.
minimizing or maximizing DTZ (with a simple tie-breaker for moves that have
identical DTZ), and are thus an just an illustration of how to game can be won.
A similar approach is already in established in
https://github.com/joergoster/Stockfish/tree/matefish2
This also contributes to addressing #5175 where SF can give an incorrect TB
win/loss for positions in TB with a movecounter that doesn't reflect optimal
play. While the full solution requires either TB generated differently, or a
search when ranking rootmoves, current SF will eventually find a draw in these
cases, in practice quite quickly, e.g.
`1kq5/q2r4/5K2/8/8/8/8/7Q w - - 96 1`
`8/8/6k1/3B4/3K4/4N3/8/8 w - - 54 106`
Gives the same results as master on an extended set of test positions from
https://github.com/mcostalba/Stockfish/commit/9173d29c414ddb8f4bec74e4db3ccbe664c66bf9
with the exception of the above mentioned fen where this commit improves.
With https://github.com/vondele/matetrack using 6men TB, all generated PVs verify:
```
Using ../Stockfish/src/stockfish.syzygyExtend on matetrack.epd with --nodes 1000000 --syzygyPath /chess/syzygy/3-4-5-6/WDL:/chess/syzygy/3-4-5-6/DTZ
Engine ID: Stockfish dev-20240704-ff227954
Total FENs: 6555
Found mates: 3299
Best mates: 2582
Found TB wins: 568
```
As repeated DTZ probing could be slow a procedure (100ms+ on HDD, a few ms on
SSD), the extension is only done as long as the time taken is less than half
the `Move Overhead` parameter. For tournaments where these lines might be of
interest to the user, a suitable `Move Overhead` might be needed (e.g. TCEC has
1000ms already).
closes https://github.com/official-stockfish/Stockfish/pull/5414
No functional change
As stockfish nets and search evolve, the existing time control appears
to give too little time at STC, roughly correct at LTC, and too little
at VLTC+.
This change adds an adjustment to the optExtra calculation. This
adjustment is easy to retune and refine, so it should be easier to keep
up-to-date than the more complex calculations used for optConstant and
optScale.
Passed STC 10+0.1:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 169568 W: 43803 L: 43295 D: 82470
Ptnml(0-2): 485, 19679, 44055, 19973, 592
https://tests.stockfishchess.org/tests/view/66531865a86388d5e27da9fa
Yellow LTC 60+0.6:
LLR: -2.94 (-2.94,2.94) <0.50,2.50>
Total: 209970 W: 53087 L: 52914 D: 103969
Ptnml(0-2): 91, 19652, 65314, 19849, 79
https://tests.stockfishchess.org/tests/view/6653e38ba86388d5e27daaa0
Passed VLTC 180+1.8 :
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 85618 W: 21735 L: 21342 D: 42541
Ptnml(0-2): 15, 8267, 25848, 8668, 11
https://tests.stockfishchess.org/tests/view/6655131da86388d5e27db95f
closes https://github.com/official-stockfish/Stockfish/pull/5297
Bench: 1212167
Allow for NUMA memory replication for NNUE weights. Bind threads to ensure execution on a specific NUMA node.
This patch introduces NUMA memory replication, currently only utilized for the NNUE weights. Along with it comes all machinery required to identify NUMA nodes and bind threads to specific processors/nodes. It also comes with small changes to Thread and ThreadPool to allow easier execution of custom functions on the designated thread. Old thread binding (WinProcGroup) machinery is removed because it's incompatible with this patch. Small changes to unrelated parts of the code were made to ensure correctness, like some classes being made unmovable, raw pointers replaced with unique_ptr. etc.
Windows 7 and Windows 10 is partially supported. Windows 11 is fully supported. Linux is fully supported, with explicit exclusion of Android. No additional dependencies.
-----------------
A new UCI option `NumaPolicy` is introduced. It can take the following values:
```
system - gathers NUMA node information from the system (lscpu or windows api), for each threads binds it to a single NUMA node
none - assumes there is 1 NUMA node, never binds threads
auto - this is the default value, depends on the number of set threads and NUMA nodes, will only enable binding on multinode systems and when the number of threads reaches a threshold (dependent on node size and count)
[[custom]] -
// ':'-separated numa nodes
// ','-separated cpu indices
// supports "first-last" range syntax for cpu indices,
for example '0-15,32-47:16-31,48-63'
```
Setting `NumaPolicy` forces recreation of the threads in the ThreadPool, which in turn forces the recreation of the TT.
The threads are distributed among NUMA nodes in a round-robin fashion based on fill percentage (i.e. it will strive to fill all NUMA nodes evenly). Threads are bound to NUMA nodes, not specific processors, because that's our only requirement and the OS can schedule them better.
Special care is made that maximum memory usage on systems that do not require memory replication stays as previously, that is, unnecessary copies are avoided.
On linux the process' processor affinity is respected. This means that if you for example use taskset to restrict Stockfish to a single NUMA node then the `system` and `auto` settings will only see a single NUMA node (more precisely, the processors included in the current affinity mask) and act accordingly.
-----------------
We can't ensure that a memory allocation takes place on a given NUMA node without using libnuma on linux, or using appropriate custom allocators on windows (https://learn.microsoft.com/en-us/windows/win32/memory/allocating-memory-from-a-numa-node), so to avoid complications the current implementation relies on first-touch policy. Due to this we also rely on the memory allocator to give us a new chunk of untouched memory from the system. This appears to work reliably on linux, but results may vary.
MacOS is not supported, because AFAIK it's not affected, and implementation would be problematic anyway.
Windows is supported since Windows 7 (https://learn.microsoft.com/en-us/windows/win32/api/processtopologyapi/nf-processtopologyapi-setthreadgroupaffinity). Until Windows 11/Server 2022 NUMA nodes are split such that they cannot span processor groups. This is because before Windows 11/Server 2022 it's not possible to set thread affinity spanning processor groups. The splitting is done manually in some cases (required after Windows 10 Build 20348). Since Windows 11/Server 2022 we can set affinites spanning processor group so this splitting is not done, so the behaviour is pretty much like on linux.
Linux is supported, **without** libnuma requirement. `lscpu` is expected.
-----------------
Passed 60+1 @ 256t 16000MB hash: https://tests.stockfishchess.org/tests/view/6654e443a86388d5e27db0d8
```
LLR: 2.95 (-2.94,2.94) <0.00,10.00>
Total: 278 W: 110 L: 29 D: 139
Ptnml(0-2): 0, 1, 56, 82, 0
```
Passed SMP STC: https://tests.stockfishchess.org/tests/view/6654fc74a86388d5e27db1cd
```
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 67152 W: 17354 L: 17177 D: 32621
Ptnml(0-2): 64, 7428, 18408, 7619, 57
```
Passed STC: https://tests.stockfishchess.org/tests/view/6654fb27a86388d5e27db15c
```
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 131648 W: 34155 L: 34045 D: 63448
Ptnml(0-2): 426, 13878, 37096, 14008, 416
```
fixes#5253
closes https://github.com/official-stockfish/Stockfish/pull/5285
No functional change
Stockfish appears to take too much time on the first move of a game and
then not enough on moves 2,3,4... Probably caused by most of the factors
that increase time usually applying on the first move.
Attempts to give more time to the subsequent moves have not worked so
far, but this change to simply reduce first move time by 5% worked.
STC 10+0.1 :
LLR: 2.96 (-2.94,2.94) <0.00,2.00>
Total: 78496 W: 20516 L: 20135 D: 37845
Ptnml(0-2): 340, 8859, 20456, 9266, 327
https://tests.stockfishchess.org/tests/view/663d47bf507ebe1c0e9200ba
LTC 60+0.6 :
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 94872 W: 24179 L: 23751 D: 46942
Ptnml(0-2): 61, 9743, 27405, 10161, 66
https://tests.stockfishchess.org/tests/view/663e779cbb28828150dd9089
closes https://github.com/official-stockfish/Stockfish/pull/5235
Bench: 1876282
1. The current time management system utilizes limits.inc and
limits.time, which can represent either milliseconds or node count,
depending on whether the nodestime option is active. There have been
several modifications which brought Elo gain for typical uses (i.e.
real-time matches), however some of these changes overlooked such
distinction. This patch adjusts constants and multiplication/division to
more accurately simulate real TC conditions when nodestime is used.
2. The advance_nodes_time function has a bug that can extend the time
limit when availableNodes reaches exact zero. This patch fixes the bug
by initializing the variable to -1 and make sure it does not go below
zero.
3. elapsed_time function is newly introduced to print PV in the UCI
output based on real time. This makes PV output more consistent with the
behavior of trivial use cases.
closes https://github.com/official-stockfish/Stockfish/pull/5186
No functional changes
For each thread persist an accumulator cache for the network, where each
cache contains multiple entries for each of the possible king squares.
When the accumulator needs to be refreshed, the cached entry is used to more
efficiently update the accumulator, instead of rebuilding it from scratch.
This idea, was first described by Luecx (author of Koivisto) and
is commonly referred to as "Finny Tables".
When the accumulator needs to be refreshed, instead of filling it with
biases and adding every piece from scratch, we...
1. Take the `AccumulatorRefreshEntry` associated with the new king bucket
2. Calculate the features to activate and deactivate (from differences
between bitboards in the entry and bitboards of the actual position)
3. Apply the updates on the refresh entry
4. Copy the content of the refresh entry accumulator to the accumulator
we were refreshing
5. Copy the bitboards from the position to the refresh entry, to match
the newly updated accumulator
Results at STC:
https://tests.stockfishchess.org/tests/view/662301573fe04ce4cefc1386
(first version)
https://tests.stockfishchess.org/tests/view/6627fa063fe04ce4cefc6560
(final)
Non-Regression between first and final:
https://tests.stockfishchess.org/tests/view/662801e33fe04ce4cefc660a
STC SMP:
https://tests.stockfishchess.org/tests/view/662808133fe04ce4cefc667c
closes https://github.com/official-stockfish/Stockfish/pull/5183
No functional change
Also fixes searchmoves.
Drop the need of a Position object in uci.cpp.
A side note, it is still required for the static functions,
but these should be moved to a different namespace/class
later on, since sf kinda relies on them.
closes https://github.com/official-stockfish/Stockfish/pull/5169
No functional change
Part 2 of the Split UCI into UCIEngine and Engine refactor.
This creates function callbacks for search to use when an update should occur.
The benching in uci.cpp for example does this to extract the total nodes
searched.
No functional change
- fix naming convention for `workingDirectory`
- use type alias for `EvalFiles` everywhere
- move `ponderMode` into `LimitsType`
- move limits parsing into standalone static function
closes https://github.com/official-stockfish/Stockfish/pull/5098
No functional change
This introduces a form of node counting which can
be used to further tweak the usage of our search
time.
The current approach stops the search when almost
all nodes are searched on a single move.
The idea originally came from Koivisto, but the
implemention is a bit different, Koivisto scales
the optimal time by the nodes effort and then
determines if the search should be stopped.
We just scale down the `totalTime` and stop the
search if we exceed it and the effort is large
enough.
Passed STC:
https://tests.stockfishchess.org/tests/view/65c8e0661d8e83c78bfcd5ec
LLR: 2.97 (-2.94,2.94) <0.00,2.00>
Total: 88672 W: 22907 L: 22512 D: 43253
Ptnml(0-2): 310, 10163, 23041, 10466, 356
Passed LTC:
https://tests.stockfishchess.org/tests/view/65ca632b1d8e83c78bfcf554
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 170856 W: 42910 L: 42320 D: 85626
Ptnml(0-2): 104, 18337, 47960, 18919, 108
closes https://github.com/official-stockfish/Stockfish/pull/5053
Bench: 1198939
This addresses the issue where Stockfish may output non-proven checkmate
scores if the search is prematurely halted, either due to a time control
or node limit, before it explores other possibilities where the
checkmate score could have been delayed or refuted.
The fix also replaces staving off from proven mated scores in a
multithread environment making use of the threads instead of a negative
effect with multithreads (1t was better in proving mated in scores than
more threads).
Issue reported on mate tracker repo by and this PR is co-authored with
@robertnurnberg Special thanks to @AndyGrant for outlining that a fix is
eventually possible.
Passed Adj off SMP STC:
https://tests.stockfishchess.org/tests/view/65a125d779aa8af82b96c3eb
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 303256 W: 75823 L: 75892 D: 151541
Ptnml(0-2): 406, 35269, 80395, 35104, 454
Passed Adj off SMP LTC:
https://tests.stockfishchess.org/tests/view/65a37add79aa8af82b96f0f7
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 56056 W: 13951 L: 13770 D: 28335
Ptnml(0-2): 11, 5910, 16002, 6097, 8
Passed all tests in matetrack without any better mate for opponent found in 1t and multithreads.
Fixed bugs in https://github.com/official-stockfish/Stockfish/pull/4976
closes https://github.com/official-stockfish/Stockfish/pull/4990
Bench: 1308279
Co-Authored-By: Robert Nürnberg <28635489+robertnurnberg@users.noreply.github.com>
Also remove dead code, `rootSimpleEval` is no longer used since the introduction of dual net.
`iterBestValue` is also no longer used in evaluate and can be reduced to a local variable.
closes https://github.com/official-stockfish/Stockfish/pull/4979
No functional change