This patch ports the efficiently updatable neural network (NNUE) evaluation to Stockfish.
Both the NNUE and the classical evaluations are available, and can be used to
assign a value to a position that is later used in alpha-beta (PVS) search to find the
best move. The classical evaluation computes this value as a function of various chess
concepts, handcrafted by experts, tested and tuned using fishtest. The NNUE evaluation
computes this value with a neural network based on basic inputs. The network is optimized
and trained on the evalutions of millions of positions at moderate search depth.
The NNUE evaluation was first introduced in shogi, and ported to Stockfish afterward.
It can be evaluated efficiently on CPUs, and exploits the fact that only parts
of the neural network need to be updated after a typical chess move.
[The nodchip repository](https://github.com/nodchip/Stockfish) provides additional
tools to train and develop the NNUE networks.
This patch is the result of contributions of various authors, from various communities,
including: nodchip, ynasu87, yaneurao (initial port and NNUE authors), domschl, FireFather,
rqs, xXH4CKST3RXx, tttak, zz4032, joergoster, mstembera, nguyenpham, erbsenzaehler,
dorzechowski, and vondele.
This new evaluation needed various changes to fishtest and the corresponding infrastructure,
for which tomtor, ppigazzini, noobpwnftw, daylen, and vondele are gratefully acknowledged.
The first networks have been provided by gekkehenker and sergiovieri, with the latter
net (nn-97f742aaefcd.nnue) being the current default.
The evaluation function can be selected at run time with the `Use NNUE` (true/false) UCI option,
provided the `EvalFile` option points the the network file (depending on the GUI, with full path).
The performance of the NNUE evaluation relative to the classical evaluation depends somewhat on
the hardware, and is expected to improve quickly, but is currently on > 80 Elo on fishtest:
60000 @ 10+0.1 th 1
https://tests.stockfishchess.org/tests/view/5f28fe6ea5abc164f05e4c4c
ELO: 92.77 +-2.1 (95%) LOS: 100.0%
Total: 60000 W: 24193 L: 8543 D: 27264
Ptnml(0-2): 609, 3850, 9708, 10948, 4885
40000 @ 20+0.2 th 8
https://tests.stockfishchess.org/tests/view/5f290229a5abc164f05e4c58
ELO: 89.47 +-2.0 (95%) LOS: 100.0%
Total: 40000 W: 12756 L: 2677 D: 24567
Ptnml(0-2): 74, 1583, 8550, 7776, 2017
At the same time, the impact on the classical evaluation remains minimal, causing no significant
regression:
sprt @ 10+0.1 th 1
https://tests.stockfishchess.org/tests/view/5f2906a2a5abc164f05e4c5b
LLR: 2.94 (-2.94,2.94) {-6.00,-4.00}
Total: 34936 W: 6502 L: 6825 D: 21609
Ptnml(0-2): 571, 4082, 8434, 3861, 520
sprt @ 60+0.6 th 1
https://tests.stockfishchess.org/tests/view/5f2906cfa5abc164f05e4c5d
LLR: 2.93 (-2.94,2.94) {-6.00,-4.00}
Total: 10088 W: 1232 L: 1265 D: 7591
Ptnml(0-2): 49, 914, 3170, 843, 68
The needed networks can be found at https://tests.stockfishchess.org/nns
It is recommended to use the default one as indicated by the `EvalFile` UCI option.
Guidelines for testing new nets can be found at
https://github.com/glinscott/fishtest/wiki/Creating-my-first-test#nnue-net-tests
Integration has been discussed in various issues:
https://github.com/official-stockfish/Stockfish/issues/2823https://github.com/official-stockfish/Stockfish/issues/2728
The integration branch will be closed after the merge:
https://github.com/official-stockfish/Stockfish/pull/2825https://github.com/official-stockfish/Stockfish/tree/nnue-player-wip
closes https://github.com/official-stockfish/Stockfish/pull/2912
This will be an exciting time for computer chess, looking forward to seeing the evolution of
this approach.
Bench: 4746616
In some French games, Stockfish likes to bring the Knight to a bad outpost spot. This is evident in TCEC S18 Superfinal Game 63, where there is a Knight outpost on the queenside but is actually useless. Stockfish is effectively playing a piece down while holding ground against Leela's break on the kingside.
This patch turns the +56 mg bonus for a Knight outpost into a -7 mg penalty if it satisfies the following conditions:
* The outpost square is not on the CenterFiles (i.e. not on files C,D,E and F)
* The knight is not attacking non pawn enemies.
* The side where the outpost is located contains only few enemies, with a particular conditional_more_than_two() implementation
Thank you to apospa...@gmail.com for bringing this to our attention and for providing insights.
See https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/dEXNzSIBgZU
Reference game: https://tcec-chess.com/#div=sf&game=63&season=18
Passed STC:
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 6960 W: 1454 L: 1247 D: 4259
Ptnml(0-2): 115, 739, 1610, 856, 160
https://tests.stockfishchess.org/tests/view/5f08221059f6f0353289477e
Passed LTC:
LLR: 2.98 (-2.94,2.94) {0.25,1.75}
Total: 21440 W: 2767 L: 2543 D: 16130
Ptnml(0-2): 122, 1904, 6462, 2092, 140
https://tests.stockfishchess.org/tests/view/5f0838ed59f6f035328947a2
various related tests show strong test results, but so far no generalizations or simplifications of conditional_more_than_two() are found. See PR for details.
closes https://github.com/official-stockfish/Stockfish/pull/2803
Bench: 4366686
Probcut is a heuristic that wasn't changed a lot in past years,
all attempts to change it using information / writing info to transposition table failed.
This patch has a number of differences that can be summarized as follows:
* For TT write/read we use depth - 3. Because probcut search is depth - 4 but we actually do the move prior to it so effectively we do depth - 3 search;
* In any case of depth of eval from transposition table being >= depth - 3 we either produce cutoff or refuse to even do probcut search, this is allowing us to write info of probcut to transposition table because we know that we wouldn't be overwriting some deeper data with our depth - 3 search - this is an important aspect of this patch;
* For some not really known reason this patch completely ignores tte->bound() - which was the case for previous patch that made probcut interact with TT, maybe 2) is the reason, although it's unproven.
A first version of this patch passed STC and LTC
passed STC
https://tests.stockfishchess.org/tests/view/5f05908a59f6f03532894613
LLR: 2.95 (-2.94,2.94) {-0.50,1.50}
Total: 95776 W: 18300 L: 17973 D: 59503
Ptnml(0-2): 1646, 10944, 22377, 11279, 1642
passed LTC
https://tests.stockfishchess.org/tests/view/5f06b54059f6f035328946bb
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 57128 W: 7266 L: 6938 D: 42924
Ptnml(0-2): 372, 5163, 17217, 5389, 423
However, an additional bugfix was needed to avoid checking a condition on ttMove if was not available. This passed non-regression bounds on top of the first version:
at STC
https://tests.stockfishchess.org/tests/view/5f080e5059f6f03532894766
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 14096 W: 2800 L: 2628 D: 8668
Ptnml(0-2): 225, 1620, 3238, 1688, 277
at LTC
https://tests.stockfishchess.org/tests/view/5f0836a559f6f0353289479c
LLR: 2.95 (-2.94,2.94) {-1.50,0.50}
Total: 25352 W: 3228 L: 3139 D: 18985
Ptnml(0-2): 175, 2350, 7549, 2415, 187
closes https://github.com/official-stockfish/Stockfish/pull/2804
Bench 4540940
A number of engines, GUIs and tournaments start to report WDL estimates
along or instead of scores. This patch enables reporting of those stats
in a more or less standard way (http://www.talkchess.com/forum3/viewtopic.php?t=72140)
The model this reporting uses is based on data derived from a few million fishtest LTC games,
given a score and a game ply, a win rate is provided that matches rather closely,
especially in the intermediate range [0.05, 0.95] that data. Some data is shown at
https://github.com/glinscott/fishtest/wiki/UsefulData#win-loss-draw-statistics-of-ltc-games-on-fishtest
Making the conversion game ply dependent is important for a good fit, and is in line
with experience that a +1 score in the early midgame is more likely a win than in the late endgame.
Even when enabled, the printing of the info causes no significant overhead.
Passed STC:
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 197112 W: 37226 L: 37347 D: 122539
Ptnml(0-2): 2591, 21025, 51464, 20866, 2610
https://tests.stockfishchess.org/tests/view/5ef79ef4f993893290cc146b
closes https://github.com/official-stockfish/Stockfish/pull/2778
No functional change
We lower the endgame value of the evaluation when we detect that there
is only one queen left on the board (more precisely, we use a scale
factor of 37/64, or about 0.58, for the endgame part of the evaluation).
Hopefully this helps a little bit for the assessment of positions with
queen imbalance, which are one of the well-known Stockfish weaknesses.
STC:
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 21600 W: 4176 L: 3955 D: 13469
Ptnml(0-2): 351, 2457, 5003, 2598, 391
https://tests.stockfishchess.org/tests/view/5ef871b6020eec13834a94e8
LTC:
LLR: 2.97 (-2.94,2.94) {0.25,1.75}
Total: 248328 W: 30596 L: 29720 D: 188012
Ptnml(0-2): 1544, 22345, 75665, 22911, 1699
https://tests.stockfishchess.org/tests/view/5ef87aec020eec13834a94fe
Closes https://github.com/official-stockfish/Stockfish/pull/2781
Bench: 4441323
* Supports popcnt (thanks @daylen)
* bits = 64 is now the default
Tested with g++ (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0 on ThunderX CN8890,
yields about 9% speedup.
Also tested with clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final).
closes https://github.com/official-stockfish/Stockfish/pull/2770
No functional change.
the option was, since at least 2014, not correctly implemented,
ignoring all dynamic adjustments to optimum time in search.
Instead of fixing it, remove it, no need to expose an option that
will influence time management negatively.
closes https://github.com/official-stockfish/Stockfish/pull/2765
No functional change.
The idea of this patch is that if capture can be described as
"less valuable piece takes more valuable piece" it's not really correct
to add only piece value of captured piece to static evaluation
since there can be more threats in other places and opponent can't really
do much but recapture our capturing piece which leaves us space for
more captures thus winning more material and increasing static eval.
passed STC
https://tests.stockfishchess.org/tests/view/5ef0167b122d6514328d760f
LLR: 2.96 (-2.94,2.94) {-0.50,1.50}
Total: 24736 W: 4838 L: 4607 D: 15291
Ptnml(0-2): 438, 2812, 5648, 3021, 449
passed LTC
https://tests.stockfishchess.org/tests/view/5ef073bc122d6514328d7693
LLR: 2.93 (-2.94,2.94) {0.25,1.75}
Total: 46152 W: 5865 L: 5567 D: 34720
Ptnml(0-2): 312, 4160, 13886, 4354, 364
closes https://github.com/official-stockfish/Stockfish/pull/2761
bench 4789930
We increase a little bit the midgame value of pawns on a4, h4, a6 and h6.
Original idea by Malcolm Campbell, who tried the version restricted to the
pawns on the H column a couple of weeks ago and got a patch which almost
passed LTC. The current pull request just adds the same idea for pawns on
the A column.
Possible follow-ups: maybe tweak the a5/h5 pawn values, and/or add a malus
for very low king mobility in midgame?
STC:
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 33416 W: 6516 L: 6275 D: 20625
Ptnml(0-2): 575, 3847, 7659, 4016, 611
https://tests.stockfishchess.org/tests/view/5ee6c4e687586124bc2c10d4
LTC:
LLR: 2.95 (-2.94,2.94) {0.25,1.75}
Total: 134368 W: 16869 L: 16319 D: 101180
Ptnml(0-2): 908, 12083, 40708, 12521, 964
https://tests.stockfishchess.org/tests/view/5ee74e60aae8aec816ab756a
closes https://github.com/official-stockfish/Stockfish/pull/2747
Bench: 5299456
Tuned search constants after many search patches since the last
successful tune.
1st LTC @ 60+0.6 th 1 :
LLR: 2.97 (-2.94,2.94) {0.25,1.75}
Total: 57656 W: 7369 L: 7036 D: 43251
Ptnml(0-2): 393, 5214, 17336, 5437, 448
https://tests.stockfishchess.org/tests/view/5ee1e074f29b40b0fc95af19
SMP LTC @ 20+0.2 th 8 :
LLR: 2.95 (-2.94,2.94) {0.25,1.75}
Total: 83576 W: 9731 L: 9341 D: 64504
Ptnml(0-2): 464, 7062, 26369, 7406, 487
https://tests.stockfishchess.org/tests/view/5ee35a21f29b40b0fc95b008
The changes were rebased on top of a successful patch by Viz (see #2734)
and two different ways of doing this were tested. The successful test
modified the constants in the patch by Viz in a similar manner to the
tuning run:
LTC (rebased) @ 60+0.6 th 1 :
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 193384 W: 24241 L: 23521 D: 145622
Ptnml(0-2): 1309, 17497, 58472, 17993, 1421
https://tests.stockfishchess.org/tests/view/5ee43319ca6c451633a995f9
Further work: the recent patch to quantize eval #2733 affects search quit
quite a bit, so doing another tune in, say, three months time might be a
good idea.
closes https://github.com/official-stockfish/Stockfish/pull/2735
Bench 4246971
We replace the current decrease of the complexity term in initiative
when shuffling by a direct damping of the evaluation. This scheme may
have two benefits over the initiative approach:
a) the damping effect is more brutal for fortresses with heavy pieces
on the board, because the initiative term is almost an endgame term;
b) the initiative implementation had a funny side effect, almost a bug,
in the rare positions where mg > 0, eg < 0 and the tampered eval
returned a positive value (ie with heavy pieces still on the board):
sending eg to zero via shuffling would **increase** the tampered
eval instead of decreasing it, which is somewhat illogical. This
patch avoids this phenomenon.
STC:
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 43072 W: 8373 L: 8121 D: 26578
Ptnml(0-2): 729, 4954, 9940, 5162, 751
https://tests.stockfishchess.org/tests/view/5ee008ebf29b40b0fc95ade2
LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 37376 W: 4816 L: 4543 D: 28017
Ptnml(0-2): 259, 3329, 11286, 3508, 306
https://tests.stockfishchess.org/tests/view/5ee03b06f29b40b0fc95ae0c
Closes https://github.com/official-stockfish/Stockfish/pull/2727
Bench: 4757174
Conceptually group hash clusters into super clusters of 256 clusters.
This scheme allows us to use hash sizes up to 32 TB
(= 2^32 super clusters = 2^40 clusters).
Use 48 bits of the Zobrist key to choose the cluster index. We use 8
extra bits to mitigate the quantization error for very large hashes when
scaling the hash key to cluster index.
The hash index computation is organized to be compatible with the existing
scheme for power-of-two hash sizes up to 128 GB.
Fixes https://github.com/official-stockfish/Stockfish/issues/1349
closes https://github.com/official-stockfish/Stockfish/pull/2722
Passed non-regression STC:
LLR: 2.93 (-2.94,2.94) {-1.50,0.50}
Total: 37976 W: 7336 L: 7211 D: 23429
Ptnml(0-2): 578, 4295, 9149, 4356, 610
https://tests.stockfishchess.org/tests/view/5edcbaaef29b40b0fc95abc5
No functional change.
This is a non-functional code style change which provides a safe access handler for LineBB.
Also includes an assert in debug mode to verify square correctness.
closes https://github.com/official-stockfish/Stockfish/pull/2719
No functional change
Merging this code into one function `winnable()`.
Should allow common concepts used to adjust the eg value,
either by addition or scaling, to be combined more effectively.
Improve trace function.
closes https://github.com/official-stockfish/Stockfish/pull/2710
No functional change.
The idea of this patch is that if rooks are not directly attacking the opponent king,
they can support king attacks staying behind pawns or minor pieces and be really
deadly if position slightly opens up at enemy king ring ranks. Loosely based on
some stockfish games where it underestimated attacks on it king when enemy has one
or two rooks supporting pawn pushes towards it king.
passed STC
https://tests.stockfishchess.org/tests/view/5ecb093680f2c838b96550f9
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 53672 W: 10535 L: 10265 D: 32872
Ptnml(0-2): 952, 6210, 12258, 6448, 968
passed LTC
https://tests.stockfishchess.org/tests/view/5ecb639f80f2c838b9655117
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 62424 W: 8094 L: 7748 D: 46582
Ptnml(0-2): 426, 5734, 18565, 6042, 445
closes https://github.com/official-stockfish/Stockfish/pull/2700
Bench: 4663220
This patch is a combinaison of two recent parameters tweaks which had
failed narrowly (yellow) at long time control:
• improvement in move ordering during search by softening the distinction
between bad captures and good captures during move generation, leading
to improved awareness of Stockfish of potential piece sacrifices (idea
by Rahul Dsilva)
• increase in the weight of pawns in the "initiative" part of the evaluation
function. With this change Stockfish should have more incentive to exchange
pawns when losing, and to keep pawns when winning.
STC:
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 10704 W: 2178 L: 1974 D: 6552
Ptnml(0-2): 168, 1185, 2464, 1345, 190
https://tests.stockfishchess.org/tests/view/5ec5553b377121ac09e1023d
LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 60592 W: 7835 L: 7494 D: 45263
Ptnml(0-2): 430, 5514, 18086, 5817, 449
https://tests.stockfishchess.org/tests/view/5ec55ca2377121ac09e10249
Closes https://github.com/official-stockfish/Stockfish/pull/2691
Bench: 4519117
Do not send the following info string on the first call to
aligned_ttmem_alloc() on Windows:
info string Hash table allocation: Windows large pages [not] used.
The first call occurs before the 'uci' command has been received. This
confuses some GUIs, which expect the first engine-sent command to be
'id' as the response to the 'uci' command. (see https://github.com/official-stockfish/Stockfish/issues/2681)
closes https://github.com/official-stockfish/Stockfish/pull/2689
No functional change.
This is a functional simplification of the time management system.
With this patch, there is a simple equation for each of two distinct
time controls: basetime + increment, and x moves in y seconds (+increment).
These equations are easy to plot and understand making future modifications
or adding additional time controls much easier.
SlowMover is reset to 100 so that is has no effect unless a user changes it.
There are two scaling variables:
* Opt_scale is a scale factor (or percentage) of time to use for this current move.
* Max_scale is a scale factor to apply to the resulting optimumTime.
There seems to be some elo gain in most scenarios.
Better performance is attributable to one of two things:
* minThinkingTime was not allowing reasonable time calculations for very short games like 10+0 or 10+0.01. This is because adding almost no increment and substracting move overhead for 50 moves quickly results in almost 0 time very early in the game. Master depended on minThinkingTime to handle these short games instead of good time management. This patch addresses this issue by lowering minThinkingTime to 0 and adjusting moverOverhead if there are very low increments.
* Notice that the time distribution curves tail downward for the first 10 moves or so. This causes less time to attribute for very early moves leaving more time available for middle moves where more important decisions happen.
Here is a summary of tests for this version at different time controls:
SMP 5+0.05
LLR: 2.97 (-2.94,2.94) {-1.50,0.50}
Total: 46544 W: 7175 L: 7089 D: 32280
Ptnml(0-2): 508, 4826, 12517, 4914, 507
https://tests.stockfishchess.org/tests/user/protonspring
STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 20480 W: 3872 L: 3718 D: 12890
Ptnml(0-2): 295, 2364, 4824, 2406, 351
https://tests.stockfishchess.org/tests/view/5ebc343e7dd5693aad4e6873
STC, sudden death
LLR: 2.93 (-2.94,2.94) {-1.50,0.50}
Total: 7024 W: 1706 L: 1489 D: 3829
Ptnml(0-2): 149, 813, 1417, 938, 195
https://tests.stockfishchess.org/tests/view/5ebc346f7dd5693aad4e6875
STC, TCEC style
LLR: 2.95 (-2.94,2.94) {-1.50,0.50}
Total: 4192 W: 1014 L: 811 D: 2367
Ptnml(0-2): 66, 446, 912, 563, 109
https://tests.stockfishchess.org/tests/view/5ebc34857dd5693aad4e6877
40/10
LLR: 2.93 (-2.94,2.94) {-1.50,0.50}
Total: 54032 W: 10592 L: 10480 D: 32960
Ptnml(0-2): 967, 6148, 12677, 6254, 970
https://tests.stockfishchess.org/tests/view/5ebc50597dd5693aad4e688d
LTC, sudden death
LLR: 2.95 (-2.94,2.94) {-1.50,0.50}
Total: 9152 W: 1391 L: 1263 D: 6498
Ptnml(0-2): 75, 888, 2526, 1008, 79
https://tests.stockfishchess.org/tests/view/5ebc6f5c7dd5693aad4e689b
LTC
LLR: 2.98 (-2.94,2.94) {-1.50,0.50}
Total: 12344 W: 1563 L: 1459 D: 9322
Ptnml(0-2): 70, 1103, 3740, 1171, 88
https://tests.stockfishchess.org/tests/view/5ebc6f4c7dd5693aad4e6899
closes https://github.com/official-stockfish/Stockfish/pull/2678
Bench: 4395562
simplify the usage of the 50 moves counter,
moving it frome the scale factor to initiative.
This patch was inspired by recent games where a blocked or semi-blocked position
was 'blundered', by moving a pawn, into a lost endgame. This patch improves this situation,
finding a more robust move more often.
for example (1s searches with many threads):
```
FEN 8/p3kp2/Pp2p3/1n2PpP1/5P2/1Kp5/8/R7 b - - 68 143
master:
6 bestmove b5c7
6 bestmove e7e8
12 bestmove e7d8
176 bestmove e7d7
patch:
3 bestmove b5c7
5 bestmove e7d8
192 bestmove e7d7
```
fixes https://github.com/official-stockfish/Stockfish/issues/2620
the patch also tests well
passed STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 50168 W: 9508 L: 9392 D: 31268
Ptnml(0-2): 818, 5873, 11616, 5929, 848
https://tests.stockfishchess.org/tests/view/5ebb07287dd5693aad4e680b
passed LTC
LLR: 2.93 (-2.94,2.94) {-1.50,0.50}
Total: 7520 W: 981 L: 870 D: 5669
Ptnml(0-2): 49, 647, 2256, 760, 48
https://tests.stockfishchess.org/tests/view/5ebbff747dd5693aad4e6858
closes https://github.com/official-stockfish/Stockfish/pull/2666
Bench: 4395562
for users that set the needed privilige "Lock Pages in Memory"
large pages will be automatically enabled (see Readme.md).
This expert setting might improve speed, 5% - 30%, depending
on the hardware, the number of threads and hash size. More for
large hashes, large number of threads and NUMA. If the operating
system can not allocate large pages (easier after a reboot), default
allocation is used automatically. The engine log provides details.
closes https://github.com/official-stockfish/Stockfish/pull/2656
fixes https://github.com/official-stockfish/Stockfish/issues/2619
No functional change
fixes https://github.com/official-stockfish/Stockfish/issues/2660
The problem was caused by .depend being created with a rule for tbprobe.o not for syzygy/tbprobe.o.
This patch keeps an explicit list of sources (SRCS), generates OBJS,
and compiles all object files to the src/ directory, consistent with .depend.
VPATH is used to search the syzygy directory as needed.
joint work with @gvreuls
closes https://github.com/official-stockfish/Stockfish/pull/2664
No functional change
The purpose of the code is to allow developers to easily and flexibly
setup SF for a tuning session. Mainly you have just to remove 'const'
qualifiers from the variables you want to tune and flag them for
tuning, so if you have:
int myKing = 10;
Score myBonus = S(5, 15);
Value myValue[][2] = { { V(100), V(20) }, { V(7), V(78) } };
and at the end of the update you may want to call
a post update function:
void my_post_update();
If instead of default Option's min-max values,
you prefer your custom ones, returned by:
std::pair<int, int> my_range(int value)
Or you jus want to set the range directly, you can
simply add below:
TUNE(SetRange(my_range), myKing, SetRange(-200, 200), myBonus, myValue, my_post_update);
And all the magic happens :-)
At startup all the parameters are printed in a
format suitable to be copy-pasted in fishtest.
In case the post update function is slow and you have many
parameters to tune, you can add:
UPDATE_ON_LAST();
And the values update, including post update function call, will
be done only once, after the engine receives the last UCI option.
The last option is the one defined and created as the last one, so
this assumes that the GUI sends the options in the same order in
which have been defined.
closes https://github.com/official-stockfish/Stockfish/pull/2654
No functional change.
This patch increases number of nodes where we produce multicut cutoffs.
The idea is that if our ttMove failed to produce a singular extension
but ttValue is greater than beta we can afford to do one more reduced search
near beta excluding ttMove to see if it will produce a fail high -
and if it does so produce muticut by analogy to existing logic.
passed STC
https://tests.stockfishchess.org/tests/view/5e9a162b5b664cdba0ce6e28
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 58238 W: 11192 L: 10917 D: 36129
Ptnml(0-2): 1007, 6704, 13442, 6939, 1027
passed LTC
https://tests.stockfishchess.org/tests/view/5e9a1e845b664cdba0ce7411
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 137852 W: 17460 L: 16899 D: 103493
Ptnml(0-2): 916, 12610, 41383, 13031, 986
closes https://github.com/official-stockfish/Stockfish/pull/2640
bench 4881443
This is a functional simplification which fixes an awkward numerical cliff.
With master king_safety, no pawns is scored higher than pawn(s) that is/are far from the king. This may motivate SF to throw away pawns to increase king safety. With this patch, there is a consistent value for minPawnDistance where losing a pawn never increases king safety.
STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 45548 W: 8624 L: 8525 D: 28399
Ptnml(0-2): 592, 4937, 11587, 5096, 562
https://tests.stockfishchess.org/tests/view/5e98ced630be947a14e9ddc5
LTC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 42084 W: 5292 L: 5242 D: 31550
Ptnml(0-2): 193, 3703, 13252, 3649, 245
https://tests.stockfishchess.org/tests/view/5e98e22e30be947a14e9de07
closes https://github.com/official-stockfish/Stockfish/pull/2639
bench 4600292
This idea is loosely based on xoroshiro idea about raisedBeta and ttmoves.
If our ttmove have low enough ttvalue and is deep enough (deeper than our probcut depth) it makes little sense to try probcut moves, since the ttMove already more or less failed to produce one according to transposition table.
passed STC
https://tests.stockfishchess.org/tests/view/5e9673ddc2718dee3c822920
LLR: 2.95 (-2.94,2.94) {-0.50,1.50}
Total: 72148 W: 14038 L: 13741 D: 44369
Ptnml(0-2): 1274, 8326, 16615, 8547, 1312
passed LTC
https://tests.stockfishchess.org/tests/view/5e96b378c2718dee3c8229bf
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 89054 W: 11418 L: 10996 D: 66640
Ptnml(0-2): 623, 8113, 26643, 8515, 633
closes https://github.com/official-stockfish/Stockfish/pull/2632
bench 4952731
This patch refines the recently introduced interaction between
the space bonus and the number of blocked pawns in a position.
* pawns count as blocked also if their push square is attacked by 2 enemy pawns;
* overall dependence is stronger as well as offset;
* bonus increase is capped at 9 blocked pawns in position;
passed STC
https://tests.stockfishchess.org/tests/view/5e94560663d105aebbab243d
LLR: 2.96 (-2.94,2.94) {-0.50,1.50}
Total: 29500 W: 5842 L: 5603 D: 18055
Ptnml(0-2): 504, 3443, 6677, 3562, 564
passed LTC
https://tests.stockfishchess.org/tests/view/5e95b383c2aaa99f75d1a14d
LLR: 2.95 (-2.94,2.94) {0.25,1.75}
Total: 63504 W: 8329 L: 7974 D: 47201
Ptnml(0-2): 492, 5848, 18720, 6197, 495
closes https://github.com/official-stockfish/Stockfish/pull/2631
bench 4956028
+-------+
| o . . | o their pawns
| x . . | x our pawns
| . x . | <- Can sacrifice to create passer?
+-------+
yes
1 2 3 4 5
+-------+ +-------+ +-------+ +-------+ +-------+
| o . . | | o r . | | o r . | | o . b | | o . b | lowercase: theirs
| x b . | | x . . | | x . R | | x . R | | x . . | uppercase: ours
| . x . | | . x . | | . x . | | . x . | | . x B |
+-------+ +-------+ +-------+ +-------+ +-------+
no no yes no yes
The value of our top pawn depends on our ability to advance our bottom
pawn, levering their blocker. Previously, this pawn configuration was
always scored as passer (although a blocked one).
Add requirements for the square s above our (possibly) sacrificed pawn:
- s must not be occupied by them (1).
- If they attack s (2), we must attack s (3).
- If they attack s with a minor (4), we must attack s with a minor (5).
The attack from their blocker is ignored because it is inherent in the
structure; we are ok with sacrificing our bottom pawn.
LTC
LLR: 2.95 (-2.94,2.94) {0.25,1.75}
Total: 37030 W: 4962 L: 4682 D: 27386
Ptnml(0-2): 266, 3445, 10863, 3625, 316
https://tests.stockfishchess.org/tests/view/5e92a2b4be6ede5b954bf239
STC
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 40874 W: 8066 L: 7813 D: 24995
Ptnml(0-2): 706, 4753, 9324, 4890, 764
https://tests.stockfishchess.org/tests/view/5e922199af0a0143109dc90e
closes https://github.com/official-stockfish/Stockfish/pull/2624
Bench: 4828294
In master, if the received ttMove meets the prescribed conditions in the various MovePicker constructors, it is returned as the first move, otherwise we set it to MOVE_NONE. If set to MOVE_NONE, we no longer track what the ttMove was, and it will might be returned later in a list of generated moves. This may be a waste. With this patch, if the ttMove fails to meet the prescribed conditions, we simply skip the TT stages, but still store the move and make sure it's never returned.
STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 66424 W: 12903 L: 12806 D: 40715
Ptnml(0-2): 1195, 7730, 15230, 7897, 1160
LTC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 45682 W: 5989 L: 5926 D: 33767
Ptnml(0-2): 329, 4361, 13443, 4334, 374
closes https://github.com/official-stockfish/Stockfish/pull/2616
Bench 4928928
Before this commit, some pawns were considered "candidate" passed pawns and given half bonus. After this commit, all of these pawns are scored as passed pawns, and they do not receive less bonus.
STC:
LLR: 2.95 (-2.94,2.94) {-1.50,0.50}
Total: 21806 W: 4320 L: 4158 D: 13328
Ptnml(0-2): 367, 2526, 5001, 2596, 413
https://tests.stockfishchess.org/tests/view/5e86b4724411759d9d098639
LTC:
LLR: 2.95 (-2.94,2.94) {-1.50,0.50}
Total: 12590 W: 1734 L: 1617 D: 9239
Ptnml(0-2): 96, 1187, 3645, 1238, 129
https://tests.stockfishchess.org/tests/view/5e86d2874411759d9d098640
This PR and commit are dedicated to our colleague Stefan Geschwentner (@locutus2), one of the most respected and accomplished members of the Stockfish developer community. Stockfish is a volunteer project and has always thrived because of Stefan's talent, insight, generosity, and dedication. Welcome back, Stefan!
closes https://github.com/official-stockfish/Stockfish/pull/2613
Bench: 4831963
In more than 100k local KNPK games, there is no discernible difference between master and master with this endgame removed: master:42971, patch:42973, draws: 3969. Removal does not seem to regress in normal games.
STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 46390 W: 8998 L: 8884 D: 28508
Ptnml(0-2): 707, 5274, 11163, 5300, 751
https://tests.stockfishchess.org/tests/view/5e83b18ee42a5c3b3ca2ef02
LTC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 44768 W: 5863 L: 5814 D: 33091
Ptnml(0-2): 251, 3918, 14028, 3905, 282
https://tests.stockfishchess.org/tests/view/5e84a82a4411759d9d0984f4
In tests with a book of endgames that can convert into KNPK, no significant difference can be seen either
```
TC 1.0+0.01
Score of patch vs master: 6131 - 6188 - 27681 [0.499] 40000
Elo difference: -0.5 +/- 1.9, LOS: 30.4 %, DrawRatio: 69.2 %
TC 2.0+0.02
Score of patch vs master: 5740 - 5741 - 28519 [0.500] 40000
Elo difference: -0.0 +/- 1.8, LOS: 49.6 %, DrawRatio: 71.3 %
``
closes https://github.com/official-stockfish/Stockfish/pull/2611
Bench 4512059
The idea behind this patch is that if static eval is really bad so capturing of current piece on spot will still produce a position with an eval much lower than alpha then our best chance is to create some kind of king attack. So captures without check are mostly worse than captures with check and can be reduced more.
passed STC
https://tests.stockfishchess.org/tests/view/5e8514b44411759d9d098543
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 46196 W: 9039 L: 8781 D: 28376
Ptnml(0-2): 750, 5412, 10628, 5446, 862
passed LTC
https://tests.stockfishchess.org/tests/view/5e8530134411759d9d09854c
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 23462 W: 3228 L: 2988 D: 17246
Ptnml(0-2): 186, 2125, 6849, 2405, 166
close https://github.com/official-stockfish/Stockfish/pull/2612
bench 4742598
There is an ambiguity between global and std clamp implementations when compiling in c++17,
and on certain toolchains that are not strictly conforming to c++11.
This is solved by putting our clamp implementation in a namespace.
closes https://github.com/official-stockfish/Stockfish/pull/2572
No functional change.
In November 2019, as a result of the simplification of rank-based outposts by 37698b0,
separate bonuses were introduced for outposts that are currently occupied and outposts
that are reachable on the next move. However, the values of these two bonuses are
quite similar, and they have remained that way for three months of development.
It appears that we can safely retire the separate ReachableOutpost parameter and
use the same Outpost bonus in both cases, restoring the basic principles of Stockfish
outpost evaluation to their pre-November state, while also reducing
the size of the parameter space.
STC:
LLR: 2.96 (-2.94,2.94) {-1.50,0.50}
Total: 47680 W: 9213 L: 9092 D: 29375
Ptnml(0-2): 776, 5573, 11071, 5594, 826
https://tests.stockfishchess.org/tests/view/5e51e33190a0a02810d09802
LTC:
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 14690 W: 1960 L: 1854 D: 10876
Ptnml(0-2): 93, 1381, 4317, 1435, 119
https://tests.stockfishchess.org/tests/view/5e52197990a0a02810d0980f
closes https://github.com/official-stockfish/Stockfish/pull/2559
Bench: 4697493
Current move histories are known to work well near the leaves, whilst at
higher depths they aren't very helpful. To address this problem this
patch introduces a table dedicated for what's happening at plies 0-3.
It's structured like mainHistory with ply index instead of color.
It get cleared with each new search and is filled during iterative
deepening at higher depths when recording successful quiet moves near
the root or traversing nodes which were in the principal variation
(ttPv).
Medium TC (20+0.2):
https://tests.stockfishchess.org/tests/view/5e4d358790a0a02810d096dc
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 100910 W: 16682 L: 16376 D: 67852
Ptnml(0-2): 1177, 10983, 25883, 11181, 1231
LTC:
https://tests.stockfishchess.org/tests/view/5e4e2cb790a0a02810d09714
LLR: 2.95 (-2.94,2.94) {0.25,1.75}
Total: 80444 W: 10495 L: 10095 D: 59854
Ptnml(0-2): 551, 7479, 23803, 7797, 592
closes https://github.com/official-stockfish/Stockfish/pull/2557
Bench: 4705960
This is a patch that significantly improves playing KNNKP endgames:
```
Score of 2553 vs master: 132 - 38 - 830 [0.547] 1000
Elo difference: 32.8 +/- 8.7, LOS: 100.0 %, DrawRatio: 83.0 %
```
At the same time it reduces the evaluation of this mostly draw engame
from ~7.5 to ~1.5
This patch does not regress against master in normal games:
STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 96616 W: 18459 L: 18424 D: 59733
Ptnml(0-2): 1409, 10812, 23802, 10905, 1380
http://tests.stockfishchess.org/tests/view/5e49dfe6f8d1d52b40cd31bc
LTC
LLR: 2.95 (-2.94,2.94) {-1.50,0.50}
Total: 49726 W: 6340 L: 6304 D: 37082
Ptnml(0-2): 239, 4227, 15906, 4241, 250
http://tests.stockfishchess.org/tests/view/5e4ab9ee16fb3df8c4cc01d0
Theory: KNNK is a dead draw, however the presence of the additional weakSide pawn opens up some mate opportunities. The idea is to block the pawn (preferably behind the Troitsky line) with one of the knights and press the weakSide king into a corner. If we can stalemate the king, we release the pawn with the knight (to avoid actual stalemate), and use the knight to complete the mate before the pawn promotes. This is also why there is an additional penalty for advancement of the pawn.
closes https://github.com/official-stockfish/Stockfish/pull/2553
Bench: 4981770
Fixes#2533, fixes#2543, fixes#2423.
the code that prevents false mate announcements depending on the TT
state (GHI), incorrectly used VALUE_MATE_IN_MAX_PLY. The latter
constant, however, also includes, counterintuitively, the TB win range.
This patch fixes that, by restoring the behavior for TB win scores,
while retaining the false mate correctness, and improving the mate
finding ability. In particular
no alse mates are announced with the poisened hash testcase
```
position fen 8/8/8/3k4/8/8/6K1/7R w - - 0 1
go depth 40
position fen 8/8/8/3k4/8/8/6K1/7R w - - 76 1
go depth 20
ucinewgame
```
mates are found with the testcases reported in #2543
```
position fen 4k3/3pp3/8/8/8/8/2PPP3/4K3 w - - 0 1
setoption name Hash value 1024
go depth 55
ucinewgame
```
and
```
position fen 4k3/4p3/8/8/8/8/3PP3/4K3 w - - 0 1
setoption name Hash value 1024
go depth 45
ucinewgame
```
furthermore, on the mate finding benchmark (ChestUCI_23102018.epd),
performance improves over master, roughly reaching performance with the
false mate protection reverted
```
Analyzing 6566 mate positions for best and found mates:
----------------best ---------------found
nodes master revert fixed master revert fixed
16000000 4233 4236 4235 5200 5201 5199
32000000 4583 4585 4585 5417 5424 5418
64000000 4852 4853 4855 5575 5584 5579
128000000 5071 5068 5066 5710 5720 5716
256000000 5280 5282 5279 5819 5827 5826
512000000 5471 5468 5468 5919 5935 5932
```
On a testcase with TB enabled, progress is made consistently, contrary
to master
```
setoption name SyzygyPath value ../../../syzygy/3-4-5/
setoption name Hash value 2048
position fen 1R6/3k4/8/K2p4/4n3/2P5/8/8 w - - 0 1
go depth 58
ucinewgame
```
The PR (prior to a rewrite for clarity)
passed STC:
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 65405 W: 12454 L: 12384 D: 40567
Ptnml(0-2): 920, 7256, 16285, 7286, 944
http://tests.stockfishchess.org/tests/view/5e441a3be70d848499f63d15
passed LTC:
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 27096 W: 3477 L: 3413 D: 20206
Ptnml(0-2): 128, 2215, 8776, 2292, 122
http://tests.stockfishchess.org/tests/view/5e44e277e70d848499f63d63
The incorrectly named VALUE_MATE_IN_MAX_PLY and VALUE_MATED_IN_MAX_PLY
were renamed into VALUE_TB_WIN_IN_MAX_PLY and VALUE_TB_LOSS_IN_MAX_PLY,
and correclty defined VALUE_MATE_IN_MAX_PLY and VALUE_MATED_IN_MAX_PLY
were introduced.
One further (corner case) mistake using these constants was fixed (go
mate X), which could lead to a premature return if X > MAX_PLY / 2,
but TB were present.
Thanks to @svivanov72 for one of the reports and help fixing the issue.
closes https://github.com/official-stockfish/Stockfish/pull/2552
Bench: 4932981
Align the TT allocation by 2M to make it huge page friendly and advise the
kernel to use huge pages.
Benchmarks on my i7-8700K (6C/12T) box: (3 runs per bench per config)
vanilla (nps) hugepages (nps) avg
==================================================================================
bench | 3012490 3024364 3036331 3071052 3067544 3071052 +1.5%
bench 16 12 20 | 19237932 19050166 19085315 19266346 19207025 19548758 +1.1%
bench 16384 12 20 | 18182313 18371581 18336838 19381275 19738012 19620225 +7.0%
On my box, huge pages have a significant perf impact when using a big
hash size. They also speed up TT initialization big time:
vanilla (s) huge pages (s) speed-up
=======================================================================
time stockfish bench 16384 1 1 | 5.37 1.48 3.6x
In practice, huge pages with auto-defrag may always be enabled in the
system, in which case this patch has no effect. This
depends on the values in /sys/kernel/mm/transparent_hugepage/enabled
and /sys/kernel/mm/transparent_hugepage/defrag.
closes https://github.com/official-stockfish/Stockfish/pull/2463
No functional change
This patch greatly increases the endgame penalty for having a trapped rook.
Idea was a result of witnessing Stockfish losing some games at CCCC exchanging
pieces in the position with a trapped rook which directly lead to a lost endgame.
This patch should partially fix such behavior making this penalty high even in
deep endgames.
Passed STC
http://tests.stockfishchess.org/tests/view/5e279d7cc3b97aa0d75bc1c4
LLR: 2.94 (-2.94,2.94) {-1.00,3.00}
Total: 8528 W: 1706 L: 1588 D: 5234
Ptnml(0-2): 133, 957, 1985, 1024, 159
Passed LTC
http://tests.stockfishchess.org/tests/view/5e27aee4c3b97aa0d75bc1e1
LLR: 2.95 (-2.94,2.94) {0.00,2.00}
Total: 88713 W: 11520 L: 11130 D: 66063
Ptnml(0-2): 646, 8170, 26342, 8492, 676
Closes https://github.com/official-stockfish/Stockfish/pull/2510
Bench: 4964462
----------------------
Comment by Malcolm Campbell:
Congrats! I think this might be a common pattern - scores that seem to mainly apply
to the midgame are often better with a similar (or at least fairly big) endgame value
as well. Maybe there are others eval parameters we can tweak like this...
Currently on a normal bench run in ~0,7% of cases 'improving' is set to
true although the static eval isn't improving at all, just keeping
equal. It looks like the strict gt-operator is more appropriate here,
since it returns to 'improving' its literal meaning without sideffects.
STC {-1.00,3.00} failed yellow:
https://tests.stockfishchess.org/tests/view/5e1ec38c8fd5f550e4ae1c28
LLR: -2.93 (-2.94,2.94) {-1.00,3.00}
Total: 53155 W: 10170 L: 10109 D: 32876
Ptnml(0-2): 863, 6282, 12251, 6283, 892
non-regression LTC passed:
https://tests.stockfishchess.org/tests/view/5e1f1c0d8fd5f550e4ae1c41
LLR: 2.98 (-2.94,2.94) {-1.50,0.50}
Total: 23961 W: 3114 L: 3018 D: 17829
Ptnml(0-2): 163, 2220, 7114, 2298, 170
CLoses https://github.com/official-stockfish/Stockfish/pull/2496
bench: 4561386
This is a non-functional simplification. Looks like std::bitset works good
for the KPKBitbase. Thanks for Jorg Oster for helping get the speed up
(the [] accessor is faster than test()).
Speed testing: 10k calls to probe:
master 9.8 sec
patch 9.8 sec.
STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 100154 W: 19025 L: 18992 D: 62137
Ptnml(0-2): 1397, 11376, 24572, 11254, 1473
http://tests.stockfishchess.org/tests/view/5e21e601346e35ac603b7d2b
Closes https://github.com/official-stockfish/Stockfish/pull/2502
No functional change
This is a non-functional speed-up: master has to access SquareBB twice while this patch
determines opposite_colors just using the values of the squares. It doesn't seem to change
the overall speed of bench, but calling opposite_colors(...) 10 Million times:
master: 39.4 seconds
patch: 11.4 seconds.
The only data point I have (other than my own tests), is a quite old failed STC test:
LLR: -2.93 (-2.94,2.94) [-1.50,4.50]
Total: 24308 W: 5331 L: 5330 D: 13647
Ptnml(0-2): 315, 2577, 6326, 2623, 289
http://tests.stockfishchess.org/tests/view/5e010256c13ac2425c4a9a67
Closes https://github.com/official-stockfish/Stockfish/pull/2498
No functional change
This is a non-functional simplification. If we use the "side to move" of the entry
instead of the template, one of the classify methods goes away. Furthermore, I've
resolved the colors in some of the statements (we're already assuming direction
using NORTH), and used stm (side to move) instead of "us," since this is much clearer
to me.
This is not tested because it is non-functional, only applies building the bitbase
and there are no changes to the binary (on my machine).
Closes https://github.com/official-stockfish/Stockfish/pull/2485
No functional change
This is a non-functional simplification. Instead of passing the piece type
for remove_piece, we can rely on the board. The only exception is en-passant
which must be explicitly set because the destination square for the capture
is not the same as the piece to remove.
Verified also in the Chess960 castling case by running a couple of perft, see
the pull request discussion: https://github.com/official-stockfish/Stockfish/pull/2460
STC
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 18624 W: 4147 L: 4070 D: 10407
Ptnml(0-2): 223, 1933, 4945, 1938, 260
http://tests.stockfishchess.org/tests/view/5dfeaa93e70446e17e451163
No functional change
Official release version of Stockfish 11.
Bench: 5156767
-----------------------
It is our pleasure to release Stockfish 11 to our fans and supporters.
Downloads are freely available at http://stockfishchess.org/download/
This version 11 of Stockfish is 50 Elo stronger than the last version, and
150 Elo stronger than the version which famously lost a match to AlphaZero
two years ago. This makes Stockfish the strongest chess engine running on
your smartphone or normal desktop PC, and we estimate that on a modern four
cores CPU, Stockfish 11 could give 1:1000 time odds to the human chess champion
having classical time control, and be on par with him. More specific data,
including nice cumulative curves for the progression of Stockfish strength
over the last seven years, can be found on [our progression page][1], at
[Stefan Pohl site][2] or at [NextChessMove][3].
In October 2019 Stockfish has regained its crown in the TCEC competition,
beating in the superfinal of season 16 an evolution of the neural-network
engine Leela that had won the previous season. This clash of style between an
alpha-beta and an neural-network engine produced spectacular chess as always,
with Stockfish [emerging victorious this time][0].
Compared to Stockfish 10, we have made hundreds of improvements to the
[codebase][4], from the evaluation function (improvements in king attacks,
middlegame/endgame transitions, and many more) to the search algorithm (some
innovative coordination methods for the searching threads, better pruning of
unsound tactical lines, etc), and fixed a couple of bugs en passant.
Our testing framework [Fishtest][5] has also seen its share of improvements
to continue propelling Stockfish forward. Along with a lot of small enhancements,
Fishtest has switched to new SPRT bounds to increase the chance of catching Elo
gainers, along with a new testing book and the use of pentanomial statistics to
be more resource-efficient.
Overall the Stockfish project is an example of open-source at its best, as
its buzzing community of programmers sharing ideas and daily reviewing their
colleagues' patches proves to be an ideal form to develop innovative ideas for
chess programming, while the mathematical accuracy of the testing framework
allows us an unparalleled level of quality control for each patch we put in
the engine. If you wish, you too can help our ongoing efforts to keep improving
it, just [get involved][6] :-)
Stockfish is also special in that every chess fan, even if not a programmer,
[can easily help][7] the team to improve the engine by connecting their PC to
Fishtest and let it play some games in the background to test new patches.
Individual contributions vary from 1 to 32 cores, but this year Bojun Guo
made it a little bit special by plugging a whole data center during the whole
year: it was a vertiginous experience to see Fishtest spikes with 17466 cores
connected playing [25600 games/minute][8]. Thanks Guo!
The Stockfish team
[0]: <http://mytcecexperience.blogspot.com/2019/10/season-16-superfinal-games-91-100.html>
[1]: <https://github.com/glinscott/fishtest/wiki/Regression-Tests>
[2]: <https://www.sp-cc.de/index.htm>
[3]: <https://nextchessmove.com/dev-builds>
[4]: <https://github.com/official-stockfish/Stockfish>
[5]: <https://tests.stockfishchess.org/tests>
[6]: <https://stockfishchess.org/get-involved/>
[7]: <https://github.com/glinscott/fishtest/wiki>
[8]: <https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/lebEmG5vgng%5B1-25%5D>
Based on recent improvement of futility pruning by @locutus2 : we lower
the futility margin to apply it for more nodes but as a compensation
we also lower the history threshold to apply it to less nodes. Further
work in tweaking constants can always be done - numbers are guessed
"by hand" and are not results of some tuning, maybe there is some more
Elo to squeeze from this part of code.
Passed STC
LLR: 2.98 (-2.94,2.94) {-1.00,3.00}
Total: 15300 W: 3081 L: 2936 D: 9283
Ptnml(0-2): 260, 1816, 3382, 1900, 290
http://tests.stockfishchess.org/tests/view/5e18da3b27dab692fcf9a158
Passed LTC
LLR: 2.94 (-2.94,2.94) {0.00,2.00}
Total: 108670 W: 14509 L: 14070 D: 80091
Ptnml(0-2): 813, 10259, 31736, 10665, 831
http://tests.stockfishchess.org/tests/view/5e18fc9627dab692fcf9a180
Bench: 4643972
This patch makes Stockfish search same depth again if > 60% of optimum time is
already used, instead of trying the next iteration. The idea is that the next
iteration will generally take about the same amount of time as has already been
used in total. When we are likely to begin the last iteration, as judged by total
time taken so far > 0.6 * optimum time, searching the last depth again instead of
increasing the depth still helps the other threads in lazy SMP and prepares better
move ordering for the next moves.
STC :
LLR: 2.95 (-2.94,2.94) {-1.00,3.00}
Total: 13436 W: 2695 L: 2558 D: 8183
Ptnml(0-2): 222, 1538, 3087, 1611, 253
https://tests.stockfishchess.org/tests/view/5e1618a761fe5f83a67dd964
LTC :
LLR: 2.94 (-2.94,2.94) {0.00,2.00}
Total: 32160 W: 4261 L: 4047 D: 23852
Ptnml(0-2): 211, 2988, 9448, 3135, 247
https://tests.stockfishchess.org/tests/view/5e162ca061fe5f83a67dd96d
The code was revised as suggested by @vondele for multithreading:
STC (8 threads):
LLR: 2.95 (-2.94,2.94) {0.00,2.00}
Total: 16640 W: 2049 L: 1885 D: 12706
Ptnml(0-2): 119, 1369, 5158, 1557, 108
https://tests.stockfishchess.org/tests/view/5e19826a2cc590e03c3c2f52
LTC (8 threads):
LLR: 2.95 (-2.94,2.94) {-1.00,3.00}
Total: 16536 W: 2758 L: 2629 D: 11149
Ptnml(0-2): 182, 1758, 4296, 1802, 224
https://tests.stockfishchess.org/tests/view/5e18b91a27dab692fcf9a140
Thanks to those discussing Stockfish lazy SMP on fishcooking which made me
try this, and to @vondele for suggestions and doing related tests.
See full discussion in the pull request thread:
https://github.com/official-stockfish/Stockfish/pull/2482
Bench: 4586187
This patch shows a description of the compiler used to compile Stockfish,
when starting from the console.
Usage:
```
./stockfish
compiler
```
Example of output:
```
Stockfish 120120 64 POPCNT by T. Romstad, M. Costalba, J. Kiiski, G. Linscott
Compiled by clang++ 9.0.0 on Apple
__VERSION__ macro expands to: 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.38)
```
No functional change
This updates estimates from 1.5 year ago, and adds missing terms. All estimates
from tests run on fishtest at 10+0.1 (STC), 20000 games, error bars +- 3 Elo,
see the original message in the pull request for the full list of tests.
Noteworthy changes are step 7 (futility pruning) going from ~30 to ~50 Elo
and step 13 (pruning at shallow depth) going from ~170 to ~200 Elo.
Full list of tests: https://github.com/official-stockfish/Stockfish/pull/2401
@Rocky640 made the suggestion to look at time control dependence of these terms.
I picked two large terms (early futility pruning and singular extension), so with
small relative error. It turns out it is actually quite interesting (see figure 1).
Contrary to my expectation, the Elo gain for early futility pruning is pretty time
control sensitive, while singular extension gain is not.
Figure 1: TC dependence of two search terms

Going back to the old measurement of futility pruning (30 Elo vs today 50 Elo),
the code is actually identical but the margins have changed. It seems like a nice
example of how connected terms in search really are, i.e. the value of early futility
pruning increased significantly due to changes elsewhere in search.
No functional change.
User "adentong" reported recently of a game where Stockfish blundered a game
in a tournament because during a search there was an hash-table issue for
positions inside the tree very close to the 50-moves draw rule. This is part
of a problem which is commonly referred to as the Graph History Interaction (GHI),
and is difficult to solve in computer chess because storing the 50-moves counter
in the hash-table loses Elo in general.
Links:
Issue 2451 : https://github.com/official-stockfish/Stockfish/issues/2451
About the GHI : https://www.chessprogramming.org/Graph_History_Interaction
This patch tries to address the issue in this particular game and similar
reported games: it prevents that values from the transposition table are
getting used when the 50-move counter is close to reaching 100 (). The idea
is that in such cases values from previous searches, with a much lower 50-move
count, become less and less reliable.
More precisely, the heuristic we use in this patch is that we don't take the
transposition table cutoff when we have reached a 45-moves limit, but let the
search continue doing its job. There is a possible slowdown involved, but it will
also help to find either a draw when it thought to be losing, or a way to avoid
the draw by 50-move rule. This heuristics probably will not fix all possible cases,
but seems to be working reasonably well in practice while not losing too much Elo.
Passed non-regression tests:
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 274452 W: 59700 L: 60075 D: 154677
http://tests.stockfishchess.org/tests/view/5df546116932658fe9b451bf
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 95235 W: 15297 L: 15292 D: 64646
http://tests.stockfishchess.org/tests/view/5df69c926932658fe9b4520e
Closes https://github.com/official-stockfish/Stockfish/pull/2453
Bench: 4586187
SEE (Static Exchange Evaluation) is a critical component, so we might
indulge some tricks to make it faster. Another pull request #2469 showed
some speedup by removing templates, this version uses Ronald de Man
(@syzygy1) SEE implementation which also unrolls the for loop by
suppressing the min_attacker() helper function and exits as soon as
the last swap is conclusive.
See Ronald de Man version there:
https://github.com/syzygy1/Cfish/blob/master/src/position.c
Patch testes against pull request #2469:
LLR: 2.95 (-2.94,2.94) {-1.00,3.00}
Total: 19365 W: 3771 L: 3634 D: 11960
Ptnml(0-2): 241, 1984, 5099, 2092, 255
http://tests.stockfishchess.org/tests/view/5e10eb135e5436dd91b27ba3
And since we are using new SPRT statistics, and that both pull requests
finished with less than 20000 games I also tested against master as
a speed-up:
LLR: 2.99 (-2.94,2.94) {-1.00,3.00}
Total: 18878 W: 3674 L: 3539 D: 11665
Ptnml(0-2): 193, 1999, 4966, 2019, 250
http://tests.stockfishchess.org/tests/view/5e10febf12ef906c8b388745
Non functional change
This patch excludes blockers for king from mobility area. It was tried a couple
of times by now but now it passed. Performance is not enormously good but this
patch makes a lot of sence - blockers for king can't really move until king moves
(in most cases) so logic behind it is the same as behind excluding king square
from mobility area.
STC
http://tests.stockfishchess.org/tests/view/5dec388651219d7befdc76be
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 6155 W: 1428 L: 1300 D: 3427
LTC
http://tests.stockfishchess.org/tests/view/5dec4a3151219d7befdc76d3
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 120800 W: 19636 L: 19134 D: 82030
Bench: 5173081
This patch simplifies latest @MJZ1977 elo gainer. Seems like PvNode check in
condition of last capture extension is not needed. Note - even if this is a
simplification it actually causes this extension to be applied more often, thus
strengthening effect of @MJZ1977's patch.
passed STC
http://tests.stockfishchess.org/tests/view/5deb9a3eb7bdefd50db28d0e
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 80244 W: 17421 L: 17414 D: 45409
passed LTC
http://tests.stockfishchess.org/tests/view/5deba860b7bdefd50db28d11
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 21506 W: 3565 L: 3446 D: 14495
Bench: 5097036
As reported on the forum it is possible, on very rare occasions, that we are
trying to print a PV line with an invalid previousScore, although this line
has a valid actual score. This patch fixes output of PV lines with invalid
scores in a MultiPV search. This is a follow-up patch to 8b15961 and makes
the fix finally complete.
The reason is the i <= pvIdx condition which probably is a leftover from the
times there was a special root search function. This check is no longer needed
today and prevents PV lines past the current one (current pvIdx) to be flagged
as updated even though they do have a valid score.
https://github.com/official-stockfish/Stockfish/commit/8b15961349e18a9ba113973c53f53913d0cd0fadhttps://groups.google.com/forum/?fromgroups=#!topic/fishcooking/PrnoDLvMvro
No functional change.
the removed line is not needed, since with the conditions on SE, eval
equals ttValue (except inCheck), which must be larger than beta if the second condition
is true.
The removed line is also incorrect as eval might be VALUE_NONE at this
location if inCheck. This removal addresses part of https://github.com/official-stockfish/Stockfish/pull/2406#issuecomment-552642608
No functional change.
this patch extends bench to print static evaluations.
./stockfish bench 16 1 1 filename eval
will now print the evaluations for all fens in the file.
This complements the various 'go' flavors for bench and might be useful for debugging and/or tuning.
No functional change.
In a recent commit, "Introduce king flank defenders," a term was introduced
by Michael Chaly (@Vizvezdenec) to reduce king danger based on king defenders,
i.e., friendly attacks on our King Flank and Camp. This is a powerful idea
and broadly applicable to all of our pieces.
An earlier, but narrower, version of a similar idea was already coded into
king danger, with a term reducing king danger simply if we had a bishop and
king attacking the same square -- there is also a similar term for knights,
but roughly three times larger. I had attempted to tweak this term's coefficient
fairly recently, in a series of tests in early September which increased this
coefficient. All failed STC with significantly negative scores.
Now that the king flank defenders term has been introduced, it appears that
the bishop-defense term can be simplified away without compensation or
significant Elo loss.
Where do we go from here? This PR is a natural follow-up to "Introduce king
flank defenders," which proposed simplification with existing and overlapping
terms, such as this one. That PR also mentioned that the coefficient it
introduced appeared arbitrary, so perhaps this PR can facilitate a tweak to
increase king flank defenders' coefficient.
Additionally, this pull request is extremely similar to https://github.com/official-stockfish/Stockfish/pull/1821,
which was (coincidentally) merged a year ago, to the day (November 23, 2018).
That patch also simplified away a linear king danger tropism term, which was
soon after replaced with a quadratic term by @Vizvezdenec (which would not have
passed without the simplification). @Vizvezdenec, again by coincidence, has
recently been trying to implement a quadratic term, this time for defenders
rather than attackers. This history of this evaluation code suggests that
this simplification might be enough to help a patch for quadratic king-flank
defenders pass.
Bench: 4959670
STC:
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 22209 W: 4920 L: 4800 D: 12489
https://tests.stockfishchess.org/tests/view/5dd444d914339111b9b6bed7
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 152107 W: 24658 L: 24743 D: 102706
https://tests.stockfishchess.org/tests/view/5dd4be31f531e81cf278ea9d
Interesting discussion on Github about this pull request:
https://github.com/official-stockfish/Stockfish/pull/2424
---
This pull request was opened less than one week before the holiday of
Thanksgiving here in the United States. In keeping with the holiday
tradition of expressing gratitude, I would like to thank our generous
CPU donors, talented forum contributors, innovative developers, speedy
fishtest approvers, and especially our hardworking server maintainers
(@ppigazzini and @tomtor). Thank you all for a year of great Stockfish
progress!
This patch measures how frequently search is exploring new configurations.
This is done be computing a running average of ttHit. The ttHitAverage rate
is somewhat low (e.g. 30% for startpos) in the normal case, while it can be
very high if no progress is made (e.g. 90% for the fortress I used for testing).
This information can be used to influence search. In this patch, by adjusting
reductions if the rate > 50%. A first version (using a low ttHitAverageResolution
and this 50% threshold) passed testing:
STC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 26425 W: 5837 L: 5650 D: 14938
http://tests.stockfishchess.org/tests/view/5dcede8b0ebc5902563258fa
LTC
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 32313 W: 5392 L: 5128 D: 21793
http://tests.stockfishchess.org/tests/view/5dcefb1f0ebc590256325c0e
However, as discussed in pull request 2414, using a larger ttHitAverageResolution
gives a better approximation of the underlying distributions. This needs a slight
adjustment for the threshold as the new distributions are shifted a bit compared
to the older ones, and this threshold seemingly is sensitive (we used 0.53125 here).
https://github.com/official-stockfish/Stockfish/pull/2414
This final version also passed testing, and is used for the patch:
STC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 16025 W: 3555 L: 3399 D: 9071
http://tests.stockfishchess.org/tests/view/5dd070b90ebc5902579e20c2
LTC
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 37576 W: 6277 L: 5998 D: 25301
http://tests.stockfishchess.org/tests/view/5dd0f58e6f544e798086f224
Closes https://github.com/official-stockfish/Stockfish/pull/2414
Bench: 4989584
This patch implements what we have been trying for quite some time -
dependance of kingdanger on balance of attackers and defenders of king
flank, to avoid overestimate attacking power if the opponent has enough
defenders of king position. We already have some form of it in bishop
and knight defenders - this is further work in this direction.
What to do based on this?
1) constant 4 is arbitrary, maybe it is not optimal
2) maybe we can use quadratic formula as in kingflankattack
3) simplification into alrealy existing terms is always a possibility :)
4) overall kingdanger tuning always can be done.
passed STC:
http://tests.stockfishchess.org/tests/view/5dcf40560ebc590256325f30
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 26298 W: 5819 L: 5632 D: 14847
passed LTC:
http://tests.stockfishchess.org/tests/view/5dcfa5760ebc590256326464
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 30600 W: 5042 L: 4784 D: 20774
Closes https://github.com/official-stockfish/Stockfish/pull/2415
Bench: 4496847
Introduce OutpostRank[RANK_NB] which contains a bonus according to
the rank of the outpost. We use it for the primary Outpost bonus.
The values are based on the trends of the SPSA tuning run with some
manual tweaks.
Passed STC:
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 27454 W: 6059 L: 5869 D: 15526
http://tests.stockfishchess.org/tests/view/5dcadba20ebc590256922f09
Passed LTC:
LLR: 2.94 (-2.94,2.94) [0.00,3.50]
Total: 57950 W: 9443 L: 9112 D: 39395
http://tests.stockfishchess.org/tests/view/5dcaea880ebc5902569230bc
Bench: 4778405
----------------------------
The inspiration for this patch came from Stefan Geschwentner's attempt
of modifying BishopPawns into a rank-based penalty. Michael Stembera
suggested that maybe the S(0, 0) ranks (3rd, 7th and also maybe 8th)
can still be tuned. This would expand our definition of Outpost and
OutpostRanks would be removed altogether. Special thanks to Mark Tenzer
for all the help and excellent suggestions.
The removed lines approximately duplicate equivalent logic in the movePicker.
Adjust the futility_move_count to componsate for some difference
(the movePicker prunes one iteration of the move loop later).
Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 8114 W: 1810 L: 1663 D: 4641
http://tests.stockfishchess.org/tests/view/5dc6afe60ebc5902562bd318
Passed LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 89956 W: 14473 L: 14460 D: 61023
http://tests.stockfishchess.org/tests/view/5dc6bdcf0ebc5902562bd3c0
Closes https://github.com/official-stockfish/Stockfish/pull/2407
Bench: 4256440
---------------------
How to continue from there?
It would be interesting to see if we can extract some Elo gain
from the new futility_move_count formula, for instance by somehow
incorporating the final -1 in the 5 constant, or adding a linear
term to the quadratics...
```
futility_move_count = (5 + depth * depth) * (1 + improving) / 2 - 1
```
Current master 648c7ec25d will generate an
incorrect mate score for:
```
setoption name Hash value 8
setoption name Threads value 1
position fen 8/1p2KP2/1p4q1/1Pp5/2P5/N1Pp1k2/3P4/1N6 b - - 76 40
go depth 49
```
even though the position is a draw. Generally, SF tries to display only
proven mate scores, so this is a bug.
This was posted http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72166
by Uri Blass, with the correct analysis that this must be related to the
50 moves draw rule being ignored somewhere.
Indeed, this is possible as positions and there eval are stored in the TT,
without reference to the 50mr counter. Depending on the search path followed
a position can thus be mate or draw in the TT (GHI or Graph history interaction).
Therefore, to prove mate lines, the TT content has to be used with care. Rather
than ignoring TT content in general or for mate scores (which impact search or
mate finding), it is possible to be more selective. In particular, @WOnder93
suggested to only ignore the TT if the 50mr draw ply is closer than the mate
ply. This patch implements this idea, by clamping the eval in the TT to
+-VALUE_MATED_IN_MAX_PLY. This retains the TTmove, but causes a research of
these lines (with the current 50mr counter) as needed.
This patch hardly ever affects search (as indicated by the unchanged
bench), but fixes the testcase. As the conditions are very specific,
also mate finding will almost never be less efficient (testing welcome).
It was also shown to pass STC and LTC non-regression testing, in a form
using if/then/else instead of ternary operators:
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 93605 W: 15346 L: 15340 D: 62919
http://tests.stockfishchess.org/tests/view/5db45bb00ebc5908127538d4
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 33873 W: 7359 L: 7261 D: 19253
http://tests.stockfishchess.org/tests/view/5db4c8940ebc5902d6b146fc
closes https://github.com/official-stockfish/Stockfish/issues/2370
Bench: 4362323
This reverts the previous commit. The PSQT changes in this previous
commit originated from tests against quite an old version of master
which did not include the other PSQT changes of 474d133 for the other
pieces, and there might be some unknown interactions between the PSQT
tables. So we made a non-regression test of the last commit against the
last-but-one commit. This test failed, leading to the revert decision.
Failed non-regression test:
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 95536 W: 15047 L: 15347 D: 65142
http://tests.stockfishchess.org/tests/view/5dc0ba1d0ebc5904493b0112
Closes https://github.com/official-stockfish/Stockfish/pull/2395
Bench: 4362323
As Stockfish developers, we aim to make our code as legible and as close
to simple English as possible. However, one of the more notable exceptions
to this rule concerns operations between Squares and Bitboards.
Prior to this pull request, AND, OR, and XOR were only defined when the
Bitboard was the first operand, and the Square the second. For example,
for a Bitboard b and Square s, "b & s" would be valid but "s & b" would not.
This conflicts with natural reasoning about logical operators, both
mathematically and intuitively, which says that logical operators should
commute.
More dangerously, however, both Square and Bitboard are defined as integers
"under the hood." As a result, code like "s & b" would still compile and give
reasonable bench values. This trap occasionally ensnares even experienced
Stockfish developers, but it is especially dangerous for new developers not
aware of this peculiarity. Because there is no compilation or runtime error,
and a reasonable bench, only a close review by approvers can spot this error
when a test has been submitted--and many times, these bugs have slipped past
review. This is by far the most common logical error on Fishtest, and has
wasted uncountable STC games over the years.
However, it can be fixed by adding three non-functional lines of code. In this
patch, we define the operators when the operands are provided in the opposite
order, i.e., we make AND, OR, and XOR commutative for Bitboards and Squares.
Because these are inline methods and implemented identically, the executable
does not change at all.
This patch has the small side-effect of requiring Squares to be explicitly
cast to integers before AND, OR, or XOR with integers. This is only performed
twice in Stockfish's source code, and again does not change the executable at
all (since Square is an enum defined as an integer anyway).
For demonstration purposes, this pull request also inverts the order of one AND
and one OR, to show that neither the bench nor the executable change. (This
change can be removed before merging, if preferred.)
I hope that this pull request significantly lowers the barrier-of-entry for new
developer to join the Stockfish project. I also hope that this change will improve
our efficiency in using our generous CPU donors' machines, since it will remove
one of the most common causes of buggy tests.
Following helpful review and comments by Michael Stembera (@mstembera), we add
a further clean-up by implementing OR for two Squares, to anticipate additional
traps developers may encounter and handle them cleanly.
Closes https://github.com/official-stockfish/Stockfish/pull/2387
No functional change.
Simplify the king ring initialization and make it more regular, by just
moving the king square off the edges and using PseudoAttacks by king from
this new square.
There is a small functional difference from the previous master, as the
old master excludes the original ksq square while this patch always includes
the nine squares block (after moving the king from the edges). Additionally,
master does not adjust the kingRing down if we are on relative rank 8,
while this patch treats all of the edges the same.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 13263 W: 2968 L: 2830 D: 7465
http://tests.stockfishchess.org/tests/view/5db872830ebc5902d1f388aa
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 72996 W: 11819 L: 11780 D: 49397
http://tests.stockfishchess.org/tests/view/5db899c20ebc5902d1f38b5e
Closes https://github.com/official-stockfish/Stockfish/pull/2384
Bench: 4959244
Make dynamic contempt weight factor dependent on static contempt so that higher
static contempt implies less dynamic contempt and vice versa. For default contempt
24 this is a non-functional change. But tests with contempt 0 shows an elo gain.
Also today is my birthday so i have already give to myself a gift with this patch :-)!
Further proceedings:
in the past we checked for default contempt that it doesn't regress against
contempt 0. Now that the later is stronger and the former is the same strength
this should be rechecked. Perhaps the default contempt have to be lowered.
It would be interesting to get some idea of the impact of this patch outside
of the 0-24 contempt range.
STC: (both with contempt=0)
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 21912 W: 3898 L: 3740 D: 14274
http://tests.stockfishchess.org/tests/view/5db74b6f0ebc5902d1f37405
LTC: (both with contempt=0)
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 27172 W: 3350 L: 3126 D: 20696
http://tests.stockfishchess.org/tests/view/5db760020ebc5902d1f375d0
Closes https://github.com/official-stockfish/Stockfish/pull/2382
No functional change (for current default contempt 24).
This PR refactors update_quiet_stats, update_capture_stats and search to more clearly reflect what is actually done.
Effectively, all stat updates that need to be done after search is finished and a bestmove is found,
are collected in a new function ```final_stats_update()```. This shortens our main search routine, and simplifies ```update_quiet_stats```.
The latter function is now more easily reusable with fewer arguments, as the handling of ```quietsSearched``` is only needed in ```final_stats_update```.
```update_capture_stats```, which was only called once is now integrated in ```final_stats_update```, which allows for removing a branch and reusing some ```stat_bonus``` calls. The need for refactoring was also suggested by the fact that the comments of ```update_quiet_stats``` and ```update_capture_stats``` were incorrect (e.g. ```update_capture_stats``` was called, correctly, also when the bestmove was a quiet and not a capture).
passed non-regression STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 75196 W: 16364 L: 16347 D: 42485
http://tests.stockfishchess.org/tests/view/5db004ec0ebc5902c06db9e1
The diff is most easily readable as ```git diff master --patience```
No functional change
- Cleanups by Alain
- Group king attacks and king defenses
- Signature of futility_move_count()
- Use is_discovery_check_on_king()
- Simplify backward definition
- Use static asserts in move generator
- Factor a statement in move generator
No functional change
Stockfish crashes immediately if users enter a wrong file name (or even an existing
folder name) for debug log file. It may be hard for users to find out since it prints
nothing. If they enter the string via a chess GUI, the chess GUI may remember and
auto-send to Stockfish next time, makes Stockfish crashes all the time. Bug report by
Nguyen Hong Pham in this issue: https://github.com/official-stockfish/Stockfish/issues/2365
This patch avoids the crash and instead prefers to exit gracefully with a error
message on std:cerr, like we do with the fenFile for instance.
Closes https://github.com/official-stockfish/Stockfish/pull/2366
No functional change.
With the current questions and issues around threading, I had a look at
https://github.com/official-stockfish/Stockfish/issues/2299.
It seems there was a problem with data races when requesting eval via UCI while
a search was already running. To fix this an extra thread uithread was created,
presumably to avoid an overlap with Threads.main() that was causing problems.
Making this eval request seems to be outside the scope of UCI, and @vondele also
reports that the data race is not even fixed reliably by this change. I suggest
we simplify the threading here by removing this uithread and adding a comment
signaling that user should not request eval when a search is already running.
Closes https://github.com/official-stockfish/Stockfish/pull/2310
No functional change.
The current bench is missing a position with high 50 moves rule counter,
making most 'shuffle' tests based on 50mr > N seem non-functional.
This patch adds one FEN with high 50mr counter to address this issue
(taken from a recent tcec game).
Four new FENs:
- position with high 50mr counter
- tactical position with many captures, checks, extensions, fails high/low
- two losses by Stockfish in the S16 bonus games against Houdini
See the pull request for nice comments by @Alayan-stk-2 about each position
in bench: https://github.com/official-stockfish/Stockfish/pull/2338
Bench: 4590210
Previously, we used various control statements and ternary operators to divide
Outpost into four bonuses, based on whether the outpost was for a knight or
bishop, and whether it was currently an Outpost or merely a potential ("reachable")
one in the future. Bishop outposts, however, have traditionally been worth far
less Elo in testing. An attempt to remove them altogether passed STC, but failed LTC.
Here we include a narrower simplification, removing the reachable Outpost bonus
for bishops. This bonus was always suspect, given that its current implementation
conflicts directly with BishopPawns. BishopPawns penalizes our bishops based on the
number of friendly pawns on the same color of square, but by definition, Outposts
must be pawn-protected! This PR helps to alleviate this conceptual contradiction
without loss of Elo and with slightly simpler code.
On a code level, this allows us to simplify a ternary operator into the previous
"if" block and distribute a multiplication into an existing constant Score. On a
conceptual level, we retire one of the four traditional Outpost bonuses.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 22277 W: 4882 L: 4762 D: 12633
http://tests.stockfishchess.org/tests/view/5d9aeed60ebc5902b6cf9751
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 51206 W: 8353 L: 8280 D: 34573
http://tests.stockfishchess.org/tests/view/5d9af1940ebc5902b6cf9cd5
Closes https://github.com/official-stockfish/Stockfish/pull/2352
Bench: 3941591
This patch changes the base aspiration window size depending on the absolute
value of the previous iteration score, increasing it away from zero. This
stems from the observation that the further away from zero, the more likely
the evaluation is to change significantly with more depth. Conversely, a
tighter aspiration window is more efficient when close to zero.
A beneficial side-effect is that analysis of won positions without a quick
mate is less prone to waste nodes in repeated fail-high that change the eval
by tiny steps.
STC:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 60102 W: 13327 L: 12868 D: 33907
http://tests.stockfishchess.org/tests/view/5d9a70d40ebc5902b6cf39ba
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 155553 W: 25745 L: 25141 D: 104667
http://tests.stockfishchess.org/tests/view/5d9a7ca30ebc5902b6cf4028
Future work : the values used in this patch were only a reasonable guess.
Further testing should unveil more optimal values. However, the aspiration
window is rather tight with a minimum of 21 internal units, so discrete
integers put a practical limitation to such tweaking.
More exotic experiments around the aspiration window parameters could also
be tried, but efficient conditions to adjust the base aspiration window size
or allow it to not be centered on the current evaluation are not obvious.
The aspiration window increases after a fail-high or a fail-low is another
avenue to explore for potential enhancements.
Bench: 4043748
Run as a simplification
a) insures that pawn attacks are always included in the pawn span
(this "fixes" the case where some outpost or reachable outpost
bonus were awarded on squares controlled by enemy pawns).
b) compute the full span only if not "backward" or not "blocked".
By looking at "blocked" instead of "opposed", we get a nice simpli-
fication and the "new" outpost detection is almost identical, except
a few borderline cases on rank 4.
passed STC
http://tests.stockfishchess.org/tests/view/5d9950730ebc5902b6cefb90
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 79113 W: 17168 L: 17159 D: 44786
passed LTC
http://tests.stockfishchess.org/tests/view/5d99d14e0ebc5902b6cf0692
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 41286 W: 6819 L: 6731 D: 27736
See https://github.com/official-stockfish/Stockfish/pull/2348
bench: 3812891
It is always used as a bool, so let's make it a bool straight away.
We can always redefine it as a Piece in a later patch if we want
to use the piece type or the piece color.
No functional change.
Simplification that eliminates ONE_PLY, based on a suggestion in the forum that
support for fractional plies has never been used, and @mcostalba's openness to
the idea of eliminating it. We lose a little bit of type safety by making Depth
an integer, but in return we simplify the code in search.cpp quite significantly.
No functional change
------------------------------------------
The argument favoring eliminating ONE_PLY:
* The term “ONE_PLY” comes up in a lot of forum posts (474 to date)
https://groups.google.com/forum/?fromgroups=#!searchin/fishcooking/ONE_PLY%7Csort:relevance
* There is occasionally a commit that breaks invariance of the code
with respect to ONE_PLY
https://groups.google.com/forum/?fromgroups=#!searchin/fishcooking/ONE_PLY%7Csort:date/fishcooking/ZIPdYj6k0fk/KdNGcPWeBgAJ
* To prevent such commits, there is a Travis CI hack that doubles ONE_PLY
and rechecks bench
* Sustaining ONE_PLY has, alas, not resulted in any improvements to the
engine, despite many individuals testing many experiments over 5 years.
The strongest argument in favor of preserving ONE_PLY comes from @locutus:
“If we use par example ONE_PLY=256 the parameter space is increases by the
factor 256. So it seems very unlikely that the optimal setting is in the
subspace of ONE_PLY=1.”
There is a strong theoretical impediment to fractional depth systems: the
transposition table uses depth to determine when a stored result is good
enough to supply an answer for a current search. If you have fractional
depths, then different pathways to the position can be at fractionally
different depths.
In the end, there are three separate times when a proposal to remove ONE_PLY
was defeated by the suggestion to “give it a few more months.” So… it seems
like time to remove this distraction from the community.
See the pull request here:
https://github.com/official-stockfish/Stockfish/pull/2289
Remove temporary array of shelters and avoid iterating over it each time to find
if the shelter values after castling are better than the current value.
Work done on top of https://github.com/official-stockfish/Stockfish/pull/2277
Speed benchmark did not measure any difference.
No functional change
In lazySMP it makes sense to prune a little more, as multiple threads
search wider. We thus increase the prefactor of the reductions slowly
as a function of the threads. The prefactor of the log(threads) term
is a parameter, this pull request uses 1/2 after testing.
passed STC @ 8threads:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 118125 W: 23151 L: 22462 D: 72512
http://tests.stockfishchess.org/tests/view/5d8bbf4d0ebc59509180f217
passed LTC @ 8threads:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 67546 W: 10630 L: 10279 D: 46637
http://tests.stockfishchess.org/tests/view/5d8c463b0ebc5950918167e8
passed ~LTC @ 14threads:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 74271 W: 12421 L: 12040 D: 49810
http://tests.stockfishchess.org/tests/view/5d8db1f50ebc590f3beb24ef
Note:
A larger prefactor (1) passed similar tests at STC and LTC (8 threads),
while a very large one (2) passed STC quickly but failed LTC (8 threads).
For the single-threaded case there is no functional change.
Closes https://github.com/official-stockfish/Stockfish/pull/2337
Bench: 4088701
Fixup: remove redundant code.
A curious feature of Stockfish's current extension code is its repeated
use of "else if." In most cases, this makes no functional difference,
because no more than one extension is applied; once one extension has
been applied, the remaining ones can be safely ignored.
However, if most singular extension search conditions are true, except
"value < singularBeta", no non-singular extensions (e.g., castling) can
be performed!
Three tests were submitted, for three of Stockfish's four non-singular
extensions. I excluded the shuffle extension, because historically there
have been concerns about the fragility of its conditions, and I did not
want to risk causing any serious search problems.
- Modifying the passed pawn extension appeared roughly neutral at STC. At
best, it appeared to be an improvement of less than 1 Elo.
- Modifying check extension performed very poorly at STC
- Modifying castling extension (this patch) produced a long "yellow" run
at STC (insufficient to pass, but positive score) and a strong LTC.
In simple terms, prior to this patch castling extension was occasionally
not applied during search--on castling moves. The effect of this patch is
to perform castling extension on more castling moves. It does so without
adding any code complexity, simply by replacing an "else if" with "if" and
reordering some existing code.
STC:
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 108114 W: 23877 L: 23615 D: 60622
http://tests.stockfishchess.org/tests/view/5d8d86bd0ebc590f3beb0c88
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 20862 W: 3517 L: 3298 D: 14047
http://tests.stockfishchess.org/tests/view/5d8d99cd0ebc590f3beb1899
Bench: 3728191
--------
Where do we go from here?
- It seems strange to me that check extension performed so poorly -- clearly
some of the singular extension conditions are also very important for check
extension. I am not an expert in search, and I do not have any intuition
about which of the eight conditions is/are the culprit. I will try a
succession of eight STC tests to identify the relevant conditions, then try
to replicate this PR for check extension.
- Recent tests interacting with the castle extension may deserve retesting.
I will shortly resubmit a few of my recent castling extension tweaks, rebased
on this PR/commit.
My deepest thanks to @noobpwnftw for the extraordinary CPU donation, and to
all our other fishtest volunteers, who made it possible for a speculative LTC
to pass in 70 minutes!
Closes https://github.com/official-stockfish/Stockfish/pull/2331
Remove the RookOnPawn logic (for rook on rank 5 and above aligning with pawns
on same row or file) which was overlapping with a few other parameters.
Inspired by @31m059 interesting result hinting that a direct attack on pawns
instead of PseudoAttacks might work.
http://tests.stockfishchess.org/tests/view/5d89a7c70ebc595091801b8d
After a few attempts by me and @31m059, and some long STC greens but red LTC,
as a proof of concept I first tried a local SPSA at VSTC trying to tune related
rook psqt rows, and mainly some rook related stuff in evaluate.cpp.
Result was STC green, but still red LTC,
Finally a 100M fishtest SPSA at LTC proved successful both at STC and LTC.
All this was possible with the awesome fishtest contributors.
At some point, I had 850 workers on the last test !
Run as a simplification
STC
http://tests.stockfishchess.org/tests/view/5d8d68f40ebc590f3beaf171
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 7399 W: 1693 L: 1543 D: 4163
LTC
http://tests.stockfishchess.org/tests/view/5d8d70270ebc590f3beaf63c
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 41617 W: 6981 L: 6894 D: 27742
Closes https://github.com/official-stockfish/Stockfish/pull/2329
bench: 4037914
Revert the previous patch now that the binary for the super-final
of TCEC season 16 has been sent.
Maybe the feature of showing the name of compiler will be added to the
master branch in the future. But we may use a cleaner way to code it, see
some ideas using the Makefile approach at the end of pull request #2327 :
https://github.com/official-stockfish/Stockfish/pull/2327
Bench: 3618154
This patch shows a description of the compiler used to compile Stockfish,
when starting from the console.
Usage:
```
./stockfish
compiler
```
Example of output:
```
Stockfish 240919 64 POPCNT by T. Romstad, M. Costalba, J. Kiiski, G. Linscott
Compiled by clang++ 9.0.0 on Apple
__VERSION__ macro expands to: 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.38)
```
No functional change
This patch replaces the obscure expressions mapping files ABCDEFGH to ABCDDCBA
by explicite calls to an auxiliary function:
old: f = min(f, ~f)
new: f = map_to_queenside(f)
We used the Golbolt web site (https://godbolt.org) to check that the current
code for the auxiliary function is optimal.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 30292 W: 6756 L: 6651 D: 16885
http://tests.stockfishchess.org/tests/view/5d8676720ebc5971531d6aa1
Achieved with a bit of help from Sopel97, snicolet and vondele, thanks everyone!
Closes https://github.com/official-stockfish/Stockfish/pull/2325
No functional change
This change to the Rook psqt encourages rook lifts to the third rank
on the two center files.
STC 10+0.1 th 1 :
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 40654 W: 9028 L: 8704 D: 22922
http://tests.stockfishchess.org/tests/view/5d885da60ebc5906dd3e9fcd
LTC 60+0.6 th 1 :
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 56963 W: 9530 L: 9196 D: 38237
http://tests.stockfishchess.org/tests/view/5d88618c0ebc5906dd3ea45f
Thanks to @snicolet for mentioning that Komodo does this a lot and
Stockfish doesn't, which gave me the idea for this patch, and to
@noobpwnftw for providing cores to fishtest which allowed very quick
testing.
Future work: perhaps this can be refined somehow to encourage this
on other files, my attempts have failed.
Closes https://github.com/official-stockfish/Stockfish/pull/2322
Bench: 3950249
Author: @nickpelling
We replace in the code the obscure expressions mapping files ABCDEFGH to ABCDDCBA
by an explicite call to an auxiliary function :
old: f = min(f, ~f)
new: f = map_to_queenside(f)
We used the Golbolt web site (https://godbolt.org) to find the optimal code
for the auxiliary function.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 30292 W: 6756 L: 6651 D: 16885
http://tests.stockfishchess.org/tests/view/5d8676720ebc5971531d6aa1
No functional change
When scoring the connected pawns, replace the intricate ternary expressions
choosing the coefficient by a simpler addition of boolean conditions:
` value = Connected * (2 + phalanx - opposed) `
This is the map showing the old coefficients and the new ones:
```
phalanx and unopposed: 3x -> 3x
phalanx and opposed: 1.5x -> 2x
not phalanx and unopposed: 2x -> 2x
not phalanx and opposed: 1x -> 1x
```
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 11354 W: 2579 L: 2437 D: 6338
http://tests.stockfishchess.org/tests/view/5d8151f00ebc5971531d244f
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41221 W: 7001 L: 6913 D: 27307
http://tests.stockfishchess.org/tests/view/5d818f930ebc5971531d26d6
Bench: 3959889
blah
It seems there is no other way to specify stack size on std::thread than linker
flags and the effective flags are named differently in many toolchains. On
toolchains where pthread is always available, this patch changes the stack
size change in our C++ code via pthread to ensure a minimum stack size of 8MB,
instead of relying on linker defaults which may be platform-specific.
Also raises default stack size on OSX to current Linux default (8MB) just to
be safe.
Closes https://github.com/official-stockfish/Stockfish/pull/2303
No functional change
This patch decreases the endgame scale factor using the 50 moves counter.
Looking at some games with this patch, it seems to have two effects on
the playing style:
1) when no progress can be made in late endgames (for instance in fortresses
or opposite bishops endgames) the evaluation will be largely tamed down
towards a draw value.
2) more interestingly, there is also a small effect in the midgame play because
Stockfish will panic a little bit if there are more than four consecutive
shuffling moves with an advantage: the engine will try to move a pawn or to
exchange a piece to keep the advantage, so the follow-ups of the position
will be discovered earlier by the alpha-beta search.
passed STC:
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 23017 W: 5080 L: 4805 D: 13132
http://tests.stockfishchess.org/tests/view/5d7e4aef0ebc59069c36fc74
passed LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 30746 W: 5171 L: 4911 D: 20664
http://tests.stockfishchess.org/tests/view/5d7e513d0ebc59069c36ff26
Pull request: https://github.com/official-stockfish/Stockfish/pull/2304
Bench: 4272173
This patch finally introduces something that was tried for years: midgame score
dependance on complexity of position. More precisely, if the position is very
simplified and the complexity measure calculated in the initiative() function
is inferior to -50 by an amount d, then we add this value d to the midgame score.
One example of play of this patch will be (again!) 4 vs 3 etc same flank endgames
where sides have a lot of non-pawn material: 4 vs 3 draw mostly remains the same
draw even if we add a lot of equal material to both sides.
STC run was stopped after 200k games (and not converging):
LLR: -1.75 (-2.94,2.94) [0.50,4.50]
Total: 200319 W: 44197 L: 43310 D: 112812
http://tests.stockfishchess.org/tests/view/5d7cfdb10ebc5902d386572c
passed LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 41051 W: 6858 L: 6570 D: 27623
http://tests.stockfishchess.org/tests/view/5d7d14680ebc5902d3866196
This is the first and not really precise version, a lot of other stuff can be
tried on top of it (separate complexity for middlegame, some more terms, even
simple retuning of values).
Bench: 4248476
Follow-up to previous commit. Update the documentation for the user when using `make`,
to show the preferred bmi2 compile in the advanced examples section.
Note: I made a mistake in the previous commit comment, the documentation is shown when
using `make` or `make help`, not `make --help`.
No functional change
The only change done to the Makefile to get a somewhat faster binary as
discussed in #2291 is to add -msse4 to the compile options of the bmi2 build.
Since all processors supporting bmi2 also support sse4 this can be done easily.
It is a useful step to avoid sending around custom and poorly tested builds.
The speedup isn't enough to pass [0,4] but it is roughly 1.15Elo and a LOS of 90%:
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 93009 W: 20519 L: 20316 D: 52174
Also rewrite the documentation for the user when using `make --help`, so that
the order of architectures for x86-64 has the more performant build one on top.
Closes https://github.com/official-stockfish/Stockfish/pull/2300
No functional change
This patch greatly scales down complexity of endgames when the
following conditions are all true together:
- pawns are all on one flank
- stronger side king is not outflanking weaker side
- no passed pawns are present
This should improve stockfish evaluation of obvious draws 4 vs 3, 3 vs 2
and 2 vs 1 pawns in rook/queen/knight/bishop single flank endgames where
strong side can not make progress.
passed STC
LLR: 2.94 (-2.94,2.94) [0.50,4.50]
Total: 15843 W: 3601 L: 3359 D: 8883
passed LTC
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 121275 W: 20107 L: 19597 D: 81571
Closes https://github.com/official-stockfish/Stockfish/pull/2298
Bench: 3954190
==========================
How to continue from there?
a) This could be a powerful idea for refining some parts of the evaluation
function, a bit like when we try quadratics or other equations to emphasize
certain situations (xoto10).
b) Some other combinaison values for this bonus can be done further, or
overall retuning of weight and offset while keeping the formula simple.
This is a simplification that integrated WeakLever into doubled pawns.
Since we already check for !support for Doubled pawns, it is trivial
to check for weak lever by just checking more_than_one(lever).
We also introduce the Score * bool operation overload to remove some
casts in the code.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 26757 W: 5842 L: 5731 D: 15184
http://tests.stockfishchess.org/tests/view/5d77ee220ebc5902d384e5a4
Closes https://github.com/official-stockfish/Stockfish/pull/2295
No functional change
Maintain best move counter at the root and allow there only moves which has a counter
of zero for Late Move Reduction. For compensation only the first three moves are excluded
from Late Move Reduction per default instead the first four moves.
What we can further do:
- here we use a simple counting scheme but perhaps some aging to fade out early iterations
could be helpful
- use the best move counter also at inner nodes for LMR and/or pruning
STC:
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 17414 W: 3984 L: 3733 D: 9697
http://tests.stockfishchess.org/tests/view/5d6234bb0ebc5939d09f2aa2
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 38058 W: 6448 L: 6166 D: 25444
http://tests.stockfishchess.org/tests/view/5d62681a0ebc5939d09f2f27
Closes https://github.com/official-stockfish/Stockfish/pull/2282
Bench: 3568210
Remove one parameter in function evaluate_shelter(), making all
comparisons for castled/uncastled shelter locally in do_king_safety().
Also introduce BlockedStorm penalty.
Passed non-regression test at STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 65864 W: 14630 L: 14596 D: 36638
http://tests.stockfishchess.org/tests/view/5d5fc80c0ebc5939d09f0acc
No functional change
@Vizvezdenec array suggested that alternate values may be better than current
master (see pull request #2270 ). I tuned some linear equations to more closely
represent his values and it passed. These futility values seem quite sensitive,
so perhaps additional Elo improvements can be found here.
STC
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 12257 W: 2820 L: 2595 D: 6842
http://tests.stockfishchess.org/tests/view/5d5b2f360ebc5925cf1111ac
LTC
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 20273 W: 3497 L: 3264 D: 13512
http://tests.stockfishchess.org/tests/view/5d5c0d250ebc5925cf111ac3
Closes https://github.com/official-stockfish/Stockfish/pull/2272
------------------------------------------
How to continue from there ?
a) we can try a simpler version for the futility margin, this would
be a simplification :
margin = 188 * (depth - improving)
b) on the other direction, we can try a complexification by trying
again to gain Elo with an complete array of futility values.
------------------------------------------
Bench: 4330402
Replace calls to count(key) + operator[key] with a single call to find(key).
Replace the std::map with std::unordered_map which provide O(1) access,
although the map has a really small number of objects.
Test with [0..4] failed yellow:
TC 10+0.1
SPRT elo0: 0.00 alpha: 0.05 elo1: 4.00 beta: 0.05
LLR -2.96 [-2.94,2.94] (rejected)
Elo 1.01 [-0.87,3.08] (95%)
LOS 85.3%
Games 71860 [w:22.3%, l:22.2%, d:55.5%]
http://tests.stockfishchess.org/tests/view/5d5432210ebc5925cf109d61
Closes https://github.com/official-stockfish/Stockfish/pull/2269
No functional change
Non functional simplification when we find the passed pawns in pawn.cpp
and some code clean up. It also better follows the pattern "flag the pawn"
and "score the pawn".
-------------------------
The idea behind the third condition for candidate passed pawn is a little
bit difficult to visualize. Just for the record, the idea is the following:
Consider White e5 d4 against black e6. d4 can (in some endgames) push
to d5 and lever e6. Thanks to this sacrifice, or after d5xe6, we consider
e5 as "passed".
However:
- if White e5/d4 against black e6/c6: d4 cannot safely push to d5 since d5 is double attacked;
- if White e5/d4 against black e6/d5: d4 cannot safely push to d5 since it is occupied.
This is exactly what the following expression does:
```
&& (shift<Up>(support) & ~(theirPawns | dblAttackThem)))
```
--------------------------
http://tests.stockfishchess.org/tests/view/5d3325bb0ebc5925cf0e6e91
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 124666 W: 27586 L: 27669 D: 69411
Closes https://github.com/official-stockfish/Stockfish/pull/2255
No functional change
This exploits the recent fractional Skill Level, and is a result from some discussion in #2221 and the older #758.
Basically, if UCI_LimitStrength is set, it will internally convert UCI_Elo to a matching fractional Skill Level.
The Elo estimate is based on games at TC 60+0.6, Hash 64Mb, 8moves_v3.pgn, rated with Ordo, anchored to goldfish1.13 (CCRL 40/4 ~2000).
Note that this is mostly about internal consistency, the anchoring to CCRL is a bit weak, e.g. within this tournament,
goldfish and sungorus only have a 200Elo difference, their rating difference on CCRL is 300Elo.
I propose that we continue to expose 'Skill Level' as an UCI option, for backwards compatibility.
The result of a tournament under those conditions are given by the following table, where the player name reflects the UCI_Elo.
# PLAYER : RATING ERROR POINTS PLAYED (%) CFS(%)
1 Elo2837 : 2792.2 50.8 536.5 711 75 100
2 Elo2745 : 2739.0 49.0 487.5 711 69 100
3 Elo2654 : 2666.4 49.2 418.0 711 59 100
4 Elo2562 : 2604.5 38.5 894.5 1383 65 100
5 Elo2471 : 2515.2 38.1 651.5 924 71 100
6 Elo2380 : 2365.9 35.4 478.5 924 52 100
7 Elo2289 : 2290.0 28.0 864.0 1596 54 100
8 sungorus1.4 : 2204.9 27.8 680.5 1596 43 60
9 Elo2197 : 2201.1 30.1 523.5 924 57 100
10 Elo2106 : 2103.8 24.5 730.5 1428 51 100
11 Elo2014 : 2030.5 30.3 377.5 756 50 98
12 goldfish1.13 : 2000.0 ---- 511.0 1428 36 100
13 Elo1923 : 1928.5 30.9 641.5 1260 51 100
14 Elo1831 : 1829.0 42.1 370.5 756 49 100
15 Elo1740 : 1738.3 42.9 277.5 756 37 100
16 Elo1649 : 1625.0 42.1 525.5 1260 42 100
17 Elo1558 : 1521.5 49.9 298.0 756 39 100
18 Elo1467 : 1471.3 51.3 246.5 756 33 100
19 Elo1375 : 1407.1 51.9 183.0 756 24 ---
It can be observed that all set Elos correspond within the error bars with the observed Ordo rating.
No functional change
This is a functional simplification that removes the std::pow from reduction. The resulting reduction values are within 1% of master.
This is a simplification because i believe an fp addition and multiplication is much faster than a call to std::pow() which is historically slow and performance varies widely on different architectures.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23471 W: 5245 L: 5127 D: 13099
http://tests.stockfishchess.org/tests/view/5d27ac1b0ebc5925cf0d476b
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 51533 W: 8736 L: 8665 D: 34132
http://tests.stockfishchess.org/tests/view/5d27b74e0ebc5925cf0d493c
Bench 3765158
This is another functional simplification to Stockfish passed pawn evaluation.
Stockfish evaluates some pawns which are not yet passed as "candidate" passed pawns, which are given half the bonus of fully passed ones. Prior to this commit, Stockfish considered a passed pawn to be a "candidate" if (a) it would not be a passed pawn if moved one square forward (the blocking square), or (b) there were other pawns (of either color) in front of it on the file. This latter condition used a fairly complicated method, forward_file_bb; here, rather than inspect the entire forward file, we simply re-use the blocking square. As a result, some pawns previously considered "candidates", but which are able to push forward, no longer have their bonus halved.
Simplification tests passed quickly at both STC and LTC. The results from both tests imply that this simplification is, most likely, additionally a small Elo gain, with a LTC likelihood of superiority of 87 percent.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 12908 W: 2909 L: 2770 D: 7229
http://tests.stockfishchess.org/tests/view/5d2a1c880ebc5925cf0d9006
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 20723 W: 3591 L: 3470 D: 13662
http://tests.stockfishchess.org/tests/view/5d2a21fd0ebc5925cf0d9118
Bench: 3377831
Current master code made sence when we had 2 types of bonuses for protected path to queen. But it was simplified so we have only one bonus now and code was never cleaned.
This non-functional simplification removes useless defendedsquares bitboard and removes one bitboard assignment (defendedSquares &= attackedBy[Us][ALL_PIECES] + defendedSquares & blockSq becomes just attackedBy[Us][ALL_PIECES] & blockSq also we never assign defendedSquares = squaresToQueen because we don't need it).
So should be small non-functional speedup.
Passed simplification SPRT.
http://tests.stockfishchess.org/tests/view/5d2966ef0ebc5925cf0d7659
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23319 W: 5152 L: 5034 D: 13133
bench 3361902
In Stockfish, both the middlegame and endgame bonus for a passed pawn are calculated as a product of two factors. The first is k, chosen based on the presence of defended and unsafe squares. The second is w, a quadratic function of the pawn's rank. Both are only applied if the pawn's relative rank is at least RANK_4.
It does not appear that the complexity of a quadratic function is necessary for w. Here, we replace it with a simpler linear one, which performs equally at both STC and LTC.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 46814 W: 10386 L: 10314 D: 26114
http://tests.stockfishchess.org/tests/view/5d29686e0ebc5925cf0d76a1
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 82372 W: 13845 L: 13823 D: 54704
http://tests.stockfishchess.org/tests/view/5d2980650ebc5925cf0d7bfd
Bench: 3328507
We recently added a bonus for double pawn attacks on unsupported enemy pawns,
on June 27. However, it is possible that the unsupported pawn may become a passer
by simply pushing forward out of the double attack. By rewarding double attacks,
we may inadvertently reward the creation of enemy passers, by encouraging both of
our would-be stoppers to attack the enemy pawn even if there is no opposing
friendly pawn on the same file.
Here, we revise this term to exclude passed pawns. In order to simplify the code
with this change included, we non-functionally rewrite Attacked2Unsupported to
be a penalty for enemy attacks on friendly pawns, rather than a bonus for our
attacks on enemy pawns. This allows us to exclude passed pawns with a simple
& ~e->passedPawns[Us], while passedPawns[Them] is not yet defined in this part
of the code.
This dramatically reduces the proportion of positions in which Attacked2Unsupported
is applied, to about a third of the original. To compensate, maintaining the same
average effect across our bench positions, we nearly triple Attacked2Unsupported
from S(0, 20) to S(0, 56). Although this pawn formation is rare, it is worth more
than half a pawn in the endgame!
STC: (stopped automatically by fishtest after 250,000 games)
LLR: -0.87 (-2.94,2.94) [0.50,4.50]
Total: 250000 W: 56585 L: 55383 D: 138032
http://tests.stockfishchess.org/tests/view/5d25795e0ebc5925cf0cfb51
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 81038 W: 13965 L: 13558 D: 53515
http://tests.stockfishchess.org/tests/view/5d25f3920ebc5925cf0d10dd
Closes https://github.com/official-stockfish/Stockfish/pull/2233
Bench: 3765158
PowerPC has had popcount instructions for a long time, at least as far
back as POWER5 (released 2004). Enable them via a gcc builtin.
Using a gcc builtin has the added bonus that if compiled for a processor
that lacks a hardware instruction, gcc will include a software popcount
implementation that does not use the instruction. It might be slower
than the table lookups (or it might be faster) but it will certainly work.
So this isn't going to break anything.
On my POWER8 VM, this leads to a ~4.27% speedup.
Fir prefetch, the gcc builtin generates a 'dcbt' instruction, which is
supported at least as far back as the G5 (2002) and POWER4 (2001).
This leads to a ~5% speedup on my POWER8 VM.
No functional change
The current skill levels (1-20) allow for adjusting playing strengths, but
do so in big steps (e.g. level 10 vs level 11 is a ~143 Elo jump at STC).
Since the 'Skill Level' input can already be a floating point number, this
patch uses the fractional part of the input to provide the user with
fine control, allowing for varying the playing strength essentially
continuously.
The implementation internally still uses integer skill levels (needed since they pick Depths),
but non-deterministically rounds up or down the used skill level such that the average integer
skill corresponds to the input floating point one. As expected, intermediate
(fractional) skill levels yield intermediate playing strenghts.
Tested at STC, playing level 10 against levels between 10 and 11 for 10000 games
level 10.25 ELO: 24.26 +-6.2
level 10.5 ELO: 67.51 +-6.3
level 10.75 ELO: 98.52 +-6.4
level 11 ELO: 143.65 +-6.7
http://tests.stockfishchess.org/tests/view/5cd9c6b40ebc5925cf056791http://tests.stockfishchess.org/tests/view/5cd9d22b0ebc5925cf056989http://tests.stockfishchess.org/tests/view/5cd9cf610ebc5925cf056906http://tests.stockfishchess.org/tests/view/5cd9d2490ebc5925cf05698e
No functional change.
Initialization of larger hash sizes can take some time.
Don't include this time in the bench by resetting the timer after Search::clear().
Also move 'ucinewgame' command down in the list, so that it is processed
after the configuration of Threads and Hash size.
No functional change.
This is a functional change that rewards double attacks on an unsupported pawns.
STC (non-functional difference)
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 83276 W: 18981 L: 18398 D: 45897
http://tests.stockfishchess.org/tests/view/5d0970500ebc5925cf0a77d4
LTC (incomplete looping version)
LLR: 0.50 (-2.94,2.94) [0.00,3.50]
Total: 82999 W: 14244 L: 13978 D: 54777
http://tests.stockfishchess.org/tests/view/5d0a8d480ebc5925cf0a8d58
LTC (completed non-looping version).
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 223381 W: 38323 L: 37512 D: 147546
http://tests.stockfishchess.org/tests/view/5d0e80510ebc5925cf0ad320
Closes https://github.com/official-stockfish/Stockfish/pull/2205
Bench 3633546
----------------------------------
Comments by Alain SAVARD:
interesting result ! I would have expected that search would resolve such positions
correctly on the very next move. This is not a very common pattern, and when it happens,
it will quickly disappear. So I'm quite surprised that it passed LTC.
I would be even more surprised if this would resist a simplification.
Anyway, let's try to imagine a few cases.
a) If you have White d5 f5 against Black e6, and White to move
last move by Black was probably a capture on e6 and White is about to recapture on e6
b) If you have White d5 f5 against e6, and Black to move
last move by White was possibly a capture on d5 or f5
or the pawn on e6 was pinned or could not move for some reason.
and white wants to blast open the position and just pushed d4-d5 or f4-f5
Some possible follow-ups
a) Motif is so rare that the popcount() can be safely replaced with a bool()
But this would not pass a SPRT[0,4],
So try a simplification with bool() and also without the & ~theirAttacks
b) If it works, we probably can simply have this in the loop
if (lever) score += S(0, 20);
c) remove all this and tweak something in search for pawn captures (priority, SEE, extension,..)
-removes wideUnsafeSquares bitboard
-removes a couple of bitboard operations
-removes one if operator
-updates comments so they actually represent what this part of code is doing now.
passed non-regression STC
http://tests.stockfishchess.org/tests/view/5d0c1ae50ebc5925cf0aa8db
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 16892 W: 3865 L: 3733 D: 9294
No functional change
Use comparison of eval with beta to predict potential cutNodes. This
allows multi-cut pruning to also prune possibly mislabeled Pv and NonPv
nodes.
STC:
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 54305 W: 12302 L: 11867 D: 30136
http://tests.stockfishchess.org/tests/view/5d048ba50ebc5925cf0a15e8
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 189512 W: 32620 L: 31904 D: 124988
http://tests.stockfishchess.org/tests/view/5d04bf740ebc5925cf0a17f0
Normally I would think such changes are risky, specially for PvNodes,
but after trying a few other versions, it seems this version is more
sound than I initially thought.
Aside from this, a small funtional change is made to return
singularBeta instead of beta to be more consistent with the fail-soft
logic used in other parts of search.
=============================
How to continue from there ?
We could try to audit other parts of the search where the "cutNode"
variable is used, and try to use dynamic info based on heuristic
eval rather than on this variable, to check if the idea behind this
patch could also be applied successfuly.
Bench: 3503788
Fixes issues #2126 and #2189 where no progress in rootDepth is made for particular fens:
8/8/3P3k/8/1p6/8/1P6/1K3n2 b - - 0 1
8/1r1rp1k1/1b1pPp2/2pP1Pp1/1pP3Pp/pP5P/P5K1/8 w - - 79 46
the cause are the shuffle extensions. Upon closer analysis, it appears that in these cases a shuffle extension is made for every node searched, and progess can not be made. This patch implements a fix, namely to limit the number of extensions relative to the number of nodes searched. The ratio employed is 1/4, which fixes the issues seen so far, but it is a heuristic, and I expect that certain positions might require an even smaller fraction.
The patch was tested as a bug fix and passed:
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 56601 W: 12633 L: 12581 D: 31387
http://tests.stockfishchess.org/tests/view/5d02b37a0ebc5925cf09f6da
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 52042 W: 8907 L: 8837 D: 34298
http://tests.stockfishchess.org/tests/view/5d0319420ebc5925cf09fe57
Furthermore, to confirm that the shuffle extension in this form indeed still brings Elo, one more test at VLTC was performed:
VLTC:
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 142022 W: 20963 L: 20435 D: 100624
http://tests.stockfishchess.org/tests/view/5d03630d0ebc5925cf0a011a
Bench: 3961247
Since root_probe() and root_probe_wdl() do not reset all tbRank values if they fail,
it is necessary to do this in rank_root_move(). This fixes issue #2196.
Alternatively, the loop could be moved into both root_probe() and root_probe_wdl().
No functional change
This is a non-functional simplification.
backmost_sq and frontmost_sq are redundant. It seems quite clear to always use frontmost_sq and use the correct color.
Non functional change.
Increase size of the pawns table by the factor 8. This decreases the number of recalculations of pawn structure information significantly (at least at LTC).
I have done measurements for different depths and pawn cache sizes.
First are given the number of pawn entry calculations are done (in parentheses is the frequency that a call to probe triggers a pawn entry calculation). The delta% are the percentage of less done pawn entry calculations in comparison to master
VSTC: bench 1 1 12
STC: bench 8 1 16
LTC: bench 64 1 20
VLTC: bench 512 1 24
VSTC STC LTC VLTC
master 82218(6%) 548935(6%) 2415422(7%) 9548071(7%)
pawncache*2 79859(6%) 492943(5%) 2084794(6%) 8275206(6%)
pawncache*4 78551(6%) 458758(5%) 1827770(5%) 7112531(5%)
pawncache*8 77963(6%) 439421(4%) 1649169(5%) 6128652(4%)
delta%(p2-m) -2.9% -10.2% -13.7% -13.3%
delta%(p4-m) -4.5% -16.4% -24.3% -25.5%
delta%(p8-m) -5.2% -20.0% -31.7% -35.8%
STC: (non-regression test because at STC the effect is smaller than at LTC)
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22767 W: 5160 L: 5040 D: 12567
http://tests.stockfishchess.org/tests/view/5d00f6040ebc5925cf09c3e2
LTC:
LLR: 2.94 (-2.94,2.94) [0.00,4.00]
Total: 26340 W: 4524 L: 4286 D: 17530
http://tests.stockfishchess.org/tests/view/5d00a3810ebc5925cf09ba16
No functional change.
This is a functional simplification. This is NOT the exact version that was tested. Beyond the testing, an assignment was removed and a piece changes for consistency.
Instead of rewarding ANY square past an opponent pawn as an "outpost," only use squares that are protected by our pawn. I believe this is more consistent with what the chess world calls an "outpost."
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23540 W: 5387 L: 5269 D: 12884
http://tests.stockfishchess.org/tests/view/5cf51e6d0ebc5925cf08b823
LTC
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 53085 W: 9271 L: 9204 D: 34610
http://tests.stockfishchess.org/tests/view/5cf5279e0ebc5925cf08b992
bench 3424592
Stockfish evaluates passed pawns in part based on a variable k, which shapes the passed pawn bonus based on the number of squares between the current square and promotion square that are attacked by enemy pieces, and the number defended by friendly ones. Prior to this commit, we gave a large bonus when all squares between the pawn and the promotion square were defended, and if they were not, a somewhat smaller bonus if at least the pawn's next square was. However, this distinction does not appear to provide any Elo at STC or LTC.
Where do we go from here? Many promising Elo-gaining patches were attempted in the past few months to refine passed pawn calculation, by altering the definitions of unsafe and defended squares. Stockfish uses these definitions to choose the value of k, so those tests interact with this PR. Therefore, it may be worthwhile to retest previously promising but not-quite-passing tests in the vicinity of this patch.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 42344 W: 9455 L: 9374 D: 23515
http://tests.stockfishchess.org/tests/view/5cf83ede0ebc5925cf0904fb
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 69548 W: 11855 L: 11813 D: 45880
http://tests.stockfishchess.org/tests/view/5cf8698f0ebc5925cf0908c8
Bench: 3854907
This is a non-functional simplification. Since our file_bb handles either Files or Squares, using Square here removes some code. Not likely any performance difference despite the test.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 6081 W: 1444 L: 1291 D: 3346
http://tests.stockfishchess.org/tests/view/5ceb3e2e0ebc5925cf07ab03
Non functional change.
We evaluate defended and unsafe squares for a passed pawn push based on friendly and enemy rooks and queens on the passed pawn's file. Prior to this patch, we further required that these rooks and queens be able to directly attack the passed pawn. However, this restriction appears unnecessary and worth almost exactly 0 Elo at LTC.
The simplified code allows rooks and queens to attack/defend the passed pawn through other pieces of either color.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 29019 W: 6488 L: 6381 D: 16150
http://tests.stockfishchess.org/tests/view/5cdcf7270ebc5925cf05d30c
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 54224 W: 9200 L: 9133 D: 35891
http://tests.stockfishchess.org/tests/view/5cddc6210ebc5925cf05eca3
Bench: 3415326
This is a non-functional simplification that combines vote counting and thread selecting in the same loop.
It is possible that the best thread would be updated more frequently than master, but I'm not sure it matters here. Perhaps "mostVotes" is a better name than "bestVote?"
STC (stopped early).
LLR: 0.70 (-2.94,2.94) [-3.00,1.00]
Total: 10714 W: 2329 L: 2311 D: 6074
http://tests.stockfishchess.org/tests/view/5ccc71470ebc5925cf03d244
No functional change.
Treat all threads the same as main thread and increment
failedHighCnt on fail highs. This makes the search try
again at lower depth.
@vondele suggested also changing the reset of failedHighCnt
when there is a fail low. Tests including this passed so the
branch has been updated to include both changes. failedHighCnt
is now handled exactly the same in helper threads and the main
thread. Thanks vondele :-)
STC @ 5+0.05 th 4 :
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 7769 W: 1704 L: 1557 D: 4508
http://tests.stockfishchess.org/tests/view/5c9f19520ebc5925cfffd2a1
LTC @ 20+0.2 th 8 :
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 37888 W: 5983 L: 5889 D: 26016
http://tests.stockfishchess.org/tests/view/5c9f57d10ebc5925cfffd696
Bench 3824325
Similar to PSQT we only need one instance of the Endgames resource. The current per thread copies are identical and read only(after initialization) so from a design point of view it doesn't make sense to have them.
Tested for no slowdown.
http://tests.stockfishchess.org/tests/view/5c94377a0ebc5925cfff43ca
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 17320 W: 3487 L: 3359 D: 10474
No functional change.
Store repetition info in StateInfo instead of recomputing it in
three different places. This saves some work in has_game_cycle()
where this info is needed for positions before the root.
Tested for non-regression at STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 34104 W: 7586 L: 7489 D: 19029
http://tests.stockfishchess.org/tests/view/5cd0676e0ebc5925cf044b56
No functional change.
High rootDepths, selDepths and generally searches are increasingly
common with long time control games, analysis, and improving hardware.
In this case, depths of MAX_DEPTH/MAX_PLY (128) can be reached,
and the search tree is truncated.
In principle MAX_PLY can be easily increased, except for a technicality
of storing depths in a signed 8 bit int in the TT. This patch increases
MAX_PLY by storing the depth in an unsigned 8 bit, after shifting by the
most negative depth stored in TT (DEPTH_NONE).
No regression at STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 42235 W: 9565 L: 9484 D: 23186
http://tests.stockfishchess.org/tests/view/5cdb35360ebc5925cf0595e1
Verified to reach high depths on
k1b5/1p1p4/pP1Pp3/K2pPp2/1P1p1P2/3P1P2/5P2/8 w - -
info depth 142 seldepth 154 multipv 1 score cp 537 nodes 26740713110 ...
No bench change.
This functional simplification removes the PvNode reduction and adjusts
the ttPv lmr condition accordingly. Their definitions only differ by the
inclusions of ttPv. Aside from this, shallow move pruning definition
will be the only other functional difference, but this does not seem to
matter too much.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 58908 W: 12980 L: 12932 D: 32996
http://tests.stockfishchess.org/tests/view/5cd1aaaa0ebc5925cf046c6a
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 20351 W: 3521 L: 3399 D: 13431
http://tests.stockfishchess.org/tests/view/5cd23fa70ebc5925cf047cd2
Bench: 3687854
In master search() may incorrectly return a draw score in the following
corner case: there was a 2-fold repetition during the game, and the
current position can be reached by a move from a repeated one. This case
is treated as an upcoming 3-fold repetition, which it is not.
Here is a testcase demonstrating the issue (note that the moves
after FEN are required). The input:
position fen 8/8/8/8/8/8/p7/2k4K b - - 0 1 moves c1b1 h1g1 b1c1 g1h1 c1b1 h1g1 b1a1 g1h1
go movetime 1000
produces the output:
[...]
info depth 127 seldepth 2 multipv 1 score cp 0 [...]
bestmove a1b1
saying that the game will be drawn by repetion. However the other possible
move for black, Kb2, avoids repetitions and wins. The patch fixes this behavior.
In particular it finds mate in 10 in the above position.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 10604 W: 2390 L: 2247 D: 5967
http://tests.stockfishchess.org/tests/view/5cb373e00ebc5925cf0167bf
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 19620 W: 3308 L: 3185 D: 13127
http://tests.stockfishchess.org/tests/view/5cb3822f0ebc5925cf016b2d
Bench is not changed since it does not test positions with history of moves.
Bench: 3184182
only the extension part of the shuffle patch is sufficient to
pass [0,3.5] bounds at VLTC as shown by two more tests.
http://tests.stockfishchess.org/tests/view/5cc168bc0ebc5925cf02bda8
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 120684 W: 17875 L: 17400 D: 85409
http://tests.stockfishchess.org/tests/view/5cc14d510ebc5925cf02bcb5
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 68415 W: 10250 L: 9905 D: 48260
this patch proposes to simplify back to this basic and easier to
understand version. In case there is a need to run a [-3, 1] VLTC on
this one, it can be done, but it is resource intensive, and not needed
IMO.
Bench: 3388643
Same idea as fisherman's knight protection.
passed STC
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 17133 W: 3952 L: 3701 D: 9480
http://tests.stockfishchess.org/tests/view/5cc3550b0ebc5925cf02dada
passed LTC
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 37316 W: 6470 L: 6188 D: 24658
http://tests.stockfishchess.org/tests/view/5cc3721d0ebc5925cf02dc90
Looking at this 2 ideas being recent clean elo gainers I have a feeling that we can add also rook and queen protection bonuses or overall move this stuff in pieces loop in the same way as we do pieces attacking bonuses on their kingring... :) Thx fisherman for original idea.
Bench 3429173
Only called with Squares as argument, so remove unused variants.
As this is just syntax changes, only verified bench at high depth.
No functional change.
The "move" class variable is Movepick is removed (removes some abstraction) which saves a few assignment operations, and the effects of "filter" is limited to the current move (movePtr). The resulting code is a bit more verbose, but it is also more clear what is going on. This version is NOT tested, but is substantially similar to:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 29191 W: 6474 L: 6367 D: 16350
http://tests.stockfishchess.org/tests/view/5ca7aab50ebc5925cf006e50
This is a non-functional simplification.
We can remove the values in Pawns if we just use the piece arrays in Position. This reduces the size of a pawn entry. This simplification passed individually, and in concert with ps_passedcount100 (removes passedCount storage in pawns.).
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 19957 W: 4529 L: 4404 D: 11024
http://tests.stockfishchess.org/tests/view/5cb3c2d00ebc5925cf016f0d
Combo STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 17368 W: 3925 L: 3795 D: 9648
http://tests.stockfishchess.org/tests/view/5cb3d3510ebc5925cf01709a
This is a non-functional simplification.
It causes a serious regression hanging a simple fixed
depth search. Reproducible with:
position fen q1B5/1P1q4/8/8/8/6R1/8/1K1k4 w - - 0 1
go depth 13
The reason is a search tree explosion due to:
if (... && depth < 3 * ONE_PLY)
extension = ONE_PLY;
This is very dangerous code by itself because triggers **at the leafs**
and in the above position keeps extending endlessly. In normal games
time deadline makes the search to stop sooner or later, but in fixed
seacrch we just hang possibly for a very long time. This is not acceptable
because 'go depth 13' shall not be a surprise for any position.
This patch reverts commit 76f1807baa.
and fixes the issue https://github.com/official-stockfish/Stockfish/issues/2091
Bench: 3243738
Introduce a new search extension when pushing an advanced passed pawn is
also suggested by the first killer move. There have been previous tests
which have similar ideas, mostly about pawn pushes, but it seems to be
overkill to extend too many moves. My idea is to limit the extension to
when a move happens to be noteworthy in some other way as well, such as
in this case, when it is also a killer move.
STC:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 19027 W: 4326 L: 4067 D: 10634
http://tests.stockfishchess.org/tests/view/5cac2cde0ebc5925cf00c36d
LTC:
LLR: 2.94 (-2.94,2.94) [0.00,3.50]
Total: 93390 W: 15995 L: 15555 D: 61840
http://tests.stockfishchess.org/tests/view/5cac42270ebc5925cf00c4b9
For future tests, it looks like this will interact heavily with passed
pawn evaluation. It may be good to try more variants of some of the more
promising evaluations tests/tweaks.
Bench: 3666092
Instead of looping through kfrom,kto, rfrom, rto, we can use BetweenBB. This is less lines of code and it is more clear what castlingPath actually is. Personal benchmarks are all over the place. However, this code is only executed when loading a position, so performance doesn't seem that relevant.
No functional change.
The kingDanger term is intended to give a penalty which increases rapidly in the middlegame but less so in the endgame. To this end, the middlegame component is quadratic, and the endgame component is linear. However, this produces unintended consequences for relatively small values of kingDanger: the endgame penalty will exceed the middlegame penalty. This remains true up to kingDanger = 256 (a S(16, 16) penalty), so some of these inaccurate penalties are actually rather large.
In this patch, we increase the threshold for applying the kingDanger penalty to eliminate some of this unintended behavior. This was very nearly, but not quite, sufficient to pass on its own. The patch was finally successful by integrating a second kingDanger tweak by @Vizvezdenec, increasing the kingDanger constant term slightly and improving both STC and LTC performance.
Where do we go from here? I propose that in the future, any attempts to tune kingDanger coefficients should also consider tuning the kingDanger threshold. The evidence shows clearly that it should not be automatically taken to be zero.
Special thanks to @Vizvezdenec for the kingDanger constant tweak. Thanks also to all the approvers and CPU donors who made this possible!
STC:
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 141225 W: 31239 L: 30846 D: 79140
http://tests.stockfishchess.org/tests/view/5cabbdb20ebc5925cf00b86c
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 30708 W: 5296 L: 5043 D: 20369
http://tests.stockfishchess.org/tests/view/5cabff760ebc5925cf00c22d
Bench: 3445945
The sed command is a bit different in Mac OS X (why not!).
The ‘-i’ option required a parameter to tell what extension to add for the
backup file. To fix it, just add extension for backup file, for example ‘.bak’
Fix broken Trevis CI test
No functional change.
The current update only by main thread depends on the luck of
whether main thread sees any/many changes to the best move or not.
It then makes large, lumpy changes to the time to be
used (1x, 2x, 3x, etc) depending on that sample of 1.
Use the average across all threads to get a more reliable
number with a smoother distribution.
STC @ 5+0.05 th 4 :
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 51899 W: 11446 L: 11029 D: 29424
http://tests.stockfishchess.org/tests/view/5ca32ff20ebc5925cf0016fb
STC @ 5+0.05 th 8 :
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 13851 W: 2843 L: 2620 D: 8388
http://tests.stockfishchess.org/tests/view/5ca35ae00ebc5925cf001adb
LTC @ 20+0.2 th 8 :
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 48527 W: 7941 L: 7635 D: 32951
http://tests.stockfishchess.org/tests/view/5ca37cb70ebc5925cf001cec
Further work:
Similar changes might be possible for the fallingEval and timeReduction calculations (and elsewhere?), using either the min, average or max values across all threads.
Bench 3506898
Simplification which removes the pawns connected array.
Instead of storing the values in an array, the values are
calculated real-time. This is about 1.6% faster on my machines.
Performance:
master ave nps: 159,248,672
patch ave nps: 161,905,592
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 20363 W: 4579 L: 4455 D: 11329
http://tests.stockfishchess.org/tests/view/5c9925ba0ebc5925cfff79a6
Non functional change.
Shuffle detection procedure :
Shuffling positions are detected if
the last 36 moves are reversible (rule50_count() > 36),
the position have been already in the TT,
there is a still a pawn on the board (to avoid special endings like KBN vs K).
The position is then judged as a draw.
An extension is realized if we already made 14 successive reversible moves in PV to accelerate the detection of the eventual draw.
To go further : we can still improve the idea. The length of the tests need a lot of ressources.
the limit of 36 is logic but must be checked again for special zugzwang positions,
this limit can be decreased in special positions,
the limit of 14 moves for extension has not been tuned.
STC
LLR: -2.94 (-2.94,2.94) [0.50,4.50]
Total: 32595 W: 7273 L: 7275 D: 18047 Elo +0.43
http://tests.stockfishchess.org/tests/view/5c90aa330ebc5925cfff1768
LTC
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 51249 W: 8807 L: 8486 D: 33956 Elo +1.85
http://tests.stockfishchess.org/tests/view/5c90b2450ebc5925cfff1800
VLTC
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 137974 W: 20503 L: 19983 D: 97488 Elo +1.05
http://tests.stockfishchess.org/tests/view/5c9243a90ebc5925cfff2a93
Bench: 3548313
Adding a clamp function makes some of these range limitations a bit prettier and removes some #include's.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 28117 W: 6300 L: 6191 D: 15626
http://tests.stockfishchess.org/tests/view/5c9aa1df0ebc5925cfff8fcc
Non functional change.
always use the implementation of gives_check in position, no need to
hand-inline part of the implementation in search.
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 57895 W: 12632 L: 12582 D: 32681
http://tests.stockfishchess.org/tests/view/5c9eaa4b0ebc5925cfffc9e3
No functional change.
This is a non-functional code style change.
If we add an accessor function for SquareBB we can consolidate all of the asserts. This is also a bit cleaner because all SquareBB accesses go through this method making future changes easier to manage.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 63406 W: 14084 L: 14045 D: 35277
http://tests.stockfishchess.org/tests/view/5c9ea6100ebc5925cfffc9af
No functional change.
This is covered by the line just before. If we would like to protect
against the piece value of e.g. a N == B, this could be done by an
assert, no need to do this at runtime.
No functional change.
This is a non-functional simplification/speedup.
The truth-table for popcount(support) >= popcount(lever) - 1 is:
------------------lever
------------------0-------1---------2
support--0------X-------X---------0
-----------1------X-------X---------X
-----------2------X-------X---------X
Thus, it is functionally equivalent to just do: support || !more_than_one(lever) which removes the expensive popcounts and the -1.
Result of 20 runs:
base (...h_master.exe) = 1451680 +/- 8202
test (./stockfish ) = 1454781 +/- 8604
diff = +3101 +/- 931
STC
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 35424 W: 7768 L: 7674 D: 19982
Http://tests.stockfishchess.org/tests/view/5c970f170ebc5925cfff5e28
No functional change.
While looking at pruning using see_ge() (which is very valuable)
it became apparent that the !extension test is not adding any
value - simplify it away.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 56843 W: 12621 L: 12569 D: 31653
http://tests.stockfishchess.org/tests/view/5c8588cb0ebc5925cffe77f4
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 78622 W: 13223 L: 13195 D: 52204
http://tests.stockfishchess.org/tests/view/5c8611cc0ebc5925cffe7f86
Further work could be to optimize the remaining see_ge() test. The idea of less pruning at higher depths is valuable, but perhaps the test (-PawnValueEg * depth) can be improved.
Bench: 3188688
On OS X threads other than the main thread are created with a reduced stack
size of 512KB by default, this is dangerously low for deep searches, so
adjust it to TH_STACK_SIZE. The implementation calls pthread_create() with
proper stack size parameter.
Verified for no regression at STC enabling the patch on all platforms where
pthread is supported.
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 50873 W: 9768 L: 9700 D: 31405
No functional change.
This is a non-functional simplification / code-style change.
This popcount16 method does nothing but initialize the PopCnt16 arrays.
This can be done in a single bitset line, which is less lines and more clear. Performance for this code is moot.
No functional change.
This is a functional simplification that removes the FutilityMoveCounts array with a simple equation using only ints.
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 14175 W: 3123 L: 2987 D: 8065
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 9900 W: 1735 L: 1597 D: 6568
Bench: 3380343
This is a non-functional patch which shrinks the reductions array.
This saves about 8Kb of memory.
The only slow part of master's reductions array is the calculation
of the log values, so using a separate array for those values and
calculating the rest real-time appears to be just as fast as master.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 63245 W: 13906 L: 13866 D: 35473
http://tests.stockfishchess.org/tests/view/5c7b571f0ebc5925cffdc104
No funcional change.
This removes the skipQuiets variable, as was done in an earlier round by
@protonspring, but fixes an oversight which led to wrong mate
announcements. Quiets can only be pruned when there is no mate score, so
set moveCountPruning at the right spot.
tested as a fix at STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 66321 W: 14690 L: 14657 D: 36974
http://tests.stockfishchess.org/tests/view/5c74f3170ebc5925cffd4b3c
and as the full patch at LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 25903 W: 4341 L: 4228 D: 17334
http://tests.stockfishchess.org/tests/view/5c7540030ebc5925cffd506f
Bench: 3292342
This is a somewhat different patch. It fixes blindspots for
two knights vs pawn endgame.
With local testing starting from random KNNvKP positions where the
pawn has not advanced beyond the 4th rank (thanks @protonspring !)
at 15+0.15 (4 cores), this went +105=868-27 against master. All except
two losses were won in reverse.
The heuristic is simple but effective - the strategy in these endgames
is to push the opposing king to the corner, then move the knight that's
blocking the pawn in for the checkmate while the pawn is free to move
and prevents stalemate. This patch gives SF the little boost it needs
to search the relevant king-cornering mating lines.
See the discussion in pull request 1939 for some more good results for
this test in independant tests:
https://github.com/official-stockfish/Stockfish/pull/1939
Bench: 3310239
This is a functional simplification of the Outposts array
moving it to a single value. This is a duplicate PR because
I couldn't figure out how to fix the original one.
The idea is from @31m059 with formatting recommendations by @snicolet.
See #1940 for additional information.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23933 W: 5279 L: 5162 D: 13492
http://tests.stockfishchess.org/tests/view/5c3575800ebc596a450c5ecb
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41718 W: 6919 L: 6831 D: 27968
http://tests.stockfishchess.org/tests/view/5c358c440ebc596a450c6117
bench 3783543
This is non-functional. These 5 arrays are simple enough to calculate real-time and maintaining an array for them does not help. Decreases the memory footprint.
This seems a tiny bit slower on my machine, but passed STC well enough. Could someone verify speed?
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 44745 W: 9780 L: 9704 D: 25261
http://tests.stockfishchess.org/tests/view/5c47aa2d0ebc5902bca13fc4
The slowdown is minimal even in 32 bit case (thanks to @mstembera for testing):
Compiled using make build ARCH=x86-32 CXX=i686-w64-mingw32-c++ and benched
This patch only:
```
Results for 40 tests for each version:
Base Test Diff
Mean 1455204 1450033 5171
StDev 49452 34533 59621
p-value: 0.465
speedup: -0.004
```
No functional change.
A simple idea, but it makes sense: in current master the search is extended
for checks that are considered somewhat safe, and for for this we use the
static exchange evaluation which only considers the `to_sq` of a move.
This is not reliable for discovered checks, where another piece is giving
the check and is arguably a more dangerous type of check. Thus, if the check
is a discovered check, the result of SEE is not relevant and can be ignored.
STC:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 29370 W: 6583 L: 6274 D: 16513
http://tests.stockfishchess.org/tests/view/5c5062950ebc593af5d4d9b5
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 227341 W: 37972 L: 37165 D: 152204
http://tests.stockfishchess.org/tests/view/5c5094fb0ebc593af5d4dc2c
Bench: 3611854
There was a simplification attempt last week for the tropism
term in king danger, which passed STC but failed LTC. This
was an indirect sign that maybe the tropism factor was sightly
untuned in current master, so we tried to change it from 1/4
to 5/16.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 28098 W: 6264 L: 5990 D: 15844
http://tests.stockfishchess.org/tests/view/5c518db60ebc593af5d4e306
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 103709 W: 17387 L: 16923 D: 69399
http://tests.stockfishchess.org/tests/view/5c52a5510ebc592fc7baea8b
Bench: 4016000
Remove overlapping safe checks from kingdanger:
- rook and queen checks from the same square: rook check is preferred
- bishop and queen checks form the same square: queen check is preferred
Increase bishop and rook check values as a compensation.
STC
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 27480 W: 6111 L: 5813 D: 15556
http://tests.stockfishchess.org/tests/view/5c521d050ebc593af5d4e66a
LTC
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 78500 W: 13145 L: 12752 D: 52603
http://tests.stockfishchess.org/tests/view/5c52b9460ebc592fc7baecc5
Closes https://github.com/official-stockfish/Stockfish/pull/1983
------------------------------------------
I have quite a few ideas of how to improve this patch.
- actually rethinking it now it will maybe be useful to discount
queen/bishop checks if there is only one square that they can
give check from and it's "occupied" by more valuable check. Right
now count of this squares does not really matter.
- maybe some small extra bonus can be given for overlapping checks.
- some ideas about using popcount() on safechecks can be retried.
- tune this safecheck values since they were more or less randomly handcrafted in this patch.
Bench: 3216489
This patch removes line 875 of search.cpp, which was updating pvHit after IID.
Bench testing at depth 22 shows that line 875 of search.cpp never changes the
value of pvHit at NonPV nodes, while at PV nodes it often changes the value
from true to false (and never the reverse). This is because the definition of
pvHit at line 642 is :
```
pvHit = (ttHit && tte->pv_hit()) || (PvNode && depth > 4 * ONE_PLY);
```
while the assignment after IID omits the ` (PvNode && depth > 4 * ONE_PLY) `
condition. As such, unlike the other two post-IID tte reads, this line of code
does not make SF's state more consistent, but rather introduces an inconsistency
in the definition of pvHit. Indeed, changing line 875 read
```
pvHit = (ttHit && tte->pv_hit()) || (PvNode && depth > 4 * ONE_PLY);
```
to match line 642 is functionally equivalent to removing the line entirely, as
this patch does.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 62756 W: 13787 L: 13746 D: 35223
http://tests.stockfishchess.org/tests/view/5c446c850ebc5902bb5d4b75
LTC
LLR: 3.19 (-2.94,2.94) [-3.00,1.00]
Total: 61900 W: 10179 L: 10111 D: 41610
http://tests.stockfishchess.org/tests/view/5c45bf610ebc5902bb5d5d62
Bench: 3796134
This changes 2 parts with regards to static exchange evaluation.
Currently, we do not allow pinned pieces to recapture if *all* opponent
pinners are still in their starting squares. This changes that to having
a less strict requirement, checking if *any* pinners are still in their
starting square. This makes our SEE give more respect to the pinning
side with regards to exchanges, which makes sense because it helps our
search explore more tactical options.
Furthermore, we change the logic for saving pinners into our state
variable when computing slider_blockers. We will include double pinners,
where two sliders may be looking at the same blocker, a similar concept
to our mobility calculation for sliders in our evaluation section.
Interestingly, I think SEE is the only place where the pinners bitboard
is actually used, so as far as I know there are no other side effects
to this change.
An example and some insights:
White Bf2, Kg1
Black Qe3, Bc5
The move Qg3 will be given the correct value of 0. (Previously < 0)
The move Qd4 will be incorrectly given a value of 0. (Previously < 0)
It seems the tradeoff in search is worth it. Qd4 will likely be pruned
soon by something like probcut anyway, while Qg3 could help us spot
tactics at an earlier depth.
STC:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 62162 W: 13879 L: 13408 D: 34875
http://tests.stockfishchess.org/tests/view/5c4ba1a70ebc593af5d49c55
LTC: (Thanks to @alayant)
LLR: 3.40 (-2.94,2.94) [0.00,3.50]
Total: 140285 W: 23416 L: 22825 D: 94044
http://tests.stockfishchess.org/tests/view/5c4bcfba0ebc593af5d49ea8
Bench: 3937213
This patch saves (4-1) * 64 * 64 = 12KiB of cache.
STC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 176120 W: 38944 L: 38087 D: 99089
http://tests.stockfishchess.org/tests/view/5c4c9f840ebc593af5d4a7ce
LTC
As a pure speed up, I've been informed it should not require LTC.
No functional change
stopOnPonderhit is used to stop search quickly on a ponderhit. It is set by mainThread as part of its time management. However, master employs it as a signal between mainThread and the UCI thread. This is not necessary, it is sufficient for the UCI thread to signal that pondering finished, and mainThread should do its usual time-keeping job, and in this case stop immediately.
This patch implements this, removing stopOnPonderHit as an atomic variable from the ThreadPool,
and moving it as a normal variable to mainThread, reducing its scope. In MainThread::check_time() the search is stopped immediately if ponder switches to false, and the variable stopOnPonderHit is set.
Furthermore, ponder has been moved to mainThread, as the variable is only used to exchange signals between the UCI thread and mainThread.
The version has been tested locally (as fishtest doesn't support ponder):
Score of ponderSimp vs master: 2616 - 2528 - 8630 [0.503] 13774
Elo difference: 2.22 +/- 3.54
which indicates no regression.
No functional change.
Small changes in initiative(). For Pawn PSQT, endgame values for d6-e6 and d7-e7 are now symmetric. The MG value of d2 is now smaller than e2 (d2=13, e2=21 now compared to d2=19, e2=16 before). The MG values of h5-h6-h7 also increased so this might encourage stockfish for more h-pawn pushes.
STC
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 81141 W: 17933 L: 17777 D: 45431
http://tests.stockfishchess.org/tests/view/5c4017350ebc5902bb5cf237
LTC
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 83078 W: 13883 L: 13466 D: 55729
http://tests.stockfishchess.org/tests/view/5c40763f0ebc5902bb5cff09
Bench: 3266398
This is a non-functional simplification that removes the AdjacentFiles array.
This array is simple enough to calculate that the pre-calculated array provides
no benefit. Reduces the memory footprint.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 74839 W: 16390 L: 16373 D: 42076
http://tests.stockfishchess.org/tests/view/5c3d75920ebc596a450cfb67
No functionnal change
If we define dcCandidates with & pawnsNotOn7,
we don't have to & it both times.
This seems more clear to me as well.
Tested for no regression.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 44042 W: 9663 L: 9585 D: 24794
http://tests.stockfishchess.org/tests/view/5c21d9120ebc5902ba12e84d
No functional change.
The new form is likely to trigger a bit more at LTC. Given that LTC
appears to be an improvement, I think that is fine.
The change is not very invasive: it does the same as before, use
potentially less time for moves that are very stable. Most of the
time, the full bonus was given if the bonus was given, so the gradual
part {3, 4, 5} didn't matter much. Whereas previously 'stable' was
expressed as the last 80% of iterations are the same, now I use a
fixed depth (10 iterations). For TCEC style TC, it will presumably
imply some more moves that are played quicker (and thus more time
on the clock when it potentially matters). Note that 10 iterations
of stability means we've been proposing that move for 99.9% of search
time.
passed STC
http://tests.stockfishchess.org/tests/view/5c30d2290ebc596a450c055b
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 70921 W: 15403 L: 15378 D: 40140
passed LTC
http://tests.stockfishchess.org/tests/view/5c31ae240ebc596a450c1881
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 17422 W: 2968 L: 2842 D: 11612
No functional change.
Introducing new concept, saving principal lines into the transposition table
to generate a "critical search tree" which we can reuse later for intelligent
pruning/extension decisions.
For instance in this patch we just reduce reduction for these lines. But a lot
of other ideas are possible.
To go further : tune some parameters, how to add or remove lines from the
critical search tree, how to use these lines in search choices, etc.
STC :
LLR: 2.94 (-2.94,2.94) [0.50,4.50]
Total: 59761 W: 13321 L: 12863 D: 33577 +2.23 ELO
http://tests.stockfishchess.org/tests/view/5c34da5d0ebc596a450c53d3
LTC :
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 26826 W: 4439 L: 4191 D: 18196 +2.9 ELO
http://tests.stockfishchess.org/tests/view/5c35ceb00ebc596a450c65b2
Special thanks to Miguel Lahoz for his help in transposition table in/out.
Bench: 3399866
This was inspired after reading about
[Multi-Cut](https://www.chessprogramming.org/Multi-Cut).
We now do non-singular cut node pruning. The idea is to prune when we
have a "backup plan" in case our expected fail high node does not fail
high on the ttMove.
For singular extensions, we do a search on all other moves but the
ttMove. If this fails high on our original beta, this means that both
the ttMove, as well as at least one other move was proven to fail high
on a lower depth search. We then assume that one of these moves will
work on a higher depth and prune.
STC:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 72952 W: 16104 L: 15583 D: 41265
http://tests.stockfishchess.org/tests/view/5c3119640ebc596a450c0be5
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 27103 W: 4564 L: 4314 D: 18225
http://tests.stockfishchess.org/tests/view/5c3184c00ebc596a450c1662
Bench: 3145487
This addresses partially issue #1911 in that it documents in our
Readme the command that users can use to verifying the md5sum of
their downloaded tablebase files.
Additionally, a quick check of the file size (the size of each
tablebase file modulo 64 is 16 as pointed out by @syzygy1) has been
implemented at launch time in Stockfish.
Closes https://github.com/official-stockfish/Stockfish/pull/1927
and https://github.com/official-stockfish/Stockfish/issues/1911
No functional change.
Delay legality check of castling moves at search time,
just before making the move, as is the standard with all
the other move types.
This should avoid an useless and not trivial legality check
when the castling is then not tried later. For instance due
to a previous cut-off.
The patch is also a big simplification and allows to entirely
remove generate_castling()
Bench changes due to a different move sequence out of MovePicker.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 45073 W: 9918 L: 9843 D: 25312
http://tests.stockfishchess.org/tests/view/5c2f176f0ebc596a450bdfb3
LTC:
LLR: 3.15 (-2.94,2.94) [-3.00,1.00]
Total: 10156 W: 1707 L: 1560 D: 6889
http://tests.stockfishchess.org/tests/view/5c2e7dfd0ebc596a450bcdf4
Verified with perft both in standard and Chess960 cases.
Closes https://github.com/official-stockfish/Stockfish/pull/1929
Bench: 3559104
A single popcount in evaluate.cpp replaces all openFiles stuff in pawns. It doesn't seem to affect performance at all.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 28103 W: 6134 L: 6025 D: 15944
http://tests.stockfishchess.org/tests/view/5b7d70a20ebc5902bdbb1999
No functional change.
This custom predicate filter creates an unnecessary abstraction layer, but doesn't make the code any more readable. The code is clear enough without it.
No functional change.
The extra condition is used as a shortcut to skip the following 3 assignments:
```C++
Bitboard b1 = shift<UpRight>(pawnsOn7) & enemies;
Bitboard b2 = shift<UpLeft >(pawnsOn7) & enemies;
Bitboard b3 = shift<Up >(pawnsOn7) & emptySquares;
```
In case of EVASION with no target on 8th rank (the common case), we end up performing the 3 statements for nothing because b1 = b2 = b3 = 0.
But this is just a small micro-optimization and the condition is quite confusing, so just remove it and prefer a readable code instead.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 78020 W: 16978 L: 16967 D: 44075
http://tests.stockfishchess.org/tests/view/5c27b4fe0ebc5902ba135bb0
No functional change.
I tried to improve the Readme because many people in my local
chess club do not understand some of the UCO options properly.
Starting point of this was Cfish's Readme by Ronald de Man,
some internet resources and the Stockfish code itself.
Closes https://github.com/official-stockfish/Stockfish/pull/1898
Initial commit by user @erbsenzaehler, with help from users
Adrian Petrescu, @alayan-stk-2 and Elvin Liu.
No functional change
Co-Authored-By: Alayan-stk-2 <alayan-stk-2@users.noreply.github.com>
Co-Authored-By: Adrian Petrescu <apetresc@gmail.com>
Co-Authored-By: Elvin Liu <solarlight2@users.noreply.github.com>
Recent tests by @xoto10, @Vizvezdenec, and myself seemed to hint that Elo could
be gained by expanding the number of cases where king safety is applied. Several
users (@Spliffjiffer, @Vizvezdenec) have anticipated benefits specifically in
evaluation of tactics. It appears that we actually do not need to restrict the
cases in which we initialize and evaluate king safety at all: initializing and
evaluating it in every position appears roughly Elo-neutral at STC and possibly
a substantial Elo gain at LTC.
Any explanation for this scaling is, at this point, conjecture. Assuming it is
not due to chance, my hypothesis is that initialization of king safety in all
positions is a mild slowdown, offset by an Elo gain of evaluating king safety
in all positions. At STC this produces Elo gains and losses that offset each
other, while at longer time control the slowdown is much less important, leaving
only the Elo gain. It probably helps SF to explore king attacks much earlier in
search with high numbers of enemy pieces concentrating but not essentially attacking
king ring.
Thanks to @xoto10 and @Vizvezdenec for helping run my LTC!
Closes https://github.com/official-stockfish/Stockfish/pull/1906
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 35432 W: 7815 L: 7721 D: 19896
http://tests.stockfishchess.org/tests/view/5c24779d0ebc5902ba131b26
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 12887 W: 2217 L: 2084 D: 8586
http://tests.stockfishchess.org/tests/view/5c25049a0ebc5902ba132586
Bench: 3163951
------------------
How to continue from there?
* Next step will be to tune all the king danger terms once more after that :-)
When iterating through 'SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX' structure, do not use structure member beyond known size.
API is guaranteed to provide us at lease one element upon successful, and no element in the structure can have a zero size.
No functional change.
This pull request fixes a rare crashing bug on Windows inside our NUMA code, first
reported by Dann Corbit in the following forum thread (thanks!):
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/gA6aoMEuOwg
The fix is to not use structure member beyond known size when iterating through
'SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX' structure. We note that the Microsoft
API is guaranteed to provide us at least one element upon successful, and no
element in the structure can have a zero size.
No functional change.
The `&& (ss-1)->killers[0] ` conditions are there seemingly to protect
accessing ss-5.
This is unneeded and not so intuitive (as the killer is checked for equality
with currentMove, and that one is non-zero once we're high enough in the stack,
this protects access to ss-5). We can just extend the stack from ss-4 to ss-5,
so we can call update_continuation_histories(ss-1, ..) always in search.
This goes a bit further than #1881 and addresses a comment in #1878.
passed STC:
http://tests.stockfishchess.org/tests/view/5c1aa8d50ebc5902ba127ad0
LLR: 3.12 (-2.94,2.94) [-3.00,1.00]
Total: 53515 W: 11734 L: 11666 D: 30115
passed LTC:
http://tests.stockfishchess.org/tests/view/5c1b272c0ebc5902ba12858d
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 140176 W: 23123 L: 23192 D: 93861
Bench: 3451321
Even when playing without endgame table bases, this particular endgame should
be a win 100% of the time when Stockfish is given a KRBK position, assuming
there are enough moves remaining in the FEN to finish the game without hitting
the 50 move rule.
PROBLEM: The issue with master here is that the PushClose difference per square
is 20, however, the difference in squares for the PushToCorners array is usually
less. Thus, the engine prefers to move the kings closer together rather than pushing
the weak king to the correct corner.
What happens is if the weak king is in a safe corner, SF still prefers pushing the
kings together. Occasionally, the strong king traps the weak king in the safe corner.
It takes a while for SF to figure it out, but often draws the game by the 50 move rule
(on shorter time controls).
This patch increases the PushToCorners values to correct this problem. We also added
an assert to catch any overflow problem if anybody would want to increase the array
values again in the future.
It was tested in a couple of matches starting with random KRBK positions and showed
increased winning rates, see https://github.com/official-stockfish/Stockfish/pull/1877
No functional change
* Update our continuous integration machinery
Ubuntu 16.04 can now be used with travis. Updating all the other stuff
when there.
Invoking the lld linker seems to save 5 minutes with clang on linux.
No functional change.
* fix
* Use a bit less code to calculate hashfull(). Change post increment to preincrement as is more standard
in the rest of stockfish. Scale result so we report 100% instead of 99.9% when completely full.
No functional change.
Although this is a compile-time constant, we stick the castlingSide into a CastlingRight, then pull it out again. This seems unecessarily complex.
No functional change.
We do not need to change the winnerKSq variable, so we can simplify
a little bit the logic of the code by changing only the loserKSq
variable when it is necessary. Also consolidate and clarify comments.
See the pull request thread for a proof that the code is correct:
https://github.com/official-stockfish/Stockfish/pull/1854
No functional change
~stronglyProtected is quite similar to ~attackedBy[Them][PAWN] & ~attackedBy2[Them],
the only difference appears to be that the former includes squares attacked twice
by both sides. The resulting logic is simpler, and the change appears to be at least
Elo-neutral at both STC and LTC.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 35924 W: 7978 L: 7885 D: 20061
http://tests.stockfishchess.org/tests/view/5c14a5c00ebc5902ba11ed72
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 37078 W: 6125 L: 6030 D: 24923
http://tests.stockfishchess.org/tests/view/5c14ae880ebc5902ba11eed8
Bench: 3646542
this patch fixes a rare but reproducible segfault observed playing a
multi-threaded match, it is discussed somewhat in fishcooking.
From the core file, it could be observed that the issue was in qsearch, namely:
````
ss->pv[0] = MOVE_NONE;
````
and the backtrace shows the it arrives there via razoring, called from the rootNode:
````
(gdb) bt
alpha=-19, beta=682, depth=DEPTH_ZERO) at search.cpp:1247
````
Indeed, ss->pv can indeed by a nullptr at the rootNode. However, why is the
segfault so rare ?
The reason is that the condition that guards razoring:
````
(depth < 2 * ONE_PLY && eval <= alpha - RazorMargin)
````
is almost never true, since at the root alpha for depth < 5 is -VALUE_INFINITE.
Nevertheless with the new failHigh scheme, this is not guaranteed, and rootDepth > 5,
can still result in a depth < 2 search at the rootNode. If now another thread,
via the hash, writes a new low eval to the rootPos qsearch can be entered.
Rare but not unseen... I assume that some of the crashes in fishtest recently
might be due to this.
Closes https://github.com/official-stockfish/Stockfish/pull/1860
No functional change
As noticed in the forum, a crash in extract_ponder_from_tt could result
if hash size is set before the ponder move is printed. While it is arguably
a GUI issue (but it got me on the cli), it is easy to avoid this issue.
Closes https://github.com/official-stockfish/Stockfish/pull/1856
No functional change.
On November 30th, @xoto10 experimented with removing this threshold,
but the simplification barely failed LTC. I was inspired to try various
[0, 4] tweaks to increase its value, which would narrow the effects of
this threshold without removing it entirely. Various values repeatedly
led to Elo gains at both STC and LTC, most of which were insufficient
to pass.
After a couple of weeks, I tried again to find an Elo-gaining tweak
but noticed that I could raise the threshold higher and higher without
regression. I decided to try removing it entirely--forgetting that
@xoto10 had already attempted this. However, this now performs much
better at both STC and LTC, producing a STC Elo gain and also potentially
a smaller LTC one.
The reason appears to be a recent change in master (e8ffca3) near
this code, which interacts with this patch. This simplification
governs the conditions under which that patch's effects are applied.
Something non-obvious about that change has significantly improved
the performance of this simplification.
I recognize and thank @xoto10, who originally had this idea. Since
I ran several LTCs recently (to determine whether to open this PR,
or one for a related [0, 4]), I would also like to acknowledge the
other developers and CPU donors for their patience. Thank you all!
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 13445 W: 3000 L: 2862 D: 7583
http://tests.stockfishchess.org/tests/view/5c11f01b0ebc5902ba11a6b8
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 33868 W: 5663 L: 5563 D: 22642
http://tests.stockfishchess.org/tests/view/5c11ffe90ebc5902ba11a8a9
Closes https://github.com/official-stockfish/Stockfish/pull/1870
Bench: 3343286
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 13323 W: 3015 L: 2818 D: 7490
http://tests.stockfishchess.org/tests/view/5c00a2520ebc5902bcedd41b
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 52294 W: 9093 L: 8756 D:34445
http://tests.stockfishchess.org/tests/view/5c00b2c40ebc5902bcedd596
Some obvious followups to this are to further tune this PSQT, or
try 8x8 for other pieces. As of now I don't plan on trying this
for other pieces as I think the majority of the ELO it brings is
for pawns and kings.
Looking at the new values, the differences between kingside and
queenside are quite significant. I am very hopeful that this a
llows SF to understand and plan pawn structures even better than
it already does. Cheers!
Closes https://github.com/official-stockfish/Stockfish/pull/1839
Bench: 3569243
I've gone through the RENAME/REFORMATTING thread and changed everything I could find, plus a few more. With this, let's close the previous issue and open another.
No functional change.
This reverts commit 33d9548218 ,
which crashed in DEBUG mode because of the following assert in position.h
````
Assertion failed: (is_ok(m)), function capture, file ./position.h, line 369.
````
No functional change
Remove the F[] array which I find unhelpful and rename `improvingFactor` to
`fallingEval` since larger values indicate a falling eval and more time use.
I realise a test was not strictly necessary, but I ran STC [-3,1] just to
check there are no foolish errors before creating the pull request:
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 35804 W: 7753 L: 7659 D: 20392
http://tests.stockfishchess.org/tests/view/5bef3a0c0ebc595e0ae39c19
It was then suggested to clean the constants around `fallingEval`
to make it more clear this is a factor around ~1 that adjusts time
up or downwards depending on some conditions. We then ran a double
test with this simplification suggestion:
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 68435 W: 14936 L: 14906 D: 38593
http://tests.stockfishchess.org/tests/view/5c02c56b0ebc5902bcee0184
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 37258 W: 6324 L: 6230 D: 24704
http://tests.stockfishchess.org/tests/view/5c030a520ebc5902bcee0a32
No functional change
Exclude doubly protected by pawns squares when calculating attackers on
king ring. Idea of this patch is not to count attackers if they attack
only squares that are protected by two pawns.
STC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 70040 W: 15476 L: 15002 D: 39562
http://tests.stockfishchess.org/tests/view/5c0354860ebc5902bcee1106
LTC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 16530 W: 2795 L: 2607 D: 11128
http://tests.stockfishchess.org/tests/view/5c0385080ebc5902bcee14b5
This is third king safety patch in recent times so we probably need
retuning of king safety parameters.
Bench: 3057978
Official release version of Stockfish 10.
This is also the 10th anniversary version of the Stockfish project, which
started exactly ten years ago! I wish to extend a huge thank you to
all contributors and authors in our amazing community :-)
Bench: 3939338
The patch was tested for correctness by running bench with and
without the change against current master, and the tablebase hit
numbers were found to be identical in both cases. See the pull
request comments for details:
https://github.com/official-stockfish/Stockfish/pull/1826
No functional change.
On November 16th, before the removal of the depth condition, I tried
revising castling extensions to only handle castling moves, rather than
moves that change castling rights generally. It appeared to be a slight
Elo gain at STC but insufficient to pass [0, 4] (+0.5 Elo), but what I
overlooked was that it made pos.can_castle(us) irrelevant and should
have been a simplification. Recent discussion with @Chess13234 and
Michael Chaly (@Vizvezdenec) inspired me to take a second look, and
the simplification continues to pass when rebased on the current master.
This replaces two conditions with one, because type_of(move) == CASTLING
implies pos.can_castle(Us), allowing us to remove the latter condition.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 110948 W: 24209 L: 24263 D: 62476
http://tests.stockfishchess.org/tests/view/5bf8f65c0ebc5902bced3a63
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 88283 W: 14681 L: 14668 D: 58934
http://tests.stockfishchess.org/tests/view/5bf994a60ebc5902bced4349
Bench: 3939338
When running on a cloud VM (n1-highcpu-96) with several NVMe SSDs and
some non-SSDs for tablebases, I noticed that the average SSD request size was
more than 256 kB. This doesn't make a lot of sense for Syzygy tablebases,
which have a block size of 32 bytes and very low locality.
Seemingly, the tablebase access patterns during probing make the OS,
at least Linux, think that readahead is advantageous; normally, it
gives up doing readahead if there are too many misses, but it doesn't,
perhaps due to the fairly high overall hit rates. (It seems the kernel cannot
distinguish between reading a block that was paged in because the userspace
wanted it explicitly, and one that was read as part of readahead.)
Setting MADV_RANDOM effectively turns off readahead, which causes
the request size to drop to 4 kB. In the aforemented cloud VM test,
this roughly tripled the amount of I/O requests that were able to go
through, while reducing the total traffic from 2.8 GB/sec to 56 MB/sec
(moving the bottleneck to the non-SSDs; it seems the SSDs could have
sustained many more requests).
Closes https://github.com/official-stockfish/Stockfish/pull/1829
No functional change.
Don't do an extra TT update in case of a fail-high,
but simply break off the moves loop and let the TT update
at the end of qsearch do this job.
Same workflow/logic as in our main search function now.
Tested for no regression to be on the safe side.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 30237 W: 6665 L: 6560 D: 17012
http://tests.stockfishchess.org/tests/view/5bf928e80ebc5902bced3f3a
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 51067 W: 8625 L: 8553 D: 33889
http://tests.stockfishchess.org/tests/view/5bf937180ebc5902bced3fdc
No functional change.
Tropism in kingdanger was simplified away in this pull request #1821.
This patch reintroduces tropism in kingdanger with using quadratic scaling.
Passed STC http://tests.stockfishchess.org/tests/view/5bf7c1b10ebc5902bced1f8f
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 52803 W: 11835 L: 11442 D: 29526
Passed LTC http://tests.stockfishchess.org/tests/view/5bf816e90ebc5902bced24f1
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 17204 W: 2988 L: 2795 D: 11421
How do we continue from there?
I've recently tried to introduce tropism difference term in kingdanger which
passed STC 6 times but failed LTC all the time. Maybe using quadratic scaling
for it will also be helpful.
Bench 4041387
A recent LTC tuning session by @candirufish showed this term decreasing significantly. It appears that it can be removed altogether without significant Elo loss.
I also thank @GuardianRM, whose attempt to remove tropism from king danger inspired this one.
After this PR is merged, my next step will be to attempt to tune the coefficients of this new, simplified kingDanger calculation.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 12518 W: 2795 L: 2656 D: 7067
http://tests.stockfishchess.org/tests/view/5befadda0ebc595e0ae3a289
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 164771 W: 26463 L: 26566 D: 111742
http://tests.stockfishchess.org/tests/view/5befcca70ebc595e0ae3a343
LTC 2, rebased on Stockfish 10 beta:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 75226 W: 12563 L: 12529 D: 50134
http://tests.stockfishchess.org/tests/view/5bf2e8910ebc5902bcecb919
Bench: 3412071
Because of aggressive time management and optimistic assumptions
about move overhead, it's still very easy to get Stockfish to forfeit
on time when we hit an endgame and have Syzygy EGTB on a spinning
drive. The latency from serving a few thousand EGTB probes (~10ms each),
of which there can currently be up to 4000 outstanding before a time
check, will easily overwhelm the default Move Overhead of 30ms.
This problem was first raised by Gian-Carlo Pascutto and some solutions
and improvements were discussed in the following pull requests:
https://github.com/official-stockfish/Stockfish/pull/1471https://github.com/official-stockfish/Stockfish/pull/1623https://github.com/official-stockfish/Stockfish/pull/1783
This patch is a minimal change proposed by Marco Costalba to lower
the impact of the bug. We now force a check of the clock right after
each tablebase read.
No functional change.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 51883 W: 11297 L: 10915 D: 29671
http://tests.stockfishchess.org/tests/view/5bf1e2ee0ebc595e0ae3cacd
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 15859 W: 2752 L: 2565 D: 10542
http://tests.stockfishchess.org/tests/view/5bf337980ebc5902bcecbf62
Notes:
(1) The bonus value has not been carefully tested, so it may be possible
to find slightly better values.
(2) Plan is to now try adding similar restriction for pawns. I wanted to
include that as part of this pull request, but I was advised to do it as
two separate pull requests. STC is currently running here, but may not add
enough value to pass green.
Bench: 3679086
Preparation commit for the upcoming Stockfish 10 version, giving a chance to catch last minute feature bugs and evaluation regression during the one-week code freeze period. Also changing the copyright dates to include 2019.
No functional change
STC:
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 84697 W: 18173 L: 18009 D: 48515
http://tests.stockfishchess.org/tests/view/5bea366f0ebc595e0ae34793
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 157625 W: 25533 L: 24893 D: 107199
http://tests.stockfishchess.org/tests/view/5be8b69e0ebc595e0ae33024
Personally, I feel like SF has been tuned to death recently and that we
need to step away from existing-parameter tunes for a bit and focus more
on new ideas. I don't really think there's much more ELO in these tunes
(for now). For me at least, this was the last existing-parameter tune I'll
be running for quite a while. Cheers!
Bench: 3572567
It does not appear to be not necessary or advantageous to
conditionally initialize kingRing[Us] or kingAttackersCount[Them],
so the 'else' can be removed.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22873 W: 4923 L: 4804 D: 13146
http://tests.stockfishchess.org/tests/view/5be9a8270ebc595e0ae33c7e
No functional change
To top the rating lists and get more interesting middle play, it
is a good habit to set the default contempt to the highest value
that does not regress against contempt=0. We recently decreased
PawnValueEg it is logical that to raise a little bit the default
higher contempt because of the following internal dependency in
line 334 of search.cpp :
````
int ct = int(Options["Contempt"]) * PawnValueEg / 100; // From centipawns
````
STC: contempt=24 passed non-regression vs contempt=0
http://tests.stockfishchess.org/tests/view/5bd6d7f80ebc595e0ae21e14
LTC: contempt=24 passed non-regression LTC vs contempt=0
http://tests.stockfishchess.org/tests/view/5bd6e0980ebc595e0ae21f07
On 2018-11-01, we also tested the effects of contempt=21 and contempt=24
against Stockfish 9, and the net result was neutral:
Contempt 21
ELO: 51.68 +-1.9 (95%) LOS: 100.0%
Total: 40000 W: 9487 L: 3581 D: 26932
http://tests.stockfishchess.org/tests/view/5bdb1a140ebc595e0ae2620a
Contempt 24
ELO: 52.21 +-2.0 (95%) LOS: 100.0%
Total: 40000 W: 9759 L: 3793 D: 26448
http://tests.stockfishchess.org/tests/view/5bdb1b680ebc595e0ae2620d
Bench: 3459874
This patch will make possible to free mapped TB files with "ucinewgame" command.
We wrote this patch specifically to address a problem that arose while
running Stockfish with 7-piece tablebases as a kibitzer at TCEC for
extended periods of time across multiple games. It was noted that after
some time, the NPS of the kibitzing Stockfish (which is usually 3x faster
than the Stockfish actually competing) would drop precipitously, eventually
falling to preposterously low numbers until restarted.
Their eval bot basically inputs FEN, go infinite, stop and loop, it probably
didn't do ucinewgame either. As time goes it gradually slowed down and OS
starts to use swap, this is not reasonable since the engine only uses 16GB
hash and the machine has 1TB physical RAM and does nothing else.
Author : noobpwnftw
Closes https://github.com/official-stockfish/Stockfish/pull/1790
No functional change.
We don't need to pass the king square as an explicit parameter to the functions
king_safety() and do_king_safety() since we already pass in the position.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 69686 W: 14894 L: 14866 D: 39926
http://tests.stockfishchess.org/tests/view/5be84ac20ebc595e0ae3283c
No functional change.
We calculate tropism as a sum of two factors. The first is the number of squares in our kingFlank and Camp that are attacked by the enemy; the second is number of these squares that are attacked twice. Prior to this commit, we excluded squares we defended with pawns from this second value, but this appears unnecessary. (Doubly-attacked squares near our king are still dangerous.) The removal of this exclusion is a possible small Elo gain at STC (estimated +1.59) and almost exactly neutral at LTC (estimated +0.04).
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 20942 W: 4550 L: 4427 D: 11965
http://tests.stockfishchess.org/tests/view/5be4e0ae0ebc595e0ae308a0
LTC:
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 56941 W: 9172 L: 9108 D: 38661
http://tests.stockfishchess.org/tests/view/5be4ec340ebc595e0ae30938
Bench: 3813986
The recently committed Fail-High patch (081af90805)
had a number of changes beyond adjusting the depth of search on fail high, with
some undesirable side effects.
1) Decreasing depth on PV output, confusing GUIs and players alike as described in
issue #1787. The depth printed is anyway a convention, let's consider adjustedDepth
an implementation detail, and continue to print rootDepth. Depth, nodes, time and
move quality all increase as we compute more. (fixing this output has no effect on
play).
2) Fixes go depth output (now based on rootDepth again, no effect on play), also
reported in issue #1787
3) The depth lastBestDepth is used to compute how long a move is stable, a new move
found during fail-high is incorrectly considered stable if based on adjustedDepth
instead of rootDepth (this changes time management). Reverting this passed STC
and LTC:
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 82982 W: 17810 L: 17808 D: 47364
http://tests.stockfishchess.org/tests/view/5bd391a80ebc595e0ae1e993
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 109083 W: 17602 L: 17619 D: 73862
http://tests.stockfishchess.org/tests/view/5bd40c820ebc595e0ae1f1fb
4) In the thread voting scheme, the rank of the fail-high thread is now artificially
low, incorrectly since the quality of the move is much better than what adjustedDepth
suggests (e.g. if it takes 10 iterations to find VALUE_KNOWN_WIN, it has very low
depth). Further evidence comes from a test that showed that the move of highest
depth is not better than that of the last PV (which is potentially of much lower
adjustedDepth).
I.e. this test http://tests.stockfishchess.org/tests/view/5bd37a120ebc595e0ae1e7c3
failed SPRT[0, 5]:
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 10609 W: 2266 L: 2345 D: 5998
In a running 5+0.05 th 8 test (more than 10000 games) a positive Elo estimate is
shown (strong enough for a [-3,1], possibly not [0,4]):
http://tests.stockfishchess.org/tests/view/5bd421be0ebc595e0ae1f315
LLR: -0.13 (-2.94,2.94) [0.00,4.00]
Total: 13644 W: 2573 L: 2532 D: 8539
Elo 1.04 [-2.52,4.61] / LOS 71%
Thus, restore old behavior as a bugfix, keeping the core of the fail-high patch
idea as resolving scheme. This is non-functional for bench, but changes searches
via time management and in the threaded case.
Bench: 3556672
This helps resolving consecutive FH's during aspiration more efficiently
STC:
http://tests.stockfishchess.org/tests/view/5bc857920ebc592439f85765
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 4992 W: 1134 L: 980 D: 2878 Elo +10.72
LTC:
http://tests.stockfishchess.org/tests/view/5bc868050ebc592439f857ef
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 8123 W: 1363 L: 1210 D: 5550 Elo +6.54
No-Regression test with 8 threads, tc=15+0.15:
http://tests.stockfishchess.org/tests/view/5bc874ca0ebc592439f85938
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 24740 W: 3977 L: 3863 D: 16900 Elo +1.60
This was a cooperation between me and Michael Stembera:
-me recognizing SF having problems with resolving FH's efficiently at
high depths, thus starting some tests based on consecutive FH's.
-mstembera picking up the idea with first success at STC & LTC (so full
credits to him!)
-me suggesting how to resolve the issues pinpointed by S.G on PR #1768
and finally restricting the logic to the main thread so that it don't
regresses at multi-thread.
bench: 3314347
Enable numa machinery only for STRICTLY MORE than 8 threads. Reason for this
change is that nowadays SMP tests are always done with 8 threads. That is a
problem for multi-socket Windows machines running on fishtest.
No functional change
The patch adds a small random component (+-1) to VALUE_DRAW for the evaluation
of draw positions (mostly 3folds). This random component is not static, but
potentially different for each visit of the node (hence derived from the node
counter). The effect is that in positions with many 3fold draw lines, different
lines are followed at each iteration. This keeps the search much more dynamic,
as opposed to being locked to one particular 3fold.
An example of a position where master suffers from 3fold-blindness and this patch
solves quickly is the famous TCEC game 53:
FEN: 3r2k1/pr6/1p3q1p/5R2/3P3p/8/5RP1/3Q2K1 b - - 0 51
master doesn't see that this is a lost position (draw eval up to depth 50) as
Qf6-e6 d4-d5 (found by patch at depth 23) leads to a loss.
The 3fold-blindness is more important at longer TC, the patch was yellow STC and
LTC, but passed VLTC:
STC
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 46328 W: 10048 L: 9953 D: 26327
http://tests.stockfishchess.org/tests/view/5b9c0ca20ebc592cf275f7c7
LTC
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 54663 W: 8938 L: 8846 D: 36879
http://tests.stockfishchess.org/tests/view/5b9ca1610ebc592cf27601d3
VLTC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 31789 W: 4512 L: 4284 D: 22993
http://tests.stockfishchess.org/tests/view/5b9d1a670ebc592cf276076d
Credit to @crossbr for pointing to this problem repeatedly, and giving the hint
that many draw lines are typical in those situations.
Bench: 4756639
Currently we update (track up) the pv even in the fail high case.
However most times in such cases the pv in the ply below remains unset
because there we have value == alpha and so finally we see truncated
pv's (=just one move) in fail high cases.
Of course tracking down these pv's (+sending them to the gui) comes at a
certian cost, but no-regression tests passed:
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 16300 W: 3556 L: 3424 D: 9320
http://tests.stockfishchess.org/tests/view/5b9b73500ebc592cf275ea92
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 202411 W: 32734 L: 32897 D: 136780
http://tests.stockfishchess.org/tests/view/5b9baed10ebc592cf275ef6d
N.B.: Digging also into qsearch was tried in another version but seemed
not to pass the tests. This means that we don't always will get a pv
until the very tips.
No functional change
This PR is a combination of two unrelated [0, 4] patches that appeared promising
but not quite strong enough to pass on their own. The combination initially failed
STC with a positive score after a long run, and the subsequent speculative LTC test
passed.
* tweak_threatOnQueen4 :
Increase the middlegame components of ThreatByMinor[QUEEN]
and ThreatByRook[QUEEN] by 15 each. Bryan's (@crossbr) analysis of CCC Bonus Game 10
inspired several tests on penalizing a queen with limited safe mobility. While
attempting to implement this idea, I noticed that when I did not include the queen's
current square in the calculations, the Elo gains seemed to vanish--and only then did
I have the idea to revisit ThreatByMinor[QUEEN] and ThreatByRook[QUEEN], adding a
corresponding value to each. Without Bryan's work, this test would never have been
submitted. I would also like to recognize the efforts and contributions of @SFisGOD,
who also vigorously worked on this idea.
* Use pure static eval for null move pruning :
This idea was directly re-purposed from a promising test by Jerry Donald Watson
(@jerrydonaldwatson) in August. It was also independently developed and tested by
Stefan Geschwentner (@locutus2) previously.
Thank you all!
STC (failed yellow):
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 83913 W: 17986 L: 17825 D: 48102
http://tests.stockfishchess.org/tests/view/5bbc59300ebc592439f76aa5
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 137198 W: 22351 L: 21772 D: 93075
http://tests.stockfishchess.org/tests/view/5bbce35f0ebc592439f77639
Bench: 4312846
Note by snicolet: I use this non-functional change patch
as a pretext to correct the wrong bench number I introduced
in the message of the previous commit.
Bench: 4059356
Storing unconditionally the current generation and bound is equivalent to master.
Part of the condition was added as a speed optimization in #429.
Here the branch is fully eliminated.
passed STC single-threaded:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 73515 W: 16378 L: 16359 D: 40778
http://tests.stockfishchess.org/tests/view/5b2fc38c0ebc5902b2e57fd5
passed STC multi-threaded:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 63725 W: 12916 L: 12874 D: 37935
http://tests.stockfishchess.org/tests/view/5b307b8f0ebc5902b2e5895f
The multithreaded test was run after a plausible suggestion by @mstembera that the effect of this could be larger with many cores. The result seems to indicate this doesn't really matter on the 8core architecture abundantly available on fishtest.
No functional change
a) Reduce PSQT values along the long diagonals on non-central squares
and increase the LongDiagonal bonus accordingly. The effect is to penalise
bishops on the long diagonal which can not "see" the 2 central squares.
The "good" bishops still have more or less the same bonus as current master.
b) For a bishop on a central square, because of the "| s" term in the code,
the LongDiagonalBonus was always given. So while being there, remove the "| s"
and compensate the central Bishop PSQT accordingly.
Passed STC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 44498 W: 9658 L: 9323 D: 25517
http://tests.stockfishchess.org/tests/view/5b8992770ebc592cf2748942
Passed LTC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 63092 W: 10324 L: 9975 D: 42793
http://tests.stockfishchess.org/tests/view/5b89a17a0ebc592cf2748b59
Closes https://github.com/official-stockfish/Stockfish/pull/1760
bench: 4693901
It looks like PawnsOnBothFlanks can be removed from initiative().
A barrage of tests seem to confirm that the adjustment to -110
does not gain elo to offset any potential loss by removing
PawnsOnBothFlanks.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22014 W: 4760 L: 4639 D: 12615
http://tests.stockfishchess.org/tests/view/5b7f50cc0ebc5902bdbb3a3e
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 40561 W: 6667 L: 6577 D: 27317
http://tests.stockfishchess.org/tests/view/5b801f9f0ebc5902bdbb4467
The barrage of 0,4 tests on the -136 value are in my ps_tunetests branch.
http://tests.stockfishchess.org/tests/user/protonspring
Closes https://github.com/official-stockfish/Stockfish/pull/1751
Bench: 4413173
-------------
How to continue from there?
The fact that endgames with all the pawns on only one flank are
drawish is a well-known chess idea, so it seems quite strange that
this can be removed so easily without losing Elo.
In the past there had been attempts to improve on PawnsOnBothFlanks
with similar concepts (for instance using the pawn span value), but
the tests were at best neutral. Maybe Stockfish is now mature enough
that these refined ideas would work to replace PawnsOnBothFlanks?
There is no need to make this as large as 65536 just for the sake of the
single 7-man tablebase that happens to have the key 0xf9247fff. Idea for the
fix by Ronald de Man, who suggested simply to allow more buckets past the end.
We also implement Robin Hood hashing for the hash table, which takes the worst
-case search for full 7-man tablebases down from 68 to 11 probes (Also takes
the average probe length from 2.06 to 2.05). For a table with 8K entries, the
corresponding numbers would be worst-case from 9 to 4, with average from 1.30
to 1.29.
https://github.com/official-stockfish/Stockfish/pull/1747
No functional change
This commit tries to make the new pure static eval code more readable by
splitting up the nested assignments into separate lines and making a few
more cosmetic tweaks.
No functional change.
This is a non-functional change. By pre-incrementing minKingPawnDistance
instead of post-incrementing, we can remove this -1.
This also makes DistanceRing more consistent with the rest of stockfish
since it now holds an actual "distance" instead of a less natural distance-1.
In current master, PseudoAttacks[KING][ksq] == DistanceRingBB[ksq][0]
With this patch, it will be PseudoAttacks[KING][ksq] == DistanceRingBB[ksq][1]
ie squares at distance 1 from the king. This is more natural use of distance.
The current array size DistanceRingBB[SQUARE_NB][8] is still OK with the new
definition, because maximum distance between two squares on a chess board is
seven (for example Kh1 and a8).
No functional change.
Follow-up for the previous patch: we use an affine formula to mix stats
and evaluation in search. The idea is to give a bonus if the previous
move of the opponent was historically bad, and a malus if the previous
move of the opponent was historically good.
More precisely, if x is the stat score of the previous move by the opponent,
we implement the following formulas to tweak the evaluation at an internal
node of the tree for our pruning decisions at this node:
if x = 0, use v' = eval(P)
if x > 0, use v' = eval(P) - 5 - x/1024
if x < 0, use v' = eval(P) + 5 - x/1024
For reference, the previous master had this simpler rule:
if x > 0, use v' = eval(P) - 10
if x <= 0, use v' = eval(P)
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 29322 W: 6359 L: 6088 D: 16875
http://tests.stockfishchess.org/tests/view/5b76a5980ebc5902bdba957f
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 30893 W: 5154 L: 4910 D: 20829
http://tests.stockfishchess.org/tests/view/5b76ca6d0ebc5902bdba9914
Closes https://github.com/official-stockfish/Stockfish/pull/1740
Bench: 4592766
Mix search stats with evaluation: if the opponent's move has a good historyStat,
then decrease the evaluation of the internal node a bit for the pruning decisions
during search.
STC;
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 72083 W: 15683 L: 15203 D: 41197
http://tests.stockfishchess.org/tests/view/5b74c3ea0ebc5902bdba7d41
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 29104 W: 4867 L: 4630 D: 19607
http://tests.stockfishchess.org/tests/view/5b7565000ebc5902bdba851b
Closes https://github.com/official-stockfish/Stockfish/pull/1738
Bench: 4514101
-----------
How to continue from there?
• the use of the previous stat score can probably be simplified in lines 587 and 716
• we could try to use a continuous bonus based on the previous stat score, instead
of just a fixed offset of -10 when the opponent previous move was good.
----------
Comments by Stefan Geschwentner:
Interesting idea. Because only the eval in search is tweak this should only
influence the eval and static eval used at inner nodes, and not on the return
search value (which comes in the end from quiescence search), except through
saving in TT followed by a TT cutoff.
So essentialy this effects diverse pruning/reduction parts -- eval and static
eval are lowered for good opponent moves:
• tt cutoff (ttValue)
• improving (static eval)
• more razoring (eval)
• less futility pruning (eval)
• less null move pruning (eval + static eval) (but with little more depth)
• more probcut (static eval)
• more move futility pruning (static eval)
We double in this patch the weight of the capture history table in the
local scoring of captures for move ordering.
The capture history table is indexed by the triplet (capturing piece,
capture square, captured piece) and gets information like "it seems to
have been historically good in that part of the search tree to capture
a pawn with a rook on g3, even if it seems to lose material", and affect
the normaly pure « Most Valuable Victim » ordering of captures.
Finished yellow at STC after 228842 games (posting a +1.36 Elo gain):
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 228842 W: 50894 L: 50152 D: 127796
http://tests.stockfishchess.org/tests/view/5b714bb00ebc5902bdba332d
Passed LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 43251 W: 7425 L: 7131 D: 28695
http://tests.stockfishchess.org/tests/view/5b71c7d40ebc5902bdba3e51
Thanks to user Vizvezdenec for running the LTC test.
Closes https://github.com/official-stockfish/Stockfish/pull/1736
Bench: 4272361
This patch introduces a non-linear bonus for pawns, along with some
(linear) corrections for the other pieces types.
The original values were obtained by a massive non-linear tuning of both
pawns and other pieces by GuardianRM, while Alain Savard and Chris Cain
later simplified the patch by observing that, apart from the pawn case, the
tuned corrections were in fact almost affine and could be incorporated in
our current code base via the piece values in types.h (offset) and the diagonal
of the quadratic matrix (slope). See discussion in PR#1725 :
https://github.com/official-stockfish/Stockfish/pull/1725
STC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 42948 W: 9662 L: 9317 D: 23969
http://tests.stockfishchess.org/tests/view/5b6ff6e60ebc5902bdba1d87
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 19683 W: 3409 L: 3206 D: 13068
http://tests.stockfishchess.org/tests/view/5b702dbd0ebc5902bdba216b
How to continue from there?
- Maybe the non-linearity for the pawn value could be somewhat tempered
again and a simpler linear correction for pawns would work?
Closes https://github.com/official-stockfish/Stockfish/pull/1734
Bench: 4681496
We simplify the razoring logic by applying it to all nodes at depth 1 only.
An added advantage is that only one razor margin is needed now, and we treat
PV and Non-PV nodes in the same manner.
How to continue?
- There may be some conditions in which depth 2 razoring is beneficial.
- We can see whether the razor margin can be tuned, perhaps even with a
different value for PV nodes.
- Perhaps we can unify the treatment of PV and Non-PV nodes in other parts
of the search as well.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 5474 W: 1281 L: 1127 D: 3066
http://tests.stockfishchess.org/tests/view/5b6de3b20ebc5902bdba0d1e
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 62670 W: 10749 L: 10697 D: 41224
http://tests.stockfishchess.org/tests/view/5b6dee340ebc5902bdba0eb0
In addition, we ran a fixed LTC test against a similar patch which also
passed SPRT [-3, 1]:
ELO: 0.23 +-2.1 (95%) LOS: 58.6%
Total: 36412 W: 6168 L: 6144 D: 24100
http://tests.stockfishchess.org/tests/view/5b6e83940ebc5902bdba1485
We are opting for this patch as the more logical and simple of the two,
and it appears to be no less strong. Thanks in particular to @DU-jdto
for input into this patch.
Bench: 4476945
Currently, we do not consider pawns passed if there is another pawn of
the same color in front of them. It appears that this condition is not
necessary. The idea is that the doubled pawns are likely to be weak and
one of them will be likely captured anyway. On the other hand, if we do
somehow manage to promote a pawn, then the pawn behind it becomes passed
as well. In any case, the end result is we end up with an extra
potentially passed pawn. The current evaluation for passed pawns already
handles this case by also scaling down this effect.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 28291 W: 6287 L: 6178 D: 15826
http://tests.stockfishchess.org/tests/view/5b6c4b960ebc5902bdb9f256
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 30717 W: 5256 L: 5151 D: 20310
http://tests.stockfishchess.org/tests/view/5b6c82980ebc5902bdb9f863
Bench: 4938285
Unify the "quiet" and "non-quiet" reduction rules for use at any kind of moves.
The idea behind it was that both rules reduce at similiar cases in master:
one directly for late previous moves and the other indirectly by using a
bad stat score which is used for most move sorting and so approximates the
late move condition.
For captures/promotions the old rule was triggered in 25% but the new
rule only for 3% of all cases (so now more reductions are done, whereas
for quiet moves reductions keep the same level).
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 162327 W: 35976 L: 36134 D: 90217
http://tests.stockfishchess.org/tests/view/5b6a9a430ebc5902bdb9d5c1
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 29570 W: 5083 L: 4976 D: 19511
http://tests.stockfishchess.org/tests/view/5b6bc5d00ebc5902bdb9e9d6
Bench: 4526980
Currently, we first calculate some bitboards at the top of Evaluation::space()
and then check whether we actually need them. Invert the ordering. Of course this
does not make a difference in current master because the constexpr bitboard
calculations are in fact done at compile time by any decent compiler, but I find
my version a bit healthier since it will always meet or exceed current implementation
even if we eventually change the spaceMask to something not contsexpr.
No functional change.
After a session of tuning for King Psqt I got some new values, which was later
tweaked manually by me Fauzi, to result in an Elo-gain patch which seems to scale
pretty well:
STC: LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 100653 W: 22550 L: 22314 D: 55789 [Yellow patch]
LTC: LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 147079 W: 25584 L: 24947 D: 96548 [Green Patch]
Bench: 4669050
Introduce voting system for best move selction in multi-threads mode.
Joint work with Stefan Geschwentner, based on ideas introduced by
Michael Stembera.
Moves are upvoted by every thread using the margin to the minimum score
across threads and the completed depth.
First thread voting for the winner move is selected as best thread.
Passed STC, LTC. A further LTC test with only 4 threads failed with positive
score. A LTC with 31 threads was stopped with LLR 0.77 after 25k games to
avoid use of excessive resources (equivalent to 1.5M STC games).
Similar ideas were proposed by Michael Stembera 2 years ago #507, #508.
This implementation seems simpler and more understandable, the results
slightly more promising.
Further possible work:
1) Tweak of the formula using for assigning votes.
2) Use a different baseline for the score dependent part: maximum score
or winning probability could make more sense.
3) Assign votes in `Thread::Search` as iterations are completed and use
voting results to stop search.
4) Select best thread as the threads voting for best move with the highest
completed depth or, alternatively, vote on PV moves.
Link to SPRT tests
[stopped LTC, 31 threads 20+0.02](http://tests.stockfishchess.org/tests/view/5b61dc090ebc5902bdb95192)
LLR: 0.77 (-2.94,2.94) [0.00,5.00]
Total: 25602 W: 3977 L: 3850 D: 17775
Elo: 1.70 [-0.68,4.07] (95%)
[passed LTC, 8 threads 20+0.02](http://tests.stockfishchess.org/tests/view/5b5df5180ebc5902bdb9162d)
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 44478 W: 7602 L: 7300 D: 29576
Elo: 1.92 [-0.29,3.94] (95%)
[failed LTC, 4 threads 20+0.02](http://tests.stockfishchess.org/tests/view/5b5f39ef0ebc5902bdb92792)
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 29922 W: 5286 L: 5285 D: 19351
Elo: 0.48 [-1.98,3.10] (95%)
[passed STC, 4 threads 5+0.05](http://tests.stockfishchess.org/tests/view/5b5dbf0f0ebc5902bdb9131c)
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 9108 W: 2033 L: 1858 D: 5217
Elo: 6.11 [1.26,10.89] (95%)
No functional change (in simple threat mode)
Use operator const T&() instead of operator T() to avoid possible
costly hidden copies of non-scalar nested types.
Currently StatsEntry has a single member T, so assuming
sizeof(StatsEntry) == sizeof(T) it happens to work, but it's
better to use the size of the proper entry type in std::fill.
Note that current code works because std::array items are ensured
to be allocated in contiguous memory and there is no padding among
nested arrays. The latter condition does not seem to be strictly
enforced by the standard, so be careful here.
Finally use address-of operator instead of get() to fully hide the
wrapper class StatsEntry at calling sites. For completness add
the arrow operator too and simplify the C++ code a bit more.
Same binary code as previous master under the Clang compiler.
No functional change.
As a note, current 2 LMR conditions on stat score
could be simplified in a single line:
r -= ((ss->statScore >= 0) - ((ss-1)->statScore >= 0)) * ONE_PLY;
We keep them splitted in 2 "if" statements because are easier
to (immediately) read.
No functional change.
This is the first patch teaching Stockfish how to use the 7-pieces
Syzygy tablebase currently calculated by Bujun Guo (@noobpwnftw) and
Ronald de Man (@syzygy1). The 7-pieces database are so big that they
required a change in the internal format of the files (technically,
some DTZ values are 16 bits long, so this had to be stored as wide
integers in the Huffman tree).
Here are the estimated file size for the 7-pieces Syzygy files,
compared to the 151G of the 6-pieces Syzygy:
```
7.1T ./7men_testing/4v3_pawnful (ongoing, 120 of 325 sets remaining)
2.4T ./7men_testing/4v3_pawnless
2.3T ./7men_testing/5v2_pawnful
660G ./7men_testing/5v2_pawnless
117G ./7men_testing/6v1_pawnful
87G ./7men_testing/6v1_pawnless
```
Some pointers to download or recalculate the tables:
Location of original files, by Bujun Guo:
ftp://ftp.chessdb.cn/pub/syzygy/
Mirrors:
http://tablebase.sesse.net/ (partial)
http://tablebase.lichess.ovh/tables/standard/7/
Generator code:
https://github.com/syzygy1/tb/
Closes https://github.com/official-stockfish/Stockfish/pull/1707
Bench: 5591925 (No functional change if SyzygyTB is not used)
----------------------
Comment by Leonardo Ljubičić (@DragonMist)
This is an amazing achievement, generating and being able to use 7 men syzygy
on the fly. Thank you for your efforts @noobpwnftw !! Looking forward how this
will work in real life, and expecting some trade off between gaining perfect
play and slow disc Access, but once the disc speed and space is not a problem,
I expect 7 men to yield something like 30 elo at least.
-----------------------
Comment by Michael Byrne (@MichaelB7)
This definitely has a bright future. I turned off the 50 move rule (ala ICCF
new rules) for the following position: `[d]8/8/1b6/8/4N2r/1k6/7B/R1K5 w - - 0 1`
This position is a 451 ply win for white (sans the 50 move rule, this position
was identified by the generator as the longest cursed win for white in KRBN v KRB).
Now Stockfish finds it instantly (as it should), nice work 👊👍 .
```
dep score nodes time
7 +132.79 4339 0:00.00 Rb1+ Kc4 Nd6+ Kc5 Bg1+ Kxd6 Rxb6+ Kc7 Be3 Rh2 Bd4
6 +132.79 1652 0:00.00 Rb1+ Kc4 Nd2+ Kd5 Rxb6 Rxh2 Nf3 Rf2
5 +132.79 589 0:00.00 Rb1+ Kc4 Rxb6 Rxh2 Nf6 Rh1+ Kb2
4 +132.79 308 0:00.00 Rb1+ Kc4 Nd6+ Kc3 Rxb6 Rxh2
3 +132.79 88 0:00.00 Rb1+ Ka4 Nc3+ Ka5 Ra1+ Kb4 Ra4+ Kxc3 Rxh4
2 +132.79 54 0:00.00 Rb1+ Ka4 Nc3+ Ka5 Ra1+ Kb4
1 +132.7
```
This patch adds the tropism measure as a new term in the king danger variable.
Since we then trasform this variable as a Score via a quadratic formula, the
main effect of the patch is the positive correlation of the tropism measure
with some checks and pins information already present in the king danger code.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 6805 W: 1597 L: 1431 D: 3777
http://tests.stockfishchess.org/tests/view/5b5df8d10ebc5902bdb91699
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 32872 W: 5782 L: 5523 D: 21567
http://tests.stockfishchess.org/tests/view/5b5e08d80ebc5902bdb917ee
How to continue from there?
• it may be possible to use CloseEnemies=S(7,0)
• we may want to try incorporating other strategic features in the quadratic
king danger.
Closes https://github.com/official-stockfish/Stockfish/pull/1717
Bench: 5591925
The previous commit wouldn't compile on the Microsoft Virtual Studio C++ compiler. So use a more compatible style for the same idea (which we already use in numerous places of evaluate.cpp, for instance in line 563).
Under the Clang compiler, both versions generate exactly the same machine code (same md5 signatures for the two binaries).
No functional change.
Remove a popcount for HinderPassedPawn, and compensate by doubling
the bonus from S(4,0) to to S(8,0).
Maybe it was pure luck, but we got the idea of this Elo gaining patch by
seing the simplification attempt by Mike Whiteley in pull request #1703.
This suggests that whenever we have a passed evaluation simplification,
we should consider the possibility that the master bonus has become
slightly out of tune with time, and we should try a few Elo gaining [0..4]
tests by hand-tuning the master bonus.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 19136 W: 4388 L: 4147 D: 10601
http://tests.stockfishchess.org/tests/view/5b59be6f0ebc5902bdb8ac06
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 99382 W: 17324 L: 16843 D: 65215
http://tests.stockfishchess.org/tests/view/5b59d2410ebc5902bdb8afa8
Closes https://github.com/official-stockfish/Stockfish/pull/1710
Bench: 4688817
This tweak excludes files D and E from the KingFlank bitboard when our
king is on the A or H files respectively. As far as I can tell, this
affects two things: the calculation for CloseEnemies and PawnlessFlank.
Aside from filtering out slightly less relevant attacks in the flank,
I suspect this helps with king prophylaxis, avoiding attacks and moving
towards the center when the pawns start to come off.
STC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 56755 W: 12881 L: 12489 D: 31385
http://tests.stockfishchess.org/tests/view/5b58a94c0ebc5902bdb88c72
LTC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 130205 W: 22536 L: 21957 D: 85712
http://tests.stockfishchess.org/tests/view/5b58b7580ebc5902bdb89029
How to continue: Tweaking the two bonuses mentioned might give some
gain, although as far as I can tell, CloseEnemies is very sensitive to
even small changes.
Closes https://github.com/official-stockfish/Stockfish/pull/1705
Bench: 5026009
When evaluating threat by safe pawn and pawn push the same expression is used.
STC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 19444 W: 4540 L: 4309 D: 10595
http://tests.stockfishchess.org/tests/view/5b5a6e150ebc5902bdb8c5c0
Closes https://github.com/official-stockfish/Stockfish/pull/1709
No functional change.
--------------------
Comments by Stéphane Nicolet:
I don't measure any speed-up on my system, with two parallel benches at depth 22:
Total time (ms) : 74989
Nodes searched : 144830258
Nodes/second : 1931353
master
Total time (ms) : 75341
Nodes searched : 144830258
Nodes/second : 1922329
testedpatch
And anyway, like Stefan Geschwentner, I don't think that a 0.3% speed-up would
be enough to pass a [0..5] LTC test -- as a first approximation, we have this
rule of thumb that 1% speed-up gives about 1 Elo point.
However, considering the facts that the reformatting by itself is interesting,
that this is your first green test and that you played by the rules by running
the SPRT[0..5] test before opening the pull request, I will commit the change.
I will only take the liberty to change the occurrences of safe in lines 590 and
591 to b, to make the code more similar to lines 584 and 585.
So approved, and congrats :-)
This patch implements some idea by Alain Savard and Mike Whiteley taken from the perpertual renaming/reformatting thread.
This is a pure code cleaning patch (so no change in functionality), but I use it as a pretext to correct the bogus bench number that I introduced in the previous commit.
Bench: 4413383
This patch reverts the recent commit called "Tweak reductions formula, etc."
The decisions for the revert decision were as follows:
1) The original commit called "Tweak reductions formula: 0.88 * depth + 0.12"
showed bad scaling at in a Very Long Time Control (VLTC) test:
VLTC (180+1.8):
LLR: -1.59 (-2.94,2.94) [0.00,5.00]
Total: 14968 W: 2247 L: 2257 D: 10464
http://tests.stockfishchess.org/tests/view/5b559ffa0ebc5902bdb84f36
2) So there was a suspicion that the original fast passing LTC test which lead
us to accept the patch may have been a statistical accident, so we organized
a match against the previous master at LTC to get an Elo estimate for the
patch:
LTC match:
ELO: -1.83 +-2.1 (95%) LOS: 4.3%
Total: 36018 W: 6018 L: 6208 D: 23792
http://tests.stockfishchess.org/tests/view/5b55f8110ebc5902bdb8526f
3) Based on these results, we ran a simplification test with [-3..1] bounds
for the revert at LTC:
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 41501 W: 7107 L: 7020 D: 27374
http://tests.stockfishchess.org/tests/view/5b5738670ebc5902bdb86932
4) So we revert.
Bench: 4491691
There seems to be some strange interaction between Overload and Connectivity.
Overload encourages us to not have too many defended and attacked pieces,
as this may expose us to various tactics. This feels somewhat like it is in
conflict with Connectivity, where pieces are defended preemptively.
Here I take the "pick one or the other" approach and just remove connectivity,
while strengthening the effect of Overload to compensate. The reasoning is that
if we defend our pieces preemptively, then it does get attacked, we want to do
something about it so we don't get penalized by Overload. On the other
hand, if it doesn't get attacked, then there's no need to defend it.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 27734 W: 6174 L: 6064 D: 15496
http://tests.stockfishchess.org/tests/view/5b5073bd0ebc5902bdb7ba5c
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 51606 W: 8897 L: 8827 D: 33882
http://tests.stockfishchess.org/tests/view/5b50aa900ebc5902bdb7bf29
Bench: 4658006
After some recent big tuning session, the values for King Protector were
simplified to only be used on minor pieces. This patch tries to further
simplify by just using a single value, since current S(6,5) and S(5,6)
are close to each other. The value S(6,6) ended up passing, although
S(5,5) was also tried and failed STC.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 14261 W: 3288 L: 3151 D: 7822
http://tests.stockfishchess.org/tests/view/5b4ccdf50ebc5902bdb77f65
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 19606 W: 3396 L: 3273 D: 12937
http://tests.stockfishchess.org/tests/view/5b4ce4280ebc5902bdb7803b
Bench: 5448998
Extend the bonus for Overload to cases where our side
has more than one attacker to a non pawn piece.
Based on an idea by Bryan in the forum. For instance,
now black gets the overload bonus in this position:
8/5R1k/6pb/p6p/P1N4P/1Pp5/2K3P1/2N4r b - - 6 46
because two black pieces are attacking the knight on c1
that is defended only by the king.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 57446 W: 12762 L: 12711 D: 31973
http://tests.stockfishchess.org/tests/view/5b4ca9970ebc5902bdb77a88
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 42113 W: 7295 L: 7209 D: 27609
http://tests.stockfishchess.org/tests/view/5b4ccea00ebc5902bdb77f69
Bench: 4667263
For the rationale to allow this, see commit
a66c73deef
This was broken when cuckoo hashing was added, and
subtly broke (for example) lichess' Android application,
thus illustrating the original judgement was sound.
No functional change.
Various king and pawn eval values tuned after 2 million games. Rounding
slightly adjusted.
LTC: http://tests.stockfishchess.org/tests/view/5b477a260ebc5978f4be3ed4
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 32783 W: 5852 L: 5588 D: 21343
STC: http://tests.stockfishchess.org/tests/view/5b472d420ebc5978f4be3e4d
LLR: 3.23 (-2.94,2.94) [0.00,4.00]
Total: 44380 W: 10201 L: 9841 D: 24338
I think I reached the limit of the fishtest framework. It frequently
crashed at 2 million games already. The small values also moved a lot
throughout the entire tuning session though with smaller margin. The
passed danger and close enemies values seems the most sensitive (changing
close enemies alone to 6 failed before but now it passes), whether or not
they are close to optimal I don't know, but it seems some parameters are
also correlated to others.
Closes https://github.com/official-stockfish/Stockfish/pull/1670
bench: 5103722
In the current master, ThreatByKing is an array of two Scores, one for
when we have a single attack and one for when we have many. The latter
case is very rarely called during bench and was recently given a strange
negative value during a tuning run, as pointed out by @candirufish on
commit efd4ca2. Here, we simplify away this second case entirely, and
increase the remaining ThreatByKing to compensate.
Although I derived the parameter tweak independently, with the goal of
preserving the same average bonus, I later noticed that a very similar
Score had already been derived by an ongoing SPSA tuning session.
I therefore recognize @candirufish for first discovering these values.
I would also like to thank @Rocky640 for valuable feedback that pointed
me in the direction of ThreatByKing.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 7677 W: 1772 L: 1623 D: 4282
http://tests.stockfishchess.org/tests/view/5b3db0320ebc5902b9ffe97a
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 108031 W: 18329 L: 18350 D: 71352
http://tests.stockfishchess.org/tests/view/5b3dbf4b0ebc5902b9ffe9db
Closes https://github.com/official-stockfish/Stockfish/pull/1666
Bench: 4678861
In current master, the function make_bitboard() does nothing apart from
helping initialize the SquareBB[] array. This seems like an unnecessary
abstraction layer.
The advantage of make_bitboard() is we can define a bitboard, in a simple
and general way, not only from a single square but also from a list of
squares. It is more elegant, faster and readable than combining multiple
SquareBB explicitly, but the last complex use case in evaluation was
simplified away a few months ago.
If make_bitboard() becomes useful again to define complicated bitboards,
it will be easy enough to reintroduce it using this pull request as
an implementation reference.
No functional change.
Make sure each piece is not scored more than once as a passed pawn "hinderer",
by scoring only the blockers along the passed pawn path. Inspired by TCEC Game 29.
Passed STC as a simplification
http://tests.stockfishchess.org/tests/view/5b3016d00ebc5902b2e58552
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 75388 W: 16656 L: 16641 D: 42091
Passed LTC as a simplification
http://tests.stockfishchess.org/tests/view/5b302ed90ebc5902b2e587fc
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 49157 W: 8460 L: 8386 D: 32311
Current master was also counting the number of attacks along a passed pawn path,
which might be misleading:
a) a defender might be counted many times for the same pawn path. For example a
White rook on a1 attacking a black pawn on a7 would score the bonus * 6 but
would be probably better placed on a8
b) a defender might be counted on different pawn paths and might be overloaded. For
example a Ke4 or Qe4 against pawns on d6 and f6 would score the bonus * 6.
Counting each blocker or attacker only once is more complicated, and does not help
either: http://tests.stockfishchess.org/tests/view/5b2ff1cb0ebc5902b2e582b2
After this small simplification, there might be ways to increase the HinderPassedPawn
penalty.
Closes https://github.com/official-stockfish/Stockfish/pull/1661
Bench: 4520519
Silences the following warnings when compiling with GCC 8.
The fix is to use an intermediate pointer to anonymous function:
```
misc.cpp: In function 'int WinProcGroup::get_group(size_t)':
misc.cpp:241:77: warning: cast between incompatible function types from 'FARPROC' {aka 'long long int (*)()'} to 'fun1_t' {aka 'bool (*)(_LOGICAL_PROCESSOR_RELATIONSHIP, _SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX*, long unsigned int*)'} [-Wcast-function-type]
auto fun1 = (fun1_t)GetProcAddress(k32, "GetLogicalProcessorInformationEx");
^
misc.cpp: In function 'void WinProcGroup::bindThisThread(size_t)':
misc.cpp:309:71: warning: cast between incompatible function types from 'FARPROC' {aka 'long long int (*)()'} to 'fun2_t' {aka 'bool (*)(short unsigned int, _GROUP_AFFINITY*)'} [-Wcast-function-type]
auto fun2 = (fun2_t)GetProcAddress(k32, "GetNumaNodeProcessorMaskEx");
^
misc.cpp:310:67: warning: cast between incompatible function types from 'FARPROC' {aka 'long long int (*)()'} to 'fun3_t' {aka 'bool (*)(void*, const _GROUP_AFFINITY*, _GROUP_AFFINITY*)'} [-Wcast-function-type]
auto fun3 = (fun3_t)GetProcAddress(k32, "SetThreadGroupAffinity");
^
```
No functional change.
Compiling the current master with MSVC gives the following error:
```
search.cpp(956): error C2660: 'operator *': function does not take 1 arguments
types.h(303): note: see declaration of 'operator *'
```
This was introduced in commit:
https://github.com/official-stockfish/Stockfish/commit/88de112b84a5285c2afb3e075a05c2ab8ad3fd33
We use a suggestion by @vondele to fix the error, thanks!
No functional change.
[STC](http://tests.stockfishchess.org/tests/view/5b2614000ebc5902b8d17193)
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 17733 W: 3996 L: 3866 D: 9871
[LTC](http://tests.stockfishchess.org/tests/view/5b264d0f0ebc5902b8d17206)
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 55524 W: 9535 L: 9471 D: 36518
Use pawn count scaling also for opposite bishops endings with additional material, with a slope of 2 instead of 7. This simplifies slightly the code.
This PR is a functionally equivalent refactoring of the version which was submitted.
Four versions tried, 2 passed both STC and LTC. I picked the one which seemed more promising at LTC.
Slope 4 passed STC (-0.54 Elo), LTC not attempted
Slope 3 passed STC (+2.51 Elo), LTC (-0.44 Elo)
Slope 2 passed STC (+2.09 Elo), LTC (+0.04 Elo)
Slope 1 passed STC (+0.90 Elo), failed LTC (-3.40 Elo)
Bench: 4761613
I believe using foward_file_bb() here is fewer instructions.
a) Fewer instructions and probably more clear (debatable).
b) Possible that a lookup is slower than a few local operations, but the
forward_file_bb table is probably used often enough that it is always
cached.
Passed
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 21004 W: 4263 L: 4141 D: 12600
http://tests.stockfishchess.org/tests/view/5b1cad830ebc5902ab9c6239
Closes https://github.com/official-stockfish/Stockfish/pull/1644
No functional change.
After a helpful suggestion from AppVeyor support staff, moving the Stockfish
execution from ps to cmd seems to work. Alternative to PR #1624 tested in PR #1637.
No functional change.
The '- 1' subtrahend was introduced for guarding against null move
search at root, which would be nonsense. But this is actually already
guaranteed by the !PvNode condition. This followed from the discussion
in pull request 1609: https://github.com/official-stockfish/Stockfish/pull/1609
No functional change
The function Position::has_repeated() is used by Tablebases::root_probe()
to determine whether we can rank all winning moves with the same value, or
if we need to strictly rank by dtz in case the position has already been
repeated once, and we are risking to run into the 50-move rule and thus
losing the win (especially critical in some very complicated endgames).
To check whether the current position or one of the previous positions
after the last zeroing move has already been occured once, we start looking
for a repetition of the current position, and if that is not the case, we
step one position back and repeat the check for that position, and so on.
If you now look at how this was done before the new root ranking patch was
merged two months ago, it seems quite obvious that it is a simple oversight:
https://github.com/official-stockfish/Stockfish/commit/108f0da4d7f993732aa2e854b8f3fa8ca6d3b46c
More specifically, after we stepped one position back with
```
stc = stc->previous;
```
we now have to start checking for a repetition with
```
StateInfo* stp = stc->previous->previous;
```
and not with
```
StateInfo* stp = st->previous->previous;
```
Closes https://github.com/official-stockfish/Stockfish/pull/1625
No functional change
Fix an error when compiling current master with MSVC due to the
ambiguity of which operator* overload was intended (reported by
Jarrod Torriero).
No functional change.
Makes sure the potential benefit of first touch does not depend on
the order of the UCI commands Threads and Hash, by reallocating the
hash if a Threads is issued. The cost is zeroing the TT once more
than needed. In case the prefered order (first Threads than Hash)
is employed, this amounts to zeroing the default sized TT (16Mb),
which is essentially instantaneous.
Follow up for https://github.com/official-stockfish/Stockfish/pull/1601
where additional data and discussion is available.
Closes https://github.com/official-stockfish/Stockfish/pull/1620
No functional change.
Stockfish currently takes a while to clear the TT when using larger hash sizes.
On one machine with 128 GB hash it takes about 50 seconds with a single thread,
allowing it to use all allocated cores brought that time down to 4 seconds on
some Linux systems. The patch was further tested on Windows and refined with
NUMA binding of the hash initializing threads (we refer to pull request #1601
for the complete discussion and the speed measurements).
Closes https://github.com/official-stockfish/Stockfish/pull/1601
No functional change
I was able to get this to pass which reduces BlockedByPawn to one dimension
with NO distance from edge offset.
GOOD) It's more simple and may provide additional clarity for further
simplifications. Facilitates migrating unblocked to one dimension as well.
BAD) If there is indeed a distance component to BlockedStorm (may or may
not be the case), this obfuscates this component into ShelterStrength and
UnblockedStorm. This may be more convoluted. Also, it may be more convenient
to have each of the three arrays (ShelterStrength, BlockedStorm, and UnBlocked)
be the same size.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 96173 W: 19326 L: 19343 D: 57504
http://tests.stockfishchess.org/tests/view/5b04544d0ebc5914abc12965
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 49818 W: 7441 L: 7363 D: 35014
http://tests.stockfishchess.org/tests/view/5b0487d50ebc5914abc12990
Closes https://github.com/official-stockfish/Stockfish/pull/1611
Bench: 5133208
define Color us and use this instead of pos.side_to_move() and nmp_odd. The latter allows to clarify the nmp verification criterion.
Tested for no regression:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76713 W: 15303 L: 15284 D: 46126
http://tests.stockfishchess.org/tests/view/5b046a0d0ebc5914abc12971
No functional change.
Simplifying away all the progressKey stuff gives exactly the same bench,
without any speed impact. Tested for speed against master with two benches
at depth 22 ran in parallel:
**testedpatch**
Total time (ms) : 92350
Nodes searched : 178962949
Nodes/second : 1937877
**master**
Total time (ms) : 92358
Nodes searched : 178962949
Nodes/second : 1937709
We also tested the patch at STC for no-regression with [-3, 1] bounds:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 57299 W: 11529 L: 11474 D: 34296
http://tests.stockfishchess.org/tests/view/5b015a1c0ebc5914abc126e5
Closes https://github.com/official-stockfish/Stockfish/pull/1603
No functional change.
Default template parameters values and recursive functions do not play well
together. Fix for below errors that showed up after updating to latest MSVC.
````
tbprobe.cpp(1156): error C2672:
'search': no matching overloaded function found
tbprobe.cpp(1198): error C2783:
'Tablebases::WDLScore `anonymous-namespace'::search(Position &,Tablebases::ProbeState *)':
could not deduce template argument for 'CheckZeroingMoves'
````
Closes https://github.com/official-stockfish/Stockfish/pull/1594
No functional change.
A position which has a move which draws by repetition, or which could have
been reached from an earlier position in the game tree, is considered to be
at least a draw for the side to move.
Cycle detection algorithm by Marcel van Kervink:
https://marcelk.net/2013-04-06/paper/upcoming-rep-v2.pdf
----------------------------
How does the algorithm work in practice? The algorithm is an efficient
method to detect if the side to move has a drawing move, without doing any
move generation, thus possibly giving a cheap cutoffThe most interesting
conditions are both on line 1195:
```
if ( originalKey == (progressKey ^ stp->key)
|| progressKey == Zobrist::side)
```
This uses the position keys as a sort-of Bloom filter, to avoid the expensive
checks which follow. For "upcoming repetition" consider the opening Nf3 Nf6 Ng1.
The XOR of this position's key with the starting position gives their difference,
which can be used to look up black's repeating move (Ng8). But that look-up is
expensive, so line 1195 checks that the white pieces are on their original squares.
This is the subtlest part of the algorithm, but the basic idea in the above game
is there are 4 positions (starting position and the one after each move). An XOR
of the first pair (startpos and after Nf3) gives a key matching Nf3. An XOR of
the second pair (after Nf6 and after Ng1) gives a key matching the move Ng1. But
since the difference in each pair is the location of the white knight those keys
are "identical" (not quite because while there are 4 keys the the side to move
changed 3 times, so the keys differ by Zobrist::side). The loop containing line
1195 does this pair-wise XOR-ing.
Continuing the example, after line 1195 determines that the white pieces are
back where they started we still need to make sure the changes in the black
pieces represents a legal move. This is done by looking up the "moveKey" to
see if it corresponds to possible move, and that there are no pieces blocking
its way. There is the additional complication that, to match the behavior of
is_draw(), if the repetition is not inside the search tree then there must be
an additional repetition in the game history. Since a position can have more
than one upcoming repetition a simple count does not suffice. So there is a
search loop ending on line 1215.
On the other hand, the "no-progress' is the same thing but offset by 1 ply.
I like the concept but think it currently has minimal or negative benefit,
and I'd be happy to remove it if that would get the patch accepted. This
will not, however, save many lines of code.
-----------------------------
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 36430 W: 7446 L: 7150 D: 21834
http://tests.stockfishchess.org/tests/view/5afc123f0ebc591fdf408dfc
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 12998 W: 2045 L: 1876 D: 9077
http://tests.stockfishchess.org/tests/view/5afc2c630ebc591fdf408e0c
How could we continue after the patch:
• The code in search() that checks for cycles has numerous possible variants.
Perhaps the check need could be done in qsearch() too.
• The biggest improvement would be to get "no progress" to be of actual benefit,
and it would be helpful understand why it (probably) isn't. Perhaps there is an
interaction with the transposition table or the (fantastically complex) tree
search. Perhaps this would be hard to fix, but there may be a simple oversight.
Closes https://github.com/official-stockfish/Stockfish/pull/1575
Bench: 4550412
At PvNodes allow bonus for prior counter move that caused a fail low
for depth 1 and 2. Note : I did a speculative LTC on yellow STC patch
since history stats tend to be highly TC sensitive
STC (Yellow):
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 64295 W: 13042 L: 12873 D: 38380
http://tests.stockfishchess.org/tests/view/5af507c80ebc5968e6524153
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 22407 W: 3413 L: 3211 D: 15783
http://tests.stockfishchess.org/tests/view/5af85dd40ebc591fdf408b87
Also use local variable excludedMove in NMP (marotear)
Bench: 5294316
Use the whole kingRing for pawn attackers instead of only the squares directly
around the king. This tends to give quite a lot more kingAttackersCount, so to
compensate and to avoid raising the king danger too fast we lower the values
in the KingAttackWeights array a little bit.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 51892 W: 10723 L: 10369 D: 30800
http://tests.stockfishchess.org/tests/view/5af6d4dd0ebc5968e652428e
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 24536 W: 3737 L: 3515 D: 17284
http://tests.stockfishchess.org/tests/view/5af709890ebc5968e65242ac
Credits to user @xoroshiro for the idea of using the kingRing for pawn attackers.
How to continue? It seems that the KingAttackWeights[] array stores values
which are quite Elo-sensitive, yet they have not been tuned with SPSA recently.
There might be easy Elo points to get there.
Closes https://github.com/official-stockfish/Stockfish/pull/1597
Bench: 5282815
Simplification: in king danger, include all blockers and not only pinned
pieces, since blockers enemy pieces can result in discovered checks which
are also bad.
STC http://tests.stockfishchess.org/tests/view/5af35f9f0ebc5968e6523fe9
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 145781 W: 29368 L: 29478 D: 86935
LTC http://tests.stockfishchess.org/tests/view/5af3cb430ebc5968e652401f
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76398 W: 11272 L: 11232 D: 53894
I also incorrectly scheduled STC with [0,5] which it failed.
http://tests.stockfishchess.org/tests/view/5af283c00ebc5968e6523f33
LLR: -2.94 (-2.94,2.94) [0.00,5.00]
Total: 12338 W: 2451 L: 2522 D: 7365
Closes https://github.com/official-stockfish/Stockfish/pull/1593
bench: 4698290
----------------------------------------
Thanks to @vondele and @Rocky640 for a cleaner version of the patch,
and the following comments!
> Most of the pinned, (or for this pull request, blocking) squares were
> already computed in the unsafeChecks, the only missing squares being:
>
> a) squares attacked by a Queen which are occupied by friendly piece
> or "unsafe". Note that adding such squares never passed SPRT[0,5].
>
> b) squares not in mobilityArea[Us].
>
> There is a strong relationship between the blockers and the unsafeChecks,
> but the bitboard unsafeChecks is still useful when the checker is not
> aligned with the king, and the checking square is occupied by friendly
> piece or is "unsafe". This is always the case for the Knight.
This patch simplifies the control flow in search(), removing an if
and a goto. A side effect of the patch is that Stockfish is now a
little bit more selective at low depths, because we allow razoring,
futility pruning and probcut pruning after a null move.
passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 32035 W: 6523 L: 6422 D: 19090
http://tests.stockfishchess.org/tests/view/5af142ca0ebc597fb3d39bb6
passed LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 41431 W: 6187 L: 6097 D: 29147
http://tests.stockfishchess.org/tests/view/5af148770ebc597fb3d39bc1
Ideas for further work:
• Use the nodes credit opened by the patch (the increased selectivity)
to try somewhat higher razoring, futility or probcut margins at [0..4].
Bench: 4855031
We can view the patch version as adding some "undermining bonus" for
level pawns, when the defending side can not easily avoid the exchange
by advancing her pawn.
• Case 1) White b2,c3, Black a3,b3:
Black is breaking through, b2 deserves a penalty
• Case 2) White b2,c3, Black a3,c4:
if b2xa3 then White ends up with a weak pawn on a3
and probably a weak pawn on c3 too.
In either case, White can still not safely play b2-b3 and make a
phalanx with c3, which is the essence of a backward pawn definition.
Passed STC in SPRT[0, 4]:
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 131169 W: 26523 L: 26199 D: 78447
http://tests.stockfishchess.org/tests/view/5aefa4d50ebc5902a409a151
ELO 1.19 [-0.38,2.88] (95%)
Passed LTC in SPRT[-3, 1]:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 24824 W: 3732 L: 3617 D: 17475
http://tests.stockfishchess.org/tests/view/5af04d3f0ebc5902a88b2e55
ELO 1.27 [-1.21,3.70] (95%)
Closes https://github.com/official-stockfish/Stockfish/pull/1584
How to continue from there?
There were some promising tests a couple of months ago about adding
a lever condition for king danger in evaluate.cpp, maybe it would
be time to re-try this after all the recent changes in pawns.cpp
Bench: 4773882
The two lines of code in the patch seem to be just as good as master.
1. We now only look at the current square to see if it is currently backward,
whereas master looks there AND further ahead in the current file (master would
declare a pawn "backward" even though it could still safely advance a little).
This simplification allows us to avoid the use of the difficult logic with
`backmost_sq(Us, neighbours | stoppers)`.
2. The condition `relative_rank(Us,s) < RANK_5` is simplified away.
Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 68132 W: 14025 L: 13992 D: 40115
http://tests.stockfishchess.org/tests/view/5aedc97a0ebc5902a4099fd6
Passed LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23789 W: 3643 L: 3527 D: 16619
http://tests.stockfishchess.org/tests/view/5aee4f970ebc5902a409a03a
Ideas for further work:
• The new code flags some pawns on the 5th rank as backward, which was not the
case in the old master. So maybe we should test a version with that included?
• Further tweaks of the backward condition with [0..5] bounds?
Closes https://github.com/official-stockfish/Stockfish/pull/1583
Bench: 5122789
When we are using the "Bitboard + Square" overloaded operators,
the compiler uses the interpediate SquareBB[s] to transform the
square into a Bitboard, and then calculate the result.
For instance, the following code:
```
b = pos.pieces(Us, PAWN) & s
```
generates in fact the code:
```
b = pos.pieces(Us, PAWN) & SquareBB[s]`
```
The bug introduced by Stéphane in the previous patch was the
use of `b = pos.pieces(Us, PAWN) & (s + Up)` which can result
in out-of-bounds errors for the SquareBB[] array if s in the
last rank of the board.
We coorect the bug, and also add some asserts in bitboard.h to
make the code more robust for this particular bug in the future.
Bug report by Joost VandeVondele. Thanks!
Bench: 5512000
Simplification: remove BlockedByKing from storm array and use a special rule.
The BlockedByKing section in the storm array is substantially similar to the
Unopposed section except for two extreme values V(-290), V(-274). Turns out
removing BlockedByKing and using a special rule for these two values shows
no Elo loss. All the other values in the BlockedByKing section are apparently
irrelevant. BlockedByKing now falls under unopposed which (to me) is a bit
more logical since there is no defending pawn on this file. Also, retuning
the Unopposed section may be another improvement.
GOOD) This is a simplification because the entire BlockedByKing section of
the storm array goes away reducing a few lines of code (and less values to
tune). This also brings clarity because the special rule is self documenting.
BAD) It takes execution time to apply the special rule. This should be negli-
gible because it is based on a template parameter and is boiled down to two
bitwise AND's.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 33470 W: 6820 L: 6721 D: 19929
http://tests.stockfishchess.org/tests/view/5ae7b6e60ebc5926dba90e13
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 47627 W: 7045 L: 6963 D: 33619
http://tests.stockfishchess.org/tests/view/5ae859ff0ebc5926dba90e85
Closes https://github.com/official-stockfish/Stockfish/pull/1574
Bench: 5512000
-----------
How to continue after this patch?
This patch may open the possibility to move the special rule to evaluate.cpp
in the evaluate::king() function, where we could refine the rule using king
danger information. For instance, with a king in H2 blocking an opponent pawn
in H3, it may be critical to know that the opponent has no safe check in G2
before giving the bonus :-)
This is a further step in the long quest for a simple way of determining
scale factors for the endgame.
Here we remove the artificial restriction in evaluate_scale_factor()
based on endgame score. Also SCALE_FACTOR_ONEPAWN can be simplified
away. The latter is a small non functional simplification with respect
to the version that was testedin the framework, verified on bench with
depth 22 for good measure.
Passed STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 49438 W: 9999 L: 9930 D: 29509
http://tests.stockfishchess.org/tests/view/5ae20c8b0ebc5963175205c8
Passed LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 101445 W: 15113 L: 15110 D: 71222
http://tests.stockfishchess.org/tests/view/5ae2a0560ebc5902a1998986
How to continue from there?
Maybe the general case could be scaled with pawns from both colors
without losing Elo. If that is the case, then this could be merged
somehow with the scaling in evaluate_initiative(), which also uses
a additive malus down when the number of pawns in the position goes
down.
Closes https://github.com/official-stockfish/Stockfish/pull/1570
Bench: 5254862
Currently the make strip target is broken on mingw as the exe name is wrong (stockfish instead of stockfish.exe).
Needs some testing by mingw users (both profile-build and strip, native and cross).
No functional change.
Queen was recently excluded from the mobility area of friendly minor
pieces. Exclude queen also from the mobility area of friendly majors too.
Run as a simplification:
STC
http://tests.stockfishchess.org/tests/view/5ade396f0ebc59602d053742
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 46972 W: 9511 L: 9437 D: 28024
LTC
http://tests.stockfishchess.org/tests/view/5ade64b50ebc5949f20a24d3
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 66855 W: 10157 L: 10105 D: 46593
How to continue from there?
The mobilityArea is used in various places of the evaluation as a
soft proxy for "not attacked by the opponent pawns". Now that the
mobility area is getting smaller and smaller, it may be worth to
hunt for Elo gains by trying the more direct ~attackedBy[Them][PAWN]
instead of mobilityArea[Us] in these places.
Bench: 4650572
Remove the distinction between the king file and the two neighbours
files in the ShelterStrength[] array. Instead we initialize the safety
variable in the evaluate_shelter() function with a -10 penalty if our
king is on a semi-open file (ie. if our king is on a file without a pawn
protection).
Also rename shelter_storm() to evaluate_shelter() while there.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 23153 W: 4795 L: 4677 D: 13681
http://tests.stockfishchess.org/tests/view/5adcb83d0ebc595ec7ff8aa7
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 25728 W: 3934 L: 3821 D: 17973
http://tests.stockfishchess.org/tests/view/5adcdcb60ebc595ec7ff8adb
See the commit history in PR#1559 for the proof that the committed
version is equivalent to the version in the tests above:
https://github.com/official-stockfish/Stockfish/pull/1559
Full credit to @protonspring for the renormalized values of the
ShelterStrength[] array used for the simplification. Thanks!
Bench: 4703935
Change the operators of the Option type in uci.h to accept floating
point numbers in double precision on input as the numerical type for
the "spin" values of the UCI protocol.
The output of Stockfish after the "uci" command is unaffected.
This change is compatible with all the existing GUI (as they will
continue sending integers that we can interpret as doubles in SF),
and allows us to pass double parameters to Stockfish in the console
via the "setoption" command. This will be useful if we implement
another tuner as an alternative for SPSA.
Closes https://github.com/official-stockfish/Stockfish/pull/1556
No functional change.
---------------------
A example of the new functionality in action in the branch `tune_float2'`:
https://github.com/snicolet/Stockfish/commit/876c322d0f20ee232da977b4d3489c4cc929765e
I have added the following lines in ucioptions.cpp:
```C++
void on_pi(const Option& o)
{
double x = Options["PI"]; // or double x = o;
std::cerr << "received value is x = " << x << std::endl;
}
...
o["PI"] << Option(3.1415926, -10000000, 10000000, on_pi);
```
Then I can change the value of Pi in Stockfish via the command line, and
check that Stockfish understands a floating point:
````
> ./stockfish
> setoption name PI value 2.7182818284
received value is x = 2.71828
````
On output, the default value of Pi is truncated to 3 (to remain compatible
with the UCI protocol and GUIs):
````
> uci
[...]
option name SyzygyProbeLimit type spin default 6 min 0 max 6
option name PI type spin default 3 min -10000000 max 10000000
uciok
````
This patch is non-functional. Current master does four operations to determine
whether an enemy pawn on this file is blocked by the king or not
```
f == file_of(ksq) && rkThem == relative_rank(Us, ksq) + 1 )
```
By adding a direction (based on the template color), this is reduced to two
operations. This works because b is limited to enemy pawns that are ahead of
the king and on the current file.
```
shift<Down>(b) & ksq
```
I've added a line of code, but the number of executing instructions is reduced
(I think). I'm not sure if this counts as a simplification, but it should
theoretically be a little faster (barely). The code line length is also reduced
making it a little easier to read.
Closes https://github.com/official-stockfish/Stockfish/pull/1552
No functional change.
This patch corrects both MultiPV behaviour and "go searchmoves" behaviour
for tablebases.
We change the logic of table base probing at root positions from filtering
to ranking. The ranking code is much more straightforward than the current
filtering code (this is a simplification), and also more versatile.
If the root is a TB position, each root move is probed and assigned a TB score
and a TB rank. The TB score is the Value to be displayed to the user for that
move (unless the search finds a mate score), while the TB rank determines which
moves should appear higher in a multi-pv search. In game play, the engine will
always pick a move with the highest rank.
Ranks run from -1000 to +1000:
901 to 1000 : TB win
900 : normally a TB win, in rare cases this could be a draw
1 to 899 : cursed TB wins
0 : draw
-1 to -899 : blessed TB losses
-900 : normally a TB loss, in rare cases this could be a draw
-901 to -1000 : TB loss
Normally all winning moves get rank 1000 (to let the search pick the best
among them). The exception is if there has been a first repetition. In that
case, moves are ranked strictly by DTZ so that the engine will play a move
that lowers DTZ (and therefore cannot repeat the position a second time).
Losing moves get rank -1000 unless they have relatively high DTZ, meaning
they have some drawing chances. Those get ranks towards -901 (when they
cross -900 the draw is certain).
Closes https://github.com/official-stockfish/Stockfish/pull/1467
No functional change (without tablebases).
This patch introduces an Analysis Contempt UCI combo box to control
the behaviour of contempt during analysis. The possible values are
Both, Off, White, Black. Technically, the engine is supposed to be in
analysis mode if UCI_AnalyseMode is set by the graphical user interface
or if the user has chosen infinite analysis mode ("go infinite").
Credits: the idea for the combo box is due to Michel Van den Bergh.
No functional change (outside analysis mode).
-----------------------------------------------------
The so-called "contempt" is an optimism value that the engine adds
to one color to avoid simplifications and keep tension in the position
during its search. It was introduced in Stockfish 9 and seemed to give
good results during the TCEC 11 tournament (Stockfish seemed to play a
little bit more actively than in previous seasons).
The patch does not change the play during match or blitz play, but gives
more options for correspondance players to decide for which color(s) they
would like to use contempt in analysis mode (infinite time). Here is a
description of the various options:
* Both : in analysis mode, use the contempt for both players (alternating)
* Off : in analysis mode, use the contempt for none of the players
* White : in analysis mode, White will play actively, Black will play passively
* Black : in analysis mode, Black will play actively, White will play passively
This patch adds some documentation and code cleanup to tablebase code.
It took me some time to understand the relation among the differrent
structs, although I have rewrote them fully in the past. So I wrote
some detailed documentation to avoid the same efforts for future readers.
Also noteworthy is the use a standard hash table implementation with a
more efficient 1D array instead of a 2D array. This reduces the average
lookup steps of 90% (from 343 to 38 in a bench 128 1 16 run) and reduces
also the table from 5K to 4K
entries.
I have tested on 5-men and no functional and no slowdown reported. It
should be verified on 6-men that the new hash does not overflow. It is
enough to run ./stockfish with 6-men available: if it does not assert at
startup it means everything is ok with 6-men too.
EDIT: verified for 6-men tablebase by Jörg Oster. Thanks!
No functional change.
We remove an unnecessary condition in the definition of safe squares
in the space evaluation. Only the squares which are occupied by our
pawns or attacked by our opponent's pawns are now excluded.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 21096 W: 4321 L: 4199 D: 12576
http://tests.stockfishchess.org/tests/view/5acbf7510ebc59547e537d4e
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 23437 W: 3577 L: 3460 D: 16400
http://tests.stockfishchess.org/tests/view/5acc0f750ebc59547e537d6a
It may be possible to further refine the definition of such safe squares.
Bench: 5351765
This patch applies a S(10, 5) bonus for every square that is:
- Occupied by an enemy piece which is not a pawn
- Attacked exactly once by our pieces
- Defended exactly once by enemy pieces
The idea is that these pieces must be defended. Their defenders have
dramatically limited mobility, and they are vulnerable to our future
attack.
As with connectivity, there are probably many more tests to be run in
this area. In particular:
- I believe @snicolet's queen overload tests have demonstrated a potential
need for a queen overload bonus above and beyond this one; however, the
conditions for "overload" in this patch are different (excluding pieces
we attack twice). My next test after this is (hopefully) merged will be
to intersect the Bitboard I define here with the enemy's queen attacks and
attempt to give additional bonus.
- Perhaps we should exclude pieces attacked by pawns--can pawns really be
overloaded? Should they have the same weight, or less? This didn't work
with a previous version, but it could work with this one.
- More generally, different pieces may need more or less bonus. We could
change bonuses based on what type of enemy piece is being overloaded, what
type of friendly piece is attacking, and/or what type of piece is being
defended by the overloaded piece and attacked by us, or any intersection
of these three. For example, here attacked/defended pawns are excluded,
but they're not totally worthless targets, and could be added again with
a smaller bonus.
- This list is by no means exhaustive.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 17439 W: 3599 L: 3390 D: 10450
http://tests.stockfishchess.org/tests/view/5ac78a2e0ebc59435923735e
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 43304 W: 6533 L: 6256 D: 30515
http://tests.stockfishchess.org/tests/view/5ac7a1d80ebc59435923736f
Closes https://github.com/official-stockfish/Stockfish/pull/1533
Bench: 5248871
----------------
This is my first time opening a PR, so I apologize if there are errors.
There are too many people to thank since I submitted my first test just
over a month ago. Thank you all for the warm welcome and here is to more
green patches!
In particular, I would like to thank:
- @crossbr, whose comment in a FishCooking thread first inspired me to
consider the overloading of pieces other than queens,
- @snicolet, whose queen overload tests inspired this one and served as
the base of my first overload attempts,
- @protonspring, whose connectivity tests inspired this one and who provided
much of the feedback needed to take this from red to green,
- @vondele, who kindly corrected me when I submitted a bad LTC test,
- @Rocky640, who has helped me over and over again in the past month.
Thank you all!
In master, we already remove the King from the mobility area of minor pieces
because the King simply stands in the way of other pieces, and since opponent
cannot capture the King, any piece which "protects" the King cannot recapture.
Similarly, this patch introduces the idea that it is rarely a need for a Queen
to be "protected" by a minor (unless it is attacked only by a Queen, in fact).
We used to have a LoosePiece bonus, and in a similar vein the Queen was excluded
from that penalty.
Idea came when reviewing an old game of Kholmov. He was a very good midgame
player, but in the opening his misplace his Queen (and won in the end :-) :
http://www.chessgames.com/perl/chessgame?gid=1134645
Both white queen moves 10.Qd3 and 13.Qb3 are in the way of some minor piece.
I would prefer to not give a bishop mobility bonus at move 10 for the square d3,
or later a knight mobility bonus at move 13 for the square b3. And the textbook
move is 19.Qe3! which prepares 20.Nb3. This short game sample shows how much a
queen can be "in the way" of minor pieces.
STC
http://tests.stockfishchess.org/tests/view/5ac2c15f0ebc591746423fa3
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 22066 W: 4561 L: 4330 D: 13175
LTC
http://tests.stockfishchess.org/tests/view/5ac2d6500ebc591746423faf
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 25871 W: 3953 L: 3738 D: 18180
Closes https://github.com/official-stockfish/Stockfish/pull/1532
Ideas for future work in this area:
• tweak some more mobility areas for other piece type.
• construct a notion of global mobility for the whole piece set.
• bad bishops.
Bench: 4989125
Simplify ThreatBySafePawn evaluation by removing the 'if (weak)' speed
optimization check from threats evaluation. This is a non functional
change as it removes just a speed optimization conditional which was
probably useful before but does no longer provide benefits. This section
section had a few more lines not long ago, with ThreatByHangingPawn and
a loop through the threatened pieces, but now there is not much left.
Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 47775 W: 9696 L: 9624 D: 28455
http://tests.stockfishchess.org/tests/view/5ac298910ebc591746423f8b
Closes https://github.com/official-stockfish/Stockfish/pull/1531
Non functional change.
1) Use make_bitboard() in Bitboards::init()
2) Fix MSVC warning: search.h(85): warning C4244: '=': conversion from
'TimePoint' to 'int', possible loss of data.
Closes https://github.com/official-stockfish/Stockfish/pull/1524
No functional change.
When we reach a position with only two opposite colored bishops and
one pawn on the board, current master would give it a scale factor
of 9/64=0.14 in about one position out of 7200, and a scale factor
of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in
100% of the cases.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 55845 W: 11467 L: 11410 D: 32968
http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 11915 W: 1852 L: 1719 D: 8344
http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c
We also have exhaustive coverage analysis of this patch effect by
Alain Savard, comparing the perfect evaluation given by the Syzygy
tablebase with the heuristic play after this patch for the set of
all legal positions of the KBPKP endgame with opposite bishops, in
the comments thread for this pull request:
https://github.com/official-stockfish/Stockfish/pull/1520
Alain's conclusion:
> According to this definition and the data, I consider this PR is
> identical to master to "solve for draw" and slightly better than
> master to solve earlier for "wins".
Note: this patch is a side effect of an ongoing effort to improve
the evaluation of positions involving a pair of opposite bishops.
See the GitHub diff of this LTC test which almost passed at sprt[0..5]
for a discussion:
http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93
No functional change (at small bench depths)
Include some not fully supported levers in the (candidate) passed pawns
bitboard, if otherwise unblocked. Maybe levers are usually very short
lived, and some inaccuracy in the lever balance for the definition of
candidate passed pawns just triggers a deeper search.
Here is a example of a case where the patch has an effect on the definition
of candidate passers: White c5/e5 pawns, against Black d6 pawn. Let's say
we want to test if e5 is a candidate passer. The previous master looks
only at files d, e and f (which is already very good) and reject e5 as
a candidate. However, the lever d6 is challenged by 2 pawns, so it should
not fully count. Indirectly, this patch will view such case (and a few more)
to be scored as candidates.
STC
http://tests.stockfishchess.org/tests/view/5abcd55d0ebc5902926cf1e1
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 16492 W: 3419 L: 3198 D: 9875
LTC
http://tests.stockfishchess.org/tests/view/5abce1360ebc5902926cf1e6
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 21156 W: 3201 L: 2990 D: 14965
This was inspired by this test of Jerry Donald Watson, except the case of
zero supporting pawns against two levers is excluded, and it seems that
not excluding that case is bad, while excluding is it beneficial. See the
following tests on fishtest:
https://github.com/official-stockfish/Stockfish/pull/1519http://tests.stockfishchess.org/tests/view/5abccd850ebc5902926cf1ddhttp://tests.stockfishchess.org/tests/view/5abcdd490ebc5902926cf1e4
Closes https://github.com/official-stockfish/Stockfish/pull/1521
Bench: 5568461
----
Comments by Jerry Donald Watson:
> My thinking as to why this works:
>
> The evaluation is either called in an interior node or in the qsearch.
> The calls at the end of the qsearch are the more important as they
> ultimately determine the scoring of each move, whereas the internal
> values are mainly used for pruning decisions with a margin. Some strong
> engines don't even call the eval at all nodes. Now the whole point of
> the qsearch is to find quiet positions where captures do not change the
> evaluation of the position with regards to the search bounds - i.e. if
> there were good captures they would be tried.* So when a candidate lever
> appears in the evaluation at the end of the qsearch, the qsearch has
> guaranteed that it cannot just be captured, or if it can, this does not
> take the score past the search bounds. Practically this may mean that
> the side with the candidate lever has the turn, or perhaps the stopping
> lever pawn is pinned, or that side is forced for other reasons to make
> some other move (e.g. d6 can only take one of the pawns in the example
> above).
>
> Hence granting the full score for only one lever defender makes some
> sense, at least, to me.
>
> IMO this is also why huge bonuses for possible captures in the evaluation
> (e.g. threat on queen and our turn), etc. don't tend to work. Such things
> are best left to the search to figure out.
We now use per-thread dynamic contempt. This patch has the following
effects:
* for Threads=1: **non-functional**
* for Threads>1:
* with MultiPV=1: **no regression, little to no ELO gain**
* with MultiPV>1: **clear improvement over master**
First, I tried testing at standard MultiPV=1 play with [0,5] bounds.
This yielded 2 yellow and 1 red test:
5+0.05, Threads=5:
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 82689 W: 16439 L: 16190 D: 50060
http://tests.stockfishchess.org/tests/view/5aa93a5a0ebc5902952892e6
5+0.05, Threads=8:
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 27164 W: 4974 L: 4983 D: 17207
http://tests.stockfishchess.org/tests/view/5ab2639b0ebc5902a6fbefd5
5+0.5, Threads=16:
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 41396 W: 7127 L: 7082 D: 27187
http://tests.stockfishchess.org/tests/view/5ab124220ebc59029516cb62
Then, I tested with Skill Level=17 (implicitly MutliPV=4), showing
a clear improvement:
5+0.05, Threads=5:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 3498 W: 1316 L: 1135 D: 1047
http://tests.stockfishchess.org/tests/view/5ab4b6580ebc5902932aeca2
Next, I tested the patch with MultiPV=1 again, this time checking for
non-regression ([-3, 1]):
5+0.5, Threads=5:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 65575 W: 12786 L: 12745 D: 40044
http://tests.stockfishchess.org/tests/view/5ab4e8500ebc5902932aecb3
Finally, I ran some tests with fixed number of games, checking if
reverting dynamic contempt gains more elo with Skill Level=17 (i.e.
MultiPV) than applying the "prevScore" fix and this patch. These tests
showed, that this patch gains 15 ELO when playing with Skill Level=17:
5+0.05, Threads=3, "revert dynamic contempt" vs. "WITHOUT this patch":
ELO: -11.43 +-4.1 (95%) LOS: 0.0%
Total: 20000 W: 7085 L: 7743 D: 5172
http://tests.stockfishchess.org/tests/view/5ab636450ebc590295d88536
5+0.05, Threads=3, "revert dynamic contempt" vs. "WITH this patch":
ELO: -26.42 +-4.1 (95%) LOS: 0.0%
Total: 20000 W: 6661 L: 8179 D: 5160
http://tests.stockfishchess.org/tests/view/5ab62e680ebc590295d88524
---
***FAQ***
**Why should this be commited?**
I believe that the gain for multi-thread MultiPV search is a sufficient
justification for this otherwise neutral change. I also believe this
implementation of dynamic contempt is more logical, although this may
be just my opinion.
**Why is per-thread contempt better at MultiPV?**
A likely explanation for the gain in MultiPV mode is that during
search each thread independently switches between rootMoves and via
the shared contempt score skews each other's evaluation.
**Why were the tests done with Skill Level=17?**
This was originally suggested by @Hanamuke and the idea is that with
Skill Level Stockfish sometimes plays also moves it thinks are slightly
sub-optimal and thus the quality of all moves offered by the MultiPV
search is checked by the test.
**Why are the ELO differences so huge?**
This is most likely because of the nature of Skill Level mode --
since it slower and weaker than normal mode, bugs in evaluation have
much greater effect.
---
Closes https://github.com/official-stockfish/Stockfish/pull/1515.
No functional change -- in single thread mode.
Extends valgrind/sanitizer testing to cover syzygy code.
The script downloads 4 man syzygy as needed. The time needed for the
additional testing is small (in fact hard to see a difference compared
to the large fluctuations in testing time in travis).
Possible follow-ups:
* include more TB sensitive positions in bench.
* include the test script of recent commit "Refactor tbprobe.cpp".
* verify unchanged bench with TB (with a long run).
* make the TB part of the continuation integration tests optional.
Closes https://github.com/official-stockfish/Stockfish/pull/1518
and https://github.com/official-stockfish/Stockfish/pull/1490
No functional change.
Adjust criterion for applying extra reduction if not improving.
We now add an extra ply of reduction if r > 1.0, instead of the
previous condition Reductions[NonPV][imp][d][mc] >= 2.
Why does this work? Previously, reductions when not improving had
a discontinuity as the depth and/or move count increases due to the
Reductions[NonPV][imp][d][mc] >= 2 condition. Hence, values of r
such that 0.5 < r < 1.5 would be mapped to a reduction of 1, while
1.5 < r < 2.5 would be mapped to a reduction of 3. This patch allows
values of r satisfying 1.0 < r < 1.5 to be mapped to a reduction of 2,
making the reduction formula more continuous.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 35908 W: 7382 L: 7087 D: 21439
http://tests.stockfishchess.org/tests/view/5aba723a0ebc5902a4743e8f
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 23087 W: 3584 L: 3378 D: 16125
http://tests.stockfishchess.org/tests/view/5aba89070ebc5902a4743ea9
Ideas for future work:
- We could look at retuning the LMR formula.
- We could look at adjusting the reductions in PV nodes if not improving.
Bench: 5326261
Use rootMoves[PVIdx].previousScore instead of bestValue for
dynamic contempt. This is equivalent for MultiPV=1 (bench remained the
same, even for higher depths), but more correct for MultiPV.
STC (MultiPV=3):
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 2657 W: 1079 L: 898 D: 680
http://tests.stockfishchess.org/tests/view/5aaa47cb0ebc590297330403
LTC (MultiPV=3):
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 2390 W: 874 L: 706 D: 810
http://tests.stockfishchess.org/tests/view/5aaa593a0ebc59029733040b
VLTC 240+2.4 (MultiPV=3):
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 2399 W: 861 L: 694 D: 844
http://tests.stockfishchess.org/tests/view/5aaf983e0ebc5902a182131f
LTC (MultiPV=4, Skill Level=17):
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 747 W: 333 L: 175 D: 239
http://tests.stockfishchess.org/tests/view/5aabccee0ebc5902997ff006
Note: although the ELO differences seem huge, they are inflated by the
nature of Skill Level / MultiPV search, so I don't think they can be
reasonably compared with classic ELO strength.
See https://github.com/official-stockfish/Stockfish/pull/1491 for some
verifications searches with MultiPV = 10 at depths 12 and 24 from the
starting position and the position after 1.e4, comparing the outputs
of the full PV by the old master and by this patch.
No functional change for MultiPV=1
This involves:
* replacing the union hacks with simply reusing the EntryPiece arrays
for the no-pawns case
* merging the PairsData structure with the EntryPiece/-Pawn structs
(with credit to Marco: @mcostalba)
* simplifying some HashTable functions
* thanks to previous changes, removing the ugly memsets
* simplifying the template logic for WDL/DTZ distinction
(now we distinguish based on an enum type, not the entry classes)
* removing the unneeded Atomic wrapper
-----------------------------
For reference, here is a manual way to check that patches concerning
table bases code are non-functional changes:
0) Download the Syzygy table bases (up to 6 men).
1) Make sure you have branches master and the pull request pointing to
the right commits.
2) Download the bench calculation scripts from the following URL:
https://gist.github.com/WOnder93/b5fcf9c989b4a1715684d5c82367cdbe
and copy into src inside your Stockfish repo.
3) Make the scripts executable (chmod +x *.sh).
4) Run the following command to use TBs located at <path>:
export SYZYGY_PATH='<path>'
5) After that, run this (it will take a long time, this is a deep bench):
BENCH_ARGS='128 1 22' ./check_benches.sh master tbprobe_cleanup 2>/dev/null`
==> You should see two equal numbers printed.
(Of course, now we have to trust that the script itself is correct :)
-----------------------------
Closes https://github.com/official-stockfish/Stockfish/pull/1477
No functional change.
This is a patch to fix issue #1498, switching the time management variables
to 64 bits to avoid overflow of time variables after 25 days.
There was a bug in Stockfish 9 causing the output to be wrong after
2^31 milliseconds search. Here is a long run from the starting position:
info depth 64 seldepth 87 multipv 1 score cp 23 nodes 13928920239402
nps 0 tbhits 0 time -504995523 pv g1f3 d7d5 d2d4 g8f6 c2c4 d5c4 e2e3 e7e6 f1c4
c7c5 e1g1 b8c6 d4c5 d8d1 f1d1 f8c5 c4e2 e8g8 a2a3 c5e7 b2b4 f8d8 b1d2 b7b6 c1b2
c8b7 a1c1 a8c8 c1c2 c6e5 d1c1 c8c2 c1c2 e5f3 d2f3 a7a5 b4b5 e7c5 f3d4 d8c8 d4b3
c5d6 c2c8 b7c8 b3d2 c8b7 d2c4 d6c5 e2f3 b7d5 f3d5 e6d5 c4e5 a5a4 e5d3 f6e4 d3c5
e4c5 b2d4 c5e4 d4b6 e4d6 g2g4 d6b5 b6c5 b5c7 g1g2 c7e6 c5d6 g7g6
We check at compile time that the TimePoint type is exactly 64 bits long for
the compiler (TimePoint is our alias in Stockfish for std::chrono::milliseconds
-- it is a signed integer type of at least 45 bits according to the C++ standard,
but will most probably be implemented as a 64 bits signed integer on modern
compilers), and we use this TimePoint type consistently across the code.
Bug report by user "fischerandom" on the TCEC chat (thanks), and the
patch includes code and suggestions by user "WOnder93" and Ronald de Man.
Fixes issue: https://github.com/official-stockfish/Stockfish/issues/1498
Closes pull request: https://github.com/official-stockfish/Stockfish/pull/1510
No functional change.
Make kingRing always eight squares, extending the bitboard to the
F file if the king is on the H file, and to the C file if the king
is on the A file. This may deal with cases where Stockfish (like
many other engines) would shift the king around on the back rank
like g1h1, not because there is some imminent threat, but because
it makes king safety look a little better just because the king ring
had a smaller area.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 34000 W: 7167 L: 6877 D: 19956
http://tests.stockfishchess.org/tests/view/5ab8216d0ebc5902932cbe64
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 22574 W: 3576 L: 3370 D: 15628
http://tests.stockfishchess.org/tests/view/5ab84e6a0ebc5902932cbe72
How to continue from there?
This patch probably makes it easier to tune the king safety evaluation,
because the new regularity of the king ring size will make the king
safety function more continuous.
Closes https://github.com/official-stockfish/Stockfish/pull/1512
Bench: 5934103
Rewrite the MovePicker class using lambda expressions for move filtering.
Includes code style changes by @mcostalba.
Verified for speed, passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 43191 W: 9391 L: 9312 D: 24488
http://tests.stockfishchess.org/tests/view/5a99b9df0ebc590297cc8f04
This rewrite of MovePicker.cpp seems to trigger less random crashes on Ryzen
machines than the version in previous master (reported by Bojun Guo).
Closes https://github.com/official-stockfish/Stockfish/pull/1454
No functional change.
To more clearly distinguish them from "const" local variables, this patch
defines compile-time local constants as constexpr. This is consistent with
the definition of PvNode as constexpr in search() and qsearch(). It also
makes the code more robust, since the compiler will now check that those
constants are indeed compile-time constants.
We can go even one step further and define all the evaluation and search
compile-time constants as constexpr.
In generate_castling() I replaced "K" with "step", since K was incorrectly
capitalised (in the Chess960 case).
In timeman.cpp I had to make the non-local constants MaxRatio and StealRatio
constepxr, since otherwise gcc would complain when calculating TMaxRatio and
TStealRatio. (Strangely, I did not have to make Is64Bit constexpr even though
it is used in ucioption.cpp in the calculation of constexpr MaxHashMB.)
I have renamed PieceCount to pieceCount in material.h, since the values of
the array are not compile-time constants.
Some compile-time constants in tbprobe.cpp were overlooked. Sides and MaxFile
are not compile-time constants, so were renamed to sides and maxFile.
Non-functional change.
Implements renaming suggestions by Marco Costalba, Günther Demetz,
Gontran Lemaire, Ronald de Man, Stéphane Nicolet, Alain Savard,
Joost VandeVondele, Jerry Donald Watson, Mike Whiteley, xoto10,
and I hope that I haven't forgotten anybody.
Perpetual renaming thread for suggestions:
https://github.com/official-stockfish/Stockfish/issues/1426
No functional change.
If search depth is less than ONE_PLY call qsearch(), no need to check the
depth condition at various call sites of search().
Passed STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 14568 W: 3011 L: 2877 D: 8680
http://tests.stockfishchess.org/tests/view/5aa846190ebc59029781015b
Also helps gcc to find some optimizations (smaller binary, some speedup).
Thanks to Aram and Stefan for identifying an oversight in an early version.
Closes https://github.com/official-stockfish/Stockfish/pull/1487
No functional change.
The NO_BSF does not cover any real life use-case today. The only compilers that
can compile SF today, with the current Makefile and no source code changes, are
either GCC compatible (define __GNUC__) or MSVC compatible (define _MSC_VER). So
they all support LSB/MSB intrinsics.
This patch simplifies away the software fall-backs of LSB/MSB that were still
in Stockfish code, but unused in any of the officially supported compilers.
Note the (legacy) MSVC/WIN32 case, where we use a 32-bit BSF/BSR solution, as
64-bit intrinsics aren't available there.
Discussed in: https://github.com/official-stockfish/Stockfish/pull/1447
and: https://github.com/official-stockfish/Stockfish/pull/1479
No functional change.
Adjust ProbCut rBeta by whether the score is improving, and also
set improving to false when in check. More precisely, this patch
has two parts:
1) the increased beta threshold for ProbCut is now adjusted based
on whether the score is improving
2) when in check, improving is always set to false.
Co-authored by Joost VandeVondele (@vondele) and Bill Henry (@VoyagerOne).
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 13480 W: 2840 L: 2648 D: 7992
http://tests.stockfishchess.org/tests/view/5aa693fe0ebc59029781004c
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 25895 W: 4099 L: 3880 D: 17916
http://tests.stockfishchess.org/tests/view/5aa6ac940ebc59029781006e
In terms of opportunities for future work opened up by this patch,
the ProbCut rBeta formula could probably be tuned to gain more Elo.
Closes https://github.com/official-stockfish/Stockfish/pull/1485
Bench: 5328254
King and pawn endgames are typically decisive, and a small
advantage is often sufficient to win. Therefore we now take
this into account when computing the initiative adjustment.
This idea came from a series of patches by Gian-Carlo Pascutto.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 48770 W: 10203 L: 9845 D: 28722
http://tests.stockfishchess.org/tests/view/5aa58cce0ebc59029780ff8d
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 22252 W: 3572 L: 3366 D: 15314
http://tests.stockfishchess.org/tests/view/5aa5b27c0ebc59029780ffad
Ideas for future developement:
- There have been a number of changes to the initiative
calculation lately. Perhaps the coefficients could be
tuned again.
- It may be possible to add special knowledge for other
endgames in the initiative calculation.
Closes https://github.com/official-stockfish/Stockfish/pull/1481
Bench: 5750110
"Loose pieces drop, in blitz keep everything protected"
Adding a small S(2,2) bonus for knights, bishops, rooks, and
queens that are "connected" to each other (in the sense that
they are under attack by our own pieces) apparently is a good
thing. It probably helps the pieces work together a bit better.
STC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 12317 W: 2655 L: 2467 D: 7195
http://tests.stockfishchess.org/tests/view/5aa2d86b0ebc590297cb6474
LTC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 35725 W: 5516 L: 5263 D: 24946
http://tests.stockfishchess.org/tests/view/5aa2fc6f0ebc590297cb64a8
How to continue from there (by Stefan Geschwentner)?
• First we should identify all other eval terms which have an overlap
with new connectivity bonus (like the outpost bonus). A simple way
would be subtract the connectivity bonus from them and look if this
better, or use a SPSA session for these terms.
• Tuning Connectivity himself with SPSA seems not so promising because
of the small range which is useful. Here manual testing changes of
Connectivity like +-1 seems better.
• The eg value is more important because in endgame the position gets
more open and so attacks on pieces are easier. Another important point
is that when defending/fortress-like positions each defending piece
needs a protection, otherwise attacks on them can break defense.
Closes https://github.com/official-stockfish/Stockfish/pull/1474
Bench: 5318575
Avoid duplicated code after recent commit "Use evaluation trend
to adjust futility margin". We initialize the improving variable
to true in the check case, which allows to avoid redundant code
in the general case.
Tested for speed by snicolet, patch seems about 0.4% faster.
No functional change.
Note: initializing the improving variable to false in the check
case was tested as a functional change, ending yellow in both STC
and LTC. This change is not included in the commit, but it is an
interesting result that could become part of a future patch about
improving or LMR. Reference of the LTC yellow test:
http://tests.stockfishchess.org/tests/view/5aa131560ebc590297cb636e
Allow a potential slider threat from a square currently occupied
by a harmless attacker, just as the recent "knight on queen" patch.
Also from not completely safe squares, use the mobilityArea instead
of excluding all pawns for both SlidersOnQueen and KnightOnQueen
We now compute the potential sliders threat on queen only if opponent
has one queen.
Run as SPRT [0,4] since it is some kind of simplification but maybe
not clearly one.
STC:
http://tests.stockfishchess.org/tests/view/5aa1ddf10ebc590297cb63d8
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 22997 W: 4817 L: 4570 D: 13610
LTC:
http://tests.stockfishchess.org/tests/view/5aa1fe6b0ebc590297cb63e5
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 11926 W: 1891 L: 1705 D: 8330
After this patch is committed, we may try to:
• re-introduce some "threat by queen" bonus to make Stockfish's queen
more aggressive (attacking aspect)
• introduce a concept of "queen overload" to force the opponent queen
into passivity and protecting duties (defensive aspect)
• more generally, re-tune the queen mobility array since patches in the
last three months have affected a lot the location/activity of queens.
Closes https://github.com/official-stockfish/Stockfish/pull/1473
bench: 5788691
We give a S(21,11) bonus for knight threats on the next moves
against enemy queen. The threats are from squares which are
"not strongly protected" and which may be empty, contain enemy
pieces or even one of our piece at the moment (N,B,Q,R) -- hence
be two-steps threats in the later case because we will have to
move our piece and *then* attack the enemy queen with the knight.
STC: http://tests.stockfishchess.org/tests/view/5a9e442e0ebc590297cb6162
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 35129 W: 7346 L: 7052 D: 20731
LTC: http://tests.stockfishchess.org/tests/view/5a9e6e620ebc590297cb617f
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 42442 W: 6695 L: 6414 D: 29333
How to continue from there?
• Trying to refine the threat condition ("not strongly protected")
• Trying the two-steps idea for bishops or rooks threats against queen
Bench: 6051247
Similar removal of superposition code trick as in the
"Simplify tropism computation" patch. This simplification
of the space() function will allow us to specify space
masks which can reach into enemy territory.
passed STC:
LLR: 3.38 (-2.94,2.94) [-3.00,1.00]
Total: 184630 W: 40581 L: 40758 D: 103291
http://tests.stockfishchess.org/tests/view/5a8433360ebc590297cc80c5
passed LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 231799 W: 37647 L: 37858 D: 156294
http://tests.stockfishchess.org/tests/view/5a96a34a0ebc590297cc8cfd
No functional change.
Add a logarithmic term in the optimism computation, increase
the maximal optimism and lower the contempt offset.
This increases the dynamics of the optimism aspects, giving
a boost for balanced positions without skewing too much on
unbalanced positions (but this version will enter panic mode
faster than previous master when behind, trying to draw faster
when slightly behind). This helps, since optimism is in general
a good thing, for instance at LTC, but too high optimism
rapidly contaminates play.
passed STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 159343 W: 34489 L: 33588 D: 91266
http://tests.stockfishchess.org/tests/view/5a8db9340ebc590297cc85b6
passed LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 47491 W: 7825 L: 7517 D: 32149
http://tests.stockfishchess.org/tests/view/5a9456a80ebc590297cc8a89
It must be mentioned that a version of the PR with contempt 0
did not pass STC [0,5]. The version in the patch, which uses
default contempt 12, was found to be as strong as current master
on different matches against SF7 and SF8, both at STC and LTC.
One drawback maybe is that it raises the draw rate in self-play
from 56% to 59%, giving a little bit less sensitivity for SF
developpers to find evaluation improvements by selfplay tests
in fishtest.
Possible further work:
• tune the values accurately, while keeping in mind the drawrate issue
• check whether it is possible to remove linear and offset term
• try to simplify the S-shape curve
Bench: 5934644
Use a recursive std::array with variadic template
parameters to get rid of the last redundacy.
The first template T parameter is the base type of
the array, the W parameter is the weight applied to
the bonuses when we update values with the << operator,
the D parameter limits the range of updates (range is
[-W * D, W * D]), and the last parameters (Size and
Sizes) encode the dimensions of the array.
This allows greater flexibility because we can now tweak
the range [-W * D, W * D] for each table.
Patch removes more lines than what adds and streamlines
the Stats soup in movepick.h
Closes PR#1422 and PR#1421
No functional change.
The first depth 2 margin triggers the verification quiescence search.
This qsearch() result has to be better then the second lower margin,
so we only skip the razoring when the qsearch gives a significant
improvement.
Passed STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 32133 W: 7395 L: 7101 D: 17637
http://tests.stockfishchess.org/tests/view/5a93198b0ebc590297cc8942
Passed LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 17382 W: 3002 L: 2809 D: 11571
http://tests.stockfishchess.org/tests/view/5a93b18c0ebc590297cc89c2
This Elo-gaining version was further simplified following a suggestion
of Marco Costalba:
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 15553 W: 3505 L: 3371 D: 8677
http://tests.stockfishchess.org/tests/view/5a964be90ebc590297cc8cc4
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 13253 W: 2270 L: 2137 D: 8846
http://tests.stockfishchess.org/tests/view/5a9658880ebc590297cc8cca
How to continue after this patch?
Reformating the razoring code (step 7 in search()) to unify the
depth 1 and depth 2 treatements seems quite possible, this could
possibly lead to more simplifications.
Bench: 5765806
In pawn structures like white pawns f6,h6 against black pawns f7,g6,h7
the attack on the king is blocked by the own pawns. So decrease the
penalty for king safety.
See diagram and discussion in
https://github.com/official-stockfish/Stockfish/pull/1434
A sample position that this patch wants to avoid is the following
1rr2bk1/3q1p1p/2n1bPpP/pp1pP3/2pP4/P1P1B3/1PBQN1P1/1K3R1R w - - 0 1
White pawn storm on the king side was a disaster, it locked the king
side completely. Therefore, all the king tropism bonus that white have
on the king side are useless, and kingadjacent attacks too. Master
gives White a static +4.5 advantage, but White cannot win that game.
The patch is lowering this evaluation artefact.
STC:
LLR: 2.94 (-2.94,2.94) [0.00,5.00]
Total: 16467 W: 3750 L: 3537 D: 9180
http://tests.stockfishchess.org/tests/view/5a92102d0ebc590297cc87d0
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 64242 W: 11130 L: 10745 D: 42367
http://tests.stockfishchess.org/tests/view/5a923dc80ebc590297cc8806
This version includes reformatting and speed optimization by Alain Savard.
Bench: 5643527
Using a SPSA tuning session to optimize the time management
parameters.
With SPSA tuning it is not always possible to say where improvements
came from. Maybe some variables changed randomly or because result
was not sensitive enough to them. So my explanation of changes will
not be necessarily correct, but here it is.
• When decrease of thinking time was added by Joost a few months ago
if best move has not changed for several plies, one more competing
indicator was introduced for the same purpose along with increase
in score and absence of fail low at root. It seems that tuning put
relatively more importance on that new indicator what allowed to save
time.
• Some of this saved time is distributed proportionally between all
moves and some more time were given to moves when score dropped a lot
or best move changed.
• It looks also that SPSA redistributed more time from the beginning to
later stages of game via other changes in variables - maybe because
contempt made game to last longer or for whatever reason.
All of this is just small tweaks here and there (a few percentages changes).
STC (10+0.1):
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 18970 W: 4268 L: 4029 D: 10673
http://tests.stockfishchess.org/tests/view/5a9291a40ebc590297cc8881
LTC (60+0.6):
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 72027 W: 12263 L: 11878 D: 47886
http://tests.stockfishchess.org/tests/view/5a92d7510ebc590297cc88ef
Additional non-regression tests at other time controls
Sudden death 60s:
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 14444 W: 2715 L: 2608 D: 9121
http://tests.stockfishchess.org/tests/view/5a9445850ebc590297cc8a65
40 moves repeating at LTC:
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 10309 W: 1880 L: 1759 D: 6670
http://tests.stockfishchess.org/tests/view/5a9566ec0ebc590297cc8be1
This is a functional patch only for time management, but the bench
does not reflect this because it uses fixed depth search, so the number
of nodes does not change during bench.
No functional change.
This is the sequel of the previous patch, we now let the parent node initialize
stat score to zero once for all grandchildren.
Initialize statScore to zero for the grandchildren of the current position.
So statScore is shared between all grandchildren and only the first grandchild
starts with statScore = 0. Later grandchildren start with the last calculated
statScore of the previous grandchild. This influences the reduction rules in
LMR which are based on the statScore of parent position.
Tests results against the previous patch:
STC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 23676 W: 5417 L: 5157 D: 13102
http://tests.stockfishchess.org/tests/view/5a9423a90ebc590297cc8a46
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 35485 W: 6168 L: 5898 D: 23419
http://tests.stockfishchess.org/tests/view/5a9435550ebc590297cc8a54
Bench: 5643520
Let the parent node initialize stat score to zero once for all siblings.
Initialize statScore to zero for the children of the current position.
So statScore is shared between sibling positions and only the first sibling
starts with statScore = 0. Later siblings start with the last calculated
statScore of the previous sibling. This influences the reduction rules in
in LMR which are based on the statScore of parent position.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 22683 W: 5202 L: 4946 D: 12535
http://tests.stockfishchess.org/tests/view/5a93315f0ebc590297cc894f
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 48548 W: 8346 L: 8035 D: 32167
http://tests.stockfishchess.org/tests/view/5a933ba90ebc590297cc8962
Bench: 5833683
The current master applies Eval::Tempo even to leaves evaluated
as draw by some of the static evaluation functions of endgame.cpp
(for instance KNN vs K or stalemates in KP vs K). This results in
some lines being reported as +0.07 or -0.07 when the terminal
position has reached such endgames (0.07 being about the value
of a tempo for Stockfish).
This patch does not apply Eval::tempo to these positions. This leads
to more nodes being evaluated as VALUE_DRAW during search, giving more
opportunities for cut-offs in alpha-beta.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 52602 W: 11776 L: 11403 D: 29423
http://tests.stockfishchess.org/tests/view/5a8cb8f60ebc590297cc8546
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,4.00]
Total: 156613 W: 26820 L: 26158 D: 103635
http://tests.stockfishchess.org/tests/view/5a8f452d0ebc590297cc865a
Bench: 4924749
To compute dicovered check or pinned pieces we use some bitwise
operators that are not really needed because already accounted for
at the caller site.
For instance in evaluation we compute:
pos.pinned_pieces(Us) & s
Where pinned_pieces() is:
st->blockersForKing[c] & pieces(c)
So in this case the & operator with pieces(c) is useless,
given the outer '& s'.
There are many places where we can use the naked blockersForKing[]
instead of the full pinned_pieces() or discovered_check_candidates().
This path is simpler than original and gives around 1% speed up for me.
Also tested for speed by mstembera and snicolet (neutral in both cases).
No functional change.
Apply -mdynamic-no-pic in a single place in the Makefile instead of 5 places.
Verified on three different Macs:
- a MacBook from 2013
- a MacBook running MacOS 10.9.5
- an iMac running MacOS 10.13.3
No functional change.
The previous asymmetry measure of the pawn structure only used to
consider the number of pawns on semi-opened files in the position.
With this patch we also increase the measure by the number of passed
pawns for both players.
Many thanks to the community for the nice feedback on the previous
version, with special mentions to Alain Savard and Marco Costalba
for clarity and speed suggestions.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 13146 W: 3038 L: 2840 D: 7268
http://tests.stockfishchess.org/tests/view/5a91dd0c0ebc590297cc877e
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 27776 W: 4771 L: 4536 D: 18469
http://tests.stockfishchess.org/tests/view/5a91fdd50ebc590297cc879b
How to continue after this patch?
Stockfish will now evaluate more positions with passed pawns, so
tuning the passed pawns values may bring Elo. The patch has also
consequences on the initiative term, where we might want to give
different weights to passed pawns and semi-openfiles (idea by
Stefano Cardanobile).
Bench: 5302866
Move the first killer move out of the capture stage, combining treatment
of first and second killer move.
passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 55777 W: 12367 L: 12313 D: 31097
http://tests.stockfishchess.org/tests/view/5a88617e0ebc590297cc8351
Similar to an earlier proposition of Günther Demetz, see pull request #1075.
I think it is more robust and readable than master, why hand-unroll the loop
over the killer array, and duplicate code ?
This version includes review comments from Marco Costalba.
Bench: 5227124
The previous asymmetry measure of the pawn structure only used to
consider the number of pawns on semi-opened files in the postions.
With this patch we also increase the measure by the number of passed
pawns for both players.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 13146 W: 3038 L: 2840 D: 7268
http://tests.stockfishchess.org/tests/view/5a91dd0c0ebc590297cc877e
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 27776 W: 4771 L: 4536 D: 18469
http://tests.stockfishchess.org/tests/view/5a91fdd50ebc590297cc879b
How to continue from there: Stockfish will now evaluate more positions
with passed pawns, so tuning the passed pawns values may bring Elo.
The patch also has consequences on the initiative term.
Bench: 5302866
When iid (Internal iterative deepening) is invoked, the prior value of ttValue is
not guaranteed to be VALUE_NONE. As such, it is currently possible to enter a state
in which ttValue has a specific value which is inconsistent with tte->bound() and
tte->depth(). Currently, ttValue is only used within the search in a context that
prevents this situation from making a difference (and so this change is non-functional,
but this is not guaranteed to remain the case in the future.
For instance, just changing the tt depth condition in singular extension node to be
tte->depth() >= depth - 4 * ONE_PLY
instead of
tte->depth() >= depth - 3 * ONE_PLY
interacts badly with the absence of ttMove in iid. For the ttMove to become a singular
extension candidate, singularExtensionNode needs to be true. With the current master,
this requires that tte->depth() >= depth - 3 * ONE_PLY. This is not currently possible
if tte comes from IID, since the depth 'd' used for the IID search is always less than
depth - 4 * ONE_PLY for depth >= 8 * ONE_PLY (below depth 8 singularExtensionNode can
never be true anyway). However, with DU-jdto/Stockfish@251281a , this condition can be
met, and it is possible for singularExtensionNode to become true after IID. There are
then two mechanisms by which this patch can affect the search:
• If ttValue was VALUE_NONE prior to IID, the fact that this patch sets ttValue allows
the 'ttValue != VALUE_NONE' condition of singularExtensionNode to be met.
• If ttValue wasn't VALUE_NONE prior to IID, the fact that this patch modifies ttValue's
value causes a different 'rBeta' to be calculated if the singular extension search is
performed.
Tested at STC for non-regression:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76981 W: 17060 L: 17048 D: 42873
http://tests.stockfishchess.org/tests/view/5a7738b70ebc5902971a9868
No functional change
The razoring heuristic is quite a drastic pruning technique,
using a depth 0 search at internal nodes of the search tree
to estimate the true value of depth n nodes. This patch limits
this razoring to the case of internal nodes of depth 1.
Author: Jarrod Torriero (DU-jdto)
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 8043 W: 1865 L: 1716 D: 4462
http://tests.stockfishchess.org/tests/view/5a90a9290ebc590297cc86c1
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 32890 W: 5577 L: 5476 D: 21837
http://tests.stockfishchess.org/tests/view/5a90c8510ebc590297cc86d5
Opportunities opened by this patch: it would be interesting to
know if it brings Elo to re-introduce razoring or soft razoring
at depth >= 2, maybe using a larger margin to compensate for the
increased pruning effect.
Bench: 5227124
This is one of the most difficult to understand but also
most important and speed critical functions of SF.
This patch rewrites some part of it to hopefully
make it clearer and drop some redundant variables
in the process.
Same speed than master (or even a bit more).
Thanks to Chris Cain for useful feedback.
No functional change.
Where variable names are explicitly incorrect, I feel morally obligated to at least
suggest an alternative. There are many, but these two are especially egregious.
No functional change.
As far as can tell, semiopenFiles are set if there is a pawn anywhere on
the file. The removed condition would be true even if the pawns were very
advanced, which doesn't make sense if we're looking for a trapped rook.
Seems the engine fairs better with this removed. My guess s that the
condition that mobility is 3 or less does this well enough.
Begs the question whether this is a mobility issue alone... not sure.
Should I do LTC test?
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 13377 W: 3009 L: 2871 D: 7497
http://tests.stockfishchess.org/tests/view/5a855be40ebc590297cc8166
Passed LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 16288 W: 2813 L: 2685 D: 10790
http://tests.stockfishchess.org/tests/view/5a8575a80ebc590297cc817e
Bench: 5006365
Make contempt dependent on the current score of the root position.
The idea is that we now use a linear formula like the following to decide
on the contempt to use during a search :
contempt = x + y * eval
where x is the base contempt set by the user in the "Contempt" UCI option,
and y * eval is the dynamic part which adapts itself to the estimation of
the evaluation of the root position returned by the search. In this patch,
we use x = 18 centipawns by default, and the y * eval correction can go
from -20 centipawns if the root eval is less than -2.0 pawns, up to +20
centipawns when the root eval is more than 2.0 pawns.
To summarize, the new contempt goes from -0.02 to 0.38 pawns, depending if
Stockfish is losing or winning, with an average value of 0.18 pawns by default.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 110052 W: 24614 L: 23938 D: 61500
http://tests.stockfishchess.org/tests/view/5a72e6020ebc590f2c86ea20
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 16470 W: 2896 L: 2705 D: 10869
http://tests.stockfishchess.org/tests/view/5a76c5b90ebc5902971a9830
A second match at LTC was organised against the current master:
ELO: 1.45 +-2.9 (95%) LOS: 84.0%
Total: 19369 W: 3350 L: 3269 D: 12750
http://tests.stockfishchess.org/tests/view/5a7acf980ebc5902971a9a2e
Finally, we checked that there is no apparent problem with multithreading,
despite the fact that some threads might have a slightly different contempt
level that the main thread.
Match of this version against master, both using 5 threads, time control 30+0.3:
ELO: 2.18 +-3.2 (95%) LOS: 90.8%
Total: 14840 W: 2502 L: 2409 D: 9929
http://tests.stockfishchess.org/tests/view/5a7bf3e80ebc5902971a9aa2
Include suggestions from Marco Costalba, Aram Tumanian, Ronald de Man, etc.
Bench: 5207156
This patch simplifies the time management code, removing the extra
thinking time for moves with draw PV and increasing thinking time
for all moves proportionally by around 4%.
Last time when the time management was carefully tuned was 1.5-2 years
ago. As new patches were getting added, time management was drifting out
of optimum. This happens because when search becomes more precise pv and
score are becoming more stable, there are less fail lows, best move is
picked earlier and there are less best move changes. All this factors are
entering in time management, and average time per move is decreasing with
more and more good patches. For individual patches such effect is small
(except some) and may be up or down, but when there are many of them,
effect is more substantial. The same way benchmark with more and more
patches is slowly drifting down on average.
So my understanding that back in October adding more think time for draw
PV showed positive Elo because time management was not well tuned, there
was more time available, and think_hard patch applied this additional time
to moves with draw PV, while just retuning back to optimum would recover Elo
anyway. It is possible that absence of contempt also helped, as SF9 is showing
less 0.0 scores than the October version.
Anyway, to me it seems that proper place to deal with draw PV is search, and
contempt sounds as much better solution. In time management there is little
additional elo, and if some code is not helping like removed here, it is better
to discard it. It is simpler to find genuine improvement if code is clean.
• Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 20487 W: 4558 L: 4434 D: 11495
http://tests.stockfishchess.org/tests/view/5a7706ec0ebc5902971a9854
• Passed LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41960 W: 7145 L: 7058 D: 27757
http://tests.stockfishchess.org/tests/view/5a778c830ebc5902971a9895
• Passed an additional non-regression [-5..0] test at the time control
of 60sec for the game (sudden death) with disabled draw adjudication:
LLR: 2.95 (-2.94,2.94) [-5.00,0.00]
Total: 8438 W: 1675 L: 1586 D: 5177
http://tests.stockfishchess.org/tests/view/5a7c3d8d0ebc5902971a9ac0
• Passed an additional non-regression [-5..0] test at the time control
of 1sec+1sec per move with disabled draw adjudication:
LLR: 2.97 (-2.94,2.94) [-5.00,0.00]
Total: 27664 W: 5575 L: 5574 D: 16515
http://tests.stockfishchess.org/tests/view/5a7c3e820ebc5902971a9ac3
This is a functional change for the time management code.
Bench: 4983414
The 'eval' debugging command in Terminal did not initialize the Eval::Contempt
variable, leading to random output during debugging sessions (normal search
was unaffected by the bug).
Example of session where the two 'eval' commands should give the same output,
but did not:
./stockfish
position startpos
d
eval
go depth 20
d
eval
The bug is fixed by initializing Eval::Contempt to SCORE_ZERO in Eval::trace
No functional change.
The current logic in master is to continue return quiet moves if their
history score is above 0. It appears as though this check can be
removed, which is also more logically consistent with the “skipQuiets”
semantics used in search.cpp.
This patch may open new opportunitiesto get Elo by changing or
tuning the definition of 'moveCountPruning' in line 830 of search.cpp,
because obeying skipQuiets without checking the history scores makes
the search more sensitive to 'moveCountPruning'.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 34780 W: 7680 L: 7584 D: 19516
http://tests.stockfishchess.org/tests/view/5a79f8d80ebc5902971a99db
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 38757 W: 6732 L: 6641 D: 25384
http://tests.stockfishchess.org/tests/view/5a7afebe0ebc5902971a9a46
Bench 4954595
Enable link-time optimization in the Makefile when compiling with clang.
Also update travis.yml to use clang++-5.0 and llvm-5.0-dev.
No functional change.
For these recaptures, we’re are only considering those captures
that recapture the recapture square (small portion of all the
captures). Therefore, scoring all of the captures and pick_besting
out of the whole group is not necessary.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 85583 W: 18978 L: 18983 D: 47622
http://tests.stockfishchess.org/tests/view/5a717faa0ebc590f2c86e9a7
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 20231 W: 3533 L: 3411 D: 13287
http://tests.stockfishchess.org/tests/view/5a73ad330ebc5902971a96ba
Bench: 5023593
The difference between QCAPTURES_1 and QCAPTURES_2 quiescence search stages
boils down to a simple check of depth. The way it's being done now is
unnecessarily complex.
This patch is simpler, clearer, and easier to understand.
Passed SPRT[-3..1] test at STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 99755 W: 22158 L: 22192 D: 55405
http://tests.stockfishchess.org/tests/view/5a71f41c0ebc590f2c86e9cb
No functional change.
It seems to be a waste of time to loop through all remaining root moves
after finishing each PV line. This patch skips this until we have reached
the last PV line (this is the way it was done in Glaurung and very early
versions of Stockfish).
No functional change in Single PV mode.
MultiPV=3 STC and LTC tests
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 3113 W: 1248 L: 1064 D: 801
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 2260 W: 848 L: 679 D: 733
Bench: 5023629
It does the following:
- If a TB win or loss value allows an alpha or beta cutoff, the cutoff is taken.
- Otherwise, the search of the current subtree continues. In PV nodes, the final value returned is adjusted to reflect that the position is a TB win (or loss).
The patch also fixes a potential problem caused by root_probe() and root_probe_wdl() dirtying the root-move scores.
This patch removes the limitation of current master that a mate is never found if the root position is not yet in the TBs, but the path to mate at some point enters the TBs. The patch is intended to preserve the efficiency and effectiveness of the current TB probing approach.
No functional change (withouth TB)
Set the default contempt value of Stockfish to 20 centipawns.
The contempt feature of Stockfish tries to prevent the engine from
simplifying the position too quickly when it feels that it is very
slightly behind, instead keeping the tension a little bit longer.
Various tests in November 2017 have proved that our current imple-
mentation works well against SF7 (which is about 130 Elo weaker than
current master) and than the Elo gain is an increasing function of
contempt, going (against SF7) from +0 Elo when contempt is set at
zero centipawns, to +30 Elo when contempt is 40 centipawns.
See pull request 1325 for details:
https://github.com/official-stockfish/Stockfish/pull/1325
This november discussion left open the decision of which "default"
value for contempt we should use for Stockfish, taking into account
the various uses ofStockfish (opening preparation for humans, computer
online tournaments,analysis tool for web pages, human/computer play,
etc).
This pull request proposes to set the default contempt value of SF
to twenty centipawns, which turns out to be the highest value which
is not a regression against current master, as this seemed to be a
good compromise between risk and safety. A couple of SPRT[-3..1]
tests were done to bisect this value:
Contempt 10: http://tests.stockfishchess.org/tests/view/5a5d42d20ebc5902977e2901 (PASSED)
Contempt 15: http://tests.stockfishchess.org/tests/view/5a5d41740ebc5902977e28fa (PASSED)
Contempt 20: http://tests.stockfishchess.org/tests/view/5a5d42060ebc5902977e28fc (PASSED)
Contempt 25: http://tests.stockfishchess.org/tests/view/5a5d433f0ebc5902977e2904 (FAILED)
Surprisingly, a test at "very long time control" hinted that using
contempt 20 is not only be non-regressive against contempt 0, but
may actually exhibit some small Elo gain, giving a likehood of superio-
rity of 88.7% after 8500 games:
VLTC:
ELO: 2.28 +-3.7 (95%) LOS: 88.7%
Total: 8521 W: 1096 L: 1040 D: 6385
http://tests.stockfishchess.org/tests/view/5a60b2820ebc590297b9b7e0
Finally, there was some concerns that a contempt value of 20 would
be worse than a value of 7, but a test with 20000 games at STC was
neutral:
STC:
ELO: 0.45 +-3.1 (95%) LOS: 61.2%
Total: 20000 W: 4222 L: 4196 D: 11582
http://tests.stockfishchess.org/tests/view/5a64d2fd0ebc590297903868
See the comments in pull request 1361 for the long, nice discussion
(180 entries :-)) leading to the decision to propose contempt 20 as
the default value:
https://github.com/official-stockfish/Stockfish/pull/1361
Whether Stockfish should strictly adhere to the Komodo and Houdini
semantics and add the UCI commands to force the contempt to be White
in the so-called "analysis mode" is still under discussion, and may
be or may not be the object of a future commit.
Bench: 5783344
1. avoid recursive call of verification.
For the interested side to move recursion makes no sense.
For the other side it could make sense in case of mutual zugzwang,
but I was not able to figure out any concrete problematic position.
Allows the removal of 2 local variables.
2. avoid further reduction by removing R += ONE_PLY;
Benchmark with zugzwang-suite (see #1338), max 45 secs per position:
Patch solves 33 out of 37
Master solves 31 out of 37
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76188 W: 13866 L: 13840 D: 48482
http://tests.stockfishchess.org/tests/view/5a5612ed0ebc590297da516c
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40479 W: 5247 L: 5152 D: 30080
http://tests.stockfishchess.org/tests/view/5a56f7d30ebc590299e4550e
bench: 5340015
as discussed in issue #1349, the way pages are allocated with calloc might imply some overhead on first write.
This overhead can be large and slow down the first search after a TT resize significantly, especially for large TT.
Using an explicit clear of the TT on resize fixes this problem.
Not implemented, but possibly useful for large TT, is to do this zero-ing using all search threads. Not only would this be faster, it could also lead to a more favorable memory allocation on numa systems with a first touch policy.
No functional change.
The heuristic to avoid thread binding if less than 8 threads are requested resulted in the first 7 threads not being bound.
The branch was verified to yield a roughly 13% speedup by @CoffeeOne on the appropriate hardware and OS, and an earlier version of this patch tested well on his machine:
http://tests.stockfishchess.org/tests/view/5a3693480ebc590ccbb8be5a
ELO: 9.24 +-4.6 (95%) LOS: 100.0%
Total: 5000 W: 634 L: 501 D: 3865
To make sure all threads (including mainThread) are bound as soon as the total number exceeds 7, recreate all threads on a change of thread number.
To do this, unify Threads::init, Threads::exit and Threads::set are unified in a single Threads::set function that goes through the needed steps.
The code includes several suggestions from @joergoster.
Fixes issue #1312
No functional change
For efficiency reasons current master only allows for transposition table sizes that are N = 2^k in size, the index computation can be done efficiently as (hash % N) can be written instead as (hash & 2^k - 1). On a typical computer (with 4, 8... etc Gb of RAM), this implies roughly half the RAM is left unused in analysis.
This issue was mentioned on fishcooking by Mindbreaker:
http://tests.stockfishchess.org/tests/view/5a3587de0ebc590ccbb8be04
Recently a neat trick was proposed to map a hash into the range [0,N[ more efficiently than (hash % N) for general N, nearly as efficiently as (hash % 2^k):
https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
namely computing (hash * N / 2^32) for 32 bit hashes. This patch implements this trick and now allows for general hash sizes. Note that for N = 2^k this just amounts to using a different subset of bits from the hash. Master will use the lower k bits, this trick will use the upper k bits (of the 32 bit hash).
There is no slowdown as measured with [-3, 1] test:
http://tests.stockfishchess.org/tests/view/5a3587de0ebc590ccbb8be04
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 128498 W: 23332 L: 23395 D: 81771
There are two (smaller) caveats:
1) the patch is implemented for a 32 bit hash (so that a 64 bit multiply can be used), this effectively limits the number of clusters that can be used to 2^32 or to 128Gb of transpostion table. That's a change in the maximum allowed TT size, which could bother those using 256Gb or more regularly.
2) Already in master, an excluded move is hashed into the position key in rather simple way, essentially only affecting the lower 16 bits of the key. This is OK in master, since bits 0-15 end up in the index, but not in the new scheme, which picks the higher bits. This is 'fixed' by shifting the excluded move a few bits up. Eventually a better hashing scheme seems wise.
Despite these two caveats, I think this is a nice improvement in usability.
Bench: 5346341
Current master can yield different staticEvals depending on the path
used to reach the position. The reason for this is that the evaluation after a
null move is always computed subtracting 2 * Eval::Tempo, while this is not
the case for lazy or specialized evals. This patch always adds tempo to evals,
which doesn't affect playing strength:
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 59911 W: 7616 L: 7545 D: 44750
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 104947 W: 18897 L: 18919 D: 67131
Fixes issue #1335
Bench: 5208264
Simplify the other check penalty computation. Compared to current master,
a) it uses a 143 kingDanger penalty instead of S(10, 10) for the "otherCheck"
(credits to ElbertoOne for finding a suitable kingDanger range to replace the score
and to Guardian for showing this could also be a neutral change at LTC).
This makes our king safety model more consistent and simpler.
b) it might also score more than one "otherCheck" penalty for a given piece type instead of just one
c) it might score many pinned penalties instead of just one.
d) It also remove 3 conditionals and uses simpler expressions.
So it was tested as a SPRT[-3, 1]
Passed STC
http://tests.stockfishchess.org/tests/view/5a2b560b0ebc590ccbb8ba6b
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 11705 W: 2217 L: 2080 D: 7408
And LTC
http://tests.stockfishchess.org/tests/view/5a2bfd0d0ebc590ccbb8bab0
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 26812 W: 3575 L: 3463 D: 19774
Trying to improve on b) another attempt was made to score also the
"otherchecks" for piece types which had some safe checks, but this
failed STC http://tests.stockfishchess.org/tests/view/5a2c79e60ebc590ccbb8badd
bench: 5149133
* A better contempt implementation for Stockfish
The round 2 of TCEC season 10 demonstrated the benefit of having a nice contempt implementation: it gives the strongest programs in the tournament the ability to slow down the game when they feel the position is slightly worse, prefering to stay in a complicated (even if slightly risky) middle game rather than simplifying by force into a drawn endgame.
The current contempt implementation of Stockfish is inadequate, and this patch is an attempt to provide a better one.
Passed STC non-regression test against master:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 83360 W: 15089 L: 15075 D: 53196
http://tests.stockfishchess.org/tests/view/5a1bf2de0ebc590ccbb8b370
This contempt implementation is showing promising results in certains situations. For instance, it obtained a nice +30 Elo gain when playing with contempt=40 against Stockfish 7, compared to current master:
• master against SF 7 (20000 games at LTC): +121.2 Elo
• this patch with contempt=40 (20000 games at LTC): +154.11 Elo
This was the result of real cooperative work from the Stockfish team, with key ideas coming from Stefan Geschwentner (locutus2) and Chris Cain (ceebo) while most of the community helped with feedback and computer time.
In this commit the bench is unchanged by default, but you can test at home with the new contempt in the UCI options. The style of play will change a lot when using contempt different of zero (I repeat: not done in this version by default, however)!
The Stockfish team is still deliberating over the best default contempt value in self-play and the best contempt modeling strategy, to help users choosing a contempt value when playing against much weaker programs. These informations will be given in future commits when available :-)
Bench: 5051254
* Remove the prefetch
No functional change.
Currently the NORTH/WEST/SOUTH/EAST values are of type Square, but conceptually they are not squares but directions. This patch separates these values into a Direction enum and overloads addition and subtraction to allow adding a Square to a Direction (to get a new Square).
I have also slightly trimmed the possible overloadings to improve type safety. For example, it would normally not make sense to add a Color to a Color or a Piece to a Piece, or to multiply or divide them by an integer. It would also normally not make sense to add a Square to a Square.
This is a non-functional change.
Add the -fno-exceptions flag to the Makefile to avoid the unecessary exceptions support in the executable (we do not use any exception in Stockfish at the moment).
This change gives a 9.2% reduction in size for the executable binary.
Before : executable size = 376956 bytes
After: executable size = 347652 bytes
No functional change.
Four very minor edits. Note that tte->save() uses posKey and
not pos.key() in other places.
Originally I also added a futility_move_counts() function to
make things more consistent with the futility_margin() and
reduction() functions. But then razor_margin[] should probably
also be turned into a function, etc. Maybe a good idea, maybe not.
So I did not include it.
Non functional change.
The new "weak" expression helps simplify the safe check calculations for rooks or minors, (but the end result for all the safe checks is the exactly the same as in current master)
The only functional change is for the "outer king ring" (for example, squares f3 g3 h3 when white king is on g1). In current master, there was a 191 penalty if any of these was not defended at all.
With this pr, there is this 191 penalty if any of these is not defended at all or is only defended by a white queen.
Tested as a simplification
STC
http://tests.stockfishchess.org/tests/view/59fb03d80ebc590ccbb89fee
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 66167 W: 12015 L: 11971 D: 42181
(against master (Update Copyright year inMakefile))
LTC
http://tests.stockfishchess.org/tests/view/5a0106ae0ebc590ccbb8a55f
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 15790 W: 2095 L: 1968 D: 11727
(against master (Handle BxN trade as good capture when history scor))
same as #1296 but rebased on latest master
bench: 5109559
In terms of technical changes this patch eliminates the return
statements from the main loop of pos.see_ge() and replaces two conditional
computations with a single bitwise negation.
No functional change
Stockfish currently relies on the "filter_root_moves" function also
having the side effect of clamping Cardinality against MaxCardinality
(the actual piece count in the tablebases). So if we skip this function,
we will end up probing in the search even without tablebases installed.
We cannot bail out of this function before this check is done, so move
the MultiPV hack a few lines below.
If the PV leads to a draw (3-fold / 50-moves) position
and we're ahead of time, think a little longer, possibly
finding a better way.
As this is most likely effective at higher draw rates,
tried speculative LTC after a yellow STC:
STC:
http://tests.stockfishchess.org/tests/view/59eb173a0ebc590ccbb8975d
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 56095 W: 10013 L: 9902 D: 36180
elo = 0.688 +- 1.711 LOS: 78.425%
LTC:
http://tests.stockfishchess.org/tests/view/59eba1670ebc590ccbb897b4
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 59579 W: 7577 L: 7273 D: 44729
elo = 1.773 +- 1.391 LOS: 99.381%
bench: 5234652
In x/y time controls there was a theoretical possibility
to use all available time few moves before the clock will
be updated with new time. This patch fixes that issue.
Tested at 60/15 time control:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 113963 W: 20008 L: 20042 D: 73913
The test was done without adjudication rules!
Bench 5234652
A band-aid patch to workaround current TB code
limitations with multi PV.
Hopefully this will be removed after committing the
big update of TB impementation, now under discussion.
No functional change.
If the search is quit before skill.pick_best is called,
skill.best_move might be MOVE_NONE.
Ensure skill.best is always assigned anyhow.
Also retire the tricky best_move() and let the underlying
semantic to be clear and explicit.
No functional change.
The first change (ss->statScore >= 0) does nothing.
The second change ((ss-1)->statScore >= 0 ) has a massive change.
(ss-1)->statScore is not set until (ss-1) begins to apply LMR to moves.
So we now increase the reduction for bad quiets when our opponent is
running through the first captures and the hash move.
STC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 57762 W: 10533 L: 10181 D: 37048
LTC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 19973 W: 2662 L: 2480 D: 14831
Bench: 5037819
This patch lets ss->ply be equal to 0 at the root of the search.
Currently, the root has ss->ply == 1, which is less intuitive:
- Setting the rootNode bool has to check (ss-1)->ply == 0.
- All mate values are off by one: the code seems to assume that mated-in-0
is -VALUE_MATE, mate-1-in-ply is VALUE_MATE-1, mated-in-2-ply is VALUE_MATE+2, etc.
But the mate_in() and mated_in() functions are called with ss->ply, which is 1 in
at the root.
- The is_draw() function currently needs to explain why it has "ply - 1 > i" instead
of simply "ply > i".
- The ss->ply >= MAX_PLY tests in search() and qsearch() already assume that
ss->ply == 0 at the root. If we start at ss->ply == 1, it would make more sense to
go up to and including ss->ply == MAX_PLY, so stop at ss->ply > MAX_PLY. See also
the asserts testing for 0 <= ss->ply && ss->ply < MAX_PLY.
The reason for ss->ply == 1 at the root is the line "ss->ply = (ss-1)->ply + 1" at
the start for search() and qsearch(). By replacing this with "(ss+1)->ply = ss->ply + 1"
we keep ss->ply == 0 at the root. Note that search() already clears killers in (ss+2),
so there is no danger in accessing ss+1.
I have NOT changed pv[MAX_PLY + 1] to pv[MAX_PLY + 2] in search() and qsearch().
It seems to me that MAX_PLY + 1 is exactly right:
- MAX_PLY entries for ss->ply running from 0 to MAX_PLY-1, and 1 entry for the
final MOVE_NONE.
I have verified that mate scores are reported correctly. (They were already reported
correctly due to the extra ply being rounded down when converting to moves.)
The value of seldepth output to the user should probably not change, so I add 1 to it.
(Humans count from 1, computers from 0.)
A small optimisation I did not include: instead of setting ss->ply in every invocation
of search() and qsearch(), it could be set once for all plies at the start of
Thread::search(). This saves a couple of instructions per node.
No functional change (unless the search searches a branch MAX_PLY deep), so bench
does not change.
This shoudl reduce time losses experienced by
users after new time management code.
Verified for no regression in very short TC (4sec + 0.1)
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 35262 W: 7426 L: 7331 D: 20505
Bench 5322108
Use different penalties for weaknesses in the pawn shelter
depending on whether it is on the king's file or not.
STC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 71617 W: 13471 L: 13034 D: 45112
LTC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 48708 W: 6463 L: 6187 D: 36058
Bench: 5322108
Two simplifications:
- Remove the initialisation to 0 of occupied, which is now unnecessary.
- Remove the initial check for nextVictim == KING
If nextVictim == KING, then PieceValue[MG][nextVictim] will be 0, so that
balance >= threshold is true. So see_ge() returns true anyway.
No functional change.
Compile with -Werror flag. To make debugging easier
also show compile ourput.
This flag is enabled only in Travis CI, not in the shipped
Makefile becuase we can't test on every possible platform.
In light of issue #1232, a test was performed about the value of '-fno-exceptions' and a second one of the combination '-fno-exceptions -fno-rtti'. It turns out these options are can be removed without introducing slowdown.
STC for removing '-fno-exceptions'
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 13678 W: 2572 L: 2439 D: 8667
STC for removing '-fno-exceptions -fno-rtti' (current patch)
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 32557 W: 6074 L: 5973 D: 20510
No functional change.
During TB initialisation, Stockfish checks for the presence of WDL
tables but not for the presence of DTZ tables. When attempting to probe
a DTZ table, it is therefore possible that the table is not present.
In that case, Stockfish should neither exit nor report an error.
To verify the bug:
$ ./stockfish
setoption name SyzygyTable value <path_to_WDL_dir>
position fen 8/8/4r3/4k3/8/1K2P3/3P4/6R1 w - -
go infinite
Could not mmap() /opt/tb/regular/KRPPvKR.rtbz
$
(On my system, the WDL tables are in one directory and the DTZ tables
in another. If they are in the same directory, it will be difficult
to trigger the bug.)
The fix is trivial: check the file descriptor/handle after opening
the file.
No functional change.
After increasing the number of threads, the histories were not cleared,
resulting in uninitialized memory usage.
This patch fixes this by clearing threads histories in Thread c'tor as
is the idomatic way.
This fixes issue 1227
No functional change.
credit goes to @mstembera for suggesting this approach.
SEE now deals with castling, promotion and en passant in a similar way.
passed STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 32902 W: 6079 L: 5979 D: 20844
passed LTC
LLR: 3.92 (-2.94,2.94) [-3.00,1.00]
Total: 110698 W: 14198 L: 14145 D: 82355
Bench: 5713905
And set x86 and x64 platforms for real.
Currently this is broken and the same binary is compiled for all platforms.
This is becuase we use a custom build step. OTH the default
build step seems not compatible with cmake generated *sln file.
No functional change.
If any thread found a 'mate in x' stop the search. Previously only
mainThread would do so. Requires the bestThread selection to be
adjusted to always prefer mate scores, even if the search depth is less.
I've tried to collect some data for this patch. On 30 cores, mate finding
seems 5-30% faster on average. It is not so easy to get numbers for this,
as the time to find a mate fluctuates significantly with multi-threaded runs,
so it is an average over 100 searches for the same position. Furthermore,
hash size and position make a difference as well.
Bench: 5965302
Avoid constructing, passing as a parameter and binding a useless empty tuple of pointers in the qsearch move picker constructor.
Also reformat the scoring function in movepicker.cpp and do some cleaning in evaluate.cpp while there.
No functional change.
What this patch does is:
* increase safety margin from 40ms to 60ms. It's worth noting that the previous
code not only used 60ms incompressible safety margin, but also an additional
buffer of 30ms for each "move to go".
* remove a whart, integrating the extra 10ms in Move Overhead value instead.
Additionally, this ensures that optimumtime doesn't become bigger than maximum
time after maximum time has been artificially discounted by 10ms. So it keeps
the code more logical.
Tested at 3 different time controls:
Standard 10+0.1
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 58008 W: 10674 L: 10617 D: 36717
Sudden death 16+0
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 59664 W: 10945 L: 10891 D: 37828
Tournament 40/10
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 16371 W: 3092 L: 2963 D: 10316
bench: 5479946
Add tests for:
- Positions with move list
- Chess960 positions
Now bench covers almost all cases, only few endgames
are still out of reach (verified with lcov)
It is a non functionality patch, but bench
changed because we added new test positions.
bench: 5479946
Rewrite perft to be placed naturally inside new
bench code. In particular we don't have special
custom code to run perft anymore but perft is
just a new parameter of 'go' command.
So user API is now changed, old style command:
$perft 5
becomes
$go perft 4
No functional change.
First step in improving bench to handle
arbitrary UCI commands so to test many
more code paths.
This first patch just set the new code
structure.
No functional change.
In particular clarify that 'sd'
parameter is used only in !movesToGo
case.
Verified with Ivan's check tool it is
equivalent to original code.
No functional change.
The only call site of Time.maximum() corrected by 10.
Do this directly in remaining().
Ponder increased Time.optimum by 25% in init(). Idem.
Delete unused includes.
No functional change.
Avoid a couple of redundant rebuilds and compile
with 2 threads since travis gives 2vCPUs.
Also enable -O1 optimization for valgrind and
sanitizers, it should be safe withouth false
positives and it gives a very sensible speed
up, especially with valgrind.
The spee dup allow us to increase testing to
depth 10, useful for thread sanitizer.
No functional change.
Current update formula ensures that the
possible value range is [-32 * D, 32 * D].
So we never overflow if abs(32 * D) < INT16_MAX
Thanks to Joost and mstembera to clarify this.
No functional change.
The trick is to create an ambiguity for the
compiler in case an unwanted conversion to
Move is attempted like in:
ExtMove m1{Move(17),4}, m2{Move(4),17};
std::cout << (m1 < m2) << std::endl; // 1
std::cout << (m1 > m2) << std::endl; // 1(!)
This fixes issue #1204
No functional change.
Reduces memory footprint by ~1.2MB (per thread).
Strong pressure: small but mesurable gain
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 258430 W: 46977 L: 45943 D: 165510
Low pressure: no regression
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 73542 W: 13058 L: 13026 D: 47458
Strong pressure + LTC: elo gain confirmed
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 31489 W: 4532 L: 4295 D: 22662
Tested for crashing on overflow and after 70K
games at STC we have only 4 time losses,
possible candidate for an overflow.
No functional change.
We use Position::set() to set root position across
threads. But there are some StateInfo fields (previous,
pliesFromNull, capturedPiece) that cannot be deduced
from a fen string, so set() clears them and to not lose
the info we need to backup and later restore setupStates->back().
Note that setupStates is shared by threads but is accessed
in read-only mode.
This fixes regression introduced by df6cb446ea
Tested with 3 threads at STC:
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 14436 W: 2304 L: 2196 D: 9936
Bench: 5608839
Some warnings after a run of:
$ clang-tidy-3.8 -checks='modernize-*' *.cpp syzygy/*.cpp -header-filter=.* -- -std=c++11
I have not fixed all suggestions, for instance I still prefer
to declare the type instead of a spread use of 'auto'. I also
perfer good old 'typedef' to the new 'using' form.
I have not fixed some warnings in the last functions of
syzygy code because those are still the original functions
and need to be completely rewritten anyhow.
Thanks to erbsenzaehler for the original idea.
No functional change.
Simplify out low level sync stuff (mutex
and friends) and avoid to use them directly
in many functions.
Also some renaming and better comment while
there.
No functional change.
The case of three or more bishops against a long king must look at all of the
bishops and not just the first two in the piece lists. This patch makes sure
that the position is treated as a win when there are bishops on opposite
colors. This functional change is very small because bench remains the same.
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 24249 W: 4349 L: 4275 D: 15625
http://tests.stockfishchess.org/tests/view/598186530ebc5916ff64a218
Bench: 5608839
In this rare case (e.g. go infinite on a stalemate),
just spin till ponderhit/stop comes.
The Thread::wait() is a renmant of the old YBWC
code, today with lazy SMP, threads don't need to
wait when outside of their idle loop.
No functional change.
But this time correctly set Threads.ponder
We avoid using 'limits' for passing pondering
flag because we don't want to have 2 ponder
variables in search scope: Search::Limits.ponder
and Threads.ponder. This would be confusing also
because limits.ponder is set at the beginning of
the search and never changes, instead Threads.ponder
can change value asynchronously during search.
No functional change.
This reverts commit 5410424e3d.
After the commit pondering is broken, so revert for now. I will
resubmit with a proper fix.
The issue is mine, Joost original code is correct.
No functional change.
Limits::ponder was used as a signal between uci and search threads,
but is not an atomic variable, leading to the following race as
flagged by a sanitized binary.
Expect input:
```
spawn ./stockfish
send "uci\n"
expect "uciok"
send "setoption name Ponder value true\n"
send "go wtime 4000 btime 4000\n"
expect "bestmove"
send "position startpos e2e4 d7d5\n"
send "go wtime 4000 btime 4000 ponder\n"
sleep 0.01
send "ponderhit\n"
expect "bestmove"
send "quit\n"
expect eof
```
Race:
```
WARNING: ThreadSanitizer: data race (pid=7191)
Read of size 4 at 0x0000005c2260 by thread T1:
Previous write of size 4 at 0x0000005c2260 by main thread:
Location is global 'Search::Limits' of size 88 at 0x0000005c2220 (stockfish+0x0000005c2260)
```
The reason of teh race is that ponder is not just set in UCI go()
assignment but also is signaled by an async ponderhit in uci.cpp:
else if (token == "ponderhit")
Search::Limits.ponder = 0; // Switch to normal search
The fix is to add an atomic bool to the threads structure to
signal the ponder status, letting Search::Limits to reflect just
what was passed to 'go'.
No functional change.
Better split code that should be run at
startup from code run at ucinewgame. Also
fix several races when 'bench', 'perft' and
'ucinewgame' are sent just after 'bestomve'
from the engine threads are still running.
Also use a specific UI thread instead of
main thread when setting up the Position
object used by UI uci loop. This fixes a
race when sending 'eval' command while searching.
We accept a race on 'setoption' to allow the
GUI to change an option while engine is searching
withouth stalling the pipe. Note that changing an
option while searchingg is anyhow not mandated by
UCI protocol.
No functional change.
as a lower level routine, movepicker should not depend on the
search stack or the thread class, removing a circular dependency.
Instead of copying the search stack into the movepicker object,
as well as accessing the thread class for one of the histories,
pass the required fields explicitly to the constructor (removing
the need for thread.h and implicitly search.h in movepick.cpp).
The signature is thus longer, but more explicit:
Also some renaming of histories structures while there.
passed STC [-3,1], suggesting a small elo impact:
LLR: 3.13 (-2.94,2.94) [-3.00,1.00]
Total: 381053 W: 68071 L: 68551 D: 244431
elo = -0.438 +- 0.660 LOS: 9.7%
No functional change.
A pawn (according to all the searched positions of a bench run) is not supported 85% of the time,
(in current master it is either isolated, backward or "unsupported").
So it made sense to try moving the S(17, 8) "unsupported" penalty value into the base pawn value hoping for a more representative pawn value, and accordingly
a) adjust backward and isolated so that they stay more or less the same as master
b) increase the mg connected bonus in the supported case by S(17, 0) and let the Connected formula find a suitable eg value according to rank.
Tested as a simplification SPRT(-3, 1)
Passed STC
http://tests.stockfishchess.org/tests/view/5970dbd30ebc5916ff649dd6
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 19613 W: 3663 L: 3540 D: 12410
Passed LTC
http://tests.stockfishchess.org/tests/view/597137780ebc5916ff649de3
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 24721 W: 3306 L: 3191 D: 18224
Bench: 5581946
Closes#1179
in the last month a couple of timeouts have been seen in travis valgrind testing, leading to undesired false positives. The precise cause of this is unclear: a normal valgrind instrumented run is about 6min, the timeout is 10min. Either there are rare hangs (not reproduced locally), or maybe the actual runtime fluctuates on the travis infrastructure (which uses VMs on AWS as far as I know). This patch leads to roughly a 2x speedup of the instrumented testing by reducing the depth from 10 to 9. If timeouts persist, it needs further analysis.
No functional change.
Closes#1171
For some reason, although game phase is used
only in material, it is computed in Position.
Move computation to material, where it belongs,
and remove the useless call chain.
No functional change.
Instead of having Signals in the search namespace,
make the stop variables part of the Threads structure.
This moves more of the shared (atomic) variables towards
the thread-related structures, making their role more clear.
No functional change
Closes#1149
Optimization options for official stockfish should be
consistent, easy, future proof and simple.
We don't want to optimize for any specific version of gcc
No functional change
Closes#1165
Addition of correction values in case of Imbalance of queens,
depending on the number of light pieces on the side without a queen.
Passed patch:
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 29036 W: 5379 L: 5130 D: 18527
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 13680 W: 1836 L: 1674 D: 10170
Bench: 6258930
Closes#1155
It is not needed becuase the only case is a real special
one (bench on depth with many threads) and can be easily
rewritten to avoid sharing rootDepth.
Verified with ThreadSanitizer.
No functional change.
Closes#1159
Only one remains (also in tbprobe.cpp), but is bougus.
As a side note, tbprobe.cpp is almost clean, only the last 3
functions probe_wdl(), root_probe() and root_probe_wdl()
are still the original ones and are quite hacky.
Somewhere in the future we will reformat also the last 3
ones. The reason why has not been done before it is because
these functions are really wrong by design and should be
rewritten entirely, not only reformatted.
No functional change.
Closes#1160
Make magic_index() a member of Magic since it uses all it's members
and keep us from having to pass the function pointer around to
init_magics().
No functional change
Closes#1146
The main change of the patch is that now time check
is done only by main thread. In the past, before lazy
SMP, we needed all the threds to check for available
time because main thread could have been blocked on
a split point, now this is no more the case and main
thread can do the job alone, greatly simplifying the logic.
Verified for regression testing on STC with 7 threads:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 11895 W: 1741 L: 1608 D: 8546
No functional change.
Closes#1152
Now we don't need anymore the tricky pointer to
show the failed test. Added some few tests too.
Also small rename in see_ge() while there.
No functional change
Closes#1151
The idea is that chances are the tt-move is best and will be difficult to raise alpha when playing a quiet move.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 7582 W: 1415 L: 1259 D: 4908
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 59553 W: 7885 L: 7573 D: 44095
Bench: 5725676
Closes#1147
This patch puts the evaluation helper functions inside EvalInfo struct, which simplifies a bit their signature and (most importantly, IMHO) makes their C++ code much cleaner and simpler to read (by removing the "ei." qualifiers all around in evaluate.cpp).
Also rename the EvalInfo struct into Evaluation class to get a natural invocation v = Evaluation(p).value() to evaluation position p.
The downside is an increase of 20 lines in evaluate.cpp (for the prototypes of the helper functions). The upsides are better readability and a speed-up of 0.6% (by generating all the helpers for the NO_TRACE case together, which helps the instruction cache).
No functional change
Closes#1135
the nodes, tbHits, rootDepth and lastInfoTime variables are read by multiple threads, but not declared atomic, leading to data races as found by -fsanitize=thread. This patch fixes this issue. It is based on top of the CI-threading branch (PR #1129), and should fix the corresponding CI error messages.
The patch passed an STC check for no regression:
http://tests.stockfishchess.org/tests/view/5925d5590ebc59035df34b9f
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 169597 W: 29938 L: 30066 D: 109593
Whereas rootDepth and lastInfoTime are not performance critical, nodes and tbHits are. Indeed, an earlier version using relaxed atomic updates on the latter two variables failed STC testing (http://tests.stockfishchess.org/tests/view/592001700ebc59035df34924), which can be shown to be due to x86-32 (http://tests.stockfishchess.org/tests/view/592330ac0ebc59035df34a89). Indeed, the latter have no instruction to atomically update a 64bit variable. The proposed solution thus uses a variable in Position that is accessed only by one thread, which is copied every few thousand nodes to the shared variable in Thread.
No functional change.
Closes#1130Closes#1129
The change passed an STC regression:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 59350 W: 10793 L: 10738 D: 37819
I verified that there was no change in performance on my machine, but of course YMMV:
Results for 40 tests for each version:
Base Test Diff
Mean 2014338 2016121 -1783
StDev 62655 63441 3860
p-value: 0.678
speedup: 0.001
No functional change.
Closes#1137
TT.new_search() was being called by mainThread in Thread::search(). However, mainThread is the last to start searching, and helper threads could reach a measured rootDepth 10 (on 64 cores) before mainThread increments the TT generation. Fixed by moving the call to MaintThread::search() before helper threads start searching.
No functional change.
Closes#1134
Gather all magic relevant data into a struct.
This changes memory layout putting everything necessary for processing a single square
in the same memory location thus speeding up access.
Original patch by @snicolet
No functional change.
Closes#1127Closes#1128
In the current version a search stops when the current position is the same as
any position earlier in the search stack,
including the root position but excluding positions before the root.
The new version makes an exception for repeating the root position.
This gives correct scores for moves in the MultiPV > 1 mode.
Fixes#948 (see it for the detailed description of the bug).
LTC: http://tests.stockfishchess.org/tests/view/587910bc0ebc5915193f754b
ELO: 0.38 +-1.7 (95%) LOS: 66.8%
Total: 40000 W: 5166 L: 5122 D: 29712
STC: http://tests.stockfishchess.org/tests/view/5922e6230ebc59035df34a50
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 94622 W: 17059 L: 17064 D: 60499
LTC: http://tests.stockfishchess.org/tests/view/59273a000ebc59035df34c03
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 61259 W: 7965 L: 7897 D: 45397
Bench: 6599721
Closes#1126
Rearrange and rename all history heuristic code. Naming
is now based on chessprogramming.wikispaces.com conventions
and the relations among the various heuristics are now more
clear and consistent.
No functional change.
The previous patch has added a fraction of the king danger score to the
endgame score of the tapered eval, so it seems natural to perform the
king danger computation later in the endgame.
With this patch we extend the limit of such check analysis down to the
material of Rook+Knight, when we have at least two pieces attacking the
opponent king zone.
Passed STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 7446 W: 1409 L: 1253 D: 4784
http://tests.stockfishchess.org/tests/view/591c097c0ebc59035df3477c
and LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 14234 W: 1946 L: 1781 D: 10507
http://tests.stockfishchess.org/tests/view/591c24f10ebc59035df3478c
Bench: 5975183
Closes#1121
Fixes a bug in Search::clear, where the filling of CounterMoveStats&, overwrote (currently presumably unused) memory because sizeof(cm) returns the size in bytes, whereas elements was needed.
No functional change
Closes#1119
In current master the size of the king ring varies abruptly from eight
squares when the king is in g8, to 12 squares when it is in g7. Because
the king ring is used for estimating attack strength, this may lead to
an overestimation of king danger in some positions. This patch limits
the king ring to eight squares in all cases.
Inspired by the following forum thread:
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/xrUCQ7b0ObE
Passed STC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 9244 W: 1777 L: 1611 D: 5856
and LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 87121 W: 11765 L: 11358 D: 63998
Bench: 6121121
Closes#1115
execute an implied ucinewgame upon entering the UCI::loop,
to make sure that searches starting with and without an (optional) ucinewgame
command yield the same search.
This is needed now that seach::clear() initializes tables to non-zero default values.
No functional change
Closes#1101Closes#1104
Replacing the old Protector table with a simple linear formula which takes into account a different slope for each different piece type.
The idea of this simplification of Protector is originated by Alain (Rocky)
STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 70382 W: 12859 L: 12823 D: 44700
LTC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 61554 W: 8098 L: 8031 D: 45425
Bench: 6107863
Closes#1099
In general, this patch handles the cases where we don't have a valid score for each PV line in a multiPV search. This can happen if the search has been stopped in an unfortunate moment while still in the aspiration loop. The patch consists of two parts.
Part 1: The new PVIdx was already part of the k-best pv's in the last iteration, and we therefore have a valid pv and score to output from the last iteration. This is taken care of with:
bool updated = (i <= PVIdx && rootMoves[i].score != -VALUE_INFINITE);
Case 2: The new PVIdx was NOT part of the k-best pv's in the last iteration, and we have no valid pv and score to output. Not from the current nor from the previous iteration. To avoid this, we are now also considering the previous score when sorting, so that the PV lines with no actual but with a valid previous score are pushed up again, and the previous score can be displayed.
bool operator<(const RootMove& m) const {
return m.score != score ? m.score < score : m.previousScore < previousScore; } // Descending sort
I also added an assertion in UCI::value() to possibly catch similar issues earlier.
No functional change.
Closes#502Closes#1074
Testing the release candidate revealed only one minor issue, namely a new warning -Wimplicit-fallthrough (part of -Wextra) triggers in the movepicker. This can be silenced by adding a comment, and once we move to c++17 by adding a standard annotation [[fallthrough]];.
No functional change.
Closes#1090
StepAttacks[] is misdesigned, the color dependance is specific
to pawns, and trying to generalise to king and knights, proves
neither useful nor convinient in practice.
So this patch reformats the code with the following changes:
- Use PieceType instead of Piece in attacks_() functions
- Use PseudoAttacks for KING and KNIGHT
- Rename StepAttacks[] into PawnAttacks[]
Original patch and idea from Alain Savard.
No functional change.
Closes#1086
ss->killers can change while the movepicker is active.
The reason ss->killers changes is related to the singular
extension search in the moves loop that calls search<>
recursively with ss instead of ss+1,
effectively using the same stack entry for caller and callee.
By making a copy of the killers,
the movepicker does the right thing nevertheless.
Tested as a bug fix
STC:
http://tests.stockfishchess.org/tests/view/58ff130f0ebc59035df33f37
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 70845 W: 12752 L: 12716 D: 45377
LTC:
http://tests.stockfishchess.org/tests/view/58ff48000ebc59035df33f3d
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 28368 W: 3730 L: 3619 D: 21019
Bench: 6465887
Closes#1085
history related scores are not related to evaluation based scores.
For example, can easily exceed the range -VALUE_INFINITE,VALUE_INFINITE.
As such the current type is confusing, and a plain int is a better match.
tested for no regression:
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 43693 W: 7909 L: 7827 D: 27957
No functional change.
Closes#1070
the order of elements returned by std::partition is implementation defined (since not stable) and could depend on the version of libstdc++ linked.
As std::stable_partition was tested to be too slow (http://tests.stockfishchess.org/tests/view/585cdfd00ebc5903140c6082).
Instead combine partition with our custom implementation of insert_sort, which fixes this issue.
Implementation based on a patch by mstembera (http://tests.stockfishchess.org/tests/view/58d4d3460ebc59035df3315c), which suggests some benefit by itself.
Higher depth moves are all sorted (INT_MIN version), as in current master.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 33116 W: 6161 L: 6061 D: 20894
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 88703 W: 11572 L: 11540 D: 65591
Bench: 6256522
Closes#1058Closes#1065
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 58462 W: 10615 L: 10558 D: 37289
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 65061 W: 8539 L: 8477 D: 48045
It is worth noting that an attempt to only increase the bonus passed STC but failed LTC, and
an attempt to remove the cap without increasing the bonus is still running at STC, but will probably fail after more than 100k.
Bench: 6188591
Closes#1063
By adding pos.non_pawn_material(pos.side_to_move()) as a precondition in step 13,
which is already in use in Futility Pruning (child node) and Null Move Pruning for similar reasons.
Pawn endgames, especially those with only 1 or 2 pawns, are simply heavily influenced by zugzwang situations.
Since we are using a bitbase for KPK endgames, I see no reason to accept buggy evals as shown in #760
Patch looks neutral at STC
LLR: 2.32 (-2.94,2.94) [-3.00,1.00]
Total: 79580 W: 10789 L: 10780 D: 58011
and LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 27071 W: 3502 L: 3390 D: 20179
Bench: 6259071
Closes#1051Closes#760
a single Xeon Phi can present itself as a single numa node with up to 288 threads (4 threads per hardware core).
Tested to work as expected with a Xeon Phi CPU 7230 up to 256 threads.
No functional change
Closes#1045
If we can moveCountPrune and next quiet move has negative stats,
then go directly to the next move stage (Bad_Captures).
Reduction formula is tweaked to compensate for the decrease in move count that is used in LMR.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 6847 W: 1276 L: 1123 D: 4448
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 48687 W: 6503 L: 6226 D: 35958
Bench: 5919519
Closes#1036
Instead of having a continuous increasing bonus for our number of pawns when calculating imbalance, use a separate lookup array with tuned values.
Idea by GuardianRM.
STC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 16155 W: 2980 L: 2787 D: 10388
LTC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 100478 W: 13055 L: 12615 D: 74808
Bench: 6128779
Closes#1030
Simplifies away all associated checks, leading to a ~0.5% speedup.
The code now explicitly checks if moves are OK, rather than using nullptr checks.
Verified for no regression:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 32218 W: 5762 L: 5660 D: 20796
No functional change
Closes#1021
This patch removes the empty rows at the beginning and at the end of
MobilityBonus[] and Protector[] arrays:
• reducing the size of MobilityBonus from 768 bytes to 512 bytes
• reducing the size of Protector from 1024 to 512 bytes
Also adds some comments and cleaner code for the arrays in pawns.cpp
No speed penalty (measured speed-up of 0.4%).
No functional change.
Closes#1018
Replaces the HalfDensity array with an equivalent, compact implementation.
Includes suggestions by mcostalba & snicolet.
No functional change
Closes#1004
By defining "strongly protected" as "protected by a pawn, or protected
by two pieces and not attacked by two enemy pieces".
Passed STC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 17050 W: 3128 L: 2931 D: 10991
Passed LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 120995 W: 15852 L: 15343 D: 89800
Bench : 6269229
Closes#1016
This eliminates alignment padding and reduces size from 48 to 40 bytes.
This makes the material HashTable smaller and more cache friendly.
No functional change
Closes#1013
Initial protective idea by Snicolet for knight, for other pieces too
Patch add penalties and bonuses for pieces, depending on the distance from the own king
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 21192 W: 3919 L: 3704 D: 13569
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 26177 W: 3642 L: 3435 D: 19100
Bench : 6687377
Closes#1012
Positions with pawns on only one flank tend to be more drawish. We add
a term to the initiative bonus to help the attacking player keep pawns
on both flanks.
STC: yellowish run stopped after 257137 games
LLR: -0.92 (-2.94,2.94) [0.00,5.00]
Total: 257137 W: 46560 L: 45511 D: 165066
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 15602 W: 2125 L: 1956 D: 11521
Bench : 6976310
Closes#1009
A tuning patch which cover the following changes:
increase the importance of queen and rook mobility in endgame and
decrease it in mg, since if we use the heavy pieces too early in the game
we will just make opponent develop their pieces by threatening ours.
King Psqt:
1)King will be encouraged more to stay in the first ranks in the MG
2)and will be encouraged more to go to the middle of the board/last ranks in the EG
Bishop scale better in EG
Logical changes on various psqt tables
1/6 of the changes of the last tuning session on mobility tables
STC: LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 227879 W: 41240 L: 40313 D: 146326
LTC : LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 167047 W: 21871 L: 21291 D: 123885
Bench: 5695960
Closes#1008
Fixes failing build for
make ARCH=x86-32 clean && make ARCH=x86-32 optimize=no build
by passing -m32 also to the link step.
Extend travis testing accordingly.
No functional change.
Closes#999
Minor non-functional simplifications in computing the scale factor.
In my opinion, the code is now slightly more readable:
- remove one condition which can never be satisfied.
- immediately return instead of assigning the sf variable.
Tested for non-regression:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 62162 W: 11166 L: 11115 D: 39881
No functional change
Closes#992
Also the penalty/bonus function is misleading, we
should simply change it to stat_bonus(depth) for
bonus and -stat_bonus(depth+ ONE_PLY) for extra
penalty.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 11656 W: 2183 L: 2008 D: 7465
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 11152 W: 1531 L: 1377 D: 8244
Bench: 6101931
Results for 20 tests for each version (pgo-builds):
Base Test Diff
Mean 2110519 2118116 -7597
StDev 8727 4906 10112
p-value: 0,774
speedup: 0,004
Further verified for no regression:
http://tests.stockfishchess.org/tests/view/5885abd10ebc5915193f79e6
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 21786 W: 3959 L: 3840 D: 13987
No functional change
Move more code into eval_init, removing some
clutter in the main routine.
Write eval_init only from "our" point of view
(do not init the attackedBy[Them] bitboards).
Add mobilityArea to the evalinfo
A few edits while being there
tested for non-regression at STC
http://tests.stockfishchess.org/tests/view/587fab230ebc5915193f77d9
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 39585 W: 7183 L: 7094 D: 25308
Non functional change.
After we have taken into account all cheap evaluation
terms, we check whether the score exceeds a given threshold.
If this is the case, we return a scaled down evaluation.
STC:
LLR: 3.35 (-2.94,2.94) [0.00,5.00]
Total: 12575 W: 2316 L: 2122 D: 8137
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 67480 W: 9016 L: 8677 D: 49787
Current version is the one rewritten by ceebo
further edited by me.
Bench: 5367704
After the history simplifications, we are only using Value Stats for CounterMoveHistory table. Therefore the parameter CM is not necessary.
No functional change.
Add asserts to check for overflow in Score * int multiplication.
There is no overflow in current master, but it would be easy to
create one as the scale of the current eval does not leave many
spare bits. For instance, adding the following unused variables
in master at the end of evaluate() (line 882 of evaluate.cpp)
overflows:
Score s1 = score * 4; // no overflow
Score s2 = score * 5; // overflow
Assertion failed: (eg_value(result) == (i * eg_value(s))),
function operator*, file ./types.h, line 336.
Same md5 checksum as current master for non debug compiles.
No functional change.
Order the enum and the array the same way they appear around line 250.
Makes it much easier to follow.
Add comments in the array definition and critical rows.
Use same terminology as elsewhere in pawns.cpp
No functional change.
Previously, we had duplicated History:
- one with (piece,to) called History
- one with (from,to) called FromTo
Now that we have only one, rename it to History, which is the generally accepted
name in the chess programming litterature for this technique.
Also correct some comments that had not been updated since the introduction of CMH.
No functional change.
Perform more complex verification and validation.
- signature.sh : extract and optionally compare Bench/Signature/Node count.
- perft.sh : verify perft counts for a number of positions.
- instrumented.sh : run a few commands or uci sequences through valgrind/sanitizer instrumented binaries.
- reprosearch.sh : verify reproducibility of search.
These script can be used from directly from the command line in the src directory.
Update travis script to use these shell scripts.
No functional change.
Now taht we correctly value-initialize Thread objects,
we don't need c'tors anymore because tables will be
zero-initialized by the compier when Thread object
is instanced.
Verified that we have no errors with Valgrind.
No functional change.
If not explicitly initialized in a class constructor,
then all data members are default-initialized when
the corresponing struct/class is instanced.
For array and built-in types (int, char, etc..)
default-initialization is a no-op and we need to
explicitly zero them.
No functional change.
Tested using syzygy bench method:
- 2016 random positions ranging between 3 and 10 pieces
- each searched using bench at depth=10
Same node count (and no speed regression).
No functional change.
EasyMove is cleared after every iteration of the
search if the 3rd move in the PV of the main thread
changes from the previous iteration. Therefore,
clearing EasyMove during a search iteration may be
excessive. The tests show that this is indeed unnecessary.
In the new version the EasyMove variable is used only in
the Thread::search function.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 47719 W: 8438 L: 8362 D: 30919
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 122841 W: 15448 L: 15457 D: 91936
bench: 5468995
Implement a threefold repetition detection. Below are the examples of
problems fixed by this change.
Loosing move in a drawn position.
position fen 8/k7/3p4/p2P1p2/P2P1P2/8/8/K7 w - - 0 1 moves a1a2 a7a8 a2a1
The old code suggested a loosing move "bestmove a8a7", the new code suggests "bestmove a8b7" leading to a draw.
Incorrect evaluation (happened in a real game in TCEC Season 9).
position fen 4rbkr/1q3pp1/b3pn2/7p/1pN5/1P1BBP1P/P1R2QP1/3R2K1 w - - 5 31 moves e3d4 h8h6 d4e3
The old code evaluated it as "cp 0", the new code evaluation is around "cp -50" which is adequate.
Brings 0.5-1 ELO gain. Passes [-3.00,1.00].
STC: http://tests.stockfishchess.org/tests/view/584ece040ebc5903140c5aea
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 47744 W: 8537 L: 8461 D: 30746
LTC: http://tests.stockfishchess.org/tests/view/584f134d0ebc5903140c5b37
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 36775 W: 4739 L: 4639 D: 27397
Patch has been rewritten into current form for simplification and
logic slightly changed so that return a draw score if the position
repeats once earlier but after or at the root, or repeats twice
strictly before the root. In its original form, repetition at root
was not returned as an immediate draw.
After retestimng testing both version with SPRT[-3, 1], both passed
succesfully, but this version was chosen becuase more natural. There is
an argument about MultiPV in which an extended draw at root may be sensible.
See discussion here:
https://github.com/official-stockfish/Stockfish/pull/925
For documentation, current version passed both at STC and LTC:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 51562 W: 9314 L: 9245 D: 33003
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 115663 W: 14904 L: 14906 D: 85853
bench: 5468995
Non-functional changes
a) splitting the threat array to avoid using an enum
b) reorder the scores according to functions where they are used.
c) declarations in evaluate_pieces after the const(s) like elsewhere
d) more compact definitions of KingFlank,
now that we need it also for the PanwLessFlank penalty.
e) reuse CenterFiles in evaluate_space
f) move one line inside next popcount
No functional change.
It was a bit of a hack, without intrinsic value, but rather compensating for the
fact that checks were mistuned.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 88308 W: 15553 L: 15545 D: 57210
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 53115 W: 6741 L: 6662 D: 39712
bench 5468995
Fixes the only exception, in razoring.
The code already does assert(PvNode || (alpha == beta - 1)), and it can be verified by studying the program flow that this is indeed the case, also for the modified line.
No functional change.
make skipEarlyPruning a search argument instead of managing this by hand.
Verified for no regression at STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 96754 W: 17089 L: 17095 D: 62570
No functional change.
* Refactor bonus and penalty calculation
Compute common terms in a helper function.
No functional change.
* Further refactoring
Remove some parenthesis that are now useless.
Define prevSq once, use repeatedly.
No functional change.
bench: 5884767 (bench of previous patch is wrong)
This patch tweaks some pawn values to favor flank attacks.
The first part of the patch increases the midgame psqt values of external pawns to launch more attacks (credits to user GuardianRM for this idea), while the second part increases the endgame connection values for pawns on upper ranks.
Passed STC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 34997 W: 6328 L: 6055 D: 22614
and LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 13844 W: 1832 L: 1650 D: 10362
Bench: 5884767
GCC compiles builtin_clzll to “63 ^ BSR”. BSR is processor instruction "Bit Scan Reverse".
So old msb() function is basically 63 - 63 ^ BSR.
Unfortunately, GCC fails to simplify this expression.
Old function compiles to
bsrq %rdi, %rdi
movl $63, %eax
xorq $63, %rdi
subl %edi, %eax
ret
New function compiles to
bsrq %rdi, %rax
ret
BTW, Clang compiles both function to the same (optimal) code.
No functional change.
The needed Windows API for processor groups could be missed from old Windows
versions, so instead of calling them directly (forcing the linker to resolve
the calls at compile time), try to load them at runtime. To do this we need
first to define the corresponding function pointers.
Also don't interfere with running fishtest on numa hardware with Windows.
Avoid all stockfish one-threaded processes will run on the same node
No functional change.
This patch fixed bugs #859 and #882.
At initialization we generate a new random key (Zobrist::noPawns).
It's added to the pawn key of all positions, so that the pawn key
of a pawnless position is no longer 0.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 21307 W: 3738 L: 3618 D: 13951
LTC:
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 45270 W: 5737 L: 5648 D: 33885
No functional change.
Under Windows it is not possible for a process to run on more than one
logical processor group. This usually means to be limited to use max 64
cores. To overcome this, some special platform specific API should be
called to set group affinity for each thread. Original code from Texel by
Peter sterlund.
Tested by Jean-Paul Vael on a Xeon E7-8890 v4 with 88 threads and confimed
speed up between 44 and 88 threads is about 30%, as expected.
No functional change.
This refines the profile-build target to avoid 'touch'ing the sources,
keeping meaningful modification dates and avoiding editor warnings like vi's:
WARNING: The file has been changed since reading it!!!
Do you really want to write to it (y/n)?
Instead of touching sources, the (instrumented) object files are removed,
which has the same effect of rebuilding them in the next step.
As a side effect, this simplifies the Makefile a bit.
No functional change.
A position can never repeat the one on the previous move.
Thus we can start searching for a repetition from the 4th
ply behind. In the case:
std::min(st->rule50, st->pliesFromNull) < 4
We don't need to do any more calculations. This case happens
very often - in more than a half of all calls of the function.
No functional change.
Anonymous struct inside anonymous unions are a GCC extension.
This patch uses named structs to stick to the C+11 standard.
Avoids a string of warnings on the Clang compiler.
Non functional change (same bench and same MD5 signature,
so compiled code is exactly the same as in current master)
Makes the actual number of nodes searched match closely
the number of nodes requested, by increasing the frequency
of checking the number of nodes searched at low node count.
All other searches retain the default checking frequency of
once per 4096 nodes, and are thus unaffected.
Passed STC as non-regression
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 26643 W: 4766 L: 4655 D: 17222
No functional change.
In 10 of 12 calls total to Position::do_move()the givesCheck argument is
simply gives_check(m). So it's reasonable to make an overload without
this parameter, which wraps the existing version.
No functional change.
Currently, we only check if there is a pawn in place
to make the en-passant capture. Now also check that
there is a pawn that could just have advanced two
squares. Also update the corresponding comment.
This makes the parsing of FENs a bit more robust, and
now correctly handles positions like the one reported by Dann Corbit.
position fen rnbqkb1r/ppp3pp/3p1n2/3P4/8/2P5/PP3PPP/RNBQKB1R w KQkq e6
d
+---+---+---+---+---+---+---+---+
| r | n | b | q | k | b | | r |
+---+---+---+---+---+---+---+---+
| p | p | p | | | | p | p |
+---+---+---+---+---+---+---+---+
| | | | p | | n | | |
+---+---+---+---+---+---+---+---+
| | | | P | | | | |
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
| | | P | | | | | |
+---+---+---+---+---+---+---+---+
| P | P | | | | P | P | P |
+---+---+---+---+---+---+---+---+
| R | N | B | Q | K | B | | R |
+---+---+---+---+---+---+---+---+
Fen: rnbqkb1r/ppp3pp/3p1n2/3P4/8/2P5/PP3PPP/RNBQKB1R w KQkq - 0 1
No functional change.
Non functional change, tests under sanitizers OK.
Rationales for change
- Offset in code is in range -4 ... 2
- There was an error by (pathological) corner case MAX_PLY=0
No functional change.
Casting a pointer to a different type with stricter alignment
requirements yields to implementation dependent behaviour.
Practicaly everything is fine for common platforms because the
CPU/OS/compiler will generate correct code, but anyhow it is
better to be safe than sorry.
Testing with dbg_hit_on() shows that the unalignment accesses are
very rare (below 0.1%) so it makes sense to split the code in a
fast path for the common case and a slower path as a fallback.
No functional change (verified with TB enabled).
Small fixes for compilation with sanitize=yes optimize=no,
by always adding -fsanitize=undefined to the LDFLAGS as required.
Updates config-sanity to check&report the status of the flag.
No functional change.
Rewrite the code in SF style, simplify and
document it.
Code is now much clear and bug free (no mem-leaks and
other small issues) and is also smaller (more than
600 lines of code removed).
All the code has been rewritten but root_probe() and
root_probe_wdl() that are completely misplaced and should
be retired altogheter. For now just leave them in the
original version.
Code is fully and deeply tested for equivalency both in
functionality and in speed with hundreds of games and
test positions and is guaranteed to be 100% equivalent
to the original.
Tested with tb_dbg branch for functional equivalency on
more than 12M positions.
stockfish.exe bench 128 1 16 syzygy.epd
Position: 2016/2016
Total 12121156 Hits 0 hit rate (%) 0
Total time (ms) : 4417851
Nodes searched : 1100151204
Nodes/second : 249024
Tested with 5,000 games match against master, 1 Thread,
128 MB Hash each, tc 40+0.4, which is almost equivalent
to LTC in Fishtest on this machine. 3-, 4- and 5-men syzygy
bases on SSD, 12-moves opening book to emphasize mid- and endgame.
Score of SF-SyzygyC++ vs SF-Master: 633 - 617 - 3750 [0.502] 5000
ELO difference: 1
No functional change.
Avoid shifting negative signed integers and use typed
enum to avoids decrementing a variable beyond its defined
range, like:
for (Rank r = RANK_8; r >= RANK_1; --r)
Changes were tested individually and passed SPRT[-3, 1].
With this patch gcc --sanitize builds cleanly.
No functional change.
Instead of outputting "info nodes ... time ..." when the last
iteration is interrupted, simply call UCI::pv() to output the PV.
I thought about calling UCI:pv() with bounds -VALUE_INFINITE, VALUE_INFINITE
to avoid "lowerbound" or "upperbound" appearing in it, but I'm not sure that
would be any better.
This patch fixes rare inconsistencies between the first move of
the last PV output and the bestmove played. It also makes sure
that all the latest statistics are sent to the GUI (not only nodes
and time but also nps, tbhits, hashfull).
No functional change.
Restore original behaviour to reset
the counter before a new move search.
Also fixed some warnings and added const
qualifier to a couple of functions, as
suggested by m_stembera.
Thanks to Werner Bergmans for reporting
the regression.
No functional change.
Use a per-thread counter to reduce contention
with many cores and endgame positions.
Measured around 1% speed-up on a 12 core and 8%
on 28 cores with 6-men, searching on:
7R/1p3k2/2p2P2/3nR1P1/8/3b1P2/7K/r7 b - - 3 38
Also retire the unused set_nodes_searched() and fix
a couple of return types and naming conventions.
No functional change.
For a default bench, this fixes the last valgrind
error (jump on uninitialised value).
Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 187869 W: 33303 L: 33463 D: 121103
No functional change.
We reference (ss-1)->currentMove, i.e. we peek
current move of the parent node, so currentMove
should be valid in the main move loop, when we
search() the subtree, but outside of main loop
it is useless.
No functional change.
Also a speedup since we don't need to recalculate SEE
for extensions...as it already determined to be positive.
Results for 12 tests for each version:
Base Test Diff
Mean 2132395 2191002 -58607
StDev 128058 85917 134239
p-value: 0.669
speedup: 0.027
Non functional change.
The target:
Odroid U3 (http://www.hardkernel.com/main/products/prdt_info.php?g_code=g138745696275)
Debian Jessie
As listed in #550 and #638 three modifications are needed for compilation to work:
float-abi flag for GCC If an FPU is present and supported by the installed os then passed value need to be hard.
I didn't find any better solution than using readelf to check for the availibilty of Tag_ABI_VFP_args which sould indicate support for the FPU. The check is only done if the arch is arm and if readelf is not present
on the system, there will be an error (/bin/sh: 1: readelf: not found) but it will not break and will continue with the default softfp value. Outputing the error is not really acceptable but I wanted some feedback on the
check itself.
-lpthread is needed on armv7 outside of Android
I replaced UNAME with KERNEL and OS to allow to differentiate Android.
m32 flag
My understanding is that outside of Android the flag is generating errors on armv7.
These modifications should introduce change only for non Android armv7 build.
No functional change.
The target:
Odroid U3 (http://www.hardkernel.com/main/products/prdt_info.php?g_code=g138745696275)
Debian Jessie
As listed in #550 and #638 three modifications are needed for compilation to work:
float-abi flag for GCC If an FPU is present and supported by the installed os then passed value need to be hard.
I didn't find any better solution than using readelf to check for the availibilty of Tag_ABI_VFP_args which sould indicate support for the FPU. The check is only done if the arch is arm and if readelf is not present
on the system, there will be an error (/bin/sh: 1: readelf: not found) but it will not break and will continue with the default softfp value. Outputing the error is not really acceptable but I wanted some feedback on the
check itself.
-lpthread is needed on armv7 outside of Android
I replaced UNAME with KERNEL and OS to allow to differentiate Android.
m32 flag
My understanding is that outside of Android the flag is generating errors on armv7.
These modifications should introduce change only for non Android armv7 build.
No functional change.
Stephane's patch removes the only usage of Position::see, where the
returned value isn't immediately compared with a value. So I replaced
this function by its optimised and more specific version see_ge. This
function also supersedes the function Position::see_sign.
bool Position::see_ge(Move m, Value v) const;
This function tests if the SEE of a move is greater or equal than a
given value. We use forward iteration on captures instread of backward
one, therefore we don't need the swapList array. Also we stop as soon
as we have enough information to obtain the result, avoiding unnecessary
calls to the min_attacker function.
Speed tests (Windows 7), 20 runs for each engine:
Test engine: mean 866648, st. dev. 5964
Base engine: mean 846751, st. dev. 22846
Speedup: 1.023
Speed test by Stephane Nicolet
Fishtest STC test:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 26040 W: 4675 L: 4442 D: 16923
http://tests.stockfishchess.org/tests/view/57f648990ebc59038170fa03
No functional change.
This is a bit tricky because we don't want
to prune the only legal evasions, even if
with negative SEE. So add an assert to avoid
this subtle bug to slip in later.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 14140 W: 2625 L: 2421 D: 9094
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 11558 W: 1555 L: 1379 D: 8624
bench: 5256717
Rename shift_bb() to shift(), and DELTA_S to SOUTH, etc.
to improve code readability, especially in evaluate.cpp
when they are used together:
old b = shift_bb<DELTA_S>(pos.pieces(PAWN))
new b = shift<SOUTH>(pos.pieces(PAWN))
While there fix some small code style issues.
No functional change.
Drop useless condition
abs(ttValue) < VALUE_KNOWN_WIN
And extend singular extension search to cases when ttValue
stores a mate score. This improves mate finding and does
not introduce any regression.
Yery tested this patch against current master on the 6500+
Chest mate suite with 200K fixed nodes:
shortest mates found: master: 1206 patch:1205
any mate found: master: 1903 patch: 2003
with 1 sec time:
shortest mates found: master: 2667 patch: 2628
any mate found: master: 3585 patch: 3646
Verified for no regression:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 25655 W: 4578 L: 4465 D: 16612
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 66247 W: 8618 L: 8557 D: 49072
bench: 6335042
Both Tablebases::filter_root_moves() and
extract_ponder_from_tt(9 were unable to handle
a mate/stalemate position.
Spotted and reported by Dann Corbit.
Added some mate/stalemate positions to bench so
to early catch this regression in the future.
No functional change.
Use the following transformations:
- to check that A is included in B, testing "(A & ~B) == 0" is faster
than "(A & B) == A"
- to remove the intersection of A and B from A, doing "A &= ~B;" is as
fast as "if (A & B) A &= ~B;" but is simpler.
Overall, the simpler patch version is 0.3% than current master.
No functional change.
Correct pinners calculation and fix bug with pinned
pieces giving check. With this patch 'pinners' only
returns sliders with exactly one defensive piece between
the slider and the attacked square (in other words, pinners
returns exact pinners).
This was a co-operation between Marco Costalba,
Stphane Nicolet and me.
Special thanks to Ronald de Man for reporting the bug with
pinned pieces giving check, discussed here:
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/S_4E_Xs5HaE
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 132118 W: 23578 L: 23645 D: 84895
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 36424 W: 4770 L: 4670 D: 26984
bench: 6272231
Instrumentation shows that in make_score(mg, eg) calls, the mg value is
zero in 25,9% of the calls while the eg value is zero in 36,8% of the
calls.
Swapping the internal fields of mg and eg in the internal
representation of Score allows the compiler to optimize away the shift
in (eg << 16) + mg in more cases, thus resulting in a 0.3% speed-up
overall.
No functional change
Rescales the king danger variables in evaluate_king() to
suppress the KingDanger[] array. This avoids the cost of
the memory accesses to the array and simplifies the non-linear
transformation used.
Full credits to "hxim" for the seminal idea and implementation,
see pull request #786.
https://github.com/official-stockfish/Stockfish/pull/786
Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 9649 W: 1829 L: 1689 D: 6131
Passed LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 53494 W: 7254 L: 7178 D: 39062
Bench: 6116200
Drops a scalability bottleneck due to memory contention
of a single shared table across threads. The effect starts
to be sensible with a high number of threads. Specifically
we have a small regression with 7 threads both at 60 and
180 seconds TC:
10000 @ 60+0.6 th 7
ELO: -2.46 +-3.2 (95%) LOS: 6.5%
Total: 9896 W: 1037 L: 1107 D: 7752
5000 @ 180+0.6 th 7
ELO: -1.95 +-4.1 (95%) LOS: 17.7%
Total: 5000 W: 444 L: 472 D: 4084
We have a regression because counterMoveHistory table is
quite big and it takes time for a single thread to fill it.
Sharing the table yields to a higher fill rate and better
quality of moves and up to 7 threads the benefits of sharing
more then compensate the loss in speed due to contention.
Interestingly even with a 3X longer TC, so with more time
for the single thread to catch up, the improvment is quite
limited and below noise level. It seems we really need much
longer TC to saturate the table.
When we move to high threads number it's another story:
5000 @ 60+0.6 th 22
ELO: 3.49 +-4.3 (95%) LOS: 94.6%
Total: 4880 W: 490 L: 441 D: 3949
2000 @ 60+0.6 th 32
ELO: 8.34 +-6.9 (95%) LOS: 99.1%
Total: 2000 W: 229 L: 181 D: 1590
As expected the speed-up more than compensates the filling
rate, and we expect that with tournament TC, where single
thread is able to saturate the table, the difference will
be even stronger. For instance for TCEC 9 super-final time
control will be 180 minutes + 15 seconds and this scalability
improvement seems definitely the way to go.
So, summarizing:
GOOD:
Measured big improvement in high core scenario
Suitable for TCEC 9 superfinal (big hardware, very long TC)
Consistent and natural patch that extends to counterMoveHistory
what we already do for remaining history tables, that are all per-thread
Non functional change for the common case of a single core
Very simple (just 6 lines modified, no added ones)
BAD:
Small regression (within 2-3 ELO) with few threads and short TC
bench: 5341477
Measured bench speed up goes from 0,7% to 2%,
given the unreliable measure a reverse simmplification
test was done on fishtest:
master vs patch
LLR: -2.94 (-2.94,2.94) [-3.00,1.00]
Total: 15499 W: 2685 L: 2867 D: 9947
Test result is positive, master is weaker.
No functional change.
This is the most compact and neatest version
is was able to produce.
On normal builds I have a small slowdown:
normal builds base vs. simplification (gcc 4.8.1 Win7-64 i7-3770 @ 3.4GHz x86-64-modern)
Results for 20 tests for each version:
Base Test Diff
Mean 1974744 1969333 5411
StDev 11825 10281 5874
p-value: 0,178
speedup: -0,003
On pgo-builds however I measure a nice 1.1% speedup
pgo-builds base vs. simplification
Results for 20 tests for each version:
Base Test Diff
Mean 1974119 1995444 -21325
StDev 8703 5717 4623
p-value: 1
speedup: 0,011
No functional change.
Don't allow pinned pieces to attack the exchange-square as long all
pinners (this includes also potential ones) are on their original
square.
As soon a pinner moves to the exchange-square or get captured on it, we
fall back to standard SEE behaviour.
This correctly handles the majority of cases with absolute pins.
bench: 6883133
In evaluate, we start by initializing the pos.psq_score
and adding the material imbalance. After that, we check
whether a specialized eval exists and if yes we return
that value and discard whatever we have computed until now.
It sounds more logical to first probe material entry and
return if we have a specialized eval, and only if it is
not the case initialize eval with some values. There is
no measurable speed-difference on my computer.
Non functional change.
In case we have installed a not complete set of 6-men tables and
there is 6 piece position on board, but no corresponding
tablebase engine is not using any syzygy at all.
Reported by Jouni Uski, fix by Peter Österlund,
confirmed as a bug by Ronald de Man.
bench: 7591630
If the opponent has a cramped position, opening a file often
helps him/her to exchange pieces, so it makes sense to reduce
the space bonus if there are open files.
Credits: Leonardo Ljubičić for the strategic idea, Alain Savard for the
implementation of the open files calculation, "CrunchyNYC" for the
compensation of the numerator.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 49112 W: 9239 L: 8900 D: 30973
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 89415 W: 12014 L: 11601 D: 65800
Bench: 7591630
This greately simplifies usage because hides to the
search the implementation specific CheckInfo.
This is based on the work done by Marco in pull request #716,
implementing on top of it the ideas in the discussion: caching
the calls to slider_blockers() in the CheckInfo structure,
and simplifying the slider_blockers() function by removing its
first parameter.
Compared to master, bench is identical but the number of calls
to slider_blockers() during bench goes down from 22461515 to 18853422,
hopefully being a little bit faster overall.
archlinux, gcc-6
make profile-build ARCH=x86-64-bmi2
50 runs each
bench:
base = 2356320 +/- 981
test = 2403811 +/- 981
diff = 47490 +/- 1828
speedup = 0.0202
P(speedup > 0) = 1.0000
perft 6:
base = 175498484 +/- 429925
test = 183997959 +/- 429925
diff = 8499474 +/- 469401
speedup = 0.0484
P(speedup > 0) = 1.0000
perft 7 (but only 10 runs):
base = 185403228 +/- 468705
test = 188777591 +/- 468705
diff = 3374363 +/- 476687
speedup = 0.0182
P(speedup > 0) = 1.0000
$ ./pyshbench ../Stockfish/master ../Stockfish/test 20
run base test diff
...
base = 2501728 +/- 182034
test = 2532997 +/- 182034
diff = 31268 +/- 5116
speedup = 0.0125
P(speedup > 0) = 1.0000
No functional change.
This non-functional change patch is a deep work to allow SF to be independent
from the actual value of ONE_PLY (currently set to 1). I have verified SF is
now independent for ONE_PLY values 1, 2, 4, 8, 16, 32 and 256.
This patch gives consistency to search code and enables future work, opening
the door to safely tweaking the ONE_PLY value for any reason.
Verified for no speed regression at STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 95643 W: 17728 L: 17737 D: 60178
No functional change.
Rewritten in a way to have explicit in the search
the bonus/penalty we apply: hopefully this will lead
to further simplification/fix of current rather messy
stats update code.
No functional change.
STC: (Yellow)
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 69115 W: 12880 L: 12797 D: 43438
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 124163 W: 16923 L: 16442 D: 90798
Note: Note based off past experiments / patches... history pruning
is quite TC sensitive. I believe the reason for this TC dependency
is that the CMH/FMH is a very large table that takes time to fill
up with. In addition having more time for will increase the accuracy
of the stats' value.
Bench: 7351698
GCC uses SSE instructions to move data but in 32-bit gcc version used
by abrok the stack is not 16-byte aligned due to a bug.
This patch workaround teh bug by not using the stack
to store KingFlank[]
Fixes issue #721.
No functional change.
Checking for legality of a possible ponder move
must be done before we undo the first pv move,
of course. (spotted by mohammed li.)
This obviously only has any effect when playing in ponder mode.
No functional change.
Take advantage that VALUE_NONE = 32002 to remove
the condition.
Commented out and not removed becuase it is tricky
to rely on the hidden value of VALUE_NONE and code
can break in case we change VALUE_NONE in the future.
No functional change.
This code was added before the accurate pv patch, when
we retrieved PV directly from TT.
It's not required for correct (and long) PVs any more and
should be safe to remove it.
Also, allowing helper threads to repeatedly over-write
TT doesn't seem to make sense(that was probably an un-intended
side-effect of lazy smp). Before Lazy SMP only Main Thread used
to run ID loop and insert PV into TT.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 74346 W: 13946 L: 13918 D: 46482
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 47265 W: 6531 L: 6447 D: 34287
bench: 8819179
Allow to specifiy the log file name, this comes
handy in case of self-matches so that each SF
instance writes into a different log file.
No functional change.
Currently root moves are copied to all teh threads
but are DTZ filtered only in main thread at the
beginning of teh search.
This patch moves the TB filtering before the
copy of root moves fixing issue #679https://github.com/official-stockfish/Stockfish/issues/679
No bench change.
There are two concepts with this patch:
Limit check extensions by using move count.
The idea is to limit search explosion.
Always extend check if the first move gives check.
The idea is to save expensive SEE calls, since the vast
majority of first move will have SEE value >= 0, also
first move may still be strong even if the SEE is negative.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 16503 W: 3068 L: 2873 D: 10562
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 37202 W: 5261 L: 5014 D: 26927
bench: 8543366
Moving a few lines from evaluate_threats to evaluate_pieces allows to
a) Remove a condition pos.count<QUEEN>(Them) == 1
b) use precalculated s instead of pos.square(Them)
c) do not check the condition at all in queenless endings
Passed STC
http://tests.stockfishchess.org/tests/view/5752e0410ebc59029919b1f4
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 67175 W: 12194 L: 12152 D: 42829
and LTC
http://tests.stockfishchess.org/tests/view/57587b140ebc59029919b2f4
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 20276 W: 2774 L: 2653 D: 14849
bench: 7907962
In this position: 3K4/8/3k4/8/4p3/4B3/5P2/8 w - - 0 5
Current DTZ probe returns 1 instead of 15
What happens is that the double push f4 is erroneously detected as a win move.
After the push we have:
[D]3K4/8/3k4/8/4pP2/4B3/8/8 b - f3 0 5
And here the code misses the possible ep capture exf3.
The bug is in probe_dtz_no_ep() where is used probe_ab() that is
blind to ep captures so it returns v == 2 (win) for position
3K4/8/3k4/8/4pP2/4B3/8/8 b - f3 0 5
Note that at the caller site the original position did not have any
possible ep capture, so probe_dtz() returns immediately after calling
probe_dtz_no_ep().
The fix is to call the ep-aware probe_wdl() instead of probe_ab()
I have verified that DTZ is correct now and also there are no more
mistmatches compared to the new 'syzygy' branch. Tested on a set of
more than 600 endgame positions, included some tricky ones.
For people interested to redo the test or doing additional tests
please pull branch tb_dbg from https://github.com/mcostalba/Stockfish repo.
bench: 8450534 (bench unaffected because syzygy is not exercized during bench)
A middle ground patch of two successful tuning patches,
one at STC, the other at LTC, which now passed both.
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 67893 W: 12777 L: 12384 D: 42732
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 30165 W: 4189 L: 3960 D: 22016
bench: 9209507
Introducing a new multi-purpose penalty related to King safety, which
includes all kind of potential checks (from unsafe or unavailable
squares currently occupied by some other piece)
This will indirectly detect and reward some pins, discovered checks, and
motifs such as square vacation, or rook behind its pawn and aligned with
King (example Black Rg8, g7 against Kg1),
and penalize some pawn blockers (if they move, it allows a discovered
check by the pawn).
And since it looks also at protected squares, it detects some potential
defense overloading.
Finally, the rook contact checks had been removed some time ago. This
test will give a small bonus for them, as well as for bishop contact
checks.
Passed STC
http://tests.stockfishchess.org/tests/view/5729ec740ebc59301a354b36
LLR: 2.94 (-2.94,2.94) [0.00,5.00]
Total: 13306 W: 2477 L: 2296 D: 8533
and LTC
http://tests.stockfishchess.org/tests/view/572a5be00ebc59301a354b65
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 20369 W: 2750 L: 2565 D: 15054
bench: 9298175
Just use _mm_popcnt_u64() that is available
both for MSVC abd Intel compiler.
Verified on MSVC that the produced assembly
has the hardware 'popcnt' instruction.
No functional change.
Currently, helper threads will only search up to the
specified depth limit. Now let them search until the
main thread has finished the specified depth.
On the other hand, we don't want to pick a thread with
a higher search depth.
This may be considered cheating. ;-)
No functional change.
It seems that icc used our fallback version of popcount.
Now use intrinsics.
icc version 16.0.2 (gcc version 5.3.0 compatibility)
bmi2 compile
uname -r 4.5.1-1-ARCH
20xbench gives a nice speedup
./stockfish-icc-master 2161515 +- 34462
./stockfish-icc-sse42 2260857 +- 50349
I could not find anything documented that is necessary that prepending -mbmi to -mbmi2 gives some benefit.
Instead at
https://gcc.gnu.org/onlinedocs/gcc/x86-Built-in-Functions.html#x86-Built-in-Functions
The following built-in functions are available when -mbmi is used. All of them generate the machine instruction that is part of the name.
unsigned int __builtin_ia32_bextr_u32(unsigned int, unsigned int);
unsigned long long __builtin_ia32_bextr_u64 (unsigned long long, unsigned long long);
The following built-in functions are available when -mbmi2 is used. All of them generate the machine instruction that is part of the name.
unsigned int _bzhi_u32 (unsigned int, unsigned int)
unsigned int _pdep_u32 (unsigned int, unsigned int)
unsigned int _pext_u32 (unsigned int, unsigned int)
unsigned long long _bzhi_u64 (unsigned long long, unsigned long long)
unsigned long long _pdep_u64 (unsigned long long, unsigned long long)
unsigned long long _pext_u64 (unsigned long long, unsigned long long)
and at
https://gcc.gnu.org/ml/gcc/2014-02/msg00204.html
( "... The real optimization comes from being able to use pext
(parallel bit extract), which can implement several bextr expressions in
parallel.")
Apart from that we don't use all -msse -msse2 -msse3 -msse4.2 etc. but just -msse3 (or -msse4.2) only.
As regards to the speedup within noise level - this pull request is actually reversal of mcostalba#198 wherein prepending -mbmi to -mbmi2 was claimed to be 0.3% faster and here (removing -mbmi) gives 0.4% speed gain.
Seems to give around 1% speed-up for CPUs with popcnt support.
Seems to give a very minor speed-up for CPUs without popcnt.
No functional change
Resolves#646
In this position we should have draw for repetition:
position fen rnbqkbnr/2pppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1 moves g1f3 g8f6 f3g1
go infinite
But latest patch broke it.
Actually we had two(!) very subtle bugs, the first is that Position::set()
clears the passed state and in particular 'previous' member, so
that on passing setupStates, 'previous' pointer was reset.
Second bug is even more subtle: SetupStates was based on std::vector
as container, but when vector grows, std::vector copies all its contents
to a new location invalidating all references to its entries. Because
all StateInfo records are linked by 'previous' pointer, this made pointers
go stale upon adding more element to setupStates. So revert to use a
std::deque that ensures references are preserved when pushing back new
elements.
No functional change.
And passed in do_move(), this ensures maximum efficiency and
speed and at the same time unlimited move numbers.
The draw back is that to handle Position init we need to
reserve a StateInfo inside Position itself and use at
init time and when copying from another Position.
After lazy SMP we don't need anymore this gimmick and we can
get rid of this special case and always pass an external
StateInfo to Position object.
Also rewritten and simplified Position constructors.
Verified it does not regress with a 3 threads SMP test:
ELO: -0.00 +-12.7 (95%) LOS: 50.0%
Total: 1000 W: 173 L: 173 D: 654
No functional change.
When starting search in a mate or stalemate position, Stockfish does not
even care to reinitialize and start worker threads. However after search
all threads are checked for the best move.
This can lead to bestmove and info beeing carried over from the last
search.
Example session:
setoption name threads value 7
go movetime 4000
position startpos moves f2f3 e7e5 g2g4 d8h4
go movetime 4000
Actual output is like (almost always):
[...]
bestmove e2e4
info depth 0 score mate 0
info depth 20 seldepth 29 multipv 1 score cp 28 [...] pv e2e4
bestmove e2e4
Expected output / output after fix:
[...]
bestmove e2e4 ponder e7e6
info depth 0 score mate 0
bestmove (none)
Resolves#623
Broken after "32-bit/64-bit Makefile fix" commit.
Ubuntu "Precise" 12.04.5 supports multilib only until
g++ 4.6 that is not enough to compile Stockfish.
So move to Ubuntu 14.04.4 LTS (Trusty Tahr)
No functional change.
There was already a penalty for squares only defended by King (undefended)
This test records a penalty for completely undefended squares in the so called extended king-ring
(so if we exclude squares defended by a Kg8 for example, we only look at h6 g6 and f6)
We also exclude squares occupied by opponent pieces in this computation,
based on the following results
Was yellow at STC
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 112499 W: 20649 L: 20293 D: 71557
and passed LTC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 36805 W: 5100 L: 4857 D: 26848
Bench: 8430233
Resolves: #619
On top of the usual conditions
a) some opponent in front (but no lever)
b) some neighbours (in front) (but no neighbour behind or same rank)
c) < rank_5
to find out if a pawn is backward we look at the squares in front of this pawn to reach the same rank as the next neighbour.
In current master, a pawn is backward if any of those squares is controlled by an enemy pawn on an adjacent file
In this version, a pawn is ALSO backward if any of those squares is occupied by an enemy pawn.
STC:
http://tests.stockfishchess.org/tests/view/56fe7efd0ebc59301a3541f1
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 19051 W: 3557 L: 3433 D: 12061
LTC:
http://tests.stockfishchess.org/tests/view/56febc2d0ebc59301a354209
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40810 W: 5619 L: 5526 D: 29665
Bench: 7525245
Resolves#614
Also a speedup(about 1%) on 64-bit w/o hardware popcnt
Retire Max15 and Full template parameters
(Contributed by Marco Costalba)
Now that we have just SW and HW versions, use
template default parameter to get rid of explicit
template parameters.
Retire bitcount.h and move the only defined
function to bitboard.h
No functional change
Resolves#620
Counter intuitively, make build ARCH=x86-32 does NOT produce a 32-bit compile
when running a 64-bit OS. Nor would ARCH=x86-64 produce a 64-bit compile when
running a 32-bit OS (assuming it compiled w/o errors).
No functional change
Resolves#621
lsb(b) and msb(b) are undefined when b == 0. This can lead to subtle bugs, where
the resulting code behaves differently on different configurations:
- It can be the home grown software LSB/MSB
- It can be the compiler generated software LSB/MSB (when using compiler
intrinsics without the right compiler flags to allow compiler to use hardware
LSB/MSB). Which of course depends on the compiler.
- It can be hardware LSB/MSB generated by the compiler.
- Not to mention that hardware LSB/MSB can return different value on different
hardware when b == 0.
No functional change
Resolves#610
Use compiler intrinsics when possible to
avoid writing platform specific asm code.
Tested on Windows 7 with MSVC 2013 and mingw 4.8.3 (32 and 64 bit)
and on Linux Mint with g++ 4.8.4 and clang 3.4 (32 and 64 bit).
No functional change
Resolves#609
Give more weight to the pawns number and
the vertical king distance in evaluate_initiative()
Passed STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 26729 W: 5067 L: 4825 D: 16837
and LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 60480 W: 8338 L: 8016 D: 44126
Bench: 8295162
Resolves#594
PredictedDepth can be negative, causing the futility_margin to be negative.
It will be very difficult to tweak moveCount pruning and reduction formula, as they are tuned to prevent this behavior.
No functional change
Resolves#587
There is no reason to compile 3 different copies of search(). PV nodes are on
the cold path, and PvNode is a template parameter, so there is no cost in
computing:
const bool RootNode = PvNode && (ss-1)->ply == 0;
And this simplifies code a tiny bit as well.
Speed impact is negligible on my machine (i7-3770k, linux 4.2, gcc 5.2):
nps +/-
test 2378605 3118
master 2383128 2793
diff -4523 2746
Bench: 7751425
No functional change.
Resolves#568
In case of a very high material score, we can
overflow VALUE_INFINITE.
This patch fixes an assert with:
position fen 7k/QQQQR3/2B5/4KN1Q/3QQ3/8/8/4R3 b - - 0 1
go depth 1
No functional change.
Resolves#546
Make it explicit that those variables are not globals, but
are used only by main thread. I think it is a sensible
clarification because easy move is already tricky enough
and current patch makes the involved actors explicit.
No functional change.
Resolves#537
Tuned the global mobility factor for each piece, as well as some +- delta,
The master mobility factor was {266,334} and tuning gave
{267, 362} +S(-2,-2) for the Knight
{249, 328} +S( 0,-2) for the Bishop
{298, 353} +S(1,1) for the Rook
{265, 358} +S(2,-1) for the Queen
Passed STC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 49402 W: 9367 L: 9037 D: 30998
and LTC
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 26831 W: 3871 L: 3658 D: 19302
Bench: 8355485
Resolves#536
Fix a bug where we could stop the search after only 10% of time used due to a matching easy move but later switch to a different move that was never pre-screened as easy due to SMP thread select.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 27227 W: 4910 L: 4800 D: 17517
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 40368 W: 5826 L: 5733 D: 28809
Resolves#521
Simplify time management code by removing hard stops for unchanging first root moves.
Search is now stopped earlier at the end iteration if it did not have fail-lows at root.
This simplification also fixes pondering bug. Ponder flag was true by default
and cutechess-cli doesn't change it to false even though no pondering is possible.
Fix the issue by setting the default value of 'Ponder' flag to false.
10+0.1:
ELO: 3.51 +-3.0 (95%) LOS: 99.0%
Total: 20000 W: 3898 L: 3696 D: 12406
40+0.4:
ELO: 1.39 +-2.7 (95%) LOS: 84.7%
Total: 20000 W: 3104 L: 3024 D: 13872
60+0.06:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 37231 W: 5333 L: 5236 D: 26662
Stopped run at 100+1:
LLR: 1.09 (-2.94,2.94) [-3.00,1.00]
Total: 37253 W: 4862 L: 4856 D: 27535
Resolves#523Fixes#510
Also inline defintions of SpaceMask and CenterBindMask.
Verified from assembly that compiler computes the values
at compile time, so it is also theoretical faster.
While there factor out scale factor evaluation.
No functional change.
This is used by std::stable_sort() to sort moves from highest score to lowest score.
1) The comment is incorrect since highest to lowest means descending.
2) It's more natural to implement a less operator using another less operator rather than a greater operator.
No functional change.
Resolves#504
Comment is based on a misunderstanding of what unaligned memory access is. Here
is an article that explains it very clearly:
https://www.kernel.org/doc/Documentation/unaligned-memory-access.txt
No matter how we define TTEntry or TTCluster, there will never be any unaligned
memory access. This is because the complier knows the alignment rules, and does
the necessary adjustments to make sure unaligned memory access does not occur.
The issue being adressed here has nothing to do with unaligned memory access. It
is about cache performance. In order to achieve best cache performance:
- we prefetch the cacheline as soon as possible.
- we ensure that TT clusters do not spread across two cachelines. If they did,
we would need to prefetch 2 cachelines, which could hurt cache performance.
Therefore the true conditions to achieve this are:
1/ start adress of TT is cache line aligned. void TranspositionTable::resize()
enforces this.
2/ TT cluster size should *divide* the cache line size. Currently, we pack 2
clusters per cache lines. It used to be 1 before "TT sardines". Does not matter
what the ratio is, all we want is to fit an integer number of clusters per cache
line.
No functional change.
Resolves#506
Instead of creating a running std::thread and
returning, wait in Thread c'tor that the native
thread of execution goes to sleep in idle_loop().
In this way we can simplify how search is started,
because when main thread is idle we are sure also
all other threads will be idle, in any case, even
at thread creation and startup.
After lazy smp went in, we can simpify and rewrite
a lot of logic that is now no more needed. This is
hopefully the final big cleanup.
Tested for no regression at 5+0.1 with 3 threads:
LLR: 2.95 (-2.94,2.94) [-5.00,0.00]
Total: 17411 W: 3248 L: 3198 D: 10965
No functional change.
Now that we don't have anymore TimerThread, there is
no need of this long class hierarchy.
Also assorted reformatting while there.
To verify no regression, passed at STC with 7 threads:
LLR: 2.97 (-2.94,2.94) [-5.00,0.00]
Total: 30990 W: 4945 L: 4942 D: 21103
No functional change.
When we reach the maximum depth, we can finish the
search without a raise of Signals.stop. However, if
we are pondering or in an infinite search, the UCI
protocol states that we shouldn't print the best move
before the GUI sends a "stop" or "ponderhit" command.
It was broken by lazy smp. Fix it by moving the stopping
of the threads after waiting for GUI.
No functional change.
Indeed, if we use a depth >= DEPTH_MAX, we start having negative depth in the
TT (due to int8_t cast).
No functional change in single thread mode
Resolves#490
Unfortunately std::condition_variable::wait_for()
is not accurate in general case and the timer thread
can wake up also after tens or even hundreds of
millisecs after time has elapsded. CPU load, process
priorities, number of concurrent threads, even from
other processes, will have effect upon it.
Even official documentation says: "This function may
block for longer than timeout_duration due to scheduling
or resource contention delays."
So retire timer and use a polling scheme based on a
local thread counter that counts search() calls and
a small trick to keep polling frequency constant,
independently from the number of threads.
Tested for no regression at very fast TC 2+0.05 th 7:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 32969 W: 6720 L: 6620 D: 19629
TC 2+0.05 th 1:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 7765 W: 1917 L: 1765 D: 4083
And at STC TC, both single thread
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 15587 W: 3036 L: 2905 D: 9646
And with 7 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 8149 W: 1367 L: 1227 D: 5555
bench: 8639247
The only interesting change is the moving of
stack[MAX_PLY+4] back to its original position
in id_loop (now renamed Thread::search).
No functional change.
Reduce the variation in Root Depth between different threads. This
prevents threads from searching at a depth much higher than Main Thread.
Performed well at STC 24 Threads:
ELO: 3.44 +-3.8 (95%) LOS: 96.1%
Total: 10000 W: 1627 L: 1528 D: 6845
And LTC 24 Threads
LLR: 1.43 (-2.94,2.94) [0.00,4.00]
Total: 3804 W: 500 L: 420 D: 2884
ELO : +7.31
p-value: 73.16%
Passed no regression at STC 3 Threads:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40457 W: 7148 L: 7060 D: 26249
And LTC 3 Threads:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 17704 W: 2489 L: 2364 D: 12851
Raising a pull request early as 24 Thread tests are very expensive and
this is clearly a positive gain at high thread counts and high time
controls. The change is a small parameter tweak with no additional
logic.
No functional change for single thread mode.
Resolves#481
Rely on well defined behaviour for message passing, instead of volatile. Three
versions have been tested, to make sure this wouldn't cause a slowdown on any
platform.
v1: Sequentially consistent atomics
No mesurable regression, despite the extra memory barriers on x86. Even with 15
threads and extreme time pressure, both acting as a magnifying glass:
threads=15, tc=2+0.02
ELO: 2.59 +-3.4 (95%) LOS: 93.3%
Total: 18132 W: 4113 L: 3978 D: 10041
threads=7, tc=2+0.02
ELO: -1.64 +-3.6 (95%) LOS: 18.8%
Total: 16914 W: 4053 L: 4133 D: 8728
v2: Acquire/Release semantics
This version generates no extra barriers for x86 (on the hot path). As expected,
no regression either, under the same conditions:
threads=15, tc=2+0.02
ELO: 2.85 +-3.3 (95%) LOS: 95.4%
Total: 19661 W: 4640 L: 4479 D: 10542
threads=7, tc=2+0.02
ELO: 0.23 +-3.5 (95%) LOS: 55.1%
Total: 18108 W: 4326 L: 4314 D: 9468
As suggested by Joona, another test at LTC:
threads=15, tc=20+0.05
ELO: 0.64 +-2.6 (95%) LOS: 68.3%
Total: 20000 W: 3053 L: 3016 D: 13931
v3: Final version: SeqCst/Relaxed
threads=15, tc=10+0.1
ELO: 0.87 +-3.9 (95%) LOS: 67.1%
Total: 9541 W: 1478 L: 1454 D: 6609
Resolves#474
Using less parameters and code to compute Threats
Includes also a few spacing edits.
Run as a simplification.
Passed STC 10+0.1
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 18879 W: 3725 L: 3600 D: 11554
Passed LTC 60+0.4
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 74116 W: 11001 L: 10958 D: 52157
bench: 8004751
Fishtest is a key factor of SF success.
Thanks to Fishtest we have not only greately
improved ELO but, even more important, we
have enabled a kind of joint development and
testing that it is the herat of on open
source project like SF.
Open sourcing is not just about open code, it is
about commuity development. In case of a chess engine
this has never been possible before due to missing
a strong and strict testing environment that allows
many people to contribute in a safe and coordinate way.
Fishtest is a new way of developing chess engines,
something that has never exsisted before.
No functional change.
Collect and give a second try to some almost passed tuning attempts and
one-line tweaks from the last month.
Passed STC
LLR: 3.07 (-2.94,2.94) [0.00,4.00]
Total: 15124 W: 2974 L: 2756 D: 9394
And LTC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 21577 W: 3507 L: 3289 D: 14781
Bench: 8855226
Resolves#464
Start all threads searching on root position and
use only the shared TT table as synching scheme.
It seems this scheme scales better than YBWC for
high number of threads.
Verified for nor regression at STC 3 threads
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40232 W: 6908 L: 7130 D: 26194
Verified for nor regression at LTC 3 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 28186 W: 3908 L: 3798 D: 20480
Verified for nor regression at STC 7 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 3607 W: 674 L: 526 D: 2407
Verified for nor regression at LTC 7 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 4235 W: 671 L: 528 D: 3036
Tested with fixed games at LTC with 20 threads
ELO: 44.75 +-7.6 (95%) LOS: 100.0%
Total: 2069 W: 407 L: 142 D: 1520
Tested with fixed games at XLTC (120secs) with 20 threads
ELO: 28.01 +-6.7 (95%) LOS: 100.0%
Total: 2275 W: 349 L: 166 D: 1760
Original patch of mbootsector, with additional work
from Ivan Ivec (log formula), Joerg Oster (id loop
simplification) and Marco Costalba (assorted formatting
and rework).
Bench: 8116244
Apply bonus for the prior CMH that caused a fail low.
Balance Stats: CMH and History bonuses are updated differently.
This eliminates the "fudge" factor weight when scoring moves. Also
eliminated discontinuity in the gravity history stat formula. (i.e. stat
scores will no longer inverse when depth exceeds 22)
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 21802 W: 4107 L: 3887 D: 13808
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 46036 W: 7046 L: 6756 D: 32234
Bench: 7677367
Add Travis CI support to GitHub repo.
After every push to master, Travis will build
the sources directly from GitHub repo according
to .travis.yml and verify everything is ok.
No functional change.
Fix issues after a run of PVS-STUDIO analyzer.
Mainly false positives but warnings are anyhow
useful to point out not very readable code.
Noteworthy is the memset() one, where PVS prefers ss-2
instead of stack. This is because memeset() could
be optimized away by the compiler when using 'stack',
due to stack being a local variable no more used after
memset. This should normally not happen, but when
it happens it leads to very sublte and difficult
to find bug, so better to be safe than sorry.
No functional change.
When changing 'search' and 'splitPointsSize' we have to
use thread locks, not split point ones, because can_join()
is called under the formers.
Verified succesfully with 24 hours toruture tests with 20
cores machine by Louis Zulli: it does not hangs.
Verifyed for no regressions with STC, 7 threads:
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 52804 W: 8159 L: 8087 D: 36558
No functional change.
Only refresh TT entry when it's really necessary.
This should give a small speed boost for some machines.
And it's a risk-free change.
No functional change.
Resolves#429
Louis Zulli reported that Stockfish suffers from very occasional hangs with his 20 cores machine.
Careful SMP debugging revealed that this was caused by "a ghost split point slave", where thread
was marked as a split point slave, but wasn't actually working on it.
The only logical explanation for this was double booking, where due to SMP race, the same thread
is booked for two different split points simultaneously.
Due to very intermittent nature of the problem, we can't say exactly how this happens.
The current handling of Thread specific variables is risky though. Volatile variables are in some
cases changed without spinlock being hold. In this case standard doesn't give us any kind of
guarantees about how the updated values are propagated to other threads.
We resolve the situation by enforcing very strict locking rules:
- Values for key thread variables (splitPointsSize, activeSplitPoint, searching)
can only be changed when the thread specific spinlock is held.
- Structural changes for splitPoints[] are only allowed when the thread specific spinlock is held.
- Thread booking decisions (per split point) can only be done when the thread specific spinlock is held.
With these changes hangs didn't occur anymore during 2 days torture testing on Zulli's machine.
We probably have a slight performance penalty in SMP mode due to more locking.
STC (7 threads):
ELO: -1.00 +-2.2 (95%) LOS: 18.4%
Total: 30000 W: 4538 L: 4624 D: 20838
However stability is worth more than 1-2 ELO points in this case.
No functional change
Resolves#422
v = value without ep capture being considered
v1 = value of the ep capture
The correct logic is:
if without e.p. capture we are losing, and the value of e.p is either draw, or win or "loss, but 50 move rule saves us", then we should use the value of ep capture.
Credit and thanks to syzygy1 and lantonov !
No functional change (except with syzygy bases)
Resolves#415Resolves#394
Instead of using hard coded Min and Max values for history,
always adjust the old value slightly downwards before adding a new value.
The adjustment acts like gravity that prevents the value escaping too
far from zero.
Bench: 8020484
Resolves#407
Apart from usual renaiming, take advantage of
C++11 function template default parmeter to
get rid of Eval trampoline functions.
Some triviality fixes while there.
No functional change.
Align the behaviour with reductions. Initially castling moves had to be
treated differently, because the SEE did not handle them correctly. But now it
does.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 83750 W: 15722 L: 15711 D: 52317
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 97183 W: 15120 L: 15115 D: 66948
bench 7759837
Resolves#403
PawnSafePush, with the value S(5,5) proved not "necessary"
possibly due to recent changes to MobilityArea and other changes to Connected bonus.
STC:
LLR: 3.22 (-2.94,2.94) [-3.00,1.00]
Total: 98528 W: 18757 L: 18759 D: 61012
LTC:
LLR: 5.30 (-2.94,2.94) [-3.00,1.00]
Total: 204194 W: 31698 L: 31734 D: 140762
Bench: 7620871
Resolves#396
Use Position::square and Position::squares instead.
This allow us to remove king_square(), simplify
endgames and to have more naming uniformity.
Moreover, this is a prerequisite step in case
in the future we decide to retire piece lists
altoghter and use pop_lsb() to loop across
pieces and serialize the moves. In this way
we just need to change definition of Position::square
to something like:
template<PieceType Pt> inline
Square Position::square(Color c) const {
return lsb(byColorBB[c]);
}
No functional change.
Based off of Pull request #383:
Include squares occupied by some pawns in the MobilityArea
a) not blocked
b) on rank 4 and above
c) or captures
Passed STC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 8157 W: 1644 L: 1516 D: 4997
And LTC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 26086 W: 4274 L: 4051 D: 17761
-----------
Then, a simplification test failed, trying to remove b and c)
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 6048 W: 1117 L: 1288 D: 3643
Another simplification test, was run to remove just (c)
Passed STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 28073 W: 5364 L: 5255 D: 17454
And LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 34652 W: 5448 L: 5348 D: 23856
A parameter tweak test showed that changing b) for "on rank 3 and above"
does not work
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 5233 W: 937 L: 1077 D: 3219
Finally, a small rewrite, and we have this version
Include squares occupied by some pawns in the MobilityArea which are
a) not blocked
b) on rank 4 and above
Bench: 8977899
Resolves#385
Under Windows with MSVC we use the IDE to compile,
in this case we infer some compiler flags usually
set by Makefile.
The condition to check this was wrong, namely when compiling
with mingw under Windows 64 bit we always set IS_64BIT and
USE_BSFQ even if compiled with ARCH=x86-32 (this is how I
found it).
Small code style touches while there.
No functional change.
Error is:
a value of type "int" cannot be assigned to an entity of type "Value"
Also retire the now unused squares_of_color() function.
No functional change.
In Chess960 we can have legal positions with
opponent rook in A or H file and with castling
available, for instance:
4k3/pppppppp/8/8/8/8/PPPPPPPP/rR2K3 w Q - 0 1
In those cases we pick up the wrong rook when
setting castling.
Fix it by checking the color of the rook.
Bug reported by Matthew Lai.
No functional change.
Instead of crafting a clever formula to calculate the array offset, simply use a
3 dimensional array. Remove the comment while at it, because now the code is
self-documenting.
No functional change.
Resolves#344
Introduce helper function Search::reset() which clears all kind of search
memory, in order to restore a deterministic search state.
Generalize TT.clear() into Search::reset() for the following use cases:
- bench: needed to guarantee deterministic bench (ie. if you call bench from
interactive command line twice in a row you get the same value).
- Clear Hash: restore clean search state, which is the purpose of this button.
- ucinewgame: ditto.
No functional change.
Resolves#346
Based on an idea and patch by VoyagerOne.
Small simplification, but was tedted for an ELO gain anyway.
STC:
LLR: 2.95 (-2.94,2.94) [-1.00,4.00]
Total: 5375 W: 1119 L: 977 D: 3279
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 17893 W: 2984 L: 2792 D: 12117
bench 8322847
Use symmetry along vertical middle axis of the board
to reduce the number of parameters.
For instance psqt value of SQ_A5 == SQ_A4 and value of
SQ_F8 == SQ_F1.
This is always true, at least until now nobody came in
with an asymmetric psqt table that worked.
Original patch by Lucas.
No functional change.
Easier for tuning psq tables:
TUNE(myParameters, PSQT::init);
Also move PSQT code in a new *.cpp file, and retire the
old and hacky psqtab.h that required to be included only
once to work correctly, this is not idiomatic for a header
file.
Give wide visibility to psq tables (previously visible only
in position.cpp), this will easy the use of psq tables outside
Position, for instance in move ordering.
Finally trivial code style fixes of the latest patches.
Original patch of Lucas Braesch.
No functional change.
In ei.attackedBy, Queen does not x-ray through Rook, but the Rook does
X-ray through the Queen.
So most of the rook contact checks supported by queen are, in fact,
Queen Contact Checks and they are already scored separately.
Bench: 7762189
Resolves#338
Currently Zobrist::castling[] are not properly zeroed
and rely on the compiler to do this at startup, but this
makes Position::init() to set different values every time
it is called!
This is a bit odd, and although not impacting normal usage,
can yield to subtle misbehaviour, very difficult to track
down, in case we happen to call it more than once for some
reason. I found this while developing tuning support and
it took me a while to track it down.
So properly init Zobrist::castling[]
No functional change.
Resolves#329
When running more games in parallel, or simply when running a game
with a background process, due to how OS scheduling works, there is no
guarantee that the CPU resources allocated evenly between the two
players. This introduces noise in the result that leads to unreliable
result and in the worst cases can even invalidate the result. For
instance in SF test framework we avoid running from clouds virtual
machines because are a known source of very unstable CPU speed.
To overcome this issue, without requiring changes to the GUI, the idea
is to use searched nodes instead of time, and to convert time to
available nodes upfront, at the beginning of the game.
When nodestime UCI option is set at a given nodes per milliseconds
(npmsec), at the beginning of the game (and only once), the engine
reads the available time to think, sent by the GUI with 'go wtime x'
UCI command. Then it translates time in available nodes (nodes =
npmsec * x), then feeds available nodes instead of time to the time
management logic and starts the search. During the search the engine
checks the searched nodes against the available ones in such a way
that all the time management logic still fully applies, and the game
mimics a real one played on real time. When the search finishes,
before returning best move, the total available nodes are updated,
subtracting the real searched nodes. After the first move, the time
information sent by the GUI is ignored, and the engine fully relies on
the updated total available nodes to feed time management.
To avoid time losses, the speed of the engine (npms) must be set to a
value lower than real speed so that if the real TC is for instance 30
secs, and npms is half of the real speed, the game will last on
average 15 secs, so much less than the TC limit, providing for a
safety 'time buffer'.
There are 2 main limitations with this mode.
1. Engine speed should be the same for both players, and this limits
the approach to mainly parameter tuning patches.
2. Because npms is fixed while, in real engines, the speed increases
toward endgame, this introduces an artifact that is equivalent to an
altered time management. Namely it is like the time management gives
less available time than what should be in standard case.
May be the second limitation could be mitigated in a future with a
smarter 'dynamic npms' approach.
Tests shows that the standard deviation of the results with 'nodestime'
is lower than in standard TC, as is expected because now all the introduced
noise due the random speed variability of the engines during the game is
fully removed.
Original NIT idea by Michael Hoffman that shows how to play in NIT mode
without requiring changes to the GUI. This implementation goes a bit
further, the key difference is that we read TC from GUI only once upfront
instead of re-reading after every move as in Michael's implementation.
No functional change.
And reformat a bit time manager code.
Note that now we set starting search time in think() and
no more in ThreadPool::start_thinking(), the added delay
is less than 1 msec, so below timer resolution (5msec) and
should not affect time lossses ratio.
No functional change.
Code like this is more a case of showing off one's C++ knowledge, rather than
using it adequately, IMHO.
**First loop (std::generate)**
Iterators are inadequate here, because they lose the key information which is
idx. As a result, we need to carry a redundant idx variable, and increment it
along the way. Very clumsy.
Usage of std::generate and a lambda function only obfuscate the code, which is
merely a simple and stupid loop over the elements of a vector.
**Second loop (std::accumulate)**
This code is thoroughlly incomprehensible. Restore the original, which was much
simpler to understand.
**Third loop (range based loop)**
Again, a range based loop is inadequate, because we lose idx! To resolve this
artificially created problem, the data model was made redundant (idx is a data
member of db[] elements!?), which is ugly and unjustified. A simple and stupid
for loop with idx does the job much better.
No functional change.
Resolves#313
Small speed-up in pawn.cpp
Results for 10 tests for each version:
Base Test Diff
Mean 1435636 1445238 -9602
StDev 22576 23189 1848
p-value: 1
speedup: 0.007
No functional change
Resolves#295
If the sum of CounterMoveHistory heuristic and History heuristic is below zero,
then reduce an extra ply in cut nodes
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 6479 W: 1099 L: 967 D: 4413
Bench: 7773299
Resolves#315
Currently if we call it more than once, we crash.
This is not a real problem, because this function is
indeed called just once. Nevertheless with this small fix,
that gets rid of a hidden 'static' variable, we cleanly
resolve the issue.
While there, fix also ThreadPool::exit to return in a
consistent state. Now all the init() functions but
UCI::init() are reentrant and can be called multiple
times.
No functional change.
Profiling shows that resetting attacks table after
a failed candidate magic attempt is the biggest
time consumer, so rewrite the logic avoiding the
memset()
Magics init for rook+bishop goes from 200msecs to
under 100msec.
No functional change.
To sync UI with main thread it is enough a single
condition variable because here we have a single
producer / single consumer design pattern.
Two condition variables are strictly needed just for
many producers / many consumers case.
Note that this is possible because now we don't send to
sleep idle threads anymore while searching, so that now
only UI can wake up the main thread and we can use the
same ConditionVariable for both threads.
The natural consequence is to retire wait_for_think_finished()
and move all the logic under MainThread class, yielding the
rename of teh function to join()
No functional change.
Now that we use spinlocks everywhere and don't put
threads to sleep while idle, we can use the slower
(but no more in hot path) std::condition_variable_any
instead of our homwgrown ConditionVariable struct.
Verified fo rno regression at STC with 7 threads:
ELO: -0.66 +-2.7 (95%) LOS: 31.8%
Total: 20000 W: 3210 L: 3248 D: 13542
No functional change
There's no reason to define it as a Bitboard, so for consistency, use bool.
This is even a speedup on my machine: i7-3770k, using gcc 4.9.1 (linux):
stat test master diff
mean 2,341,338 2,327,998 13,134
stdev 15,765 14,717 5,405
speedup 0.56%
P(speedup>0) 100.0%
No functional change.
Resolves#298
Avoid redundant 'while' conditions. It is enough to
check them in the outer loop.
Quick tested for no regression 10K games at 4 threads
ELO: -1.32 +-3.9 (95%) LOS: 25.6%
Total: 10000 W: 1653 L: 1691 D: 6656
No functional change.
Spinlock must be used instead.
Tested for no regression at 15+0.05 th 4:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 25928 W: 4303 L: 4190 D: 17435
No functional change.
Resolves#297
Unify the quites moves loop for both cases,
the compiler optimizes away the
if (is_ok((ss-1)->currentMove))
inside loop, so that the result is same
speed as original.
No functional change.
During the search, do not block on condition variable, but instead use std::this_thread::yield().
Clear gain with 16 threads. Again results vary highly depending on hardware, but on average it's a clear gain.
ELO: 12.17 +-4.3 (95%) LOS: 100.0%
Total: 7998 W: 1407 L: 1127 D: 5464
There is no functional change in single thread mode
Resolves#294
Both are the result of a SPSA tuning session with a custom book, 50k iterations each.
After an additional tuning session of the mobility values, tuning the delta values, with following result.
40k games at 9+0.05:
ELO: 4.13 +-2.2 (95%) LOS: 100.0%
Total: 40000 W: 8581 L: 8106 D: 23313
and LTC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 36518 W: 6049 L: 5782 D: 24687
Bench: 8567402
Resolves#284
Fixes reported startup error about missing libwinpthread-1.dll
when the dll is not in the path.
The current -static-xxxx flags, introduced with:
https://github.com/official-stockfish/Stockfish/commit/373503f4a9a990054b5
Only take in account standard libraries, but not thread
library.
No functional change.
Resolves#289
Idea and original implementation by Stephane Nicolet
7 threads 15+0.05
ELO: 3.54 +-2.9 (95%) LOS: 99.2%
Total: 17971 W: 2976 L: 2793 D: 12202
There is no functional change in single thread mode
Spend much less time in positions where one move is much better than all other alternatives.
We carry forward pv stability information from the previous search to identify such positions.
It's based on my old InstaMove idea but with two significant improvements.
1) Much better instability detection inside the search itself.
2) When it's time to make a FastMove we no longer make it instantly but still spend at least 10% of normal time verifying it.
Credit to Gull for the inspiration.
BIG thanks to Gary because this would not work without accurate PV!
20K
ELO: 8.22 +-3.0 (95%) LOS: 100.0%
Total: 20000 W: 4203 L: 3730 D: 12067
STC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 23266 W: 4662 L: 4492 D: 14112
LTC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 12470 W: 2091 L: 1931 D: 8448
Resolves#283
Introduce a counter move history table which additionally is indexed by the last move's piece and target square.
For quiet move ordering use now the sum of standard and counter move history table.
STC:
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 4747 W: 1005 L: 885 D: 2857
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 5726 W: 1001 L: 872 D: 3853
Because of reported low NPS on multi core test
STC (7 threads):
ELO: 7.26 +-3.3 (95%) LOS: 100.0%
Total: 14937 W: 2710 L: 2398 D: 9829
Bench: 7725341
Resolves#282
Use Mutex instead.
This is in preparaation for merging with master branch,
where we stilll don't have spinlocks.
Eventually spinlocks will be readded in some future
patch, once c++11 has been merged.
No functional change.
minKingPawnDistance is used only as local variable in one place so we don't need it to be part of "Pawns::Entry" structure.
No functional change.
Resolves#277
This change in the Makefile restores the possibility to compile
Stockfish on Mac OS X 10.9 and 10.10 after the C++11 has been merged.
To use the default (fastest) settings, compile with:
make build ARCH=x86-64-modern
To test the clang settings, compile with
make build ARCH=x86-64-modern COMP=clang
Beware that the clang settings may provide a slightly slower (6%)
executable.
Backported from master.
No functional change
Resolves#275
This change in the Makefile restores the possibility to compile
Stockfish on Mac OS X 10.9 and 10.10 after the C++11 has been merged.
To use the default (fastest) settings, compile with:
make build ARCH=x86-64-modern
To test the clang settings, compile with
make build ARCH=x86-64-modern COMP=clang
Beware that the clang settings may provide a slightly slower (6%)
executable.
No functional change
Resolves#275
Now that c++11 branch has been merged in master,
disable unconditionally the spinlocks and use mutex
instead. This will allow to run fishtest even on HT
machines withouth changes.
In the future we will reintorduce spinlocks, once
we will have took care of fishtest.
No functional change.
And use mutex instead. You may never want to do this.
It is a workaround to run c++11 on fishtest where many
machiens have HTenabled and this can be a problem when
number of cores set is higher than number of physical cores.
To disable spinlocks, just compile with -DNO_SPINLOCK flag
No functional change.
Calling lock.test_and_set() in a tight loop creates expensive
memory synchronizations among processors and penalize other
running threads. So syncronize only only once at the beginning
with fetch_sub() and then loop on a simple load() that puts much
less pressure on the system.
Reported about 2-3% speed up on various systems.
Patch by Ronald de Man.
No functional change.
It is reported to be defenitly faster with increasing
number of threads, we go from a +3.5% with 4 threads
to a +15% with 16 threads.
The only drawback is that now when testing with more
threads than physical available cores, the speed slows
down to a crawl. This is expected and was similar at what
we had setting the old sleepingThreads to false.
No functional change.
Initialization is more complex than what I'd like due
to MSVC compatibility that for some reason does not like:
std::atomic_flag lock = ATOMIC_FLAG_INIT;
No functional change.
It should be used slavesMask.count() instead.
Verified 100% equivalent when sp->allSlavesSearching:
dbg_hit_on(sp->allSlavesSearching, sp->slavesCount != sp->slavesMask.count());
No functional change.
Document and clarify that we cannot rejoin on ourselves
and that we never late join if we are master and all
slaves have finished, inded in this case we exit idle_loop.
No functional change.
In case of Threads.size() == 2 we have that sp->allSlavesSearching
is always false (because we have finished our search), bestSp is
always NULL and we never late join, so there is no need to special
case here.
Tested with dbg_hit_on(sp && sp->allSlavesSearching) and
verified it never fires.
No functional change.
And retire a redundant field. This is important also
from a concept point of view becuase we want to keep
SMP structures as simple as possible with the only
strictly necessary data.
Verified with
dbg_hit_on(sp->spLevel != level)
that the values are 100% the same out of more 50K samples.
No functional change.
Balance threads between split points.
There are huge differences between different machines and autopurging makes it very difficult to measure the improvement in fishtest, but the following was recorded for 16 threads at 15+0.05:
For Bravone (1000 games): 0 ELO
For Glinscott (1000 games): +20 ELO
For bKingUs (1000 games): +50 ELO
For fastGM (1500 games): +50 ELO
The change was regression for no one, and a big improvement for some, so it should be fine to commit it.
Also for 8 threads at 15+0.05 we measured a statistically significant improvement:
ELO: 6.19 +-3.9 (95%) LOS: 99.9%
Total: 10325 W: 1824 L: 1640 D: 6861
Finally it was verified that there was no (significant) regression for
4 threads:
ELO: 0.09 +-2.8 (95%) LOS: 52.4%
Total: 19908 W: 3422 L: 3417 D: 13069
2 threads:
ELO: 0.38 +-3.0 (95%) LOS: 60.0%
Total: 19044 W: 3480 L: 3459 D: 12105
1 thread:
ELO: -1.27 +-2.1 (95%) LOS: 12.3%
Total: 40000 W: 7829 L: 7975 D: 24196
Resolves#258
This micro-optimization only complicates the code and provides no benefit.
Removing it is even a speedup on my machine (i7-3770k, linux, gcc 4.9.1):
stat test master diff
mean 2,403,118 2,390,904 12,214
stdev 12,043 10,620 3,677
speedup 0.51%
P(speedup>0) 100.0%
No functional change.
This micro-optimization only complicates the code and provides no benefit.
Removing it is even a speedup on my machine (i7-3770k, linux, gcc 4.9.1):
stat test master diff
mean 2,403,118 2,390,904 12,214
stdev 12,043 10,620 3,677
speedup 0.51%
P(speedup>0) 100.0%
No functional change.
Use integer arithmetic instead of floating point arithmetic.
Floating point arithmetic was causing different results for some 32-bit compiles
No functional change
Resolves#249Resolves#250
Also remove useless StateCopySize64 optimization:
compiler uses SSE movups instruction anyhow and
does not need this trick (verified with fishbench).
No functional change.
Funny enough, gcc __builtin_prefetch() expects
already a void*, instead Windows's _mm_prefetch()
requires a char*.
The patch allows to remove ugly casts from caller
sites.
No functional change.
Results for 10 tests for each version (gcc 4.8.3 on mingw):
Base Test Diff
Mean 1502447 1507917 -5470
StDev 3119 1364 4153
p-value: 0,906
speedup: 0,004
Results for 10 tests for each version (MSVC 2013):
Base Test Diff
Mean 1400899 1403713 -2814
StDev 1273 2804 2700
p-value: 0,851
speedup: 0,002
No functional change.
I went through all the individual compile options that differ between
-fprofile-generate/-fprofile-use and -fprofile-arcs/-fbranch-probabilities
and distilled the speed difference down to only turning off
-fno-peel-loops and -fno-tracer. Using this we still get the full speedup
(maybe a bit more because other optimizations stay on) and it's also much cleaner
because we can get rid of the "@rm -f ucioption.gc*" hack for all versions of gcc.
No functional change.
Resolves#237
Follow the usual approach to delay computation
as far as possible, in case an earlier killer
cut-offs we avoid to do useless work.
This also greatly simplifies the code.
No functional change.
Verified with perft there is no speed regression,
and code is simpler. It is also conceptually correct
becuase an extended move is just a move that happens
to have also a score.
No functional change.
This is an old patch from Jean-Francois Romang to send
UCI hashfull info to the GUI:
https://github.com/mcostalba/Stockfish/pull/60/files
It was wrongly judged as a slowdown, but it takes much
less than 1 ms to run, indeed on my core i5 2.6Ghz it
takes about 2 microsecs to run!
Regression test is good:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 7352 W: 1548 L: 1401 D: 4403
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 61432 W: 10307 L: 10251 D: 40874
I have set the name of the author to the original
one.
No functional change.
This patch has two positive effects:
- Retire a hackish formula and leave
just a natural, simple and plain one.
- Reduce strenght at very low level, but
don't impact medium/high levels.
Actually even at level 0, SF is still too
strong for many beginners (this was reported
many times for instance on Droidfish user
comments on Google Play).
Test on fishtest shows that ELO drop is around
170 ELO at level 0 (good!), 130 ELO at level 1
and smoothly reduces (as expected) until level
10 where the drop is just of 8 ELO.
No functional change.
- Fix a skill level problem: Don't allow move pruning at root node
- Revert "Fix profile build for gcc on Mac OSX". Results for a faster binary in x86-64.
- Fix a MSVC warning
Bench: 8918745
Seems to be a performance regression for standard build.
For SF6 people compiling on Mac OSX using profile-build option
just need to make necessary adjustments manually...
No functional change
Resolves#223
Instead of swapping sub-optimal move in Skill
d'tor, make it explicitly at the end of the search.
Also streamline and clarify relation with multiPV
and pass it directly instead of relying on the hacky
'candidates' member.
No functional change.
Warning is C4512 (assignment operator could not be generated)
Now, apart the foreign syzygy code, everything compiles
without warnings at warning level 4.
Backported from C++11 branch.
No functional change.
- Fix a compilation issue related to BMI2 PEXT instruction
- Retrieve a ponder move from TT if PV is only one move long
Bench: 8080602
No functional change
This intrinsic to call BMI2 PEXT instruction is
defined in immintrin.h. This header should be
included only when USE_PEXT is defined, otherwise
we define _pext_u64 as 0 forcing a nop.
But under some mingw platforms, even if we don't
include the header, immintrin.h gets included
anyhow through an include chain that starts with
STL <algorithm> header. So we end up both defining
_pext_u64 function and at the same time defining
_pext_u64 as 0 leading to a compile error.
The correct solution is of not using _pext_u64 directly.
This patch fixes a compile error with some mingw64
package when compiling with x86-64.
No functional change.
Resolves#222
In case we stop the search during a fail-high
it is possible we return to GUI without a ponder
move. This patch try harder to find a ponder move
retrieving it from TT. This is important in games
played with 'ponder on'.
bench: 8080602
Resolves#221
When not using Makefile, e.g. with MSVC, if hardware
supports BMI2 instructions, then USE_PEXT should be
added in project configuration to enable pext support.
No functional change.
This intrinsic to call BMI2 PEXT instruction is
defined in immintrin.h. This header should be
included only when USE_PEXT is defined, otherwise
we define _pext_u64 as 0 forcing a nop.
But under some mingw platforms, even if we don't
include the header, immintrin.h gets included
anyhow through an include chain that starts with
STL <algorithm> header. So we end up both defining
_pext_u64 function and at the same time defining
_pext_u64 as 0 leading to a compile error.
The correct solution is of not using _pext_u64 directly.
This patch fixes a compile error with some mingw64
package when compiling with x86-64.
No functional change.
Warning is C4512 (assignment operator could not be generated)
Now, apart the foreign syzygy code, everything compiles
without warnings at warning level 4.
No functional change.
Coverity scan warns about uninitialized 'sf' argument when
calling probe(). Actually it is a false positive because
argument is passed by reference and assigned inside
probe(). Nevertheless it is a hint that fucntion signature
is a bit tricky, so rewrite it in a more conventional way,
assigning 'sf' from probe() return value.
No functional change.
On platforms where size_t is 32 bit, we
can have an overflow in this expression:
(mbSize * 1024 * 1024)
Fix it setting max hash size of 2GB on platforms
where size_t is 32 bit.
A small rename while there: now struct Cluster
is definied inside class TranspositionTable so
we should drop the redundant TT prefix.
No functional change.
On Android-ARM current TB code crashes at
random times even in single thread mode.
Reported, debugged, fixed and verified
by Peter Osterlund.
No functional change.
Resolves#201
This optimization is aimed at old hardware only (withouth popcount), and even on
non popcount compile (ARCH=x86-64), it provides no mesurable speedup:
stat test master diff
mean 2,341,779 2,354,699 -12,920
stdev 12,910 14,770 18,150
speedup -0.55%
P(speedup>0) 23.8%
No functional change.
Resolves#187
- Change UCI::value() signature
This function should only return the value,
lowerbound and upperbound info is up to the
caller because it requires external knowledge,
out of the scope of this little helper.
- Retire 'key' command
It is not an UCI command and is absolutely
useless: never used.
- Comments fixing and other trivia
No functional change.
And reshuffle a bit the functions to place
them in a consistent order.
To be on the safe side, patch has been
validated for no regression/crashes with
a small 8K games test with 3 threads:
ELO: 3.98 +-4.4 (95%) LOS: 96.3%
Total: 8388 W: 1500 L: 1404 D: 5484
No functional change.
It is up to material (and pawn) table look up
code to know where the per-thread tables are,
so change API to reflect this.
Also some comment fixing while there
No functional change.
Move all in evaluation.
Simplify the code and concentrate in a single place
all the logic behind space evaluation, making it much
more clear.
Verified also at STC it does not regress due to a possible
slow down:
LLR: 3.91 (-2.94,2.94) [-3.00,1.00]
Total: 65744 W: 13285 L: 13194 D: 39265
No functional change.
Commenst are obsolete now, an updated description
would be quite obscure, so better let the code
to talk and remove them all together.
No functional change.
Use the same template of other pawns moves generation,
make the code more uniform, simplify generate_promotions
that has now been renamed.
No functional change (verified also with perft).
In some UNIX systems "rm" prompts user for confirmation.
However "rm -f" is always a guaranteed forced deletion.
Also move gcc profiling hack under the correct target
No Functional change
Resolves#168
ShelterWeakness and Stormdanger array are now indexed additionally by
file pair (a/h,b/g,c/f,d/e). The special case of king blocking a pawn
is incorporated in the StormDanger array. Finally the 93 parameters
are tuned by SPSA on LTC.
STC
ELO: 3.46 +-2.2 (95%) LOS: 99.9%
Total: 40000 W: 8275 L: 7877 D: 23848
LTC
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 10311 W: 1876 L: 1721 D: 6714
Bench: 9498821
Resolves#163
The evaluation is already done by the specialized
function, don't need to add something elese later.
With this patch following positions are evaluated
correctly as draws:
8/6p1/1Pkp1p1p/2nNn2P/2P1K1P1/8/8/3B4 w - - 7
8/1k4p1/1P1p1p1p/3NnK1P/2P3P1/1n6/4B3/8 w - -
Verified it not regress with an STC test:
LLR: 3.15 (-2.94,2.94) [-3.00,1.00]
Total: 49812 W: 10095 L: 10016 D: 29701
Reported by Arjun Temurnikar.
bench: 8289983
In particular seems more natural to return
bool and TTEntry on the same line, actually
we should pass and return them as a pair,
but due to limitations of C++ and not wanting
to use std::pair this can be an acceptable
compromise.
No functional change.
Resolves#157
In some cases we want to go direcly to the moves loop
without checking for early return. The patch make this
logic more clear and consistent.
Tested for no regression, passed STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 25282 W: 5136 L: 5022 D: 15124
and LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 72007 W: 12133 L: 12095 D: 47779
bench: 9316798
This patch replaces RKISS by a simpler and faster PRNG, xorshift64* proposed
by S. Vigna (2014). It is extremely simple, has a large enough period for
Stockfish's needs (2^64), requires no warming-up (allowing such code to be
removed), and offers slightly better randomness than MT19937.
Paper: http://xorshift.di.unimi.it/
Reference source code (public domain):
http://xorshift.di.unimi.it/xorshift64star.c
The patch also simplifies how init_magics() searches for magics:
- Old logic: seed the PRNG always with the same seed,
then use optimized bit rotations to tailor the RNG sequence per rank.
- New logic: seed the PRNG with an optimized seed per rank.
This has two advantages:
1. Less code and less computation to perform during magics search (not ROTL).
2. More choices for random sequence tuning. The old logic only let us choose
from 4096 bit rotation pairs. With the new one, we can look for the best seeds
among 2^64 values. Indeed, the set of seeds[][] provided in the patch reduces
the effort needed to find the magics:
64-bit SF:
Old logic -> 5,783,789 rand64() calls needed to find the magics
New logic -> 4,420,086 calls
32-bit SF:
Old logic -> 2,175,518 calls
New logic -> 1,895,955 calls
In the 64-bit case, init_magics() take 25 ms less to complete (Intel Core i5).
Finally, when playing with strength handicap, non-determinism is achieved
by setting the seed of the static RNG only once. Afterwards, there is no need
to skip output values.
The bench only changes because the Zobrist keys are now different (since they
are random numbers straight out of the PRNG).
The RNG seed has been carefully chosen so that the
resulting Zobrist keys are particularly well-behaved:
1. All triplets of XORed keys are unique, implying that it
would take at least 7 keys to find a 64-bit collision
(test suggested by ceebo)
2. All pairs of XORed keys are unique modulo 2^32
3. The cardinality of { (key1 ^ key2) >> 48 } is as close
as possible to the maximum (65536)
Point 2 aims at ensuring a good distribution among the bits
that determine an TT entry's cluster, likewise point 3
among the bits that form the TT entry's key16 inside a
cluster.
Details:
Bitset card(key1^key2)
------ ---------------
RKISS
key16 64894 = 99.020% of theoretical maximum
low18 180117 = 99.293%
low32 305362 = 99.997%
Xorshift64*, old seed
key16 64918 = 99.057%
low18 179994 = 99.225%
low32 305350 = 99.993%
Xorshift64*, new seed
key16 65027 = 99.223%
low18 181118 = 99.845%
low32 305371 = 100.000%
Bench: 9324905
Resolves#148
Currently Search::RootMoves is accessed and even
modified by TB probing functions in a hidden
and sneaky way.
This is bad practice and makes the code tricky.
Instead explicily pass the vector as function
argument so to clarify that the vector is modified
inside the functions.
No functional change.
Simplified also some logic while there.
TBLargest needs renaming too, but itis for
a future patch because touches also syzygy
directory stuff.
No functional change.
We really don't need to uglify in this way
our nice count() API with this ad-hoc hack.
So remove the hack and use the already
existing infrastructure.
No functional change.
Resolves#134
Updating the makefile so that the clean and gcc-profile-clean targets also
remove the profiling data files in the syzygy directory.
No functional change.
Resolves#136
Just like in Physics, the ratio of 2 things in the same unit, should be
without unit.
Example use case:
- Ratio of a Depth by a Depth (eg. ONE_PLY) should be an int.
- Ratio of a Value by a Value (eg. PawnValueEg) should be an int.
Remove a bunch of useless const while there.
No functional change.
Resolves#128
Adds support for Syzygy tablebases to Stockfish. See
the Readme for information on using the tablebases.
Tablebase support can be enabled/disabled at the Makefile
level as well, by setting syzygy=yes or syzygy=no.
Big/little endian are both supported.
No functional change (if Tablebases are not used).
Resolves#6
It is possible that we won't have a ponder move if our PV
is too short. In that case, just don't print a ponder move.
No functional change
Resolves#130
When evaluating double pawns we use always
lsb() to extract the frontmost square.
This breaks evaluation color symmetry as is
possible to verify with an instrumented evaluate()
Value evaluate(const Position& pos) {
Value v = do_evaluate<false>(pos);
Position p = pos;
p.flip();
assert(v == do_evaluate<false>(p));
return v;
}
Passed no regression test:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 21035 W: 4244 L: 4122 D: 12669
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 39839 W: 6662 L: 6572 D: 26605
bench: 8255966
It is a non functional change, but because
we now skip copying of pv[] in SpNode, patch
has been tested for regression with 3 threads:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 54668 W: 9873 L: 9809 D: 34986
No functional change.
This is a regression from 428962a
We have to cast to char here, otherwise the compiler
interprets it as an integer, and writes a number.
No functional change
Resolves#122
This is the first of a patch series to
rearrange and simplify accurate PV.
In this patch there is simple coding
style and reformatting stuff.
Verified with fishtest it does not crash
with MAX_PLY = 8
No functional change.
Previously token would keep its value from the previous line when an empty
line was input, leading to unexpected behaviour.
No functional change
Resolves#119
When comparing to a Depth it is more
consistent to use DEPTH_MAX instead
of a int.
This is a subtle difference because we use
ply and depth almost interchangably in SF,
but they are different. FOr counting plies
makes ense to continue using ints, while
for Depth we have our specific enum.
This cleanly fixes a new Clang 3.5 warning:
No functional change.
This gives SF accurate PVs, such that the evaluation of the leaf node in
the PV matches the score backed up to the root (99% of the time.
q-search will use the value stored in the hash table instead of the eval
value sometimes).
One drawback is that fail-high/low only get a minimal 2 move PV.
It doesn't add any additional overhead to the non-PV codepath except an
extra eight bytes to the SearchStack structure in multi-threaded
searches.
A core part of this is not pruning based on TT score in PV nodes. This
was measured as not being a regression at multiple TCs, except for one
exception, fast TC with huge hash, which is not realistic for longer
searches.
STC - 1 thread, 128 mb hash
ELO: 1.42 +-3.1 (95%) LOS: 81.9%
Total: 20000 W: 4078 L: 3996 D: 11926
STC - 3 thread, 128 mb hash
ELO: -3.60 +-2.9 (95%) LOS: 0.8%
Total: 20000 W: 3575 L: 3782 D: 12643
STC - 3 thread, 8 mb hash
ELO: 0.12 +-2.9 (95%) LOS: 53.3%
Total: 20000 W: 3654 L: 3647 D: 12699
LTC - 3 thread, 32mb hash
ELO: 2.29 +-2.0 (95%) LOS: 98.8%
Total: 35740 W: 5618 L: 5382 D: 24740
Bench: 6984058
Resolves#102
Daniel Jose reported that it was an elo gain in his engine:
http://www.talkchess.com/forum/viewtopic.php?t=54290
STC: Hash=16
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 33067 W: 6670 L: 6571 D: 19826
LTC: Hash=64
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41181 W: 7008 L: 6920 D: 27253
And another one to verify no regression with hash pressure:
STC: Hash=4
LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 25085 W: 5059 L: 4991 D: 15035
Verified that qsearch does not explode after this patch (recapture threshold).
Bench 7962287
Resolves#112
Real men jump/branch by hand...but
we prefer the humble way.
Moved also some uci info code where it
belongs, while there.
No functional change.
Resolves#110
16MB for 1" searches is more comensurate with the average use case.
And 16 is the default used by stockfish bench, so it makes sense to be
consistent, if only to have the same minimum memory requirement for using
SF and compiling it with PGO.
No functional change.
Resolves#109
Now we can directly replace it with
the definition resulting in simpler
and possibly faster code because
PvNode is evaluated at compile time.
No functional change.
* remove some erroneous comments, that were based on the ONE_PLY == 2.
* rename hd to d, because there's no more half-depth in SF.
* rescope variables into the for loops.
* reindent the for loops correctly.
* add a comment to explain the eval improving part (not so obvious to read
this code as array has 4 dimensions).
No functional change.
This should reduce search inconsistencies, and doesn't seem to have a measurable ELO Impact:
STC with Hash=16
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 49264 W: 10076 L: 10007 D: 29181
LTC with Hash=64
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 82149 W: 14044 L: 14023 D: 54082
Plus an extra test, to make sure it doesn't regress with strong hash pressure:
STC with Hash=4
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 21498 W: 4327 L: 4246 D: 12925
Bench: 7302735
Resolves#100
Idea is to apply king safety later in the endgame. Previously, we didn't
apply KS in a RR vs. Q ending for example, which causes poor play.
Now we calculate king attacks when the attacking side has a queen or more.
STC with 8moves_v3
LLR: 3.06 (-2.94,2.94) [0.00,4.00]
Total: 38481 W: 6228 L: 5952 D: 26301
LTC with 2moves_v1
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 51053 W: 8670 L: 8353 D: 34030
Bench: 7514010
Resolves#98
UCI specification is not clear on the size of
integers that are exchanged in the protocol, so
instead of a simple int, assume 'nodes' is a
int64_t because we need a bigger size to store
this value in many real cases, especialy with
very long searches.
No functional change.
Resolves#75
Clang 3.5 issues warning on constructs like: abs(f1 - f2). The thing is that
f1 and f2 are enum types, and the range given (all positive) allows the
compiler to choose an unsigned type (efficiency being one reason to prefer
unsigned arithmetic). If f1 < f2 are unsigned, then f1 - f2 wraps around zero
and the abs() becomes a no-op. It's the reinterpretation of the unsigned
result (large value) as a signed int that happens to give the correct result,
thanks to 2's complement. This is all tricky and dangerous!
In the spirit of the standard, we assume nothing on the signedness of enums,
and simply calculate the rank and file distances as:
- rank_dist(r1, r2) = r1 < r2 ? r2 - r1 : r1 - r2
- file_dist(f1, f2) = f1 < f2 ? f2 - f1 : f1 - f2
this logic can in fact be applied to any enum we may use, so for better
generality and to avoid code duplication, we use a template function diff()
here.
No functional change.
Resolves#95
This area has become obscure and tricky over the course of incremental
changes that did not respect the original's consistency and clarity. Now,
it's not clear why we use MAX_PLY = 120, or why we use MAX_PLY+6, among
other things.
This patch does the following:
* ID loop: depth ranges from 1 to MAX_PLY-1, and due to TT constraint (depth
must fit into an int8_t), MAX_PLY should be 128.
* stack[]: plies now range from 0 to MAX_PLY-1, hence stack[MAX_PLY+4],
because of the extra 2+2 padding elements (for ss-2 and ss+2). Document this
better, while we're at it.
* Enforce 0 <= ply < MAX_PLY:
- stop condition is now ss->ply >= MAX_PLY and not ss->ply > MAX_PLY.
- assert(ss->ply < MAX_PLY), before using ss+1 and ss+2.
- as a result, we don't need the artificial MAX_PLY+6 range. Instead we
can use MAX_PLY+4 and it's clear why (for ss-2 and ss+2).
* fix: extract_pv_from_tt() and insert_pv_in_tt() had no reason to use
MAX_PLY_PLUS_6, because the array is indexed by plies, so the range of
available plies applies (0..MAX_PLY before, and now 0..MAX_PLY-1).
Tested with debug compile, using MAX_PLY=16, and running bench at depth 17,
using 1 and 7 threads. No assert() fired. Feel free to submit to more severe
crash-tests, if you can think of any.
No functional change.
SSE4.2 has nothing to do with POPCNT. We must dispell this myth, because
Stockfish is a reference and many will copy this mistake if they see it in Stockfish:
* SSE is an SIMD instruction set, relative to vectorization (using special 128-bit registers).
* POPCNT/LZCNT work on normal registers (eg. AL, AX, EAX, RAX).
The confusion comes from the fact that, in the Intel product line, it just
so happens that SSE4.2 and POPCNT/LZCNT came out at the same time. But this
is not true for AMD. For example, all AMD Pheniom II have SSE3 but no
POPCNT/LZCNT, and that is why the modern compile uses -msse3 -popcnt and not -msse4.2.
No functional change.
Resolves#86
Objects that are only accessible at file-scope should be put in the anonymous namespace.
This is what the C++ standard recommends, rather than using static, which is really C-style and results in static linkage.
Stockfish already does this throughout the code. So let's weed out the few exceptions,
because... they have no reason to be exceptional.
No functional change.
Resolves#84
It is more idiomatic, we didn't used it
in the past because Position::pretty(Move)
had a calling argument, but now we can.
As an added benefit, we avoid a lot of string
copies in the process because now we avoid
std::ostringstream ss.
No functional change.
This commit fixes two issues:
1) Don't print PVs after the search has been interrupted
This solves the "mate 0 upperbound" scores that sometimes
creep up when a multi-PV analysis gets interrupted with
the `stop` command.
2) Print multipv before score
Shredder Classic fails to identify the main PV
(the one with multipv 1) if `score` comes first.
This leads to an eval graph that doesn't reflect
the scores actually reported by Stockfish when
doing a multiPV analysis.
No functional change
Closes#76
Helper function should just know how to find the
biggest piece type in a bitboard. All the threat
logic and data shoud be in evaluate_threats().
This nicely separates the scope of the two functions
in a more consistent way and simplifies the code.
No functional change.
Use the max_threat() helper function to estimate more precisely the
best hanging piece threat. Also retunes the Threat array using SPSA.
STC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 7598 W: 1596 L: 1468 D: 4534
LTC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 7896 W: 1495 L: 1350 D: 5051
Bench: 6816504
Resolves#73
Instead of hard-code the weights in a big table,
we prefer to calculate them out of few parameters
at startup. This allows to keep low the number of
independent parameters and hence is good for tuning
and for a better insight in the meaning of the numbers.
No functional change.
Make even more clear what are the terms that
contribute to evaluate connected pawns, and
completely separate them from the weights
that are now fully looked up in a table.
For future tuning makes sense to init the table with
a formula instead of hard-code it. This allows to
reduce problem space cardinality and makes tuning
easier.
And fix a MSVC warning while there:
warning C4804: '>>' : unsafe use of type 'bool' in operation
No functional change.
These two notions are very correlated. Since connected has the most
generality, it makes sense to generalize it to encompass what is
covered by candidate.
STC:
LLR: 4.03 (-2.94,2.94) [-3.00,1.00]
Total: 11970 W: 2577 L: 2379 D: 7014
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 13194 W: 2389 L: 2255 D: 8550
bench 7328585
Now that half plies have been removed from the engine, we can encode
TT depth into an int8_t.
Range is -128 to +127, so it goes still further than the previous
limit of 121 plies (with ONE_PLY == 2 where depth - DEPTH_NONE was
encoded as an uint8_t).
No functional change.
Resolved#60
Now that half-plies are no more used we can simplify
the code assuming that ONE_PLY is 1 and no more 2.
Verified with a SMP test:
LLR: 2.95 (-2.94,2.94) [-4.50,0.00]
Total: 8926 W: 1712 L: 1607 D: 5607
No functional change.
On top of previous patch, rename time variables to
reflect the simplification of UCI parameters.
It is more correct to use as varibales directly the
corresponding UCI option, without intorducing redundant
intermediate variables.
This allows also to simplify the code.
No functional change.
In case of a succesful late join we set again
'searching' flag, so we can restart search
immediately without an useless lock/unlock
cycle.
No functional change.
Now "Write Search Log" will pring moves in UCI format, consistent with all the rest. This functionality is
not aimed at end-users anyway. It's hardly useful at all, in fact. Also, pretty-printing SAN moves is
something that better belongs in the GUI than in the engine.
No functional change.
Where they better belong.
Also, this removes '#include <string>' from types.h, which reduces the amount of code to compile (every
translation unit includes types.h).
No functional change.
Unify various perft functions and move all the code
to search.cpp.
Avoid perft implementation to be splitted between
benchmark.cpp (where it has no reason to be) and
search.cpp
No functional and no speed change (tested).
So that one can redirect cout to /dev/null and only print print cerr in the terminal (for more accurate speed
tests).
Suggested by Marco.
No functional change.
The compiler tries to cast Options["Hash"] into a string, using:
Option::operator std::string() const {
assert(type == "string");
return currentValue;
}
And, as expected, the assert() fails.
std::to_string() would be the right solution, but it's C++11. And using a stringstream is too much code to
achieve so little. Let's keep it the way it was: hardcoded (ie. default hash defined in two places).
No functional change.
The eval already returns zero in KK, KBK, KNK (see material.cpp). The difference is:
- we lose the "TB pruning" benefit of the draw rule (ie. search goes on even if eval is zero)
- we gain some speed by removing a useless test from the hot path
STC:
LLR: 0.05 (-2.94,2.94) [-3.00,1.00]
Total: 128000 W: 21357 L: 21560 D: 85083
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 33023 W: 4613 L: 4509 D: 23901
bench 7461881
First, remove some dead code (function never called with a Move argument).
Then, remove printing of legal moves, which does not belong here. Let's keep commands orthogonal and minimal:
- the "d" command should display the board, nothing more, or less.
- "perft 1" will display the list of legal moves.
No functional change.
Stockfish allocates the default hash (32MB) in main(), before entering UCI::loop(). If there is not enough
memory, the program will crash even before UCI::loop() is entered and the GUI is given a change to specify a
lower Hash value.
This defective design could be resolved by doing a lazy allocation upon "isready" command, as the UCI protocol
guarantees that "isready" will be sent at least once before any search. But it's a bit cumbersome when using
Stockfish "manually" to have to remember to type "isready" everytime.
So leave the current design, but reduce the default hash to 16MB instread of 32MB. In order to perform such
quick searches (depth=13), there is no reason to use so much Hash anyway. Another benefit is to introduce a
bit of hash pressure in bench, which increases chances to detect rare bugs related to TT replacement, for
example.
This is not a functional change, although it obviously changes the bench.
bench 7461879
Instead of defining it both in ucioption.cpp and benchmark.cpp. Obviously changing the default Hash will
change the bench as a result.
No functional change.
The main purpose of perft is to help debugging. But without the breakdown in sum of perft(N-1), it is a
completely useless debugging tool.
So perft now displays the breakdown, and divide is therefore removed.
No functional change.
After commit 94b1bbb68b, in case available root moves are less than multiPV, we
could never reach condition:
PVIdx + 1 == multiPV
and as a consequence UCI output is not printed.
Fixed suggested by Joerg Oster.
No functional change.
Despite being neutral at STC, it turned out to be regressive at LTC:
40k games at LTC with Hash=8
ELO: -2.06 +-1.9 (95%) LOS: 1.4%
Total: 39720 W: 5740 L: 5976 D: 28004
40k games at LTC with Hash=128
ELO: -2.69 +-1.9 (95%) LOS: 0.2%
Total: 39149 W: 5702 L: 6005 D: 27442
bench 7477963
Also raise the admissible bounds to (-100,100), as there is no reason to prevent users from using high
values if they want to.
Does not regress in self play:
ELO: 0.10 +-2.0 (95%) LOS: 53.7%
Total: 40000 W: 7084 L: 7073 D: 25843
master vs SF 3
ELO: 182.86 +-2.7 (95%) LOS: 100.0%
Total: 40000 W: 21843 L: 2541 D: 15616
Contempt = 20 vs SF 3
ELO: 189.25 +-2.8 (95%) LOS: 100.0%
Total: 40000 W: 22721 L: 2859 D: 14420
Diff is therefore 6.4 +/- 3.9 elo against a 180-190 elo weaker engine, which is significantly positive,
as expected. This elo difference is likely understated, because of FishTest aggressive draw adjudication
though.
We could push Contempt further, but after 20cp, it would get in the way of FishTest draw adjudication
rule, and is likely to reduce the testing throughput as a result.
bench 8198667
Steamline a bit the implementation of
skill levels. As a side effect we can
retire MultiPV global and use a local
variable instead.
No functional change.
- Currently broken
- Never been really useful
- Does not work well with new splitting model
Verified for no regression at STC with 3 threads:
LLR: 2.96 (-2.94,2.94) [-6.00,0.00]
Total: 81905 W: 12122 L: 12381 D: 57402
No functional change
Bitboard init code is already noteasy to follow,
so don't make it even harder using 'smart' code.
Also reindent a while loop in standard way.
No functional change.
With Eelco's patch "Don't special case for abs(beta) >= VALUE_MATE_IN_MAX_PLY" condition "abs(ttValue) < VALUE_KNOWN_WIN" has been removed from singular extension search, and condition "abs(beta) < VALUE_KNOWN_WIN" was added to the SingularExtensionNode definition.
This might lead to problems, especially in positions, where a mate is due.
For example, this position 5rk1/4K1pp/8/5PPP/8/8/8/1R6 w - - 12 1 triggers an assert.
stockfish: search.cpp:434: Value {anonymous}::search(Position&, Search::Stack*, Value, Value, Depth, bool) [with {anonymous}::NodeType NT = (<unnamed>::NodeType)2u; bool SpNode = false]: Assertion `-VALUE_INFINITE <= alpha && alpha < beta && beta <= VALUE_INFINITE' failed.
So let's re-insert the removed condition.
First spotted by Uri Blass, fix by me.
Bench: 8759675
* /boot/common was removed from Haiku
* The equivalent path now that package management has been implemented is /boot/system/non-packaged
No functional change
Bench: 8759681
The assert:
assert(ttValue != VALUE_NONE);
Could fire for multiple reasons (although is very rare),
for instance after an IID we can have ttMove != MOVE_NONE
while ttValue is still set at VALUE_NONE.
But not only this, actually SMP is a source of corrupted
ttValue and anyhow we can detect the condition:
ttMove != MOVE_NONE && ttValue == VALUE_NONE
even north of IID.
Reported by Ronald de Man.
It is so rare that bench didn't change.
bench: 7710548
Remove from the search this special case and apply
null search and razoring also in mate positions.
Tested in no-regression mode and passed both
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 65431 W: 10860 L: 10810 D: 43761
and LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 34928 W: 4814 L: 4713 D: 25401
This patch kicks in only in mate positions and in
these cases it seems beneficial in finding mates
faster as Yery Spark measured on the Chest mate suite:
Total number of positions 6425
Fixed nodes 200K per position
master: 1049
new: 1154
And also the 5446 'hard' positions again with 2000K nodes
(those not found by both engines in 200K nodes):
master: 1069
new: 1395
bench: 7710548
It seems this flag is only for gcc and
yields a warning under OSX Mavericks:
clang: warning: argument unused during compilation: '-ansi'
No functional change.
Here MSVC is worried that
StepAttacksBB[PAWN][psq]
could overflow, so change psq initialization
to clarify psq is always less than 64.
No functional change.
Don't take the split lock if we don't have
available slaves (about 30-40% of times).
This new condition allows to retire the now
redundant one on number of threads.
No functional change.
Split previous patch in 2 steps: first remove
the MOVE_NULL hack, then retire nullChild.
The first step is a prerequisite
for second one and affects bench.
The second step (next patch) just removes nullChild
without affecting bench.
bench: 8205159
Are broken for big-endian case and
I have verified with MSVC 2013 Premium
bench is correct and there is no
miscompilation, so the main reason
to change the original code drops.
No functional change.
Another attempt at retiring current asymmetric
king evaluation and use a much simpler symmetric
one. As a good side effect we can avoid recalculating
eval after a null move.
Tested in no-regression mode and passed
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 21580 W: 3752 L: 3632 D: 14196
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 18253 W: 2593 L: 2469 D: 13191
And a LTC regression test against SF DD to
verify we don't have regression against
weaker engines due to some kind of 'contempt'
effect:
ELO: 54.69 +-2.1 (95%) LOS: 100.0%
Total: 40000 W: 11072 L: 4827 D: 24101
bench: 8205159
Before it was working by accident in case of
see_sign() and failing with see() due to how
castle moves are coded (king captures the rook).
Better to explicitly filter out castling moves
and use see() without any surprise/trick.
No functional case.
Book handling belongs to GUI, we kept this code
for historical reasons, but nowdays there is
really no need of this old, (mostly) unused
and especially incorrect designed functionality.
It is up to the GUI to choose the book (far easier for
the user) and to select the book parameters. In no
place, including fishtest, TCEC, rating lists, etc.
the "own book" is used, moreover currently SF is
released without any book and even if in the future we
bundle a book in the release package, it will be the GUI
that will take care of it.
This corrects a wrong design decision that Galurung
and later Stockfish inherited from what was common
practice many yeas ago.
No functional change.
There is really little that user can achieve (apart
from a weakened engine) tweaking these parameters
that are already tuned and have no immediate or visible
effect.
So better do not expose them to the user and avoid the
typical "What is the best setup for my machine?" kind of
question (by far the most common, by far the most useless).
No functional change.
To show perft numbers for each move. Just
use 'divide' instead of 'perft', for instance:
position startpos moves e2e4 e7e5
divide 4
Inspired by Ronald de Man.
No functional change.
Retire current asymmetric king evaluation
and use a much simpler symmetric one.
As a side effect retire the infamous
'Aggressiveness' and 'Cowardice' UCI
options.
Tested in no-regression mode,
Passed both STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 33855 W: 5863 L: 5764 D: 22228
And LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40571 W: 5852 L: 5760 D: 28959
bench: 8321835
At root we start counting plies from 1,
instead pv[] array starts from 0. So
the variable 'ply' we use in extract_pv_from_tt
to index pv[] is misnamed, indeed it is
not the real ply, but ply-1.
The fix is to leave ply name in extract_pv_from_tt
but assign it the correct start value and
consequentely change all the references to pv[].
Instead in insert_pv_in_tt it's simpler to rename
the misnamed 'ply' in 'idx'.
The off-by-one bug was unhidden when trying to use
'ply' for what it should have been, for instance in
this position:
position fen 8/6R1/8/3k4/8/8/8/2K5 w - - 0 1
at depth 24 mate line is erroneusly truncated due
to value_from_tt() using the wrong ply.
Spotted by Ronald de Man.
bench: 8732553
If razoring conditions are satisfied and
depth is low, then directly drop in qsearch.
Passed both STC
LLR: 2.98 (-2.94,2.94) [-1.50,4.50]
Total: 12914 W: 2345 L: 2208 D: 8361
And LTC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 50600 W: 7548 L: 7230 D: 35822
bench: 8739659
When there aren't legal moves after
a search, instead of returning imediately,
save bestValue in TT as in the usual case.
There is really no reason to special case
this one.
With this patch is fully fixed (again) follwing
position:
7k/6p1/6B1/5K1P/8/8/8/8 w - - 0 1
Also in SMP case.
bench: 8802105
After last Joona's patch there is no measurable
difference between the option set or unset.
Tested by Andreas Strangmüller with 16 threads
on his Dual Opteron 6376.
After 5000 games at 15+0.05 the result is:
1 Stockfish_14050822_T16_on : 3003 5000 (+849,=3396,-755), 50.9 %
2 Stockfish_14050822_T16_off : 2997 5000 (+755,=3396,-849), 49.1 %
bench: 880215
Instead of waiting to be allocated, actively search
for another split point to join when finishes its
search. Also modify split conditions.
This patch has been tested with 7 threads SMP and
passed both STC:
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 2885 W: 519 L: 410 D: 1956
And a reduced-LTC at 25+0.05
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 4401 W: 684 L: 566 D: 3151
Was then retested against regression in 3 thread case
at standard LTC of 60+0.05:
LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 40809 W: 5446 L: 5406 D: 29957
bench: 8802105
On a final fixed game number test it failed
to prove better than standard version.
STC 15+0.05
ELO: -0.86 +-1.7 (95%) LOS: 15.8%
Total: 57578 W: 10070 L: 10213 D: 37295
bench: 8802105
Unfortunatly we have a slow down that causes
a regression in STC with no-regression mode:
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22454 W: 3836 L: 4029 D: 14589
bench: 8678654
The bug was found to be elsewhere. This version
is correct and also is able to detect as draw
positions like:
8/8/5b2/8/8/4k1p1/6P1/5K2 b - - 6 133
bench: 8678654
Position is win also if strong side has a bishop
and a knight (plus other material, otherwise
KBNK would be triggered instead of KXK).
This fixes a subtle bug where a search on position
k7/8/8/8/8/P7/PB6/K7 b - - 6 1
Instead of returning a draw score, suddendly returns
a big score. This happens because at one point in
search we reach this position:
8/Pk6/8/8/8/4B3/P7/K7 w - - 3 8
Where white can promote. In case of rook promotion (and also in case of
queen promotion) the resutling position gets a huge static eval that is
above VALUE_KNOWN_WIN (from the point of view of white). So for rook
promotion it is
&& futilityBase > -VALUE_KNOWN_WIN
that prevents futility pruning in qsearch. (Removing this condition indeed
lets the problem occur). Raising the static eval for K+B+N+X v K to a value
higher than VALUE_KNOWN_WIN fixes this particular problem without having to
introduce an extra futility pruning condition in qsearch.
I just checked and it seems K+R v K, K+2B v K and even K+B+N v K already get
a huge static eval. Why not K+B+N+P v K?
I think this fix corrects an oversight. There is special code for KBNK, but
KBNXK is handled by KXK, so the test for sufficient material should also test
for B+N.
bench: 8678654
In the (rare) cases when the two conditions
are true, then fully check again with a slow
but correct MoveList<LEGAL>(pos).size().
This is able to detect false positives like
this one:
8/8/8/Q7/5k1p/5P2/4KP2/8 b - - 0 17
When we have a possible simple pawn push that
is not stored in attacks[] array. Because the
third condition triggers very rarely, even if
it is slow, it does not alters in a measurable
way the average speed of the engine.
bench: 8678654
Currently a stealmate position is misevaluated
in a negative/positive score, this leads qsearch(),
that does not detects stealmates too, to return the
wrong score and this yields to some kind of endgames
to be completely misevaluated.
With this patch is fully fixed follwing position
7k/6p1/6B1/5K1P/8/8/8/8 w - - 0 1
Also in SMP case.
Correct root cause analysys by Ronald de Man.
bench: 8678654
After reverting to the original Tord's
endgame, a search on position
7k/6p1/6B1/5K1P/8/8/8/8 w - - 0 1
Reports, correctly, a draw score instead of
an advantage for white.
Issue reported by Uri Blass.
bench: 8678654
Remove the optimization for Intel, is not
standard and can break at any time, moreover
our release build is not done with Intel C++
anymore so we don't need to sqeeze the extra
speed out from this compiler.
No functional change.
If we return from split with a stale value
due to a stop or a cutoff upstream occurred,
then we exit moves loop and save a stale value
in TT before returning search().
This patch, from Joona, fixes this.
bench: 8678654
We can never have bestValue == -VALUE_INFINITE at
the end of move loop because if no legal move exists
we detect it with previous condition on !moveCount,
if a legal move exists we never prune it due to
futility pruning condition:
bestValue > VALUE_MATED_IN_MAX_PLY
So this code never executes, as I have also verified
directly.
Issue reported by Joona.
No functional change.
Intel compiler is very picky:
"error: this operation on an enumerated type requires an
applicable user-defined operator function"
Reported by Tony Gaor.
No functional change.
Put the division at the end to reduce
rounding errors. This alters the bench
due to different rounding errors, but
should not alter ELO in any way.
bench: 7615217
This apparentely silly tweak allows
to speed up the bench by almost 3%.
Not clear why, repeating with perft,
the speed up vanishes.
Suggested by Jonathan Calovski.
No functional change.
Believed to be a speed optimization as benched
on Windows with bench realtime affinity 0x1 deleting
highest and lowest runs:
Base Test
1549259 1608202
1538115 1583934
1543168 1556938
1536365 1554179
1533026 1582010
Signature remains unchanged and gives anywhere from 1-2% nps
boost in analysis depending on number of cores used.
No functional change.
This is a very discussed patch with many
argumentations pro and against. The fact is
it passed both STC:
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 16305 W: 3001 L: 2855 D: 10449
And LTC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 34273 W: 5180 L: 4931 D: 24162
Although it is true that a correct test should
include foreign engines, we commit it anyhow so
people can test it out in the wild, under broader
conditions.
bench: 7384368
Idea from Lyudmil Tsvetkov.
The value seems to be raised a bit abruptly, but as
Gary said, a blocked pawn on the sixth rank has been
instrumental in limiting king mobility in multiple
losses that I've seen from SF. A blocked pawn on fifth
rank is much less serious on the king safety impact.
Passed both STC
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 14551 W: 2750 L: 2607 D: 9194
and LTC
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 43595 W: 6917 L: 6618 D: 30060
And even a retest at 60" fixed games 40K
ELO: 1.79 +-1.9 (95%) LOS: 97.0%
Total: 39889 W: 6018 L: 5813 D: 28058
bench: 7154916
Adding BMI1 allows the compiler to use _blsr_u64
automatically (the advertised 0.3% speed gain).
I verified that the compiler does not use this
instruction with the -mbmi2 flag only. Also, all
processors supporting BMI2 is also supporting BMI1.
No functional change
Prefer
file_of(s) < file_of(ksq)
to the inidrect
file_of(ksq) < FILE_E
To evaluate if semiopen side to check is the left side.
Also other small touches while there.
No functional change.
Right now the Makefile is cluttered with OS X equivalents
of all the x86 targets. We can get rid of all of them and
just check UNAME against "Darwin" for the few OS X-specific
things we need to do.
We also disable Clang LTO when using BMI2 instructions. For
some reason, LLVM cannot find the PEXT instruction when using
LTO. I don't know why, but disabling LTO for BMI2 fixes it.
No functional change.
Intel Haswell and newer CPUs can calculate sliders
attacks using special PEXT asm instructions instead
of magic bitboards. This gives a +3% speed up.
To enable it just compile with ARCH=x86-64-bmi2
No functional change.
Retire software pext and introduce hardware
call when USE_PEXT is defined during compilation.
This is a full complete implementation of sliding
attacks using PEXT.
No functional change.
Reshuffle functions to define them in reverse
calling order (C style).
This allow us to define templates before they are
used. Currently it is not like this, for instance
evaluate_pieces is defined after do_evaluate that
calls it. This happens to work for some strange
reason (two phase lookup?) but we want to avoid
code that works 'by magic'.
As a nice side-effect we can now remove the function
prototypes.
No functional change.
This is more consistent with what other engines are doing.
Often people thinks that SF's scores are overblown. In the
end, it just boils down to the arbitrary way of rescaling them.
No functional change.
I have noticed that increasing the bench depth produces
progressively smaller and slightly faster executables at
the cost of longer compile times. Also using bench "time"
instead of "depth" seems to produce slightly smaller/faster
executables given comparable compile times.
I have made a new Makefile that generates smaller and
about 1% to 2% faster profile executables at only a
little extra compile time. On my mobile 2GHz i7 a
full profile build time goes from 3'48" to 4'13" and
the exe goes down by 5% from 416,310 bytes to 395,567
bytes.
No functional change.
Small simplification.
Passed SPRT(-3,1) both at STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 17051 W: 3132 L: 3005 D: 10914
and LTC:
LLR: 4.55 (-2.94,2.94) [-3.00,1.00]
Total: 24890 W: 3842 L: 3646 D: 17402
The rationale behind this is that I've never managed to add a
Queen on 7th rank bonus in DiscoCheck, because it never showed
to be positive (evne slightly) in testing. The only thing that
worked is Rook on 7th rank.
In terms of SF code, it seemed natural to group it with QueenOnPawn
as well as those are done together. I know you're against groupping
in general, but when it comes to non regression test, you are being
more conservative by groupping. If the group passes SPRT(-3,1) it's
safer to commit, than test every component in SPRT(-3,1) and end up
with the risk of commiting several -1 elo regression instead of just
one -1 elo regression.
In chess terms, perhaps it's just easier to manouver a Queen (which
can more also diagonaly) than a Rook. Therefore you can let the search
do its job without needing eval ad-hoc terms to guide it. For the Rook
which takes more moves to manouver such eval terms can be (marginally)
useful.
bench: 7473314
We chose this instead of negamax sign convention
(ie. from the point of view of the side to move)
because it is more in line to how the eval
table is presented.
Also some tweak to formatting while there.
No functional change.
In some legal positions like this one:
R6R/3Q4/1Q4Q1/4Q3/2Q4Q/Q4Q2/Np1Q4/kB1N1KB1 b -- 0 1
We can have a very high score, in this case 30177 and 29267
for midgame and endgame respectively, and because
VALUE_INFINITE = 30001 we have an assert in interpolate()
Midgame and endgame scores are stored in 16 bit signed integers
so we can rise VALUE_INFINITE a little bit. This does not fix
the possibility of overflow in general case, just makes the
condition more difficult to trigger and anyhow better uses all
the score width.
Raising VALUE_INFINITE to 32000 seems to fix the problem for this
particular case.
No functional change.
Tested directly at LTC because previous long
test series on this topic shows it is TC dependant.
Tested with no-regression mode because gets rid of
an ugly and ad-hoc rule.
Test at LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 67918 W: 10590 L: 10541 D: 46787
bench: 7926803
Here the new idea is to link pinned pieces
with king safety.
Passed both STC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 10047 W: 1867 L: 1737 D: 6443
And LTC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 10419 W: 1692 L: 1543 D: 7184
bench: 8325087
We add a penalty for each pawn which is not protected by another pawn
of the same color.
Passed both short TC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 12107 W: 2411 L: 2272 D: 7424
And long TC
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 9204 W: 1605 L: 1458 D: 6141
bench: 7682173
We want all the UCI options are printed in the order in which are
assigned, so we use an index that, depending on Options.size(),
increases after each option is added to the map. The problem is
that, for instance, in the first assignment:
o["Write Debug Log"] = Option(false, on_logger);
Options.size() can value 0 or 1 according if the l-value (that
increments the size) has been evaluated after or before the
r-value (that uses the size value).
The culprit is that assignment operator in C++ is not a
sequence point:
http://en.wikipedia.org/wiki/Sequence_point
(Note: to be nitpick here we actually use std::map::operator=()
that being a function can evaluate its arguments in any order)
So there is no guarantee on what term is evaluated first and
behavior is undefined by standard in this case. The net result
is that in case r-value is evaluated after l-value the last
idx is not size() - 1, but size() and in the printing loop
we miss the last option!
Bug was there since ages but only recently has been exposed by
the removal of UCI_Analyze option so that the last one becomes
UCI_Chess960 and when it is missing engine cannot play anymore
Chess960.
The fix is trivial (although a bit hacky): just increase the
last loop index.
Reported by Eric Mullins that found it on an ARM and MIPS
platforms with gcc 4.7
No functional change.
Thanks to std::bitset we can easily increase
the limit of active threads above 64.
Thanks to Lucas Braesch for pointing at the
correct solution of using std::bitset.
No functional change.
Using memset on a std::vector is undefined behavior,
so manually init all the data memebers of LimitsType.
Bug intorduced in 41641e3b1e
No functional change.
Because we test for available slaves before
entering split(), we almost always allocate a
slave, only in the rare case of a race (less
then 2% of cases) this is not true, but to
special case this occurrence is not worth
the added complexity.
bench: 7451319
Split delta value in aspiration window so that when
search depth is less than 24 a smaller delta value
is used. The idea is that the search is likely to
be more accurate at lower depths and so we can exclude
more possibilities, 25% to be exact.
Passed STC
LLR: 2.96 (-2.94, 2.94) [-1.50, 4.50]
Total: 20430 W: 3775 L: 3618 D: 13037
And LTC
LLR: 2.96 (-2.94, 2.94) [0.00, 6.00]
Total: 5032 W: 839 L: 715 D: 3478
Bench: 7451319
During endgame initialization we get the material
hash key of each endgame forging and ad-hoc position
that in same cases is illegal (leaves teh king under
capture). This is not a problem for the material key,
but rises an assert when SF is run in debug mode with
'testKingCapture' set in pos_is_ok().
So rewrite the code to always produce legal positions.
No functional change.
Big simplification of pawn move check.
Code has been tested with a brute force approach: for
every position reached during a bench search, the function
has been called for each combinations of Move(from, to)
and verified the result is the same of old code.
Actually this function is very critical becuase is the
one that ensures corrupted TT moves are discarded, so
to properly test it a simple bench is not enough.
Verified also speed is not changed.
No functional chnage.
It has been obsoleted out already some time ago
and currently there is no point in changing eval
score according to if we are in game or analyzing.
So retire the option.
No functional change.
Try to avoid repetition draws at early midgame,
this should give an edge against weaker opponents
and reduce draw rate.
Tested for regressions with SPRT[-3, 1] and
passed both short TC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 68498 W: 12928 L: 12891 D: 42679
And long TC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 40212 W: 6386 L: 6295 D: 27531
bench: 7990513
When running the following position:
8/kPp5/2P3p1/p1P1p1P1/2PpPp2/3p1p2/3P1P2/5K2 w - - 0 1
An assert is raised at depth 92:
assert(-VALUE_INFINITE <= alpha && alpha < beta && beta <= VALUE_INFINITE);
This is because it happens that beta = 29832,
so rbeta = 30032 that is > VALUE_INFINITE
Bug spotted and analyzed by Uri, fix suggested by Joerg.
Other fixes where possible but this one is pointed
exactly at the source of the bug, so it is the best
from a code documentation point of view.
bench: 8430785
Under some very rare case 100 plies of search
could be not enough. Increasing more could lead
to crashes due to reached stack size limit on
some platforms.
Strongly requested by Uri.
bench: 8430785
Currently king has no material key associated because
it can never happen to find a legal position without
both kings, so there is no need to keep track of it.
The consequence is that a position with only the two
kings has material key set at zero and if the material
hash table is empty any entry will match and this is
wrong.
Normally bug is hidden becuase the checking for a draw
with pos.is_draw() is done earlier than evaluate() call,
so that we never check in gameplay the material key of a
position with two kings.
Nevertheless the bug is there and can be reproduced setting
at startup a position with only two kings and typing
'eval' from prompt.
The fix is very simple: add a random key also for the king.
Also fixed the condition in material.cpp to avoid asserting
when a 'just 2 kings' postion is evaluated.
No functional change.
Actually MultiCut is too different from current scheme.
Note that neither ProbCut is exactly what we do because
we try just a handful of captures instead of all moves,
nevertheless it seems more in line with what we do.
Suggested by Joona.
No functional change.
Makes more sense than returning a draw score. Tested
with reduced MAX_PLY = 30 and passed both short TC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 17434 W: 3345 L: 3194 D: 10895
And long TC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 2610 W: 488 L: 373 D: 1749
With current limit of MAX_PLY = 100 the patch should not
introduce any measurable change, nevertheless is the correct
approach.
Idea of returning eval is from Michel Van den Bergh.
bench: 8430785
Fix small overflow error while converting
magic boosters from right rotate to left rotate,
in particular booster 38 was converted to 4122
instead of the corrcet value 26.
Formula used was:
s1 = original & 63, s2 = (original >> 6) & 63;
new = (64 - s1) | ((64 - s2) << 6);
Instead of:
s1 = original & 63, s2 = (original >> 6) & 63;
new = ((64 - s1) & 63) | (((64 - s2) & 63) << 6);
This has no impact in number of cycles needed, but
just in the resultig number that yields to a rotate
amount bigger than 63.
Spotted by Ehsan Rashid.
No functional change.
Latest master triggers a compiler warning due
to comparing int64_t to uint64_t.
notation.cpp: In Funktion »std::string pretty_pv(Position&, int, Value, int64_t, Move*)«:
notation.cpp:230:30: Warnung: Vergleich zwischen vorzeichenbehafteten und vorzeichenlosen Ganzzahlausdrücken [-Wsign-compare]
This patch should fix it.
No functional change.
When initializing the magic numbers used to
compute sliding attacks, we endless generate a
random and test it as a possible magic.
In the general case this takes a lot of iterations,
but here, insteaad of picking a casual random, we
rotate it a couple of times and generate a number that
we know has a good probability to be a magic candidate.
This is becuase the quantities by which we rotate the
number are known in advance to produce quickly a good
canidate.
The patch, inspired by DON, just moves the shuffle to RKISS
changing the boosters to take in account a left rotation
instead of a right rotation as in the original.
No functional change.
Although does not change ELO level, it seems
verification is useful in many zugzwang positions
as reported by many sources.
So revert this simplification.
bench: 8430785
Another SEE speed up that passed the SPRT short TC test!
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 81337 W: 15060 L: 14745 D: 51532
No functional change.
Actually race conditions do exist in an engine, just
think for a moment to TT concurrent access. Racy code
is not a problem per se, if the consequences are well
known and correctly handled.
In case of TT access we ensure that the TT move is validated
before to be tried, here we just retry the same move in less
that 1 case out of a million: this is totally harmless considering
that very probably the second time the move is tried we get
immediately a TT hit and search quickly returns.
So we simplify the code for no harm.
No fuctional change (in single thread case)
Tested with SPRT in simplification mode [-4.00,0.00],
this ensures that the patch is (very probably) not
a regression.
Passed both short TC
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 27543 W: 4278 L: 4209 D: 19056
And long TC
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 39483 W: 7325 L: 7305 D: 24853
bench: 8347121
Hopefully this patch makes the code more:
* Self-documenting: Null search is always a zero window search,
because it is testing for a fail high. It should never be done
on a full window! The current code only works because we don't
do it at PV nodes, and therefore (alpha, beta) = (beta-1, beta):
that's the kind of "clever" trick we should avoid.
* Idiot-proof: If we want to enable null search at PV nodes, all we
need to do now is comment out the !PvNode condition. It's that simple!
In theory, null search should not be done at PV nodes, because PV nodes
should never fail high. But in practice, they DO fail high, because of
aspiration windows, and search inconsistencies, for example. So it makes
sense to keep that flexibility in the code.
No functional change.
Use ralpha instead of rbeta
* rbeta is confusing people. It took THREE attempts to code razoring
at PV nodes correctly in a recent test, because of the rbeta trick.
Unnecessary tricks should be avoided.
* The more correct and self-documenting way of doing this, is to say
that we use a zero window around alpha-margin, not beta-margin.
The fact that, because we only do it at PV nodes, alpha happens to be
beta-1 and that the current stuff with rbeta works, may be correct,
but is confusing.
Remove the misleading and partially erroneous comment about returning
v + margin:
* comments should explain what the code does, not what it could have done.
* this comment is partially wrong in saying that v+margin is "logical",
and that it is "surprising" that is doesn't work.
From a theoretical perspective, at least 3 ways of doing this are equally
defendable:
1/ fail hard: return alpha: The most conservative. We bet that the search
will fail low, but we don't know by how much and don't want to take risks.
2/ aggressive fail soft: return v (what the current code does). This
corresponds to normal fail soft, with the added assumption that we don't
care about the reduction effect (see below point 3/)
3/ conservative fail soft: return v + margin. If the reduced search (qsearch)
gives us a score <= v, we bet that the non reduced search will give us a
score <= v + margin.
* Saying that 2/ is "logical" implies that 1/ and 3/ are not, which is
arguably wrong. Besides, experimental results tell us that 2/ beats 3/,
and that's not something we can argue against: experimental results are
the only trusted metric.
* Also, with the benefit of hindsight, I don't think the fact that 2/ is
better than 3/ is surprising at all. The point is that it is YOUR turn to
move, and you are assuming that by NOT playing (and letting the opponent
capture your hanging pieces in QS) you cannot generally GAIN razor_margin(depth).
No functional change.
With some positions like
8/8/8/2p2K2/1pp5/br1p1b2/2p2r2/qqkqq3 w - -
The eval score is higher than VALUE_INFINITE because
is the sum of VALUE_KNOWN_WIN plus a big material
advantage. This leads to an assert. Here are the
steps to reproduce:
Compile SF with debug=yes then do
./stockfish
position fen 8/8/8/2p2K2/1pp5/br1p1b2/2p2r2/qqkqq3 w - -
go depth 1
This patch fixes the issue in this case, but do exsist
other positions for which the patch is not enough and
we will need to limit the eval score to be sure not
overflow the limit.
Note that is not possible to increase the value of
VALUE_INFINITE because should remain within int16_t
type to be stored in a TT entry.
bench: 7356053
Depth is already dependent on the actual value
of ONE_PLY, in particular can be expressed like:
Depth = n * ONE_PLY
And because formula is used to calculate R that is
also dependent on the value of ONE_PLY and can be
expressed like:
R = x * ONE_PLY
We don't want to divide depth by a 'ply' value but
directly by an integer number.
Spotted by sf-x
No functional change.
Instead of a fixed reduction of ONE_PLY, now
Null move dynamic reduction based on value can
grow larger in case we are above beta of a value
much higher then PawnValueMg.
Note that now an eval returning VALUE_KNOWN_WIN
makes null search to drop in qsearch.
Passed both short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 26141 W: 4871 L: 4699 D: 16571
And long TC:
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 33695 W: 5309 L: 5056 D: 23330
bench: 7356053
Verified there are no hidden bugs and is
actually a speed optimization:
Fixed games at 15+0.05 TC
ELO: 1.72 +-2.9 (95%) LOS: 87.5%
Total: 20000 W: 3741 L: 3642 D: 12617
No functional change
When time remaining is less than Emergency Move Time,
we won't even complete one iteration and engine reports
a stale +M0 score.
To reproduce run "go wtime 10"
info depth 1 seldepth 1 score mate 0 upperbound nodes 2 nps 500 time 4 multipv 1 pv a2a3
info nodes 2 time 4
bestmove a2a3 ponder (none)
This patch fixes the issue.
Tested by Binky at very short TC: 0.05+0.05
ELO: 5.96 +-12.9 (95%) LOS: 81.7%
Total: 1458 W: 394 L: 369 D: 695
And at a bit longer TC:
ELO: 1.56 +-3.7 (95%) LOS: 79.8%
Total: 16511 W: 3983 L: 3909 D: 8619
bench: 7804908
TCEC season 3, which is due to start in a few weeks, just
had its server upgraded to 64GB RAM and will therefore allow
16GB hash to be used per engine.
This is almost the upper limit without changing the
type of size and hashMask. After this we need to
move to uint64_t instead of uint32_t.
No functional change.
Retire KmmKm evaluation function. Instead give a very drawish
scale factor when the material advantage is small and not much
material remains.
Retire NoPawnsSF array. Pawnless endgames without a bishop will
now be scored higher. Pawnless endgames with a bishop pair will
be scored lower. The effect of this is hopefully small.
Consistent results both at short TC (fixed games):
ELO: -0.00 +-2.1 (95%) LOS: 50.0%
Total: 40000 W: 7405 L: 7405 D: 25190
And long TC (fixed games):
ELO: 0.77 +-1.9 (95%) LOS: 78.7%
Total: 39690 W: 6179 L: 6091 D: 27420
bench: 7213723
When we have a fail-high of a quiet move, store it in
a Followupmoves table indexed by the previous move of
the same color (instead of immediate previous move as
is in countermoves case).
Then use this table for quiet moves ordering in the same
way we are already doing with countermoves.
These followup moves will be tried just after countermoves
and before remaining quiet moves.
Passed both short TC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10350 W: 1998 L: 1866 D: 6486
And long TC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 14066 W: 2303 L: 2137 D: 9626
bench: 7205153
This hacky rule allows to get an about right eval out of this position:
r2qk2r/ppp2p2/2npbn2/2b1p3/2P1P1P1/2NB1PPp/PPNP3K/R1BQ1R2 b kq - 0 13
And, more importantly, passed both short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 6239 W: 1249 L: 1127 D: 3863
And long TC:
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 38371 W: 6165 L: 5888 D: 26318
bench: 8183238
As pointed out by Joona, Lucas and otehr people in
the forum, this endgame is not a known, there are many
positions where it takes more than 50 moves to claim the
win and becasue exact rules is not possible better to
retire and allow the search to workout the endgame for us.
bench: 8502826
While editing original Uri's messy patch
I have incorrectly simplified the logic
condition. Here is the correct original
version, as it was tested.
bench: 8502826
Remove the easy move code and add the condition to
play instantly if only one legal move is available.
Verified there is no regression at 60+0.05
ELO: 0.17 +-1.9 (95%) LOS: 57.0%
Total: 40000 W: 6397 L: 6377 D: 27226
bench: 8502826
If we are still at first move, without a fail-low and
current iteration is taking too long to complete then
stop the search.
Passed short TC:
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 26030 W: 4959 L: 4785 D: 16286
Long TC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 18019 W: 2936 L: 2752 D: 12331
And performed well at 40/30
ELO: 4.33 +-2.8 (95%) LOS: 99.9%
Total: 20000 W: 3480 L: 3231 D: 13289
bench: 8502826
First tested with 50K games at very short TC of 5+0.05
ELO: 3.11 +-2.0 (95%) LOS: 99.9%
Total: 49665 W: 10941 L: 10497 D: 28227
Then retested with usual SPRT at short TC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 16875 W: 3198 L: 3049 D: 10628
And at long TC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 5890 W: 985 L: 857 D: 4048
bench: 7800379
In case ply is very high, function will round
to zero (although mathematically it is always
bigger than zero). On my system this happens at
movenumber 6661.
Although 6661 moves in a game is, of course,
probably impossible, for safety and to be locally
consistent makes sense to ensure returned value
is positive.
Non functional change.
Function move_importance() is already always
positive, so we don't need to add a constant
term to ensure it.
Becuase move_importance() is used to calculate
ratios of a linear combination (as explained in
previous patch), result is not affected. I have
also verified it directly.
No functional change.
Drop a useless parameter. This works because ratio1 and ratio2
are ratios of linear combinations of thisMoveImportance and
otherMovesImportance and so the yscale cancels out.
Therefore the values of ratio1 and ratio2 are independent
of yscale and yscale can be retired.
The same applies to yshift, but here we want to ensure
move_importance() > 0, so directly hard-code this safety
guard in function definition.
Actually there are some small differences due to rounding errors
and usually are at most few millisecond, that's means below 1% of
returned time, apart from very short intervals in which a difference
of just 1 msec can raise to 2-3% of total available time.
No functional change.
The flag raises also in case of a pawn duo, i.e.
when we have two adjacent pawns on the same rank,
and not only in case of a chain, i.e. when the two
pawns are on a diagonal line.
See this for a reference:
http://en.wikipedia.org/wiki/Connected_pawns
Renaming suggested by Ralph.
No functional change.
Use a skew-logistic function to replace the
MoveImportance[] array.
Verified it does not regress at fixed number
of games both at short TC:
LLR: -2.91 (-2.94,2.94) [-1.50,4.50]
Total: 39457 W: 7539 L: 7538 D: 24380
And long TC:
ELO: -0.49 +-1.9 (95%) LOS: 31.0%
Total: 39358 W: 6135 L: 6190 D: 27033
bench: 7335588
Passed short TC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 18331 W: 3608 L: 3453 D: 11270
And scored above 50% on a very long test in long TC
LLR: -2.97 (-2.94,2.94) [0.00,6.00]
Total: 51533 W: 8181 L: 8047 D: 35305
bench: 7335588
In endgame it's better to have pawns on both wings.
So give a bonus according to file distance between left
and right outermost pawns.
Passed both short TC
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 39073 W: 7749 L: 7536 D: 23788
And long TC
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 6149 W: 1040 L: 910 D: 4199
bench: 7665034
Reduce eval discontinuity becuase now we kick in
king safety evaluation in many more cases.
Passed both short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 8708 W: 1742 L: 1613 D: 5353
And long TC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 6743 W: 1122 L: 990 D: 4631
bench: 6835416
Tighter lower bound for pawn attacks so to
activate king safety in some cases like here:
6k1/2B3p1/2Pp1p2/2nPp3/2Q1P2K/P2n1qP1/R6P/1R6 w
Original patch by Chris, further simplified by
Jörg Oster.
Passed both short TC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 30171 W: 5887 L: 5700 D: 18584
And long TC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 20706 W: 3402 L: 3204 D: 14100
bench: 7607562
Add a bonus according if the attacking
pieces are minor or major.
Passed both short TC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 13142 W: 2625 L: 2483 D: 8034
And long TC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 18059 W: 3031 L: 2844 D: 12184
bench: 7425809
A great simplification that shows no regression
and it seems even a bit scalable.
Tested with fixed number of games:
Short TC
ELO: 0.60 +-2.1 (95%) LOS: 71.1%
Total: 39554 W: 7477 L: 7409 D: 24668
Long TC
ELO: 2.97 +-2.0 (95%) LOS: 99.8%
Total: 36424 W: 5894 L: 5583 D: 24947
bench: 8184352
Change updating rule after a TT hit to match
the same one at the end of the search.
Small change in functionality, but we want to
have uniform rules in the code.
bench: 7767864
We already update killers so it is natural to extend to
history and counter move too.
Passed both short TC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 52690 W: 9955 L: 9712 D: 33023
And long TC
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 5555 W: 935 L: 808 D: 3812
bench: 7876473
After a fail high in LMR, if reduction is very high do
a research at lower depth before teh full depth one.
Chances are that the re-search will fail low and the
full depth one is skipped.
Passed both short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 11363 W: 2204 L: 2069 D: 7090
And long TC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 7292 W: 1195 L: 1061 D: 5036
bench: 7869223
Since all ENPASSANT moves are now considered dangerous, this
change of order should give a slight speedup.
Also simplify futilityValue formula.
No functional change.
We avoid to use an ad-hoc table at the cost of a
relative_rank() call in advanced_pawn_push().
On my 32 bit system it is even slightly faster (on 64bit
may be different). This is the speed in nps alternating
old and new bench runs:
new
368890
368825
369972
old
367798
367635
368026
No functional change.
Instead of a passed pawn now we just require the pawn to
be in the opponent camp to be considered a dangerous
move. Added some renaming to reflect the change.
Passed both short TC test
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10358 W: 2033 L: 1900 D: 6425
And long TC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 21459 W: 3486 L: 3286 D: 14687
bench: 8322172
To align to same named Position function and
avoid using std::cout directly.
Also remove some stale <iostream> include while
there.
No functional change.
Add a Mac SSE4.2 target. Also change the Mac OS X minimum version to
10.6. Rationale: 97% of Macs run at least 10.6, version 10.9 is now
free, and using 10.6 as the minimum version gives a small 5% boost in
benchmark speed over versions using 10.0 as the minimum version.
Finally, enable Clang’s Link Time Optimization when compiling for the
Mac.
No functional change.
An old idea retested at SPRT(0, 3) with 60+0.05 TC:
LLR: 2.95 (-2.94,2.94) [0.00,3.00]
Total: 98872 W: 15549 L: 15123 D: 68200
This is a very small elo increase patch so it really
stresses the limits of fishtest.
bench: 8596156
It seems to intorduce a regression when tested
with 3 threads at 15+0.05:
ELO: -2.26 +-2.2 (95%) LOS: 2.4%
Total: 30000 W: 4813 L: 5008 D: 20179
bench: 8331357
Tested setting FakeSplit to true and running
./stockfish bench 128 2
There is a different signature with and without
the patch so it affects functionality but
only in SMP case.
bench: 8331357
SMP case is very tricky and raises an assert in stage_moves():
assert(stage == KILLERS_S1 || stage == QUIETS_1_S1 || stage == QUIETS_2_S1)
So rewrite the code to just return moves[] when we are sure
we are in quiet moves stages.
Also rename stage_moves to quiet_moves to reflect that.
No functional change (but needs testing in SMP case)
Use MovePicker moves[] to access already tried
quiet moves. A bit of care shall be taken
to avoid calling stage_moves() when we are still
at ttMove stage, because moves are yet to be
generated. Actually our staging move generation
makes this code a bit more tricky than what I'd
like, but removing an ausiliary redundant
array like quietsSearched[] is a good thing.
Idea by DiscoCheck
bench: 9355734
Use the newly introduced LineBB[] to simplify this
super hot-path function.
Verified with perft we don't have any speed regression, although
the number of squares removed is less than before in case of
contact check.
Insipred by DiscoCheck implementation.
Perft numbers are the same, but we have an harmless functional
change due to reorder of moves, because now some illegal moves
are no more detected at generation time, but in the search.
bench: 8331357
This seems a die hard idea :-)
Passed both short TC
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 17485 W: 3307 L: 3156 D: 11022
And long TC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 38181 W: 6002 L: 5729 D: 26450
bench: 8659830
Actually, it is not used, as both arrays have the
same values. Some local tests in either direction
showed no improvement.
Also some minor corrections in the comments.
No functional change.
Previously some squares could be "incorrectly" awarded
to a pinned piece.
e.g. in 3k4/1q6/3b4/3Q4/8/5K2/B7/8 b - - 0 1 the black
bishop get 4 squares too many and the white queen gets 6.
Passed both short TC.
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 4871 W: 934 L: 817 D: 3120
And long TC:
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 38968 W: 6113 L: 5837 D: 27018
bench: 9282549
But compensate by reducing rook and queen
value by 53 = (160 / 3)
Material imbalances are affected as follows:
Red. Major Rook Queen Total
QRR +160 -2*53 -53 +1
QR +160 -53 -53 +54
RR +160 -2*53 0 +54
R 0 -53 0 -53
Q 0 0 -53 -53
so that the imbalance changes by at most 54 + 53 = 107 units.
This corresponds to appromximately 3.5cp in the final evaluation.
Verified with fixed number 40000 games at both short
and long TC it does not regress.
Short TC 15+0.05
ELO: 1.93 +-2.1 (95%) LOS: 96.6%
Total: 40000 W: 7520 L: 7298 D: 25182
Long TC 60+0.05
ELO: -0.33 +-1.9 (95%) LOS: 36.5%
Total: 39663 W: 6067 L: 6105 D: 27491
bench: 6703846
As, Gary (that analyzed the bug) says:
SF does not print a PV when the original best move fails low,
we hit our time allowance, and stop the search. The output from
the SF search is below. It was failing low on Ne1 at depth 34.
Then, we get bestmove Qd3, but no PV change.
info depth 34 seldepth 45 score cp 38 upperbound nodes 483484489 nps 15464575 time 31264 multipv 1 pv f3e1 h5h4 e1d3 h4g3 f2g3 a6f6 f1f6 e7f6 d1a4 f6e7 a1f1 d8f8 a4b3 b7b6 b3c2 f7f6 c2a4 h3g5 b2b3 g5f7 a4c6 f7d6 h1g2 f6f5 e4f5 d6f5
info depth 34 seldepth 45 score cp 38 upperbound nodes 483484489 nps 15464575 time 31264 multipv 1 pv f3e1 h5h4 e1d3 h4g3 f2g3 a6f6 f1f6 e7f6 d1a4 f6e7 a1f1 d8f8 a4b3 b7b6 b3c2 f7f6 c2a4 h3g5 b2b3 g5f7 a4c6 f7d6 h1g2 f6f5 e4f5 d6f5
info depth 34 seldepth 47 score cp 30 upperbound nodes 2112334132 nps 17255517 time 122415 multipv 1 pv f3e1 h5h4 d1a4 a6f6 e1d3 d8f8 a4c2 h4g3 f2g3 f6f1 a1f1 h7g8 b2b3 f7f6 a3a4 b7b6
info depth 34 seldepth 47 score cp 30 upperbound nodes 2112334132 nps 17255517 time 122415 multipv 1 pv f3e1 h5h4 d1a4 a6f6 e1d3 d8f8 a4c2 h4g3 f2g3 f6f1 a1f1 h7g8 b2b3 f7f6 a3a4 b7b6
info nodes 18235667001 time 969824
bestmove e2d3 ponder c8d7
Looking at the code, if we hit Signals.stop, we return from id_loop
before printing any PV. It is possible for us to have resorted the
RootMove list though, which will change the move that is actually
played.
No functional change.
1/ eval margin and gains removed:
16bit are now free on TT entries, due to the removal of eval margin. may be useful
in the future :) gains removed: use instead by Value(128). search() and qsearch()
are now consistent in this regard.
2/ futility_margin()
linear formula instead of complex (log(depth), movecount) formula.
3/ unify pre & post futility pruning
pre futility pruning used depth < 7 plies, while post futility pruning used
depth < 4 plies. Now it's always depth < 7.
Tested with fixed number of games both at short TC:
ELO: 0.82 +-2.1 (95%) LOS: 77.3%
Total: 40000 W: 7939 L: 7845 D: 24216
And long TC
ELO: 0.59 +-2.0 (95%) LOS: 71.9%
Total: 40000 W: 6876 L: 6808 D: 26316
bench 7243575
1/ eval margin and gains removed:
- gains removed by Value(128): search() and qsearch() now behave consistently!
2/ futility_margin()
- testing showed that there is no added value in this weird (log(depth), movecount)
formula, and a much simpler linear formula is just as good. In fact, it is most
likely better, as it is not yet optimally tuned.
- the new simplified formula also means we get rid of FutilityMargins[], its
initialization code, and more importantly ss->futilityMoveCount, and the hacky
code that updates it throughout the search().
- the current formula gives negative futility margins, and there is a hidden interaction
between the move coutn pruning formula and the futility margin one: what happens is
that MCP is supposed to be triggered before we use the non-sensical negative futility
margins.
3/ unify pre & post futility pruning
- pre futility pruning (what SF calls value based pruning) used depth < 7 plies,
while post futility pruning (what SF calls static null move pruning) used depth < 4 plies.
- also the condition depth < 7 in pre futility pruning was not obvious, and it seemd
to be depth < 16 (futility_margin() returns an infinite value when depth >= 7).
Tested with fixed number of games both at short TC:
ELO: 0.82 +-2.1 (95%) LOS: 77.3%
Total: 40000 W: 7939 L: 7845 D: 24216
And long TC
ELO: 0.59 +-2.0 (95%) LOS: 71.9%
Total: 40000 W: 6876 L: 6808 D: 26316
bench: 10206576
RedundantRook and RedundantQueen replaced by simple
variable RedundantMajor. Also the SameColor coefficient
for Queen<->Queen has been set by definition to 0.
The remaining 5 parameters:
LinearCoefficients[ROOK]
LinearCoefficients[QUEEN]
QuadraticCoefficientsSameColor[ROOK][ROOK]
QuadraticCoefficientsSameColor[QUEEN][ROOK]
RedundantMajor
are sufficient to equate the material imbalances for the
5 common material configurations of R, RR, Q, QR and QRR
to any desired values simultaneously.
With the chosen parameters there should be no functional
change unless one side has more than 2 rooks or more
than 1 queen. For example bench from the start position
using the commands:
./stockfish
go depth 16
produces identical output except for one extra node
in the last iteration.
bench: 8198094
Coefficients for Bishop<->BishopPair and Bishop<->Bishop
are also pretty much redundant. By altering the values
in LinearCoefficients[] these coefficients can be zeroed
without changing the imbalance calculations in any position
with less than 3 bishops for one side.
bench: 7995098
First coefficient in the SameColor array does an
equivalent job when folded into the LinearCoefficients
array.
All of the diagonal terms in the OppositeColor array
are redundant due to cancellation.
No functional change.
In case we find a very good move after a
troubled start, we don't return immediately
anymore.
Tested directly at long TC where it passed:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 13910 W: 2397 L: 2228 D: 9285
bench: 7995098
And remove a complex (and broken) formula.
Indeed previous code was broken in case of TC with big
time increments where available_time() was too similar
to total time yielding to many time losses, so for instance:
go wtime 2600 winc 2600
info nodes 4432770 time 2601 <-- time forfeit!
maximum search time = 2530 ms
available_time = 2300 ms
For a reference and further details see:
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/dCPAvQDcm2E
Speed tested with bench disabling timer alltogheter vs timer set at
max resolution, showed we have no speed regressions both in single
core and when using all physical cores.
No functional change.
A combo of two patches that failed SPRT with score
higher than 50% but togheter they succeed:
SPRT at 60+0.05
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 7312 W: 1276 L: 1139 D: 4897
bench: 8029334
If the game got late enough that move_importance(currentPly) * slowMover / 100
rounds to 0, then we ended up dividing 0 by 0 when only looking 1 move ahead.
This apparently caused the search to almost immediately abort and Stockfish
would blunder in long games. So convert thisMoveImportance to a double.
No functional change.
This seems more a material imbalance topic,
anyhow test is good and so patch is applied
as is.
Passed both short TC:
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 17391 W: 3548 L: 3393 D: 10450
And long TC:
LLR: 3.00 (-2.94,2.94) [0.00,6.00]
Total: 34660 W: 5972 L: 5700 D: 22988
bench: 8291883
Dumb down a bit the code and trade some possible
speed (but this is far from hot path anyhow) for
some added readability for the layman.
No functional change.
The normalising transformation is computed all at
once by the helper function get_flip_sq and then
applied immediately to the relevant squares as soon
as they are loaded from the position class.
bench: 8350690
New formula mathces the old formula until d = 45
Test code:
int main() {
for(int d=1; d<=45; d++)
{
int a = int(log(double(d * d) / 2) / log(2.0) + 1.001);
int b = int(2.9 * log(double(d)));
if (a != b) std::cout << d << std::endl;
}
return 0;
}
bench: 8455956
This is the first chain bonus version
from Ralph that also passed both
Short TC:
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 23460 W: 4727 L: 4556 D: 14177
And long TC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 31858 W: 5497 L: 5240 D: 21121
And performed better against current
committed version, always at 60secs:
LLR: -2.94 (-2.94,2.94) [-3.00,3.00]
Total: 26301 W: 4477 L: 4580 D: 17244
This test was done by Leonid.
bench: 8455956
After 40K games at 60 secs, result is still
not clear, but not a regression against SF 4
After
ELO: 50.11 +-2.1 (95%) LOS: 100.0%
Total: 40000 W: 10547 L: 4817 D: 24636
Before
ELO: 49.51 +-2.1 (95%) LOS: 100.0%
Total: 40000 W: 10483 L: 4821 D: 24696
So re-apply the patch to avoid to
special-case this one.
bench: 7403882
It was never updated !
Currently it only affects evaluate_passed_pawns()
and in particularly the rule to increase the bonus
if we have more non-pawn pieces. We could simply use
popcount() instead and avoid the little slowdown
in put_piece() and remove_piece(), but this would
leave a very subtle and tricky hole where people
are forced to remember that pos.count<ALL_PIECES>()
does not work. This is not obvious and so dangerous.
Thanks to Ronald de Man for spotting this.
bench: 7931424
Due to a strange issue (bug?) the ternary
operator does not return a BitCountType for
icc, so revert to the expression.
The same patch was already applied in
9749f1f14c
Thanks to NssY Wanyonyi for pointing out
this.
No functional change.
This reverts commit 4bc2374450 for
two reasons.
First regression testing shows almost equal
score:
Before the patch:
ELO: 49.75 +-2.5 (95%) LOS: 100.0%
Total: 27205 W: 7113 L: 3244 D: 16848
After the patch:
ELO: 48.87 +-2.9 (95%) LOS: 100.0%
Total: 20860 W: 5478 L: 2563 D: 12819
Second, and more sensible to me, this patch
increases safe check bonuses to 4 times their
original value (!) and considering:
- Values were already well tuned
- Values are highly critical
- King safety is highly critical, very TC
dependent and very difficult to test
- Our testing coverage is partial (self-testing,
blitz times)
I think is better to be safe than sorry and so
I revert the patch.
bench: 8440524
Passed both short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10466 W: 2087 L: 1953 D: 6426
And long TC:
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 26334 W: 4540 L: 4310 D: 17484
And also proved stronger than a slightly
different patch, also succesful against master:
https://github.com/mcostalba/Stockfish/commit/dc6830a3b4ed12
But losing against current one in a match
at 60secs with SPRT [-3, 3]:
LLR: -2.96 (-2.94,2.94) [-3.00,3.00]
Total: 44484 W: 7360 L: 7463 D: 29661
bench: 9160831
Also the drawing criteria has been slightly loosened.
It now detects a draw if the king is ahead of all the
pawns and on the same file or the adjacent file.
bench: 7700683
This lost position 8/8/3q4/8/5k2/2P1R3/2K2P2/8 w - - 0 1
was previously evaluated as a draw.
The king and rook need to be correctly placed with
respect to the _same_ pawn.
(Note also that the check for the pawn being on RANK_2
in the old version is redundant: it must be on RANK_2 if
it hopes to protect a rook on RANK_3)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Good both at short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 5448 W: 1133 L: 1012 D: 3303
And at long TC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 40509 W: 6836 L: 6541 D: 27132
bench: 7700683
Don't need a struct here. Speed test shows
result is teh same. Moreover RKISS is used
mainly at startup to compute magics, so
prefer to keep it simple...RKISS ;-)
Also some assorted triviality while there.
No functional change.
These two changes go in opposite directions and it
seems that the combination is stronger than original.
Here are the positive tests at various TC:
15+0.05
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 24561 W: 4946 L: 4772 D: 14843
60+0.05
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 15259 W: 2598 L: 2423 D: 10238
40/30
LLR: 2.96 (-2.94,2.94) [-3.00,3.00]
Total: 2570 W: 527 L: 422 D: 1621
Unfortunately there is also a bad result
with one sec time increment that needs
to be further investigated:
12+1
LLR: -2.97 (-2.94,2.94) [-3.00,3.00]
Total: 2694 W: 438 L: 543 D: 1713
bench: 8340585
Rationale:
- Speed of double and float is about the same (not on the hot path anyway)
- Double makes code prettier (no need to write 1.0f, just 1.0)
- Only practical advantage of float is to use less memory, but since we never
store large arrays of double, we don't care.
No functional change.
Increase bench default depth from 12 to 13 and
add 15 new endgame positions to have broader
coverage and also more reliable nps calulcation
used for fishtest framework.
Due to the new endgame positions, where nps is higher,
the total nps is increased of about 15%.
Thanks to Lucas and Jörg for the suggestions.
No functional change, but bench number is now:
bench: 8336338
For some users -stack_size,0x4000 does not work,
so revert for now.
osX 10.6.8
gcc version 4.7.3 (MacPorts gcc47 4.7.3_2)
g++: error: unrecognized command line option '-stack_size,0x4000'
make[2]: *** [stockfish] Error 1
make[1]: *** [gcc-profile-make] Error 2
make: *** [profile-build] Error 2
No functional change.
This reverts commit 800410eef1 and instead increases
stack size.
I went through the old emails with Daylen that reported the
crash issue on Mac OS X and was fixed by 0049d3f337.
It was reported default stack size for a thread in Mac OS X is 8
megabytes while the patch that we are reverting allows to reduce
stack size at max of about 217KB, so the reason for the crash was
only marginal in MAX_MOVES value. On those emails Daylen also
hinted how to increase stack size for Mac OS X to 16MB.
So prefer to increase stack size to 16MB instad of re-inventing
the wheel and do our home grown stack as we did with the patch
that we are now reverting (it will remain anyhow in git history
for documentation purposes).
No functional change.
Unify extensions between PV and not PV nodes
and remove all but check extensions.
This is a simplification so tested at fixed number
of games where proved to not regress.
About 45k games at 15+0.05
ELO: 1.23 +-2.0 (95%) LOS: 88.5%
Total: 45643 W: 9107 L: 8946 D: 27590
About 45k games at 60+0.05
ELO: 1.07 +-1.8 (95%) LOS: 87.8%
Total: 46786 W: 7728 L: 7584 D: 31474
bench: 3172206
If the uci option 'Best Book Move' is set to true the lookup still
returns a move at random instead of the move with the highest
weight.
No functional change.
This should be enough for any legal position, even
the handcrafted ones, like the one presented by Reuven:
1Q5R/4Q1K1/B1Q5/B4Q2/N2Q4/pQ4Q1/pn2Q3/krQ4R w - -
Where currently we crash. This reverts the patch
0049d3f337 of 8/4/2012 where stack
was shrinked due to crashes while in deep analysys.
No functional change.
This greately reduces stack usage and is a
prerequisite for next patch.
Verified with 40K games both in single and SMP
case that there are no regressions.
No functional change.
This is an even safer setup proposed and tested
by Alexandre Meirelles.
Regression testing of 40K games at 10+0.05 show
result is stable both against current master:
ELO: -0.29 +-2.2 (95%) LOS: 39.7%
Total: 40000 W: 8010 L: 8043 D: 23947
and again original master (the one with smallest
time parameters):
ELO: 1.71 +-2.2 (95%) LOS: 93.8%
Total: 40000 W: 8325 L: 8128 D: 23547
Alexandre verified with LittleBlitzer time losses are
greately reduced with this setup:
Games Completed = 2100 of 3000 (Avg game length = 35.745 sec)
Settings = RR/128MB/15000ms+50ms/M 1000cp for 12 moves, D 150 moves/
Time = 39200 sec elapsed, 16800 sec remaining
1. Stockfish 190913 1091.5/2100 803-720-577 (L: m=313 t=1 i=0 a=406) (D: r=278 i=91 f=136 s=8 a=64) (tpm=212.5 d=14.75 nps=925427)
2. Houdini 2.0 w32 1008.5/2100 720-803-577 (L: m=250 t=299 i=0 a=254) (D: r=278 i=91 f=136 s=8 a=64) (tpm=204.1 d=12.04 nps=1326351)
No functional change.
Goes in the direction of avoiding time losses and seems
equivalent after almost 40K games at super fast TC of 10+0.05
ELO: 2.61 +-2.2 (95%) LOS: 99.1%
Total: 39869 W: 8258 L: 7959 D: 23652
No functional change.
Goes in the direction of avoiding time losses and seems
equivalent after almost 40K games at super fast TC of 10+0.05
ELO: 2.41 +-2.3 (95%) LOS: 98.1%
Total: 37222 W: 7843 L: 7585 D: 21794
No functional change.
The ideal setting for super-blitz might be something like:
"Emergency Base Time" = 50
"Emergency Move Time" = 5
This would give a total emergency time buffer of:
50 + 40 * 5 = 250 ms
This setup replaces the previous half cooked hack
"Don't blunder under extreme time pressure".
Test results are very good at super blitz, but keep good even
at 60 secs.
At 5+0.05
ELO: 24.30 +-2.4 (95%) LOS: 100.0%
Total: 37802 W: 10060 L: 7420 D: 20322
At 15+0.05
ELO: 13.41 +-2.9 (95%) LOS: 100.0%
Total: 22271 W: 4853 L: 3994 D: 13424
At 60+0.05
ELO: 5.30 +-3.2 (95%) LOS: 99.9%
Total: 16000 W: 2897 L: 2653 D: 10450
No functional change.
Instead of current code, give a bonus according to the frontmost
square among candidate + passed pawns.
This is a big simplification that removes a lot of accurate code
substituting it with a statistically based one using the common
'bonus' scheme, leaving to the search to sort out the details.
Results are equivalent but code is much less and, as an added bonus,
we now store candidates bitboard in pawns hash and allow this
info to be used in evaluation. This paves the way to possible
candidate pawns evaluations together with all the other pieces,
as we do for passed.
Patch passed short TC
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 16927 W: 3462 L: 3308 D: 10157
Then failed (quite quickly) at long TC
LLR: -2.95 (-2.94,2.94) [0.00,6.00]
Total: 8451 W: 1386 L: 1448 D: 5617
But when ran with a conclusive 40K fixed games at 60 secs it proved
almost equivalent to original one.
ELO: 1.08 +-2.0 (95%) LOS: 85.8%
Total: 40000 W: 6739 L: 6615 D: 26646
bench: 3884003
Now that we use pre-increment on enums, it
make sense, for code style uniformity, to
swith to pre-increment also for native types,
although there is no speed difference.
No functional change.
ENABLE_OPERATORS_ON has incorrect definitions of
post-increment and post-decrement operators.
In particularly the returned value is the variable
already incremented/decremented, while instead they
should return the variable _before_ inc/dec.
This has no real effect because are only used in loops
and where the returned value is never used, neverthless
it is wrong. The fix would be to copy the variable to a
dummy, then inc/dec the variable, then return the dummy.
So instead, rename to pre-increment that can be implemented
without the dummy, actually the current implementation
it is already the correct pre-increment, with the only change
to return a reference (an l-value) and not a copy, so
to properly mimic the pre-increment on native integers.
Spotted by Kojirion.
No functional change.
We always attempt to keep at least this emergencyBaseTime
at clock. But if available time is very low it means that
we will force ourself to play immediately to satisfy the
emergencyBaseTime constrain and so leading to blunders.
Patch is good at short and very short TC (15secs and 5secs respectively)
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 26590 W: 5426 L: 5245 D: 15919
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 5767 W: 1397 L: 1268 D: 3102
Instead seems has no influence at longer TC (60 secs)
LLR: -2.96 (-2.94,2.94) [0.00,6.00]
Total: 79862 W: 13623 L: 13339 D: 52900
So it is committed to have a broader testing but is
to be consider still EXPERIMENTAL and can be reverted
easily.
No functional change.
In case we have less then 10ms to think as soon as
we wake up the timer, it immediately fires and calls
check_time() where due to condition:
elapsed > TimeMgr.maximum_time() - 2 * TimerResolution
the stop flag is set and search returns immediately, without
actually search anything.
Here the somewhat hacky fix is to start the timer after
at least one iteration as been completed.
No functional change.
The possible maximum mobility cardinality (plus one in case of
zero squares available) is:
- Knights: max. 8 squares -> max. 9 entries
- Bishops: max. 13 squares -> max. 14 entries
- Rooks: max. 14 squares -> max. 15 entries
- Queen: max. 27 squares -> max. 28 entries
So remove the extra entries in the table.
Spotted by Dariusz Orzechowski.
No functional change.
Union of
- LMR >= 3 plies from Gary tests.stockfishchess.org/tests/view/522522960ebc595d328fcafd
- allows() tweak from Reuven tests.stockfishchess.org/tests/view/5225fa1c0ebc595d328fcb53
Both passed Step I and failed Step II.
Instead this union passed both short TC:
LLR: 2.95 (-2.94,2.94)
Total: 14525 W: 3063 L: 2874 D: 8588
And long TC
LLR: 2.94 (-2.94,2.94)
Total: 31075 W: 5566 L: 5308 D: 20201
bench: 4238160
Idea is sound but implementation is partial. Ryan and Joona noticed that
we leave an hole in material table. Also we got another report by an user
of an odd behaviour. Namely, if you start stockfish and from the prompt
give 'bench' you get 3453941, then if you run again bench you get 3453940.
The reason is that two different positions with the same number of pieces,
but one with a bishop pair and another without have the same material key.
But after Eelco patch also different material imbalance and this yields
to this issue.
Restesting at long TC shows the patch does not really contribute at
ELO improvement. Actually patch failed at long TC.
LLR: -2.97 (-2.94,2.94)
Total: 23109 W: 4104 L: 4092 D: 14913
So revert.
bench: 3453945
Prefer pos.bishop_pair() to pos.count<BISHOP>(WHITE) > 1
because the first checks that the two bishops are on
different color squares.
Although the change seems to kick in only in very rare cases,
quite surprisingly it was able to pass SPRT test at short TC.
LLR: 2.95 (-2.94,2.94)
Total: 39818 W: 8174 L: 7956 D: 23688
bench: 3453941
This was thought to be a draw but the bishops generally win. However,
it takes up to 66 moves. The position in the diagram was thought to be
a draw for over one hundred years, but tablebases show that White wins
in 45 moves. All of the long wins go through this type of semi-fortress
position. It takes several moves to force Black out of the temporary
fortress in the corner; then precise play with the bishops prevents Black
from forming the temporary fortress in another corner (Nunn 1995:265ff).
Before computer analysis, Speelman listed this position as unresolved,
but "probably a draw" (Speelman 1981:109).
bench: 3453945
STANDALONE-TOOLCHAIN.html in Android NDK says:
It is recommended to use the -mthumb compiler flag to force the generation
of 16-bit Thumb-1 instructions (the default being 32-bit ARM ones).
If you want to target the 'armeabi-v7a' ABI, you will need ensure that the
following two flags are being used:
CFLAGS='-march=armv7-a -mfloat-abi=softfp'
Note: The first flag enables Thumb-2 instructions, and the second one
enables H/W FPU instructions while ensuring that floating-point
parameters are passed in core registers, which is critical for
ABI compatibility. Do *not* use these flags separately!
Thanks to Peter Osterlund for pointout this doc and for showing me
an example Makefile to follow.
No functional change.
Becuase castle is coded as "king captures the rook"
the to_sq(move), A1/8 or H1/8 is empty after the move,
leading to assert assert(p != NO_PIECE) in color_of().
Teach allows() asserts about castle and fix the crash.
Bug reported by Ryan Takker and tracked down by Tom Vijlbrief.
No functional change.
If our rook is behind a passed pawn, all
squares are defended.
One of the longest tests to pass !
Passed both short TC
LLR: 2.97 (-2.94,2.94)
Total: 44560 W: 9518 L: 9281 D: 25761
And long TC
LLR: 2.96 (-2.94,2.94)
Total: 61348 W: 11618 L: 11192 D: 38538
bench: 3787694
With
position fen 7k/8/8/8/8/7P/6K1/7B w - - 0 1
go depth 25
The evaluation at depth 22 is not draw as it should be. The reason is that
when search reaches the position 8/6kP/8/8/8/3B4/6K1/8 w - - 0 1 if white plays
h8R or h8N then we get a position that is a "KNOWN_WIN" and is _not_ a check, so
futility pruning in qsearch kicks in and black may think that it is "futile"
to reply Kxh8 since, according to the logic of the code, it cannot raise the score
back towards a draw.
bench: 4728533
The case of two lone kings on the board is already considered
by the "No pawns" scaling factor rules in material.cpp as is
KBK and KNK.
Moreover we had a small leak in endgames map because for
KK endgame it happens white and black material keys are the
same (both equal to zero), so when adding the black endgame in
Endgames::add() we were overwriting the already exsisting
white one, leading to a memory leak found by Valgrind.
So remove the endgames althogheter and rely on scaling
to correctly set the endgames value to a draw.
No functional change.
Instead of classical flags, throw an
exception when we want to immediately halt
the search. Currently only one type
is used for both UCI stop and threads
cut off.
No functional change.
Idea originated from a post of Don Dailey
on talkchess and reported by Eelco.
This is the last succesful attempt of a long
series of trials (as usually happens, the
'idea' alone is not enough).
Passed both short 15secs TC
LLR: 2.97 (-2.94,2.94)
Total: 7629 W: 1645 L: 1515 D: 4469
And long 60secs TC
LLR: 2.96 (-2.94,2.94)
Total: 10218 W: 1932 L: 1775 D: 6511
bench: 4944581
With the new automatic setting of split depth
instead of a default, the user no longer needs
guidance on setting the split point.
Also threads now defaults to one.
No functional change.
Search::RootColor is a global parameter set
before to start a search, it is not something
trace() should change.
This patch allows to add trace() calls, for
debugging, inside search itself without altering
the bench, and also ensures that the values
returned by trace() and evaluate() are fully
equivalent.
No functional change.
The rounding formula is different between
positive and negative scores due to the
GrainSize/2 term that is asymmetric.
So use truncation instead of rounding. This
guarantees that evaluation is rounded to zero
in the same way for both positive and negative
scores.
Found with position's flip
bench: 4634244
Because VALUE_NONE is 30002, it happens that
after a check the next move is never an improving
one.
After this patch bench signature is independent from
VALUE_NONE actual value.
bench: 4303194
Apply to LMR the same Eelco's idea
applied to move count pruning.
This is the result of a series of
attempts started by Thomas Kolarik.
Passed both short TC
LLR: 2.95 (-2.94, 2.94)
Total: 5675 W: 1241 L: 1117 D: 3317
And long TC:
LLR: 2.95 (-2.94, 2.94)
Total: 8748 W: 1689 L: 1539 D: 5520
bench: 4356801
Not a real functional change, but bench changed due to different piecelist
reordering. To verify it a temporary my canonicalize_rooks function was
written as follows. It just ensures that the rook on the "smaller" square
is listed first.
void Position::canonicalize_rooks(Color c)
{
if (pieceCount[c][ROOK] == 2)
{
Square s0 = pieceList[c][ROOK][0];
Square s1 = pieceList[c][ROOK][1];
if (s0 > s1)
{
pieceList[c][ROOK][0] = s1;
pieceList[c][ROOK][1] = s0;
index[s0] = 1;
index[s1] = 0;
}
}
}
With this both bench and the test on Chess960 positions
./stockfish bench 128 1 8 Chess960.epd file > /dev/null
Gives same result.
bench: 4424151
Set threads number always to 1 at startup and let the
user explicitly to chose the number of threads.
Also preserve the useful behavior of automatically set
"Min Split Depth" according to the requested threads,
indeed this parameter is too technical for a casual user,
so, when left to zero, we set it on a sensible value.
No functional change
The new Position methods add_piece, move_piece, and remove_piece
now manage the member variables pieceList, pieceCount, and index,
and 9 blocks of code in Position that used to manipulate those
data structures by hand now call the new methods.
There is a slightly slowdown (< 1%) on Clang and on perft,
but the cleanup compensates the little speed loss.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Introduce ThreadBase struct that is search
agnostic and just handles low level stuff,
and derive all the other specialized classes
form here.
In particular TimerThread does not hinerits
anymore all the search related stuff from Thread.
Also some renaming while there.
Suggested by Steven Edwards
No functional change.
At thread creation start_routine() is called
and from there the virtual function idle_loop()
because we do this inside Thread c'tor, where the
virtual mechanism is disabled, it could happen that
the base class idle_loop() is called instead.
The issue happens with TimerThread and MainThread
where, at launch, start_routine calls
Thread::idle_loop instead of the derived ones.
Normally this bug is hidden because c'tor finishes
before start_routine() is actually called in the
just created execution thread, but on some platforms
and in some cases this is not guaranteed and the
engine hangs.
Reported by Ted Wong on talkchess
No functional change.
The endgame king + minor vs king is erroneusly
detected as king + minor vs king + minor
Here the fix is to detect king + minor earlier,
in particular to add these trivial cases to
endgame evaluation functions.
Spotted by Reuven Peleg
bench: 4727133
And #ifdef instead of #if defined
This is more standard form (see for example iostream file).
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
A big simplification and removing of useless code.
Finished at 50% both at short TC (with SPRT) than
at long TC at fixed number of games:
ELO: -0.14 +-3.4 (95%) LOS: 46.8%
Total: 15206 W: 2836 L: 2842 D: 9528
bench: 5059948
This reverts commit 4b3a0fdab0.
As Gary says: " It failed when I tried it at long TC previously, and only
barely passed this time. Some anecdotal evidence is that it hurts vs other
engines as well (the Lightspeed rating list showed a 16 elo drop from previous
best version - still +- 5 error bars on both, but that's still significant)"
I also agree that if we have some doubts (like in this case) it is better to
be safe than sorry.
bench: 4615572
Reduces the influence of PSQT for entries such as
the extended center and the h-file.
Passed both short TC test:
LLR: 2.95 (-2.94,2.94)
Total: 23919 W: 5207 L: 5029 D: 13683
And long TC one:
LLR: 2.96 (-2.94,2.94)
Total: 5762 W: 1108 L: 974 D: 3680
Bench: 4617880
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This very speed critical code was full of clever (!)
tricks and subtle details.
So I have rewritten it in a more straithforward way
and, as very often happens, result is even faster
than original.
No functional change.
This reverts commit 3e95800814
For some reason it fails the short TC test:
LLR: -2.96 (-2.94,2.94)
Total: 20033 W: 4214 L: 4265 D: 11554
bench: 4769737
It is somewhat unbilievable but our SEE is broken !
If the first SEE move is a king capture and square is
defended then SEE continues instead of breaking.
The bug shows only on normal SEE, not see_sign() so
probing with a:
dbg_hit_on_c(slIndex==1, captured == KING);
reports just a tiny:
Total 3465656 Hits 6646 hit rate (%) 0
Bug was there since Retire seeValues[] and move PieceValue[] out of Position of 26/6/2011 (!)
although for some reason didn't show immediately, indeed the
bougous patch was a "No functional change" (!!)
bench: 4699504
It is somewhat unbilievable but our SEE is broken !
If the first SEE move is a king capture and square is
defended then SEE continues instead of breaking.
The bug shows only on normal SEE, not see_sign() so
probing with a:
dbg_hit_on_c(slIndex==1, captured == KING);
reports just a tiny:
Total 3465656 Hits 6646 hit rate (%) 0
Bug was there since 351ef5c85b of 26/6/2011 (!)
although for some reason didn't show immediately, indeed the
bougous patch was a "No functional change" (!!)
bench: 4793754
Without patch we have 333198 nps, with patch 334249.
A very small +0.3%, not a lot manily becuase this is a
side path that is taken very few times.
Anyhow idea is correct becuase first 'quick' condition
has an hit rate of about 95%.
No functional change.
But still keep the same original
margin for score.
Passed both short TC test
LR: 2.95 (-2.94,2.94)
Total: 3710 W: 845 L: 726 D: 2139
And long TC
LLR: 2.95 (-2.94,2.94)
Total: 57859 W: 10939 L: 10532 D: 36388
bench: 4769737
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Partially revert previous patch and use
unlikey() just as code annotation.
Actually it is better to rely on a profiler for branch prediction:
http://blog.man7.org/2012/10/how-much-do-builtinexpect-likely-and.html
"In fact, even when only one in ten thousand values is nonzero,
we're still at only roughly the break-even point"
No functional change,
It is somewhat redundant and could make SF
name too long, so use just Version, in case
of a signature build Version will be set to
'sig-xxx' otherwise, if left empty, we fall
back on usual date stamp.
No functional change.
When compiling with:
make signature-build ARCH=xxx COMP=xxx
After binary has been roduced, it will be run to
get the signature 'stockfish bench' and this
number will be used as Version, so that it
will be easy to track the original sources
from a binary.
No functinal change.
Due to a strange issue (bug?) the ternary
operator does not return a BitCountType for
icc, so revert to the expression used
before bcbc9bfd1f
No functional change.
Here speed up is the name of the game.
Speed up is gained:
- Removing the useless enoughMaterial code
- Limiting trapped rook evaluation to where it counts
Tested at long TC:
LLR: 2.97 (-2.94,2.94)
Total: 10061 W: 1948 L: 1790 D: 6323
bench: 4558173
Thanks to Don, Miguel, Louis and the other people
of talkchess forum for the suggestion:
http://www.talkchess.com/forum/viewtopic.php?t=48612
Also sync polyglot.ini with current UCI options
No functional change.
Unfortunatly a reverse test at long TC failed:
master^ vs master
LLR: 1.37 (-2.94,2.94)
Total: 33682 W: 6294 L: 6071 D: 21317
So becuase short TC score is 50% there is a good
possibility patch is not scalable.
So revert it.
bench: 4507288
Do not use threat move to detect the condition. This
let us to retire the big allows() function.
Test at short TC was within 50% score:
LLR: -2.95 (-2.94,2.94)
Total: 38272 W: 7941 L: 7940 D: 22391
To be verified with reverse long TC
bench: 4191565
Here the main difference is that now we center
aspiration window on last returned score. This allows
to simplify handling of mate scores.
We have done a reversed SPRT tests, where we wanted to
verify if master is stronger than this patch.
Long TC: master vs this patch (reverse test)
LLR: -2.95 (-2.94,2.94)
Total: 37992 W: 7012 L: 6920 D: 24060
bench: 4507288
Temporary revert aspiration window patch
so to be visible to everybody: it will be
re-applied with next patch
No functional change (together with next one)
Here the main difference is that now we center
aspiration window on last returned score. This allows
to simplify handling of mate scores.
We have done a reversed SPRT tests, where we wanted to
verify if master is stronger than this patch.
Long TC: master vs this patch (reverse test)
LLR: -2.95 (-2.94,2.94)
Total: 37992 W: 7012 L: 6920 D: 24060
bench: 4507288
A simplification of the 'dangerous' definition.
Seems neutral at reverse test at long TC
master vs patch
LLR: -2.96 (-2.94,2.94)
Total: 16974 W: 3122 L: 3139 D: 10713
bench: 4689029
Function calloc() already initializes memory to
zero, so avoid calling clear() afterwards.
Also some renaming while there (inspired by DiscoCheck).
No functional change.
Here we skip the call to pos.attacks_from<ROOK>(s) in the 98%
of cases, testing the first 2 members first. Unfortunatly
code is a bit triky and not clear. So we give up to the
speed optimization in exchange of more code clarity.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
So when we are doing a LMR search at the parent ALL node.
This patch didn't prove stronger at 60" TC
LLR: -2.97 (-2.94,2.94)
Total: 22398 W: 4070 L: 4060 D: 14268
But, first, it scores at 50%, second (and most important for me) the opposite,
i.e. normal reduction when parent node is not reduced, seems very bad:
LLR: -2.95 (-2.94,2.94)
Total: 7036 W: 1446 L: 1534 D: 4056
According to Don, this idea of increased reduction of CUT nodes
works because if parent node is reduced, missing a cut-off due to
reduced depth search (meaning position is somehow tricky) forces
a full depth research at parent node, giving due insight in this
set of sensible positions.
IOW if we expect a node to fail-high at depth n, then we assume it
should fail-high also at depth n-1, if this doesn't happen it means
position is tricky enough to deserve a research at depth n+1.
bench: 4687419
We got a good result from this tweak, in line with
what was already found by Don Dailey.
At short TC:
LLR: 2.95 (-2.94,2.94)
Total: 13097 W: 2742 L: 2598 D: 7757
At long TC:
LLR: 2.97 (-2.94,2.94)
Total: 7281 W: 1408 L: 1265 D: 4608
bench: 5108393
Follow Don Dailey definition of cut/all node:
"If the previous node was a cut node, we consider this an ALL node.
The only exception is for PV nodes which are a special case of ALL nodes.
In the PVS framework, the first zero width window searched from a PV
node is by our definition a CUT node and if you have to do a re-search
then it is suddenly promoted to a PV nodes (as per PVS search) and only
then can the cut and all nodes swap positions. In other words, these
internal search failures can force the status of every node in the subtree
to swap if it propagates back to the last PV nodes."
http://talkchess.com/forum/viewtopic.php?topic_view=threads&p=519741&t=47577
With this definition we have an hit rate higher than 90% on:
if (!PvNode && depth > 4 * ONE_PLY)
dbg_hit_on_c(cutNode, (bestValue >= beta));
And an hit rate of just 28% on:
if (!PvNode && depth > 4 * ONE_PLY)
dbg_hit_on_c(!cutNode, (bestValue >= beta));
No functional change.
Fix was wrong becuase search starts from ss+1,
code is a bit tricky here, so rewrite in a way
to be more easy to read and understand.
Spotted by Eelco.
No functional change.
In case of we pick a sub-optimal move be
sure to print this, and not the best one
on seach log file.
Bug spotted by Guenther Demetz.
No functional change.
Search is started after setting a position and
issuing UCI 'go' command. Then if we stop the search
and call 'go' again without setting a new position it
is assumed that the previous setup is preserved, but
this is not the case because what happens is that
SetupStates is reset to NULL, leading to a crash as
soon as RootPos.is_draw() is called because st->previous
is now stale.
UCI protocol is not very clear about requiring that a
position is setup always before launching a search,
so here we easy the life of GUI developers assuming
that the current state is preserved after returning
from a 'stop' command.
Bug reported by Gregor Cramer.
No functional change.
A small number of tests with simulated
annealing at 15s indicated these values
may be better
And this is verified at long 60+0.05 TC
LLR: 2.95 (-2.94,2.94)
Total: 40658 W: 7821 L: 7501 D: 25336
bench: 4931544
This patch is the sum of:
- Grainsize of 4 instead of 8
- Removing "depth < DEPTH_ZERO"
- Change DEPTH_QS_RECAPTURES = -5 to -7
All the patches individually failed to pass SPRT but scored
around 50%.
Together they pass easily short TC:
LLR: 2.96 (-2.94,2.94)
Total: 4429 W: 964 L: 844 D: 2621
And with some difficult long TC of 60+0.05:
LLR: 2.95 (-2.94,2.94)
Total: 64133 W: 11968 L: 11532 D: 40633
bench: 4821467
Add MOVE_NONE at the tail, this allows to loop
across MoveList checking for *it != MOVE_NONE,
and because *it is used imediately after compiler
is able to reuse it.
With this small patch perft speed increased of 3%
And it is also a semplification !
No functional change.
Most of the time we cut-off earlier, at captures, so this
results in useless work.
There is a small functionality change becuase 'ss' can change
from MovePicker c'tor to when killers are tried due, for
instance, to singular search.
bench: 4603795
Very good at long 60"+0.05 TC
LLR: 2.95 (-2.94,2.94)
Total: 5954 W: 1151 L: 1016 D: 3787
[edit: slightly changed form original patch to avoid useless loop
across killers when killer is MOVE_NONE]
bench: 4327405
Performed more or less well at short TC
LLR: 2.95 (-2.94,2.94)
Total: 50517 W: 9815 L: 9574 D: 31128
And a bit better at long TC
LLR: 2.96 (-2.94,2.94)
Total: 15564 W: 2805 L: 2624 D: 10135
bench: 4375253
It seems that do not limiting checking the
trapped rook only on rank 1 improves the
score.
At long TC
LLR: 2.97 (-2.94,2.94)
Total: 6581 W: 1346 L: 1204 D: 4031
bench: 4985012
Don't update refutation table in case of
previous move is MOVE_NULL or MOVE_NONE
and don't try refutation if is already
a killer move.
Pass both short TC
LLR: 2.96 (-2.94,2.94)
Total: 4310 W: 953 L: 869 D: 2488
And long one
LLR: 2.95 (-2.94,2.94)
Total: 6707 W: 1254 L: 1184 D: 4269
bench: 4785954
Very good result both at short TC 15+0.05
LLR: 2.95 (-2.94,2.94)
Total: 2803 W: 596 L: 483 D: 1724
And at long TC 60+0.05
LLR: 2.95 (-2.94,2.94)
Total: 2862 W: 548 L: 431 D: 1883
bench: 4329221
Unortunatly we have no guarantee that the call to
operator~(Color c) is resolved at compile time.
Perhaps the solution would be to use C++11 const_expr,
but for now simply use the good old-style ternary operator
that works as expected.
No functional change.
A rook is trapped if on rank 1 as is the king.
Currently the condition aloows for the rook
to be also in front of the pawns as long
as king is on first rank.
Verified with short TC test:
LLR: -1.71 (-2.94,2.94)
Total: 23234 W: 4317 L: 4317 D: 14600
Here what it counts is that after 23K games
result is equal.
bench: 4696542
When it is already defined(_WIN32).
According to Microsoft documentation:
http://msdn.microsoft.com/en-us/library/b0084kay.aspx
_WIN32 Defined for applications for Win32 and Win64. Always defined.
_WIN64 Defined for applications for Win64.
Patch suggested by Joona.
No functional change.
This info is normally printed together with
PV info in uci_pv() but when search is stopped,
for instance when max search time is reached,
uci_pv is not called and we miss this bits.
Suggested by gravy_train
No functional change.
But this time do not play with pointers, in
particular do not assume that size_t is an
unsigned type of the same width as pointers.
This code should be fully portable.
No functional change.
This fixes an assert while testing with debug on.
Assert was due to static null pruning returning value
eval - futility_margin(depth, (ss-1)->futilityMoveCount)
That was sometimes higher than VALUE_INFINITE triggering
an assert at the caller site.
Because eval con be equal to ttValue and anyhow is read from
TT that can be corrupted in SMP case, we need to sanity
check it before to use.
bench: 4176431
Let TT clusters (16*4=64 bytes) to hold on a singe cache line.
This avoids the need for the double prefetch.
Original patches by Lucas and Jean-Francois that has also tested
on his AMD FX:
BIG HASHTABLE
./stockfish bench 1024 1 18 > /dev/null
Before:
1437642 nps
1426519 nps
1438493 nps
After:
1474482 nps
1476375 nps
1475877 nps
SMALL HASHTABLE
./stockfish bench 128 1 18 > /dev/null
Before:
1435207 nps
1435586 nps
1433741 nps
After:
1479143 nps
1471042 nps
1472286 nps
No functional change.
Crash is due to uninitialized ss->futilityMoveCount that
when happens to be negative, yields to an out of range
access in futility_margin().
Bug is subtle because it shows itself only in SMP case.
Indeed in single thread mode we only use the
Stack ss[MAX_PLY_PLUS_2];
Allocated at the begin of id_loop() and due to pure
(bad) luck, it happens that for all the MAX_PLY_PLUS_2
elements, ss[i].futilityMoveCount >= 0
Note that the patch does not prevent futilityMoveCount
to be overwritten after, for instance singular search
or null verification, but to keep things readable and
because the effect is almost unmeasurable, we here
prefer a slightly incorrect but simpler patch.
bench: 4311634
Instead of a pointer. This should fix the issue of
remaining with a stale pointer when for instance calling
IID, but also null search verification, singular search
and razoring where we call search with the same ss
pointer. In this case ss->ei is overwritten in the
search() call and upon returning remains stale.
This patch could have a performance hit because Eval::Info
is big (176 bytes) and during splitting we copy 4 ss entries.
On the good side, this patch is a clean solution.
Proposed by Gary.
No functional change.
Allow to use EvalInfo struct, populated by
evaluation(), in search.
In particular we allocate Eval::Info on the stack
and pass a pointer to this to evaluate().
Also add to Search::Stack a pointer to Eval::Info,
this allows to reference eval info of previous/next
nodes.
WARNING: Eval::Info is NOT initialized and is populated
by evaluate(), only if the latter is called, and this
does not happen in all the code paths, so care should be
taken when accessing this struct.
No functional change.
The idea is to fail high more easily in static
null test if in the parent node we are already
very deep in the move list, so the propability
to fail high there is very low.
[edit: I have slightly changed the functionality
moving
ss->futilityMoveCount = moveCount;
At the end of the pruning code, this should not affect
ELO in anyway, but makes code more natural and logic]
Test with SPRT is good at 15+0.05
LLR: 2.96 (-2.94,2.94)
Total: 50653 W: 10024 L: 9780 D: 30849
And at 60+0.05
LLR: 2.97 (-2.94,2.94)
Total: 40799 W: 7227 L: 6921 D: 26651
bench: 4530093
When we use sysconf(_SC_NPROCESSORS_ONLN) to get number of
cores, we have to include sysconf library that is unistd.h
Sometimes it happens to work just becuase unistd.h indirectly
included by some other libraries, but not always.
Reported and fixed by Eyal BD
No functional change.
The idea is to penalize a bishop in case of
its pawns are on the same colored squares.
Good at short 15"+0.05 TC
LLR: 2.95 (-2.94,2.94)
Total: 4252 W: 925 L: 806 D: 2521
And at longer 60"+0.05 TC
LLR: 2.95 (-2.94,2.94)
Total: 15006 W: 2743 L: 2564 D: 9699
bench: 5274705
It seems stronger both at fast 15+0.05 TC with fixed game number test:
ELO: 2.74 +-2.7 (95%) LOS: 97.6%
Total: 24000 W: 4698 L: 4509 D: 14793
And also at long 60+0.05 TC with SPRT
LLR: 3.05 (-2.94,2.94)
Total: 38986 W: 6845 L: 6547 D: 25594
bench: 5157061
Use an optinal argument instead of a template
parameter. Interestingly, not only is simpler,
but also faster, perhaps due to less L1 instruction
cache pressure because we don't duplicate the very
used SEE code path.
No functional change.
Rationale:
- Current settings seem to make engine *significantly* weaker in analysis mode.
- In practice this setting only has effect when king safety scores are high.
- Even in analysis mode its far more important to know if one side is getting mated,
rather than get evaluation correct with 1cp accuracy.
No functional change
According to Jean-Paul this setup should be stronger
than default.
And SPRT test seems to confirm it:
At fast TC 15"+0.05
ELO: 3.33 +-2.7 (95%) LOS: 99.2%
Total: 25866 W: 5461 L: 5213 D: 15192
At longer TC 60"+0.05
ELO: 7.27 +-5.0 (95%) LOS: 99.8%
Total: 6544 W: 1212 L: 1075 D: 4257
bench: 5473339
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Increasing depth limit to 10 plies seems stronger
after 16K games at 15"+0.05 (ELO: +3.56) and also
repeating the test at 60"+0.05 TC:
ELO: 2.08 +-3.1 (95%) LOS: 90.9%
Total: 16000 W: 2641 L: 2545 D: 10814
Moreover setting the limit to 12 is proved stronger
then limit set to 10 by direct SPRT test at 15"+0.05:
ELO: 2.56 +-2.0 (95%) LOS: 99.5%
Total: 46568 W: 9240 L: 8897 D: 28431
So we directly set the limit to 12, the strongest setup.
bench: 4361224
Master IID formula is depth / 2
Previous patch is depth - 4 * ONE_PLY
This one is the middle way:
(dept/2 + depth-4*ONE_PLY)/2 -> depth-2*ONE_PLY-depth/4
After 16000 games at 60+0.05 th 1
ELO: 4.08 +-3.1 (95%) LOS: 99.5%
Total: 16000 W: 2742 L: 2554 D: 10704
bench: 4781239
Same idea of 5af8179647
in qsearch() but applied to search()
After 15500 games at 15+0.05
ELO: 4.48 +-3.4 (95%) LOS: 99.5%
Total: 15500 W: 3061 L: 2861 D: 9578
bench: 4985829
Always before pruning the move, it's important to check that:
bestValue > VALUE_MATED_IN_MAX_PLY
See example position:
8/2p1p3/P1NpP3/3k4/1P1BN3/2P1P3/2Q5/6K1 w - - 0 1
http://support.stockfishchess.org/discussions/problems/268-wrong-declaring-a-forced-mate-in-3-moves
This problem was present in 2.3.1, then it was fixed by my patch.
After 24000 games at 15+0.05
ELO: 2.40 +-4.4 (95%) LOS: 95.7%
Total: 24000 W: 4774 L: 4608 D: 14618
bench: 4465997
This reverts commit a24da071f0
Seems a regression when tested against 2.3.1
With this patch, have after 20000 games at 60+0.05, we have
ELO: 13.42 +-4.8 (95%) LOS: 100.0%
Total: 20000 W: 3746 L: 2974 D: 13280
Instead with the patch reverted:
ELO: 16.62 +-4.8 (95%) LOS: 100.0%
Total: 20000 W: 3816 L: 2860 D: 13324
Although we are within error bounds here we take the conservative
approach of not introducing changes that are not proved stronger
It doesn't mean that the change shall be weaker, simply that we
don't want to take any risk.
No functional change.
Here the rational seems to be that if after one try easy
move detection fails then the easy move is not so easy :-)
After 15563 games at 60+0.05
ELO: 3.04 +-5.5 (95%) LOS: 97.0%
Total: 15563 W: 2664 L: 2528 D: 10371
No functional change.
Increase MaxRatio to use more time when in trouble.
After 16000 games at 60+0.05
ELO: 4.89 +-5.4 (95%) LOS: 99.9%
Total: 16000 W: 2700 L: 2475 D: 10825
No functional change.
This seems good at short TC controls.
After 10000 games at 20+0.05
ELO: 9.56 +-6.8 (95%) LOS: 100.0%
Total: 10000 W: 1949 L: 1674 D: 6377
Testing at long TC and regression testing is still
ongoing. So this is a bit speculative commit and
could be reverted in the future.
Also re-testing at long TC the SEE pruning in PV nodes
seems less effective (perhaps even a regression, but
still ongoing) so disabled for now.
bench: 4968764
This reverts commit 0d68b523a3.
After easy move semplification this machinery is not
needed anymore (because of we don't need to know if a
root move is a recapture)
No functional change.
Detect a move as easy only if it is the only one ;-)
or if is much better than remaining ones after we
have spent 20% of search time.
Tests are ongoing, but it seems this semplification
stands. Anyhow it is experimental for now and could
be reverted/improved with further work Gary is
testing right now.
No functional change.
After previous patch if split point master is
waiting for job and "Use Sleeping Threads" is
false (our condition for official releases) then
it will lock/unlock splitPoint mutex in a super
tight loop badly affecting performance.
Rewrite the code to lock only when we are about
to finish. Note that race condition on slavesMask
is anyhow fixed.
No functional change.
There is no point searching a move that is forced.
It wastes time while allowing computer opponents to
fill hash with 100% accuracy.
[edit: Condition moved together with "easy move" ones]
Bench identical: 4922272
We detect an easy move as a recapture with an
high margin on the second best move.
Unfortunatly the recapture detection is broken
becuase we identify as a recapture any move that
follows an opponent's previous capture !
This patch fix the logic to correctly detect a
real re-capture.
No functional change.
Note that we read shared data without lock
protection, so code is theoretically prone to
torn reads. But, first splitPoint pointer
never changes, and alpha is of integer type so
it is read in a single DWORD access.
No functional change.
This patch is actually the sum of two contributions that
have been tested independently:
1) Pruning of negative SEE moves in PV
After 10000 games at 20+0.05
ELO: 5.18 +-7 (95%) LOS: 99.2%
Total: 10000 W: 1952 L: 1803 D: 6245
2) Remove of bestValue > VALUE_MATED_IN_MAX_PLY condition
After 23000 games at 20+0.05
ELO: 1.63 +-4 (95%) LOS: 88.1%
Total: 23000 W: 4232 L: 4124 D: 14644
The whole patch as been re-tested at long TC with positive results:
After 10000 games at 60+0.05
ELO: 4.31 +-7 (95%) LOS: 98.3%
Total: 10000 W: 1765 L: 1641 D: 6594
kf = (kf == FILE_A) ? kf++ : ....
is tricky becuase kf is updated twice and it happens
to do the right thing just by accident.
Rewrite in a better way.
Spotted by pdimov
No functional change.
Give a bonus if a bishop can pin a piece or can
give a discovered check through an x-ray attack.
Seems good after 24000 games at 15"+0.05 (single thread):
ELO: 12.30 +- 99%: 5.79 95%: 4.40 LOS: 100.00%
Total: 24000 W: 4931 L: 4082 D: 14987
bench: 4917064
Rename startPosPly to gamePly and increment/decrement
the variable after each do/undo move. This adds a little
overhead in do_move() but we will need to have the
game ply during the search for the next patches now
under test.
Currently we don't increment gamePly in do_null_move()
becuase it is not needed at the moment. Could change
in the future.
As a nice side effect we can now remove an hack in
startpos_ply_counter().
No functional change.
Another trick, along the same lines of previous
patch. This time we first check positions with
white side to move that, becuase we start with
pawn on rank 7, are easily classified as wins,
then black ones.
Number of cycles reduced to 15 !
Becuase now it is faster we can remove a lot of
code to detect theoretical draws. We will calculate
them anyhow, although a bit slower, but the speed
up trick more than compensates it.
Verified that generated bitbases match original ones.
No functional change.
Change the way the index is coded so that
now looping from 0 to IndexMax generates
the pawns from RANK_7 down to RANK2.
Becuase positions with pawns at RANK_7
are easily classified as wins/draws, this
small trick allows to reduce the number
of needed iterations from 30 down to 26!
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Rename to insertion_sort so to avoid confusion
with std::sort, also move it to movepicker.cpp
and use the bit slower std::stable_sort in
search.cpp where it is used in not performance
critical paths.
No functional change.
We can encode the ClusterSize directly in the
hashMask, this allows to skip the left shift.
There is no real change, but bench number is now
different because instead of using the lowest order
bits of the key to index the start of the cluster,
now we don't use the last two lsb bits that are
always set to zero (cluster size is 4). So for
instance, if 10 bits are used to index the cluster,
instead of bits [9..0] now we use bits [11..2].
This changes the positions that end up in the same
cluster affecting TT hits and so bench is different.
Also some renaming while there.
bench: 5383795
Do a 32bit bitwise 'and' instead of a 64bit
subtract and bitwise 'and'.
This is possible because even in the biggest
hash table case (8GB) the number of entries
is 2^29 so storable in an unsigned int.
No functional change.
Save the current active position in each Thread
instead of keeping a centralized array in struct
SplitPoint.
This allow to skip a memset() call at each split.
No functional change.
Obsolete renmant of when position was directly
passed to the search instead of being copied
for the main thread as is now.
From Jundery.
No functional change.
The syntax splitPoints() should force the compiler to
value-initialize the array and because there is no
user defined c'tor it falls back on zero-initialization.
Unfortunatly this is broken in MSVC compilers, because
value initialization for non-POD types is not supported,
so left splitPoints un-initialized and add in split()
initialization of slavesPositions, that is the only
member not already set at split time.
This fixes an assert under MSVC when running with
more than one thread.
Spotted and reported by Jundery.
No functional change.
It is a draw if pawns are on G or B files, weaker pawn is
on rank 7 and bishop can't attack the pawn.
No functional change (because it is very rare and does not appear in bench)
To return a pointer to the available
thread instead of a bool. This allows
to simplify the core loop in split().
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This function "returns" two values: bestValue and bestMove
Instead of returning one and passing as pointer the other
be consistent and pass as pointers both.
No functional change.
Currently a ttMove is reduced with ss->reduction = DEPTH_ZERO,
so it is actually not reduced (as it should be), but the
trick works just becuase it happens that ttMove is the first
to be tried and
reduction(depth, 1)
Always returns zero. So explicitly forbid reduction of ttMove
in the LMR condition. This is much clear and self-documented.
No functional change.
Surprisingly this rare case was not considered
when scoring a capture.
Also take in account that in the promotion case
we gain a new piece (typically a queen) but we
lose the promoting pawn.
These small issues were present since Glaurung times!
Found while browsing DiscoCheck sources
bench: 5400063
Handling of History and Gains is almost the same, with
the exception of the update logic, so unify both
classes under a single Stats struct.
No functional change.
And handle the castle directly in do/undo_move().
This allow to greatly simplify the code.
Here the beast is the nasty Chess960 that is
really tricky to get it right because could be
that 'from' and 'to' squares are the same or
that king's 'to' square is rook's 'from' square.
Anyhow should work: verified on all Chess960
starting positions.
No functional and no speed change also in Chess960.
Use a more traditional approach, along the same lines
of do_move().
It is true that we copy more in do_null_move(), but we
save the work in undo_null_move(). Speed test shows the
new code to be even a bit faster.
No functional change.
We have only one call place so inline its content.
BTW, function is already declared as FORCE_INLINE.
Also some small refactoring while there.
No functional change.
When a thread is allocated a bit is set in slavesMask.
This bit corresponds to the thread's index field that,
because it happens to be the position in the threads
array, eventually it is equal to the loop index 'i'.
But instead of relying on this 'coincidence', explicitly
use the 'idx' field so to clarify slavesMask usage.
Backported from c++11 branch.
No functional change.
Intel Compiler has 'invented' this pearl:
warning #1476: field uses tail padding of a base class
Just becuase we have subclassed MainThread and added
the field 'bool thinking'.
Pure nosense. Silence the warning.
No functional change.
Fix again TimerThread::idle_loop() to prevent a
theoretical race with 'exit' flag in ~Thread().
Indeed in Thread d'tor we raise 'exit' and then
call notify() that is lock protected, so we
have to check again for 'exit' before going to
sleep in idle_loop().
Also same change in Thread::idle_loop() where we
now check for 'exit' before to go to sleep.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Great code simplification: - instead do not futility
prune threat refutations. allows_move() is therefore removed.
4000 games at 50,000 nodes/move:
1085-989-1926 [51.2%] LOS=98.3%
4000 games in 10"+0.1"
756-751-2493 [50.1%] LOS=55.1%
EDIT: I have retested the patch of Lucas in a slightly different form
(without pruning in PvNode) and test mre or less confirms that
60 lines of code are totally unuseful:
After 6195 games at 15"+0.05"
1333 - 1325 - 3537 ELO 0
bench 5140990
Silly logic bug introduced in dda7de17e7
Timer thread, when msec = 0, instead of going
to sleep, calls check_time() in an endless loop.
Spotted and reported by snino64 due to abnormally
high CPU usage.
No functional change.
Subclass MainThread and TimerThread and declare
idle_loop() virtual. This allow us to cleanly
remove a good bunch of hacks, relying on C++
polymorphism to do the job.
No functional change.
Rename it is_finished and use it only in main
thread to signal search is finished. This allows
us to simplify the complex SMP logic.
Ultra tricky patch: deep test is required under
wide conditions like pondering on and option
"Use Sleeping Threads" set to false.
No functional change.
This reverts commit 869c924410
I misunderstood here. Actually it can happen that
thread is created but still not entered idle_loop
and at the same time start_searching() is called.
Becuase 'do_sleep' is set start_searching() will
set it to false and start the search, but when,
at last, the thread enters idle_loop(), resets
the flag and goes to sleep: not what we want.
Revert the hack waiting for a better solution
in the next patches.
No functional change.
Finally we can now merge the 'ponderhit' case with
'stop' and 'quit'.
The patches have been done step by step to help debugging
becuase this is really tricky code.
No functional change.
Reset Limits.ponder only if search continue, but if
we are going to stop the search there is no need
(and is also confusing) to clear the 'ponder' flag.
This mimics the behaviour upon rceiving 'stop' when
pondering.
No functional change.
In SAN notation when two pieces of the same type can
move to a given destination square, a disambiguation
additional info (like starting file) shall be added
to the SAN move.
If one of the two pieces is pinned, the corresponding
move _could_ be illegal and in this case disambiguation
is not needed. But to be pinned alone it is not enough
to deduce that the move is illegal, for instance in this
position:
R3rk2/2r6/8/8/8/8/8/K7 b - - 0 1
The move Rc8 is ambiguous although the rook in e8 is pinned
and the correct SAN notation should be Rcc8.
No functional change.
Don't wait for the search to finish after a 'stop'
command, but keep processing the GUI input if any.
Also explicitly wake up the main thread (that could be
sleeping) after a 'stop' or 'quit' command and do not
rely on wait_for_search_finished() doing it for us.
This patch cleans up the code and functions's definitions,
but it is risky and needs a good test under different
conditions to be sure it does not introduces hungs up.
No functional change.
Revert patch c581b7ea36
Seems a regression after testing from Gary:
ELO: 7.24 +- 99%: 17.03 95%: 12.93
LOS: 97.86%
Wins: 439 Losses: 381 Draws: 1962
And mine:
After 5410 games at 15"+0.05
Wins: 936 Losses: 1141 Draws: 3333 ELO -13
Moreover we know that there is a regression in the range
of patches which include the fromNull patch.
Probably this is not the only regression since 2.3.1 and
perhaps the idea under fromNull is good, but at the moment,
while in deep regression hunting, better to be on the safe
side and revert it entirely.
My guess on why this is a regression is that using the
negated evaluation of previous ply in case of null search
fails to take in account the king safety asymmetry between
the two colors. This is of course just a guess.
bench 5503830
They are not self-describing and create a lot of user
requests about them.
Given that the values are already well tuned there
is no need to expose them as UCI options.
No functional change.
Sometimes is faster, but not always and on very long mates
produces strange scores probably due to truncation of PV
artifacts.
So simply perform normal search also in case of UCI 'mate x'
command, with the only difference that when a mate in x is
found search returns immediately.
No functional change.
Now that we can call 'bench' command also from
interactive terminal it makes no more sense to
exit the application if the user types a wrong
file name.
No functional change.
Since &-ing with SpaceMask restricts the set to the home
half of the board, it is possible to use just one popcount
instead of 2 by shifting "safe" to the other half of the
board. This gives a small speedup especially on systems
where hardware popcount is not available.
Patch kindly sent by Richard Vida.
No functional change.
It is now possible to run SF on a 'mate in x' testsuite.
For instance in case of a file with fen strings of
positions with mate in 10 we can now 'bench' on it:
stockfish bench 128 1 10 mate_in_10.epd mate
No functional change.
Following a user request I added the handling of UCI:
go mate x
Currently we just return from a PV node if x moves have been
done. Probably not the best approach. I have looked at Fruit/Toga
sources and there is even simpler: engine falls back on a fixed
depth search.
No functional change.
And return on using TT as backing store for position
evaluations.
Tests (even on single thread) show eval cache was a regression.
In multi thread result should be even worst because eval cache
is a per-thread struct, while TT is shared.
After 4957 games at 15"+0.05 (single thread)
eval cache vs master 969 - 1093 - 2895 -9 ELO
So previous reported result of +18 ELO was probably due to an
issue in the testing framework (a bug in cutechess-cli) that
has been fixed in the meanwhile.
bench: 5386711
In case of null search at low depths returns a fail low
due to a threat then, rather than return beta-1 (to cause
a re-search at full depth in the parent node), we set a flag
threatExtension = true (false by default) that will cause
moves that prevent the threat to be extended of one ply
in the following search.
Idea and patch is by Lucas Braesch.
Lucas also did the tests:
1500 games in 5"+0.05":
SF_threatExtension vs SF_20121222: 366 - 331 - 803 [51.2%] LOS=90.8%
3000 games in 10"+0.1":
SF_threatExtension vs SF_20121222: 610 - 559 - 1831 [50.8%] LOS=93.2%
Tests confirmed by Gary after 10570 games,
ELO: 2.79 +- 99%: 8.72 95%: 6.63
LOS: 94.08%
Wins: 1523 Losses: 1438 Draws: 7607
And finally by me at 15"+0.05, single thread, 3824 games
threatExtension vs master 768 - 692 - 2364 +7 ELO
bench 4918443
This is difficult code becuase a bug here could lead
to very subtle crashes in case of SMP games where we
have TT move corruption due to concurrent access.
Anyhow I have fully verified te code throwing at it
random moves. It shoudl work.
No functional change.
Test by Joona prooves the new feature don't value 70 added lines of code.
Grand totals after 10040 games (crashes: 0) for tt_both
master_9edc7 - 6a93488_6a934: 1756 - 1688 - 6596 ELO +2 (+- 2.7)
Confirmed by test of Gary:
After 8680 games:
ELO: 0.80 +- 99%: 9.62 95%: 7.31
LOS: 65.38%
Wins: 1288 Losses: 1268 Draws: 6130
Thanks a lot to both for testing it !!!
bench 5149248
This is more complex than what I'd like but I
was unable to split in small chunks.
Here we add 2 slots to TTEntry (valueUpper and depthUpper)
so that sizeof(TTEntry) returns to the original 16 bytes
and we can pack exactly 4 entries in a 64 bytes cache line.
Now we save an upper bound score alongside a lower (exact)
score. The idea is to increase TT cut-offs rates becuase
there is now an higher probability for a node to use TT info.
This patch is highly experimental and probably needs further
steps as is hinted by an unrealistic bench number:
bench: 2022385
In almost all cases we already know in advance that
color_of() argument is different from NO_PIECE.
So avoid the check for NO_PIECE in color_of() and
test at caller site in the very few places where
this case could occur.
As a nice side effect, this patch fixes a (bogus)
warning under some versions of gcc.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use an eval cache instead of TT to store node
position evaluations.
It is already an improvment and, because it frees
two TT entry slots, paves the way to extend TT to
store both upper and lower bounds.
After 4855 games, single thread, 15"+0.05
Mod vs Orig 1165 -920 - 2770 ELO +18
bench: 5149248
And document why this is an hard limit. It
seems for some (lucky) people 32 threads
are not enough.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In search(), after we evalute the position, in case there
isn't any TT entry we create one with just the evaluation
score.
This patches removes that code. The reason becuase the patch
deserves a single commit it is becuase introduces a (very small)
functional change due to the fact that the total number of
TT stores is less now and this slightly alters the TT hits
of our benchmark.
bench: 4983262
Rely fully on eval cache. Note that we still save eval
info to TT, this is not needed at this moment and will be
removed in future patches. We keep it so to have a "non
functional change" patch.
No functional change.
With this patch series we want to introduce a per-thread
evaluation cache to store node evaluation and do not
rely anymore on the TT table for this.
This patch just introduces the infrastructure.
No functional change.
In case of a RootNode or a SpNode move has
been already checked for legality so we can
skip a redundant check.
Spotted by Frank Genot.
No functional change.
In qsearch we should update the bestValue as we do
in case of futilityValue < beta, also when pruning
moves with non-positive see.
Spotted by Lucas Braesch
Bench: 5695710
Send the PV lines to GUI only once at the end of the
PV search loop or just in case of long searches.
We need to sync also sending of "currmove" info to
avoid sending info on current move without first
informing the GUI on the PV line we are searching on.
No functional change.
At this point we have already verified (value > alpha)
and this implies, in case of a non-PV node, where search
window size is zero, that value >= beta.
This is not so self-evident, so document the code with
an assert condition.
No functional change.
Let the caller to decide where to redirect (cout or cerr) the
ASCII representation of the position. Rename the function to
reflect this.
Renamed also from_fen() and to_fen() to set() and fen() respectively.
No functional change.
In case a PvNode node has a static evaluation above alpha but
no available moves we want to flag the node as BOUND_EXACT,
not as BOUND_UPPER as is currently.
The behaviour was recently introduced with patch d471c49700
of 3/10/2012
Spotted by Hongzhi Cheng.
bench: 5558464
Both Lucas re-test and Jean-Francois confirrm it
is a regression.
Here Jean-Francois's results after 3600 games :
Score of 96d3b1c92b vs 3b87314: 690 - 729 - 2181 [0.495] 3600
ELO: -3.86 +- 99%: 14.94 95%: 11.35
LOS: 15.03%
Wins: 690 Losses: 729 Draws: 2181 Total: 3600
Bench: 5404066
Don't prune and eventually extend check moves of type
DISCO_CHECK (pun intended, Lucas will understand :-) ).
Patch from Lucas Braesch that has also tested it:
Result: 879-661-2137, score=52.96%, LOS=100.00% (time 10"+0.1")
I have started a verification test right now.
bench: 6004966
From Jean-Francois's
Final result after 5000 games :
Score of c581b7e vs a878312: 1163 - 970 - 2867 [0.519] 5000
ELO: 13.35 +- 99%: 12.71 95%: 9.65
LOS: 100.00%
Wins: 1163 Losses: 970 Draws: 2867 Total: 5000
From me
After 3266 games at 20"+0,05
Score of c581b7e vs a878312: 612 - 607 - 2047
So no regression at longer TC and perhaps a little gain at
fast TC.
In this case we try a rather drastic
approach: we simply don't futility prune
in qsearch when arriving from a null move.
So we save evaluating and also save to mess
with eval margins at all because margin is used
only in futility.
Also accuracy should not be affected, actually it
improves because we don't prune anything anymore.
bench: 5404066
Performs well at very short TC of 40/4+0.05 (courtesy of Jean-Francois):
Wins: 2503 Losses: 2146 Draws: 5581 Total: 10230 +12 ELO
But is poor at longer TC of 20"+0.05
Wins: 321 Losses: 373 Draws: 1141 Total: 1808 -10 ELO
The patch was clearly a tradoff between speed and accuracy and
the most interesting part of it are test results that can be
commented as follows:
- A short TC is very sensible to any speed increase
- A longer TC is more sensible to accuracy and less to speed
So a patch that does not change speed is suitable to be tested at
short TC, while a speed/accuracy compromise patch is IMO better to
be tested at longer TC to verify loss of accuracy can be tolerated.
In this case the revert is only temporary. We will come back again
once we will be able to preserve the evaluation margin.
bench: 5809010
Reuse the evaluation of the parent with inverted sign and
set margin to zero (this is an hack!).
This is done only in qsearch where almost 15% of calls are
from a null move. In normal search the number of nodes where
(ss-1)->currentMove == MOVE_NULL
is almost zero and so there is no need of using this trick.
The big advantage of this patch is a speed-up due to skipped
evaluate() calls, that are very costly.
Functionality is of course affected and we will need to proper
test it later. For now we just register a 3-4% speed up.
Suggested by Hongzhi Cheng.
bench: 5051328
When testing if a move blocks the threat path there is no
reason to require the threat to be a slider. Indeed threat
can be a double pawn push like in this example:
r1bq1rk1/ppp1np1p/4n1p1/3p4/3P2Q1/2P1B3/PPBN2PP/R4RK1 w - - 0 16
Where white's move Rf6 blocks the threat f5.
As a nice side effect we can retire the now useless helper
piece_is_slider().
This patch kicks in only very rare cases, indeed the bench is
still the same!
bench: 5809010
There is no guarantee that split() consumes all the node's
moves. Indeed split() can return without performing any job
for instance because MAX_SPLITPOINTS_PER_THREAD is reached
or becuase no available threads are found (this latter case
is much more common).
So search must continue in those cases and we cannot force
exiting from move's loop.
Bug introduced by 1ac417edb8 of 5/10/2012
Spotted by Frank Genot.
No functional change.
If a previous move attacks the king (with the piece
of the threat move removed) then must be a discovered
check, otherwise it means that first move gave check
and we were not able to do a null move.
Also renamed stuff to better document the function's
context.
No functional change.
When testing if a piece is moving through the squares
vacated by a previous move there is no reason to require
the piece to be a slider, indeed we can have a double
pawn push like in this example:
r1q2rk1/2p1bppp/2Pp4/pN5b/Q1P1p3/4B2P/PP1R1PP1/1K5R w - - 3 18
Where black's move f5 is connected to previous move Be7 that
frees the path.
Or we can have a castle move:
r1bqkb1r/pppp1ppp/2n1pn2/1B6/4P3/2N2N2/PPPP1PPP/R1BQK2R b KQkq - 5 1
Where a previous move Bb5 allows the white to castle king side.
This time patch is mine ;-)
new bench: 5809010
We send to GUI multi-pv info after each cycle,
not just once at the end of the PV loop. This is
because at high depths a single root search can
be very slow and we want to update the gui as
soon as we have a new PV score.
Idea is good but implementation is broken because
sort() takes as arguments a pointer to the first
element and one past the last element.
So fix the bug and rename sort arguments to better
reflect their meaning.
Another hit by Hongzhi Cheng. Impressive!
No functional change.
When checking if the moving piece p1 in a previous
move m1 defends the destination square of a move m2
we have to use the occupancy with the from square of
m2 removed so to take in account the case in which
f2 will block an x-ray attack from p1.
For instance in this position:
r2k3r/p1pp1pb1/qn3np1/1N2P3/1p3P2/2B5/PPP3QP/R3K2R b KQ - 1 9
The move eXf6 is connected to the previous move Bc3 that
defends the destination square f6.
With this patch we have about 10% more moves detected as
'connected'. Anyhow the absolute number is very low, about
4000 more moves out of 6M nodes searched.
Another issue spotted by Hongzhi "Hawk Eye" Cheng ;-)
new bench: 5757373
On Intel, perhaps due to 'lea' instruction this way of
zeroing the lsb of *b seems faster than a shift+negate.
On perft (where any speed difference is magnified) I
got a 6% speed up on my Intel i5 64bit.
Suggested by Hongzhi Cheng.
No functional change.
Compiler complies that 'cnt' is initialized but
unused (in !CheckThreeFold case). Moving the
definition of 'cnt'out of the loop seems to do
the trick.
No functional change.
Instead of use a variable so to resolve many conditions
already at compile time. In quiesce is also where we
have most of the InCheck nodes and is one of the most
performance critical code paths.
Speed up of 1.5% with Clang and 1% with gcc
Suggested by Hongzhi Cheng.
No functional change.
When checking if a move defends the threatened piece
we correctly remove from the occupancy bitboard the
moved piece. This patch removes from the occupancy also
the threatening piece so to consider the cases of moves
that defend the threatened piece x-raying through the
threat move.
As example in this position:
r3k2r/p1ppqp2/Bn4p1/3p1n2/4P1N1/5Q1P/PPP2P1P/R3K2R w KQkq - 1 10
The threat black move is dxe4. With this patch we include
(and so don't prune) white's Bb7 that would be pruned otherwise.
The number of affected position is very low, around 1% of cases,
so we don't expect ELO changes, neverthless this is the logical
and natural thing to do.
Patch suggested by Hongzhicheng.
new bench: 5323798
ReducedStateInfo is a redundant struct that is also
prone to errors, indeed must be updated any time is
updated StateInfo. It is a trick to partial copy a
StateInfo object in do_move().
This patch takes advantage of builtin macro offsetof()
to directly calculate the number of quad words to copy.
Note that we still use memcpy to do the actual job of
copying the (48 bytes) of data.
Idea by Richard Vida.
No functional and no performance change.
Only in not performance critical code like pretty_pv(),
otherwise continue to use the good old C-style arrays
like in extract/insert PV where I have done some code
refactoring anyhow.
No functional change.
Silly typo (introduced in e304db9d1e) completely
messed up move notation in case of promotions causing
"Illegal move" warning in cutechess-cli.
Reported by Jörg Oster.
No functional change.
In multi-threads runs with debug on we experience some
asserts due to the fact that TT access is intrinsecally
racy and its contents cannot be always trusted so must
be validated before to be used and this is what the
patch does.
No functional case.
And restore old behaviour of not returning from a RootNode
without updating RootMoves[].
Also renamed is_draw() template parameters to reflect a
'positive' logic (Check instead of Skip) that is easier
to follow.
New bench: 5312693
When signal 'stop' is raised we return bestValue
that could be still set at -VALUE_INFINITE and
this triggers an assert. Fix it by returning
a value we know for sure is not +-VALUE_INFINITE.
Reported by 平岡拓也 Hiraoka.
No functional change.
Note that the actual pickup is done in the class
d'tor so to be sure it is always triggered, even
in case of a sudden exit due to a 'stop' signal.
No functional change.
Function move_to_san() requires the Position to be
passed by referenced because a do/undo move is done
inside the function to detect a possible mate and to
add to the san string the corresponding '#' suffix.
Instead of passing a copy of current position pass
directly the original position object after const
casting it. This has the advantage to avoid a costly
Position copy, on the down side a bench test could
report different searched nodes if print(move) is
used, due to the additionals do_move() calls.
No functional change.
Existing Makefile is buggy for PowerPC, it has no
SSE, yet it is given it if Prefetch is enabled,
because it isn't ARMv7.
Patch from Matthew Brades.
No functional change.
We can have ttValue == VALUE_NONE when we use a TT
slot to just save a position static evaluation, but
in this case we also save DEPTH_NONE so to avoid
using the ttValue in search. This happens to work,
but due to a number of lucky and tricky cases that
we now documnet through a bunch of asserts and a
little change to value_from_tt() that has no real
effect but clarifing the code.
No functional change.
On some platforms 128 MB of RAM for TT is too much,
so run 'bench' with the default 32 MB size.
No functional change although of course now 'bench'
reports a different number: 5545018
Use popcount() instead in the only calling place.
It is used only at initialization so there is no
speed regression and anyhow even initialization
itself is not slowed down: magic bitboard setup
stays around 175 msec on my slow 32bit Core Duo.
No functional change.
Simplify the code and doing this introduce a couple
of (very small) functional changes:
- Always compare to depth even in "mate value" condition
- TT cut-off in qsearch also in case of PvNode, as in search
Verified against regression with 2500 games at 30"+0.05
on 2 threads: 451 - 444 - 1602
Functional changed: new bench is 5544977
As Joona says: "The problem is that when doing full
window search (-VALUE_INFINITE, VALUE_INFINITE), and
pruning all the moves will return fail low which is
mate score, because only clause touching alpha is
"mate distance pruning". So we are returning mate score
although we are just pruning all the moves. In reality
there probably is no mate in sight.
Bug spotted and fixed by Joona.
Implement lsb/msb using armv7 assembly instructions.
msb is the easiest one, using a gcc intrinsic that generates
code using the ARM's clz instruction. lsb is also using this
clz instruction, but with the help of ARM's 'rbit' (bit
reversing) instruction. This leads to a >2% speed gain.
I also renamed 'arm-32' to the more meaningfull 'armv7' in the Makefile
No functional change.
Port to qsearch() the same changes we recently
added to search().
Overall this search refactoring series shows
almost 2% speed up on gcc compile.
No functional change.
When using the Makefile (as for the mingw case),
IS_64BIT and USE_BSFQ are already set with
ARCH=x86-64 and do not need to be redefined.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
First disable Contempt Factor during analysis, then
calculate the modified draw score from the point of
view of the player, so from the point of view of
RootPosition color.
Thanks to Ryan Taker for suggesting the fixes.
No functional change.
These kind of arch specific code is really nasty
to make it right becuase you need to verify on
all the platforms.
Now should compile properly also on ARM
Reported by Jean-Francois.
No functional change.
In particular on ARM processors. Original patch by
Jean-Francois, sligtly modified by me to preserve
the meaning of NO_PREFETCH flag.
Verified with gcc, clang and icc that prefetch instruction
is correctly created.
No functional change.
A simplification and also a small speed-up of
about 1% mainly due to reducing calls to
thisThread->cutoff_occurred().
Worst case split point recovering time after a
cut-off occurred is limited to 3 msec on my slow
PC, and usually is below 1 msec, so it seems safe
to remove the cutoff_occurred() check.
No functional change.
This is very crude and very basic: simply in case
of a draw for repetition or 50 moves rule return
a negative score instead of zero according to the
contempt factor (in centipawns). If contempt is
positive engine will try to avoid draws (to use
with weaker opponents), if negative engine will
try to draw. If zero (default) there are no changes.
No functional change.
Use a value related to PawnValue instead.
This is a different patch from previous one because
could affect game play and skill levels, although
in a mostly unmeasurable way. Indeed thresold has
been raised so easy move is a bit harder to trigger
and skill level is a bit more prone to blunders.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Show the real value in the code, not hide it
behind a variable name, especially when there
is only once occurence.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Extend for an extra half-ply in case the node is (probably)
going to fail high. In this case the added overhead is limited.
A novelity is the way this patch has been tested: Always in
self-play but with a much longer TC to allow the singular
extension to fully kick in and also (my impression) to have
less noisy results.
Ater 1015 games on my QUAD at 60"+0.05
Mod vs Orig 173 - 150 - 692 ELO +8
Handle also the SMP case. This has been quite tricky, not
trivial to enforce the node limit in SMP case becuase
with "helpful master" concept we can have recursive split
points and we cannot lock them all at once so there is the
risk of counting the same nodes more than once.
Anyhow this patch should be race free and counted nodes are
correct.
No functional change.
As long as isPvMove (renamed to pvMove) is set after
legality check, we can postpone legality even in PV case.
Patch aligns the PV case with the common non-pv one.
No functional change.
Patch and tuning by Gary Linscott from an idea of Ryan Taker.
Double tested by Gary:
Wins: 3390 Losses: 2972 Draws: 11323
LOS: 99.999992%
ELO: 8.213465 +- 99%: 6.746506 95%: 5.124415
Win%: 51.181792 +- 99%: 0.969791 95%: 0.736740
And by me:
After 5612 games 1255 1085 3271 +11 ELO
We have a crash with this position:
rkqbnnbr/pppppppp/8/8/8/8/PPPPPPPP/RKQBNNBR w HAha -
What happens is that even if we are castling QUEEN_SIDE,
in this case we have kfrom (B8) < kto (C8) so the loop
that checks for attackers runs forever leading to a crash.
The fix is to check for (kto > kfrom) instead of
Side == KING_SIDE, but this is slower in the normal case of
ortodhox chess, so rewrite generate_castle() to handle the
chess960 case as a template parameter and allow the compiler
to optimize out the comparison in case of normal chess.
Reported by Ray Banks.
Broken by commit a44c5cf4f7 of 3 /12 / 2011 that
was labeled "No functional change" because our 'bench'
test didn't triggered that particular endgame. Indeed
we need to run a specific bench on a set of endgames
position when touching endgame.cpp because normal bench
does not cover endgames properly.
Found by MSVC 2012 code analyzer.
It seems Intel is unable to properly workout templates with 'static'
storage specifier.
Workaround using an anonymous namespace instead.
No functional change.
Currently we exit the loop when
abs(bestValue) >= VALUE_KNOWN_WIN
but there is no logical reason for this. It seems more
natural to re-search again with full open window.
This has practically no impact in most cases, we have a
'no functional change' running 'bench' command.
The trick here is to check for legality only in the
(rare) cases we have pinned pieces or a king move
or an en-passant.
This trick is able to increase the speed of perft
of more then 20%!
No functional change.
Simply reshuffling the code inverting the condition in next_attacker()
yields a miraculous speed up of more than 3% under gcc!
On my laptop a bench run goes from 320Knps to 330Knps
No functional change.
This allows to reduce the scanning for new X-ray attacks
according to the capturing piece type.
It seems to be just a very small speed increase in MSVC 64
bit and gcc 32 bit, I guess cache issues value more than some
instruction less to execute (as usual).
No functional change.
It is very difficult and risky to assure
that a running thread doesn't access a global
variable. This is currently true, but could
change in the future and we don't want to rely
on code that works 'by accident'. The threads
are still running when ThreadPool destructor is
called (after main() returns) and this could
lead to crashes if a thread accesses a global
that has been already freed. The solution is to
use an exit() function and call it while we are
still in main(), ensuring global variables are
still alive at threads termination time.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When many threds concurrently print you need to serialize
the access to std::cout to avoid output lines are intermixed
with the contents of each thread.
This is not strictly needed at the moment because
only main thread prints out, although some ad-hoc
test could trigger UCI::loop() printing while searching.
Anyhow we want to lift this pretty avoidable constrain
also as a prerequisite for future work.
This patch just introduces the support, next one will enable
the serialization.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Before the search we setup the starting position doing all the
moves (sent by GUI) from start position to the position just
before to start searching.
To do this we use a set of StateInfo records used by each
do_move() call. These records shall be kept valid during all
the search because repetition draw detection uses them to back
track all the earlier positions keys. The problem is that, while
searching, the GUI could send another 'position' command, this
calls set_position() that clears the states! Of course a crash
follows shortly.
Before searching all the relevant parameters are copied in
start_searching() just for this reason: to fully detach data
accessed during the search from the UCI protocol handling.
So the natural solution would be to copy also the setup states.
Unfortunatly this approach does not work because StateInfo
contains a pointer to the previous record, so naively copying and
then freeing the original memory leads to a crash.
That's why we use two std::auto_ptr (one belonging to UCI and another
to Search) to safely transfer ownership of the StateInfo records to
the search, after we have setup the root position.
As a nice side-effect all the possible memory leaks are magically
sorted out for us by std::auto_ptr semantic.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Should be more clear from where the 'magic' numbers
come from.
Also bit of reformat while there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of just size(). Although code is longer,
should be more immediate to understand when reading.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
To mimics C++11 std::mutex and std::condition_variable,
also rename locks and condition variables to be more
uniform across the classes.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Reduce of one instruction. It seems a tad faster on
the profiler now. Very slightly but anyhow it is a
code semplification.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This better mimics std::vector::operator[] and
fixes a warning with MSVC 64bit.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Further reduce redundancy in move generation.
Veirifed no speed regression on MSVC, Clang and gcc.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Greatly simplify these very performace critical functions.
Amazingly we don't have any speed regression actually under
MSVC we have the same assembly for generate_moves() !
In generate_direct_checks() 'target' is calculated only
once being a loop invariant.
On Clang there is even a slight speed up.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Are used by Position but do not belong to that class,
there is only one instance of them (that's why were
defined as static), so move to a proper namespace instead.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Check we are the last slave of the split point
before to wake up the master. This should avoid
spurious wakes up.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We can detect the split point master also from within idle_loop,
so we can call the function without parameters and remove an
overloaded member hack in Thread class.
Note that we don't need to take a lock around curSplitPoint
when entering idle_loop() because if we are the master then
curSplitPoint cannot change under our feet (because is_searching
is set and so we cannot be reallocated), if we are a slave
we enter idle_loop() only upon Thread creation and in that case
is always splitPointsCnt == 0. This is true even in the very rare
case that curSplitPoint != NULL, if we have been already allocated
even before entering idle_loop().
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Directly use the underlying std::map instead and avoid
a useless inheritance.
As a nice side-effect Options global object has now a
default c'tor avoiding possible issues with globals
initializations.
Suggested by Rein Halbersma.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Templetize MovePicker::next_move() member function instead. It
is easier and we also avoid the forwarding of MovePicker() c'tor
arguments in the common case.
Suggested by Rein Halbersma.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
First change: If Haiku is host platform, change
installation prefix to /boot/common/bin
Second change: Only link in pthreads if Haiku isn't
host platform.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And group there all the formatting functions but
uci_pv() that requires access to search.cpp variables.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use the helpers to format the PV info but without
writing to output stream (file or cout). Message
formatting and sending are two logically different
task.
Incidentaly reintroduce the pretty_pv() name,
from Glaurung memories :-)
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Simplifies the code and seems more natural.
We have a very small fucntional change becuase now
at PV nodes castles are extended one ply anyhow.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With warning level 4 MSVC complains that a default
assignment operator could not be generated due to
member 'file' is a reference (warning C4512).
Use a pointer instead of a reference and move
struct Tie outisde class Logger while there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems more accurate: lsb is clear while 'first
bit' depends from where you look at the bitboard.
And fix compile in case of 64 bits platforms that
do not use BSFQ intrinsics.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of -O4 option that does not work with both mingw and
Linux gcc (tested with Clang 3.1).
As reported by Reed Kotler:
Turns out that -O4 is not a valid option for clang unless you have
the proper gold linker and plugins built. That's because -O4 enables
LTO, which writes out bitcode files during the compile, and then loads
those and optimizes them during the link phase.
It requires a linker that supports LLVM's LTO. There is a plugin for
Gold available as part of LLVM.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We have a signed integer here so let the return type
take in account that.
Found by Clang with -Weverything option.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Return a reference instead of void so to enable
chained assignments like
"p = q = Position(...);"
Suggested by Rein Halbersma.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Although a (little) functional change, we have no ELO change
but formula it is now more clear.
After 13019 games at 30"+0.05
Mod vs Orig 2075 - 2088 - 8856 ELO 0 (+- 3.4)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems the standard behaviour as implemented
in most engines although UCI protocol does not
specify what to do upon "ucinewgame" command.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Calculate the attacks only for the piece to disambiguate,
not for all.
Also some reformatting while there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
At endgame time push the king near his pawns (actually
one of them).
Original idea is from Critter (although slightly different),
implementation is mine and is completely different from the
original, in particular it is different the algorithm to
compute the minimum distance from pawns.
After 19895 games at 15"+0.05
Mod vs Orig 3638 - 3248 - 13009 ELO +7
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Error is due to ambiguous overloading of operator^
because we have both the built-in operator^(int, int)
and the user defined operator^(Bitboard, Square).
This error does not trigger when using Makefile becuase,
due to luck, the user defined operator^(Bitboard, Square)
happens to be always defined _after_ opposite_colors() so
that compiler does not claim. But in case of Microsoft MSVC
we don't have a Makefile and the order of files compilation
is chosen by the compiler (in an unpredictable way). So it
could still happen that error is not detected (as in my case),
but in another case the order of compilation of the files could
be so that at some point both operator^ were defined before
opposite_colors() and this triggers the error.
The fix is much simpler than the explanation :-)
Reported by Quocvuong82.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Currently when we fail to open a book file, for instance
if it doesn't exsist, we leave Book::open() with ifstream
failbit set. If then the book file is added, we correctly
open it at next attempt, but failbit is still set so that
after opening we exit because ifstream::good() returns false.
The fix is to reset failbit upon exiting.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In case a book entry has 'count' field set to 0
we crash. Spotted by Clang's static analyzer.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
So to be fully in sync with StateInfo, and move struct
to position.h, just below StateInfo.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Revision 2aac860db3 of 27 / 4 / 2012
changed can_castle() signatures from bool to int and
this broke the code that calculates polyglot hash key
in book.cpp
Instead of directly fixing the code we prefer to change
castling rights definitions to align to the polyglot ones
(as we did in previous patch). After this step we can simply
take internal castle rights as they are and use them
directly to calculate polyglot book hash key, as we do
in this patch that fixes the regression.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
To be aligned with PolyGlot book castle right definitions.
This will be used by next patch.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Early classify as known draws the positions
where white king is trapped on the rook file.
Suggested by Dan Honeycutt.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After extensive testing (I was off line this week and let
the test go on) it seems this change is useless:
After 33968 games on a QUAD (4 threads) at 15"+0.05
Mod vs Orig 5425 - 5550 - 22993 ELO -1 (+-2.1)
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Only in case of promotion we care about an upper case
promotion piece char, so std::transform() is overkill
for the task.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Assumption: Junior sends promotions according to the side to move (ucase/lcase).
Fact: Stockfish generally handles promotion lcase.
Patch: Handling position fen input moves always with lcase promotions.
Ported back by Portfish. No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems ADL lookup is broken with the STLPort library. Peter says:
The compiler is gcc 4.4.3, but I don't know how many patches they
have applied to it. I think gcc has had support for Koenig lookup
a long time. I think the problem is the type of the vector iterator.
For example, line 272 in search.cpp:
if (bookMove && count(RootMoves.begin(), RootMoves.end(), bookMove))
gives the error:
jni/stockfish/search.cpp:272: error: 'count' was not declared in this scope
Here RootMoves is:
std::vector<RootMove> RootMoves;
If std::vector<T>::iterator is implemented as T*, then Koenig lookup
would fail because RootMove* is not in namespace std.
I compile with the stlport implementation of STL, which in its vector
class has:
typedef value_type* iterator;
I'm not sure if this is allowed by the C++ standard. I did not find
anything that says the iterator type must belong to namespace std.
The consensus in this thread
http://compgroups.net/comp.lang.c++.moderated/argument-dependent-lookup/433395
is that the stlport iterator type is allowed.
Report and patch by Peter Osterlund.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Let first argument to be the 'color'. This allows to align
pos.pieces() signatures with piece_list(), piece_count() and
all the other functions where 'color' argument is passed as
first one.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Hex representation doesn't add any value in those cases.
Preserve hex representation where more self-documenting
for instance for binary masks values.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems to increase SMP performances. To note that
this patch goes in the opposite direction of "Active
reparenting" where we try to reparent an idle slave
as soon as possible. Instead here we prefer to keep
it idle instead of splitting on a shallow / near the
leaves node.
After 11550 games on a QUAD (4 threads) at 15"+0.05
Mod vs Orig 1972 - 1752 - 7826 ELO +6 (+-3.6)
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Set optimization level to 4 and get a 2.564% faster binary:
Stockfish (Clang, Level 4) bench:
$ make build ARCH=osx-x86-64 COMP=clang
(Clang does not support PGO)
Average of 4 trials:
Total time (ms): 5137.5
Nodes searched: 5631135
Nodes/second: 1096084.5
Stockfish (Clang, Level 3) bench:
$ make build ARCH=osx-x86-64 COMP=clang
(Clang does not support PGO)
Average of 4 trials:
Total time (ms): 5269.25
Nodes searched: 5631135
Nodes/second: 1068679.25
Stockfish (GCC, PGO) bench:
$ make profile-build ARCH=osx-x86-64
Average of 4 trials:
Total time (ms): 5286
Nodes searched: 5631135
Nodes/second: 1065292.25
Suggestion and performance tests by Daylen Yang.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
A PvNode that givesCheck has been already granted
an extension of ONE_PLY in previous condition.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Makefile modified to support the clang compiler under Mac.
This was tested using clang 4 under Mountain Lion, but should
also work fine under Lion and possibly under Snow Leopard.
It requires the 'Xcode 4.x Command Line Tools' to be installed.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Since revision 374c9e6b63
we use also castling information to calculate king safety.
So before to reuse the cached king safety score we have to
veify not only king position, but also castling rights are
the same of the pre-calculated ones.
This is a very subtle bug, found only becuase even after
previous patch, consecutives runs of 'bench' _still_ showed
different numbers. Pawn tables are not cleared between 'bench'
runs and in the second run a king safety score, previously
evaluated under some castling rights, was reused by another
position with different castling rights instead of being
recalculated from scratch.
Bug spotted and tracked down by Balint Pfliegel
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now that we can call bench multiple times
from command prompt we need to ensure searched
nodes remain constant across different runs.
Spotted by Blint Pfliegel.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After 6K games at 60" + 0.1 on QUAD with 4 threads
this implementation fails to show a measurable increase,
result is well within error bar.
Perhaps with 8 or more threads resut is better but we
don't have the hardware to test. So retire for now and
in case re-add in the future if it proves good on big
machines.
The only good news is that we don't have a regression and
implementation is stable and bug-free, so could be reused
somewhere in the future.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The check for detecting when a split point has all the
slaves still running is done with:
slavesMask == allSlavesMask
When a thread reparents, slavesMask is increased, then, if
the same thread finishes, because there are no more moves,
slavesMask returns to original value and the above condition
returns to be true. So that the just finished thread immediately
reparents again with the same split point, then starts and
then immediately exits in a tight loop that ends only when a
second slave finishes, so that slavesMask decrements and the
condition becomes false. This gives a spurious and anomaly
high number of faked reparents.
With this patch, that rewrites the logic to avoid this pitfall,
the reparenting success rate drops to a more realistical 5-10%
for 4 threads case.
As a side effect note that now there is no more the limit of
maxThreadsPerSplitPoint when reparenting.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Check for a cutoff occurred also high in
the tree and not only at current split
point.
This avoids some more wasted reparenting.
No functional chnage.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is more correct given what the function does. In
particular single_bit() returns true also in case of
empty bitboards.
Of course also the usual renaming while there :-)
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of reparenting to oldest split point, try to reparent
to latest. The nice thing here is that we can use the YBWC
helpful master condition to allow the reparenting of a split
point master as long as is reparented to one of its slaves.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And update master->splitPointsCnt under lock
protection. Not stricly necessary because
single_bit() condition takes care of false
positives anyhow, but it is a bit tricky and
moving under lock is the most natural thing
to do to avoid races with "reparenting".
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In Young Brothers Wait Concept (YBWC) available slaves are
booked by the split point master, then start to search below
the assigned split point and, once finished, return in idle
state waiting to be booked by another master.
This patch introduces "Active Reparenting" so that when a
slave finishes its job on the assigned split point, instead
of passively waiting to be booked, searches a suitable active
split point and reprents itselfs to that split point. Then
immediately starts to search below the split point in exactly
the same way of the others split point's slaves. This reduces
to zero the time waiting in idle loop and should increase
scalability especially whit many (8 or more) cores.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Apart from the semplification it is now more clear that
the actual Tempo added was half of the indicated score.
This is becuase at start compute_psq_score() added half
Tempo unit and in do_move() white/black were respectively
adding/subtracting one Tempo unit.
Now we have directly halved Tempo constant and everything
is more clear.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is still enabled during fixed limit search so to
use it during fixed depth/nodes/time matches.
Bug reported by Daylen.
No functional changes.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Shrinking from [16] to [2][2] is able to speedup
perft of start position of almost 5% !
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Shrink dimensions of the biggest stack consumers arrays.
In particular movesSearched[] can be safely shrinked
without any impact on strenght or risk of crashing.
Also MAX_PLY can be reverted to 100 with almost no impact
so to limit search recursion and hence stack allocation.
A different case is for MAX_MOVES (used by Movepicker's
moves[]), because we know that do exsist some artificial
position with about 220 legal moves, so in those cases SF
will crash. Anyhow these cases are never found in games.
An open risk remains perft, especially run above handcrafted
positions.
This patch originates from a report by Daylen that found
SF crashing on his Mac OS X 10.7.3 while in deep analysys
on the following position:
8/3Q1pk1/5p2/4r3/5K2/8/8/8 w - - 0 1
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now a fen file with Chess960 positions is
correctly parsed. But it is mandatory to set
"UCI_Chess960" option _before_ to call bench.
Note that this was not needed/possible before
adding the possibility to call 'bench' from
command prompt.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now that we can call bench on current position
we can directly use it to perform our perft.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now that we can call bench from command prompt
has a sense to teach bench to run the current
set position. To do this is enough to call bench
with 'current' as fen source parameter.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After extensive test Gary says:
"So, after 16k games at 10"+1" on an i7, the undefended rook test
looks to be not good (albeit by a very small margin).
3063 - 3093 - 9844 (-1).
I doubt that is causing the regression, but even so, it looks like
it's not worth keeping, and we can go back to the simpler undefended
minors check."
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Unfortunatly accessing thread local variable
is much slower than object data (see previous
patch log msg), so we have to revert to old code
to avoid speed regression.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Much faster then pthread_getspecific() but still a
speed regression against the original code.
Following are the nps on a bench:
Position
454165
454838
455433
tls
441046
442767
442767
ms (Win)
450521
447510
451105
ms (pthread)
422115
422115
424276
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After we release the SplitPoint lock the master, suppose
is main thread, can safely return and if a "quit" command
is pending, main thread exits and associated Thread object
is freed. So when we access master->is_searching a crash
occurs.
I have never found such a race that is of course very rare
becuase assumes that from lock releasing we go to sleep for
a time long enough for the main thread to end the search and
return. But you can never know, and anyhow a race is a race.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
So to avoid a crash when setting the moves in
UCI "position startpos moves ...." command.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
But use the newly introduced local storage
for this. A good code semplification and also
the correct way to go.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use thread local storage to store a pointer to the thread we
are running on. This will allow to remove thread info from
Position class.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With this change sources are fully endianess
independent, so we can simplify the Makefile.
Somewhat surprisingly we don't have any speed
regression !
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Erroneusly adds default positions to the fens
loaded from external file.
Bug introduced in adb71b8096
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The 2 overload functions map() accept a pointer to
EndgameBase<Value> or a pointer to EndgameBase<ScaleFactor>.
Because Endgame<E> is derived from one of them we can
directly use a pointer to this class to resolve the
overload as is needed in Endgames::add().
Also made class Endgames fully parametrized and no more
hardcoded to the types (Value or ScaleFactor) of endgames
stored.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
A std::set (that is a rb_tree) seems really
overkill to store at most a handful of moves
and nothing in the common case.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is possible to start with 'stockfish', then from
command prompt type 'bench' and SF will do what you expect.
Old behaviour is anyhow preserved. As a bonus we can now
start from command line any UCI command understood by
Stockfish. The difference is that after execution of a
command from arguments SF quits, while at the end of the
same command from prompt SF stays in UCI loop.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Allows some code semplification and avoids directly
allocation and managing heap memory.
Also the usual renaming while there.
No functional change and no speed regression.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In particualr before to wake up main thread that
could take some random time. Until we don't reset
search time we are not able to correctly track
the elapsed search time and this can be dangerous
under extreme time pressure.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We need to wake up main thread if it is sleeping
waiting for stop or ponderhit, so we cannot skip
calling wait_for_search_finished().
Found by Othello1984.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In this case SF stop searching and goes sleeping
waiting for a stop / ponderhit before to return
best move. So when a "stop" arrives we need to wake
up the main thread again.
Another regression introduced by 3aa471f2a9,
hopefully the last one.
Thanks to Otello1984 to reporting this.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Renamed stuff and added comments. The aim is to make more
readable, at least by me ;-) , this newly added part of code.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Incredible typo from my side!
The 2 tables are completely different, one counts 1s the
other returns the msb position. Even more incredible
the 'stockfish bench' command returns the same number
of nodes!!!
Spotted by Justin Blanchard.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Add more detailed pawn shelter/storm evaluation
After 10670 games at 10"+0.05
Mod vs Orig 2277 - 1941 - 6452 ELO +11 !!!
The first real increase since 2.2.2, congratulations Gary !!!
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Fixes a not so rare crash (once every 100 games)
newly introduced. Unfortunatly I am still not
able to figure out why :-(
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
There is no need to "invent" different names
from the original UCI parameters.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Penalty for undefended rook
Almost no change at longer TC, but perhaps there
is a tiny increase....
After 17522 games at 10"+0.05
Mod vs Orig 3064 - 2967 - 11491 ELO +2
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When quitting we should avoid RootPosition to be
destroyed while threads are still running, leading
to a crash. In case of a "stop" or "ponderhit"
command there is no need for the UI thread to wait.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And add final touches to this long patch series.
All the series has been verified against regression with
20K games at fast TC.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We have a clash with start_fn defined both as a
Thread memeber and as a function pointer type in
pthread_create().
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We cannot set do_sleep flag of main thread before
"bestmove" is sent to GUI, otherwise GUI could send
immediately the next "go" command that triggers
start_thinking() and because do_sleep is set UI
thread resets the flag to launch a new search. But
when shortly after main thread returns to main_loop()
flag is incorrectly reset and main thread goes to sleep
hanging the engine.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We store pointers instead of Thread objects because
Thread is not copy-constructible nor copy-assignable
and default ones are not suitable. So we cannot store
directly in a std::vector.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Associate platform OS thread to the Thread class instead of
creating it from ThreadsManager.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Split the data allocation, now done (mostly once)
in read_uci_options(), from the wake up and sleeping
of the slave threads upon entering/exiting the search.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems is the cause of strange and rare hangs
reported by some users where Stockfish stops
responding to GUI. It is not clear why but for
the moment revert the patch.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Not correct warning about use of an uninitialized
variable. The warning is not correct becuase we can
never reach the warned code when in SpNode, anyhow
the fix is simple.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
A somewhat tricky function pointer cast allows us
to move the platform specifics to lock.h, the cast
is tricky because return type is not the same of the
casted function in Linux (for Windows return type is
a DWORD that is a long) but this should not be a
problem as long as the size is the same;
From: http://stackoverflow.com/questions/188839/function-pointer-cast-to-different-signature
"OpenSSL was only casting functions pointers to
other function types taking and returning the same
number of values of the same exact sizes, and this
(assuming you're not dealing with floating-point)
happens to be safe across all the platforms and
calling conventions I know of. However, anything
else is potentially unsafe."
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This should avoid some aliasing issues
with TT table access.
After 3913 games at 10"+0.05
Mod vs Orig 662 - 651 - 2600 ELO +0 (+- 6.4)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Even if not under attack. This seems to be good
especially on openings.
After 12112 games at 10"+0.05
Mod vs Orig 2175 - 1997 - 7940 ELO +5 (+- 3.7)
[Patch series from Gary, little edited by me]
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We need splitted Tie classes because MSVC stream library
takes a lock on buffer both on reading and on writing and
this causes an hang because, while searching, the I/O
thread is locked on getline() and when main thread is
trying to std::cout() something it blocks on the same
lock waiting for I/O thread getting some input and
releasing the lock.
The solution is to use separated streambuf objects for
cin and cout.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
A big code simplification and cruft removing, make
Logger class a singleton and fully self conteined.
Also add direction indicators (">>" and "<<") to
better differentiate input and output lines in the
log file.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Using "o" as a parameter with the on_xxx(const UICOption& o)
functions is a bit dangerous because of confusion with "0".
Suggested by Rein Halbersma.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Some trial was needed to find the correct recipe but now
we log both stdin and stdout to file "io_log.txt".
Link http://spec.winprog.org/streams/ was very useful
to understand the details of iostreams implementation.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
By means of "Use Debug Log" UCI option it is possible to toggle
the logging of std::cout to file "out.txt" while preserving
the usual output to stdout. There is zero overhead when logging
is disabled and we achieved this without changing a single line
of exsisting code, in particular we still use std::cout as usual.
The idea and part of the code comes from this article:
http://groups.google.com/group/comp.lang.c++/msg/1d941c0f26ea0d81
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In particular before initialization. So that SF
seems more snappy at startup.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
As it was in Glaurung times. Also rearranged order
so that byTypeBB[0] is accessed before byTypeBB[x]
to be more cache friendly. It seems there is even
a small speedup.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Also some microoptimizations, were there from ages
but hidden: the renaming suddendly made them visible!
This is a good example of how better naming lets you write
better code. Naming is really a kind of black art!
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Stick to UCI protocol that says:
* by default all the opening book handling is done by the GUI,
but there is an option for the engine to use its own book
("OwnBook" option, see below)
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
UCI protocol it is not clear about what the engine
should be supposed to do when "ucinewgame" is
received. Stockfish simply sets the position to
start FEN, but it is redundant becuase the GUI always
resends the position after "ucinewgame" command, so
it seems we can safely ignore that command.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When a button fires UCIOption::operator=() is called and from
there the on_change() function. Now it happens that in case of
a button the on_change() function resets option's value to
"false" triggering again UCIOption::operator=() that calls again
on_change() and so on in an endless loop that is experienced
by the user as an application hang.
Rework the button logic to fix the issue and also be more clear
about how button works.
Reported by several people working with Scid and tracked down
to the "Clear Hash" UCI button by Steven Atkinson.
Bug recently introduced by 2ef5b4066e.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now we are forced to just use C++ iostream becuase
buffers are independent and using C library functions
like printf() or scanf() could yield to issues.
Speed up of about 1%.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Result of t.time * 1000 should be a 64 bit value, not an int.
Bug reported by several users.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Fine tune newly introduced pinner bonus score:
After 34696 games at 2"+0.05
Mod vs Orig 7474 - 7087 - 20135 ELO +3 (+- 2.4)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
So to be done only once at startup and in the (unlikely)
cases that a relevant UCI parameter is changed, instead
of doing it at the beginning of each search.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Introduce 'on change' actions that are triggered as soon as
an UCI option is changed by the GUI. This allows to set hash
size before to start the game, helpful especially on very fast
TC and big TT size.
As a side effect remove the 'button' type option, that now
is managed as a 'check' type.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Self-documenting code instead of a tricky
bitwise tweak, not known by everybody.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Add a bonus if a slider is pinning an enemy piece.
Idea from Critter.
After 27443 games at 2"+0.05
Mod vs Orig 5900 - 5518 - 16025 ELO +4 (+- 2.7)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Introduce and use a new Time class designed after
QTime, from Qt framework. Should be a more clear and
self documented code.
As an added benefit we now use 64 bits internally to get
millisecs from system time. This avoids to wrap around
to 0 every 2^32 milliseconds, which is 49.71 days.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Visual Studio 11 is worried that shift result could
overflow an integer, this is impossible becuase max
value of the shift is 4, but compiler cannot know it.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Pre compute castle path so to quickly test
for impeded rule.
This speeds up perft on starting position
of more than 2%.
No functional change
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Fix this warning with MSVC 64 bits:
warning C4244: '=' : conversion from 'std::streampos' to 'size_t',
possible loss of data
Point is that std::streampos could be negative, while size_t
is always non-negative.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And introduce SPlitPoint bestMove to pass back the
best move after a split point.
This allow to define as const the search stack passed
to split.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is a prerequisite for next patch and simplifies
the function. testing at ultra fast TC shows no
regression.
After 24302 games at 2"+0.05
Mod vs Orig 5122 - 5038 - 13872 ELO +1 (+- 2.9)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Reverse the meaning of castleRightsMask[sq] so that now
is stored the castling right that will be removed in
case a move starts from or arrives to sq square. This
allows to simplify the code.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of by square. This is a more conventional
approach, as reported also in:
http://chessprogramming.wikispaces.com/Zobrist+Hashing
We shrink zobEp[] from 64 to 8 keys at the cost of an extra
'and 7' at runtime to get the file out of the ep square.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We shouldn't need lock protection to increment
splitPointsCnt and set curSplitPoint of masterThread.
Anyhow because this code is very tricky and prone to
races bound the change in a single patch.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When updating castleRights in do_move() perform only one
64bit xor with zobCastle[] instead of two.
The trick here is to define zobCastle[] keys of composite
castling rights as a xor combination of the keys of the
single castling rights, instead of 16 independent keys.
Idea from Critter although implementation is different.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Because TT table is shared tte->move() could change
under our feet, in particular we could validate
tte->move() then the move is changed by another
thread and we call pos.do_move() with a move different
from the original validated one !
This leads to a very rare but reproducible crash once
every about 20K games.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
There is no need to limit the maximum ply searched to
100, with deep exclusion search extensions we could
reach it even with much smaller search depths.
The only drawback is an increase in stack usage, but
is limited mainly to id_loop(), in particular the
recursive search() functions are not affected.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Replace a 64 bit 'and' by two 32 bits ones and
use unsigned instead of int.
This simple patch increases perft speed of 6% on
my Intel Core 2 Duo !
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
But only when needed, after a split point. This behaviour
does not apply when useSleepingThreads is false, becuase
in this case threads are not woken up at split points so
must be already running.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Rule says should be reset only after a capture and/or
a pawn move.
This incredible bug was here since Glaurung times !
Spotted by Kiriakos.
No functional change in the test bench because we
don't reach the 50 moves limits.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With default value of 100 no change in regard of current
behaviour. Increasing the value makes SF to think a
longer time for each move. Decreasing the value makes SF
to move faster.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This method belongs to Thread, not to ThreadsManager.
Reshuffle stuff in thread.cpp while there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Release split point lock before to wake up
master thread. This seems to increase speed
in case "sleeping threads" are used:
After 7792 games with 4 threads at very fast TC (2"+0.05)
Mod vs Orig 1722 - 1627 - 4443 ELO +4 (+- 5.1)
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When allocating a slave we set both is_searching
and splitPoint under lock protection.
Unfortunatly the order in which the variables are
set is not defined. This article was very clarifying:
http://software.intel.com/en-us/blogs/2007/11/30/volatile-almost-useless-for-multi-threaded-programming/
So when in idle loop we test for is_searching and then
access splitPoint, it could happen that splitPoint is still
not updated leading to a possible crash.
Fix the race lock protecting splitPoint access.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With current code we could raise bestValue above beta,
not what is intended for.
Spotted by Richard Vida.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Simplify and streamline the code. Verified all the
resulting bitbases are not changed.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Fix an issue where the log file stores an incorrect +0.00
eval after a search has been stopped.
Bug reported by Ajedrecista.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This allows to retire ClearMaskBB[] and use just
one SquareBB[] array to set and clear a bit.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With this we should compeltely remove the need
of installing third party POSIX threads library
when compiling with mingw-gcc under Windows.
Spotted by Trung Tu.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is not clear the advantage and we don't want
to risk of introducing regressions on this
very critical parameter. So revert to old limit.
After 16003 games
Mod vs Orig 2496 - 2421 - 11086 ELO +1 (+-3)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Apart from some renaming the biggest change
is the retire of split_point_finished()
replaced by slavesMask flags. As a side
effect we now take also split point lock
when allocation available threads.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of Posix threads. This seems to fix time
losses of the gcc compiled version for Windows.
The patch replaces the MSVC specific _MSC_VER flag
with _WIN32 and _WIN64 that are defined both by
MSVC and mingw-gcc.
Workaround found by Jim Ablett.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of by SEE. Almost no ELO change but it is
a bit easier and is a more natural choice given
that good captures are ordered in the same way.
After 10424 games
Mod vs Orig 1639 - 1604 - 7181 ELO +1 (+-3.8)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
pass references (Windows style) instead of
pointers (Posix style) as function arguments.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The previous line, LDFLAGS += $(CXXFLAGS), does not make sense, and
breaks profile-build, thus changing it into: LDFLAGS += -flto.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Integrate TT_MOVE step into the first state. This allows to
avoid the first call to next_phase() in case of a TT move.
And use overflow detection instead of the bunch of STOP_XX
states to detect end of moves.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In case of a PvNode could happen that alpha == beta - 1,
for instance in case the same previous node was visited
with same beta during a non-pv search, the node failed low
and stored beta-1 in TT. Then the node is searched again
in PV mode, TT value beta-1 is retrieved and updates alpha
that now happens to be beta-1.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Almost no functional change because multiple recaptures
to same square are very rare, but neverthless it seems
the correct thing to do.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
They are almost the same, if the function arguments would
have been the same they could even be integrated.
Also a bit of renaming while there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And increase LMR limit. Tests show no change ELO wise,
but we prefer to take the risk to commit anyhow becuase
is a 'prune reducing' patch.
After 10749 games
Mod vs Orig: 1670 - 1676 - 7403 ELO 0 (+-3.7)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Skip calling promotion generation functions in
the very common case of no possible promotion
evasion. Also retire generate_pawn_captures()
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Fix an incorrect warning: 'bm may be used uninitialized'
with the old but still commonly used gcc 4.4
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now that we don't support anymore popcount detection
at runtime this target is obsolete.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use $(CXX) instead of assuming compiler name is 'gcc'
Spotted by Louis Zulli.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
All the piece dependant data is passed now as
function arguments so that the code is exactly
the same for bishop and rook.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is more correct and specific. Another naming
improvement while reading Critter sources.
No functional changes.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And directly pass RootMoves instead of SearchMoves
to main thread. A class declaration is better suited
in a header and slims a bit the fatty search.cpp
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We just need to verify if a legal move is among the
SearchMoves, so we don't need a vector for this.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
New gcc 4.7 complains about casting a volatile pointer
to void* so assign the variables directly.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It could lead to terrible mistakes otherwise, as
it happened during a game on playchess when on
this position (after white's f4):
2q4r/4b1k1/p3rpp1/3np2p/PpNpNP1P/1P1P2PQ/2P1R3/4R1K1 b - - 0 1
SF moves immediately e5xf4 instead of the correct f5.
In general during engine matches it is impossible the
opponent leaves a piece hanging or anyhow starts a
clear losing sequence. So avoid to fall in subtle traps.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It should help to avoid recalculating check squares
of sliding attackers for queen when already done for
bishops and rooks. Of course this helps when there are
bishop, rook and queen on the board !
Fixed also a subtle bug (use of same variable b in while
condition and in condition body) introduced recently by
revision d655147e8c that triggers
in case we have at least 2 non-pawn discovered check pieces.
This is very rare that's why didn't show in the node count
verification where we actually have a case of 2 dc pieces
in position 14, but one is a pawn.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
During generation of non-captures checks (in qsearch)
we don't consider castling moves that give check,
this patch includes also this rare case. Verified with
perft that all the non-capture checks are now generated.
There should be a very little slowdown due to the extra
work, but actually I failed to measure it. I don't expect
any ELO improvment, there is even no functional change on
the standard depth 12 search, it is just to have a correct
move generator.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The full movegen patch series shows a speed up of almost
6% (!) on perft, and code is much more readable too.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
MV_CHECK is an alias of the more appropiate named
MV_NON_CAPTURE_CHECK so use only the latter.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And make CRITICAL_SECTION locks the only option for Windows.
This guarantees backward compatibility with all the Windows
versions (even XP and older) and an hassle free experience
when compiling for Windows. Tests performed by Ingo and
reported on talkchess confirm there is no speed penalty
against the most modern SRW locks:
http://www.talkchess.com/forum/viewtopic.php?t=41835&start=20
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
On that platform non-bracketed casting are not supported.
Reported by Richard Lloyd.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Further increase safety against time losses. After this
change (tested on LittleBlitzer and cutechess) I had no
more time losses at 2" and 1"+0.02 TC both on Windows
and Linux on more than 10000 games.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We try hard not to lose on time even under extreme
time pressure. We achieve this through 3 different but
coordinated steps:
1) Increase max frequency of timer events
2) Quickly return after a stop signal
3) Take in account timer resolution
With these SF played under LittleBlitzer at 1"+0.02 and 3"+0
without losing on time even one game.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Before creating main thread we set its do_sleep flag to true,
then thread is created and it will go to sleep in main_loop()
after resetting do_sleep.
But if after the setting of do_sleep and before its resetting
the UI thread calls start_thinking() it will not wait on:
if (!asyncMode)
while (!main.do_sleep)
cond_wait(&sleepCond, &main.sleepLock);
as it should but will immediately return before the main thread has
started the search. This very rare race show itself during bench,
when the first position is erroneusly skipped so that bench node count
results of 5309038 instead of the correct 5457475.
The patch is somewhat tricky, but is simple and it works!
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Locals left and right shadow two same named
variables in the std::ifstream base class.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems it yields to missing wake-up events with the
result of SF loosing on time as reported by many people.
So revert the patch and use a more robust approach: assume
there can be spurious wake ups events and make the code to
work also in those cases.
While debugging I found that WaitForSingleObject() had wrong
parameter 0 instead of INFINITE yielding to a crash while
exiting under Windows, strangely unnoticed til now.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The aim is to have shorter names without losing
readibility but, if possible, increasing it.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Retire open(), close() and name() from public visibility
and greately simplify the code. It is amazing how much
can be squeezed out of an already mature code !
No functional change
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In Windows when OLD_LOCKS is defined we use SetEvent() to mimic
the semantic of the POSIX pthread_cond_signal().
Unfortunatly there is not a direct mapping because with SetEvent()
the state of an event object remains signaled until it is set
explicitly to the nonsignaled state or until a single waiting thread
has been released. Instead in case of pthread_cond_signal(), if there
are no waiting threads it has no effect. What we may want is something
like PulseEvent() instead of SetEvent(). Unfortunatly it is documented
by Mcrosoft as 'unreliable' due to spurious wakes up that could
filter out the signal resetting. So we opt to reset manually any
pending signaled state before to go to sleep.
This fixes the strange misbehaves during 'stockfish bench'
when using OLD_LOCKS under Windows.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now that HasPopCnt is a compile time constant we can
centralize and unify the BitCountType selection.
Also rename count_1s() in the more standard popcount()
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It was meant to build a single binary optimized
for any kind of CPU: with and without hardware POPCNT.
This is a nice idea but in practice was never used, or
people builds binary with popcnt enabled or not, mainly
according to their type of CPU. And it was also never
used in the official Jim's builds where, in case, would
be easier for a number of reasons, do build two different
versions: with and without SEE42 support.
So retire this feature and simplify the code.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This leads to a further and unexpected simplification
of this already very size optimized code.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Currently after a 'quit' command UI thread raises stop
signal, exits from uci_loop() and calls Threads.exit()
while the search threads are still active.
In Threads.exit() main thread is asked to terminate, but
if it is parked in idle_loop() it will exit and free its
resources (in particular the shared Movepicker object) while
sibling slaves are still active and this leads to a crash.
The fix is to let the UI thread always wait for main thread
to finish the search before to return from uci_loop().
Found by Valgrind when running with 8 threads.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Greatly improves the usage. User defined conversions
are a novelity for SF, another amazing C++ facility
at work !
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The condition for a mate score was wrong:
abs(v) < VALUE_MATE - PLY_MAX * ONE_PLY
instead of
abs(v) < VALUE_MATE_IN_PLY_MAX
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
During the search we score a mate as "plies to mate
from the root" to compare in an homogeneous way the
values returned by different sub-trees. However we
store in TT a mate score as "plies to mate from the
current position" the let the TT value remain valid
across the game.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is simple but somewhat tricky code that deserves
a bit of documentation. A bit of renaming while there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Add the check that alpha < beta - 1 if and only if PvNode is true.
The current code would not flag PvNode and alpha == beta - 1. In
other words, the || is not an exclusive OR!.
Also sync assert conditions of search() and qsearch()
Suggested by Rein Halbersma.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Make a better use of C++ operators overloading to
streamline the APIs.
Also sync polyglot.ini file while there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
I am not able to reproduce the speed regression anymore,
and also we were using cout even before speed regression
so probably the reason is not there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Be consistent with the way these operators are defined
in plain C (and in C++).
Spotted by Lucas Braesch.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We don't use killers to order evasions, so it
seems natural do not consider an evasion cut-off
move as a possible killer. Test shows almost no
change, as it should be becuase this is a really
tiny change, but neverthless seems the correct
thing to do.
After 11893 games
Mod vs Orig 1773 - 1696 - 8424 ELO +2 (+-3.4)
Idea from Critter.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Take advantage of argument-dependent lookup (ADL) to
avoid specifying std:: qualifier in some STL functions.
When a function argument refers to a namespace (in this
case std) then the compiler will search the unqualified
function in that namespace too.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Partially revert efd2167998
Without this patch SF does not send "bestmove" to GUI.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Seems sensibly faster: On a
./stockfish bench > /dev/null
We have +2% on mingw and even +5% on MSVC !
Also removed the nice but complex enum set960 machinery,
use directly the underlying move_to_uci() function.
Speed regression reported by Heinz van Saanen.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Do not return the book move if is not among the
RootMoves,in particular if we have been asked to
search on a move subset with "searchmoves" then
return book move only if it is among this subset.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Introduce pv_info_to_log() and pv_info_to_uci() and
greatly cleanup this stuff.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Actually after last patch it happens that delta
starts always with the fixed value of 16.
So further remove useless code.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems that we just need to look at previous score to
compute aspiration window size.
After 5350 games:
Mod vs Orig 800 - 803 - 3647 ELO +0 (+- 5.2)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Diretcly use the underlying std::vector<Move> and the
STL algorithms. Also a bit of cleanup while there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is ok to redirect st pointer to startState, but the latter
should be updated with the content pointed by the st of the
original position. The bug is hidden when startState and *st
are the same as is the case of searching from start position,
but as soon as moves are made (as is the case when splitting)
the bug leads to a crash.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The Position object used by UI thread is a local variable in
uci_loop(), so after receiving "quit" command the function
returns and the position is freed from the stack.
This should not be a problem becuase in start_thinking() we copy
the position to RootPosition that is the one used by main search
thread. Unfortunatly we blindly copy also StateInfo pointer that
still points to the startState struct inside UI position. So the
pointer becomes stale as soon as UI thread leaves uci_loop() and
because this happens while main search thread is still recovering
after the 'stop' signal we have a crash.
The fix is to update the pointer to the correct startState after
the copy.
Found with Valgrind.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Comments should be informative but not pedantic / obvious.
The only exception is the function description where we
indulge a bit on the "chatty" side, but has always been like
this since Glaurung times, so we continue with this tradition.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Tested togheter with previous patch; shows no regression and
is a semplification.
After 5817 games:
Mod vs Orig 939 - 892 - 3986 ELO +2 (+- 5.1)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Triggered by a comment of Eelco on talkchess. Also
a bit of cleanup while there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Consider negative captures as good if
still enough to reach beta.
After 7502 games:
Mod vs Orig 1225 - 1158 - 5119 ELO +3 (+- 4.5)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
A cast rarely is the right solution. In this case was enough
to redifine 3 variables with type size_t instead of int
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems we don't have any added value. Note that the
moves that were used to be extended are still flagged
as dangerous so to avoid at least pruning them.
After 9555 games
Mod vs Orig 1562 - 1540 - 6453 ELO +0 (+- 4)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
As Heinz says:
"Function empty() should have a constant run-time even
on lousy compilers and you spare the not.
The change is even measurable: + 100-150 nodes/sec. Wow:-)"
Patch from Heinz van Saanen
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is more idiomatic for a functor (a function object) as are
the endgames.
Suggested by Rein Halbersma.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
A pinned piece cannot move and so does not play any role
in SAN disambiguation.
Reported by Steven Edwards.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It was never clear to me why we needed this trick, and now
that we rely only on C++ std::getline() and std::cout for
input / output it is even more a mistery what this code does.
So disable it and wait to see if someone screams ;-)
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Detach from the UI thread the input arguments used by
the search threads so that the UI thread is able to receive
and process any command sent by the GUI while other threads
keep searching.
With this patch there is no more need to block the UI
thread after a "stop", so it is a more reliable and
robust solution than the previous patch.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Unfortunatly xboard sends immediately the new position to
search after sending "stop" when we have a ponder miss.
Becuase main thread position is not copied but is referenced
directly from root position and the latter is modified by
the "position.." UCI command we end up with the working position
that changes under our feet while the search is still recovering
after the "stop" and this causes a crash.
This happens only with the (broken) xboard, native UCI does not
have this problem.
Reported by otello1984
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Fixes an hang when playing with ponder ON. Perhaps there is still
a very small race but now it seems engine does not hang anymore.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Move global search-related variables under "Search" namespace.
As a side effect we can move uci_async_command() and
wait_for_stop_or_ponderhit() away from search.cpp
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use the starting thread to wait for GUI input and instead use
the other threads to search. The consequence is that now think()
is alwasy started on a differnt thread than the caller that
returns immediately waiting for input. This reformat greatly
simplifies the code and is more in line with the common way
to implement this feature.
As a side effect now we don't need anymore Makefile tricks
with sleep() to allow profile builds.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
If we are pondering we will stop the search only when
GUI sends "ponderhit" or "stop" commands or when we reach
maximum depth. In all the other cases we continue to search
so there is no need to verify for available time.
Also better clarify why wait_for_stop_or_ponderhit() before
to exit in some cases.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In case there is only 1 legal move we will stop the
search at depth 10 anyway because the exclusion search
probe will fail low.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In the "easy move" paradigm we verify if the best move has
ever changed over a good number of iterations and if the
biggest part of the searched nodes are consumed on it.
This is a kind of hacky ad indirect heuristic to deduce
if one move is much better than others.
Rewrite the early stop condition to verify directly if one
move is much better than others performing an exclusion
search.
Idea to use exclusion search for time management if form Critter.
After 12245 games at 30"+0.1
Mod vs Orig 1776 - 1775 - 8694 ELO +0 (+-3.4)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Tuned with CLOP against a pool of 3 engines. Result
verified with a direct match:
After 11720 games at 10"+0.1
Mod vs Orig 1922 - 1832 - 7966 ELO +2 (+-3.6)
So no change in self match but if CLOP is right it should
be a bit better against an engine pool.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The value returned by root search it is actually
our best value, so rename the variable to reflect this.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of binding link time optimization to the choice of
popcount support, do the right thing and add -flto option
when gcc 4.5 or later is detected.
Although it should be supported also under mingw, it happens
that it doesn't, at least on my 4.6.1 due to some known bugs.
Thanks to Mike for helping me with this patch.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Remove the bonus for no *friendly* pieces in the pawn's path and
reduce a bit the bonus based on kings proximity.
This patch is part of to the ongoing effort to remove form evaluation
all the terms that do not add value.
After 16284 games:
Mod vs Orig 2728 - 2651 - 10911 ELO +1 (+- 3.1)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After a "stop" due to a ponder miss Xboard sends
immediately the new position to search, without
waiting for engine to effectively stop the search.
It is not clear if this is a GUI bug (as I suspect)
or allowed behaviour, but because it won't be fixed
anyway workaround this issue making listener thread
to switch to in-sync mode as soon as a "stop" command
is received.
Thanks to Mike Whiteley for reporting this.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
If GUI sends stop while we are waiting for
a command do not reply with a silly:
Unknown command: stop
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Oliver reports profile builds error with new gcc 4.6, he says:
"We need to add -lgov with profile-generate AND profile-use.
So it has to be added to the second stage of building too.
The problem occurred first with the introduction of gcc4.6 and
I think this is because the previous version did find the gcov
library automatically. gcc4.6 needs more precise options and
does less guesses. I have seen it in debian, Ubuntu and also with
mingw on Windows. And all use gcc4.6."
This patch fixes the issue.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After async I/O patches 'bench' changed behaviour and now waits for
input at the end of the test run. This is due to listener thread stay
blocked on std::getline() even after test run is finished, as soon as
we feed something the thread unblocks and then quickly exits.
This is not a big problem, but has the bad side effect of breaking
profile builds that hang forever at the end of the test run.
The tricky workaround is to create a pipe that connects to stockfish
input and then, when test run is finished, breaking the pipe: this
makes std::getline() immediately return.
So this patch adds a 'sleep 10' piped into 'stockfish bench' test run
command. After 10 seconds sleep ends, the pipe breaks and 'bench'
finishes as usual.
Thanks to Oliver Korff for reporting the issue, and to Mike Whiteley
for having co-authored this solution.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use do_uci_async_cmd() instead of process input commands
directly and clarify that what we are waiting for is
something that is able to raise StopRequest flag.
Also fix some stale comments in do_uci_async_cmd(). Here
we need to reset Limits.ponder only upon receiving "ponderhit".
In the case of "quit" or "stop" resetting Limits.ponder has no
effect because the search is going to be stopped anyway.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The timer will be fired asynchronously to handle
time management flags, while other threads are
searching.
This implementation uses a thread waiting on a
timed condition variable instead of real timers.
This approach allow to reduce platform dependant
code to a minimum and also is the most portable given
that timers libraries are very different among platforms
and also the best ones are not compatible with olds
Windows.
Also retire the now unused polling code.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With our new listener thread we don't need anymore
this ugly and platform dependent code.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of polling for input use a dedicated listener
thread to read commands from the GUI independently
from other threads.
To do this properly we have to delegate to the listener
all the reading from the GUI: while searching but also
while waiting for a command, like in std::getline().
So we have two possible behaviours: in-sync mode, in which
the thread mimics std::getline() and the caller blocks until
something is read from GUI, and async mode where the listener
continuously reads and processes GUI commands while other
threads are searching.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The std::min() template function requires both arguments
to be of the same type.
But here we have the integer MAX_THREADS compared to a long:
long sysconf(int name);
So cast to integer and fix the compile.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Add comments and rename stuff to better clarify what the
magic bitboard initialization code does.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We test if the piece moved in 'to' attacks the square 's' with:
bit_is_set(attacks_from(piece, to), s))
But we should instead consider the new occupancy, changed after
the piece is moved, and so test with:
bit_is_set(attacks_from(piece, to, occ), s))
Otherwise we can miss some cases, for instance a queen in b1 that
moves in c1 is not detected to attack a1 while instead she does.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is not possible to unify due to the fact that the
sequence steps are reversed. What we can do is to try
to sync comments and code as much as we can to easy
reading and documentation.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is called only in do_move() that now has been fully
expanded. This is the most time consuming function of
all the engine.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Is called in just one place. And rename set_castle() in the
now free to use and more appropiate set_castle_right().
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
They don't add any value given that the corresponding
table has global visibility anyhow.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is a prerequisite to allow changing piece values
at runtime, needed for tuning.
Also use scores instead of separated midgame and endgame values.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
First tuning with CLOP against a pool of 3 engines. Result
verified with a direct match:
After 8736 games at 10"+0.1
Mod vs Orig 1470 - 1496 - 5770 ELO -1 (+-4.3)
So no change in self match but if CLOP is right it should
be a bit better against an engine pool.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In particular add that we can have an harmless false positive
in case StopRequest or thread.cutoff_occurred() are set.
Reported by David Lee.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
As a side effect now log file is open and closed every
time it is used instead of remaining open for the whole
thinking time.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Mainly used to log stuff to a file while playing, when
stdout is used for the comunication with the GUI.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Build broken by commit 3141490374
where we renamed move_is_ok() in is_ok() and this clashes
with the same named method in Position that overrides the
move's one causing compile errors.
The fix is to rename the method in Position.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Justin reports that it breaks the compilation on Fedore 15 and as Tom says:
-static is only needed to work around the gcc on ubuntu 11.10 beta bug.
If -static introduces issues on its own then it is better to remove it.
It will not be needed in most environments.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Partially revert 1036cadcec because UCI protocol
in case of multipv explicitly requires:
for the best move/pv add "multipv 1" in the string when you send the pv.
in k-best mode always send all k variants in k strings together.
Thanks to Justin Blanchard for pointing this out.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is enabled when selecting x86-64-modern target, this gives
another nice speed up:
On a Core i5-2500 (3300 Mhz, Sandy Bridge):
64 bit download version: 1597151 n/s
-flto : 1659664 n/s
-flto -msse3: 1732344 n/s
Patch suggested by Tom Vijlbrief.
Also unify flto, popcount and msse3 optimization under "modern"
target, note that this can break the "modern" build on old gcc that
do not support -flto option: in this case update gcc ;-) or default
to the standard build.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Just by adding the -flto option to CXXFLAGS link command
we can gain a few percent in speed.
On a Core i5-2500 (3300 Mhz, Sandy Bridge):
64 bit download version:
Without -flto: 1597151 n/s
With -flto : 1659664 n/s
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This was not clear to someone on talkchess and actually
is not trivial to understand.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems we have a very rare crash under Linux, once
every 10K games without this patch.
Is faster to wake up all the threads, especially on SMP,
where the threads can then exit in parallel while the main
thread is waiting for the next one to terminate.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Almost no increase but seems the logic thing to do.
After 16707 games 2771 - 2595 - 11341 ELO +3 (+- 3.2)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Restore old locking scheme changed with
commit 1e92df6b20.
This seems to prevent a very rare crash that occurs
once every 5-10K games.
With this patch we have no crashes after 33K games.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is more natural than using the family subtype and also
use two single maps instead of a std::pair.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When initializing endgames map we build a faked FEN string
in mat_key() to get the position hash's key.
This fen string lacks full move numbers, so when parsing the
fen in Position::from_fen() we leave startPosPly un-initialized.
Spotted by Valgrind (this is a kind of bug that is almost impossible
for humans to find).
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Another stupid remark to quiet out:
remark #2259: non-pointer conversion from "int" to "UINT16={unsigned short}"
may lose significant bits
In this case icc always converts to an integer the result of a shift operation
if the bit size of the operand is smaller, hence the warning when assignin
back to n.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now that we have just two mutually exclusive thread's states
we can repleace them by a simple boolean.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now that Rml ordering is based on normal MovePicker logic,
apart for the ttMove that is given, we can avoid to score
all the root moves at depth 1. We only need it for easy move
detection logic, but in this case we just need to score the
first two best moves and not all the Rml set.
No regression after 6400 games
Mod vs Orig 1052 1012 4336 ELO +2 (+- 4.9)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Allocation of pawn and material hash tables should
be strictly bounded to the change of the number of
activeThreads, so move the code inside set_size().
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use proper way to detect for thread terimnation instead of
our homegrown flag.
It adds more code than it removes and adds also platform specific
code, neverthless I think is the way to go becuase Thread::TERMINATED
flag is intrinsecly racy given that when we raise it thread is still
_not_ terminated nor it can be, and also we don't want to reinvent
the (broken) wheel of thread termination detection when there is
already available the proper one.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Was used to prevent issues when creating multiple threads
on Windows, but now it seems we can remove it safely.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now that we can split at root it happens that SendSearchedNodes
works only once at the end of the iteration, but this is useless
becuase speed info is sent anyhow toghter with the pv line.
So retire for now, waiting to find something SMP compatible.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Change qsearch() to reflect alpha update logic
of search().
To be consistent changed also moves loop condition and
futility pruning condition.
No regression after 5072 games at TC 10"+0.1
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Start a slave as soon as is allocated.
No functional change with faked split.
Regression tested the full split() series and after
2000 games no regression and no crash.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When calling split or we immediately return because unable to
find available slaves, or we start searching on _all_ the moves
of the node or until a cut-off occurs, so that when returning
from split we immediately leave the moves loop.
Because of this we don't need to change alpha inside split() and
we can use a signature similar to search() so to better clarify
that split() is actually a search on the remaining node's moves.
No functional change with faked split.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Allocate and initialize a new split point
out of lock becuase modified data is local to
master thread only.
Also better document why we need a lock at
the end of split().
No functional change with faked split.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When StopRequest is raised we cannot immediately exit the
move loop but first we need to update bestValue so to avoid
assert:
assert(bestValue > -VALUE_INFINITE && bestValue < VALUE_INFINITE);
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Another great success by Joona !
After 5876 games at 10"+0.1
Mod vs Orig: 1073 - 849 - 3954 ELO +13 (+- 5.2)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now that we don't special case the root moves anymore
we don't need to pass NodeType anymore as template parameter,
a simple bool to detect a SpNode will be enough.
Spotted by Joona.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After issuing "go"-command, at the end of the search
SF shows: "Unknown command: ...".
Spotted by Joona.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of our home grown function to perform a case
insensitive compare on option names as required by UCI
protocol.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems FritzGUI already remembers the old lines, so
we just need to update PV info only for the new lines.
Also introduced prevScore field in RootMove to avoid
a bulk copy of Rml.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This should fix following issue:
Suppose the search with MultiPVIteration == 0 returns an exact score
move = Nxf4, score = 100
Now search with MultiPVIteration == 1 and get two scores
move = Qg8, score = 150
move = Ra1, score = 180
If we now reorder all the moves in one step we end up with
pv[0] = Ra1, pv[1] = Qg8
Instead reordering as the current patch we end up in:
pv[0] = Ra1, pv[1] = Nxf4
preserving the first searched move.
No functional change in single PV.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This patch temporarily breaks MultiPV and searchmove
features, but they will be re-implemented in future
patches.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We missed to set chess960 flag into the std::stringstream used to
setup the PV line.
Bug introduced with commit f803f33e63
of 30/12/2010 when we started to print PV line into a std::stringstream
instead of directly into cout, where the chess960 flag is correctly set.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
As a side effect now root position can be directly
allocated on the stack and doesn't need to be defined
static anymore.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is misnamed because it is not a parser, perhaps a
tokenizer, anyhow better call it for what it is, an
input string stream.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Return the correct number of played plies at the end
of the setup moves. Currently it always returns 0 when
starting from start position, like in real games.
We fix this adding st->pliesFromNull that starts from 0
and is incremented after each setup move and is never
reset in the setup phase but only once search is started.
It is an hack because startpos_ply_counter() will return
different values during the search and is correct only
at the beginning of the search.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This fixes a regression on real games due to the fact that
we have some mismatches:
history[st->gamePly - i] != stp->key
when st->gamePly - i == 0,this is due to a nasty bug I have
introduced when using std::vector<> as StateInfo backup. The
point is that StateInfo keeps inside a pointer to the previous
StateInfo in a kind of linked list. But when std::vector<> is
resized reallocates a larger chunk of memory and moves the
data there so these pointers became stale.
This patch fixes the issue.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We just need startup value to calculate available
thinking time. So remove from state.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Avoid the ugly and anyhow incorrect hard limit on the
maximum number of moves and allow to handle an arbitrary
number of moves to search.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This allow to retire do_setup_move() and also to simplify
draw detection logic becuase now we always have:
Min(st->rule50, st->gamePly) = st->rule50
This was already true when starting from starting position,
but now is true even when starting from a FEN string because
now we take in account fullmove number in counting gamePly so
that it is always.
st->rule50 <= st->gamePly
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Fixes the reported KNNK ending problem:
http://talkchess.com/forum/viewtopic.php?t=39347
Joona says:
Now I finally had a time to take a look at on this issue.
I've reproduced the problem starting from this position:
1B6/1B2k3/P7/1P3p2/1K6/8/4b3/4b3 w - - 6 85
I made Stockfish play as white and Fruit as black.
I repeated test ten times and once SF was not able to deliver mate.
But I observed several times that SF had reported on last something like mate in 10.
However next time it played move with score mate in 15.
Easiest way to solve the problem is attached as a patch. I tested it several times and SF always
ended up playing the optimal move. Of course the downside is that now delivering mate
takes a bit longer, but IMO it's better to lose once in a while by time in sudden death
game than not being able to deliver simple mate with long time controls.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We have a bug (possibly because of returning draw from
root move list), it is possible to see when looking at
games with a GUI, we can see rarely but consistently the
score return as #0 for many depths until it comes back to
normal values.
Revert patches until it is not fixed.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Running following command:
position startpos moves e1e8
Makes SF to assert in debug mode in do_move() but to accept
bad input and continue in release mode where probably it is
going to crash little later.
So validate input before to feed do_move().
Suggestion by Yakovlev Vadim.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It's only necessary to do the checking at the end of every non-const
member (including the constructors and from_fen()) of class Position.
Once the post-condition of every modifier guarantees the class invariant,
we don't need to verify sanity of the position as preconditions for outside
callers such as movegen, search etc. For non-class types such as Move and
Square we still need to assert of course.
Suggested by Rein Halbersma.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After previous patch is no more needed to pass
the color, becuase it is always the side to move.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It worked by accident because we always called both directions,
but definition was wrong.
Functional change due to different generation order, but
perft numbers are the correct ones.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Functional change due to the fact that now pick_best() is
stable, but should be no change in strenght.
Example code and ideas by Rein Halbersma.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And pass correct currentPly to TimeManager::init().
This restores old behaviour, in particular now black has
a different timing than white becuase is no more:
currentPly = 2 * fullMoveNumber;
but becomes
2 * (fullMoves - 1) + int(sideToMove == BLACK)
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
No functional change also in faked split mode
To be sure verified in real games with 4 threads TC 2"+0.1
After 11125 games 2497 - 2469 - 6159 ELO +0 (+- 4.4)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Fix also an incredible 3% speed regression by an almost
never called function. I guess this is due to mingw very low
quality standard libraries implementation.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And also tolerate a 0 value for full move number.
Revert BUG_41 patch, now we set initial King file only
if a castling is possible, so we don't need the fix
anymore in case of correct FEN.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
If we cannot castle castleRightsMask[] could be not valid,
for instance when king initial file is FILE_A as queen rook.
In this case castleRightsMask[] at initialQRFile is different
from the expected (ALL_CASTLES ^ WHITE_OOO).
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
See:
http://talkchess.com/forum/viewtopic.php?t=39327
After 8130 games on QUAD at 20"+0.1
1342 - 1359 - 5429 ELO +0 (+- 4.4)
Tried also version with just king settings changed:
After 5932 games 962 - 1052 - 3918 ELO -5 (+- 5.2)
And with just mobility settings changed:
After 4114 games 618 - 619 - 2877 ELO +0 (+- 5.9)
Frank has tested only 1200 games, but at longer TC and
against many engines, so because PHQ results are not worst
than other combination and not worst than original let's
commit his version.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We miss to account as a capture a promotion capture !
Incredible bug in this critical function that is here
since a long time (b50921fd5c of 21/10/2009 !!)
This patch fixes the bug and readds the faster
move_is_capture_or_promotion() that slightly increases
overall speed.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is now much more modular than before and also we
always send the seldepth when we send the depth, this
avoids to make seldepth disappearing from GUI at the
start of a new iteration.
Print also fails high/low pv lines at high enough
search depths.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Easy, almost trivial simplification, I don't understand
how I missed this before !!
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This trivial change gives an impressive 2,5% speedup !!!!
Also retire one unused move_gives_check() overload.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Restore original behaviour, before root unification and
remove a now useless ugly hack for alpha in multi-pv case.
No functional change
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is not correct to use an iterator stick on a vector that
is sorted becuase iterator is invalidated in general case.
It happens to work by accident because iterators are implemented
as pointers and so they behave in the same (correct) way then
using array indices, but the latters are the correct thing to use.
Also better document the code.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is a fossil from the root_search() era, no more
needed today.
Spotted by Onno
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This avoids search explosion in qsearch for some
patological cases like:
r1n1n1b1/1P1P1P1P/1N1N1N2/2RnQrRq/2pKp3/3BNQbQ/k7/4Bq2 w - - 0 1
After 9078 games 20"+0.1 QUAD:
Mod vs Orig 1413 - 1319 - 6346 ELO +3 (+- 4)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The trick is to classify more position at first cycle,
so to reduce following work. Speed up is of about 50% !
Also some cleanup while there.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Seems there is no regression so prefer to prune less.
After 8278 games
Mod vs Orig 1246 - 1265 - 5767 +0 ELO (+- 4.2)
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We should start from i = 0, it works by accident because
static storage BSFTable[] is init to zero by default.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In case we have more than one promotion move, prefer
the one that captures the biggest piece.
Almost no functional change, anyhow I don't expect any
ELO change, it is just the correct thing to do.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems much worst in number of nodes seacrhed to reach
the depth and anyhow does not give any advantage to the
Onno's oroginal one.
So revert by now and perhaps readd when we find something
clearly better.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Allow to choose among 4096 instances of pseudo-random
sequences instead of the previous 64 so the probability
to find a better sequence increases and actually we have
a much better 64 bit case and we can also use the 64 bit
version of pick_magic() also for 32 bits and althoug sub
optimal, because now we can have more choices results are
even slightly better also for 32 bit.
Use also a faster submask().
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After 12613 games at 20"+0.1 on QUAD
Mod vs Orig 1870 - 1863 - 8880 ELO +0 (+- 3.3)
So no performance change but it is a code semplification
and also is more easy to understand.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Good result for 32 bit case where computation is very fast,
still not satisfying on 64 bit case where the magics seem
a bit harder to get.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Due to a -2% speed penalty. This patch takes the best
of the previous series without the regression due to
introduction of Magic struct.
Speedup against previous revision is of almost 3% !!!!
No functional change both in 32 and 64 bits.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We can calculate them counting the masks bits.
Also small tweak to sliding_attacks()
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Here the idea is to test probcut not only after bad
captures, but after any bad move, i.e. any move that
leaves the opponent with a good capture.
Ported by a patch from Onno, the difference from
original version is that we have moved probcut after
null search.
After 7917 games 4 threads 20"+0.1
Mod vs Orig: 1261 - 1095 - 5561 ELO +7 (+- 4.2) LOS 96%
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We really want PV moves and also Split Point moves to be
legal to avoid messing the move counter and corresonding
PV move detection or shared Split Point's counter variable.
This fixes a real bug where a position with only one move
allowed returns bestValue == -VALUE_INFINITE if the move
turns out to be illegal.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We disjoint pseudo legal detection from full legal detection.
It will be used by future patches.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
When we are in check and we move the king then testing with
pl_move_is_legal(m, pinned) is not enough becuase we cannot
rely on attackers_to() but we have to explicitly remove the
king form the occupied bitboard to catch as invalid moves like
b1a1 when opposite queen is on c1.
Our move generator already produces correct evasions so we
just need to add the extra verification to move_is_legal().
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Found another missed control in move_is_legal() thanks to
brute force testing.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Add a new check in move_is_legal()
Avoid useless casting in move.h while there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It is interesting the fact that we need to test for
move_is_castle(m) anyway and not relying on testing
if destination square is attacked. Indeed the latter
condition fails if the castling rook is attacked,
castling is coded as "king captures the rook" but it
is legal in that case.
Verified no functional change with beginning of the series.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Remove the check for castling moves because it is
already implicit in the check for king moves and castling
is so rare that doing the check is just a slow down.
Thanks to Marek Kwiatkowski.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The structure of move is changed so should also the two
functions. It happens that it works by accident !
Bug spotted by Marek Kwiatkowski
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We already test this condition in see_sign() and
so it is almost always a redundant verification.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
- git log HEAD | grep "\b[Bb]ench[ :]\+[0-9]\{7\}" | head -n 1 | sed "s/[^0-9]*\([0-9]*\).*/\1/g" > git_sig
- export benchref=$(cat git_sig)
- echo "Reference bench:" $benchref
#
# Compiler version string
- $COMPILER -v
#
# Verify bench number against various builds
- export CXXFLAGS="-Werror -D_GLIBCXX_DEBUG"
- make clean && make -j2 ARCH=x86-64 optimize=no debug=yes build && ../tests/signature.sh $benchref
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then make clean && make -j2 ARCH=x86-32 optimize=no debug=yes build && ../tests/signature.sh $benchref; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then make clean && make -j2 ARCH=x86-32 build && ../tests/signature.sh $benchref; fi
#
# Check perft and reproducible search
- export CXXFLAGS="-Werror"
- make clean && make -j2 ARCH=x86-64 build
- ../tests/perft.sh
- ../tests/reprosearch.sh
#
# Valgrind
#
- export CXXFLAGS="-O1 -fno-inline"
- if [ -x "$(command -v valgrind )" ]; then make clean && make -j2 ARCH=x86-64 debug=yes optimize=no build > /dev/null && ../tests/instrumented.sh --valgrind; fi
- if [ -x "$(command -v valgrind )" ]; then ../tests/instrumented.sh --valgrind-thread; fi
#
# Sanitizer
#
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then make clean && make -j2 ARCH=x86-64 sanitize=undefined optimize=no debug=yes build > /dev/null && ../tests/instrumented.sh --sanitizer-undefined; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then make clean && make -j2 ARCH=x86-64 sanitize=thread optimize=no debug=yes build > /dev/null && ../tests/instrumented.sh --sanitizer-thread; fi
are already enabled and no configuration is needed.
### Support on Windows
The use of large pages requires "Lock Pages in Memory" privilege. See
[Enable the Lock Pages in Memory Option (Windows)](https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/enable-the-lock-pages-in-memory-option-windows)
on how to enable this privilege. Logout/login may be needed
afterwards. Due to memory fragmentation, it may not always be
possible to allocate large pages even when enabled. A reboot
might alleviate this problem. To determine whether large pages
are in use, see the engine log.
## Compiling Stockfish yourself from the sources
Stockfish has support for 32 or 64-bit CPUs, certain hardware
instructions, big-endian machines such as Power PC, and other platforms.
On Unix-like systems, it should be easy to compile Stockfish
directly from the source code with the included Makefile in the folder
`src`. In general it is recommended to run `make help` to see a list of make
targets with corresponding descriptions.
```
cd src
make help
make build ARCH=x86-64-modern
```
When not using the Makefile to compile (for instance with Microsoft MSVC) you
need to manually set/unset some switches in the compiler command line; see
file *types.h* for a quick reference.
When reporting an issue or a bug, please tell us which version and
compiler you used to create your executable. These informations can
be found by typing the following commands in a console:
```
./stockfish
compiler
```
## Understanding the code base and participating in the project
Stockfish's improvement over the last couple of years has been a great
community effort. There are a few ways to help contribute to its growth.
### Donating hardware
Improving Stockfish requires a massive amount of testing. You can donate
your hardware resources by installing the [Fishtest Worker](https://github.com/glinscott/fishtest/wiki/Running-the-worker:-overview)
and view the current tests on [Fishtest](https://tests.stockfishchess.org/tests).
### Improving the code
If you want to help improve the code, there are several valuable resources:
* [In this wiki,](https://www.chessprogramming.org) many techniques used in
Stockfish are explained with a lot of background information.
* [The section on Stockfish](https://www.chessprogramming.org/Stockfish)
describes many features and techniques used by Stockfish. However, it is
generic rather than being focused on Stockfish's precise implementation.
Nevertheless, a helpful resource.
* The latest source can always be found on [GitHub](https://github.com/official-stockfish/Stockfish).
Discussions about Stockfish take place in the [FishCooking](https://groups.google.com/forum/#!forum/fishcooking)
group and engine testing is done on [Fishtest](https://tests.stockfishchess.org/tests).
If you want to help improve Stockfish, please read this [guideline](https://github.com/glinscott/fishtest/wiki/Creating-my-first-test)
first, where the basics of Stockfish development are explained.
## Terms of use
Stockfish is free, and distributed under the **GNU General Public License version 3**
(GPL v3). Essentially, this means that you are free to do almost exactly
what you want with the program, including distributing it among your
friends, making it available for download from your web site, selling
it (either by itself or as part of some bigger software package), or
using it as the starting point for a software project of your own.
The only real limitation is that whenever you distribute Stockfish in
some way, you must always include the full source code, or a pointer
to where the source code can be found. If you make any changes to the
source code, these changes must also be made available under the GPL.
For full details, read the copy of the GPL v3 found in the file named
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.