Monday, August 24, 2015

Correspondence Chess - Reform is Long Overdue


In a previous blog post, I mentioned GM Arno Nickel's open letter concerning the excessive draw problem in correspondence chess.

GM Nickel promised to distribute a summary of responses to all who responded to his survey. Today my copy arrived. I found the results most interesting and wish to thank GM Nickel for his efforts. The survey was about his proposal to alter the scoring of drawn games. However, his conclusions about engine detection must be considered opinion not supported by facts. GM Nickel's comments are shown below in quotation marks.

"A radical measure claimed over and over again is the engine ban. It has to be distinguished from the additional offer of "engine - free play" which is supported, for example, by the BdF and by many free chess servers, namely as a kind of competition based on voluntary arrangement."

Not being able to read German well, I cannot discover a relationship between Bdf and the server, but I did locate a non-engine event. Here is a game between two strong players. Top 3 analysis below is with Stockfish 6 to 30 ply depth, multi-pv=3. "Book" or database opening moves are not counted in Top 3, nor are "forced" moves (f) such as responding to checks, or moves that recover material during exchanges.

[Event "non-engine"]
[Site ""]
[Date "?"]
[Round "?"]
[White "redacted"]
[Black "redacted"]
[Result "1/2-1/2"]
[ECO "C68"]
[WhiteElo "2300+"]
[BlackElo "2400+"]

1. e4 e5 2. Nf3 Nc6 3. Bb5 a6 4. Bxc6 dxc6 5. O-O Bg4 6. h3 h5 7. d3 Qf6
8. Nbd2 Ne7 9. Re1 Ng6 10. d4 {end of book: Chow - Ivanov, Dallas 1996,
1/2 - 1/2 in 45 moves.}

10. ...                           Nf4     T1
11. dxe5    T1              Qg6    T1
12. Nh4     T1              Bxd1   T1
13. Nxg6    (f)              Nxg6    T1
14. Rxd1    (f)              O-O-O T1
15. e6        T1             fxe6     T1
16. Re1     T1             Ne5     T1
17. Nb3     T1             Bb4     T2
18. c3        T1             Bd6     T1
19. Bg5     T1             Rd7     T1
20. Red1   T1             Rf8     T2
21. a4        T1             Rdf7   T1
22. Nd4      T1            Rxf2    T1
23. Nxe6     (f)            R8f7    T1
24. Nd8      T1            Rf8     T1
25. Ne6      T1            R8f7   T1
26. Nd8      T1            Rf8     T1
27. Ne6      T1

White has T1 14/14 = 100% agreement with the engine's top recommendations.
Black has T1 15/17 = 88% agreement with the engine's top recommendations.
So much for engine-free, gentleman's agreement chess. As GM Nickel famously quips, "black sheeps can be everywhere."

Using the Top 3 methodology described in this article, I have never found a correspondence game played  prior to 1980 (i.e., before chess micro devices, before personal computers) where T1 matching exceeded 71%. In most games, the percent was far less, even at the world championship level. That some of today's amateur cc players match engine output 80%, 90% even 100% in non-engine tournaments defies credulity.

Looks like Bdf's "voluntary arrangement" doesn't work any better than the "honor system" here in the States, where engine abuse is rampant in no-engine events.

"A strict 'ban' is more than this, as it goes for active control methods and drastic sanctions in order to guarantee a fair competition and avoid as far as possible grey areas. As control mechanism some players think of copying methods from real time on - line play. This way a player who’s moves are, for example, more than 70% the same as the first choice by engines might be considered as convicted of cheating. In my view such an approach is no more than wishful thinking as it totally ignores the differences between real time on - line play (mostly blitz and rapid chess) and correspondence chess ... Already the use of databases in correspondence chess would be a problem to find the right move, where to start with controls. While in on - line chess time usage is an important criterion for proving cheating (players spend the same time on easy moves as highly difficult moves), this aspect does not apply to correspondence chess." 

Top 3 analysis is based on frequency counts and their percentage of total moves played. It does not use "secret methods," "time" or any of the methods of on-line server admins to police cheating in fast play (blitz) chess. Active control methods are a necessary part of engine detection, but "drastic sanctions" aren't the best solution. What business can survive by driving away its customers? Organizations should simply create a separate division for advanced chess, just like they do for Chess960. New players are free to choose which type of chess they wish to play, traditional (no engine) chess or advanced (engine-assisted) chess. Separate rules, events and rating lists would be maintained. Traditional players guilty of engine abuse are permanently moved into the advanced chess group. Players may not participate in both groups; advanced chess players may never move to the traditional (no engine) group.
     As explained above, Top 3 calculations do not include "book" or database opening moves - these moves are ignored. Players in traditional no-engine chess may follow any published (publicly available) chess game played before 1980, and any published otb game played after 1980. Published correspondence chess games played after 1980 are "verboten;" copying such moves in traditional cc games is committing Top 3 suicide because they will drive up % matching.

"Besides it is completely unclear how to value a correspondence of the played move with an “engine move”, when there are many equivalent computer moves like in stages of mainly positional play. This could happen just by chance, same as in case of forced moves a correspondence with “engine moves” would have to exist as the player would otherwise just lose (a piece or more)."

As explained above, "forced" moves are not included in Top 3 calculations. It's also clear combinations to win material and mating attacks are something players can find without an engine's help. These sequences are exceptions and we don't count "forced" moves. The majority are "quiet" positions where several moves have nearly equal evaluations. In positions where the evals are very close to one another, unaided human players can't differentiate differences of a few hundredths of a pawn. Yet some players' moves match an engine's top recommendations well in excess of 70%.
      In positions where two or more moves are tied at a 0.00 eval, and the engine can't break the tie no matter how many plies are examined, it's a theoretical draw. The engine-assisted player will unerringly steer the game to a draw, even if it takes 100 or more moves. Natural players often lose such positions because they play like humans, failing to find the precise drawing line.
     Strong players can estimate a half-pawn advantage in a position (engine eval for this is 0.50). Human players cannot discern a few hundredths of a pawn. When the engine determines the top 3 moves are + 0.08, + 0.04 and +0.01, the odds are strongly against a natural player making the T1 move, especially matching such minuscule T1 evals many times in a game. Alarmists are always claiming Top 3 will wrongly convict innocent people who play "sharp" (tactical) chess, or that players could match computer output "by chance." Such critics display ignorance of Top 3 methodology and/or a total lack of understanding of mathematical probability. In contrast to natural chess play, one rarely sees combinations and mating attacks in advanced chess because engines don't ever fall into such predicaments. Contrary to "conventional wisdom", it is those non-tactical, "quiet" positions that will be most revealing of engine abuse.

"There are many other arguments, why the concept of an enforced engine ban is definitively condemned to failure, no matter how we think about usage of engines in correspondence chess. Support of engine - free play makes only sense as an additional offer and it requires that all players strive for nothing more than fun and honour, without precious prices and qualifications, as otherwise cheating would dramatically increase. As has been reported in forums even fun tournaments with engine - free play are by no means absolutely safe from cheating. Black sheeps can be everywhere."

Cannot agree engine detection is "condemned to failure" but separating traditional chess from advanced chess is a long overdue reform! The two types of chess can co-exist (just don't co-mingle) peacefully in an organization. There is no need to promote one at the expense of the other. Players are customers and organizations need to respect the wishes of all players.

GM Nickel concludes: "However, we are still waiting for long-term plans how to deal with the challenge posed by the increasing strength and domination of chess programs, and correspondence chess databases that are getting bigger and bigger. The answers given by ICCF appear to be evasive and defensive; to my mind they also fail to include the ICCF members in a discussion about the question how correspondence chess sees and presents itself. The answers to the survey show above all one thing: in view of the rising number of draws and the dominance of computer engines a huge number of ICCF members and a lot of chess fans feel the need for concrete action to keep correspondence chess attractive or, if there is no other choice, to reinvent its attraction."

Almost immediately following the ChessBase open letter, survey analysis and conclusions, ICCF rushed to post an interview with Ron Langeveld, ICCF's 26th World Champion.

In this interview, GM Langeveld offers some opinions supportive of ICCF's status quo regarding advanced chess rules. He states: "A high draw ratio in itself is not a problem. It’s not that game replay would suffer due to an increased draw ratio." Not sure typical players and fans agree with that. Playing through 75-100 (and more) move games that end in draw after draw after draw is pretty boring stuff. Gone are the brilliant combinations and slashing attacks of yesterday. GM Langeveld calls any player in disagreement "mediocre." The overwhelming majority of chess players are not super GM's and it is their entry fees that sustain an organization. It is incongruent for GM Langeveld to belittle average players and then conclude his interview with a wish for lower entry fees.