6 types of robots vs Zenith Daylong GIB, Argine and Ben vs the Zenith players
#1
Posted 2023-December-12, 04:24
These are the robot scores over 1,920 boards (all of the deal pools in Zenith)
Click for the leaderboard.
Zenith Dec 10th
Gib Advanced = 53.91%
Gib Basic = 52.58%
Argine Advanced = 50.96%
Thinking Ben = 50.32%
Argine Simplified = 48.38%
Instant Ben = 45.64%
Argine is set to play 2/1 (it played 2/1 in the previous simulations too).
The Ben model used is trained on GIB advanced for the bidding, and on ACBL human hands for the play of hand.
Instant Ben just plays based on what the neural network says
Thinking Ben runs some simulations using info from the neural network before making a move.
#2
Posted 2023-December-12, 06:33
#3
Posted 2023-December-12, 10:08
#4
Posted 2023-December-12, 11:16
mycroft, on 2023-December-12, 10:08, said:
May well be so if most or all club players are playing systems similar to Gib 2/1.
My experience is that the robot pair does much worse (low 40s) when they don't understand the opponent's bidding.
#5
Posted 2023-December-12, 21:07
#7
Posted 2023-December-13, 02:25
pilowsky, on 2023-December-12, 21:07, said:
We are doing another run with the same bot playing for both North and South. That setting is better to compare bots to each other. Like how does a pair of gibs compare to a pair of argines to a pair of bens, etc. Making a bot play with another bot as partner puts the bot at a disadvantage (especially true for argine)
#9
Posted 2023-December-14, 03:21
Both North and South are played by the same robot, so each robot plays with itself as partner.
Argine Advanced = 53.72%
Gib Advanced = 53.17%
Thinking Ben = 50.77%
Argine Basic = 50.62%
Gib Basic = 49.83%
Instant Ben = 45.47%
This new setup has helped Argine and has hurt Gib Basic.
My interpretation is that Argine has fewer misunderstandings now as she plays with herself as partner (Argine does play 2/1, but a slightly different flavor of 2/1 than gib).
Gib basic's performance dropped because now it has a weaker partner (itself).
#10
Posted 2023-December-23, 04:19
diana_eva, on 2023-December-12, 04:24, said:
These are the robot scores over 1,920 boards (all of the deal pools in Zenith)
Click for the leaderboard.
Here is another interesting thing you might do with the data from this experiment: for each human h, calculate Advanced GIB's percentage P_h on the set of 16 boards played by h. What percentage of humans were beaten by the robot when playing the same set of boards? How did the robot's performance vary over sets, e.g., what were the minimum and maximum values of P_h?
#11
Posted 2023-December-23, 04:19
diana_eva, on 2023-December-12, 04:24, said:
These are the robot scores over 1,920 boards (all of the deal pools in Zenith)
Click for the leaderboard.
Here is another interesting thing you might do with the data from this experiment: for each human h, calculate Advanced GIB's percentage P_h on the set of 16 boards played by h. What percentage of humans were beaten by the robot when playing the same set of boards? How did the robot's performance vary over sets, e.g., what were the minimum and maximum values of P_h?
#12
Posted 2023-December-24, 21:23
Not sure how you would do it of course, and maybe very time consuming
Sorry for thinking idly - maybe even how often the contracts were the defining factor
Also pondering annoyingly. I wonder how the likes of Qplus and other 2/1 engines would stack up
- and I may be a contrarian but I like the idea of random pairings of different 2/1 bots - isn't that the point of bidding systems
Maybe a simple interpretative AI interface on each - what does GiB mean lol