Playing against a board game L'Oaf AI

Playing against a board game L'Oaf AI

I wrote about making a L'Oaf AI and learning strategies from it a week or so ago. In the article I mentioned letting readers play against the AI, and after getting permission from the game's designer, Bart de Jong, I was able to put something together where you can play against the AI. Before finishing this, I added a risky-AI mode (one that tries to win instead of drawing when things get hard), fixed a minor mistake I had in my previous code, and re-ran the analysis code. Scroll down to the bottom of the post if you immediately want to try playing against the bot.

Let me know if you end up winning in a comment. I don't think I ever won, but I was able to find a strategy to reliably draw against the normal bot. We should be able to win some of the times, as even the random playstyle could eek out a few wins here and there.

Learning how to play a game from AI
Teaching an AI to play L’oaf and seeing what a human can learn from it.

Re-running the analysis and adding a risky bot (expand me)

After publishing the original post, I noticed an edge case where my implementation drifted from the rules. If a round’s effect targeted the highest or lowest reputation player and both players were tied, my code applied the effect to only one player. The rules (and the intent of the condition) are that tied players should both be targeted.

This does not flip most games on its own, but it can nudge optimal play and it definitely matters when you are training and comparing strategies. I fixed the targeting rule, retrained the models, and reran the same analysis scripts. By the way: 8 is now the least favorite card, instead of 9.

Updated ELO or performance summary after the rules fix

A risky bot mode

While building the playable version, I ran into a practical issue: the strongest bot is good in a way that feels a bit boring. When the position gets close, it often goes into mutual failure where both players end up fired, which produces a draw. This is of course a good strategy if your objective is to avoid losing, but it is not the most fun opponent.

So I trained a second version of the 128×128 model with a deliberately skewed reward: win +2, draw −1, loss −1. This makes draws feel as bad as losses, which pushes the policy toward trying to win instead of forcing a draw. You can switch between the normal and risky model in the game UI. Below are some plots of this new model, and it performs as expected: it is a bit worse than the all-rounder models, but it's still good. It learns at approximately the same pace and generally will still beat humans, I think. What's interesting is that it uses different moves than the other models, using high cards more often.

Win/draw/loss comparison: normal vs risky
Agreement plot including risky model
Self-play progress plot including risky model
ELO-style ranking including risky model

If the insert is too small, open it in a new tab: loaf.measuredthinking.nl