Extremely basic illustration of a heuristic function in decision making during the gameplay, just something to do on my holiday inspired by video a friend showed me from Vsauce2 (Youtube) on hexapawn.
I am quite unfamiliar with Python and any improvements are welcome, was forced to use it due to internet difficulties and using another person's Windows 😭 laptop.
- Python 3.8.1
- Colorama 0.4.3
Board:
Board in game:
['x', 'x', 'x']
[' ', ' ', ' ']
['o', 'o', 'o']
Movement that allowed:
- Pawn can move one up if no pawn is in said position.
- Pawn can move one capture opponent's pawn in forward diagonal position.
Three ways to win:
- Get your pawn to oponent's starting side.
- The oponent is left with no valid moves.
- Capture all of the oponent's pawns.
- When player loses, all of their moves (states) are discouraged by reducing their points, thus diciplining the computer.
- Otherwise encourage good moves (states) by increasing their points.
Points:
- New state: 50
- Increase amount: 5
- Decreases amount: -5 or amount is made 0 when a negative result is reached.
Each player generates a random value and checks in which decision range the value falls under, afterwards this state is returned. This is based on Monte Carlo simulations.
It was a fun experiment and I learned a bit about Python 3 syntax and side note Colorama is a very cool project, do have a look! In terms of the game and the basic machine learning, I found it quite interesting that after a simulation with a hundred games the 'players' would still encounter 'unknown' states and that during this simulation that none of the decisions could be phased out (reach a zero value). Another interesting find was after a thousand games the general losing player would be inable to make decisions because all of their decisions lead in a bad outcome, to resolve this I had to let the player in this situation just randomly choose a decision rather than no decision.
