Skip to content

Add a stronger CPU opponent based on Minimax search.#8

Closed
maksverver wants to merge 0 commit intoElTh0r0:mainfrom
maksverver:main
Closed

Add a stronger CPU opponent based on Minimax search.#8
maksverver wants to merge 0 commit intoElTh0r0:mainfrom
maksverver:main

Conversation

@maksverver
Copy link
Copy Markdown
Contributor

@maksverver maksverver commented Oct 8, 2024

This player is significantly stronger than AdvancedCPU.

@ElTh0r0
Copy link
Copy Markdown
Owner

ElTh0r0 commented Oct 8, 2024

Awesome! Thank you so much for this! I have heard of Minimax but never had the time to look into it.
I'll test the new CPU as soon as I can (but I already gave up trying to understand everything in your script 😉).

Since you were the first who provided a new CPU: Was the provided interface to the game "enough" to develop an opponent or are you missing any data from the game or would you have preferred the data in a different format?

@maksverver
Copy link
Copy Markdown
Contributor Author

Thanks, I hope you enjoy it! The AI essentially looks two full moves ahead, so it can avoid a lot of simple traps. I haven't been able to beat it yet myself.

Overall I found the CPU interface easy to understand and I had little trouble integrating with it, though so far I ignored the trickier parts like irregularly shaped boards, or games with more than 2 players (which the UI doesn't seem to support either).

The only information that seems to be really missing is the player scores. This is relevant because there can be cases where a player can allow the opponent take a tower, knowing that it will be able to take a tower in return the next turn. It would make sense to do this, for example, when you're playing best-of-three and you're already one point ahead, but it doesn't make sense when the opponent is just one point away from winning. Currently, there is no way to distinguish these situations. Ideally, the game should inform the AI of the current scores and the target score to win the match.

Related to this: I noticed that the game does not allow you create a winning tower for the opponent, but I didn't see anything in the official game rules that would forbid this. While this is rarely useful in practice anyway, it might come up occasionally, for example if this is the only legal move that exists (in which case you're not allowed to pass), or when you can prove that giving away a point to your opponent allows you to score a point on the next turn. To be honest I'm not sure how common either scenario is.

Another piece of missing data is that the game does not explicitly tell the AI what its opponent's last move was, which is relevant because it's illegal to undo it. I ended up using the legalMoves array that is passed to callCPU() as the source of truth, which seems otherwise a little superfluous, since any AI that is advanced enough to look more than 1 move ahead necessarily has to be able to generate legal moves itself.

Finally, I was a little confused about which information is available where. It seems like most of the “fixed” state of the game (like rectangular dimensions of the board) are available to initCPU() via the game object, but then details about the board shape (which can include holes apparently?) are only available from the jsonBoard argument passed to callCPU(). Similarly, I don't really understand why getNumOfPlayers() is a method on the game object, while nDirection is passed as an argument to callCPU(). Maybe there are historical reasons for this. If I had to design the interface from scratch I'd probably try to make this a bit more consistent. That being said, this was not really a hurdle to implementing the CPU interface. I just wanted to share my experience since you asked.

@ElTh0r0
Copy link
Copy Markdown
Owner

ElTh0r0 commented Oct 12, 2024

Thanks for the feedback, I really appreciate!

so far I ignored the trickier parts like irregularly shaped boards, or games with more than 2 players (which the UI doesn't seem to support either).

Both is planned and the UI is prepared for it. If you are compiling StackAndConquer yourself, you can already experiment with it. You need to change this line for 3 (or more) players and game dialog will provide you the option accordingly (the rules dialog is prepared for this game mode as well and will show "Rules à trois" , if the below is increased).

m_nMaxPlayers(2),

For enabling the board selection, you would need to change this line. If you want to try another shape, I uploaded a test board here: https://gist.github.com/ElTh0r0/25d23d7cbd0b73da40165a0b9789d0f0; you just need to copy it into the boards sub folder (or you can put it in your user folder; if you are on Linux: ~/.local/share/stackandconquer/boards/)

m_pUi->cbBoard->setEnabled(false);

The only information that seems to be really missing is the player scores. This is relevant because there can be cases where a player can allow the opponent take a tower, knowing that it will be able to take a tower in return the next turn. It would make sense to do this, for example, when you're playing best-of-three and you're already one point ahead, but it doesn't make sense when the opponent is just one point away from winning. Currently, there is no way to distinguish these situations. Ideally, the game should inform the AI of the current scores and the target score to win the match.

Good idea, I like it! I will add it to the interface!

Related to this: I noticed that the game does not allow you create a winning tower for the opponent, but I didn't see anything in the official game rules that would forbid this. While this is rarely useful in practice anyway, it might come up occasionally, for example if this is the only legal move that exists (in which case you're not allowed to pass), or when you can prove that giving away a point to your opponent allows you to score a point on the next turn. To be honest I'm not sure how common either scenario is.

Correct, for 2 players the rules don't forbid creating a winning tower for the opponent. Main reason I implemented it like that (as far as I can remember 😉), was that my first "dummy" CPU is just picking a random legal move and it was the quickest to not letting the dummy CPU select such a "suicide move".
Rules for more than 2 players really forbids it. But as I wasn't thinking about such possible best-of-three strategies you described above, I'll change the code accordingly and will remove "suicide moves" only for >2 players.

Another piece of missing data is that the game does not explicitly tell the AI what its opponent's last move was, which is relevant because it's illegal to undo it. I ended up using the legalMoves array that is passed to callCPU() as the source of truth, which seems otherwise a little superfluous, since any AI that is advanced enough to look more than 1 move ahead necessarily has to be able to generate legal moves itself.

Got it, yes this we can discuss and I am open to add this information to the interface. What do you think is more convenient: Receiving the last move or directly receiving the illegal "undo move"?

Finally, I was a little confused about which information is available where. It seems like most of the “fixed” state of the game (like rectangular dimensions of the board) are available to initCPU() via the game object, but then details about the board shape (which can include holes apparently?) are only available from the jsonBoard argument passed to callCPU(). Similarly, I don't really understand why getNumOfPlayers() is a method on the game object, while nDirection is passed as an argument to callCPU(). Maybe there are historical reasons for this. If I had to design the interface from scratch I'd probably try to make this a bit more consistent. That being said, this was not really a hurdle to implementing the CPU interface. I just wanted to share my experience since you asked.

Good question 🙂 the interface was just "somehow" growing without really thinking about that! But yes, main "fixed" data is provided from the game object. nDirection is dynamic (but only relevant for >2 players as the direction changes if a player must pass).
Thinking about the interface, I think I might move the legalMoves to the game object, so the AI developer can decide if this information is needed or not - even if this breaks the above fixed/dynamic convention. Didn't decide yet.

@maksverver
Copy link
Copy Markdown
Contributor Author

Both is planned and the UI is prepared for it.

Very cool! I'll try to experiment with them a bit when I have some free time.

What do you think is more convenient: Receiving the last move or directly receiving the illegal "undo move"?

I don't think it matters too much, but my preference would be the last move played in the same format used by legalMoves. (On the first turn, this could be undefined. If the previous player passed, this could be null or an empty array [] which is what I use internally to represent a pass.) It's trivial to reverse the move in the CPU code anyway. If you decide to use the reverse move, it might make sense to only provide it if it would be otherwise valid. Most moves aren't actually undoable; the move is undoable only if the number of stones left behind was equal to the number of stones moved.

But yes, main "fixed" data is provided from the game object.

Got it. In that case, it might make sense to also provide the board layout in the game object so it can be used to precompute moves in initCPU(). Currently, there is only the width and height, but since the actual board can have fields missing, that's not enough. Essentially I would like to have the entire board definition from triangle.stackboard available.

(It's not critical; I can work around it by doing the initialization in callCPU(), but it seems conceptually more appropriate to do it in initCPU().)

nDirection is dynamic (but only relevant for >2 players as the direction changes if a player must pass).

Aha. I hadn't read up on the 3-player rules before. But are you sure that the direction can change? From the rules here:

In the case where the current player has no move available, the right to move is given back to the previous player, which means effectively the turn order is changed once to anti-clockwise.

The way I interpret this is that if it is player 2's turn (for example), and he has only moves that would let player 3 win, he must pass and play transfers back to player 1, but the word “once” here suggests that this doesn't permanently change the direction of play. So from player 1's perspective, the next player is always player 2.

(This is not all that closely related to the topic at hand. Maybe we can discuss the three-player rules in another issue.)

@maksverver
Copy link
Copy Markdown
Contributor Author

Something else I just noticed: the number of stones per player is not exposed in the CPU API currently, which is a problem because it is configurable in the board file format.

I've created separate issues for the things we discussed, because it's getting a little difficult to keep track of them in this thread. I hope you don't mind. (If you do, you can always close them.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants