This study looks at how Reinforcement Learning (RL) approaches can be used to understand player behavior in Electronic Gaming Machines (EGMs) found in venues like casinos. The gaming business is keen to learn about the many types of player behavior and create virtual players mimicking these behaviors. To achieve this, we trained RL models to mimic player behavior by grouping different playing styles with K-means clustering and determining termination states for one of the playing behaviors. The Proximal Policy Optimization (PPO) and Actor Critic using Kronecker-Factored Trust Region (ACKTR) models were subsequently implemented, with the agents being rewarded based on their proximity to the termination states. Our findings suggest that the ACKTR model performed better than the PPO model, with the generated playing behavior demonstrating a high level of statistical similarity to real-world player behavior within the selected cluster.
Article ID: 2023L15
Publisher: Canadian Artificial Intelligence Association