Skip to content

RL Board Game Agent

Python Keras PyQt5 Reinforcement Learning

GitHub


The Problem

Board games provide an ideal testbed for reinforcement learning algorithms — they have clear rules, discrete action spaces, and measurable outcomes. This project tackles the challenge of training an agent to play a 7×7 board game from scratch, learning purely through self-play without any human-provided strategy knowledge.

The Approach

The agent uses Deep Q-Learning (DQN) to learn optimal play. Starting with no knowledge of the game beyond the rules, it improves through millions of self-play episodes.

Deep Q-Network

A neural network approximates the Q-value function, mapping game states to action values. The network architecture is configurable — allowing experimentation with different depths and widths to find the best capacity for the game's complexity.

Experience Replay

The agent stores past experiences (state, action, reward, next state) in a replay buffer and samples random mini-batches for training. This breaks the correlation between consecutive experiences and stabilizes learning.

Self-Play Training

The agent plays against copies of itself, continuously improving its strategy. As the agent gets stronger, so does its opponent — creating an ever-increasing difficulty curve.

Key Features

  • Deep Q-Learning with experience replay and target networks
  • Configurable neural network — experiment with different architectures
  • Self-play training loop for continuous improvement without human data
  • Multiple play modes — Player vs Player, Player vs AI, AI vs AI
  • PyQt5 GUI for interactive gameplay and visualization
  • 7×7 board with custom game rules

Architecture

graph LR
    A[Game State] --> B[DQN]
    B --> C[Q-Values per Action]
    C --> D[ε-Greedy Selection]
    D --> E[Action]
    E --> F[Environment Step]
    F --> G[Reward + Next State]
    G --> H[(Replay Buffer)]
    H --> B

Tech Stack

Component Technology
RL Framework Custom DQN implementation
Neural Network Keras
GUI PyQt5
Language Python