Deep Q-Learning (DQN) combines neural networks with Q-learning to teach the agent optimal gameplay
The agent observes the game state: snake position, food location, and obstacles. This becomes the input to the neural network.
The neural network predicts Q-values for each action (up, down, left, right). The agent picks the action with the highest expected reward.
After each move, the agent receives a reward (+10 for food, -10 for collision). The network updates its weights using backpropagation to improve future decisions.
Past experiences are stored in memory and randomly sampled during training. This breaks correlation between consecutive states and stabilizes learning.
The timeless Snake game reimagined as an AI training environment
Neural networks learn optimal strategies through trial and error
Watch performance improve over thousands of training episodes