A tabular Q-Learning agent that learns to balance a pole on a cart from scratch. Discretises the continuous state space into bins and builds an action-value table over 10,000 episodes of trial and error, going from instant failure (~25 steps) to sustained balancing (150+ steps).