Blog

Interactive tabular RL in four rooms

A browser-based walkthrough of model-free Q-learning, Dyna-Q, and hindsight experience replay in a classic four-rooms gridworld. Fill in the JavaScript updates, switch to the ground truth, and watch the learned values and policies change in real time.