Fundamentals of Reinforcement Learning: Estimating the Action-Value Function
In this article, we introduce fundamental concepts of reinforcement learning—including the k-armed bandit problem, estimating the action-value function, and the exploration vs. exploitation dilemma.