The Multi-Armed Bandit Problem and Its Solutions

The algorithms are implemented for Bernoulli bandit in lilianweng/multi-armed-bandit.

Exploitation vs Exploration

The exploration vs exploitation dilemma exists in many aspects of our life. Say, your favorite restaurant is right around the corner. If you go there every day, you would be confident of what you will get, but miss the chances of discovering an even better option. If you try new places all the time, very likely you are gonna have to eat unpleasant food from time to time. Similarly, online advisors try to balance between the known most attractive ads and the new ads that might be even more successful.

Liked Liked