Epsilon first gambler.
For a list of all members of this type, see EpsilonFirstGambler Members.
System.Object
Gambler
GamblerBase
EpsilonFirstGambler
Public static (Shared in Visual Basic) members of this type are safe for multithreaded operations. Instance members are not guaranteed to be thread-safe.
The epsilon first strategy is similar to the EpsilonGreedyGambler but instead, all the exploration is performed at first (a pure exploration phase, followed by a pure exploitation phase).
Note that, by design, the epsilon first strategy is not an online strategy. The number of rounds that will be played have to be known in advance otherwise it will be possible to decide the number of rounds to spend in the exploration phase.
This algorithm is analysed in PAC bounds for Multi-Armed Bandit and Markov Decision Processes by Even-Dar, Mannor and Mansour in COLT 2002.
Namespace: Bandit.Stochastic
Assembly: Bandit (in Bandit.dll)
EpsilonFirstGambler Members | Bandit.Stochastic Namespace