public class IntervalEstimationGambler : GamblerBase

Thread Safety

Public static (Shared in Visual Basic) members of this type are safe
for multithreaded operations. Instance members are not guaranteed to be
thread-safe.

Remarks

Intuively, the Interval Estimation algorithm makes an optimistic reward estimation with 100 (1 - alpha) % confidence interval for each lever. Then the lever of highest estimated upper bound is pulled. Notice that a smaller alpha leads to more exploration. In order to compute the upper bound of the mean estimate, the current implementation relies on the assumption that the lever mean estimate is normally distributed.

The Interval Estimation is due to KaelBling (1993) in Learning in Embedded Systems (MIT Press).