On Universal Sequential Classification from Sequentially Observed Empirical Statistics
Journal
2022 IEEE Information Theory Workshop, ITW 2022
ISBN
9781665483414
Date Issued
2022-01-01
Author(s)
Abstract
The focus of this paper is on binary classification of a sequentially observed stream of i.i.d. samples, based on sequentially observed empirical statistics. The decision maker (classifier) sequentially observes a sequence of testing data i.i.d. sampled from one of two unknown distributions P0 and P1. In addition, it receives two sequences of training data which are also sequentially sampled from the two unknown distributions in an i.i.d. fashion, respectively. Since the distributions are unknown, it is natural to put a constraint either on the expected stopping times or on the error probabilities that has to be satisfied universally over all possible pairs of distributions (P0, P1). For both settings, we develop tests that are asymptotically optimal within the class of tests that satisfy the respective universality constraints. For expected-stopping-Time universality, the optimal error exponents are shown to be the Renyi divergences of order, where ? is the ratio of the length of a training data sequence to that of the testing data sequence. For error-probability universality, the optimal expected stopping times normalized by the logarithm of the error probability are the reciprocal of the ?-weighted generalized Jensen-Shannon (GJS) divergences. The proposed sequential tests are both based on threshold tests of two statistics, each of which is the ?-weighted GJS divergences of the type of the testing sequence from that of a training sequence. Interestingly, to achieve asymptotic optimality, the stopping and decision rules in the two different universality setups take a "matching"and a "discriminating"viewpoint respectively in designing the threshold tests.
Type
conference paper
