Dataset listing

The univariate and multivariate classification problems are available in three formats: Weka ARFF, simple text files and sktime ts format. Weka does not allow for unequal length series, so the unequal length problems are all padded with missing values. ts format does allow for this feature.

Univariate Weka formatted ARFF files and .txt files (about 500 MB).

Univariate sktime formatted ts files (about 300 MB).

Multivariate Weka formatted ARFF files (and .txt files) (about 2 GB).

sktime formatted ts files (about 1.5 GB).

Lists of the data, including which are unequal length, can be found here. Details on loading sktime data with the Python package are here .

To store multivariate series in ARFF we take advantage of relational attributes. These are fairly unintuitive, so we have provided an overview of this and other basic features of loading data and building classifiers here.

These files provide a simple list of the data characteristics for univariate and multivariate problems.

To see how accurate different classifiers are on these data see the results page.

More information on the datasets is given below. Problems with variable length series are listed as length 0.

Dataset Train Size Test Size Length No. of Classes Type