The univariate and multivariate classification problems are available in three formats: Weka ARFF, simple text files and aeon ts format. Weka does not allow for unequal length series, so the unequal length problems are all padded with missing values. ts format does allow for this feature.
Univariate Weka formatted ARFF files and .txt files (about 500 MB).
Univariate aeon formatted ts files (about 300 MB).
Multivariate Weka formatted ARFF files (and .txt files) (about 2 GB).
aeon formatted ts files (about 1.5 GB).
Lists of the data, including which are unequal length, can be found here. You can load data directly from this website using the aeon toolkit. pip install aeon, then run:
More details on loading are here .
from aeon.datasets import load_classification X, y, meta_data = load_classification("GunPoint") print(" Shape of X = ", X.shape) print(" Meta data = ", meta_data)
To store multivariate series in ARFF we take advantage of relational attributes. These are fairly unintuitive, so we have provided an overview of this and other basic features of loading data and building classifiers here.
To see how accurate different classifiers are on these data see the results page.
More information on the datasets is given below. Problems with variable length series are listed as length 0.
|Dataset||Train Size||Test Size||Length||No. of Classes||Type|