This page includes results we did not include in the paper and experiments we have processed since the papers publication. When displaying these results, we default to the tsml implementation of HC2 but may use the aeon implementation if it is more relevant, i.e., if we are testing a new component which is implemented in aeon but not tsml, such as variations on ROCKET or TSFresh.
We provide two implementations of the HIVE-COTE 2.0 algorithm. One Java implementation in the tsml package, and another in the Python aeon package. We compare both our implementations of HC2 and its components (as of 18/03/2022). As the aeon package develops, the implmentations may become more efficient.
tsml results:
HC2 | DrCIF | TDE | STC | Arsenal | |
---|---|---|---|---|---|
Accuracy | 0.8912 | 0.8637 | 0.8589 | 0.8585 | 0.8659 |
Average Train Time (Minutes) | 187.1848 | 24.3228 | 40.3987 | 66.823 | 14.9539 |
Total Train Time (Hours) | 349.4112 | 45.4025 | 75.4109 | 124.7362 | 27.9139 |
aeon results:
HC2 | DrCIF | TDE | STC | Arsenal | |
---|---|---|---|---|---|
Accuracy | 0.8926 | 0.8637 | 0.8609 | 0.8555 | 0.8657 |
Average Train Time (Minutes) | 275.856 | 85.3483 | 45.2662 | 125.9338 | 5.0892 |
Total Train Time (Hours) | 514.9313 | 159.3168 | 84.497 | 235.0764 | 9.4999 |
None of the classifiers are significantly different from their language counterparts in terms of accuracy. The Python implementation is quite a bit slower, STC is contracted for 2 hours instead of 1. Both HC2 implementations are capable of contracting the full classifier.
In HIVE-COTE 2.0 we replace the cross-validation train accuracy estimates with out-of-bag estimates requiring only a single model. Here we compare both estimates for the HC2 components. Except for TDE, out-of-bag estimates are noticeably faster. TDE evaluates all members of its ensemble using leave-one-out cross-validation as part of the building process, by retaining those estimated the full ensemble estimate is essentially free.
Out-of-bag accuracy estimates:
Arsenal-oob | TDE-oob | DrCIF-oob | STC-oob | |
---|---|---|---|---|
Train Accuracy | 0.8542 | 0.8475 | 0.8422 | 0.8709 |
Difference to Test Accuracy | -0.0117 | -0.0114 | -0.0215 | 0.0124 |
Average Train Time (Minutes) | 8.9383 | 3.564 | 15.6845 | 11.4425 |
Total Train Time (Hours) | 8.7794 | 2.4126 | 4.6385 | 5.1934 |
Cross-validation accuracy estimates:
Arsenal-cv | TDE-cv | DrCIF-cv | STC-cv | |
---|---|---|---|---|
Train Accuracy | 0.8621 | 0.8866 | 0.8546 | 0.8823 |
Difference to Test Accuracy | -0.0038 | 0.0277 | -0.0091 | 0.0238 |
Average Train Time (Minutes) | 67.2417 | 0.0002 | 31.2117 | 102.5649 |
Total Train Time (Hours) | 65.0597 | 0.00004 | 7.1529 | 42.5626 |
We compare four versions of HIVE-COTE 2.0. HC2-oob is the version used in our publication, where each component uses out-of-bag error. For HC2-cv all components use cross-validation. HC2-fastest uses the fastest train estimate method for each component, with all except for TDE using out-of-bag error. Lastly HC2-closest uses the component with the train accuracy estimate closest to the actual test accuracy, consisting of out-of-bag estimates from TDE and STC and cross-validation estimates from DrCIF and Arsenal.
Of the HC2 variants, HC2-fastest is unsurprisingly the fastest. Unexpectedly, it is also significantly better than the out-of-bag estimate version.
HC2-fastest | HC2-cv | HC2-oob | HC2-closest | |
---|---|---|---|---|
Accuracy | 0.8917 | 0.892 | 0.8912 | 0.8912 |
Average Train Time (Minutes) | 183.4415 | 348.9466 | 187.1848 | 261.1385 |
Total Train Time (Hours) | 342.4237 | 651.3661 | 349.4112 | 487.458 |
With the exception of TDE, which did not have multivariate capabilities prior to HC2, all component classifiers use the multivariate versions used in the 2021 multivariate time series backoff. STC created an ensemble of classifiers built on each dimension in the dataset. Given that the classifier is also contracted for an hour, this can get out of hand quite quickly. DuckDuckGeese has 1345 dimensions, so in theory would take 56 days to train if each ensemble member was trained sequentially, not including the requires train accuracy estimate.
We test a faster solution, where the dimension for each shapelet extracted in the shapelet transform portion of the classifier is randomly selected. We find no significant different in terms of accuracy between these methods, but the new multivariate STC is significantly faster.
HC2-NewSTC | HC2 | NewSTC | STC | |
---|---|---|---|---|
Accuracy | 0.7445 | 0.7448 | 0.7123 | 0.7032 |
Average Train Time (Hours) | 21.6148 | 439.1926 | 2.3552 | 269.988 |
Total Train Time (Days) | 23.4161 | 475.792 | 2.3957 | 292.487 |