Skip to contents

The goal of compenginets is to provide all the time series from https://www.comp-engine.org/

Installation

compenginets currently isn’t on CRAN. You can install the development version from Github

# install.packages("devtools")
devtools::install_github("robjhyndman/compenginets")

CompEngine: A self-organizing database of time-series data

CompEngine is an online time-series database which allow users to upload and interactively compare data with similar time series data set. The website was build by Nick Jones and Ben Fulcher, based on the early work conducted by Ben D. Fulcher, Max A. Little, and Nick S. Jones (2013). To provide time series similar to the data user uploads, it compute features of the data, and find existing time series matching those features. The features list and detailed description can be found on the page of CompEngine.

Usage

This package intends to provide means to access data from CompEngine with ease with an R solution. Function get_cets can be used to return time series with a specified name or within a certain category. By default, get_cets returns the first 10 pages (maximum 10 in one page) time series within the category which matches argument key.

library(compenginets)

# Get series within Finance category (including subcategory)
cets_finance <- get_cets("finance")
length(cets_finance)
#> [1] 100
str(cets_finance[[1]])
#>  Time-Series [1:4197] from 1 to 4197: 6890 6783 6553 6453 6508 ...
#>  - attr(*, "name")= chr "M4_D3333_Finance_1"
#>  - attr(*, "description")= chr ""
#>  - attr(*, "samplingInformation.name")= chr "M4_D3333_Finance_1"
#>  - attr(*, "samplingInformation.description")= chr ""
#>  - attr(*, "samplingInformation.samplingInformation")='data.frame':  1 obs. of  2 variables:
#>   ..$ samplingRate: chr "1.00"
#>   ..$ samplingUnit: chr "/day"
#>  - attr(*, "tags")= chr [1:3] "finance" "M4" "Daily"
#>  - attr(*, "category.name")= chr "Finance"
#>  - attr(*, "category.uri")= chr "real/finance/"
#>  - attr(*, "sfi.name")= chr [1:22] "DN_HistogramMode_5" "DN_HistogramMode_10" "CO_Embed2_Dist_tau_d_expfit_meandiff" "CO_f1ecac" ...
#>  - attr(*, "sfi.prettyName")= chr [1:22] "DN_HistogramMode_5" "DN_HistogramMode_10" "CO_Embed2_Dist_tau_d_expfit_meandiff" "CO_f1ecac" ...
#>  - attr(*, "sfi.value")= num [1:22] 36 56.9 72.9 56.8 50 ...
#>  - attr(*, "source")= chr NA

# Supply the number of pages need with option maxpage
# A maximum of 10 time series are in one page
cets_finance_20 <- get_cets("finance", maxpage = 2)
length(cets_finance_20)
#> [1] 20

# Switch category to FALSE to get the time series matching a name
W138_finance_m4 <- get_cets("M4_W138_Finance_1", category = FALSE)
str(W138_finance_m4)
#>  Time-Series [1:1044] from 1 to 1044: 2062 2086 2026 2076 2077 ...
#>  - attr(*, "name")= chr "M4_W138_Finance_1"
#>  - attr(*, "description")= chr ""
#>  - attr(*, "samplingInformation.samplingRate")= chr "1.00"
#>  - attr(*, "samplingInformation.samplingUnit")= chr "/week"
#>  - attr(*, "tags")= chr [1:3] "finance" "M4" "weekly"
#>  - attr(*, "category.name")= chr "Finance"
#>  - attr(*, "category.uri")= chr "real/finance/"
#>  - attr(*, "sfi.name")= chr [1:22] "DN_HistogramMode_5" "DN_HistogramMode_10" "CO_Embed2_Dist_tau_d_expfit_meandiff" "CO_f1ecac" ...
#>  - attr(*, "sfi.prettyName")= chr [1:22] "DN_HistogramMode_5" "DN_HistogramMode_10" "CO_Embed2_Dist_tau_d_expfit_meandiff" "CO_f1ecac" ...
#>  - attr(*, "sfi.value")= num [1:22] 84 86.8 65.2 14.1 23.8 ...
#>  - attr(*, "source")= logi NA

A list of category can be obtained externally.

cate_path <- category_scraping()
str(cate_path, list.len = 10)
#> List of 195
#>  $ real                                     : chr [1:55] "real" "audio" "ecology" "economics" ...
#>  $ synthetic                                : chr [1:139] "synthetic" "flow" "iterative map" "periodic" ...
#>  $ unassigned                               : chr "unassigned"
#>  $ audio                                    : chr [1:7] "audio" "animal sounds" "human speech" "music" ...
#>  $ ecology                                  : chr [1:2] "ecology" "zooplankton growth"
#>  $ economics                                : chr "economics"
#>  $ finance                                  : chr [1:8] "finance" "crude oil prices" "exchange rate" "gas prices" ...
#>  $ industry                                 : chr "industry"
#>  $ medical                                  : chr [1:10] "medical" "boc" "chest volume" "ecg" ...
#>  $ meteorology                              : chr [1:14] "meteorology" "air pressure" "air temperature" "carbon dioxide" ...
#>   [list output truncated]

License

This package is free and open source software, licensed under CC0