vignettes/compenginets.Rmd
compenginets.Rmd
The goal of compenginets is to provide all the time series from http://www.comp-engine.org/.
compenginets
currently isn’t on CRAN.
You can install the development version from Github
CompEngine is an online time-series database which allow users to upload and interactively compare data with similar time series data set. The website was build by Nick Jones and Ben Fulcher, based on the early work conducted by Ben D. Fulcher, Max A. Little, and Nick S. Jones (2013). To provide time series similar to the data user uploads, it compute features of the data, and find existing time series matching those features. The features list and detailed description can be found on the page of CompEngine.
This package intends to provide means to access data from CompEngine with ease with an R
solution. Function get_cets
can be used to return time series with a specified name or within a certain category. By default, get_cets
returns the first 10 pages (maximum 10 in one page) time series within the category which matches argument key
.
library(compenginets)
# Get series within Finance category (including subcategory)
cets_finance <- get_cets("finance")
length(cets_finance)
#> [1] 100
str(cets_finance[[1]])
#> Time-Series [1:4197] from 1 to 4197: 6890 6783 6553 6453 6508 ...
#> - attr(*, "name")= chr "M4_D3333_Finance_1"
#> - attr(*, "description")= chr ""
#> - attr(*, "samplingInformation.name")= chr "M4_D3333_Finance_1"
#> - attr(*, "samplingInformation.description")= chr ""
#> - attr(*, "samplingInformation.samplingInformation")='data.frame': 1 obs. of 2 variables:
#> ..$ samplingRate: chr "1.00"
#> ..$ samplingUnit: chr "/day"
#> - attr(*, "tags")= chr [1:3] "finance" "M4" "Daily"
#> - attr(*, "category.name")= chr "Finance"
#> - attr(*, "category.uri")= chr "real/finance/"
#> - attr(*, "sfi.name")= chr [1:16] "CO_Embed2_Basic_tau.incircle_1" "CO_Embed2_Basic_tau.incircle_2" "FC_LocalSimple_mean1.taures" "DN_HistogramMode_10" ...
#> - attr(*, "sfi.prettyName")= chr [1:16] "Autocorrelation measure" "Autocorrelation measure" "Predictability measure" "Distribution measure" ...
#> - attr(*, "sfi.value")= num [1:16] 58.8 56.2 69.1 50.7 27.2 ...
#> - attr(*, "source")= chr NA
# Supply the numer of pages need with option maxpage
# A maximum of 10 time series are in one page
cets_finance_20 <- get_cets("finance", maxpage = 2)
length(cets_finance_20)
#> [1] 20
# Switch category to FALSE to get the time series matching a name
W138_finance_m4 <- get_cets("M4_W138_Finance_1", category = FALSE)
str(W138_finance_m4)
#> Time-Series [1:1044] from 1 to 1044: 2062 2086 2026 2076 2077 ...
#> - attr(*, "name")= chr "M4_W138_Finance_1"
#> - attr(*, "description")= chr ""
#> - attr(*, "samplingInformation.samplingRate")= chr "1.00"
#> - attr(*, "samplingInformation.samplingUnit")= chr "/week"
#> - attr(*, "tags")= chr [1:3] "finance" "M4" "weekly"
#> - attr(*, "category.name")= chr "Finance"
#> - attr(*, "category.uri")= chr "real/finance/"
#> - attr(*, "sfi.name")= chr [1:16] "CO_Embed2_Basic_tau.incircle_1" "CO_Embed2_Basic_tau.incircle_2" "FC_LocalSimple_mean1.taures" "DN_HistogramMode_10" ...
#> - attr(*, "sfi.prettyName")= chr [1:16] "Autocorrelation measure" "Autocorrelation measure" "Predictability measure" "Distribution measure" ...
#> - attr(*, "sfi.value")= num [1:16] 58.812 0.207 39.916 83.933 18.947 ...
#> - attr(*, "source")= logi NA
A list of category can be obtained externally.
cate_path <- category_scraping()
str(cate_path, list.len = 10)
#> List of 194
#> $ real : chr [1:54] "real" "audio" "ecology" "economics" ...
#> $ synthetic : chr [1:139] "synthetic" "flow" "iterative map" "periodic" ...
#> $ unassigned : chr "unassigned"
#> $ audio : chr [1:7] "audio" "animal sounds" "human speech" "music" ...
#> $ ecology : chr [1:2] "ecology" "zooplankton growth"
#> $ economics : chr "economics"
#> $ finance : chr [1:8] "finance" "crude oil prices" "exchange rate" "gas prices" ...
#> $ industry : chr "industry"
#> $ medical : chr [1:10] "medical" "boc" "chest volume" "ecg" ...
#> $ meteorology : chr [1:13] "meteorology" "air pressure" "air temperature" "precipitation amount" ...
#> [list output truncated]