The document discusses time series representations, which can be used to reduce dimensionality, remove noise, and emphasize patterns in time series data. It introduces the TSrepr R package, which implements various time series representation methods like PAA, DWT, DFT, and SAX. It allows creating representation matrices from multiple time series and provides functions for normalization, windowing, and extending the package with custom representations. Time series representations help with tasks like clustering, classification, and forecasting of time series data.
1 of 11
Downloaded 10 times
More Related Content
Time series representations for better data mining
1. Time Series Representations for Better Data Mining
What can we do with time series data?
Classification
Clustering
Anomaly (outlier) detection
Forecasting
What are the problems with time series data?
High-dimension
Noise
Concept-drift (trend-shift etc.)
1
2. Time Series Representations
What can we do for solving these problems?
Use time series representations!
They are excellent to:
Reduce memory load.
Accelerate subsequent machine learning algorithms.
Implicitly remove noise from the data.
Emphasize the essential characteristics of the data.
Help to find patterns in data (or motifs).
2
5. TSrepr
TSrepr - CRAN1, GitHub2
R package for time series representations computing
Large amount of various methods are implemented
Several useful support functions are also included
Easy to extend and to use
data <- rnorm(1000)
repr_paa(data, func = median, q = 10)
1
https://CRAN.R-project.org/package=TSrepr
2
https://github.com/PetoLau/TSrepr/
5
6. All type of time series representations methods are implemented, so far these:
PAA - Piecewise Aggregate Approximation ( repr_paa )
DWT - Discrete Wavelet Transform ( repr_dwt )
DFT - Discrete Fourier Transform ( repr_dft )
DCT - Discrete Cosine Transform ( repr_dct )
PIP - Perceptually Important Points ( repr_pip )
SAX - Symbolic Aggregate Approximation ( repr_sax )
PLA - Piecewise Linear Approximation ( repr_pla )
Mean seasonal profile ( repr_seas_profile )
Model-based seasonal representations based on linear model ( repr_lm )
FeaClip - Feature extraction from clipping representation ( repr_feaclip )
Additional useful functions are implemented as:
Windowing ( repr_windowing )
Matrix of representations ( repr_matrix )
Normalisation functions - z-score ( norm_z ), min-max ( norm_min_max )
6
7. Usage of TSrepr
mat <- "some matrix with lot of time series"
mat_reprs <- repr_matrix(mat, func = repr_lm,
args = list(method = "rlm", freq = c(48, 48*7)),
normalise = TRUE, func_norm = norm_z)
mat_reprs <- repr_matrix(mat, func = repr_feaclip,
windowing = TRUE, win_size = 48)
clustering <- kmeans(mat_reprs, 20)
7
11. Conclusions
Time Series Representations:
They are our fiends in clustering, forecasting, classification etc.
Implemented in TSrepr
Questions: Peter Laurinec tsreprpackage@gmail.com
Code: https://github.com/PetoLau/TSrepr/
More research: https://petolau.github.io/research
Blog: https://petolau.github.io
And of course: install.packages("TSrepr")
11