Publications
2026
- Flexible grouping of linear segments for highly accurate lossy compression of time series dataXenophon Kitsios, Panagiotis Liakos, Katia Papakonstantinopoulou, Nikos Giatrakos, and Yannis KotidisProc. ACM Manag. Data, 2026
2024
- Flexible grouping of linear segments for highly accurate lossy compression of time series dataThe VLDB Journal, Sep 2024
Approximating a series of timestamped data points through a sequence of line segments with a maximum error guarantee is a fundamental data compression problem, termed as Piecewise Linear Approximation (PLA). As the demand for analyzing large volumes of time-series data across various domains continues to grow, the significance of this problem has recently received considerable attention. Recent PLA algorithms have emerged to help us handle the overwhelming amount of information, albeit at the expense of some precision loss. More precisely, these algorithms involve a delicate balance between the maximum acceptable precision loss and the space savings that can be achieved. In our recent work we proposed Sim-Piece, offering a fresh perspective on the long-standing challenge of PLO approximation. Sim-Piece identifies similarities among line segments in a PLA representation enabling their grouping and joint representation. This way, Sim-Piece delivers space-saving advantages that outperform even the optimal PLA approximation. In this work, we present Mix-Piece, an improved PLA compression algorithm that builds upon the core idea of Sim-Piece (i.e., exploiting similar PLA segments) but improves further its performance by (1) considering multiple candidate PLA segments when ingesting a time series, (2) enabling grouping of additional segments not utilized by Sim-Piece, and, (3) making use of a versatile output format that exploits all segment similarities. Our experimental evaluation demonstrates that Mix-Piece outperforms Sim-Piece and previous competing techniques, attaining compression ratios with more than twofold improvement on average over what PLA algorithms can offer. This allows for providing significantly higher accuracy with equivalent space requirements.
@article{mix-piece, author = {Kitsios, Xenophon and Liakos, Panagiotis and Papakonstantinopoulou, Katia and Kotidis, Yannis}, title = {Flexible grouping of linear segments for highly accurate lossy compression of time series data}, journal = {The VLDB Journal}, year = {2024}, month = sep, day = {01}, volume = {33}, number = {5}, pages = {1569-1589}, issn = {0949-877X}, doi = {10.1007/s00778-024-00862-z}, url = {https://doi.org/10.1007/s00778-024-00862-z}, }
2023
- Sim-Piece: Highly Accurate Piecewise Linear Approximation through Similar Segment MergingProc. VLDB Endow., Apr 2023
Approximating series of timestamped data points using a sequence of line segments with a maximum error guarantee is a fundamental data compression problem, termed as piecewise linear approximation (PLA). Due to the increasing need to analyze massive collections of time-series data in diverse domains, the problem has recently received significant attention, and recent PLA algorithms that have emerged do help us handle the overwhelming amount of information, at the cost of some precision loss. More specifically, these algorithms entail a trade-off between the maximum precision loss and the space savings achieved. However, advances in the area of lossless compression are undercutting the offerings of PLA techniques in real datasets. In this work, we propose Sim-Piece, a novel lossy compression algorithm for time-series data that optimizes the space requirements of representing PLA line segments, by finding the minimum number of groups we can organize these segments into, to represent them jointly. Our experimental evaluation demonstrates that our approach readily outperforms competing techniques, attaining compression ratios with more than twofold improvement on average over what PLA algorithms can offer. This allows for providing significantly higher accuracy with equivalent space requirements. Moreover, our algorithm, due to the simplicity of its merging phase, imposes little overhead while compacting the PLA description, offering a significantly improved trade-off between space and running time. The aforementioned benefits of our approach significantly improve the efficiency in which we can store time-series data, while allowing a tight maximum error in the representation of their values.
@article{sim-piece, author = {Kitsios, Xenophon and Liakos, Panagiotis and Papakonstantinopoulou, Katia and Kotidis, Yannis}, title = {Sim-Piece: Highly Accurate Piecewise Linear Approximation through Similar Segment Merging}, year = {2023}, issue_date = {April 2023}, publisher = {VLDB Endowment}, volume = {16}, number = {8}, issn = {2150-8097}, url = {https://doi.org/10.14778/3594512.3594521}, doi = {10.14778/3594512.3594521}, journal = {Proc. VLDB Endow.}, month = apr, pages = {1910–1922}, numpages = {13}, }