Elf: Erasing-based Lossless Floating-Point Compression
A Compact and Efficient Erasing-based Lossless Floating-Point Compression Algorithm
Update time: 2023-03-08 21:30:00+08:00

Brief Introduction

Elf is an erasure-based floating-point data compression algorithm with a high compression ratio. Elf can greatly increase the number of trailing zeros in XORed results by erasing the last few bits, which enhances the compression ratio with a theoretical guarantee. Elf also has its own elaborated encoding strategy for the XORed results with many trailing zeros.

Features

Elf can greatly increase the number of trailing zeros in XORed results, which enhances the compression ratio with a theoretical guarantee.
Elf algorithm takes only O (1) in both time complexity and space complexity.
ELf adopt unique coding method for the XORed results with many trailing zeros.
The erase operation in this project is used as a preprocessing step for all XOR-based compression algorithms.

Github Source Code

https://github.com/Spatio-Temporal-Lab/elf

Paper download

Elf: Erasing-based Lossless Floating-Point Compression

Prerequisites

https://git-scm.com/download
1. Install JDK 1.8 in your computer: https://www.oracle.com/fr/java/technologies/javase/javase8-archive-downloads.html
2. Install IntelliJ IDEA in your computer: https://www.jetbrains.com/idea/
3. Install git in your computer: https://git-scm.com/download

Demonstration Tutorials

Here is the screen recording of our demonstraion.
Now, let's experience Elf by executing the following statements one by one.

1. Clone code

Open IntelliJ IDEA, find the git column, and select Clone... , and Fill https://github.com/Spatio-Temporal-Lab/elf.git into URL

2. Set JDK

File -> Project Structure -> Project -> Project SDK -> add SDK
Click JDK to select the address where you want to download jdk-8
The screen shot is shown as follows:
Waiting for the maven project to build

3. Test

Select the org/urbcomp/startdb/compress/elf package in the test folder, which includes tests for 64bits Double data and 32bits Float data
Here we can also perform float type experiments and Beta experiments, just need to run the corresponding class

4. Result

The test results are saved in result folder in resource.
The experimental results include the compression rate compression time and decompression time of each data set in this experiment.
In order to be more helpful for analysis, it also includes the average median and maximum and minimum values of compression time and decompression time.

5. Add new dataset

When you want to use your own dataset for experiments, you can put the data in the resource folder, and then add the dataset name path in the test class.
We give an example where the new dataset is named demo.csv.We perform double type compression experiments on the new dataset.
The screen shot is shown as follows:
First put this data set in the resource
Second add the path of this dataset in the test class,here is TestCompressor.
We can see the experimental results corresponding to the new dataset in the experimental results, as shown in the screenshot below.