A Compact and Efficient Erasing-based Lossless Floating-Point Compression Algorithm
Brief Introduction
Elf is an erasure-based floating-point data compression algorithm with a high compression ratio. Elf can
greatly increase the number of trailing zeros in XORed results by erasing the last few bits, which enhances
the compression ratio with a theoretical guarantee. Elf also has its own elaborated encoding strategy for
the XORed results with many trailing zeros.
Features
Elf can greatly increase the number of trailing zeros in XORed results, which enhances the compression ratio
with a theoretical guarantee.
Elf algorithm takes only O (1) in both time complexity and space complexity.
ELf adopt unique coding method for the XORed results with many trailing zeros.
The erase operation in this project is used as a preprocessing step for all XOR-based compression algorithms.
Github Source Code
Paper download
Prerequisites
Demonstration Tutorials
Here is the screen recording of our demonstraion.
Now, let's experience Elf by executing the following statements one by one.
1. Clone code
2. Set JDK
File
-> Project Structure
-> Project
-> Project SDK
-> add SDK
Click JDK
to select the address where you want to download jdk-8
The screen shot is shown as follows:
Waiting for the maven project to build
3. Test
Select the org/urbcomp/startdb/compress/elf
package in the test
folder, which includes tests for 64bits
Double data and 32bits Float data
Here we can also perform float type experiments and Beta experiments, just need to run the corresponding
class
4. Result
The test results are saved in result folder in resource.
The experimental results include the compression rate compression time and decompression time of each data set
in this experiment.
In order to be more helpful for analysis, it also includes the average median and maximum and minimum values of
compression time and decompression time.
5. Add new dataset
When you want to use your own dataset for experiments, you can put the data in the resource folder, and then
add the dataset name path in the test class.
We give an example where the new dataset is named demo.csv
.We perform double type compression experiments on the new dataset.
The screen shot is shown as follows:
First put this data set in the resource
Second add the path of this dataset in the test class,here is TestCompressor
.
We can see the experimental results corresponding to the new dataset in the experimental results, as shown in the screenshot below.