bastie/be42
Compression with permutations
idea
Nearly(? or) all compression algoritm see a file as byte-stream with explicite information byte-offset and byte-value. Most of them replacing repeating sequences with references, and yes this works great.
In different this compression algorithm look different and use another view on file, the file structure as compression base. All files can be descripe as repeated part time permutation of values. This permutation ends with first repeated value. Also you can see this value as part who concat two permutations, because one end with this value and another start with this value. At the end it can be a rest.
For example (based on Nibbles instead of Bytes):
01 0A 07 04 0A 03 03 05 0A 07 04 0A 0F
| | | |
Start | | |
Repeated | |
also Start | |
Repeated Repeated
also Start also RestProperties of Nibble-permutation are btw. max size is 16 (0123456789ABCDEF) but in random data the birthday hint tells us most of that are maybe 5 to 6 elements long. Also the probality of repeated value increases with next value, beginning with 1/15, 1/14, 1/13 ... 1/1. Its like a markov chain.
Note: The algorithm is based on human intelligence, but part of the code was first written by artificial intelligence (vibe coding) and than modified by human because the result is the priority.
compare
enwik8
| compressor | version | parameter | size in bytes | % | time | bytes per second | comment | | ---------- | ------- | --------- | ------------- | ----- | -------- | ---------------- | ------------------ | | ben | 0.42 | | 32.280.526 | 32.28 | 2:58.30 | 181.046 | non optimized code | | ben+xz | | xz -9ekf | 31.300.064 | 31.30 | 3:06.06 | 168.225 | | | bzip2 | 1.0.8 | -9zkf | 29.008.758 | 29.01 | 0:04.75 | 6.107.106 | | | gzip | 479 | -9kf | 36.475.811 | 36.48 | 0:03.52 | 10.362.446 | fast | | xz | 5.8.2 | -9ekf | 24.831.656 | 24.83 | 0:56.00 | 443.422 | | | zopfli | 1.0.3 | --i100 | 34.955.165 | 34.96 | 10:10.79 | 57.229 | | | zpaq | 7.15 | -m5 | 19.625.015 | 19.63 | 4:32.29 | 71.907 | best | | zstd | 1.5.7 | -k19f | 26.944.227 | 26.94 | 0:41.83 | 664.136 | |
Dependencies
No dependency for be42 library.
CLI tool ben needed swift-argument-parser and be42.
License
Apache License Version 2.0
Version
0.42.0
First public version, perhaps the 42nd attempt at a functioning implementation. Better than gzip.
Package Metadata
Repository: bastie/be42
Stars: 1
Forks: 0
Open issues: 0
Default branch: main
Primary language: swift
License: Apache-2.0
Topics: bastie, compression, compression-algorithm, nibble, nibbles, permutation, transformer
README: README.md