jjrscott/lzw

This project's purpose was to educate me (jjrscott) on how the LZW works. While build this project I realised that the LZW algorithm as originally states has two distinct parts:

Rosetta Code

From Rosetta Code:

The Lempel-Ziv-Welch (LZW) algorithm provides loss-less data compression.
You can read a complete description of it in the Wikipedia article on the subject. It was patented, but it entered the public domain in 2004.

Classic example

For some reason all demos use TOBEORNOTTOBEORTOBEORNOT as the input string. The following example takes a string and produces an integer array, just as the original demos do.

import LZW

let original = "TOBEORNOTTOBEORTOBEORNOT"

let compressed = LZW.compress(original.utf8.map({$0}))

print("compressed: \(compressed)")

let encoded = compressed.map { unit in
    switch unit {
    case .element(let value):
        return UInt(value)
    case .index(let index):
        return UInt(UInt8.max) + UInt(index)
    }
}

print("encoded: \(encoded)")

Output:

compressed: [.element(84), .element(79), .element(66), .element(69), .element(79),
             .element(82), .element(78), .element(79), .element(84), .index(0),
             .index(2), .index(4), .index(9), .index(3), .index(5), .index(7)]
encoded: [84, 79, 66, 69, 79, 82, 78, 79, 84, 256, 258, 260, 265, 259, 261, 263]

Algorithm

A Technique for High-Performance Data Compression:

Initialize table to contain single-character strings
Read first input character K:
        K → prefix string ω;
Step: Read next input character K
         If no such K (input exhausted):
                    code(ω) → output;
                    EXIT.
         If ωK exists in string table:
                    ωK → ω;
                    repeat Step.
         else ωK not in string table:
                    code(ω) → output;
                    ωK → string table;
                    K → ω;
                    repeat Step.

Package Metadata

Repository: jjrscott/lzw

Default branch: main

README: README.md