bench-my-utf8/readme.md

# UTF-8 Benchmark

The idea is to benchmark several utf-8 aspects of the Haskell Text package. Namely utf-8 encoding/decoding and various unicode casing operations. Hopefully, we'll find ways to improve text's performance.

For the time being, we're comparing the Text implementation with the C ICU one. In the future I also plan to test it against C++ Boost and the Rust stdlib.

# Implemented so far

- [x] UTF-8 decoding from file.
  - [x] English
  - [ ] Chinese
  - [ ] French
  - [ ] Russian
- [ ] UTF-8 encoding to file.
  - [ ] English
  - [ ] Chinese
  - [ ] French
  - [ ] Russian
- [ ] UTF-8 encoding to file.
  - [ ] English
  - [ ] Chinese
  - [ ] French
  - [ ] Russian


# Usage

```bash
nix-shell
make all
```

# Findings

## UTF-8 Decoding

```
hyperfine ./haskell-read-utf8 ./icu-read-utf8
Benchmark #1: ./haskell-read-utf8
  Time (mean ± σ):      23.3 ms ±   0.9 ms    [User: 14.9 ms, System: 8.3 ms]
  Range (min … max):    22.0 ms …  26.1 ms    111 runs

Benchmark #2: ./icu-read-utf8
  Time (mean ± σ):      12.5 ms ±   0.8 ms    [User: 7.6 ms, System: 4.9 ms]
  Range (min … max):    11.5 ms …  16.1 ms    176 runs

Summary
  './icu-read-utf8' ran
    1.85 ± 0.14 times faster than './haskell-read-utf8'
```
-												Initial benchmark: comparing haskell's Text with icu on EN corpus

Run `make all` to run the benchmark.

											
										
										
											2021-06-29 21:02:27 +02:00
+								# UTF-8 Benchmark
 								The idea is to benchmark several utf-8 aspects of the Haskell Text package. Namely utf-8 encoding/decoding and various unicode casing operations. Hopefully, we'll find ways to improve text's performance.
 								For the time being, we're comparing the Text implementation with the C ICU one. In the future I also plan to test it against C++ Boost and the Rust stdlib.
 								# Implemented so far
 								- [x] UTF-8 decoding from file.
 								  - [x] English
 								  - [ ] Chinese
 								  - [ ] French
 								  - [ ] Russian
 								- [ ] UTF-8 encoding to file.
 								  - [ ] English
 								  - [ ] Chinese
 								  - [ ] French
 								  - [ ] Russian
 								- [ ] UTF-8 encoding to file.
 								  - [ ] English
 								  - [ ] Chinese
 								  - [ ] French
 								  - [ ] Russian
 								# Usage
 								```bash
 								nix-shell
 								make all
 								```
 								# Findings
 								## UTF-8 Decoding
 								```
 								hyperfine ./haskell-read-utf8 ./icu-read-utf8
 								Benchmark #1: ./haskell-read-utf8
 								  Time (mean ± σ):      23.3 ms ±   0.9 ms    [User: 14.9 ms, System: 8.3 ms]
 								  Range (min … max):    22.0 ms …  26.1 ms    111 runs
 								Benchmark #2: ./icu-read-utf8
 								  Time (mean ± σ):      12.5 ms ±   0.8 ms    [User: 7.6 ms, System: 4.9 ms]
 								  Range (min … max):    11.5 ms …  16.1 ms    176 runs
 								Summary
 								  './icu-read-utf8' ran
 .85 ± 0.14 times faster than './haskell-read-utf8'
 								```