Performance
TUIX Core is designed for high throughput. The compiled C core handles rendering, compositing, and input, while the Python layer manages configuration and orchestration.
Benchmark Summary
Benchmarks measured on Windows 10, 12th Gen Intel Core i5-12450HX, 24GB RAM.
| Scenario | Mean Latency | Ops/sec | Peak RSS |
|---|---|---|---|
| Startup | 302 µs | 3,311 | 3.9 MB |
| Tree Construction (10 items) | 110 µs | 9,126 | 3.9 MB |
| Tree Construction (100 items) | 104 µs | 9,635 | 3.9 MB |
| Tree Construction (1000 items) | 97 µs | 10,309 | 4.0 MB |
| Re-render (no changes) | 10 µs | 96,154 | 3.9 MB |
| Content Update | 134 µs | 7,463 | 3.9 MB |
| Layout Stress (10×4 grid) | 1,142 µs | 876 | 4.0 MB |
| Scroll Stress (2000 items) | 112 µs | 8,929 | 4.1 MB |
| Virtual List (100K items) | 70 µs | 14,388 | 18.3 MB |
Rendering Optimizations
Delta Rendering
The renderer only processes rows that changed between frames. Each row is hashed using FNV-1a on the raw pixel data (before color quantization). Unchanged rows are skipped entirely — no quantization, no SGR emission, no output.
Color Quantization LUT
A 128KB precomputed lookup table maps RGB565 (65,536 entries) to the closest terminal color in ANSI16, ANSI256, or truecolor. The LUT is built once on the first render call, then all color matching is O(1) per pixel.
SGR Grouping
Consecutive pixels with identical foreground color, background color, and style flags are grouped into a single SGR (Select Graphic Rendition) escape sequence. This reduces the amount of escape code bytes emitted per row.
Chunked Output
Rendered ANSI output is accumulated in a 256KB buffer and flushed in chunks. This reduces the number of write() system calls and improves throughput on both Windows and POSIX.
Scene Pointer Caching
The main loop caches the active scene pointer across frames. Scene names are interned strings compared by pointer rather than value, making the per-frame scene lookup O(1).
Memory Profile
TUIX Core has a small memory footprint. Base RSS at startup is approximately 3.9 MB. Memory growth is primarily driven by the number of pixels in active buffers. A full-screen canvas at 120×30 uses approximately 120 × 30 × sizeof(TuixPixel) bytes.
Comparisons
TUIX Core was benchmarked against several terminal UI frameworks across languages:
| Framework | Language | Tree 1000 Items | Re-render | Peak RSS |
|---|---|---|---|---|
| TUIX.Core | Python/C | 97 µs | 10 µs | 3.9 MB |
| Ratatui | Rust | 125 µs | — | 3.2 MB |
| Bubble Tea | Go | 311 µs | — | 11.2 MB |
| blessed | Node.js | 5,800 µs | 124 µs | 152 MB |
| Ink | Node.js (React) | 3,200 µs | — | 87 MB |
Running Benchmarks
# Full benchmark suite (25 scenarios, outputs JSON + CSV)
python tests/benchmarks/full_benchmark.py
# Micro scene benchmark (command buffer execution speed)
python tests/benchmarks/micro_scene_bench.pyThe full benchmark outputs benchmark.json, benchmark_summary.csv, and benchmark_frames.csv. The micro benchmark measures per-object command buffer execution cost in microseconds.