internetarchive/trough

GitHub

Trough is a Python-based approach for querying very large datasets by splitting data into many small sharded SQLite databases keyed for reliable worst-case performance. It is designed to leverage distributed storage (flat SQLite files) instead of large CPU and RAM clusters, aiming for predictable performance once the largest shard is loaded locally.

A status summary will appear after the next weekly refresh.

AI-generated from public sources. May be inaccurate. Report

Recent updates

No recent updates have been summarized for this source yet.