RetroCat

~~hero-subtitle The Retrievable Object Catalog~~

RetroCat is a sort of database and blob storage intended for large catalogs of similarly-formatted media objects. it was intended to be a deduplicating blob storage for video game ROMs but I'm also intending to use it for archival data in general.

What's the point?

The goal is to decompose and recompose object contents interred into its blob storage, such that the contents can be queried individually, or the entire object reconstructed 1:1 to its original file. As well, if you had two familially-related files with minor changes in an object inside of it, retrocat would only store those changes.

Object contents would be reversibly deconstructed, decrypted/decompressed, and loaded into a content-addressible blob storage, backed by a database, which could then deduplicate and differentially compress across multiple items of the same type.

This would make common things like, in the example of a video game library, SDK libs or assets, or update files, which are part of the original ROM, able to be deduplicated away as your library grows; and carrying multiple regional or version revisions of the same title would only keep the differences (e.g., patches to the code, or string and gfx changes) stored. Ttems would be imported and exported from the blob store, and the database would track what items changed, and their deltas.

So say - for instance - you have a bunch of ISO images of operating systems, different languages, different system builds, but a lot of the same stuff in all of them. You just put them all in retrocat, and it would splay out the ISO filesystem into a set of catalogued objects independently, and then - if say, you had Windows XP in English and one in Portuguese - it would see one 'dominant' file, and every 'sub' file would be differences applied to the dominant one.

RetroCat will have an API where, say, a video game emulator could simply request “what block where” over some IPC, and RetroCat would do the heavy lifting of decompressing and reconstructing the filesystem on-the-fly.