Caltech Library logo

File system layout

dataset provides a way to organize your JSON Objects on disc. The original was a “buckets” oriented layout. The newer and current layout is a pairtree. The layout managed/described by the collection.json document located in the root folder of the collection. The file pairtree supports “attachments” by creating a sub directory next the the JSON document. The sub directory name is _ (because it lacks specific meaning, should be visible on most file systems and is short). E.g. storing the document “hello-world.json” with the attachment “smiles.png” in a collection named “C” would result in paths like C/pairtree/he/ll/o-/wo/rl/d/hello-world.json and `C/pairtree/he/ll/o-/wo/rl/d/_/smiles.png”. Attachments are experimental and how they are handled will likely change in the future.

Pairtree

The directory layout looks like: