Caltech Library logo

Storage Engines

With the introduction of v2 of dataset you now have a choice of storage engines. Currently offered in the 2.0 release is

The pairtree storage engine is stable. The primary limitations are the file system (where it stores the JSON documents) case limitations and lack of record/field locking. Pairtree work fine for batch operations, single user/single process operations with less than 100k documents.

With the introduction of SQL Storage dataset can be used in a multi-process/multi-user mode via a RESTful API. The SQL storage is experimental and as it gets you more various considerations are coming to the surface

Cautions