Caltech Library logo

Storage Engines

With the introduction of v2 of dataset you now have a choice of storage engines. In v2.2 SQLite3 became the default storage engine. Current supported storage engines are.

With the introduction of SQL Storage dataset can be used in a multi-process/multi-user mode via a RESTful API. The SQL storage is experimental and as it gets you more various considerations are coming to the surface

Cautions

The pairtree storage engine is stable. The primary limitations are the file system (where it stores the JSON documents) case limitations and lack of record/field locking. Pairtree work fine for batch operations, single user/single process operations with less than 100k documents. It does not appropriate for concurrent access involving writes. I remains in v2.2 as a historical artifact. Will be removed in v3.

MySQL is used less and less in the software I work with. MySQL support maybe dropped in version 3 of dataset.