The documentation is organized around the command line options and as a series of “how to” style examples.
The basic operations support by dataset are listed below organized by collection and JSON document level.
dataset is based around the concept of key/value pairs where the key is the unique identifier for an object stored (i.e. the value) in the collection. Each storage option supported by dataset and its own issues around what things can be called. Keys should be lower case alpha numeric or underscore only. E.g. the pairtree storage relies on the file system to store the JSON objects. Some file systems are not case sensitive, others face challenges with non-alpha numeric filenames.
New as of version v2 is a web service providing access to dataset collections. This is described in the datasetd documentation page.
datasetd supports the following end points.
In v2 dataset is starting to suport storing your JSON document in a SQL database. Currently three SQL databases can be used to store the JSON documents, SQLite 3 (default engine, used in dataset’s test suites), MySQL 8 (minimally tested), Postgres >= 12 (well tested). See storage engines for more details.
Migrating dataset collections between major versions or just different collections can be done using the “dump” and “load” feature. This replaces the old process in early v2 that required you to run a “repair” operation to convert a collection to the current version of dataset.
Example migrating from dataset “data_v2.ds” from v2 to v3 as “data_v3.ds”.
dataset3 init data_v3.ds
dataset dump data_v2.ds | dataset3 load data_v3.ds