dsquery
dsquery OPTIONS C_NAME SQL_STATEMENT [PARAMS]
dsquery is a tool to support SQL queries of dataset collections. Pairtree based collections should be index before trying to query them (see ‘-index’ option below). Pairtree collections use the SQLite 3 dialect of SQL for querying them. For collections using a SQL storage engine (e.g. SQLite3, Postgres and MySQL), the SQL dialect used is that of the SQL storage engine chosen.
The schema is the same for all storage engines. The scheme for the JSON stored documents have a four column scheme. The columns are “_key”, “created”, “updated” and “src”. “_key” is a string (aka VARCHAR), “created” and “updated” are timestamps while “src” is a JSON column holding the JSON document. The table name reflects the collection name without the “.ds” extension (e.g. data.ds is stored in a database called data having a table also called data).
The output of dsquery is a JSON arrary of objects. The order of the objects is determined by the your SQL statement and SQL engine. There is an option to generate a 2D grid of values and CSV format are also supported as options (see ‘-grid’ and ‘-csv’ below).
Generate a list of JSON objects with the _key
value
merged with the object stored as the ._Key
attribute. The
colllection name “data.ds” which is implemented using Postgres as the
JSON store. (note: in Postgres the ||
is very helpful).
dsquery data.ds "SELECT jsonb_build_object('_Key', _key)::jsonb || src::jsonb FROM data"
In this example we’re returning the “src” in our collection by querying for a “id” attribute in the “src” column. The id is passed in as an attribute using the Postgres positional notatation in the statement.
dsquery data.ds "SELECT src FROM data WHERE src->>'id' = $1 LIMIT 1" "xx103-3stt9"