py_dataset
is the python version of dataset. dataset is
a way of storing and organazing JSON documents on disc. To use
py_dataset
we usually import it using Python’s “from”
syntax.
from py_dataset import dataset
This provides a dataset
object to work with dataset
collections.
This is an example of creating a dataset called fiends.ds, saving a record called “littlefreda.json” and reading it back.
import sys
import json
from py_dataset import dataset
# Creating our friends.ds dataset collection for the first time.
= 'friends.ds'
c_name if not dataset.init(c_name, dsn = ""):
print(dataset.error_message())
1)
sys.exit(
# Now let's add something to our collection.
= 'littlefreda'
key = {"name":"Freda","email":"little.freda@inverness.example.org"}
record if not dataset.create(c_name, key, record):
print(dataset.error_message())
1)
sys.exit(
# We should have at least one record in our collection.
# This is the idiom for iterating and working with our collection
# objects.
= dataset.keys(c_name)
keys for key in keys:
= dataset.path(c_name, key)
p print(p)
# NOTE: the "read" method returns a touple!
:= dataset.read(c_name, key)
record, err if err != '':
print(f"read error, {err}")
1)
sys.exit(print(f"Doc: {record}")
The command dataset.init(c_name, dsn = "")
,
dataset.keys(c_name)
,
dataset.read(c_name, key)
dataset.create(c_name, key)
are the main actors here. Most
dataset methods require the collection name as the first parameter.
Likewise many return some sort of value. If it is a boolean value than
True means success and False means failure. If the method returns data
then often it will be returned as a touple like with
read()
. If an error has occurred (e.g. permissions on disc
raising a problem) you can retrieve the dataset error message by using
the error_message()
function. If you’re done with the error
you can use error_clear()
to reset the error message
queue.
Now check to see if the key, littlefreda, is in the collection
'littlefreda') dataset.has_key(c_name,
You can also read your JSON formatted data from a file but you need to convert it first to a Python dict. In theses examples we are creating for Mojo Sam and Capt. Jack then reading back all the keys and displaying their paths and the JSON document created.
with open("mojosam.json") as f:
= f.read().encoding('utf-8')
src "mojosam", json.loads(src))
dataset.create(c_name,
with open("capt-jack.json") as f:
= f.read()
src "capt-jack", json.loads(src))
dataset.create(
for key in dataset.keys(c_name):
print(f"Path: {dataset.path(c_name, key)}")
print(f"Doc: {dataset.read(c_name, key)}")
print("")
NOTE: In v2 of dataset there is no internal mechansim for filting and sorta keys. If you need that you should create a data frame, read the data frame out and manipulate it. Internal sorting and filtering just proved too slow.