Creating a Database¶
For most users, a database should already be provided, pre-made. But of course, someone has to make the database. This is the guide for how to do that.
First, you’ll need a bunch of data. You can get alert archive tarballs from the
UW ZTF public archive. Or, if you’re at
UW, you can access the tarballs directly from epyc at
/epyc/data/ztf/alerts/public/.
Next, you’ll need AWS credentials to be able to upload alerts, putting them somewhere that boto can find them. If you’re creating a new database from scratch, you’ll also need to create the S3 bucket were alert blobs will be stored.
You’ll also want to make a directory on disk where the index will be stored.
Use the alertbase.Database.create method to create a fresh new
database. Pass it the values you just made. Then, upload your tarfile by calling
the upload_tarfile() method of alertbase.Database:
-
async
Database.upload_tarfile(tarfile_path, n_worker=8, limit=None, skip_existing=False)¶ Upload a ZTF-style tarfile of alert data using a pool of workers to concurrently upload alerts.
- Parameters
tarfile_path (
PathPath) – a local path on disk to a gzipped tarfile containing individual avro-serialized alert files.n_worker (
intint(default:8)) – the number of concurrent S3 sessions to open for uploading.limit (
int|NoneOptional[int] (default:None)) – maximum number of alerts to upload.skip_existing (
boolbool(default:False)) – if true, don’t upload alerts which are already present in the local index
- Return type
NoneNone
This is an async method, so you call it in a slightly unusual way. This should do the trick:
import asyncio
import alertbase
import pathlib
with alertbase.Database.create("us-west-2", "bucket-name", "./path/to/alertdb") as db:
asyncio.run(db.upload_tarfile(pathlib.Path("./path/to/tar")))
You call this directly on the .tar.gz file without untarring or unzipping it.
Expect this to take a long time! A single tarfile can easily take over an hour.
If you want logging and debugging output, you can do:
import logging
logging.basicConfig()
logging.getLogger("alertbase").setLevel(logging.DEBUG)