Asset Uploader
Getting Started¶
This tool is built to operate two different kinds of long-running jobs: scanning your storage (local or remote) for asset entries that are then registered to Vermillio and uploading batches of asset files for those entries. Each of these types of jobs is resumable should the process die.
For your first run, you'll want to start by scanning a storage source for registering to Vermillio:
uv run vermillio_uploader scan_folder <PROTOCOL (gs://, s3://, az://, /)><PATH_TO_DIRECTORY>
This will log a scan_id to your terminal and begin and resumable storage scan. If you need to resume this scan at a later point, you can resume as such:
uv run vermillio_uploader scan_folder "" --scan_id <SCAN_ID>
To upload media files for these registered assets, you will create and run batches by the number of files you want to upload in a single batch.
uv run vermillio_uploader run_new_batch <BATCH_SIZE e.g. 500>
The batch will be created from entries that you have registered but not uploaded in increments up to the size you specified and uploaded one at a time. You may wish to run multiple concurrent batches to speed processing, though you will want to monitor resources (especially network bandwidth) on your system to determine the right number.
As with scans, batches can be resumed with the output batch_id:
uv run vermillio_uploader upload_batch <BATCH_ID>
Please note for both scans and batches, you should only run one process for each scan_id or batch_id.
Advanced Usage¶
If you need to check the status of or resume several processes without knowing the ids in question, is is possible to get a view of all the open scans and uploads:
uv run vermillio_uploader data_transfer get_unfinished_scans
uv run vermillio_uploader data_transfer get_unfinished_batches
Using the results of these commands you can start scan_folder and upload_batch operations as above with the appropriate ids.