Command line tool - run option¶
The run command actually runs the pipeline.
Run it by passing the name of one or more sources to run:
python ocdskingfisher-cli run taiwan python ocdskingfisher-cli run taiwan canada_buyandsell
You can run all sources with the all flag.
python ocdskingfisher-cli run --all
It is not recommended to do this as some of the sources take a very long time to download!
There is a sample mode. This only fetches a small amount of data for each source. (If you use the all flag we strongly recommend the sample flag too!)
python ocdskingfisher-cli run --sample ...
It will look for existing collections with the same source and sample flag as you specify, and by default resume the latest one.
To make sure you start a new collection, pass the newversion flag.
python ocdskingfisher-cli run --newversion ...
To select an existing collection, pass the dataversion flag.
python ocdskingfisher-cli run --dataversion 2018-07-31-16-03-50 ...
By default, it will run all stages of the pipeline.
You can specify that only one stage should be run with the following flags:
You can specify that stages should be skipped with the following flags:
python ocdskingfisher-cli run --skipstore --skipcheck ...