Frequently Asked Questions (FAQ)#
Pipeline and Configuration version mismatch#
I can’t run the pipeline because it says my TOML file has a version mismatch?
In order to try and manage compatibility for the pipeline, your configuration file has a version key in it. This key must be compatible (within SemVer) with the installed version of vampires_dpp. There are two approaches to fixing this:
(Recommended) Call
dpp upgradeto try to automatically upgrade your configurationDowngrade
vampires_dppto match the version in your configuration
I’m getting warnings about centroid files, help!#
The blah blah explain it.
TODO
Performance#
It’s slow. It’s so, so slow. Help.
It’s hard to process data in the volumes that VAMPIRES produces, but there are some tips for speeding it up.
Use an SSD (over USB 3 or thunderbolt)
Faster storage media reduces slowdowns from opening and closing files, which happens a lot throughout the pipeline
Important
If using a portable SSD or HDD, make sure to use a high-speed cable plugged into a high-speed port on your computer. Tools like dd, lsusb, cyme or CrystalDiskMark can be used to verify your connection and read/write speeds to the drive.
Don’t save intermediate files
The time it takes to open a file, write to disk, and close it will add a lot to your overheads, in addition to the huge increase in data volume
Use multi-processing
Using more processes should improve some parts of the pipeline, but don’t expect multiplicative increases in speed since most operations are limited by the storage IO speed.
Semaphore Warnings#
If you run the pipeline and you see errors like this:
UserWarning: resource_tracker: There appear to be 5 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
that is okay. This can happen during multiprocessing and will clear up after your computer restarts. The pipeline and the rest of your computer will run fine even if you see this.