Contributing#

For bug fixes or new features, please file an issue before submitting a pull request. If the change is not trivial, it may be best to wait for feedback.

Generating data#

The Makefile in the project repository handles the generation of the training and benchmarking datasets. Simply running make will re-generate the mibig-2.0 and mibig-3.1 datasets, including the features.hdf5 and classes.hdf5 needed to train CHAMOIS.

Managing versions#

As it is a good project management practice, we should follow semantic versioning, so remember the following:

As long as the model predicts the same thing, retraining/updating the model should be considered a non-breaking change, so you should bump the MINOR version of the program.
Upgrading the internal HMMs could potentially change the output but won’t break the program, they should be treated as non-breaking change, so you should bump the MINOR version of the program.
If the model changes prediction (e.g. predicted classes change), then you should bump the MAJOR version of the program as it it a breaking change.
Changes in the code should be treated following semver the usual way.
Changed in the CLI should be treated as changed in the API (e.g. a new CLI option or a new command bumps the MINOR version, removal of an option bumps the MAJOR version).

Upgrading the internal model#

After having trained a new version of the model, simply copy the resulting JSON file to chamois/predictor/predictor.json.

Upgrading the internal HMMs#

To download the subset of Pfam HMMs required by the model, simply run:

$ python setup.py clean download_pfam --inplace

Contributing#

Generating data#

Managing versions#

Upgrading the internal model#

Upgrading the internal HMMs#

This Page