-
Notifications
You must be signed in to change notification settings - Fork 32
Use of CAT in nextflow nf-core pipeline #54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @skrakau, Thanks, it's great that you use CAT in the nf-core mag pipeline! :) We don't have a longterm solution for this, but as said before it's on our list. This is the standing: we cannot currently make any DIAMOND database compatible with any version of DIAMOND. Some versions of DIAMOND work with some DB formats, so it doesn't have to be a one-to-one match, but some don't. What we do instead lately is in the preconstructed DB folder supply the DIAMOND binaries with which the database files are made. It's in a folder called Alternatively if that doesn't work for you because want to supply your own version of DIAMOND I can put the version in the name of the tarball or put an extra text file on the server with the version number. Let me know your thoughts on this! Regarding older DB versions: we do have the old databases on our own system so a user can always ask for them but currently I only put the latest on our download server (the storage there is limited). Also we don't want to keep TBs of data so we're thinking of a more robust solution of saving a single DB per timeframe... Best wishes, Bastiaan |
Hi @bastiaanvonmeijenfeldt, thanks a lot for your answer and the updates! Yes, the name in the tarball or in an extra file would help, then one could directly see if it is still compatible. I will try to keep myself up to date. Thanks again! |
Hi, I am having this problem (nf-core/mag#188). Even if I do --path_to_diamond, it does not overwrite the DIAMOND that is included in Nextflow MAG. Can you make tbb.bio.uu.nl/bastiaan/CAT_prepare/CAT_prepare_20200304.tar.gz available on your website? Thanks, |
Hi,
we would like use, or more precise continue using CAT in the nf-core mag pipeline, however, there are a few issues we are facing currently.
If I see it correctly, one still needs to use the diamond version which was used to create the database, as already discussed in Database was built with a different version of Diamond and is incompatible #45. Is there any plan that this might change in future releases?
This is a problem for nextflow pipelines, since they use one specific container, i.e. with one specific diamond version. Thus only certain db versions could be used with the pipeline, and one would need to update the pipeline immediately after each new db release, if necessary. (And then it would not work with older db versions anymore.)
To know if a database version would be compatible to the current pipeline version, one would need to download 180 GB first.
Accessibility of older database versions: are older versions somewhere accessible? I know the databases are quite large, but for the sake of reproducibility, it would be good if also older database versions would remain accessible. Furthermore, if not, older pipeline versions would also not work anymore.
I just thought I double check if there is anything planned for the future or if we missed something here.
Thanks in advance!
Best, Sabrina
The text was updated successfully, but these errors were encountered: