Skip to content

Use of CAT in nextflow nf-core pipeline #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
skrakau opened this issue Mar 11, 2021 · 3 comments
Open

Use of CAT in nextflow nf-core pipeline #54

skrakau opened this issue Mar 11, 2021 · 3 comments

Comments

@skrakau
Copy link

skrakau commented Mar 11, 2021

Hi,
we would like use, or more precise continue using CAT in the nf-core mag pipeline, however, there are a few issues we are facing currently.

  1. If I see it correctly, one still needs to use the diamond version which was used to create the database, as already discussed in Database was built with a different version of Diamond and is incompatible #45. Is there any plan that this might change in future releases?
    This is a problem for nextflow pipelines, since they use one specific container, i.e. with one specific diamond version. Thus only certain db versions could be used with the pipeline, and one would need to update the pipeline immediately after each new db release, if necessary. (And then it would not work with older db versions anymore.)

  2. To know if a database version would be compatible to the current pipeline version, one would need to download 180 GB first.

  3. Accessibility of older database versions: are older versions somewhere accessible? I know the databases are quite large, but for the sake of reproducibility, it would be good if also older database versions would remain accessible. Furthermore, if not, older pipeline versions would also not work anymore.

I just thought I double check if there is anything planned for the future or if we missed something here.

Thanks in advance!
Best, Sabrina

@bastiaanvonmeijenfeldt
Copy link
Collaborator

bastiaanvonmeijenfeldt commented Mar 16, 2021

Hi @skrakau,

Thanks, it's great that you use CAT in the nf-core mag pipeline! :)

We don't have a longterm solution for this, but as said before it's on our list.

This is the standing: we cannot currently make any DIAMOND database compatible with any version of DIAMOND. Some versions of DIAMOND work with some DB formats, so it doesn't have to be a one-to-one match, but some don't. What we do instead lately is in the preconstructed DB folder supply the DIAMOND binaries with which the database files are made. It's in a folder called DIAMOND_X.X.X. A user (or pipeline) can manually set --path_to_diamond to this binary, and everything should run as intended.
If that helps I could make this path the default version of DIAMOND if it exists and no --path_to_diamond is supplied. The only thing you'll have to do is download the latest DB folder and all should run smoothly.

Alternatively if that doesn't work for you because want to supply your own version of DIAMOND I can put the version in the name of the tarball or put an extra text file on the server with the version number. Let me know your thoughts on this!

Regarding older DB versions: we do have the old databases on our own system so a user can always ask for them but currently I only put the latest on our download server (the storage there is limited). Also we don't want to keep TBs of data so we're thinking of a more robust solution of saving a single DB per timeframe...

Best wishes,

Bastiaan

@skrakau
Copy link
Author

skrakau commented Mar 23, 2021

Hi @bastiaanvonmeijenfeldt, thanks a lot for your answer and the updates!

Yes, the name in the tarball or in an extra file would help, then one could directly see if it is still compatible.

I will try to keep myself up to date. Thanks again!

@gailrosen
Copy link

Hi, I am having this problem (nf-core/mag#188). Even if I do --path_to_diamond, it does not overwrite the DIAMOND that is included in Nextflow MAG. Can you make tbb.bio.uu.nl/bastiaan/CAT_prepare/CAT_prepare_20200304.tar.gz available on your website?

Thanks,
Gail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants