-
Notifications
You must be signed in to change notification settings - Fork 57
Add DBs needed for MAGs workflow to CVMFS #945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
|
|
They need to go in here: |
checkm2:Table: checkm2
semibinTable: gtdb
GTDB-tkTable: gtdbtk_database_versioned
baktaTable: amrfinderplus_versioned_database
and
ncbi taxonomyTable: ncbi_taxonomy
gtdbtk metadataTable: gtdbtk_database_metadata_versioned
|
CheckM2current DB: https://zenodo.org/records/5571251/files/checkm2_database.tar.gz SemiBincurrent DB: https://zenodo.org/record/4751564/files/GTDB_v95.tar.gz
=> This data will be used by MMSeqs2 to build the DB which is used by SemiBin. The current DM download a old version which is is a finish build by MMSeqs2 GTDB-TkGTDB-Tk Metadata
=> This are the latest version which are used in the DM. Older can also be downaload by it. Baktahttps://zenodo.org/records/10522951/files/db.tar.gz => Latest version, also be used in the DM AmrfinderplusData will be donwlaoded using this command NCBIhttps://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz => has to be this link. NOTE: NCBI did updated this file today in the morning which means that the current DM runs with the old DB. |
nate@alchemist% cat /cvmfs/data.galaxyproject.org/byhand/location/checkm2.loc
1.0.2 1.0.2 CheckM2 diamond DB downloaded with version 1.0.2 1.0.2 /cvmfs/data.galaxyproject.org/byhand/checkm2/1.0.2/uniref100.KO.1.dmnd
nate@alchemist% grep 220 /cvmfs/data.galaxyproject.org/managed/location/gtdbtk_database_versioned.loc
full_database_release_220_downloaded_2024-10-19 Full Database - release 220 (2024-10-19) 220 /cvmfs/data.galaxyproject.org/managed/gtdbtk_database_versioned/full_database_release_220_downloaded_2024-10-19
nate@alchemist% grep ^amr /cvmfs/data.galaxyproject.org/byhand/location/amrfinderplus_versioned.loc
amrfinderplus_V3.12_2024-05-02.2 V3.12-2024-05-02.2 3.12 /cvmfs/data.galaxyproject.org/byhand/amrfinderplus-db/amrfinderplus_V3.12_2024-05-02.2
nate@alchemist% grep 5.1 /cvmfs/data.galaxyproject.org/byhand/location/bakta_database.loc
V5.1_2024-01-19 10522951 1.7 /cvmfs/data.galaxyproject.org/byhand/bakta_database/10522951
nate@alchemist% grep 2024-06-05 /cvmfs/data.galaxyproject.org/byhand/location/ncbi_taxonomy.loc
2024-06-05 2024-06-05 /cvmfs/data.galaxyproject.org/byhand/ncbi_taxonomy/2024-06-05 the restI'll work on these ASAP. |
Since one needs to know the entry of the loc files and DBs in /data.galaxyproject.org/byhand/ to write the IWC workflow, is this information somehow publicly available? |
Why is this needed ? You should only need the dbkey |
well, how do I know that the dbkey is related to the DB I need without knowing the content of the loc file ? Some DMs allow manual entry and where do I find the dbkey if I cannot see the loc file ? |
If you need one specific database, select it in the editor. If you want to have the user select it, connect a text param to a select, like in https://usegalaxy.org/u/marius/w/checkm2-example. As an author you do not need access to loc files, and unless you know an admin, you will never get it. It's not part of what we consider available to users. |
For the record, |
Thanks @natefoo now I get it, was looking for the loc files ! |
The last two are done:
nate@alchemist% cat /cvmfs/data.galaxyproject.org/byhand/location/gtdb.loc
17102022 GTDB reference genome generated by MMseqs2 used in SemiBin gtdb /cvmfs/data.galaxyproject.org/byhand/gtdb
|
thanks @natefoo ! |
We want to add a MAGs workflow to IWC and it needs some DBs in CVMFS. The DBs are available on .eu.
They can all be installed via DMs. Can we provide a list to get copied to CVMFS ?
What do you need for that exactly. The DMs with parameters or the .loc entry on .eu ?
The text was updated successfully, but these errors were encountered: