-
Notifications
You must be signed in to change notification settings - Fork 603
Seems VariantsToTable not properly handle AD greater than 100 #6115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Huh. Something dumb is happening here. Definitely a bug on our end. |
@bshifaw I couldn't recreate this with the latest version or 4.1.2.0 when I made a VCF with that line in it. Can you please attach the actual input VCF that had the problem? |
The files are in the user-liaison channel. I can take care of this bug. |
I am the OP in the GATK forum. This bug affects only the SNPs with a three digital ALT AD value, and the two AD values of a SNP are separated by a comma. If VariantsToTable, LibreOffice, or Excel treats the comma as a thousands separator, AD concatenation would happen. Is this possible? |
Yes, it turns out this is not strictly a GATK bug. I opened the tsv file with Visual Studio Code and found a comma was there between the two AD values for the affected SNPs. So it is LibreOffice or Excel, not VariantsToTable, treats the comma as a thousands separator. IMHO, using something other than a comma to separate the two AD values should solve the problem on the GATK end. |
In the VCF 4.3 spec (http://samtools.github.io/hts-specs/VCFv4.3.pdf) AD is now a reserved key for the FORMAT field giving a list of values with length equal to the number of alleles including the reference. Given that this is included in the spec now, we can't change the delimiter while still using the AD key. You may find some of the changes I introduced in #5697 to be helpful. If you split multi-allelics ( |
User provided input files that i tested and one of the AD values did get concatenated but not all AD values greater than 100 were concatenated.
User post
I am trying to extract info from a vcf file using the following command and encountered a problem:
The text was updated successfully, but these errors were encountered: