Skip to content

Jc validate variants fixes issue5862 #5984

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 17, 2019

Conversation

jonathoncohen98
Copy link
Contributor

In validateVariants tool, made the default case behave so that it does the validations that can be done, and issues warning messages for the validations that cannot be done (ie, required external files)

Copy link
Collaborator

@jamesemery jamesemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good start. You may need to take a look at all of the options for the user and make sure something sane is done in each of these cases:

  • Default Arguments with no reference or dbsnp provided
  • Default Arguments with reference but not dbsnp provided
  • Default Arguments with no reference but dbsnp provided
  • Default Arguments with both reference and dbsnp provided

Then I would write some tests for each of these cases. For example when you have default arguments can it still fail reference validation even if you don't have a dbsnp provided and vice versa.

@@ -348,6 +358,9 @@ private void applyValidationType(VariantContext vc, Allele reportedRefAllele, Al
if (!rsIDs.isEmpty()) {
vc.extraStrictValidation(reportedRefAllele, observedRefAllele, rsIDs);
}
//default case - do as many validations as possible given no command line args
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if rsIDs is not empty? Then it looks like you will call vc.validateAlternatieAlleles() in two places. What if rsIDs is empty the the reference isn't empty? Where does that get called in the default case?

@@ -209,6 +209,16 @@ public void onTraversalStart() {
genomeLocSortedSet = new GenomeLocSortedSet(new GenomeLocParser(seqDictionary));
}
validationTypes = calculateValidationTypesToApply(excludeTypes);

//warn user if certain requested validations cannot be done due to lack of arguments
if(dbsnp.dbsnp == null && validationTypes.contains(ValidationType.ALL))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... it seems like the conditions for these would want to be validationTypes.contains(All) || validationTypes.contains(IDs). It seems like if the user disables REF validation types but forgets to do the same for IDs then it should still warn in this case.

@codecov
Copy link

codecov bot commented Jun 4, 2019

Codecov Report

Merging #5984 into master will increase coverage by 6.783%.
The diff coverage is 80%.

@@               Coverage Diff               @@
##              master     #5984       +/-   ##
===============================================
+ Coverage     80.142%   86.924%   +6.783%     
- Complexity     31040     32741     +1701     
===============================================
  Files           2014      2014               
  Lines         151333    151367       +34     
  Branches       16612     16617        +5     
===============================================
+ Hits          121281    131575    +10294     
+ Misses         24197     13728    -10469     
- Partials        5855      6064      +209
Impacted Files Coverage Δ Complexity Δ
...tools/walkers/ValidateVariantsIntegrationTest.java 98.701% <100%> (+97.252%) 44 <4> (+42) ⬆️
...r/tools/walkers/variantutils/ValidateVariants.java 80.62% <66.667%> (-2.263%) 42 <7> (+6)
...nder/utils/runtime/StreamingProcessController.java 67.299% <0%> (-0.474%) 33% <0%> (ø)
.../walkers/vqsr/CNNScoreVariantsIntegrationTest.java 96.629% <0%> (+0.562%) 16% <0%> (+1%) ⬆️
...va/org/broadinstitute/hellbender/GATKBaseTest.java 98.214% <0%> (+1.786%) 8% <0%> (+1%) ⬆️
...python/StreamingPythonExecutorIntegrationTest.java 22.222% <0%> (+11.111%) 2% <0%> (+1%) ⬆️
...llbender/tools/GatherVcfsCloudIntegrationTest.java 36.607% <0%> (+33.929%) 9% <0%> (+7%) ⬆️
...ls/spark/BaseRecalibratorSparkIntegrationTest.java 40.541% <0%> (+38.739%) 5% <0%> (+3%) ⬆️
...tools/FixCallSetSampleOrderingIntegrationTest.java 48.81% <0%> (+47.619%) 8% <0%> (+7%) ⬆️
...plePostTraversalPythonExecutorIntegrationTest.java 100% <0%> (+57.143%) 3% <0%> (+1%) ⬆️
... and 159 more

@@ -209,6 +209,16 @@ public void onTraversalStart() {
genomeLocSortedSet = new GenomeLocSortedSet(new GenomeLocParser(seqDictionary));
}
validationTypes = calculateValidationTypesToApply(excludeTypes);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also calculateValidationTypesToApply() should be changed to create a new list of the output valiation types it works on.

Copy link
Collaborator

@jamesemery jamesemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, some very minor comments and one question then I think this will be good.

//warn user if certain requested validations cannot be done due to lack of arguments
if(dbsnp.dbsnp == null && (validationTypes.contains(ValidationType.ALL) || validationTypes.contains(ValidationType.IDS)))
{
logger.warn("IDS validation cannot be done because no DBSNP file was provided");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a line to these warning statements mentioning that other validations will still be performed.

@@ -345,9 +358,26 @@ private void applyValidationType(VariantContext vc, Allele reportedRefAllele, Al
// The workaround is to not pass an empty list.
switch( t ) {
case ALL:
if (!rsIDs.isEmpty()) {
if (hasReference() && !rsIDs.isEmpty())
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fold these into a hierarchy just for ease or reading.

spec.executeTest("test that validation occurs w/ ref, w/ dbsnp", this);
}

//proves that validationExampleRSID has more problems than just ID
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this tests proves about the tool itself in this case...

…for all possibilities

added warning log message for validations that cannot be done

in default case, do all possible validations

Edited default validation

edited for James' comments

Added test cases for all possible default cases

calculateValidationTypesToApply uses its own temporary list

edited for James' final, minor revisions
@jonathoncohen98 jonathoncohen98 force-pushed the jc_validateVariants_fixes_issue5862 branch from db5ac30 to 403dc85 Compare June 14, 2019 14:43
Copy link
Collaborator

@jamesemery jamesemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving so this can be merged

@jamesemery jamesemery merged commit 083aac8 into master Jun 17, 2019
@jonathoncohen98 jonathoncohen98 deleted the jc_validateVariants_fixes_issue5862 branch June 19, 2019 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants