-
Notifications
You must be signed in to change notification settings - Fork 603
Issue #5910: Extract engine from GenotypeGVCFs #6004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #6004 +/- ##
==============================================
- Coverage 80.142% 72.222% -7.92%
+ Complexity 31040 27019 -4021
==============================================
Files 2014 2017 +3
Lines 151333 151440 +107
Branches 16612 16623 +11
==============================================
- Hits 121281 109373 -11908
- Misses 24197 36760 +12563
+ Partials 5855 5307 -548
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I would say that you should clean up the code a little bit so its not full of commented out methods/notes etc. I would also say you should write some javadoc comments for the methods in the new class that don't have them.
@@ -98,8 +93,11 @@ | |||
public static final String ALL_SITES_SHORT_NAME = "all-sites"; | |||
public static final String KEEP_COMBINED_LONG_NAME = "keep-combined-raw-annotations"; | |||
public static final String KEEP_COMBINED_SHORT_NAME = "keep-combined"; | |||
private static final String GVCF_BLOCK = "GVCFBlock"; | |||
private VCFHeader outputHeader; | |||
//move to engine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to clean up these notes to self
if (somaticInput) { | ||
logger.warn("Note that the Mutect2 reference confidence mode is in BETA -- the likelihoods model and output format are subject to change in subsequent versions."); | ||
} | ||
|
||
//leave |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also get rid of the leave comments
import htsjdk.variant.vcf.VCFHeaderLine; | ||
import htsjdk.variant.vcf.VCFStandardHeaderLines; | ||
|
||
public class GenotypeGVCFsEngine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should add a javadoc comment here explaining what the engine does and generally how it should be used
|
||
|
||
//constructor | ||
public GenotypeGVCFsEngine(VariantAnnotatorEngine annotationEngine, GenotypeCalculationArgumentCollection genotypeArgs, boolean includeNonVariants) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have a propper javadoc explaining ~ what it does.
} | ||
|
||
//methods | ||
public VariantContextWriter initialize(VCFHeader inputVCFHeader, Set<VCFHeaderLine> defaultToolVCFHeaderLines, boolean keepCombined, DbsnpArgumentCollection dbsnp, VariantContextWriter vcfWriter) //do work for onTraversalStart |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with this method.
* @param vc the variant context | ||
* @return variant context with the NON-REF alleles removed if multiallelic or replaced with NO-CALL alleles if biallelic | ||
*/ | ||
//move to engine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More //move comments that could go
} | ||
|
||
/* | ||
//dont move | ||
private static int parseInt(Object attribute){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this from the main class.
intervals = hasUserSuppliedIntervals() ? intervalArgumentCollection.getIntervals(getBestAvailableSequenceDictionary()) : | ||
Collections.emptyList(); | ||
|
||
final SampleList samples = new IndexedSampleList(inputVCFHeader.getGenotypeSamples()); //todo should this be getSampleNamesInOrder? | ||
//move | ||
//final SampleList samples = new IndexedSampleList(inputVCFHeader.getGenotypeSamples()); //todo should this be getSampleNamesInOrder? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete the commented out code
|
||
//setupVCFWriter(inputVCFHeader, samples); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, just delete the code lines you commmented out.
//could leave error checking here, or put error checking in engine - if not null, return VariantContext. if null, return null. | ||
|
||
final VariantContext regenotypedVC = gvcfEngine.callRegion(intervals, loc, variants, ref, features, merger, somaticInput, tlodThreshold, afTolerance); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think leaving the below logic in the main tool was probably the right call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One quick comment about the javadocs and this should be good to go.
@@ -25,15 +25,24 @@ | |||
import java.util.*; | |||
import java.util.stream.Collectors; | |||
|
|||
/** | |||
* the core engine for the GenotypeGVCF that does all of the actual work of the tool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core engine for the GenotypeGVCF that does all of the actual work of the tool
-> 'Engine class to allow for other classes to replicate the behavior of GenotypeGVCFs. See {@link GenotypeGVCFs} for details.
private VariantContextWriter vcfWriter; | ||
|
||
/** these are used when {@link #onlyOutputCallsStartingInIntervals) is true */ | ||
//pass in to callRegion() in engine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spurious note
Codecov Report
@@ Coverage Diff @@
## master #6004 +/- ##
==============================================
- Coverage 80.142% 72.222% -7.92%
+ Complexity 31040 27019 -4021
==============================================
Files 2014 2017 +3
Lines 151333 151440 +107
Branches 16612 16623 +11
==============================================
- Hits 121281 109373 -11908
- Misses 24197 36760 +12563
+ Partials 5855 5307 -548
|
Codecov Report
@@ Coverage Diff @@
## master #6004 +/- ##
==========================================
Coverage ? 72.222%
Complexity ? 27019
==========================================
Files ? 2017
Lines ? 151440
Branches ? 16623
==========================================
Hits ? 109373
Misses ? 36760
Partials ? 5307
|
65635a1
to
c4f084d
Compare
Extracted GenotypeGVCFs engine from GenotypeGVCFs