Skip to content

Close a FeatureReader after use #9078

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 15, 2025
Merged

Conversation

lbergelson
Copy link
Member

I missed this when it was merged, but we should really close out this FeatureReader.

Copy link
Member

@KevinCLydon KevinCLydon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@lbergelson lbergelson merged commit 373df45 into master Jan 15, 2025
20 checks passed
@lbergelson lbergelson deleted the lb_miss_try_with_resources branch January 15, 2025 22:23
gbggrant added a commit that referenced this pull request Apr 23, 2025
* Funcotator Update for Datasource Release V1.8 (#8512)

* Fixed Funcotator VCF output renderer to correctly preserve B37 contig names on output for B37 aligned files (#8539)

* Fix for events not in minimal representation (#8567)

* Ultima.flow annotations.fix (#8442)

* hmer ondel must have mon length

* Revert "hmer ondel must have mon length"

This reverts commit 7852871.

* remove superfluous variant type condition

* fix error message to actually reflect missing argument

* fixed unittest to include variant type

* Remove conflict

* Removes unnecessary and buggy validation check (#8580)

* Additional fix + logging fixes
* Added missing initialization

* Update our HTSJDK dependency to 4.0.2 (#8584)

* Update picard to 3.1.1 (#8585)

* Add option to AnalyzeSaturationMutagenesis to keep disjoint mates (#8557)

* Add option for keeping disjoint mates in ASM

* Better name and fixing reports

* Finish fixing report

* Fix report name

* New/Updated Flow Based Read tools (#8579)

New Tool: GroundTruthScorer
Update: FlowFeatureMapper

* GroundTruthScorer doc update (#8597)

* Add a native GATK implementation for 2bit references, and remove the dependency on the ADAM library (#8606)

* Add a native GATK implementation for 2bit references, with comprehensive unit tests

* For now, this is only hooked up to the Spark codepath, but it could easily be hooked up to ReferenceDataSource and the Walker codepath as well

* Remove the dependency on the ADAM library, to resolve conflicts with future dependency upgrades

* Update dependencies to address security vulnerabilities, and add a security scanner to build.gradle (#8607)

* Updated many GATK dependencies to address known security vulnerabilities

* Added a security scanner to build.gradle

* There are still some remaining vulnerabilities in GATK dependencies, but this eliminates most of them

* Update http-nio and wire its new settings (#8611)

* Update http-nio and wire it so it's configured at startup along with GCS setttings.

* PrintFileDiagnostics for cram, crai and bai. (#8577)

* New experimental tool to print out human readable file diagnostics for cram/crai/bai files.

* Allow GenomicsDBImport to connect to az:// files without interference (#8438)

* GATK's lack of support for az:// URIs means that although GenomicsDB can
  natively read them, parts of the java code crash when interacting with them
* Adding --avoid-nio and --header arguments
  These allow disabling all of the java interaction with the az:// links
  and simply passing  them through to genomicsdb
  This disables some safeguards but allows operating on files in azur
* Update GenomicsDB version to 1.5.1 for azure improved support

* There are no direct tests on azure since we do not yet have any infrastructure
  to generate the necessary tokens, there is a disabled test which requires
  #8612 before we can enable it.

---------

Co-authored-by: Nalini Ganapati <[email protected]>
Co-authored-by: Nalini Ganapati <[email protected]>

* Disable line-by-line codecov comments (#8613)

* Support for custom ploidy regions in HaplotypeCaller (#8609)

For having variable ploidy in different regions, like making haploid calls outside the PAR on chrX or chrY, 
there is now a --ploidy-regions flag. The -ploidy flag sets the default ploidy to use everywhere, and --ploidy-regions
should be a .bed or .interval_list with "name" column containing the desired ploidy to use in that region
when genotyping. Note that variants near the boundary may not have the matching ploidy since the ploidy used will be determined using the following precedence:

* ploidy given in --ploidy-regions for all intervals overlapping the active region when calling your variant
  with ties broken by using largest ploidy); note ploidy interval may only overlap the active region and determine 
  the ploidy of your variant even if the end coordinate written for your variant lies outside the given region
* ploidy given via global -ploidy flag
* ploidy determined by the default global built-in constant for humans (2).

---------

Co-authored-by: Ty Kay <[email protected]>
Co-authored-by: rickymagner <[email protected]>

* Update the GATK base image to a newer LTS ubuntu release (#8610)

* Update the GATK base image to the latest Ubuntu LTS release (22.04)

* Add some additional useful utilities to the base image

* Switch to a newer conda version with a much faster solver

* Update the scripts and documentation for building the base image

* Update the VETS integration tests to allow for a small epsilon during numeric comparisons, and include the full diff output in exception messages when a mismatch is detected

* build_docker_remote: add ability to specify the RELEASE arg to the cloud-based docker build, and add a release script (#8247)

* Added a -r argument to build_docker_remote.sh to toggle the RELEASE flag during
  docker builds

* Added a release_prebuilt_docker_image.sh to release a prebuilt docker image to the
  official repos

* Update to htsjdk 4.1.0 (#8620)

* update to htsjdk 4.1.0 which enables http-nio in more cases
* remove several test cases handling genomicsdb path parsing which were testing nonsensical paths that are now illegal

* Fix the Spark version in the GATK jar manifest, and used the right constant in build.gradle (#8625)

* Update http-nio to 1.1.0 which implements Path.resolve() methods (#8626)

* This should make http access seamless in many places

* The way this handles query parameters is not ideal for signed url cases so we'll need to revisit that

* Fix GT header in PostprocessGermlineCNVCalls's --output-genotyped-intervals output (#8621)

* Write gCNV interval output ID=GT header as Type=String

Incorrectly writing this as Type=Integer causes bcftools to misparse
the genotype field.

* Use correct header types and numbers in test VCF file

* Output the new image name at the end of a successful cloud docker build (#8627)

* Reduce SVConcordance memory footprint (#8623)

* Rewrite complex SV functional annotation in SVAnnotate (#8516)

* Improvements to Mutect2's Permutect training data mode (#8663)

* include normal seq error log likelihood in Permutect dataset

* handle different alelle representations in multiallelic / indel variants for Permutect training data mode

* set the default artifact to non-artifact ratio to 1 in Permutect training data mode

* normal artifact lod is now defined without the extra minus sign (#8668)

* Parameterize the logging frequency for ProgressLogger in GatherVcfsCloud. (#8662)

* Handle CTX_INV subtype in SVAnnotate (#8693)

* Move to GenomicsDB 1.5.2 which supports M1 macs (#8710)

* Support for MacOS universal builds (intel AND M1)
* Catch JNI importer exceptions and propagate them as java IOExceptions
* Turn off HDFS support by default

Co-authored-by: Nalini Ganapati <[email protected]>

* Standardize test results directory between normal/docker tests (#8718)

Normalize the test results location between docker and non docker tests

* fix no data hom refs (#8715)

Output GQ0 genotypes from reference blocks as no-call rather than hom-ref

* Update the setup_cloud github action (#8651)

* update setup-gcloud@v0 -> v2 since v0 is deprecated

* Improve failure message in VariantContextTestUtils (#8725)

* added --inverted-read-filter argument to allow for selecting reads that fail read filters from the command line easily (#8724)

* GatherVcfsCloud is no longer beta (#8680)

* Fix to long deletions that overhang into the assembly window causing exceptions in HaplotypeCaller (#8731)

* dragstr model in Mutect2 WDL (#8716)

* Update README to include list of popular software included in docker image (#8745)

* Update README to include list of popular software included in docker image

* Now we should be excluding the test folder from code coverage calculations (#8744)

* Make M2 haplotype and clustered events filters smarter about germline events (#8717)

* M2 bad haplotype filter does not filter variants that share a haplotype with a germline event

* two ECNT annotations -- haplotype and region -- and clustered events filter looks at both

* Funcotator: suppress a log message about b37 contigs when not doing b37/hg19 conversion (#8758)

Don't print the very long and misleading "The following contigs are present in b37 and
missing in the input VCF sequence dictionary" log message when we're not even doing b37/hg19
conversion.

* SNVQ recalibration tool added for flow based reads (#8697)

Co-authored-by: Dror Kessler <[email protected]>

* Several GQ0 cleanup changes: (#8741)

Set GGVCFs --all-sites GQ0 hom-refs to no-calls
Set regular GGVCFs GQ0 hom-refs to no-calls (any DP, PL) for better AF/AN annotations
Remove PLs in "no data" case where DP=0 for more accurate QUAL score

* Re-commit large files as lfs stubs (#8769)

Several files tracked by git lfs were accidentally reimported as normal files.
This makes them stubs again.

* Enable ReblockGVCF to subset AS annotations that aren't "raw" (pipe-delimited) (#8771)

* Enable ReblockGVCF to subset AS annotations that aren't "raw" (i.e. pipe-delimited)

* Fix tests by removing AssemblyComplexity from default annotations

* Gc getpipeupsummaries use mappingqualityreadfilter (#8781)

* Add MappingQualityReadFilter

* Added additional warnings for mmq

* Fixed doc typo

* Add malaria spanning deletion exception regression test with fix (#8802)

* Add malaria spanning deletion exception regression test with fix

* Disabling codecov.

---------

Co-authored-by: Jonn Smith <[email protected]>

* Bug fix in flow allele filtering  (#8775)

* Fixed a bug that prevented filtering by SOR in many cases

* Allow for GT to be a nocall if GQ and PL[0] are zero instead of homref in GenomicsDB (#8759)

* Allow for GT to be a nocall if GQ and PL[0] are zero instead of homref in GenomicsDB

* Move to 1.5.3 release from snapshot

---------

Co-authored-by: Nalini Ganapati <[email protected]>
Co-authored-by: Nalini Ganapati <[email protected]>

* Reduced docker layers in GATK image from 44 to 16 (#8808)

* Reduced total layers in the GATK docker image from 44 down to 16.

* Reduced GATK base image layers from 20 to 3.

* This might be a better solution than a full squash down to a single layer, because: 

If we are hosting this in a premium ACR, the limit is 10,000 readOps per minute. So with 16 layers, you get around 625 pulls per minute. Also, this will be able to still take advantage of parallel pulls (default is 3, but at most 16 threads in this case, I believe) as opposed to one big layer which will not download in parallel. There's the potential of that being a lot slower and subsequent jobs falling into the same "minute" because others are not done, making it easier to hit that 10k readOps limit. Lastly, people using GATK outside data pipelines will not be able to take advantage of layer caching too.

Resolves #8684

* VariantFiltration: added arg to write custom mask filter description in VCF header (#8831)

Added a --mask-description argument to VariantFiltration to write a custom description for the mask filter in the VCF header

* Bigger Permutect tensors and Permutect test datasets can be annotated with truth VCF (#8836)

* added 20 more Permutect read features

* Permutect test data can, like training data, be annotated with a truth VCF

* [BIOIN-1570] Fixed edge case in variant annotation  (#8810)

* [BIOIN-1570] Fixed edge case in variant annotation when the variant is close to the edge of the reference

* Mutect2 germline resource can have split multiallelic format (#8837)

* Mutect2 WDL and GetSampleName can handle multiple sample names in BAM headers (#8859)

* Permutect dataset engine outputs contig and read group indices, not names (#8860)

* Fixed a bug in AlleleFiltering that ignored more than a single sample (#8841)

* Fixing bug in ReblockGVCFs when removing annotations (#8870)

* fix for gnarly when PLs are null (#8878)

* Fix GenotypeGVCFs with mixed ploidy sites (#8862)

* Remove deprecated genomes in the cloud docker image that was causing CNN WDL test failures (#8891)

Replace usage of a deprecated Genomes-in-the-Cloud docker image that was causing the CNN WDL tests to fail with the GATK base image.

* Restore gnarly tests (#8893)

* Update no-calls in Gnarly haploid test
* Put back turned off tests

* Remove header lines in ReblockGVCFs when we remove FORMAT annotations (#8895)

* remove header lines when removing annotation in Reblocking

* clean up

* Update several dependencies to fix vulnerabilities (#8898)

* Update http-nio to 1.1.1 (#8889)

This release fixes several bugs when using HttpPath resolve methods that have been affecting users. In particular, they were causing issues when trying to construct HTTP URIs for companion files such as fasta/bam indices.

* Inverted SoftClippedReadFilter to conform to the standard filtering logic (#8888)

* Tool to detect CRAM base corruption caused by GATK issue 8768 (#8819)

A new diagnostic tool, CRAMIssue8768Detector, that analyzes a CRAM file to look for possible base corruption caused by #8768

This issue affects GATK versions 4.3.0.0 through 4.5.0.0, and is fixed in GATK 4.6.0.0.
This issue also affects Picard versions 2.27.3 through 3.1.1, and is fixed in Picard 3.2.0.

The bug is triggered when writing a CRAM file using one of the affected GATK/Picard versions,
and both of the following conditions are met:

    * At least one read is mapped to the very first base of a reference contig
    * The file contains more than one CRAM container (10,000 reads) with reads mapped to that same reference contig

When both of these conditions are met, the resulting CRAM file may have corrupt containers containing reads with an incorrect sequence.

This tool writes a report to an output text file indicating whether the CRAM file appears to have read base corruption caused by issue 8768, and listing the affected containers. By default, the output report will have a summary of the average mismatch rate for all suspected bad containers and a few presumed good containers in order to determine if there is a large difference in the base mismatch rate.

Optionally, a TSV file with the same information as the textual report, but in tabular form, can be written using the "--output-tsv" argument.

---------

Co-authored-by: David Roazen <[email protected]>

* Update HTSJDK to 4.1.1 and Picard to 3.2.0 (#8900)

Includes a unit test to check for the presence of the fix in HTSJDK 4.1.1 for the
CRAM base corruption bug reported in #8768

Resolves #8768

* Handle CTX_PP/QQ and CTX_PQ/QP CPX_TYPE values in SVConcordance (#8885)

* Updated Python and PyMC, removed TensorFlow, and added PyTorch in conda environment. (#8561)

* Updated Python and PyMC, removed TensorFlow, and added PyTorch in Docker and conda environments. Notable environment changes: python 3.6.10 -> 3.10.13, pymc 3.1 -> 5.10.0, theano 1.0.4 -> pytensor 2.18.1, added pytorch 2.1.0, removed tensorflow 1.15.0 and other CNN dependencies, added libblas-dev to the base Docker.

* Updated gCNV code to account for changes from PyMC3/Theano to PyMC/PyTensor.

* Updated gCNV integration tests.

* Updated gCNV WDL tests.

* Updated other tests and tools affected by environment changes.

* Reverted posterior sampling to online implementation.

* Updated localDevCondaEnv task in build.gradle.

* Addressed review comments and cleaned up TODOs.

* Released gatkbase-3.3.0 and updated Dockerfile.

* Added DeprecatedFeature tags to CNNScoreVariants, CNNVariantTrain, and CNNVariantWriteTensors.

* Complex SV intervals support (#8521)

* Add support for CPX variants in SVCluster and SVConcordance

* Update hdf5-java-bindings to version 1.2.0-hdf5_2.11.0, which removes log4j 1.x (#8908)

* Clarify in the README which git lfs files are required to build GATK (#8914)

Resolves #8912

* Modified HaplotypeBasedVariantRecaller to support non-flow reads (#8896)

* X_FILTERED_COUNT semantics adjusted in FlowFeatureMapper (#8894)

* adding test to match WARP tests edge case (#8928)

* Adds VcfComparator tool (#8933)

* Systematically compare the outputs of two pipeline versions, especially
post-reblocking when many deletion "cleanup" changes occur

* Subset annotations to account for different alleles

* Refactor and clean up

* Check ref blocks

* Be explicit about AS_RAW annotation differences

* Update changed star attributes from NaNs to missing; add option to
ignore

* Ignoring star attributes still needs work

* Lookin' pretty good...

* Add new mode to log all differences before failing

* Misplaced brace prevented PL checks

* Check on */* genotypes was failing DRAGEN outputs -- removed

* clean up after rebase

* adding test

* style

* not sure why we need to cap values for the comparator so removing it

* addressing comments

* more comments

---------

Co-authored-by: Laura Gauthier <[email protected]>

* Add docs about citing GATK (#8947)

* Issue #7159 Create tool for producing genomic regions (as a BED file) (#8942)

* Initial commit and basic code to read gtf

* add: code to write to bed & integration test

* fix: make getAllFeatures public and use the nesting of features to get to transcripts

* add: filtering transcripts by basic tag

* add: sorts by contig and start (need to fix - sorting lexicographically)

* fix: now sorts by contig then start & output is correct

* fix: make dictionary an arg

* add: comments + simplified CompareGtfInfo

* refactor: apply method
test: add separate tests for gene and transcript

* refactor: onTraversalSuccess and writeToBed

* add: more tests

* fix: test files in correct dir pt1. (files are too large)

* fix: test files in correct dir pt2.

* add: compareFiles and ground truth bed files

* fix: runGtfToBed assert

* add: comments to GtfToBed

* fix: error handling for different versions of gtf and dictionary

* fix: edited some bad conventions

* fix: remove spaces from input file fullName

* add: gtf file with MYT1L and MAPK1

* add: many transcripts unit test and refactoring

* add: tiebreaker sorting by id

* add: make sort by basic optional

* add: html doc comment

* fix: dictionary arg

* fix: add "Gencode" to description

* add: sample mouse gencode testing

* fix: Remove arg shortnames

* fix: rename and move CompareGtfInfo

* fix: kebab-case args

* fix: update html doc

* fix: use IntegrationTestSpec.assertEqualTextFiles()

* fix: remove unnecessary test of pik3ca

* fix: remove set functions in GtfInfo

* fix: style of comparator

* fix: style of comparator

* fix: use Files.newOutputStream() to write and logger for errors

* fix: use getBestAvailableSequenceDictionary()

* fix: use dataProvider for integration tests

* fix: better encapsulation

* fix: move mapk1.gtf to large dir

* fix: arg names

* fix: rename reference dict.

* fix: sequence-dictionary arg javadoc

* add: javadoc to GtfInfo

* add: dictionary exception and corresponding test

* add: test with fasta file as reference arg

* add: javadoc for fasta file

* fix: javadoc and onTraversalStart exception

* Update the large test CRAM files to CRAM v3.0. (#8832)

* Update detector output files. (#8971)

Repair the CRAM detector/diagnostics test output to reflect the CRAM file name change that was introduced by updating the large CRAM files to v3.0.

* Require both overlap and breakend proximity for depth-only SV clustering (#8962)

* Adds QD and AS_QD emission from VariantAnnotator on GVS input (#8978)

* Updating VariantAnnotator to emit QD and AS_QD on GVS input

* clean up

* addressing comments

* Added new argument "--variant-output-filtering" to variant walkers for customized filtering of output variants to intervals. (#6388)

Co-authored-by: James <[email protected]>

* Swapped mito mode in Mutect to use the mode argument utils, providing better override behavior and documentation (#8986)

* Warn instead of throwing when querying intervals that were not in GenomicsDBImport (#8987)

* Switch to logging a warning instead of an exception for intervals in a query that were not part of GenomicsDBImport
* upgrade GenomicsDB 1.5.3 -> 1.5.4
* fixes #8415

* Updates to VcfComparator (#8973)

* updates

changes for gq0 comparison

more changes

some vcfs have no gqs

clean up

change for NPE

clean up

whitespace

* addressing comments

* Update Mutect2.java Documentation (#8999)

This commit changes the default --mitochondria-mode parameters to what they are supposed to be in the source code based on the raised question in the forum 

https://gatk.broadinstitute.org/hc/en-us/community/posts/29969305493147-Incorrect-documentation-for-mitochondria-mode-in-Mutect2

This commit also fixes the mitochondria mode flag to its correct naming in the documentation.

* Add dependency submission workflow to monitor vulnerabilities (#9002)

* Adds a new github actions workflow to submit our dependencies for analysis

* Add more detailed conda setup instructions to the GATK README (#9001)

Our README was missing some important details on how to properly setup the GATK conda environment locally.

* Port of nvscorevariants into GATK, with a basic tool frontend (#8004)

* Port of the NVIDIA-authored nvscorevariants tool into GATK, with a basic tool frontend

* This is a direct replacement for the legacy tool CNNScoreVariants. It produces results that are almost identical to that tool, but is implemented on top of a more modern ML library, Pytorch.

* The Python code is taken from https://github.com/NVIDIA-Genomics-Research/nvscorevariants, with a few minor modifications necessary to get the tool working on newer versions of the Python libraries.

* Added pytorch-lightning to the GATK conda environment, as it's required by this tool

* Disabled jacoco in build.gradle, as it was causing strange errors related to jacoco trying to parse the new Pytorch model files in the resources directory

---------

Co-authored-by: Louis Bergelson <[email protected]>

* Re-added `--only-output-calls-starting-in-intervals` as a deprecated argument. (#9000)

* Adding small warning messages to not to feed any GVCF files to these tools (#9008)

* Adding small warning messages to not to feed any GVCF files to these tools.

* Update src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/VariantRecalibrator.java

Co-authored-by: Louis Bergelson <[email protected]>

* Update src/main/java/org/broadinstitute/hellbender/tools/walkers/annotator/VariantAnnotator.java

Co-authored-by: Louis Bergelson <[email protected]>

* Update src/main/java/org/broadinstitute/hellbender/tools/walkers/annotator/VariantAnnotator.java

Co-authored-by: Louis Bergelson <[email protected]>

---------

Co-authored-by: Louis Bergelson <[email protected]>

* Update docker base (#9005)

* Rebuild base image to incorporate recent patches
* gatkbase-3.3.0 -> 3.3.1

* Added a check for whether files can be created and executed within the configured tmp-dir (#8951)

* Update gradle and build.gradle (#8998)

* Update gradle and build.gradle

* update gradle wrapper 8.2.1 ->  8.10.2
* remove 'versions' plugin because we don't use it
* update gradle plugins to new versions
  * shadow plugin changed maintainers and coordinates
	com.github.johnrengelman.shadow:8.1.1 -> com.gradleup.shadow:8.3.3
  * git-version 0.5.1 -> 3.1.0
  * sonatype scan 2.6.1 -> 2.8.3
  * download 5.4.0 -> 5.6.0
* use tasks.register() which is the newer style
* add gradle.properties to configure daemon memory

* update several vulnerable dependencies to safer versions
* update htsjdk and picard
* replace deprecated override of ThresholdStream.getStream() -> getOutputStream()

* Remove CNNScoreVariants, CNNVariantTrain, and CNNVariantWriteTensors (#9009)

Removed the code for these tools, now that NVScoreVariants is merged
and we've moved to a Python environment that is incompatible with these
legacy tools.

Also added the tools to the DeprecatedToolsRegistry.

* Mark NVScoreVariants as a beta feature (#9010)

* Additional Dependency updates (#9006)

* Updating dependency management and vulnerable dependencies

* Update dependencies to fix vulnerabilities as reported in #8950
* Update our dependency management to make use of some newish gradle features
  * Add dependency constraints to update transitive dependencies, this allows us to specify versions without making them
    direct dependencies
  * Remove most force expressions and replace them where necessary with version strict requirements
  * Make use of several published bom's to configure consistent dependency versions for platforms like netty and log4j2
  * Remove exclude statements that are now handled by variant dependency resolution (like guava android vs jdk)
* Exclude the org.bouncycastle:bcprov-jdk15on dependency and replace it with bcprov-jdk18onA
  This adds an unnecessary testUtilImplementation level dependency on what is really a transitive, but I couldn't get gradle's explicit version replacement to work.
  replacement logic to work so this is a workaround

* Added a '--prefer-mane-transcripts' mode that enforces MANE_Select tagged Gencode transcripts where possible (#9012)

* Added a '--prefer-mane-transcripts' mode that enforces MANE_Select tagged Gencode transcripts where possible

* Use jetty bom to enforce uniform jetty versions (#9016)

* fix dependabot alert for jetty-xml
* bom version -> 9.4.56.v2024082

* Change the kryo version specification to use maven style [,) instead of + (#9018)

* maven central doesn't like + in version ranges and gradle doesn't translate it

* Port of `CallableLoci` from GATK3 (#9031)

* This is a port of a tool from GATK which is being used in some active pipelines.  
* CallableLoci is a a tool which calculates coverage based stats and identifies regions of the genome which should be "callable" or not based on several filters.

* Retain all source IDs on VariantContext (#9032)

* Retain all source IDs on VariantContext merge

In GATK3, when merging variants, the IDs of all the source VCFs were retained. This code path seems like it intended that, since the variantSources set is generated, but it doesnt get used for anything. This PR will use that set to set the source of the resulting merged VC.

* Add overloaded method to GATKVariantContextUtils.simpleMerge to preserve existing behavior when storeAllVcfSources=false


Co-authored-by: Louis Bergelson <[email protected]>

* Add SVStratify and GroupedSVCluster tools (#8990)

* VariantRecalibrator R script fix for recent R versions. (#9046)

* change space=rgb to space=lab

* change space=rgb to space=lab

* VQSR Rscript fix

* Update README.md (#9047)

* Update README to match our R and Python requirements. 
* Update brew installation instructions.


Co-authored-by: Louis Bergelson <[email protected]>

* Prioritize het calls when merging clustered SVs (#9058)

* Updating upload_artifact in github actions (#9061)

Update upload_artifact from 3 -> 4

* Close a FeatureReader after use in HaplotypeCallerEngine (#9078)

* Bump ch.qos.logback:logback-core from 1.4.14 to 1.5.13 (#9079)

* Bump ch.qos.logback:logback-core from 1.4.14 to 1.5.13

Bumps [ch.qos.logback:logback-core](https://github.com/qos-ch/logback) from 1.4.14 to 1.5.13.
- [Commits](qos-ch/logback@v_1.4.14...v_1.5.13)

---
updated-dependencies:
- dependency-name: ch.qos.logback:logback-core
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* also update logback-classic to keep it in sync

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Louis Bergelson <[email protected]>

* Remove unecessary calculation and methods in FlowBasedRead (#9077)

* remove the field FlowBaseRead.forwardSequence was calculated at non-trivial cost and then only used for it's length
* removed the method seqlength because it was a more convoluted of getLength
* fixed some typos and finals because I couldn't help myself

* Lots of little changes to Permutect dataset generation (#9094)

* alt downsampling is drawn uniformly at random, not matching artifact distribution
* fixed the variant distance from end of read to account for clipping
* feature range only goes up to 20 since reads can be clipped that small
* took out total read mismatches in favor of ranged mismatches
* removed STR INFO features now computed in Python; added TLOD / alt count INFO feature
* renamed mutect3 and m3 to permutect everywhere
* M2 can create test and training datasets at the same time (also in WDL), with distinct truth VCFs
* masked intervals, option to skip filltering in M2 WDL
* argument for considering only a fraction of germline sites active i.e. partial genotypeGermlineSites mode

* Bring in some performance improvements from genomicsdb (#9059)

* Bring in some performance improvements from genomicsdb which fix a problematic memory leak
* Move to 1.5.5 version of genomicsdb

* BQSR: avoid throwing an error when read group is missing in the recal table, and some refactoring. (#9020)

* Addressed an edge case in ScoreVariantAnnotations that can occur when one variant type is not present in the input VCF. (#9112)

* Update netty version (#9120)

* netty 4.1.114.Final -> 4.1.118.Final
* includes security patches

* Uncomment line excluding bad version of bouncycastle library (#9129)

* This seems to have been accidentally commented out for testing and never un-commented

* Exclude logback-classic to fix logger configuration warning (#9128)

* Updated references to the funcotator datasets bucket to point to the new google bucket (#9131)

* Update gradle sonatype plugin (#9133)

* 2.8.3 -> 3.0.0

* Bump org.apache.commons:commons-vfs2 from 2.9.0 to 2.10.0 (#9130)

* Bump org.apache.commons:commons-vfs2 from 2.9.0 to 2.10.0
* Remove use of newly deprecated methods in test code: 
 RandomStringUtils.randomAlphanumeric() -> RandomStringUtils.insecure().nextAlphanumeric()

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Louis Bergelson <[email protected]>

* Update to org.apache.commons.lang3 from org.apache.commons.lang

* Fixed the the ExtractCohortToVcfTest and ExtractCohortToPgenTest

* Do not put QD into extracted VCFs

* Added the --validation-type-to-exclude ALLELES flag to ValidateVariants to get around issue where GTs with GQ=0 now being no-called in extracted VCF and thus causing some alt alleles to not be present

* Update the VDS Creation to transmute LGT so that it is a no-call if the GQ is 0

* Fix for code scanning alert no. 5: Arbitrary file access during archive extraction ("Zip Slip")

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: jamesemery <[email protected]>
Co-authored-by: meganshand <[email protected]>
Co-authored-by: Dror Kessler <[email protected]>
Co-authored-by: Ilya Soifer <[email protected]>
Co-authored-by: droazen <[email protected]>
Co-authored-by: Louis Bergelson <[email protected]>
Co-authored-by: Chris Macdonald <[email protected]>
Co-authored-by: Chris Norman <[email protected]>
Co-authored-by: Nalini Ganapati <[email protected]>
Co-authored-by: Nalini Ganapati <[email protected]>
Co-authored-by: Ty Kay <[email protected]>
Co-authored-by: rickymagner <[email protected]>
Co-authored-by: John Marshall <[email protected]>
Co-authored-by: Mark Walker <[email protected]>
Co-authored-by: epiercehoffman <[email protected]>
Co-authored-by: David Benjamin <[email protected]>
Co-authored-by: Nalini Ganapati <[email protected]>
Co-authored-by: Laura Gauthier <[email protected]>
Co-authored-by: Kevin Lydon <[email protected]>
Co-authored-by: Dror Kessler <[email protected]>
Co-authored-by: Gökalp Çelik <[email protected]>
Co-authored-by: Jonn Smith <[email protected]>
Co-authored-by: Kevin Palis <[email protected]>
Co-authored-by: samuelklee <[email protected]>
Co-authored-by: Sana Shah <[email protected]>
Co-authored-by: Jonn Smith <[email protected]>
Co-authored-by: bbimber <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Takuto Sato <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants