Skip to content

wrote getPairOrientation method for GATKRead #6420

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 25, 2020
Merged

Conversation

davidbenjamin
Copy link
Contributor

Closes #1550. @SHuang-Broad do you still want this?

@SHuang-Broad
Copy link
Contributor

sorry, David, just noticed this.

@davidbenjamin
Copy link
Contributor Author

@SHuang-Broad getFragmentLength() is supposed to return the mate's 5' position minus the read's 5' position, so getStart() + getFragmentLength() should equal the mate end for a forward strand read. Therefore, it seems to me that regardless of overlap the method is comparing 5' coordinates correctly, provided that we trust getFragmentLength().

Copy link
Contributor

@SHuang-Broad SHuang-Broad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be a bug, @davidbenjamin , (and a late one) but have you considered the following scenario?
It may be worth it to check for "improperly paired" flag, then.

PairOrientation

reverseStrandTandem.setMateIsReverseStrand(true);
reverseStrandTandem.setPosition(new SimpleInterval("1", 100, 200));
reverseStrandTandem.setMatePosition(new SimpleInterval("1", 200, 300));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I also suggest adding test cases:

  • mates mapped to different chromosomes
    Of course if you believe getting pair orientation from such mate pairs to be non-sense, then it should be move to the test cases expecting exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SHuang-Broad What's your opinion? It doesn't seem like any of the existing values of SamPairUtil.PairOrientation make sense. If it weren't in htsjdk I would be tempted to add DIFFERENT_CHROMOSOMES to the enum. I could imagine throwing an error and leaving it to the calling code to check. I could also imagine having this method return an Optional<SamPairUtil.PairOrientation> and reserving an empty result for this case.

@davidbenjamin
Copy link
Contributor Author

@SHuang-Broad Could you explain your disperse duplicated / improperly paired scenario? It seems to me that if a read and its mate map to the same strand then they have to be oriented in the same (5-prime to 3-prime) direction.

I quickly looked into the improperly paired flag and the word online is that there's no fixed definition. Sometimes it just means an innie with larger-than-expected fragment size.

I will do whatever you want me to do, because I'm just trying to close old GATK issues!

@SHuang-Broad
Copy link
Contributor

Yes, we need to talk and draw on a board when we are both in office. It's a bit hard to explain via pure text here.

@davidbenjamin
Copy link
Contributor Author

Sounds good!

Copy link
Contributor

@SHuang-Broad SHuang-Broad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, David.
Good to merge, once you make a decision on whether to check for same chromosome-ness (I don't know if that's a word...).

@davidbenjamin davidbenjamin merged commit cecf85c into master Feb 25, 2020
@davidbenjamin davidbenjamin deleted the db_pair_orientation branch February 25, 2020 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add new functionality in GATK4 to directly extract SamPairUtil.PairOrientation from GATKRead
2 participants