Do virtually continuous matches of DNA segments provide stronger evidence of a match?

Question

Do virtually continuous matches of DNA segments provide stronger evidence of a match?

5 Answers

Answer 1 · 2019-10-17T09:35:57+0000

I'm not an expert, but from everything I have read and heard, says that small segments are not particularly valid.

You really want longer unbroken segments

I have always been told that any segments that are less than 7 cM or even 10 cM at the very least, are just "noise" or random coincidence.

Answer 2 · 2019-10-17T11:06:12+0000

small segments of that size are significantly below the 50% confidence interval. See the table here.

Answer 3 · 2019-10-17T12:10:09+0000

As Robynne pointed out in her answer, segments smaller than 7 cM tend to be unreliable. When the DNA of two people split and recombine to make a new person those splits, or crossovers, tend to occur at certain locations, and smaller segments are more likely to stick with the segment on either side of it. We can see evidence of this when comparing the segments of close relatives (siblings, parent/child, grandparent/grandchild, etc.) If the tiny segments did not have a tendency to stick to their adjacent segments, then we would expect that siblings would have hundreds of tiny segments in common, but in reality they tend to have fewer large chunks (see: https://isogg.org/wiki/Chromosome_browser_examples)

If the tool you're using to compare chromosomes allows you to set a minimum segment size, start with a larger threshold (25 - 30 cM) and work your way down, but keep in mind that matching segment sizes of less than 7 cM don't guarantee a genetic relationship.

answered Oct 17, 2019 by Erik Oosterwal G2G6 Mach 5 (54.3k points)

Howdy from the American Society of Human Genetics 2019 Annual Meeting. I never asked on G2G if any other WikiTreers would be here; if you are at the George R. Brown, give me a shout. I plan to be back for Friday afternoon and will likely skip Saturday altogether. Been a long week. And it's like taking a month to travel mainland China after doing a few Rosetta Stone lessons in Mandarin. You thought you understood most of the lingo, and in some instances you can follow along. But elsewhere, there's definitely a foreign language being spoken.

Taking a break before "Haplotype-level Interrogation of the Genome" at 4:15, and just wanted to make a quick comment. First up, so that there's no confusion, the DNA of two people doesn't split and then recombine. Recombination, technically crossing over, happens during the first of two iterations of prophase during meiosis. No recombination happens when the zygote forms. That's when the haploid chromosomes from each parent join to make our 23 pairs of diploid chromosomes. They don't mix and match at that point (if you arguably discount the tiny PAR regions on the Y). Your mother's ovum has, in fact, been recombined and waiting for you since she was still a fetus herself.

And I'm afraid Erik's explanation of tiny segments sticking to adjacent segments may throw some folks for a curve. Recombination is not a totally random thing; it isn't like throwing chromosomes into a Waring blender for 60 seconds and then just piecing together what comes out. If it were, for one thing we'd never be able to calculate a centiMorgan at all...because it's a mathematical estimation of the probability that a crossover has occurred between two specific loci on a chromosome: one cM equals a 0.01 probability of a crossover in a single generation.

We have a pretty good handle, thanks to numerous peer-reviewed studies, on the average number of actual crossover events per generation per gender. There are a lot more crossovers that occur in the female genome than the male; far more than the X-chromosome alone can account for. About 70% more, in fact. But even then the per-generation numbers aren't huge.

I tend to use the round numbers from Harvard geneticist David Reich: about 45 crossovers for the ovum, and 26 for the spermatozoa, for a sex-averaged 35.5 per parent...call it 35. All the centiMorgan calculations we see use only sex-averaged values, so we might as well average the crossover discussion, as well.

Most studies I've seen are right in that same area. IIRC ranging from a sex-averaged low of about 31 crossovers to a high at about Reich's 35.5. There are a couple of studies that have been done using the DNA mismatch repair protein, MLH1, as the indicator for crossover, but these produced outlier-level results from the others that were all pretty much in that low-30s range.

So there simply aren't hundreds of segments created during meiosis in a single generation. You're basically working with about 70 segments from your maternal grandparents, and 70 from your paternal grandparents. When comparing sibling to sibling, the number of HIR segments will likely come in at around that figure, about 70. I can only speak experientially here, but in families I've dealt with that included multiple siblings I'd hazard that the typical average is slightly under that; but it wouldn't surprise me to see it climb into the 80s.

The actual number of expected shared segments is almost impossible to predict due in large part to our current atDNA tests. Our microarray tests only examine about 670,000 reference sequences and, depending on the test you take, from 8% to 18% of those are going to be in protein-coding genes...meaning they likely aren't going to be as significant to genealogy and population studies as SNPs that are not in protein-coding genes. In effect we're testing ~600K SNPs out of about 4 to 5 million that are relevant in distinguishing your genome from mine.

The whole segment issue is fuzzy math. Yet another reason to discount very small reported segments. We can't tell, with inexpensive microarray testing, where segments actually begin or where they end. We're simply guessing, even when applying genotyped imputation. But segments can't "drag" adjacent segments around with them. A segment is either created via crossing over during meiosis or it isn't. And two siblings can't have hundreds of different shared segments. Simply not enough crossover events within their two parents to allow that.

There are arguments on both the pros and cons of small segments. At the end of the day, though, there have been no scientific, peer-reviewed studies to indicate what may and may not be genealogically useful segment sizes using our microarray tests. Dr. Tim Janzen and others have posited pretty good information that very small segments are likely to be just noise.

A corollary--not to start a debate here; just stating a fact--is that there is also no scientific, peer-reviewed evidence that autosomal DNA triangulation is a genealogically valid method. We think it probably is because there's a whole lot of anecdotal and individual-case evidence. But we don't actually know. There have been no published studies.

The number of crossovers during meiosis isn't large per-generation, but as we move back in time each set of grandparents contribute a potential 70 crossovers. Theoretically, there's no real reason to think that, for example, the same segment(s) regularly passes along during the course of several generations. A valuable study by Brenna Henn demonstrated that we'd likely have only a 15% chance of sharing any detectable DNA at all with a 5th cousin. AncestryDNA published some info a while back that indicated the odds of finding the same matching segment shared by a group of three or more 5th cousins was essentially zippo (here's a blog post by Debbie Kennett referencing it).

I use autosomal triangulation myself. But sitting here at the ASHG conference--where I have the smallest brain in this and all contiguous buildings--I'm reminded the fact remains that we have no scientific evidence that autosomal DNA triangulation is valid. Assuming that small in-common segments shared among multiple distant cousins means anything in relation to a hypothetical MRCA is just speculation.

commented Oct 17, 2019 by Edison Williams G2G6 Pilot (452k points)

Erik's answer was starred by the OP, John, as Best Answer when I added my, er, very brief <cough> comment. I hope nothing I said led to the removal of that star.

I'm operating on caffeine, so won't reply with any meat-and-potatoes right now. Tonight; about Erik's follow-up and triangulation.

I'm not a huge baseball fan but, you know, we're in the final stretch toward the World Series. A group of us decided to go to Biggio's (as in former Houston Astro Craig Biggio) sports bar last night to watch the postponed game four between the Yankees and the Astros. Hey; the bar is only a couple of blocks from the convention center, so a no-brainer, right? It was absolutely packed, and loud (especially starting in the 3rd inning), and we really couldn't leave until the ribbon had been tied around the Astros' 8-3 victory. Made for a long night.

But I owe Pip and Kerry (that's Dr. Larson, BTW) for making me seem smarter than I am. Checks are in the mail, guys.

The National Society of Genetic Counselors held a joint forum at the ASHG conference last Wednesday, and some of those folks have an "MS" after their names. There's also an inexpensive category of ASHG membership for undergrads heading into med school and related fields. But I think you could comfortably fit all of us at the conference who aren't MDs or STEM PhDs in one of the smaller meeting rooms.

The ASHG is heavily tilted toward the clinical side of things, and I've only bumped into a few population geneticists, much less attendees with primarily genealogical interests. One of ASHG's themes this year is greater inclusion--for example advocating that primary care physicians shouldn't feel excluded from genetics and the pace of new development--and I've added my comment that more work should be done to attract participation from outside the fields and research that are specifically clinically-focused.

But the exhibit hall alone is worth it. There are 292 exhibitors ranging from small start-ups I'd never heard of to 23andMe, BGI, Illumina, Oxford Nanopore, PacBio (their Shawn Levy gave a good presentation on long-read sequencing), Paragon, Quest Diagnostics (who does the AncestryDNA lab processing), and Thermo Fisher Scientific.

Morning rush-hour is over. Another cup of coffee, and I'm heading back to the conference...

commented Oct 18, 2019 by Edison Williams G2G6 Pilot (452k points)

Answer 4 · 2019-10-17T15:05:46+0000

I think that some of the testing companies sometimes end up with artificial breaks within a legitimate segment, so it's a possibility. (Someone more knowledgable than me may know more details.) Having said that, even your combined segment listed above is very small and as other have already said, it's likely to be noise. There are a couple of good blog posts out there about working with very small segments, so you might want to look for those to help determine whether this one might be worth pursuing.

Answer 5 · 2019-10-18T21:32:46+0000

I assume there's a typo in your message and you meant that the next segment begins at 58,979,510? Where are you seeing this? If GEDmatch, turn on "Prevent Hard Breaks" in the 1:1 tool. If else where, this could be an example of a microdeletion. See

https://jogg.info/pages/vol8/sc/generation-gaps.pdf

Categories

Do virtually continuous matches of DNA segments provide stronger evidence of a match?

Please log in or register to add a comment.

Please log in or register to answer this question.

5 Answers

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Related questions