How far back can autosomal triangulation confirm?

+18 votes
2.1k views
What is the hypothetical limit to autosomal triangulation?

So I found a Larry W. Burford who shares dna on the 22nd chromosome with me and has a geneaology back to my 5th great grandfather James Irvine Wilson.  Assuming I find another distant cousin to match with is this even possible?

What is the earliest relative you have been able to match with autosomal triangulation?
WikiTree profile: James Wilson
in The Tree House by Jonathan Wilson G2G6 Mach 1 (17.6k points)
It is not the burden of proof fallacy. You made the claim that MRCAs are highly likely to be more recent when there are significant gaps in the trees. I say it is not. If you were then to tell me to prove my point, you would be making the fallacy. "It is up to the person making the claim to prove the claim" -- you made the initial claim, so the proof is on you.

You talk about odds again, but there are no odds here, because there is no probability model. There is not even data.

The incorrect, unsourced trees at Ancestry are a different problem. If you can confirm the lines to the same distant ancestors with good sourcing, then even with significant holes in the trees I think it could still be quite common in this situation that the researcher has found the MRCA (excepting when there is known significant endogamy).
In this case this is unlikely; my branch of this particular line has been in Kentucky since the early 1800's and these matches' ancestors never left Maryland. This match is very definitely through my maternal grandfather; a significant percentage of his ancestry comes from early Maryland Catholic colonists whose descendants exhibit some characteristics of an endogamous population (Catholic immigration to the colonies virtually ceased after 1700, and the Catholic families already present intermarried for the next 200 years); the GEDmatch users used for triangulation show no matches to people who share this endogamous ancestry. Therefore it can be eliminated as the source of the shared DNA; all of those shared matches with significant trees are descendants of either Henry Childs and Jemima Pottenger, Samuel Pottenger and Elizabeth Tyler, or Robert Tyler and Susannah Duvall (and some of those who are known 5th cousins share over 30cM; the paper-trail 7th cousins share less, typically around 18-20cM).
I agree with the OP. Usually we can say this because not all our 7th great grandparents are from one close region or ethnicity. Remember that back in those days the radius of the people was quite close. They didn’t take airplanes to go on holidays or even boats. They usually married within the village or from neighboring villages.

Also, triangulation helps to sort of out things. If 3 or more otherwise independent DNA matches branch of at different times from different descendants of their common ancestor, then yes these cases are quite robust in their evidence that the person who is identified is indeed the common ancestor who did contribute to that DNA.

Lastly, the whole argument of “you don’t if it went down the branch that you show is relatively irrelevant here. What counts is that with triangulation you identify an IBD segment and with overlapping family trees you identify a common ancestor (and usually several, more recent common ancestors) through which the DNA must have come down. It’s not important if this ancestor will show up in a 100% perfectly filled out family once, twice, three or even more times (through pedigree collapse, which is normal at that level).

The DNA is still from that person or usually from the MRCA couple

C, when you mentioned Maryland Catholics I wondered if we were related.  Checking the relationship finder, I see that we have 382 common ancestors within 30 generations!  If we did share DNA, I wonder how we would ever sort that out?

(Although my first impression when seeing that list was that probably a lot of those lines are wrong, and not my doing!)

We are apparently vaguely related by marriage in a more recent timeframe (my 4th great-grandfather John H. Burch's fiirst wife was a Cissell).
The Maryland Catholics!
And Maryland Catholic endogamy; your ancestor James Cissell's first wife Margaret Vowles is my 7th great-aunt (her parents John Vowles and Elizabeth Cooke are my 7th great-grandparents), and looking at his descendants and their marriages, Edelen, Shircliffe, Hagan, Willett, Mudd, Nevitt, Mattingly, Greenwell, Spalding, Brewer and Miles are all in my direct line and Wathen, Hayden, Boone, Wimsatt and Riney are all related by marriage.
Wow. What's with all this hair-trigger freaking out about 8Cs? Nobody even SAID that "8x cousin matches are common"! I think I only have ONE myself. The only reason I consider it seriously is that I have 7C matches on the same line that look pretty credible.

How do those 7C people "look credible"? Basically, two of them are second cousins to each other, PLUS the lineage from the MRCA to them is almost completely MALE. Probably, they have a large "sticky segment" from that surname, and I happen to have gotten a piece of it. There are a handful of people who also match them, so if the surname showed up in their trees to that would strengthen the case.

This idea that you have to rule out all your other 4th-gt grandparents seems like unsophisticated bunk to me. Who's been selling that stuff? (I have a pretty good guess.) Maybe if you have some sort of ethnically-pure tree you might have that kind of issue, but for just 5C or 6C, it seems like I generally get a number of shared matches that make it all too clear which part of my ancestry a given match is.
Agreed Barry. Seems to be a lot of people on wiki that are quick to be naysayers. If I have solid documentation back to an ancestor,( like my French Canadian side where there is incredible records back to the first immigrants), that what's the need for triangulation? Most likely,as in my case, I have a brick wall ancestor who I believe is related to a particular family but the census records in the 1700s don't list names quite often,just heads of house. I've found no living male descendants to date so I turn to auto in hopes of getting a match. The common ancestor would be my 6th g grandfather. I have gotten some great matches(in my opinion) throughs several different testing companies. 9 to 12 cent. Typically. It seems to be from on son of Elias Eliot, Oliver Eliot. His daughter Elizabeth married Abijah Eaton. It's the Eatons that I keep matching. I was immediately told 7th cousins is too far away & I needed other family lines to be a good triangulation. I continue to find matches but all from this marriage. So far I have two brothers, their g aunt, a 3rd cousin of theirs, another cousin family,(different descendants of Abijah & elisabeth), another cousin of the brothers who a can't contact but is a cousin of the Eatons per myheritage. All matching me on the same spot on chr.12. At what point does it become a preponderance of evidence?
The Maryland Catholics who became the Kentucky Catholics! My MD bunch were mostly from St. Mary’s and immigrated to Marion, KY. Goodrums, Lees, Thompsons and Downs.

8 Answers

+12 votes
Definitely!

Just off the top of my head, I can tell you that I have a segment that is definitely from my 5th-gt grandfather Heckathorn. Something to keep in mind, though, is that usually the descent is from a PAIR of ancestors, so you can't really tell which one in the pair a segment is from.

In my case, I found a half-6C who also had that segment. He's descended from the 2nd wife, while I'm descended from the 1st, so I know it's from the husband in that pair.

I have a match to somebody who is an 8C on paper, but it hasn't been verified that our common DNA is from those ancestors. That being said, I have several matches to 7Cs on that same side, so it really might be "legit". In fact, two of those 7Cs are 2nd cousins to each other - their MRCA is along the line that goes back to our common ancestor - so I'm fairly confident that I really DO have a DNA segment from those 6-th grandparents in that corner of the tree.

Really, I have a number of 6Cs that I have identified within my matches, on a number of side of my tree, and their shared matches indicate that it's not just some fluke.

I'd go so far to say that it's not even unusual.
by Living Stanley G2G6 Mach 9 (92.4k points)
Thank you, that is encouraging.
But the matches you describe don’t sound triangulated. Are they?
+8 votes

I agree with the OP. Usually we can say this because not all our 7th great grandparents are from one close region or ethnicity. Remember that back in those days the radius of the people was quite close. They didn’t take airplanes to go on holidays or even boats. They usually married within the village or from neighboring villages.

Also, triangulation helps to sort of out things. If 3 or more otherwise independent DNA matches branch of at different times from different descendants of their common ancestor, then yes these cases are quite robust in their evidence that the person who is identified is indeed the common ancestor who did contribute to that DNA.

Lastly, the whole argument of “you don’t if it went down the branch that you show" is relatively irrelevant here. What counts is that with triangulation you identify an IBD segment and with overlapping family trees you identify a common ancestor (and usually several, more recent common ancestors) through which the DNA must have come down. It’s not important if this ancestor will show up in a 100% perfectly filled out family once, twice, three or even more times (through pedigree collapse, which is normal at that level).

The DNA is still from that person or usually from the MRCA couple.

I personally have an example of 3 of my DNA matches that I triangulate with, two of them (DK and SM) go back to a common Great-Great-Great-Grandparent and then the third of them (JO) matches with DK 6x Great-Grandparent (the MRCA couple is born about 1668 and 1670) whilst JO matches with SM at the Great-Great-Grandparent level.

The last MRCA is the father of the famous Nicholas Adams family who founded Johnsburg and has thousands of descendants. The wonderful WikiTree'er Sandie Schwarz has written this special page about the Founders of Johnsburg in McHenry County: Founders and Early Settlers of Johnsburg, McHenry County, Illinois

There is unfortunately no software that can show these relationships in one picture but I have descendants graphs for all of them.

Unfortunately it's still unclear where my connection is to that group :-(

by Andreas West G2G6 Mach 7 (76.4k points)
edited by Andreas West
+7 votes
Like C. Handy, I work solely off of my parents' DNA tests.  I have identified triangulated groups for several of my dad's 5th great-grandparents.  For one ggf, my dad has 154 DNA matches at ancestry ThruLines; another has 47, and another has 42.  While I would consider the ThruLines evidence to be fairly 'sufficient' to confirm my connection, the TG certainly does that.

I have found ThruLines to be helpful in identifying potential cousins that you can contact and ask to upload to Gedmatch (or see if they are on other sites such as FTDNA, 23andMe, or MyHeritage).  With that information, it is easier to locate triangulated groups (TG) and see if the paper trail is there.

Be aware that when you look at your ThruLine connections, their trees may be inaccurate.  I generally, however, am able to find the correct line which does indeed trace back to the identified common ancestor.
by Darlene Athey-Hill G2G6 Pilot (547k points)
I have found that many ThruLines are simply wrong, for various reasons including inaccurate trees but also because the way Ancestry patches together the information can even override accurate information on individual trees.
Julie, I agree.  As I stated, the lineages shown on ThruLines are many times wrong.  However, DNA doesn't lie.  So you know that the people in that ThruLine are sharing DNA with you.  Lots of times I am able to figure out the correct lineage for the DNA match.  Sometimes the ThruLines will point you to a potential 'brick wall' ancestor who isn't actually your ancestor.  If you have a good tree and do the research, you may find out how everyone in that particular ThruLine group is connected to you through a totally different ancestor.  I just located this today on a potential 5th great-grandfather.  I have a fairly extensive tree, and by searching for the surname of this potential grandfather, I was able to determine that the people were actually matching me through a different line (that two of the daughters of this person had married into).  There is still the possibility that one of my female brick walls could trace to this man, but that's going to take me 'awhile' (!!) to try and determine that.  And all of this is why I said that I combine the ThruLine 'clues' with triangulated groups.  I use DNAPainter to map my parents' chromosomes.  I've got them each over 50% mapped.  So I contact the DNA cousins shown on ThruLines; if they don't respond, I search for them on the other sites already mentioned.  I have been fortunate to locate numerous of them on one of the other sites and therefore can view the shared segments.

I was responding to your first paragraph in which you said "I would consider the ThruLines evidence to be fairly 'sufficient' to confirm my connection."

You're right.  DNA doesn't lie.  Surprising how many people don't get that.

Gotcha!  I should have worded it differently.  It needs to be understood that when you have a large number of DNA cousins for a ThruLine ancestor such as I mentioned, it is sufficient to confirm my connection.  The question is if it's to that particular ancestor.  If several people have strong trees tracing back, then there is a good chance. For the examples I mentioned, there are strong paper trails to the common ancestor. I still believe you need to triangulate.  

As I mentioned, I saw today a purported 5th ggf on ThruLines with lots of DNA cousins.  The difference between him and the other 5th ggfs is that this one is a 'dotted line' ancestor, meaning that ancestry has tried to figure out the common ancestor for my dad's shared matches.  People with undeveloped trees might automatically add this person.  I was able to figure out the connection.  Bottom line is you need a fairly extensive tree to be able to use DNA to try and confirm relationships, and for the distant ones, you always need to triangulate.
+9 votes

This thread has become so long and complicated it's hard to find the right spot to insert comments.  

 

It seems to me, reading this after you all have discussed the issue for hours, that several terms and ideas should be clarified.

 

First, there is a difference between WikiTree policy and reality.  By that I mean WikiTree has rules about what you can considered confirmed with DNA and what you can't, generally because it is considered too far back to be reliable.  At least I think that's the reason.

 

Jim Bartlett writes a blog called Segment-ology (easy to find by Googling) that I've found very helpful.  I recall that somewhere he said he has managed to find triangulated groups of matches for nearly every segment of his chromosomes.  Of course, that does not mean he has identified the source.  He also did an analysis of recombinations (crossovers) by generation, posted Feb. 2, 2016 (https://segmentology.org/2016/02/02/crossovers-by-generation/) that suggests that identifiable segments can persist for many generations.  The way I look at it is, they have to come from someone.  

 

Second, there seems to be some confusion between total matches and identified matches. Sorry, SJ, but I think you're wrong and Barry is right.  Our number of cousins increases with every generation back you go, just as our number of grandparents does.  And as others have said, it doesn't seem reasonable to say that a person must know every single one of his ancestors in a particular generation or the ones that came after it in order to identify the source of a match.

 

It would help tremendously if people would make chromosome maps.  We do not usually find eighth cousins out of the blue.  For anyone who has been working with DNA a while, they may have third, fourth cousin etc. matches already from the same line. The first thing I do when I want to identify a match is to get chromosome detail, then place that match on my map.  It has to be compatible with the branch of my family I have already identified as the source of that particular segment.  All my maps go back to my grandparents, which immediately eliminates 75% of my ancestors as candidates, and some go back to my great grandparents.  

 

Jonathan, somewhere above you asked whether a particular DNA segment corresponds to a family name.  In the sense that the segment comes from a certain ancestor, yes, but even that  seems like a complicated question.  For example, I can look at my chromosome 1 map and see that I got a particular segment from Mary Hoisington.  She is the common ancestor for me and the four people that my siblings and I share the segment with.  But I didn't really get it from Mary.  I got it from my father, William David Kelts.  He got it from his father, etc.  And on the other end, Mary got it from one of her parents, and if it was her mother, she probably wasn't named Hoisington.  Also, in that particular example, none of my matches have the surname Hoisington.  If you see a common surname among your matches on a segment that you think goes way back, my guess is that they are probably related to each other more closely than they are related to you.

Edit:  Some earlier posts to this thread have been hidden, so it is not clear what some of those remaining are responding to.

by Living Kelts G2G6 Pilot (554k points)
edited by Living Kelts
P.S.  Sorry the type above is so small.  It didn't start out that way, but I made a few edits, and for some reason, every edit frustratingly changed the type size and font and the paragraph spacing.  I tried to fix it but couldn't.
I've taken to doing a lot of re-typing because of that problem (ugh!) Well, it was worth a little squinting on this one.

I really want to know where all the shrill, pious nay-saying comes from (not you, obviously). It seems like various people are absolutely committed to coming up with some obscure rationale whereby they can claim that your DNA evidence is not perfect, must be rejected, and then want to jump all over you for it.

The "real deal" can be tricky, and it can be easy to jump to conclusions too quickly, not realizing the "gotchas" that can be at play. But in a real application, there can be a lot of complex information involved, that works in concert with the DNA. Universally applying simplified Puritanical rules is just throwing the baby out with the bath water.
+5 votes
The farthest I've been able to triangulate au DNA so far is my 6th great grandfather Augustine Leftwich-3. We have isolated 2 segments, and represent 5 of Augustine's children: Frances (Leftwich-146) Carter; Thomas Leftwich-7, (through 2 separate wives); Augustine Leftwich-2, Jr; Uriah Leftwich-138, and Jabez Leftwich-125.Fun stuff!

I'm currently working on John Woodson-13 who is my 10th great grandfather. Very fun stuff! I'm finding the farther back you are able to go, the segments are usually smaller,  and there's less likelihood of other shared lines to eliminate.
by Sherrie Mitchell G2G6 Mach 5 (53.0k points)
Why is there less likelihood of other lines to eliminate?
Because the farther back you go the more removed you are from current day. So if you can zone in on a potential match, and look at that person's tree, and see that there are zero lines in common, then it may be valid. But of course that depends on other matches coming into play who have also been examined.

So my experience is that the farther back you go the possible effect of shared lines becomes less, and the shared values become validated.
I think that takes us back to the case in which SJ Baty is right after all.  For example, I have matches that appear to be sixth cousin matches through an ancestor named Benjamin Barnes.  But I only know Benjamin Barnes was my ancestor due to an old published genealogy (and my subsequent research that seems to validate that).  And my matches no doubt also have access to that old genealogy.  But we have plenty of other unknown ancestors in those remote branches of our trees who could just as easily be the source of the DNA.
+6 votes
Yes. I have 58 DNA matches on my colonial Rhode Island line, from 5th to 8th removed. Why I think these are accurate; well researched lines going back to the 1600's with multiple sources, loads of DAR/SAR even though that does not go back as far. It was a small place, there were not a lot of families that colonized Block Island originally and then moved on to RI. Pedigree Collapse - this doubles your chances of a segment being carried down, even from many generations out, because as mentioned previously- not so many choices. I have three families that intermarry several times over a hundred year span. My fav is dad's second wife's sister is also wife of dad's oldest son from 1st wife, so her sister is also her mother-in-law, and their kids are cousins and aunt/uncle - niece/nephew. 2 Brothers marry 2 sisters of other family, second cousins marry. You get condensed genetics. 23 generally places them as 4-6 cousins even though they can be 7-8.
by
Pedigree collapse makes your matches stronger, but doesn't it also make it harder to identify the source?
It could make a specific pair of ancestors harder to ID if the trees are not accurate or go far enough back, or if there is more than one connection to those lines. In many cases it just re-enforced the DNA segments downstream of the originating ancestral pair. I do have a few people where we are related on more than one line, so ID'ing  where the specific segment came from is harder, though our relatives in common can often tease that out. (in additional to colonial RI, both my parents have lines that go back to small towns in Scotland that are only about 25 miles apart. According to 23, my parents are not related, but I have a few people who show up on both sides via Scottish lines)
Doesn't pedigree collapse mean, by definition, that "there is more than one connection to those lines"?
yes Julie, Endogamy has to be accounted, so you need to know the other participants tree, lineage. And, you have to run tests against others that you know are more recent cousins. That becomes important especially when you have cousins who connect through many surnames. Those are the cousins that we really cannot depend upon for AU matches, but can use to determine the more distant cousins who do.

 "In genealogy, pedigree collapse describes how reproduction between two individuals who share an ancestor causes the number of distinct ancestors in the family tree of their offspring to be smaller than it could otherwise be. Robert C. Gunderson coined the term." So fewer ancestors because there will be ancestors who are doing double duty. If cousins marry they now have only 3 sets of grandparents instead of 4, and their children get a double dose  of DNA from those grandparents. chart from https://www.yourdnaguide.com/ydgblog/2019/7/26/calculating-the-pedigree-collapse-effect-in-your-dna-matchespedigree collapse dna testing genetic relationships 92.png.jpg

An example might help.  So in this case (and I know it is a hypothetical), who are you, or who are you the equivalent to?  Fiona?  So you know, or believe, that your DNA came from Adam and Anna (actually it was one or the other), but isn't it important to you to know who  it came through?  

Where are you, Edison, Frank, SJ, Barry?  How about jumping in here?
OK My ggg grandfather Samuel M marries ggg grandmother Elizabeth B. His brother Jacob M marries Elizabeth's sister, Nancy B. The children of these 2 couples are first cousins, but have only 2 pairs of grandparents, the M's and the B's instead of 3 if one of each set of parents were unrelated. These cousins have  received the same DNA from those grandparents and  will test  like siblings or half siblings rather than cousins. Their descendants will test as being more closely related by generation than they actually are and are more likely to carry segments from the original ancestors.
OK, I understand what you're saying.  I see that your second great grandfather Amos was in Samuel's household in 1850 so that appears to show you've identified your path to Samuel correctly (which I am not sure could be determined by DNA).  

However, the situation I had in mind was not when the pedigree collapse had occurred in your (or my) own tree, but rather when it had occurred in someone else's ancestry.  In the cases I've worked with personally, when I've seen a match's pedigree with the same names repeated far back in the tree, it has been impossible for me to determine exactly how I am related to my matches.  (The problems could well have been compounded by poorly documented trees, and possibly because in some cases the match himself was confused by all the intermarriages.)
If the trees aren't correct, then at best you can take a calculated guess. The careful genealogy has to be done and not everyone is willing to put in the time, especially to follow sibling lines up to the present.
This sounds like something I have. I have two gt-gt-gt-gt grandmothers who were sisters, Elizabeth and Barbara.

As I look through my matches on AncestryDNA, I see nobody more distant than a 3C until I hit 69cM, a descendant of my Barbara (who is therefore also related to me thru Elizabeth). The closest relation is 4C1R but there's even more intermarrying than that.

Moving along further down the list, we run into another at 64cM. This one is related to both Elizabeth and Barbara too - she's descended from my gt-gt grandfather's sister, whose husband was a 1C of that gt-gt.grandfather's wife. A 3C1R, plus other relations.

Then we hit another at 56cM. This one is simpler - a 4C descended from my Elizabeth (and therefore also related to me thru my Barbara).

At 55cM, there's a 5C thru my Elizabeth.

At 51cM I run into my 1st 3C1R that doesn't involve any intermarrying - before this it was all 3C or closer, or intermarrying cases.

At 50cM there are two. The first is like the 69cM case, above, but the second is where it gets interesting. She's a 6C! Her deal is that she's 7 generations down from Elizabeth and Barbara's parents, thru E & B's brother, John, and she and her paternal grandmother are the only females in that line.

Between my test and my brother's, I've uncovered 10 descendants of this 6C's gt-gt grandfather, so far, who one or the other of us match (two 5C1Rs, five 6Cs, and three 6C1Rs).

It seems clear that THEY have a "sticky segment" from that mostly-male lineage. My brother and I have TWICE the opportunity to match them, because we have those ancestors in our tree twice. That "top 6C" matches my brother on 4 segments (65cM), but all the others only match us on one or two.

If I can get some of those folks on GEDmatch I'll be able to see what part of my DNA matches to those 5th-gt grandparents, but the "pedigree collapse" makes it hard to pick out DNA from some of the families Elizabeth and Barbara married into.
If they're not on GEDmatch, are you relying on their trees plus their shared matches?
Indeed. In fact, many of them are picked up by ThruLines. Just one match with ThruLines isn't very convincing at all - in fact I have at least a half a dozen that I have a pretty good idea are pure bunk (if they're related at all, it's not necessarily how they say.

But it turns out that my way-back ancestry in the US happen to be especially prolific ones. These matches can generally be associated with clusters of one to five dozen matches, and generally there's a match or two that are identifiable to tell which side of my tree they're on.

I have a fair number of 3Cs in my matches for 7 out of 8 of my gt-grandparents. For about 90% of my matches, I can tell which 1/8 of my tree they're on, within a few seconds of looking at them.

Sometimes I can translate SOME of it to GEDmatch. I have a huge segment for the aforementioned Elizabeth's father-in-law, for example, and so the cluster has dozens of matches in it. Several of these people actually ARE on GEDmatch, and so I can see how they al fall on the same segment on Chromosome 10. THAT'S a mostly-male line segment, since Elizabeth is the only female between her father-in-law and me. A lot of the trees for my matches on AncestryDNA have the specific unusual surname on their tree, but can't trace far enough beck to connect. I contacted a 4C in that group, and told him I could guarantee we match on Chromosome 10, and sure enough, when he uploaded to GEDmatch, we were. If I could get a third person on there with a solid paper trail back to 5th-gt grandpa, I'd have a nice triangulation for Wikitree, for sure. As it is, I'd have to track down a book a distant relative wrote some years ago to get my own paper trail back before 4th-gt grandpa. I was in contact with the researchers back when I was being written, and they really knew what they were doing.

So I don't have ANY "official" triangulations yet, but am quite confident that the DNA is surely there for this and other lines.

My research for my paternal line is out of control! I have tracked down many thousands of distant relations - out to 6C (plus "removeds"). So it's been easy to identify certain 6C people on that side, and they show up matching each other - even ones on different major branches. One of them, in the generation after me, finally showed up on GEDmatch, so I just need one more. That 6C1R also matches my nephew, so if he got on GEDmatch undoubtedly he would match her on the same segment - she's his 7C, so nobody try to tell me they don't exist!
+4 votes
For me farthest back for au DNA is my Augustine Leftwich, b 1712 line: https://www.wikitree.com/wiki/Leftwich-3

We have identified several SNPs of Augustine auDNA based on our auDNA matching on 2 Chrs and have confirmation DNA statements posted. He is my 6th Great Grandfather, about the same range for the others who are matching. We represent a number of his sons, and are able to sort out and eliminate the various wives to be able to say this is Augustine's auDNA. Y DNA is also tested.
by Sherrie Mitchell G2G6 Mach 5 (53.0k points)
+5 votes
I have used autosomal triangulation to identify ancestors as far back as 12 generations (getting me to the early to mid 1600s).   My approach was to compare my DNA with that of a 2nd cousin with whom I shared a known ancestor whose parents were a genealogical brick wall.  I specifically chose a 2nd cousin whose family had been geographically separated from mine since the time of our common known ancestor.  By triangulating our common matches on GedMatch and looking for those who had published GedComs, I could compare their pedigrees using the 2-GedCom comparison tool.  I, of course, only compared the GedComs of individuals who both triangulated on the same segment that I shared with my 2nd cousin.   The first time I did this it took me a year to connect the ancestors I found to the ancestors that I already knew about (who were separated by an interval of almost 200 years). Since my first success I have located other distant ancestral couples (some of whom I have not yet successfully connected to my tree).

Since my original post has been challenged by a very credible source I thought it would be valuable to provide details so that others may judge for themselves.  First I want to thank Edison for his comment and I want to encourage others to read his very carefully thought out criticism of triangulation using distant cousins.  I do not disagree with any of his mathematical and biological arguments, yet I dispute his claim that my result is a fallacy.  I'll first present the details of my result and then speculate on why we are likely both  partly right.

The details are that I desired to identify the parents of my great grandmother Ellen Barnes-19617 whose origins had escaped the repeated searches of online paper records by both myself and a 2nd cousin.  My 2nd cousin and I shared 5 large segments of DNA and I used the GedMatch utility to identify our common matches (with shared segments > 7cM).  There were hundreds. I did triangulations among all our shared matches and grouped them according to which of the 5 DNA segments they were associated.  I also knew the genealogy of Ellen's husband William Millar-2384, and with that knowledge I used family surnames to eliminate three of the five segments as likely associated with William.  I then took the two remaining segments and did triangulations with myself and the members of the associated group (dropping my 2nd cousin for this phase of the investigation).  For each of these segments I sorted the triangulation partners into those with published GedComs on GedMatch.  One segment did not yield any result but the other gave several possibilities.  I then used the GedMatch 2-GedCom comparison tool and found two of these cousins whose pedigrees intersected at the couple Jacob Bartlett-6242, and Sarah Albee-523.  Jacob was born in 1673 and Sarah was born in 1683 while Ellen Barnes was likely born circa 1838, so there is about a 175 year gap between them.  These pedigrees also showed Jacob and Sarah's parents so I included them in my estimate of 12 generations and Mid-1600s for birth years.

As I stated in my first post it took me a year to find the paper trail connection between Jacob and Sarah and their 6xg granddaughter Ellen (I got hung up on several blind alleys).  My first reaction was "can I actually believe this result" so I spent some time looking for genetic evidence that Ellen's parents really were who the triangulation suggested (the paper records here were very sparse and uncertain as I mentioned at the beginning).  I soon found it in a 4th cousin (identified by another triangulation involving a much smaller number of generations) who descends from one of Ellen's grandparents.  After talking to her and comparing our trees I concluded that the result was real.  Connecting Ellen's parents and one grandparent to other already existing WikiTree profiles proceeded smoothly after that point.

Now given that I accept Edison's arguments, the question is how could we both be right?  Could it just be a fluke of highly unlikely possibilities conspiring to yield a correct result (a blind squirrel finding a nut) or is there something else at work here?  The fact that this is not an isolated success (I have another triangulation that yielded ancestors back 10 generations)  makes me believe the latter.  

Here is my speculation on what is going on.  I suspect that the two triangulation partners that I found were not as remote cousins for me as the triangulation suggests, but that the true MRCA of the triangulatiion was missing from one or both of their GedComs.  If, however, I am related to those two cousins in more than one way (multiple denomination cousins) and if their trees contain another set of more remote common ancestors then the GedMatch tool would spit out that remote couple as the intersection point of the two pedigrees.  In other words endogamy to the rescue.

This explanation is not at all far fetched.  We all know that statistically speaking we (or those of us sharing European heritage) all probably descend from Charlemange (about 36 generations back).  I myself have discovered one path back to him while playing around with WikiTree and we all probably have multiple paths back to him if our entire genealogy could be laid out before us. It is not unreasonable that our "web" (a better metaphor than a tree) also loops back multiple times during more recent genealogical times.

Here are some tricks that I have used when triangulating with remote cousins.  First, make sure that your two partners are about as remote from each other as they are from you.  Otherwise they are likely genetic proxies for each other rather than true triangulation partners for you.  This is usually addressed in terms of the "lengths of the legs of the stool".  There are times, however, when genetic proxies are useful.  I have a list of six 2nd cousins who all descend from one set of great grandparents.  When I find a remote cousin of triangulation interest whose shared segments with me do not exceed the 7cM threshold, I check him/her against my cousins and often find one of them who does share a large enough segment.  They then become proxies for me for triangulation purposes.  This opens the possibility that any found ancestors may be associated with a part of my cousin's tree that I do not share, but if the ancestors occur where I expected them to be in my tree then the proxy relationship is likely valid.  A group of 1st cousin proxies could be used the same way but I don't have enough of them to do so.  

In conclusion, I agree with Edison about the mathematical probabilities and biological uncertainties of triangulation with remote cousins.  If, however, you do get results suggesting ancestors so far back that genetic relationships are highly improbable do not immediately discard them as fallacious.  Check them out and you may discover that they are still genealogically valid.
by Dudley Miller G2G2 (2.1k points)
edited by Dudley Miller

Hi, Lewis. I'm giving your answer and upvote to offset the -1 I found.  smiley

But I do disagree with the content of the answer, and caution readers that autosomal triangulation back that far in the past is almost certainly a fallacy. Don't try that at home, folks.

The fact is, we don't have any evidence that the method we call "autosomal triangulation" or "segment triangulation" among distant cousins has any scientific validity at all. And there are a lot of biological reasons why it shouldn't be valid beyond several generations.

If you'd like some of my thoughts on the subject, I wrote about it here on G2G earlier this month.

Many thanks, Lewis, for expanding your initial post; a lot of good info there...and more food for thought for me if I ever get around to writing that two-parter "The Trouble with Triangulations." And thanks for contacting me directly because--you're correct--I didn't receive a notification since it wasn't a reply to my comment. Also, I hadn't connected that you were the same person I'd seen answering questions at physics.stackexchange.com. Makes me a little hesitant to ever bring up anything with math in it.
laugh

And that comment was one of my more...um...concise G2G posts. And I hope the brevity didn't get me into trouble because I want to be clear I wasn't stating that your conclusions were fallacious. We all have to apply the Genealogical Proof Standard (and I'll bet the term "proof" is kind of itchy for a theoretical physicist) on a case-by-case basis. How we choose to weight specific pieces of evidence, including DNA, is a personal decision. My comment was only a cautionary statement that the casual genealogist is safest, for now at least, in treating the concept and methodology of autosomal DNA triangulation to very distant cousins as potentially fallacious. As I said, the method has never been tested in a scientific environment, and there are numerous factors which imply it shouldn't be valid. Too, the WikiTree policy for triangulation was constructed in a vacuum--it is not a genetic genealogy consensus--and, perforce, it has to try to straddle the very difficult (and sometimes mutually exclusive) line between simplicity and accuracy.

And...I've once again run afoul of the G2G 12,000 character limit, so Part 2 will follow.

I'll try not to ramble, but I tend to "think out loud" as I type, and there are several thoughts I want jot down before I sleep on it and lose them. One crucial observation you made was "We all know that statistically speaking we (or those of us sharing European heritage) all probably descend from Charlemange (about 36 generations back)."

Our genetic family trees and our on-paper versions will almost never look the same, even if we believe we have accounted for pedigree collapse. Our genealogical version might better be termed a "web," as you said, but the genetic version will be closer to a bowl of spaghetti.

It's tossed around frequently that all humans are over 99% genetically identical. While that doesn't tell the whole story, it does do a bit of level-setting regarding the spaghetti bowl. Homo sapiens has no sub-speciation and, biologically speaking, no races (which taxonomically would require a sub-species). Neanderthal DNA makes up roughly 1.7% and 1.8% percent of European and Asian genomes, respectively, and up until a year ago we thought that modern sub-Saharan African populations contained only about 0.02% Neanderthal DNA. But a study published in Cell (Chen et al., February 2020) showed that people with African ancestry actually have closer to 0.5% Neanderthal contribution, due not to direct contact but to return migrations of modern humans who went to Europe, interbred with Neanderthals, and then came back to Africa. So in that thumbnail summary, we are all pretty much as genetically similar to Neanderthals as we are different from each other.

But that 99% figure is derived in much the same way as is the fact that I'm 41% genetically the same as a banana. All multicellular life on earth has genetic similarities...for instance, that we all have cells and that cellular structure and function is coded in our DNA. We still don't know how much--or exactly which--of our DNA is free to mutate without odd phenotypic or dangerous health results, but the current thinking is that somewhere around 20 to 25 million autosomal SNPs can distinguish one human from another, and around 5 million from within the same continental-level population. To qualify as a SNP, it's generally accepted that the polymorphism needs to be common enough to occur in at least 1% of the population. A few of the problems with our microarray testing and interpretation in a moment, but since--with the somewhat rare exceptions of ancient remains testing, a la our friends the Neanderthals--we can evaluate our genetic connectedness for genealogy only with living (or recently living) test takers, we have to frame those 20-25 million SNPs in light of a global population of 7.8 billion.

I believe one of the difficulties inherent in genetic genealogy is that we have many years of genealogical data to consider, and all of it is packaged for our perception as distinct, e.g., "And unto Enoch was born Irad; and Irad begat Mehujael; and Mehujael begat Methusael; and Methusael begat Lamech." A representation of singular, threadlike lineages that remain discrete and insular as we march across time.

Jokes about jumping into the gene pool for a swim, though, actually have merit. Stan Lee's imagination and the X-Men comic books aside, humans don't fabricate blocks of new DNA. Among our 4.6 billion nucleic acids are not infrequent point mutations, but the mutation options are quaternary: the choices are only adenine, cytosine, guanine, thymine, and they can pair-up only as adenine/thymine and cytosine/guanine, so the flexibility is pretty limited. And to go to the next generation these can't be the result of any of the many thousands of DNA repair operations or millions of DNA duplications our body performs daily during mitosis...they have to be mutations that enter the germ cells.

A summary that population and evolutionary geneticist Graham Coop offers is that "nearly everyone in Europe is related to nearly everyone else over the past 1000 years, and likely everyone in the world is related over the past ~3000 years." When we move beyond recent relationships, our tests fairly quickly begin to show only a single shared segment of autosomal DNA...at least a segment of meaningful size and I, personally, don't consider 7cM meaningful as a single, standalone segment. More in a sec, but thanks to work by Tim Janzen and John Walden we know that SNP chip-reported 7cM segments survive traditional phasing only 42% of the time, meaning that 58% of the time they will be false because they appear in neither parent.

Multiple segments of a nominal size are indicative of genealogical relatedness, but single small segments should be taken with a few grains of salt. Amy Williams's research at Cornell shows, when considering segments equal to or greater than 7cM, that we seldom see more than one shared segment among 5th cousins, and almost never among 6th cousins or greater.

Coop writes, "A single example of a block of around this length [10cm] is not a particularly meaningful statement about genealogical relationship between two people..." From his published work with Peter Ralph, he concluded that "the typical age of a 10cM block shared by two individuals from the United Kingdom is between 32 and 52 generations (depending on the inferred distribution used)." And his simulations suggest that, by 12 generations back, there is a better than 80% chance that we inherit no measurable DNA from any specific ancestor.

Part 2...

As the timeline deepens, the homogeneity of the gene pool increases and the pathway of inheritance of a small, single segment could be from any of several--or many--sources...most of them unavailable for us to analyze because it's rare to find genealogies with thorough accuracy even encompassing all 5th great-grandparents. That two distant cousins share the same small segment of DNA is actually no evidence that they inherited it from the same source, even if the paper trail indicates that should be the case. In populations that have expanded rapidly over the past 1,000 years--like that in Western Europe--Coop's numbers show that segments 5-10cM in size reach their maximum density of less than 0.002 probability at around 30 generations, and then continue at the same density before a sharp drop-off just after 100 generations. Put another way when associated with our SNP-chip test results, correlation does not constitute causality.

The mistaken impression among some genealogists is that three distant cousins sharing a segment on the same chromosome, a segment that overlaps on common start and end loci, is evidence the segment was inherited from the same ancestor. At least a portion of that uncertainty might be mitigated by mapping one's own--and that of close family, say through 2nd cousins--regions of potential haplotypic pile-up, or excess sharing, before starting down the triangulation road. But I know of very few who do that. Or that evaluate the tested SNP density of a segmental area in ratio to the purported matching SNPs, or that determine the areas of exonic overlap with the segment to minimize the approximately 19% of the SNPs in our current tests that focus on areas of clinical rather than genealogy/population interest.

Then there's the matter of the highly useful--but deceptive--centiMorgan. I believe most know that it isn't a physical measurement of anything, but merely an estimation of where crossover will occur during meiosis. The base calculation we still use today was formulated by a man name Damodar Kosambi in 1944, extended from the work of John Haldane. Deceptively simple, it is: x = 0.25 ln[(1 + 2y)/(1 − 2y)], where x is the recombination frequency corrected by the mapping function (ln) and y is the observed recombination fraction. But it, among other things:

  • Uses only an averaged, Poisson distribution model for inter-crossover distances
  • Is only as accurate as the genome assembly and mapping function used
  • Does not take into account crossover interference
  • Does not take into weighted account research over the past decade that's revealed over 50,000 crossover "hotspots"
  • Does not take into account, in males, the aging effects of DNA deamination from methylation

Just over a year ago, Caballero and Williams, et al., found that by modeling crossover interference the standard deviation of the calculation decreased by 10.4% on average, and that applying sex-specific genome maps rather than sex-averaged models (which all our testing and reporting companies use) increased the standard deviation by 4.2%. That latter in acknowledgment that the female genome undergoes crossover at a rate about 70% greater than the male's.

All the testing companies still report based on the GRCh37 human genome map assembly...which has known errors and was superseded by GRCh38 almost eight years ago, in June 2013. Which is a good segue to note that, with our microarray test results, we never receive precise segment start and stop positions for triangulation. We can't because we're testing only one in about every 5,000 base pairs. Segment start and stop points are estimates...and there's a whole lot of segment contiguity assumptions and imputation going on under the hood. (A quick aside here is that GEDmatch offers the option to view segment start and end points as mapped to both GRCh37 and GRCh38, but it uses only GRCh37 for centiMorgan calculation.)

The need for some of the assumptions and, frankly, guesswork stems from the changing microarray chips in use. For example, a 23andMe v5 test and an FTDNA v1 test look only at about 23% of the same SNPs. When we're starting out examining only about 0.02% of the genome to begin with, a disparity of 75% in markers tested ain't exactly a boon to accuracy. But it's a sliding scale: if we're dealing with large segments, assumptive errors don't have as much impact; but when dealing with small, singleton segments, errors with assumptions and estimates and imprecise centiMorgan calculations can render the supposed segment invalid.

Building on that, and as a final comment, I have some numbers from GEDmatch. These are my own, so no broad verification, just examples; but I've seen similar percentages from others.

Last year I extracted results from my 30X whole genome sequencing and uploaded that to GEDmatch. I don't know the entire catalog of SNPs and loci that GEDmatch will accept into their database, so the only available alternative is to look at all the different microarray versions in use and concatenate all the distinct SNPs tested, then extract those from the WGS results.

I think the same general impression holds true for autosomal DNA as for yDNA: the more markers tested, the more matches you'll get. In fact, it's the inverse: the more markers tested, the fewer matches you should see because some false or weak matches can be eliminated and the greater the resultant accuracy of the matches reported.

The WGS extracted "kit" I uploaded to GEDmatch contained 2,080,567 SNPs rather than the usual ~650K. Using their "DNA File Diagnostic Utility" nets you the total number of matches--not matching individuals, but total matches--that the kit has in the database. My WGS-extraction kit currently has 56,802. Since that kit contained about 220% more SNPs than any of the standard microarray tests, those matches should be more accurate, and fewer, than the data from individual tests I took and uploaded. I still have those tests on GEDmatch as "research kits."

Other test are: 23andMe v5, 161,680 matches; MyHeritage v2, 117,136 matches; and FTDNA v3, 109,359 matches. So on average, the increased number of reported SNPs reduced my number of matches by 44%. With the potential implication--and this would no doubt apply to small, single-segment matches only--that by using small segments for triangulation with one of those microarray tests rather than the 2 million SNP WGS results, a significant number of the segment matches would have been invalid at the outset, but I would never have known it.

Still and all, I obviously consider DNA for genealogy immensely fun and useful stuff. A lotta times on G2G I feel like a devil's advocate, that I'm trying to point out all the ways DNA shouldn't be used. But it really isn't that at all; only that we (meaning the scientific community) still don't know everything there is to know about genetics. New stuff is published every month. So we always have to remain critical of information and methods we think are simply a given. Heck, we still haven't yet done telomere-to-telomere sequencing of all our chromosomes.

Related questions

+18 votes
3 answers
+2 votes
1 answer
327 views asked Oct 2, 2019 in The Tree House by Jonathan Wilson G2G6 Mach 1 (17.6k points)
+9 votes
2 answers
516 views asked Mar 2, 2019 in Genealogy Help by John Trotter G2G6 Mach 4 (43.2k points)
+8 votes
3 answers
733 views asked Jan 11, 2019 in Genealogy Help by Jeff Andle G2G6 Mach 1 (12.4k points)
+3 votes
0 answers
225 views asked May 23, 2018 in Genealogy Help by John Trotter G2G6 Mach 4 (43.2k points)
+4 votes
1 answer
312 views asked Apr 7, 2018 in The Tree House by Stephanie Stults G2G6 Mach 4 (43.0k points)
+7 votes
1 answer
+9 votes
3 answers
626 views asked Nov 3, 2017 in Genealogy Help by Joanne C G2G4 (4.1k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...