I would like to canvas opinions on what should be included in a BigY-700 DNA confirmation statement

+8 votes
305 views
I am in the process of writing a BigY-700 Y DNA confirmation statement for a couple of paternal line 5C1R fellows with paper records that already indicated their relationship was solid, but the two BigY test results confirm this relationship with an average of 2 private SNPs each.

There are no guidlines for BigY DNA confirmation statements for a male line relationship chain that I can find and so I am putting it out there, what should be included?

I note that Greg Clarke in a recent post is intending to include this feature in a future update of his confirmation app, so maybe now is the time to start thinking about exactly how WikiTree should be utilising the level of detailed data that the BigY-700 test offers for blokes that ARE related on their paternal lines, but at a much greater generational distance than the current and outdated STR based system allows for.
in Policy and Style by Matt Kitching G2G2 (2.9k points)

5 Answers

+5 votes
One thing missing from all the current confirmation statements is verifiability, except for cases where the testers use GEDmatch and their kit numbers are specified.

With yDNA, we have the opportunity to verify at least some of the data if the testers are included in a public project at FamilyTreeDNA. I don't suppose we would want to require membership in a project, but we could encourage it and allow/recommend a statement such as, "Both testers are shown in Lineage 5 - Col. John George of the [https://www.familytreedna.com/public/george?iframe=ycolorized George DNA Project]."

We might also have a link to the Group Time Tree, such as, "They are also shown in the project's [https://discover.familytreedna.com/groups/george/tree?subgroups=19984 Group Time Tree]."

Note that all that information is already public but testers are identified only by kit number and surname, so there should be no privacy problems if they are not WikiTree members.
by Bennet George G2G6 Mach 2 (23.4k points)
+4 votes

This is what I have on my dad's profile:

by Darlene Athey-Hill G2G6 Pilot (547k points)
This is nice, but only gets at the STR matching. The post asked about Big-Y confirmation, which should also include SNP information.

A 47/48 marker match is wonderful to have, but IMO  marking relationships as confirmed based on this alone would not be consistent with the standards we use for autosomal DNA. Those standards require the DNA measurement to confirm to the proposed cousin relationship by having the measured cM amount be in an appropriate range. There is no reasonable way to set such a standard using just STR matching (or so I understand). But there is a way using Big-Y testing, since that provides estimates for the year of birth of the most recent common ancestor.

As you know, the use of Y-DNA to confirm relationships is much different than using autosomal.  So the measured cMs don't play a part with Y-DNA.  I was not the person who came up with the statement.  Peter, one of the team leaders of Wikitree's DNA Project, created it. My dad and Whit Athey (Athey-170) have the same Y haplogroup (G-FGC52664), an extremely rare haplogroup.  My dad has done the Y-700; Whit did the Y-111.

 I'm fortunate in that Whit is a VIP in the DNA field and manages our Y-DNA project.  Whit created the Y-haplogroup predictor still in use today.

Anyway, sorry my information isn't of any help.  Good luck!

+5 votes

IMO, there is too much information to try to distill into just a "citation" style confirmation statement, as is commonly done with autosomal statements. The statement would need two clearly separated parts for STR information and then SNP information. As Bennet said, The STR part should include reference to the appropriate surname project and lineage grouping. It should also include the access date, since this information may change over time. But ideally, it would also indicate the number of markers used in each of the different tests, and at minimum the genetic distance between them. Better yet would be to include the specific markers where differences occur. 

The SNP part would include terminal haplogroup and access date, since those are subject to change. It should also include the estimate of the time to most recent common ancestor, since that is the Big-Y equivalent of the required check for autosomal DNA confirmation that the estimated relationship distance is consistent with the proposed relationship between the matches. (If there is the proposed paper-trail common ancestor was born outside of the estimated window given by FTDNA, then the confirmation statement should not be made with further investigation.) 

One issue with creating such a statement and putting it on a lot of profiles is that as soon as the terminal haplogroup changes, you would have to revise a whole lot of statements across a lot of profiles. Similarly, as new STR kits come in, you'd like to be able to update the statements with the added information. And since the matches used in the STR part versus the SNP part can be different, that would add additional complication to writing the confirmation statement.

So it seems to me a better solution is to create a freespace page for Y-DNA from a specific ancestor or group of ancestors. A citation statement could then be much shorter, with a link to the freespace page, and it could be written in a way that you wouldn't have to find and revise statements across a bunch of profiles every time, say, a terminal haplogroup changes. 

The page would then allow room for deeper analysis, say of lineages with matching DNA but as yet no identified common ancestor, or about issues where a paper trail identifies one of several brothers as an ancestor to a kit, but where the particular brother is not known. You could indicate proposed STR or SNP mutations that are probable indicators for certain lineages, or else state explicitly when targeted testing has led to the unfortunate situation where it seems one brother had no additional tested mutations compared to the father. This sort of analysis could be put on a single proposed ancestor's profile, with DNA confirmation statements linking to that profile. But a freespace page would handle also the case of several related lineages with no known common ancestor, or the case where there is a more recent lineage that clearly descends from the common ancestor of a group, but with no known paper trail to connect them.

by Barry Smith G2G6 Pilot (297k points)
I agree with Barry that there likely isn't going to be a practical one-size-fits-all way to properly document DNA evidence derived via Big Y testing. A hallmark of WikiTree's "Confirmed with DNA" practices from the beginning has been to have something that is simple enough for everyone to use and easy to explain without in-depth training or experience.

While I believe it would be possible to create a halfway decent set of work instructions to satisfactorily approach evidence used in a conclusion that meets the Genealogical Proof Standard, I think that set of instructions would per force be a complex and lengthy series of interconnected if/then/else steps.

The only--admittedly minor--reason I'm not always a fan of using FreeSpace pages as repositories for explanation of evidence analysis is that they are completely disconnected from the genealogy profiles when it comes to pulling GEDCOM backups. Yep, everything in the biography box, including all source citations, comes down as GEDCOM notes fields, but unlike a fully-qualified citation to an external source that has a hope of being located for evaluation if WikiTree is unavailable, that extensive evidentiary write-up on a Space page may someday go the way of web 404 error.

If you're spending the time to do a bang-up job on a Space page, it might make sense to also save that as a PDF file, upload it elsewhere for additional storage, and then possibly reference it as an "also see" in the profile citations.
+4 votes

How about giving the testing company and test name, then simply stating the man's measured haplogroup along with its estimated year of origin?  For example, FTDNA, Y-111:  I-M253 (4300 BCE), or FTDNA, Big Y-700: I-FT26227 (1750 CE).  This information would provide a basis for further examination of the man's haplotree and/or comparison with other men's results. 

by Robert Petty G2G4 (4.4k points)
I like it.  Hopefully I can update my deceased father's DNA info.
Thanks Laurie.
If it's a Y-DNA test through FTDNA, they can upgrade the test using the original sample - you just have to pay the upgrade price.
Understood. I have upgraded numerous DNA tests in FTDNA.  I was referring to updating my father's DNA information in Wikitree with more information.
Got it.
This is useful information to place on a profile, but it is not enough to mark specific relationships “confirmed with DNA” IMO, which is what the post asked about.

I propose the following (without all the extra spacing between the lines, which showed up when I pasted from Word):

Y-SNP (Big Y) DNA Confirmation of:

Robert Lee Petty (Petty-2644): FT26227

With:

James Winter Petty (Petty-3070): Z39481

MRCA:

Thomas Petty (Petty-8): I-BY34474  

SNP Sequences from MRCA Haplogroup:

I-BY34474 > BY120617 > FT14616 > FT26227

I-BY34474 > FT404340 > Z39481

Hi, Robert. I can follow along with that and there's no disagreement per FTDNA's TMRCA predictions with the date to the common ancestor. But a couple of minor things:

Petty-8 shows to be Thomas Petty, born c. 1673; this is also who the WikiTree Relationship Finder shows as the MRCA. Hubert Petty is Petty-7, Thomas's father.

There may have been recent changes to the haplotree for these subclades, but FT122807 currently shows as not being applicable to either Petty-2644 or Petty-3070. The way I read the branchings from BY34474 is:

I-BY34474 > BY120617 > FT14616 > FT26227

I-BY34474 > FT404340 > Z39481

Of course, WT has no standard for use of Big Y SNP data and, while STRs can be...variable, I always like to include mention of STR differences when I chart yDNA relationships. You may already do this, and I suspect that WikiTree privacy standards--given that in the "confirmation" statement we don't mention actual STRs that differ, and can't mention autosomal segment start/stop points--won't allow it to be shown on profiles.

Just noting that for our own research, additional STR information can help distinguish specific lines, particularly for basal haplogroups that display much longer SNP chronological intervals than do common ones like "R" and "I". And I always log the data about individual SNPs that are shown as mismatches, as well as all private variants.

Thanks for the example.
Thanks very much Edison, for your comments as well as for catching the errors - I guess I was in too much of a hurry to get it posted, and did a very poor job of entering and/or proofing my entries.  I do tend not to pay much attention to STR results, as the SNP results are so much more strait forward (I feel) - but I will do so in the future!
+3 votes

This is the statement I have on my profile for a match.
 

Roy Hudson: YDNA Test Big-Y 700 (FTDNA)
Relationship to me: 7th Cousin Twice Removed

1. Roy Hudson (Living)
2. Isaac Newton Earle III (1917-1967)
3. Isaac Newton Earle Jr (1886-1938)
4. Isaac Newton Earle (1853-1926)
5. Alfred Weston Earle MD (1813-1881)
6. Marmaduke Sidney Earle I (1769-1856)
7. Morris Earle (abt.1734-1780)
8. Marmaduke Earle Sr (1696-bef.1765)
9. Edward Earle Jr. (abt.1668-abt.1713)
10. Edward Earle Sr. (abt.1628-1711) England
  • Paternal relationship is confirmed through Y-chromosome DNA testing at Family Tree DNAGlenn Earls, FTDNA kit #885704, and his 7th Cousin Twice Removed, Roy Hudson, FTDNA kit #AM13992, match at a Genetic Distance of 4 on 111 markers, thereby confirming their direct paternal lines back to their MRCA Edward Earle Sr. (abt.1628-1711). Based on a Genetic Distance of 4 at the Y-111 test level, Glenn Earls and Roy Hudson are estimated to share a common paternal line ancestor who was, with a 95% probability, born between 1500 and 1850 CE. The most likely year is rounded to 1700 CE. This date is an estimate based on genetic information only.
by Glenn Earls G2G6 (8.2k points)

To my understanding, that "Confirmed with DNA" citation absolutely conforms to existing WikiTree policies and standards.

And it's also an example of why the yDNA policy, being grounded in technology that's almost a decade and a half old (the 111-marker test was introduced in 2011), doesn't provide much substance in terms of positive evidence for genealogy.

The date ranges noted are from the current static table at FTDNA that doesn't take into consideration the vastly different STR average mutation rates (the most variable markers show rates over 220 times faster than the slowest). The end result is a 95% CI that carries an approximate date range of 1500 to 1850; 350 years is a big gap in genealogical terms which, for a patrilineal line with an average generational interval of 32 years, means 11 generations.

Plus, the two men referenced have different surnames and there is no reconciling explanation for the surname break (one of the WikiTree profiles is private so there's no opportunity for third-party investigation.

As an example of the lack of TMRCA reliability in STRs alone, in a fairly mature yDNA project we have almost 30 men who have taken a Big Y test. The actual haplogroup project is R-BY3332, and our work is a subproject within that. In it we have a couple of GD4 at 111-marker individuals who, according the the STR chart, should fit in the 1500-1850 TMRCA range with a 95% CI.

In fact, their Big Y tested haplogroups are R-BY51514 and R-BY22196. Their common haplogroup parent is R-BY22166, one level below R-BY3332.

The FTDNA Discover tool places the emergence of BY22166 at 1050 CE: 800-1230 CE at a 95% CI, and 919-1136 CE at 68%. Our own analysis of the specific individuals in the project matches the Discover tool's estimates: before the tool was available, we had placed the coalescence date at circa 900 CE.

So having detailed SNP data moves, with solid evidence, the median TMRCA date from 1700 CE per STR information only, to 1036 CE when the Big Y results are used. This is a massive difference that exceeds even the lower end of the STR-only 95% CI range back another 450 years. In total, the discrepancy between the median TMRCA estimates represents about 21 patrilineal generations.

The original question was about the "format" of a DNA confirmation statement. I provided the statement I have put on "my profile" to show the lineage and the DNA connection. 

I used the FTDNA information because that is the information that FTDNA has. If you don't like the FTDNA information, you should take it up with them.

The change in name is explained in Roy Hudson's father's profile. 

Roy Hudson's profile is private because he is living and does not want to share his information. 

Roy Hudson has a documented and sourced lineage back to Edward Earle b. 1628

The matching MCRA could go back further than that but it is not known who is the father of Edward Earle b. 1628. 

"The original question was about the "format" of a DNA confirmation statement."

In fact, your use of the word "format" is the first time the word had been written either in the original question or anywhere in the answers or comments. The question was about "what should be included in a BigY-700 DNA confirmation statement."

I wasn't picking on the statement you presented. Honest. I actually said it "absolutely conforms to existing WikiTree policies and standards."

But the first iteration of the Big Y test premiered in late 2013 and, in the intervening years, our knowledge about the testing and the results have grown by leaps and bounds. In 2014, FTDNA and the Genographic Project jointly released their then-new, combined yDNA haplotree; it contained just over 1,200 branches. This morning that figure was 79,149 branches derived from approximately 359,000 Y-SNP test-takers.

I have zero problems with the "Confirmed with DNA" citation statement as you wrote it. A central issue, though, is just how meaningful are the WikiTree parameters for yDNA "confirmation" now a decade after the Big Y test was launched and we have mounting evidence that STR data alone can't do much of a job in providing reasonably solid TMRCA estimations.

If there's going to be consideration for a citation statement that uses information from the Big Y (or whole genome sequencing) tests, I think it would be worthwhile revisiting the existing WT standards.

For example, it was brought up several months ago that the instructions for yDNA "confirmation" tells members to use "the red TiP icon for a TiP report." That red icon and the corresponding report went away as of February 2023. Can't follow the instructions because the instructions are no longer valid. The example citation can't be followed, either: there's no longer a TiP report that shows a generation-by-generation probability...because FTDNA did away with it since newer methods of TMRCA evaluation indicated that the old TiP report--as all of us FTDNA Group Project admins knew--was often way off the mark even as broad and non-specific as it was.

Anyone who tries to follow the current WT guidelines will, at best, arrive at the decision that a GD of 3 at 37 markers is perfectly fine for a genealogical "confirmation" statement, even though the 95% CI for that is now 650 CE to 1750 CE. That puts us sometime from around 150 years into Anglo-Saxon rule in England--prior to the arrival of the Vikings and over 400 years before William the Conqueror--to a couple of decades before the signing of the U.S. Declaration of Independence. Not as bad as mtDNA at trying to determine timeframes, but still not terribly accurate given that today, in the R yDNA haplogroup, it isn't uncommon for us to be able to tighten the estimation to as few as 80 or 90 years.

Related questions

+6 votes
2 answers
+6 votes
3 answers
+2 votes
1 answer
148 views asked Feb 14, 2020 in Genealogy Help by Randee Stewart G2G5 (5.9k points)
+7 votes
3 answers
779 views asked Mar 12, 2022 in WikiTree Help by JJ Stratton G2G4 (4.2k points)
+8 votes
1 answer
153 views asked Aug 30, 2021 in Genealogy Help by Cindy Cooper G2G6 Pilot (335k points)
+12 votes
4 answers
+2 votes
1 answer

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...