Need help understanding some issues on DNA, please.

+7 votes
874 views
Hi,

I have taken two tests and know what each says my ethnicity estimates are, but that is pretty close to all I know about these tests.

I am curious first about the estimates. One ethnicity is about 9 percent off between the two, one is the exact same and the rest are different but not huge differences. One however is not on the other at all. How can this be?

My other question is how far back do I need to look in my tree, with regards to the results, to find where these ethnicities came from? For example, I have Scandinavian and Norwegian dna. How many generations back should these ancestors be?
in The Tree House by Lisa Murphy G2G6 Pilot (344k points)

5 Answers

+12 votes
 
Best answer

Hi Lisa,

I agree with the other answers regarding the accuracy of ethnicity estimates. I take them with a healthy pinch of salt as a grain is not quite sufficient. And I don't pay much attention to ethnicity estimates of less than 5%. Another important thing to remember is the old adage that "absence of evidence is not evidence of absence" applies very well to ethnicity estimates. If you have more than 10% of something it's probably meaningful, it came from somewhere. But, if something doesn't show up at all it does not mean that you don't have ancestors of that ethnicity, especially if they are more than five or six generations back.

As far as how far back you have to go, it depends on how much of that ethnicity you have. And again, there are no absolutes here. You can get a very rough estimate by halving the percentage for every generation: 50% for your parents, 25% for your grandparents, etc. While this is very accurate for your parents it goes downhill pretty quickly beyond them.

I'd guess your Scandinavian comes from your Norwegian 2nd GGF Cyrus John Hagen. The rough estimate for a 2nd GGP would be 6.25%, but in reality you may have inherited much less or significantly more DNA from him than that.

All that said, the real value in your DNA tests is in the matches when it comes to genealogy. I'd be willing to bet that you have some 3rd or 4th cousin matches who are descendants of Cyrus' siblings (if he had any) or his aunts or uncles. DNA testing is pretty popular in Norway. I have tons of matches who have nothing but Scandinavian ancestors who are related to me through my great grandmother who was born in Norway. If you're lucky, you might even be able to extend your tree by researching them, especially since I've found that many of my Scandinavian matches have well researched and quite extensive trees. Just be careful, the patronymic naming thing gives me a headache every time I research that branch wink.

by Paul Chisarik G2G6 Mach 3 (34.6k points)
selected by Lisa Murphy
Thanks Lisa, let me know if you want some help getting started at looking into those matches.
I found your commentary about Scandinavian roots interesting. If you were to look at my tree, you would see that 11 of my 16 gt-gt grandparents were either born in German-speaking areas, or Ireland, or their parents were, while the other 5 were born in America, with only a few lines traceable back to Europe. So, no recent Scandinavian roots.

Yet, when I look at my MyHeritage matches I see lots of people who are Finns or Swedes, with completely Scandinavian roots. I attribute this to ancestry going back to New Sweden, and I have at least one ancestral line that is of a Finn who immigrated to New Sweden in 1653. Undoubtedly, this is telling me that there are undiscovered lines back to New Sweden.

What gets me about my distant Finnish/Swedish cousins is that it seems like even when I see that some of them match each other with hundreds of cM, I still often can't see how they're related, even though they give trees going reasonably far back.

I get the impression from this that these Scandinavian countries are endogamous populations - everybody is distantly-related to everybody in so many ways that cM numbers don't mean very much. I wonder if that's something that's well known about Scandinavia, or whether what I think I'm seeing is just some sort of anomaly. These are not huge countries - they generally have about 10 million people or less, if I'm not mistaken.
+18 votes

The ethnicity estimates are just that -- estimates. Depending on the test, the underlying data used to determine ethnicity varies from company to company. You will also find that the estimates change over time as the datasets are improved. My own tests range pretty widely from being entirely British Isles and France to having some Eastern European and Jewish to another saying there is Chinese, Southeast Asia, Native American (both North and South American). I discount that last company entirely.

At present, the tests are mostly a fun marketing gimmick but can provide some hints. How far  back to go is hard to estimate and the smaller the percentage, the further back you might have to go. Following the paper trail is really all you can do. If you have good DNA matches with someone that doesn't tie back to your paper trail, there might be some hints there.

by Doug McCallum G2G6 Pilot (542k points)
Hi, Doug.  I'm going to guess that the company you're discounting has made this mistake:  They are saying that you are descended from people when it's probably the case that you and that person are both descended from a common ancestor.  

I have my DNA with one company (not one of the major genealogy companies) that says that I'm descended from someone in Peru, from some old DNA they got from some bones.  Well, my family is very consistently NOT from the New World!  They also say I have some Dai Chinese in me.  Nope.  Actually, this Peruvian and this Chinese Dai and I are all probably descended from the same common ancestor, from way back, or maybe from two different ancestors, one with the Chinese Dai and the other with the Peruvian.  

Of all things, that Peruvian was a child who never had any children (since she was dead as a child)!  She most certainly was not my ancestor!  This means that she and I share a common ancestor from centuries ago.

That fact, though, is kind of cool in and of itself.  Same with the Chinese Dai.  We share some common ancestor many, many, many generations ago!  

For the Peruvian, many settlers there came from Spain. Well, I DO have "Western Mediterranean" in my DNA, per more than one DNA company!  She and I probably had a common ancestor from Spain!  Don't know how many generations back, but it's intriguing think about this!

It sounds like a similar thing has happened with that one DNA company you're discounting.  Rather than looking at it as ancestors, consider that those "far off places" actually have someone who is descended from a common ancestor with you--many generations back!  

If you could trace migration patterns to those areas, you might have clues to your deep ancestry!
The company tends to describe as having an ancestor who... Anyway, they are such an outlier with their breakdown being quite far away from the other 5 companies I have data from. Their "advanced" ethnicity is the furthest from reality. For example, they say that Tamil ancestry entered in the past 5 generations (1840-1905) and Southern Han Chinese in the past 6 generations (1820-1880) and yes they give those date ranges. I do have the paper trails that go back further than those timeframes and lived mostly where there wouldn't have been an opportunity to bring those genetics into the picture, I have to discount them at this point.

It sounds as if we might be talking about MyTrueAncestry. If so, I've expressed my views about the company in the past (for example, 29 December 2020 and 31 January 2020, among others). In my informed opinion, I'll refrain from using the "S" word but this is one DNA-related company I advise folks to steer clear of: there's no science of any genealogical value going on there...and even application of the term "science" is very much debatable.

Hi, Edison,

I've never heard of MyTrueAncestry.  The company I was  referring to was CRI Genetics, which gives information about health-related things, like "fat" genes, which is what I was interested in.  They also provide recipes for cooking!  

There was some information there that was connected to a medical issue of mine.  It indicated that my genes didn't work one way but worked another way.  (I don't remember the details in order to share this more clearly.)  Of all things...  I took that information to my doctor, and it turns out that the medicine he had me on that DID work (whereas others didn't) was related to my genes!  Now I understand better why some medicines work and others don't.  I need them to act in a certain way because I'm genetically predisposed to that!  Cool!  (Maybe information such as this can be used to help doctors prescribe medications better.)

In addition, CRI Genetics provides some ancestral information, including ancient ancestors and famous people of today.  For instance, according to them, the actress Eva Longoria and I share the same mitochondrial DNA, which means that she and I have some common female ancestor along our maternal lines.

She has Spanish roots, and several companies with which I've uploaded my DNA agree that I have some Spanish (that is, southwestern Mediterranean) background, farther back than I've researched, so things are confirmed here through this company saying that Eva Longoria and I share some DNA.  Also, now this tells me that it's through my maternal line that goes back to Spain, since it's the mitochondrial DNA.  

While CRI Genetics worded things wrong concerning my ancestry (about the Peruvian and Chinese Dai mentioned in a previous message), they are useful to me in other ways.  

Each company has its strengths and weaknesses.  Each one can add pieces to the DNA/ethnicity/ancestry puzzle, when we're willing to find it!  

I find it fascinating!

Thanks for your comment.

Hi, Kathy. I suppose I had a 50/50 chance of guessing correctly when I chose MyTrueAncestry. CRI Genetics is the other company I could never in good conscience direct anyone to for testing or evaluation.

I don't remember ending up as "anonymous" in the thread (gasp; that means my total G2G word count is even higher than I've estimated...over 2 million words surprise), but here is a G2G question from 2019 about CRI Genetics and its services.

I just did a quick check, and at least CRI has been able to boost their Better Business Bureau rating up to a "B" now by improving their responses in resolving issues. That said--and the science, or semblance thereof, aside--they've had 685 BBB complaints logged in the last three years. For less-than-large player in the industry, that's an awful lot.

On the science front, the truth is that inexpensive microarray tests have limited capability to inform us about medical/clinical conditions or risks, and extremely little, if anything, about phenotypical issues like fat metabolism, or response to exercise or a specific diet.

The federal Food and Drug Administration has, in the now 21 years that we've had any form of direct-to-consumer DNA testing, approved a grand total of only four tests, all from 23andMe. And even at that, the BRCA analysis remains controversial because, while valid, it can't take into account the other various factors involved in breast cancer, so interpreted false negatives are not uncommon. The latter is also partly due to the fact that the majority of physicians--unless their specialties call for it--are still not well-informed regarding the current state of genetics research; that remains a primary focus of the American Society of Human Genetics.

The reason the microarray tests are so limited is simple: for DTC testing, the maximum number of single nucleotide polymorphisms examined numbers about 700,000, or about 0.023% of the genome. These are not protein coding genes that are examined, but merely individual DNA "letters" among the ~6.12 billion (~3.06 billion base pairs) that we have. The average size of a protein coding gene is somewhere around 2,100 base pairs; the largest one known codes for a protein called dystrophin and spans about 2.4 million base pairs.

The approved BRCA analysis from 23andMe that I mentioned consists of looking only at three specific positions on Chromosome 17: the letters at the locations 185, 5382, and 6174. But very seldom can the value of single positions like this tell us much because the more we learn, the more we've come to realize that, when it comes to genetics, it takes a village.

We're starting to look at DNA not as an intrinsic, fundamental blueprint, per se--as we did at the start of the Human Genome Project--but more as a vast collection of interacting building blocks...as being raw material rather than solely the architectural diagram. 

In fact, today being DNA Day and all, before the Human Genome Project began, scientists were encouraged to place their bets on the total number of genes the project would find. The guesses ranged from more than 312,000 to just under 26,000, with an average of around 40,000. At the "completion" of the Human Genome Project in 2003 the estimate arrived at was about 35,000. Right now the current consensus, depending upon where you find it, is around 20,500. Much lower than we thought only a couple of decades ago...because DNA isn't a static blueprint, after all.

We've discovered amazing things in just the past few years, things like genome transfer via cell-to-cell travel of whole organelles being a reality; that simple activities like resistance exercise result in long-term changes through epigenetic activation and/or suppression of over 150 different genes; that our germline genomes change as we age through methylation and deamination; that determinants of transcription factor binding affinity can be identified as being structured like neural networks.

This is just a cautionary tale for the casual G2G reader. There are definite insights to be gained from the 700,000 point samples in an at-home autosomal DNA test. Things like 23andMe's BRCA analysis. However, the more any person or organization tries to tout that such point-datum information is definitive about clinical, pharmacological, or (particularly) wellness and lifestyle matters, the more I believe we should, with gusto, heave a pound of salt at the assertions and demand, "Show me the peer-reviewed research studies," before we believe a milligram of it.

The actual, peer-reviewed science is telling us to greater and greater extents that our genes work not only collectively with each other in complex ways, but that even more complex are epigenetic factors, both heritable and environmental, that result in variable histone and methylation changes, changes in RNA production, and transcriptional gene activation and suppression.

With the first, full, telomere-to-telomere sequencing of the human genome last year, there are great prospects that the hybrid sequencing methodology can start to allow better insight into heritable epigenetics and the role they may play. But we're not there yet, and we'll never get that kind of data from a microarray test: those simply can't look at the chromatin or heterochromatic regions of our genomes. Even our current commercial whole genome sequencing tests like the 30X coverage I took can't do that.

Hi, Edison (and everyone else!),

I appreciate your detailed response.  While in high school, I considered majoring in genetics in order to research the inheritance of horse colors.  (This was back in the 1970's.)  Had a visit at Cornell to discuss this and decided that I wasn't interested in the actual biology of it (the cellular level) but was more interested in basic Mendelian genetics.  

With this genetic genealogy stuff, I'm learning more and more as I go along.  Have to look things up on the internet as I progress, finding out that what I was doing either wasn't quite right or is more complex than I had imagined!  And so I make adjustments to what I'm looking at and how I'm going about it.

As far as CRI Genetics is concerned, they do need to "beef up" their program and make it higher quality.  For me anyway, I've gotten from them what I purchased it for.  (Okay, for what my sister purchased, as it was a Christmas present when it went on sale one year for the low price of $59.)

I understand that these companies--as with commercial companies in general--are out to make a profit.  Some companies offer more genetic or DNA match information for a price, after you've uploaded your DNA with them.  This way, they can continue to make money off the people after their DNA has been uploaded.  

We each get out of this what we want (hopefully!) and for whatever price we're willing to pay for that.  I'm just sorry that many people aren't aware of the lack of scientific accuracy involved (yet) in ethnicity determination.  As my cousin has said to me, it would be nice if all of these companies would pool their sampling data so that we could all get more accurate results!

When two or more companies come up with the same regions, then I figure that I should probably consider it to correct.  (Interestingly, my original ethnicity results from Ancestry.com included "South Europe", which includes Italy and the Balkans.  In later updates, that has disappeared.  However, MyHeritage has me at 11% Balkan, and Gedmatch.com says I have ancient Italian.  I figure that there must actually be something there.)

Interestingly, FamilyTreeDNA says I have some Baltic in my DNA (not "Balkan" but "Baltic").  They were the only company to indicate that region until this recent Ancestry.com ethnicity update, where "Eastern Europe & Russia" has been now been added to my genetic ethnicity, and this includes the Baltic region.  Okay, I guess I'll accept that as a historical region for some of my ancestors!  And because I've been mapping my chromosomes and DNA matches, I can see that these genes came to me through my Danish ancestors.  Cool!  This actually gives me a hint of a historical migration pattern!  

Best wishes to all of you!

Kathy
+9 votes

Lisa,

What Doug has said is correct, these are just estimates and are generally accurate with the higher percentages. Each company uses its own reference dataset based on samples they have acquired resulting in the differences. If you want to really compare results between the different companies, you need to do so against the same dataset such as those found on GEDMatch. There you will see that the differences are very minor between the different tests, usually only a percent or so different.  I have tested at five different companies and on GEDMatch they are all about the same. Otherwise, those ethnicities that show up on every company's test are probably correct, while those that show with only one are probably not correct.

As far as how far back you can go, without actually seeing your results and your tree, its almost impossible to determine when your Scandinavian or Norwegian DNA originated.  I also have a small amount of Scandinavian DNA which may actually be carried forward from my English ancestors. The best advice I could give you is to do the paper trail back as far as you can for each line and see where they originate.  Autosomal DNA (AuDNA) only goes back about five to eight generations at most, with anything 3rd cousin or beyond having the possibility of not sharing any usable DNA segments. You can see this online at the DNA Painter Tool, which comes in handy. The further back you go, the smaller these shared segments get as a general rule.

Hope this helps

by Ken Parman G2G6 Pilot (122k points)
+8 votes

I just watched this youtube video a couple of days ago.  Identical twins test five autosomal DNA tests to find their ancestry.  Very interesting.  https://www.youtube.com/watch?v=Isa5c1p6aC0  It does explain that the differences are due to different program algorithms for each testing company.

by Kitty Smith G2G6 Pilot (650k points)
+2 votes
The short answer is that ethnicity estimates are such junk that it's borderline fraud. It's used to sell DNA kits to people who don't want to do any real work to find out their real ancestry.

If you think about it, you don't even get exactly the same amount of DNA from all your gt-gt-gt grandparents, so even if you could magically tell what country every little bit of DNA came from, it STILL wouldn't tell you the right answer (depending on how you define "the right answer").

It can tell you if you're 1/4 Chinese, or something like that - it's OK as far as what continents your people came from - but that's about it.

Getting such people to do their DNA can actually help real researchers, though, so it's not all bad.

That being said, I've also found that when AncestryDNA gives you REGIONS in your DNA - it's mixed in with the results that give percentages, but there's no percentages associated with them - that that stuff is AMAZINGLY accurate.
by Living Stanley G2G6 Mach 9 (92.3k points)

Related questions

+2 votes
1 answer
134 views asked Jan 9, 2014 in Genealogy Help by anonymous
+6 votes
3 answers
+3 votes
3 answers
+2 votes
1 answer
+14 votes
2 answers
+9 votes
2 answers
+4 votes
0 answers
184 views asked Jan 31, 2023 in Genealogy Help by Lisa Murphy G2G6 Pilot (344k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...