I've ordered a Dante Labs Whole Genome kit. How do I extract files usable in genealogy?

+9 votes
3.5k views
The kit comes with mtDNA analysed, but the stated format is VCF, with BAM and FASTQ raw data files optionally availabe on HDD. The VCF file alone is 20MB which sounds wa-a-ay bigger than the average Ancestry or 23andMe file.

Also, GEDmatch requires a "raw file" but I assume it won't be in the realm of 100GB, which is what I'd get if I order the raw files.

So, firstly, does anyone have a list of standards for these files? And secondly, how do I convert my VCF file into something usable on genealogy sites? Ideally I'd like to end up with an mtDNA+xDNA file, a yDNA file and an auDNA file in minimal sizes.
in Genealogy Help by Robert Judd G2G6 Pilot (138k points)
edited by Robert Judd

Since nobody else knows the answer to this I'll reply to my own thread.

Firstly, the new version of GEDmatch can use VCF files directly, which I didn't know when I posted the question. Secondly, I asked Dante Labs as well, and they suggested using the EvE Free application at sequencing.com

More data: This excellent article covers the exact ground I was enquiring about. Further, the author references a free tool that can be used to convert VCF files into various formats used by the common genealogy testing companies, such as 23andMe, AncestryDNA and FTDNA/MyHeritage.

Nice find, great article!

For me, the big takeaway from the article is that while GEDmatch accepts VCF uploads, they are useless for matching because they don't include all of your reference values, those SNP's that are the same as modal.  The uploaded kit only contains the differences from modal, obviously useless for segment matching.
Sure. It also gives an alternative (create a 23andMe compatible file) and states that GEDmatch are working on the issue. I found it really useful.

Now if I could just find a way to extract yDNA and mtDNA files from the Dante Labs results I'd be a happy camper.

Hi Rob! This FB group has the answer to that and several other questions I've been navigating as well. There's about 500 Dante Labs customers there as well as some staff. Check the files area and feel free to contribute with what you figured out so far.

https://www.facebook.com/groups/373644229897409/

2 Answers

+3 votes
From the author of DNA Kit Studio:

Hi Rob,

Thanks for your email. Currently I don't have any tool that is able to extract MTDNA or YSTR markers. I will research it and I find a way to extract it, I will develop it in DNA Kit Studio.

If you have any questions, please let me know. I will be happy to help you.

Regards,

Wilhelm
by Robert Judd G2G6 Pilot (138k points)
+3 votes
I've done all the extraction to get yDNA, mtDNA, and GEDmatch autosomal information, it is pretty technical, but definitely doable.  It includes converting to HG38 at ySeq.net and then uploading to yFull.com for the yDNA and mtDNA portion.  The autosomal is done with samTools.

Contact me privately and I can assist further.
by William Foster G2G6 Pilot (124k points)
William, Thanks for the follow-up on this (rather ancient) posting. I eventually got the job done using WGSExtract, which generates all common formats (23andMe, Ancestry, FTDNA, LDNA, MyHeritage and Combined Kit) in all known versions, as well as chrY-only and chrY+ChrM files.

Having used them at GEDmatch, YFull and yourDNAportal I can guarantee they work.

Related questions

+13 votes
2 answers
+7 votes
2 answers
497 views asked Feb 16, 2019 in The Tree House by Nathan Kennedy G2G6 Mach 4 (40.6k points)
+3 votes
0 answers
401 views asked Apr 15, 2019 in The Tree House by Robert Judd G2G6 Pilot (138k points)
+9 votes
2 answers
3.2k views asked Dec 14, 2018 in The Tree House by Andreas West G2G6 Mach 7 (77.9k points)
+19 votes
3 answers
+3 votes
1 answer
156 views asked Jan 9, 2018 in Genealogy Help by Wendy Fromme G2G6 Mach 2 (26.9k points)
+2 votes
1 answer
303 views asked Dec 23, 2017 in The Tree House by Barry Smith G2G6 Pilot (313k points)
+14 votes
4 answers
390 views asked Jun 25, 2019 in Policy and Style by Sally Douglas G2G6 Mach 3 (38.3k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...