GEDCOM issues - rejected matches / large files (over 5000 people)

+6 votes
239 views

Hello again!

Two questions - 

1) Has anyone had a chance to investigate the request

Function request: Match page for GEDCOMs - WikiTree G2G

posted about 7 weeks ago?

I'm working my way through a batch of GEDCOMs, and would REALLY appreciate anything that saved time. Jumping back to the main list after every rejected match is rather tedious. Please check how difficult the change might be and offer some feedback. If it isn't feasible, that's ok. But I'd like to be sure someone took a look....

2) I'm also trying to figure out a way to split ancestry family trees with over 5000 people in them, so they can be processed. From available info, it looks like they need to be duplicated and then each person must be deleted from one or the other copy. If I figure about 2 manual deletions per minute each minute for 8 hours each workday, splitting the three largest files (about 20000 records) would take about a month - before the whole GEDcompare process could even begin. There has to be a better way! Are there any good tools for modifying large GEDCOMs after export and before uploading them to GEDCompare?

in WikiTree Tech by GM Garrettson G2G6 Mach 3 (34.7k points)
edited by GM Garrettson
Thanks, Jim! I appreciate the feedback.

1 Answer

+7 votes
 
Best answer
Hello GM,

Regarding your question 2, may i suggest that you take a look at the following :-

https://www.wikitree.com/wiki/Space:Find_my_Past#MyHeritage_Family_Tree_Builder

Re: your question 1, from your original post you say the following :-

'At the moment, as soon as a suggested match is rejected, this window closes and you go back to the original list.'

and you say

'because clicking on "match" does NOT automatically close the window'

so...

the trick is (and keep this between the two of us) to ONLY click on 'matches' where that is applicable for all the people in the comparison list.

then either click on a 'reject' if one is available or click to close the window.

As an addendum, my understanding is that the GEDCOMpare process is not part of one of the WikiTree App's - it is part of the wider WikiTree infrastructure,

and again as far as i am aware this code is not open source, so i am not sure how anyone would be able to sort out the task that you wish.
by Allan Entwistle G2G6 Mach 3 (38.1k points)
selected by GM Garrettson

Thanks, Allan!

this was very helpful, and I learned a lot by reading the page you (and the FindMyPast team) created Find my Past (wikitree.com)

Discovering golden nuggets of information about WikiTree is always an adventure - thanks again!

Searching G2G, I found a reference to a software called GRAMPS, which I had just installed when I read your post. It apparently offers a similar feature for creating smaller GEDCOMS. FindMyPast also looks quite promising - maybe I'll have a chance to explore that as well. I would be interested in hearing how you and others using FindMyPast or similar programs use them in combination with your work on WikiTree. (perhaps via PM?)

I'm a GRAMPS user since 2019. Also use Ancestry and have over 13,000 people on it. I have downloaded the Ancestry stuff into GRAMPS and have taken a look for what it produces. Not too good.

For a clean upload to WT, each profile would need to be cleaned up a lot or the junk will be uploaded and need to be cleaned up.

My suggestion, download to the program of your choice, clean up each record, tag it, then when you have 100, upload them to WT.

Do a trial of a very few first to make sure you have done everything in the program to create a good upload process.

For example, the upload needs to contain a Birth and Death record when you only have a Baptism and Burial events. Also, the Place names need to be correct and complete. Dates need to be in the 15 Feb 1830 or 1830-02-15 formats. I now know how to enter everything needed in GRAMPS which produces a very clean upload to WT. It took about 6 changes to my process before I got it right. Good luck.
Hi Dave,

thanks for sharing your experience!

I also appreciate you wishing me luck. I'll admit that I am a bit overwhelmed by the prospect of checking each record, "cleaning up" all of the date and location information before beginning the upload and compare process in WikiTree.  Since so few of the "suggested match" profiles in WikiTree have anything near the data quality you seem to consider a pre-requisite for upload, and since each one must be examined before any new profile can be added (regardless of how well the data has been cleansed before uploading), I guess I don't really even understand the benefit of spending time in an external program to "pre-clean" the data. What difference does it really make?  

Maybe it is time for me to face the real possibility that my parents' research cannot be "saved" to WikiTree with any reasonable amount of time or effort. Their ancestry account will be terminated in May 2024, and I had been trying to convince my siblings that it made sense to support the WikiTree vision by "transferring" as much of their info as possible. When I started, I had no idea how difficult and time-consuming that project would actually be.

I may try to "download to the program of my choice" (probably GRAMPS, at this point) and hope that will preserve enough of their work for future generations to enjoy. Maybe one day, someone else will be willing and better able to invest the necessary additional time and energy to add it to WikiTree (or some more modern implementation of the WikiTree vision).
I was not able to work on our family tree until I retired, although my wife had been collecting data since 1973. I have to limit myself to 3hrs per day on the computer so I can get other chores done. I'm slowly filling out the tree. My average to upload to WT is about 100 per month. Even with the help of Ancestry, Find My Past and FamilySearch it can take sometimes days to fully document a family. It gets complicated based on the era, multiple marriages and name changes.

Using GRAMPS, you create a copy of the tree for future generations and work on it as time permits. GRAMPS is feature rich however it will take some learning to use it to its full potential. Read as many help files as you can and ask questions on the blog if necessary. I think adding the people to WT is an admirable goal however it should be secondary to improving the records first.
Hi Dave,

that makes a lot of sense. Thanks again for sharing!

Given what I had considered to be some of the greatest "selling points" for WikiTree - the collaboration with others in the community and the helpful apps and tools for finding and checking sources - I had been looking at the process from a somewhat different perspective. My approach had been "check to be sure there isn't a match, then add the profile with the best information (and SOURCES) you have/ can find right now. Because then you (or others) can come back and improve the profile later - in WikiTree".

I just don't have the time or ambition to re-check all of my parents' work. Keeping it all in a local program until "days" can be devoted to getting each record "ready for WikiTree" just isn't a very promising option. At the rate of 100 profiles a month, it would take me at least twenty years.

I do understand the importance of having WikiTree profiles as nearly "perfect" as is possible. There are some great profiles on WikiTree - and I hope one day to develop the skills (and have the time) to contribute a couple of exemplary profiles myself.

Unfortunately, WikiTree has a lot of rather poor profiles, as well. When I find a good profile which matches a person in my parents' database, I'm really grateful that I DIDN'T spend too much time trying to perfect all the details before uploading and comparing. When I find an empty or abandoned profile left over from the early days of GEDCOM imports, I try to improve that profile (on WikiTree) as much as I can. But it seems somewhat self-defeating to spend days getting everything right before uploading and comparing the basic information to determine whether or not a matching profile already exists.

Or am I missing something?
As I work primarily in GRAMPS, when I start checking a family, I do a "find" in WT to see if they are already there. If I find that they are, I add a tag "managed by others". Normally I will not update these unless the profile is very poor. If I find a match it also lets me verify some facts and might give me a clue to others that I might not have. When I have a profile completed that does not exist in WT, I tag it "WTupdate", the tag can be batch removed after a gedcom has been created and uploaded.

GRAMPS has many features. You could decide to create a filter to only display four generations and when ready, upload those. Or just your immediate paternal or maternal connections. You don't need to do all the cousins etc. Pick something that is within your capability.

I like working on it because it is like doing a million piece puzzle, but some days I have to walk away because it gets really boring. It keeps my mind sharp (I hope). I currently have about five years of checking ahead of me right now. Our youngest son has said he will take it over when I can't do it any longer, so the data will be protected for another generation.
Hi Dave,

You've given me considerable food for thought. Thanks again!

GRAMPS certainly sounds like a wonderful tool, and definitely worth investing some time to learn more of its features. I do enjoy hunting down the missing puzzle pieces and resolving roadblocks, and can quite happily spend days doing so. From the little I've already seen of GRAMPS, it provides a much more user-friendly environment for that sort of activity.

It sounds to me like you only upload profiles after you have manually determined that they will need to be added - basically doing all of the searching / comparing / matching process "outside" of GEDcompare, which would clearly speed up the process once you decide to upload. Again, that makes sense if you primarily work in GRAMPS.

But if everything is pre-cleansed and edited in GRAMPS, why go to the trouble of uploading profiles to WikiTree? Don't get me wrong - I applaud and appreciate you and others who generously share your (completed) work and build the common family tree! But I thought uploading was just the first step in the collaborative process I considered a major advantage of WikiTree.

I guess my hope had been that WikiTree could be a place where I (or others in my more-or-less immediate family) could "primarily work" on preserving and eventually improving on the genealogical research mom and dad have already done. Perhaps that is simply not realistic.
I contribute to WT because it is a good objective to have a world tree. It also lets family see most of the tree any time they like. I do see others making connections and making minor corrections. I do make mistakes. I have been contacted by members from a few countries and I hope we have helped each other. I estimate it only takes me about 4 minutes time to add a profile in the manor I do it, so I have decided that the time invested is worth while. I have only asked the group for help a couple of times, without success, so I guess my research is comparable to others.

yessmiley Thanks again - I will definitely try the GRAMPS approach. Four minutes sounds great!

Related questions

+7 votes
1 answer
+7 votes
1 answer
165 views asked Sep 13, 2017 in WikiTree Tech by Tom Culver G2G2 (2.0k points)
+6 votes
1 answer
216 views asked Mar 17, 2019 in WikiTree Tech by Helmut Jungschaffer G2G6 Pilot (607k points)
+5 votes
1 answer
76 views asked Oct 1, 2017 in WikiTree Tech by Janice Anderson G2G6 (9.4k points)
+8 votes
0 answers
136 views asked Sep 9, 2017 in WikiTree Tech by Anonymous Brickland G2G6 (9.5k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...