Challenge of the Week: Will you help us clean up after the Connect-a-Thon?

+23 votes
479 views

Hi WikiTreers,

Will you join our "Data Doctor" Challenge of the week?

Last week we had our April Connect-a-Thon where we added over 88,000 profiles to WikiTree! 

This week we want to do a mini Clean-a-Thon and work on clearing out the suggestions for those profiles. 

Here is the list of profiles that could use some TLC.

Will you join us?

If you're participating, please post here to let us know. It's nice to cheer each other on. Or post if you have any questions about how to participate.

Thanks for helping!

in The Tree House by Eowyn Walker G2G Astronaut (2.5m points)
reshown by Aleš Trtnik
Challenge is active.
Selecting the 'new' column for many of the Find a Grave suggestions is not bringing up the same number as shown.  It is bringing up less than 5 in many instances when the count is over 100.
This is now fixed.

21 Answers

+14 votes
I’m in. The thons give us plenty to do.
by Eric Perkins G2G6 Mach 3 (30.5k points)
+14 votes
I'm going to work on some of these.  I didn't get to participate in the Thon as much as I wanted so here's an opportunity to make up for that.  :)
by Kirby Drake G2G6 Mach 2 (23.9k points)
+13 votes
I did some after cleaning up some "Research Notes" I left during the Thon.  Will do more tomorrow.
by Ray Sarlin G2G6 Pilot (103k points)
edited by Ray Sarlin
+13 votes
I'm definitely in :)
by Azure Robinson G2G6 Pilot (565k points)
+12 votes
Will work on some of these now that I have all my goofs fixed.
by Patricia Roche G2G6 Pilot (822k points)
+14 votes
I'll do a few as well as finishing clearing my own errors
by Anon Sharkey G2G6 Pilot (124k points)
+14 votes
I’ll work on a few.
by Rhonda Schneringer G2G6 Mach 2 (25.6k points)
+12 votes
Of course I will help! I've been waiting for this challenge since the Thon.
by Star Kline G2G6 Pilot (724k points)
+15 votes
The highest section of Suggestions / errors is in Find a Grave.  Not surprising since so many people created profiles with only a Find a Grave citation, which has been stated in multiple places is not a reliable source without other sources.

If people that created profiles with a Find a Grave suggestion had created the profiles with the dates and locations on the Find a Grave page since they are 'reliable', the majority of the suggestions in that section would not exist. Instead we have a large number that don't have a specific date or it is different, as well as locations.

I don't understand how there could be suggestions for merged Find a grave when those profiles and Find a grave citations were just added.  Were the pages merged after the thon was done?  Doubtful.  Possibly they just used a link from  another site, instead of the actual site which might have a different page now, or date or location than it had initially.

Hundreds of profiles that have the Find a Grave already existing on another page or it doesn't match the profile person.  Possible duplicates are probably more likely than it being on a page of a relative without the sameas=no from the ones that I have seen and already proposed merges on.
by Linda Peterson G2G6 Pilot (786k points)
Not hundreds but Thousands. 3044 - Find a Grave suggestions from this thon...I thought the objective was Accuracy over Quantity...

Looking at the histograms these suggestions indicates:

  • ERR 586 FindAGrave - Link to Merged Grave ID had about the same number of new profiles as any other average week.
  • ERR 587 FindAGrave - Link to Nonexisting Grave ID had by far its largest week (~200 new errors) of all time. However, I reviewed about 20 of the new hits at random and none of those profiles were edited within the last month. It would seem the staff over at FindAGrave have been busy processing deletion requests or one of their users is on a war path.
  • ERR 591 FindAGrave - Possible Father is slightly elevated over last week, while ERR 592/593 FindAGrave - Possible Mother/Spouse are down significantly from last week.
  • ERR 585 - FindAGrave - Multiple Profiles Link to Save Grave ID had about an average week.
To you statement:

"I don't understand how there could be suggestions for merged Find a grave when those profiles and Find a grave citations were just added."

That's because they weren't just added. I just opened 20 of those, and the only ones I see edits on since the Thon started are those being fixed by Johansen-1608 and a handful of others.

These suggestions are only on profiles added during the thon. All of the profiles that I looked at were created during the Thon, which is what these suggestion are about.

I suspect the Histograms that you looked at are for the entire DD reports, not just the Thon profiles, which is why you are seeing profiles that have not been edited.

If you compare the full DD report for 3 weeks ago, so it is an average week before the Thon profiles are being included, to the full DD report for this week, the New column shows that there are 20,496 new suggestions this week in the main suggestion group, last week it was more than 20,000 new suggestions. On Apr 7, a normal week before the thon, it was 11,325, which is normal for each week. There are a lot of different suggestion types included in those totals.

Find a Grave last week was 17,665, this week it is 14,415 and on Apr 7, it was 8,166. Only 16 different suggestion types in Find a Grave.

Merged Find a Grave Apr 7 was 149, Apr 14 was 215 and this week 289. My point with the Merged and non existent pages for the profiles just created is 'Were those grave pages all merged or removed since the thon?' Doubtful.   

Over 16,000 of the 87,700 profiles have already been orphaned.  The orphaned profiles have over 1660 suggestions on them this week.

I suspect the Histograms that you looked at are for the entire DD reports, not just the Thon profiles,

Correct, the histograms only count full weeks. The Thon report is a subset of the weekly report, so anything listed in the Thon report is already in the weekly.

They account for the number of errors corrected from last week. So if there were 5000 new errors, and 4000 were fixed from last week's report, the histogram will show a delta of +1000. Negative values are also possible.

which is why you are seeing profiles that have not been edited.

Not quite. The histograms don't show individual profiles, so I can't be seeing it there. I'm seeing profiles which have not been edited during the week in the weekly reports. See here, this week's 586 New report. Looking at a random selection of these profiles:

I suspect that Ales is mostly scraping FindAGrave profiles for changes on an as-needed basis, both to conserve processing power and to prevent hammering FindAGrave with traffic. (After all, most memorials don't change super frequently.) It appears these merges happened some time before the Thon and are just now popping on the report.

What about the Thon caused these to pop up this week is unclear. Perhaps someone edited a nuclear relative, which triggered a refresh of the FindAGrave data? However, we can discern that the above folks are not on the report because of being edited or added during the Thon.

The same holds true for the 857 New report. Note that there's a huge spike in this error this week, the first time since stats have been recorded. Again these don't look look like edits made during the week of the Thon. Some external factors with FG must be at play.

I hear you on seeing an increased number of new FG errors. But is that unusual when considered beside the large increase in profile creation and increased site activity?

The FG error rate for new profiles has varied between 8.98% and 17.76% in 2024, with a mean of 13.53%. For the two weeks of the Thon, the figures are 15.20% and 13.87%, or "about average." The math is available on this 2024 FindAGrave Error Metrics GSheet.

tl;dr: It's likely any other event that increased site activity to a similar amount would have generated a similar number of FG errors.

[Edit: line breaks for readability]

You have 857 above, but I think you mean 587.

To see total Suggestions generated from the Thon profiles, you have to look at the Total column in this week's challenge, because some were new last week and the rest this week.  Many suggestions from last week were also fixed, so the total is higher than what is shown.  

All suggestions are not shown every week for every profile.  It is mainly the 'active' profiles that have suggestions shown.  Some of the older profile suggestions show up new each week, but the majority of profiles are not actually being checked for suggestions weekly.

The huge increase in Find a Grave suggestions after the last 2 thons is because so many profiles are being generated with Find a Grave source only.  Many profiles are being created with a generic year for birth and death, so suggestions are generated because the date from the Find a Grave does not match the profile.  People use Family Search citations, not Find a Grave, and those frequently do not have the Find a Grave ID in the citation, so the 571 Link without Grave ID are generated. Many people also use one person's Find a Grave citation on the entire family being generated which generates the 572 Linked Grave not matching profile.

Compare Total Suggestions in 1st suggestion grouping to the one in Find a Grave for this challenge.  Find a Grave is a lot larger, which is not the case in weekly reports.
+13 votes
I will help as I am already working on my own suggestions
by Kathy Nava G2G6 Pilot (310k points)
+13 votes
I'm in and happy to help.
by Sandy Patak G2G6 Pilot (235k points)
+14 votes
I will work on the Wikidata suggestions.
by Paul Gierszewski G2G6 Mach 8 (89.9k points)
+12 votes
In and happy to help.
by Erin Robertson G2G6 Pilot (156k points)
+13 votes
I’d like to help. We’re expecting a week of rain after today, so I’ll be able to stay out of the garden. This will be just the thing.
by Katrina Lawson G2G6 Mach 4 (49.1k points)
+12 votes
I’ll help with this one!
by P Whittington G2G6 Mach 2 (21.1k points)
+11 votes
I'm cleaning up my suggestions first, making good progress.  The 831 error is slowing me down though, I have about 18 that are making me drool on myself!
by Terri Smith G2G6 Mach 1 (11.7k points)
+11 votes
Well this is a different type of project, at least for me.  I'm in!
by Laura Nixon G2G6 Mach 3 (32.2k points)
+11 votes
I'm in, and have already cleaned up some of the suggestions.

Like it when all the numbers are so small, feels like we can clean out the square. Would like to see the table again in a week to see how it looks, hopefully with more empty squares.
by Larry Klaasen G2G6 (8.2k points)
+7 votes
I worked on some yesterday, I'll see what I can do today.
by Judith Fry G2G6 Mach 7 (78.8k points)
+7 votes
I’ve been working on this one!
by Karen Haney G2G6 Mach 1 (15.2k points)

Related questions

+9 votes
8 answers
+10 votes
10 answers
+11 votes
8 answers
+12 votes
16 answers
+15 votes
11 answers
+8 votes
5 answers
+9 votes
7 answers
+12 votes
5 answers
+10 votes
4 answers

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...