Suggestions to lessen outdated links

+9 votes
297 views

There are 2 assets some may not be aware of that can be used to avoid links to sources that have address changes, which is a constant problem, or worse yet, removed from the web, altogether. One is the cache version. I find that through a google search by adding cache to the name of the page or simply peeking inside the search result I'm interested in. In the desired search result item, click the 3 dots on the right, then in the pop up box, click the down arrow. If the page has been captured, you will see the word cache and that link takes you to the page as google memorialized it on a given date (shown at the top of the page). Another method, if a cache version isn't found, is to create one yourself through the WayBack Machine owned by Internet Archive. As long as the site allows crawlers, this tool, shown on the right will create a snapshot and give it a permanent address. 

For our clever developers, on the left is a link to the code for an Availability JSON API that, when run, gives the latest archived snapshot, if available. Links from Internet archive, familysearch, and other large entities won't likely need an archived page but any personal website and pages that are in danger of being swallowed by ancestry.com would be protected through  the cache or archived page.

WayBack Machine Chrome Extension

ETA: Correction, Google cache will be replaced when they crawl the page again and is temporary.

in WikiTree Tech by Connie Mack G2G6 Mach 2 (23.1k points)
edited by Connie Mack

These are good ideas thank you Connie, but is there evidence that Google cache links are permanent? For example, here is one such link:

https://webcache.googleusercontent.com/search?q=cache:yjtlUsKch7UJ:https://www.wikitree.com/&hl=en

It's for the WikiTree home page at 18 Nov 2023 13:00:30 GMT. Will that link work for ever, or will the cached copy be replaced and the yjtlUsKch7UJ part of the link become out of date?

I have no idea if it is accurate, but there is a site that says "Google keeps webpages in their cache for about 90 days, or until the page is crawled again." (https://neilpatel.com/blog/google-cache/#:~:text=Webpages%20are%20cached%20for%20approximately,the%20page%20is%20crawled%20again.)

I did a bit of research and it appears that the Google cache will be replaced when they crawl the page again. I'll edit my original post.
I don't believe links will always get you to the record you are quoting therefore I like to leave a copy of important transcripts under Notes in the profile. The reader does not need to search for the record if they want more info.

However, there are some that don't like this format and I have found numerous profiles that have been edited and this information removed. Sad, because the actual transcription gives a lot more information than what is usually articulated in the text of the bio.

I'm not going to take issue with it, just move on with my research. However, I think relying on a URL to find a document is short sited. I in fact have not been able to find the document a second time even when I had the URL and the document info. I don't believe images need to be retained thou.
I'm not suggesting we should ignore the need to create a full citation but the link saves a lot of time. If were taking the time to create the citation, it doesn't take much more to add the link to the citation. Everything I've read about Internet Archive's Wayback Machine links is that they are permanent. Of course, if the non profit went out of business and no one else took it over or if the internet was completely destroyed, then I suppose that permanency is destroyed too. I would suggest our Federal government would want to preserve all that the internet archive holds.

Transcripts are a good idea for those that have the time and I'm sorry someone deleted them. You can restore your hard work by looking through the changes. It would probably be best to copy that section and paste it back in though, so you don't loose any additions that came after.

3 Answers

+8 votes
Familysearch has been the subject of broken links in the past (for example, when they transferred a large portion of their records to an Italian site -Antenati). I understand what you're trying to do but I've given up on worrying about link permanence. I just do my best to transcribe all the relevant data and read the source links like this..."as of XYZ date, this is where the info came from"....
by Nick Andreola G2G6 Mach 8 (89.4k points)
I can't give up trying to find a solution because I've spent a lot of time helping data doctors fix broken links. I see your point though. But, I like having links to the original source so that I can check data for myself and to see if there is anything more shown. It saves a lot of time tracking the source down.
+9 votes
A lot of dead links are to Ancestry.com family trees that were never good sources of evidence to begin with.  Archive.org helps maintain resources of all kinds.
by Amy Garber G2G6 Mach 1 (18.1k points)
Agreed and ancestry is one place I'd rather not see links coming from. Even the shared links aren't very usable, quite small. The website is intended to attract paying users and not helpful to others.

Just FYI you can always see the full size image from an Ancestry sharing link. It just takes a couple of steps. I documented how to do it on an FSP here.

But Connie, sometimes Ancestry is the only place for certain sources. For example, a set of Alabama marriage records for many of my ancestors and relatives is not available on FamilySearch. So, it's use the Ancestry collection or provide no source at all. I agree with you that whenever possible it is best to provide a source citation to a freely available source.

Thank you for your advice on how to provide links which do not expire.

Thank you for that info, Rob.

Nelda, I know I haven't found many Quaker documents at other locations but are you sure about Alabama marriages not found at familysearch? Here is the state index and here are Alabama County Marriages. The later are my favorite type because they often have more info than the state version.

Yes, I'm sure. There are sets of Alabama records on FamilySearch, but one set in particular I use is not on FamilySearch.

By the way, with the Alabama marriage records on FamilySearch, some are indexed incorrectly as far as location. One set has EVERY couple married in Alabama City, Etowah County, Alabama when they were not all married there. Another set has EVERY couple married in Montgomery, Montgomery County, Alabama but sometimes includes the correct alternate location. Sometimes I can find the corresponding record on Ancestry to get the correct location, so I usually include the Ancestry record in the WikiTree profile also. Or I find a clipping on Newspapers.com (another subscription site) with the correct location. Have to be careful of transcription errors with indexed records which do not provide an image.

I see. I prefer when the image is there also and I've noticed sometimes location isn't quite right on some records. Thanks
+9 votes
That is some good tips, that may extend the life of a link, but even cache will eventually disappear. I take the approach that the source should have enough information to find the source again, if the link does not work. For example, you can include the name of the website, the title of the page, the author, the date of publication, and any other relevant details that can help you locate the source. This way, you can also verify the credibility and accuracy of the source, and avoid relying on links that may be broken or outdated.
by Jimmy Honey G2G6 Pilot (163k points)

Related questions

+21 votes
1 answer
+2 votes
3 answers
286 views asked Aug 25, 2023 in WikiTree Tech by Miranda Bailey G2G6 Mach 2 (23.6k points)
+8 votes
1 answer
+15 votes
2 answers
713 views asked Apr 17, 2022 in WikiTree Help by Living Rayner G2G6 Mach 1 (18.5k points)
+5 votes
2 answers

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...