Topic: Data, Biography, Categories, WikiTree+, and Their Impact on Search Capabilities (aka "IT")
Finally, a topic where I feel I have some expertise … but that doesn't mean I'm going to try to say "xxxx is how this is best done". All I want to do is clear up what may be mis-communications between those who have addressed this topic and explain how all of the above does (or does not) work. I have my blinders on and my only focus is how to best make the Holocaust category hierarchy maximally useful to everyone from family members of individual profiles to researchers, and including WikiTree members and also other projects.
What Jan wrote about data redundancy is 100% correct, although non-techies may not have fully absorbed his explanation, so I will attempt to explain it more simply:
When we have more than 1 place where the same data is stored, that is termed "data redundancy". There are two ways that redundant data may be updated:
- When you make a change to a value in 1 place, the software that processes that change could include checking all other places where the same data item is stored and changing them to match the value of the 1 item that was updated. This is a very good thing because it ensures that data is always consistent (i.e., you won't ever see birth and death locations as United States for a profile that is in Category: South Pole but not in Category: United States - please indulge me - I know those would be high level categories, where profiles don't belong, but this is only for an example). The problem is that it would take a lot of time and effort to program, add significant processing time to the act of storing data changes, plus may not even be possible to do in all cases.
- In the absence of software ensuring data consistency, when you make a change to - say - a location field then you also have to check the biography to make sure that, if that location is mentioned, you also change it there, plus you need to check that there is no category for the previous location value and probably also add a category for the new location value. This is asking a lot of the member who edits a profile, but if it is not done then there will be contradictory information in the profile and nobody will know which is correct.
- No matter which way data is managed, a monkey wrench is thrown into the mix by the different languages in which location names are entered, as well as spelling variants that creep in no matter how hard you try to standardize place names.
We all agree that an important purpose of data is to offer the ability to run searches. I will put forth the statement that the most important thing about doing searches is knowing how to create a search statement that is narrow enough to exclude results that are not of interest yet includes ALL results that are of interest. That said, the capabilities of WikiTree to search data is - I hate to be so harsh - downright Neanderthal. WikiTree+, on the other hand, has pretty good search capability. I'm sure you're going to be surprised when I say that best of all available search engines is google. This is because google allows entry of a pretty good group of Boolean constructs (I think there should be an umlaut on one of the o's, but not sure which one. Maybe I can't spell so good, but I promise you that I can sling the and's, or's, if-then's, and other Boolean terms around with the best - I once prepared material and presented training to government intelligence analysts on search skills). That allows you to create search algorithms that are much more precise. Ah, but I've digressed - sorry 'bout that.
Categories are a way that WikiTree kind of mitigates what it lacks in search sophistication. You can find a list of profiles that have something in common by looking at a category page … however, that is provided that the category is in a well designed hierarchy and that all profiles that share the common characteristic have been put in the appropriate category, which is a very tall order.
WikiTree+ is probably not familiar to many members, but Jan's examples of the kinds of searches you can do there are good ones, so I won't try to expand on it. I do think, though, that we need to highlight the WikiTree+ capabilities better - maybe a series of G2G posts about what they are and how to harness them to meet your needs.
If anyone wants to do more advanced google searching, feel free to contact me and I'll help - can either construct your search algorithm or 'splain you how to do it yourself.
I would like to see everyone stop arguing about which way is best to search for information and whether or not all the rest of WikiTree's categories are good to have. My only concern is to make the Holocaust categories maximally useful.
Please … OK? … Thank you!
OK - My rant is now officially over. THANX to everyone for allowing me my soapbox and, although I hope you don't, you can all get back to the free-for all now!