Autocorrect for misspelled state names

+15 votes
1.1k views

I am preparing autocorrect function for 608 and 638 errors. First I will do it on American states. Later I will add other countries. So all those misspellings can be quickly corrected.

Question for you.

In what form should I write the name if date is after 1777-06-04?

Whatever, South xCarolina (misspelled location)

1.) Whatever, South Carolina, USA

2.) Whatever, South Carolina, United states

3.) Whatever, South Carolina, United states of America

Also is the above date ok for all states or did some join later?

This are misspelling, I will work on.

  • United States
  • United States of America
  • Alabama
  • Alabama USA
  • Alaska
  • Alaska USA
  • Arizona
  • Arizona USA
  • Arkansas
  • Arkansas USA
  • California
  • California USA
  • Colorado
  • Colorado USA
  • Connecticut
  • Connecticut USA
  • Delaware
  • Delaware USA
  • Florida
  • Florida USA
  • Georgia
  • Georgia USA
  • Hawaii
  • Hawaii USA
  • Idaho
  • Idaho USA
  • Illinois
  • Illinois USA
  • Indiana
  • Indiana USA
  • Iowa
  • Iowa USA
  • Kansas
  • Kansas USA
  • Kentucky
  • Kentucky USA
  • Louisiana
  • Louisiana USA
  • Maine
  • Maine USA
  • Maryland
  • Maryland USA
  • Mass
  • Massachusetts
  • Massachusetts USA
  • Michigan
  • Michigan USA
  • Minnesota
  • Minnesota USA
  • Mississippi
  • Mississippi USA
  • Missouri
  • Missouri USA
  • Montana
  • Montana USA
  • Nebraska
  • Nebraska USA
  • Nevada
  • Nevada USA
  • New Hampshire
  • New Hampshire USA
  • New Jersey
  • New Jersey USA
  • New Mexico
  • New Mexico USA
  • New York
  • New York USA
  • North Carolina
  • North Carolina USA
  • North Dakota
  • North Dakota USA
  • Ohio
  • Ohio USA
  • Oklahoma
  • Oklahoma USA
  • Oregon
  • Oregon USA
  • Pennsylvania
  • Pennsylvania USA
  • Rhode Island
  • Rhode Island USA
  • South Carolina
  • South Carolina USA
  • South Dakota
  • South Dakota USA
  • Tennessee
  • Tennessee USA
  • Texas
  • Texas USA
  • Utah
  • Utah USA
  • Vermont
  • Vermont USA
  • Virginia
  • Virginia USA
  • Washington
  • Washington USA
  • West Virginia
  • West Virginia USA
  • Wisconsin
  • Wisconsin USA
  • Wyoming
  • Wyoming USA

 

 

in The Tree House by Aleš Trtnik G2G6 Pilot (818k points)
retagged by Dorothy Barry
That's my concern. If Washington is autocorrected in the manner proposed, Wikitree could be creating new errors.
There is a Washington in England, the Philippines, Canada, Guyana, and a couple of islands.
Ellen, Remember that nothing is actually being auto corrected.  The place is just being identified as missing information (state or country), and a possible improvement is being suggested.  A person will still have to make the actual correction, and hopefully will give some small thought as to which Washington (or any other place) is correct.  Certainly, I can see a few errors being introduced by people being not careful in a correction.  However, vastly  more errors will be corrected compared to new errors made.
I will take a few errors over a bunch of errors any day! Thanks for the upgrade!!!

sorry I'm late to the party...

[00:52:50][00:00]Error  mIss -> Mass 216

Miss would be for Mississippi, not Massachusetts

edit: and likely, Moss is a typo for Miss given the proximity of the letter 'o' to the letter 'i' on the keyboard.

Denis, I agree with you. Mass will not be autocorrected.

Also don't forget about Washington, District of Columbia which is unique in the US. 

Article One, Section 8, of the United States Constitution places the District (which is not a state) under the exclusive  legislation of Congress

I would prefer using USA. It's simpler & everyone knows where the USA is
Just go with USA.

9 Answers

+8 votes
Many states weren't formed by 1777.  So, auto-correcting states really needs to be matched with statehood dates.  I vote for #2.  It is definitely not ok to use 4 June 1777 (and where did that date come from?).
by Kathy Zipperer G2G6 Pilot (480k points)
+10 votes

You've opened the proverbial "can of worms" Aleš.<grin>

I know that the Wikitree standard is not to use any abbreviations, but I can't imagine there is a single user of Wikitree anywhere in the world that would not understand that "USA" means the same as the official name of the United States of America. Using this one abbreviation would save database space if nothing else.

"United States" is used on Family Search. But there is also the official name of Mexico (English translation) which is the "United Mexican States". I would avoid the shortened version (United States) as it is named in the Constitution and recognized by the United Nations by the full name. 

I'd vote for USA.

As for a date, pick one! The constitution was adopted on 02 July 1776, but Jefferson's edits were adopted on 04 July 1776. Congress voted 09 September 1776 to name it officially. Each state ratified on different dates. Good luck with choosing a date.

And, yes, delete "Mass", it's wrong. And, officially, Massachusetts is actually the Commonwealth of Massachusetts, but that name is used only in official documents. Commonwealth also applies to Virginia, Kentucky, & Pennsylvania to be anal. But I doubt anyone wants to use that.

Perhaps for ease of use, 04 July 1776 should be used for the cutoff date?

Thanks for all you do! I, for one, thoroughly appreciate it.

[Edit] There is a list of dates here, though I won't vouch for its accuracy:

https://simple.wikipedia.org/wiki/List_of_U.S._states_by_date_of_statehood 

 

by Bobbie Hall G2G6 Pilot (355k points)

I will use the dates on this list. If someone has a problem, let me know. Since none of the state has 04 July 1776 it doesn't really matter.

Sounds very reasonable to me.

Can someone check if I got all dates correct. 

 

So for error 608 and 638, if location ends with first column and date is after or equal third column, ending will be corrected to second column.

Location ends with New ending After date
Delaware Delaware, USA 17871207
Delaware USA Delaware, USA 17871207
Pennsylvania Pennsylvania, USA 17871212
Pennsylvania USA Pennsylvania, USA 17871212
New Jersey New Jersey, USA 17871218
New Jersey USA New Jersey, USA 17871218
Georgia Georgia, USA 17880102
Georgia USA Georgia, USA 17880102
Connecticut Connecticut, USA 17880109
Connecticut USA Connecticut, USA 17880109
Mass Massachusetts, USA 17880206
Massachusetts Massachusetts, USA 17880206
Massachusetts USA Massachusetts, USA 17880206
Maryland Maryland, USA 17880428
Maryland USA Maryland, USA 17880428
South Carolina South Carolina, USA 17880523
South Carolina USA South Carolina, USA 17880523
New Hampshire New Hampshire, USA 17880621
New Hampshire USA New Hampshire, USA 17880621
Virginia Virginia, USA 17880625
Virginia USA Virginia, USA 17880625
New York New York, USA 17880626
New York USA New York, USA 17880626
North Carolina North Carolina, USA 17891121
North Carolina USA North Carolina, USA 17891121
Rhode Island Rhode Island, USA 17900529
Rhode Island USA Rhode Island, USA 17900529
Vermont Vermont, USA 17910304
Vermont USA Vermont, USA 17910304
Kentucky Kentucky, USA 17920601
Kentucky USA Kentucky, USA 17920601
Tennessee Tennessee, USA 17960601
Tennessee USA Tennessee, USA 17960601
Ohio Ohio, USA 18030301
Ohio USA Ohio, USA 18030301
Louisiana Louisiana, USA 18120430
Louisiana USA Louisiana, USA 18120430
Indiana Indiana, USA 18161211
Indiana USA Indiana, USA 18161211
Mississippi Mississippi, USA 18171210
Mississippi USA Mississippi, USA 18171210
Illinois Illinois, USA 18181203
Illinois USA Illinois, USA 18181203
Alabama Alabama, USA 18191214
Alabama USA Alabama, USA 18191214
Maine Maine, USA 18200315
Maine USA Maine, USA 18200315
Missouri Missouri, USA 18210810
Missouri USA Missouri, USA 18210810
Arkansas Arkansas, USA 18360615
Arkansas USA Arkansas, USA 18360615
Michigan Michigan, USA 18370126
Michigan USA Michigan, USA 18370126
Florida Florida, USA 18450303
Florida USA Florida, USA 18450303
Texas Texas, USA 18451229
Texas USA Texas, USA 18451229
Iowa Iowa, USA 18461228
Iowa USA Iowa, USA 18461228
Wisconsin Wisconsin, USA 18480529
Wisconsin USA Wisconsin, USA 18480529
California California, USA 18500909
California USA California, USA 18500909
Minnesota Minnesota, USA 18580511
Minnesota USA Minnesota, USA 18580511
Oregon Oregon, USA 18590214
Oregon USA Oregon, USA 18590214
Kansas Kansas, USA 18610129
Kansas USA Kansas, USA 18610129
West Virginia West Virginia, USA 18630620
West Virginia USA West Virginia, USA 18630620
Nevada Nevada, USA 18641031
Nevada USA Nevada, USA 18641031
Nebraska Nebraska, USA 18670301
Nebraska USA Nebraska, USA 18670301
Colorado Colorado, USA 18760801
Colorado USA Colorado, USA 18760801
North Dakota North Dakota, USA 18891102
North Dakota USA North Dakota, USA 18891102
South Dakota South Dakota, USA 18891102
South Dakota USA South Dakota, USA 18891102
Montana Montana, USA 18891108
Montana USA Montana, USA 18891108
Washington Washington, USA 18891111
Washington USA Washington, USA 18891111
Idaho Idaho, USA 18900703
Idaho USA Idaho, USA 18900703
Wyoming Wyoming, USA 18900710
Wyoming USA Wyoming, USA 18900710
Utah Utah, USA 18960104
Utah USA Utah, USA 18960104
Oklahoma Oklahoma, USA 19071116
Oklahoma USA Oklahoma, USA 19071116
New Mexico New Mexico, USA 19120106
New Mexico USA New Mexico, USA 19120106
Arizona Arizona, USA 19120214
Arizona USA Arizona, USA 19120214
Alaska Alaska, USA 19590103
Alaska USA Alaska, USA 19590103
Hawaii Hawaii, USA 19590821
Hawaii USA Hawaii, USA 19590821
New York should be 1788 0726

I find no other errors.
We've had this conversation before also. When do "we" become the USA. We all know that the 13 colonies ratified the constitution on different dates. That requires that we would have to know 13 different dates. When we clearly have one date 4 Jul 1776.

IN CONGRESS, JULY 4, 1776. The unanimous Declaration of the thirteen united States of America. Source: Declaration of Independence

This to me is a clear statement of intent by the Delegates sent to the Continental Congress. It is a statement of this is who we are now.
I think Aleš, that this is a place, where you really can't say one date, there should be some leeway between 1776 and 1788 (in Connecticut)

Anne, I set the date to 1788-01-09 for Connecticut. I will enable autocorrect only for profiles after this date.

     

Thanks Bobby. I will correct. I  had a feeling, I made a mistake. Entering 50 numbers is a must for mistake. 

One error in 50, that's pretty good odds! Well done Aleš.

And, I completely agree with Anne B -- the original 13 states should be allowed leeway between 4 July 1776 and whenever they officially adopted the articles of confederation. The original 13 in your list are Delaware to Rhode Island.

And, Aleš, THANK YOU.  You are a true database hero.
The "leeway" probably makes a lot of sense, since many people would use the 04 Jul 1776 date as a cutoff, even if not correct.
That sounds ok Aleš
Wait, what are we doing here?  We have been through this before.  The dates for the ratification of the United States Constitution are not the dates of statehood for the original 13 colonies.

The 13 original colonies declared their independence on 4 July 1776.  That is the birth date for the United States of America (the name used in the declaration).  It is the date we celebrate every year.  No other date is acceptable for New Hampshire, Massachusetts , Connecticut, Rhode Island, New York, New Jersey, Pennsylvania, Delaware, Maryland, Virginia, North Carolina, South Carolina, and Georgia.

COPY-PASTE:  I don’t want to get into a spat over the nuances of history, but it’s not true.  Jefferson used the phrase United States of America in the Declaration of Independence.  The new government existed in the form of the Continental Congress.  In 1777, the Articles of Confederation were drafted and states directly that the name of the country shall be The United States of America. Even in its unratified state from 1777-1781 the Articles of Confederation allowed the Continental Congress to conduct war versus Great Britain, conduct diplomacy with European powers as a single nation, deal with issues of borders, land expansion, Native American relations, etc.  The articles were not completely ratified until 1781, but they certainly formed a formal system of government for The United States of America, before it was later replaced by the Constitution of the United States in 1789.  In 1777, Morocco became the first foreign country to recognize the USA as a separate nation, and most importantly France also did so later that year.   In 1976, we celebrated the two hundredth anniversary of the birth of our country with tremendous fanfare – we did nothing in 1989.  Certainly, we did not have the same form of Constitutional democracy which we have today before 1789, but that does mean Constitution Day is the birth of the United States of America.  I doubt most Americans have any idea that there is such a thing as Constitution Day.

From the perspective of the United States, we became a separate and sovereign nation on 4 July 1776.  Great Britain might disagree and say they did not give up governmental control of their colonies until 1783 when the Treaty of Paris formally acknowledged the United States to be free, sovereign, and independent states; but then again, they lost.

There is absolutely no reason to choose any date other than July 4, 1776 for the birth of the United States.

As I have noticed, Wikitree makes no distinctions between states and territories; thus "Michigan" applies not only to the state but also to the territory formed out of what had been Indiana Territory in 1805.
+8 votes
United States and USA and United States of America have *ALL* been advocated as correct, with choice among these left to individual preference until this moment.  If you want to change the long standing policy then there should first be a G2G discussion of whether it should be changed and, if so, what to change it to.
by Gaile Connolly G2G Astronaut (1.2m points)
It is not implying what is correct, I am just asking what to use, where there is none used (which is wrong) and to automatically add it. And I have the answer which I agree with. It is USA.
Apparently I did not understand the original post.  My impression was that you plan to automatically change all instances of any location in this country to uniformly name the state (or commonwealth or district) and country.

As long as you do not plan to change all instances of "United States of America" or "United States" to "USA" (when correct according to the date), I have no objection.
I use to prefer USA, as simple, short and obvious.  Because of the FamilySearch database's preference for United States, I have gotten use to that and probably 1000s of locations have been corrected to United States.

If you use USA, you are going to be in a situation where Database Errors is adding USA to every state location, and then the wikitree suggestions will be saying change it to United States - it will be very annoying.

Is it possible to say either USA or United States are both correct without generating database errors?
Previous discussions did not decide an absolute correct. USA is a standard country abbreviation and is acceptable. United States of America is always the most correct, but is long. United States has been deemed ok (by virtue of it's presence on the family search list) and common usage.
Maybe we need to apply the "as they did then" rule.  Does anybody know what people living in their little log cabins called their country back in the early 1800's?  ... how about in the late 1800's?  ... and other periods up to and including today?

I bet that in different states - maybe even in different counties or towns - the local folk used different words to name their country.  Please understand - I'm not talking about official country name here, but trying to get to the "what they did then" level.  I know some people near me call it "the U S of A" right now.
All three forms are and will always be acceptable for DB_Errors.

Joe, You have good point. But for now, there is no FS hint for location in autocorrect. If it will be added, I can change it then, but for now USA is used on millions of profiles and I think we should stick with that.
Well, I know my Kentucky relatives used "merca" but I don't think we want to adopt that.  I'm fine with using a reasonable standard and have been adding United States to any profile I come across without it based on the Family Search location list.  I also like  USA but thought that wasn't really correct.  I'm back to USA if I'm not going to create errors for anyone.
+4 votes
Personally, I'd prefer 1789, when the constitution went into effect.

Here is the Wikipedia list of states by date of admission and what each was called previously:

https://simple.wikipedia.org/wiki/List_of_U.S._states_by_date_of_statehood

Pat
by Living Prickett G2G6 Mach 9 (97.7k points)
I have been using this accurate list (Link) here for 3 years. :)
I don't understand this phrase: "using this accurate list (Link)"

Pat
That list is misleading and dead wrong in assuming that the political entity created by the Declaration of Independence was anything other than the United States of America (as it was named by the Declaration itself).
The colonies collectively declared independence in 1776. Continental troops and the British Army dod not hesitate to cross territorial boundaries. There were Articles of Confederation applicable before 1789, The colonies did some things characteristic of independent republics.

I guess all the difficult work Stephen Tibbits and I undertook to create the Colonial North American Place Names - Google Sheets has been forgotten as was this page North American Place Names (wikitree.com). It was meant to curtail all of these types of discussions.

+5 votes
I would like to know exactly what will happen when you "autocorrect" the data, Aleš.

Are you being literal,meaning that your program will automatically correct any misspelled data that is currently entered in the location fields? Or are you just creating a drop-down menu within WikiTree, similar to current external menu (from FamilySearch?) that gives us standardized choices?

What action, if any, will we, as profile managers, need to take?
by Lindy Jones G2G6 Pilot (260k points)
No automation on corrections. You will still have to review and save.

Here is a working example. You can do those corrections. Please verify changes if it is ok.

http://www.softdata.si/osebe_staro/ales/wikitree/608_New_example.htm
Thanks for the reply, Aleš!

Use of the word "autocorrect" may be causing some confusion. You should delete the "auto" prefix for clarification (my opinion, anyway!).

In error report it is just button Correct.

I meant it as automatically prepared correction. But it can really be misunderstood or confusing. Before we used it for UTF errors, where It was clear what it means.

Any suggestion how to name it. We will use this expression a few more times with other errors. In Error report, I am satisfied with Correct button. 

Name suggestion "Apply this Change"

Or on the button add what will happen

"Press to change georiga ro Georgia, USA"

The bad thing with a solution like this is that lesson learned is that speed is more important than doing a good work and check sources.... a one click solution ==> will take 100 seconds doing 200 changes.... 

  1. What happen with odd places?!?!!? like Sydney i Canada
  2. I would prefer more checks that it can be a relevant change like checking parents etc...  and a bot only solution ;-) 

Or maybe restrict a solution like this to Profile Managers only? Or add a control question and a log that the person clicking on the button confirm yes I have checked the profile and .....

 

But this are really trivial changes. For complex ones, there will be no automatically prepared corrections

+4 votes

July 4, 1776, is the date recognized that the states were independent​ from Great Britain, not​ the date they became a new nation. Then in 1777 with the adoption of the Articles, they became a loose confederation. As Wikipedia says, "A guiding principle of the Articles was to preserve the independence and sovereignty of the states."  Not until the constitution was ratified in 1788 did it go into effect and it was 1789 before there was a president and Congress. So I think it's inaccurate to use 1776 as the date for USA and prefer 1789.

Pat

by Living Prickett G2G6 Mach 9 (97.7k points)
My point of view is that 1776-1789 is gray area and I will allow both forms. Otherwise there will be constant complaints about that.
It's not a gray area at all Aleš.  The views expressed by Patricia and Bobbie are an incorrect understanding of the formation of the United States and are a view held by almost no one in the United States.  The United States came into existence on July 4, 1776.  There were disagreements regarding the powers of the federal government vs the powers of states, and the rights to be granted under the Constitution.  The states were not independent countries or territories agreeing to join the Union by ratifying the Constitution, they were agreeing that this would be the form of government of the Union which they were already a part of.  What was the United States in 1776?  A country with absolutely no one in it?  Who signed the Treaty of Paris, four guys representing no one?  

There is no wiggle room at all here.  July 4, 1776 is the only acceptable date for the 13 original states.  The other 37 states were admitted to the union under acts of congress as prescribed by Article IV, Section 3, Clause 1 of the United States Constitution, a very different process.
So 13 states started usingg USA in 1776 and the others as stated i the table in other post. I will update autocorrect on monday.

How very patronizing and insulting of you Joe: "The views expressed by Patricia and Bobbie are an incorrect understanding of the formation of the United States and are a view held by almost no one in the United States."

I do not have an incorrect understanding of the formation of the United States. In my, obviously far too humble, opinion, the question revolves around when a given STATE attained statehood, what form the name should take for a given STATE, not when the US was formed. 

My discussion of this subject is now ended.

You are correct about my tone Bobbie.  My apologies to you and Patricia.  Aleš is not American so his suggestion that there was gray area on a point where there I see only black and white caused me to use some forceful language to make my point.

This is also a tiresome argument which we have had before with the conclusion being 4 July 1776 is the correct date for the 13 original states.  It is frustrating to have it again, and it was frustrating that people defaulted to an incorrect position.  I still have no understanding why anyone thinks they not states when they called themselves states, and why anyone thinks they were not part of the United States when called themselves the United States of America beginning on 4 July 1776.  Again, ratification of the Constitution had absolutely nothing to do with statehood.  What would have happened if say Rhode Island (the last one) had not ratified the Constitution in 1790?  Absolutely nothing.  Only 9 states were required to ratify the Constitution for it to become the law of the land which happened in 1788.  This again shows that Rhode Island was state of the Union of the United States of America both before and after the Constitution went into effect.

You keep trying to make some distinction about when a “given STATE attained statehood, what form the name should take for a given STATE, not when the US was formed.”  This is not really correct.  Aleš just wants to know when it is correct to put USA or United States or United States of America after a state name.  This is clearly 4 July 1776 and no other.
Actually after what date to use USA is already resolved. I will correct start date for Delaware, Pennsylvania, New Jersey, Georgia, Connecticut, Massachusetts, Maryland, South Carolina, New Hampshire, Virginia, New York, North Carolina and Rhode Island to 17760604

But in other thread I am asking what to use before USA for each state.
Well, it seems to me you need a constitution to have a nation and there was no constitution on 4 July 1776. As for the term as used in the Declaration, they were just saying we are the "States of America" and we are "united" (no cap) in that we are declaring that each of us is independent from Britain -- it says nothing about becoming a new nation.

Pat

July Aleš - 17760704

Well, that would be 17760704 not 17760604, i.e., it was July, not June 1776 that we signed the declaration.
Ups sorry about that. I will correct it.
+4 votes
Most states joined the union later than the original 13 states.  You can get a list from this wikipedia page:  https://en.wikipedia.org/wiki/List_of_U.S._states_by_date_of_admission_to_the_Union
by Peggy McMath G2G6 Mach 6 (68.0k points)
+4 votes
Maybe some of the more obvious and frequent misspellings could be autocorrected ("Connetticut" for Connecticut, "Massachussetts" for Massachusetts) as well as blatant typos (let us say "Misssissippi".

Instead of "USA" we get "United States".
by Paul Brower G2G6 Mach 1 (11.4k points)
+4 votes
In the example. I would go with #3 -- United States (which is the form currently used in the drop down lists) but use a capital S.

USA also stands for Union of South Africa
by Walt Steesy G2G6 Mach 5 (50.8k points)

Related questions

+8 votes
2 answers
+5 votes
1 answer
+19 votes
2 answers
+8 votes
1 answer
266 views asked Dec 11, 2016 in WikiTree Tech by Rubén Hernández G2G6 Mach 5 (52.9k points)
+11 votes
3 answers
+11 votes
1 answer
+9 votes
4 answers
+6 votes
1 answer

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...