Jump to content


Photo

Standardising Place Names


  • Please log in to reply
66 replies to this topic

#1 Robert Fletcher

Robert Fletcher

    Member

  • Members
  • PipPip
  • 14 posts

Posted 12 March 2016 - 01:42 PM

I am a bit confused about the standardising of place names. I imported my database as a GEDCOM and there would be nothing better I would like than to get all the place names standardised. But I am a bit confused about this.
 
Am I correct in thinking that this is done in the  Dataclean>Placeclean utility? Am I also correct in assuming that this will not work correctly until the Place List has been edited to reflect the Standardised form.
 
If I am on the right track , is it safe to remove duplicates? e.g. I have 11 entries in my place list for Albany, Western Australia. If I standardise one to Albany, Western Australia, Australia and delete the others will it remove anything from my database?
 
If I am doing this all wrong can you tell me the correct way or point to a tutorial. So far I have only found this alluded to.
 
Thanks
 
Robert…..


#2 mleroux

mleroux

    Advanced Member

  • Members
  • PipPipPip
  • 68 posts

Posted 12 March 2016 - 07:10 PM

If you go to list/place names, you can select one of the entries and merge it with the others, you will end up with a singe entry, which you can then geotag and/or standardize

Marc
Always learning and loving the discovery process. Focusing on the Huntingdon and Soulanges areas of Quebec - O'Connor/Leroux/Walsh/McCann/Savage/Lalonde/Lauzon


#3 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3570 posts

Posted 12 March 2016 - 11:44 PM

 

If I am on the right track , is it safe to remove duplicates? e.g. I have 11 entries in my place list for Albany, Western Australia. If I standardise one to Albany, Western Australia, Australia and delete the others will it remove anything from my database?

 

 

It's only safe to remove duplicate place names that are unused. But after a GEDCOM import into a new, empty database, all the place names would be used. The only safe way to deal with this situation is to go into Lists->Places and merge all the duplicates.

 

Jerry



#4 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3570 posts

Posted 13 March 2016 - 12:05 AM

The bigger picture here is that RM does not standardize place names. Rather, it helps you to standardize place names if you choose to avail yourself of that help. You can avail yourself of that help by using Gazetteer to lookup place names and by using County Check to check place names as they are entered. Both of these activities take place from keyboard input and are not in effect during GEDCOM import.

 

RM actually stores places names in up to three forms: the place name, the standardized place name, and the short (or abbreviated) place name.

  • The place name is the place name you enter or the place name imported from GEDCOM. It is the name that is normally used in reports. As previously described, RM does not standardize these place names but it helps you to standardize these place names if you wish to avail yourself of that help;
  • As far as I know, the standardized place name field is filled in automatically by RM only during geocoding. And as far as I know, it's used only during RM's mapping functions and is not available in RM's printed reports. It is certainly not available to RM's sentence template language which allows you to customized a lot of things in your reports. So if you want your reports to use standardized place names, you will have to standardize them yourself in the place name field.
  • The short or abbreviated place name is never filled in automatically by RM, but you can fill it in yourself. Having filled it in yourself, you can use it in reports by using the [Place:Short] option in RM's sentence template language. But be aware that not all RM reports use sentence templates. And even for reports that do use sentence template, there are sentences that are generated by RM without reference to sentence templates.

Finally, for things like GEDCOM export and RM's interface with FamlySearch Family Tree it is the "place name" that is used, not the "standardized place name" and not the "short place name".

 

Jerry

 



#5 Don Newcomb

Don Newcomb

    Advanced Member

  • Members
  • PipPipPip
  • 1045 posts

Posted 13 March 2016 - 08:28 AM

One should be careful when "standardizing" place names. Often databases have place names that are not specific enough to "zero in" on one location. One that I've run into is "Greyfriars". This name could apply to any of the dozens of former Franciscan houses and churches in England. Even if you know which abbey it applies to, the church or abbey is seldom an actual standardized place, but rather it's a place detail (location) within a town or city, which would be the standardized place. 

 

Genealogy programs have (properly) allowed users to enter anything in a place field. The results in many cases have been both frustrating and at the same time humorous, when the genealogist is not familiar with geography. (e.g. Dublin, Co Cork, Ireland) They may also enter towns in districts whose boundaries have changed, misspell place names, etc.

 

What I have done is to try to go through place names one-by-one. Not an easy task as my main database has over 5000 place names. In cases where the name is confused enough that it does not properly geocode to a standard place, I prepend a dot "." to the place name so that it sorts to the top of the place list. As I get time, I try to go through these places and resolve what and where they actually are. In many cases I have to print a list of all the events that occurred at that place and edit them each to move part of the name to the details, such as in the case of "Greyfriars" above.

 

When you use RootsMagic, you become much more circumspect and careful about importing other people's GEDCOMs. In general you begin importing them into a separate database, searching for problems, checking place names, etc. and once cleaned, dragging the  new data into your own database and connecting the lines. Life is much easier this way.  



#6 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3570 posts

Posted 13 March 2016 - 11:22 AM

One should be careful when "standardizing" place names. 

 

You should also make an informed decision if you want to conform to RM's place name standard or not. The standard always includes the name of the country and never includes descriptors such as "County", "Township", etc. For example, with the standard you would have Dandridge, Jefferson, Tennessee, United States whereas I prefer entering the same place as Dandridge, Jefferson County, Tennessee. I'm not real sure what the standard wants you to do when a city or town spans multiple counties and you know that an event took place in the city or town but not in which county. Plus, there are numerous cases where there are more political subdivisions than this or where political subdivisions at different levels of the hierarchy have the same name and its hard to tell which subdivision you are talking about without including descriptors such as County and Township.

 

You should also make an informed decision if you want to use RM's Place Details feature or not. For example, cemetery names are not a part of RM's place name standard and RM expects you to enter cemetery names into the Place Details field instead of into the Place field. But having done so, you will discover that your Place Details most typically are not exported via GEDCOM to third party software. RM can export Place Details to itself, but the GEDCOM extensions that it uses to do so are usually not recognized by other software. And of course it's not just cemetery names where this issue arises. It's also an issue with any other data that you might enter into RM's Place Details.

 

Jerry



#7 Robert Fletcher

Robert Fletcher

    Member

  • Members
  • PipPip
  • 14 posts

Posted 13 March 2016 - 11:52 AM

Hi Everyone

 

I would like to thank you for the wonderful comments and suggestions. The problem I have is a database I started 8 years ago which was merged with part of my cousins database and neither of us handle things then like we do now. I previously used FTM and that does not seem to have the tools to clean up like RM. It is a matter of one by one if I remember correctly. RM is fantastic in the regard but it a learning curve. I seem to have worked out what I am doing and in most cases the Geocode works. Some laces are not there. Eg I came across the suburb of Chester Park, Western Australia. I found that it changed it's name in the early 20th century to Bassendean so I have shown this as “Chester Park (Bassendean), Western Australia, Australia”. There a a few more oddities I have to work around.

 

I do like Don's suggestion of putting a dot at the beginning of those place names that cannot be immediately resolved. I have some now that I will only find my going through the list so I will adopt this.

 

Thanks again for your help.



#8 KFN

KFN

    Advanced Member

  • Members
  • PipPipPip
  • 212 posts

Posted 13 March 2016 - 12:11 PM

Jerry, does RM prevent you from putting a cemetery name in the place field. I alway put cemetery name in the place or street address if known or valuable.

#9 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3570 posts

Posted 13 March 2016 - 12:27 PM

Jerry, does RM prevent you from putting a cemetery name in the place field. I alway put cemetery name in the place or street address if known or valuable.

 

No, RM does not prevent you from putting a cemetery name in the place field. You can put anything you wish in the place field. But if you put a cemetery name in the place field, then the place field is "non-standard". RM really wants you to put the cemetery name in place details and to leave the place field as "standard".

 

For a lot of reasons, I do not handle place names the way RM wants me to. All my place names are "non-standard" and I do put cemetery names in the place field rather than the place details field. I do not use RM's place details field at all because of the problems of of exchanging such data with third party software.

 

If you use place fields that are "non-standard", then RM's County Check feature will nag you to death, so I turn off County Check. I do use RM's Gazetteer extensively. When I copy and paste place names from Gazetteer, I add descriptors such as "County" to the place name as needed and I remove the country code from the place name if the country code is United States.

 

Jerry



#10 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3570 posts

Posted 13 March 2016 - 12:47 PM

 

I do like Don's suggestion of putting a dot at the beginning of those place names that cannot be immediately resolved. I have some now that I will only find my going through the list so I will adopt this.

 

 

Even with RM's DataClean facility, cleaning up your places for the most part still reduces down to cleaning one place at a time. For most cleaning activities, it's actually easier to use Lists->Place List to clean up places rather than using the PlaceClean feature within DataClean. But it's well worth running the PlaceClean feature just to see what, if anything, it reports.

 

It may or may not be obvious that "one place at a time" does mean "one place at a time" and not "the place for each event, one event at a time". So do it from Lists->Place List.rather than from the Edit Person screen for each person. The Merge facility inside of Lists->Place List works extremely well, and cleaning a place inside of Lists->Place List will fix the place for all of the events in which it appears. You are much better off merging places than trying to delete the ones that are wrong, even for places that are no longer in use. When you merge, keep the one that is right (or better) and don't keep the one that is wrong (or not as good). You can merge several 'not as good ones" with one "better one" all at the same time.

 

Jerry



#11 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3407 posts

Posted 13 March 2016 - 04:55 PM

Hopefully for the benefit of Robert I will give my own views on place management.

 

Firstly there is no hard and fast rule, you do want your places either to be recognized by online mapping utilities or geocode them yourself to satisfy that need.

 

A Place with a Standardized Place Name and goecoding will not be flagged by further geocoding operations, in other words if you enter this information Rootsmagic will be happy. There has been past wishes for the ability to build a custom Gazetteer add on like a custom spell check file but so far these have not been implimented in the program. I work at Parish level as a place in Ireland, create my own Standardized Place Name and geocode it with the centre of the geographic centre of the parish, other countries will undoubetedly warrant different approaches.

 

I do use Place Details and I do use CountyCheck but I understand the objections other users have to these features and really do wish Rootsmagic would work to overcome them. The issues that Jerry and Don have outlined are very genuine ones but are also easily overcome. The Cemetery name certainly belongs in the Place Details field in my opinion but as stated this would then be lost in a gedcom export to another genealogy program, this is just WRONG of any programmer who might claim to have an interest in genealogy research. FTM had an in program designation of Place Details but on gedcom export the information represented in this "field" was prefixed to that information in the Place field. That is what Rootsmagic need to do to allow their users freedom to migrate to other platforms and have confidence in the quality of the Rootsmagic program to prevent anyone wanting to migrate.

 

In case you are not aware yet this same argument applies to Shared Events within Rootsmagic.

 

I posted a video some years ago regarding Places Management, little has changed so it is still pretty valid today, the like is below. I am sure Rootsmagic have something in their Webinar library so that would also be worth a look and after that simply make your mind up as to what works best for you and is internationally recognized.

Rootsmagic mapping your genealogy and family tree

We are all limited by our visions and abilities

Whilst we can borrow from the visions of others we cannot always deliver.

 

User of Family Historian 6.2.7, Rootsmagic 7.6.0, Family Tree Maker 2014 & Legacy 7.5

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#12 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6254 posts

Posted 13 March 2016 - 08:30 PM

I wonder if the Place and Place Detail architecture is simply the wrong solution. Maybe there should be only the one Place or Location field for an event, containing as many levels of detail as needed. That is compatible with GEDCOM but complicates things for standardization and county check and sentences and report indexes and... But maybe those are manageable.

Looking at the Place hierarchy for a burial as country, state/province, county or other subdivision, city, cemetery, plot, there need be just one geocode at whatever resolution is known, county check need only examine the first three levels, standardization the first four or +/- depending on the administrative divisions determined by the higher level, the sentence template language could have modifiers to pick out the first n levels or any one level, report indexes could be constrained to the first n levels, ...

Just as exported Place Details is problematic, so too importing this all levels Place into other systems; both require user controls over what goes to PLAC and where the rest would go. And both Place architectures need user controls for report outputs over the suppression of selected country names, e.g., "United States".

Perhaps it is no better a solution given the variations in administrative divisions. Can all cemetery names be on the 5th level or will some be necessarily on the 4th? That's a similar issue to putting the name in the Place or the Details. They need to be all the same for Place Indexes to handle them consistently.

Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#13 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3407 posts

Posted 14 March 2016 - 06:14 AM

I find Rootsmagics use of Place Details logical albeit there are still a few problems in managing them still to be ironed out. The main objection I have always read is that the option to export Place Details prefixed to the Place field for gedcom compatibility does not exist, Rootsmagic should overcome this.

 

Family Tree Maker facilitate the in program designation of Place Details and a nice hierarchical Place view which is a perfectly logical way to view Places, I hope the next version of Rootsmagic incorporates a similar view.

 

Should the inboard Place holder in Rootsmagic facilitate the entering of other administrative boundaries in separate fields useful for searching and reporting? this would appear to be another desire users are trying to overcome in their own Place notations.

 

I record Parish, County, Poor Law Union, Country as Irish parishes for the specific reason or record location. The Parish is the most local record source, the County is the more modern likely record location and the Poor Law Union was an administrative body during Victorian Workhouse times. The string is still a specific place notation where the place is the Parish and can be geocoded.

 

And yes I fully understand that things do change over time and new parishes have come into being taking a chunk out larger parishes and things will continue to change. Like County Check tries to address this is a challenge we will continue to face and try to understand in our genealogy, the inclusion of Place and Place Details in reports should help explain things in our reporting.


We are all limited by our visions and abilities

Whilst we can borrow from the visions of others we cannot always deliver.

 

User of Family Historian 6.2.7, Rootsmagic 7.6.0, Family Tree Maker 2014 & Legacy 7.5

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#14 KFN

KFN

    Advanced Member

  • Members
  • PipPipPip
  • 212 posts

Posted 14 March 2016 - 07:58 AM

The GEDCOM solution to this is the FORMat subtag. It is not the best solution, and is discouraged, but has some value in this design. Obviously a more robust and normalized GEDCOM record type for place is required for data transfer. For RM and Other programs a user defined and integrated place record type is required that is then used to build an export. Import would be harder since most people don't do a good job of normalizing their structure in every PLACe field they enter.

#15 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3570 posts

Posted 14 March 2016 - 08:10 AM

I wonder if the Place and Place Detail architecture is simply the wrong solution. 

 

What really needs to happen is so radical that it will never happen. But let me describe some ideas anyway. These ideas will posit that the horrible, horrible, horrible place name standard will not be fixed, so that both users and vendors of genealogy software will be on their own in coming up a rational way to deal with place names.

 

The single most important thing that should happen is that the storage of place names should be separated from the reporting of place names, the data entry of place names, the printing of place names, the import and export of place names, and the data interchange of place names. For example, a place name might be stored as country = "United States of America", state = "Tennessee", county = "Anderson", city = "Oliver Springs". The political subdivisions such as country and state should not be pre-defined. They should be able to be anything at all that the user and the software and the user need them to be, and in any language. Multiple languages and translation between could be handled, but I will not get into the minutia of how that could be done. The important thing is that in the storage of the place names, the names of all political subdivisions should be explicitly stated with no exceptions and no defaults, and with no limits or pre-definitions of what those politicial subdivisions should be. Also, the storage should establish no hierarchies, such that a state is a subdivision of a country. There are too many cases that are not hierarchical. For example, a city is not a subdivision of a county when the city spans multiple counties or when the city is independent of any county.

 

Most of the time, the place name keywords such a city and county would represent political subdivisions, but there would be no restrictions on what the keywords represent. For example, there would be no Place Details field for things like cemetery names. Rather, cemetery names would be handled with place name keywords such as cemetery = "Shady Grove", etc.

 

Having separated the storage of place names from other uses of  place names in this manner, the mapping of place names for other purposes should be handled by templates that have defaults and that the user can control. For example, I might have a template for printing a birth place named something like [BirthPlaceShort] that looks something like <[BirthCity]><, [BirthCounty] County><, [BirthState]> that would yield Oliver Springs, Anderson County, Tennessee. Somebody else could have a template for printing birth places named something like [BirthPlaceStandard] that looks something like <[BirthCity]><, [BirthCounty]><, [BirthState]><, [BirthCounty]> that would yield Oliver Springs, Anderson, Tennessee, United States of America. Having set things up in this manner, facts/events could have templates that looked something like [Person] was born <[Date]><[BirthPlaceShort]> or [Person] was born <[Date]><[BirthPlaceStandard]>. In other words, the templates would actually be nested. A sentence template for a fact/event would reference a place template rather than referencing a place.

 

We wouldn't stop with reports. We would also have place templates for GEDCOM export, place templates for interfacing with the FSFT API, etc. for any other external interfaces.

 

Place indexes in reports would be driven simply by the way the names look after all the levels of templates have been applied and the place name has been rendered.

 

For keyboard input of place names, the user interface would have to support the user specifying both a place name keyword and the value of that keyword. There could obviously be predefined defaults, but the user would not be limited to using only the defaults.

 

The trickiest part of this for me would be for things like GEDCOM import of place names and input of place names from API's such as the FSFT API. Some very flexible and very sophisticated processing would have to take place to map the place names that were input to the internal place name structure. Maybe someday the scheme I've outlined could be used for data interchange as well as for storage, export, and reporting. Using this structure for data interchange would solve the problem.

 

I'm by no means hung up on the paradigm of place_keyword = "value" for data storage. The data storage scheme could just as well be JSON format or XML format or something like that. For storage in a relational database, I would prefer a place keyword table with two columns - one for the keyword and one for the value - to a true keyword format or a JSON format or an XML format. That's because software shouldn't have to parse these keyword/value pairs over and over again. But whatever the exact scheme, the data would have to be stored as pairs of for the form (place_keyword,value) where place_keyword was a political subdivision or else something like cemetery, hospital, etc.

 

As I said at the beginning, what I'm proposing is a pipe dream that will never happen because it's so radical. But I do think that what I'm suggesting or something very similar to it is what is needed. Place names are never going to work very well with what we now have to work with.

 

Jerry



#16 Robert Fletcher

Robert Fletcher

    Member

  • Members
  • PipPip
  • 14 posts

Posted 14 March 2016 - 08:35 AM

Thank you for a very stimulating and thought provoking discussion. I have in the process made a big mistake. I was thinking that the Place List was a list of places used in the database and one place could be used multiple times. So in my case I had 17 places for Nottingham, Nottinghamshire, England, so I merged the lot. The problem is most of these records had a street address or cemetery in the 'Place Details' and all these have been merged into one = a mess. I have come to the conclusion that I can only merge places that are identical in every way and go through then one at a time. It just means I have to reload the GEDCOM and start over. Currently I am not doing any active research.

 

Robert….



#17 Robert Fletcher

Robert Fletcher

    Member

  • Members
  • PipPip
  • 14 posts

Posted 14 March 2016 - 08:46 AM

I have just realised that if I enter a new person and say record the birth I can choose the birth from the list but I cannot add “Place Details” this is not like FTM. If this is correct and I have not got confused again it would be better to use non-standardised place names and leave the “Place Details” blank. Is this what Jerry was getting at?.



#18 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6254 posts

Posted 14 March 2016 - 08:53 AM

I have just realised that if I enter a new person and say record the birth I can choose the birth from the list but I cannot add “Place Details” this is not like FTM. If this is correct and I have not got confused again it would be better to use non-standardised place names and leave the “Place Details” blank. Is this what Jerry was getting at?.

You can but you need to Save the fact having chosen the Place before proceeding to select or add the Place Detail. Just a little procedural gotcha... and not what Jerry was getting at. He described the problem of preserving Place Details when exporting to other systems which has led him to storing the value that would logically go in Place Detail with the higher levels in Place because the full Place value is safely transported through GEDCOM to all (or almost all) other systems..


Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#19 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6254 posts

Posted 14 March 2016 - 08:58 AM

As I said at the beginning, what I'm proposing is a pipe dream that will never happen because it's so radical. But I do think that what I'm suggesting or something very similar to it is what is needed. Place names are never going to work very well with what we now have to work with.

 

TMG probably comes closest to what you describe but still a long way off. Iirc, it separated each level of a place into a separate field in the place record and provided builtin and user-defined input labelling and output templates. Thus one could have a default template but optionally select an alternate template for any given place, suited to its hierarchy. 


Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#20 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3570 posts

Posted 14 March 2016 - 08:59 AM

Thank you for a very stimulating and thought provoking discussion. I have in the process made a big mistake. I was thinking that the Place List was a list of places used in the database and one place could be used multiple times. So in my case I had 17 places for Nottingham, Nottinghamshire, England, so I merged the lot. The problem is most of these records had a street address or cemetery in the 'Place Details' and all these have been merged into one = a mess. I have come to the conclusion that I can only merge places that are identical in every way and go through then one at a time. It just means I have to reload the GEDCOM and start over. Currently I am not doing any active research.

 

Robert….

 

I'm not quite sure I understand the problem. If you merge places which have place details, the place details lists will be merged together into a single place details list. Things should be just fine. The only possible glitch is that the combined place details list might have duplicates that you may also wish to merge, but that's not really a very big deal.

 

Jerry