Jump to content


Photo

Place names standard


  • Please log in to reply
70 replies to this topic

#61 SomebodySmart

SomebodySmart

    Advanced Member

  • Members
  • PipPipPip
  • 112 posts

Posted 13 January 2017 - 07:49 AM

As far as place name matching goes, let me add that I have experience programming a computer to interpret the user's input, and although I have no idea what I am doing, technically, let me observe the 45 different ways to write North Saint Mary's:

 

Five different ways to spell North:

 

North

No.

No

N.

N

 

Times three different ways to spell Saint:

 

Saint

St.

St

 

Times three different ways to spell Mary's:

 

Mary’s

Mary's

Marys

 

Yes, the machine distinguishes between the curly and straight apostrophes. The straight ones should be used for now because they penetrate charset barriers better than curly ones.

 

I was programming in QBASIC until about three years ago when I was worried what I'd do when there wouldn't be computers available that would run that language. I was able to use the UCASE, as in IF UCASE$(A$) = UCASE$(B$) so that matches did not have to be case sensitive; but now that I'm using Python 3.2 I use the .upper() for the same purpose.  I still have no idea what I'm doing.



#62 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3387 posts

Posted 13 January 2017 - 08:11 AM

So what is wrong with this Gedcom output that Rootsmagic already provides?

 

0 _PLAC Sault Sainte Marie, Chippewa County, Michigan, United States of America
1 NOTE According to SomebodySmart this straddles the Canadian border although C
2 CONC ountyCheck Explorer does not recognize the Canadian variation.
1 MAP
2 LATI N46.3000000
2 LONG W84.4833306

 

In all your straddling examples the actual site must be in one jurisdiction or the other, I haven't researched your place example but CountyCheck does not recognize the Canadian variation you provided. edit: It does but suggests the following.

 

 

unrecognized.png


“Your most unhappy customers are your greatest source of learning.” -Bill Gates

It's now time for discretion, trust, patience and support

 

User of Rootsmagic 7.5.9, Family Historian 6.2.7, Family Tree Maker 2014 & Legacy 7.5

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#63 KFN

KFN

    Advanced Member

  • Members
  • PipPipPip
  • 207 posts

Posted 13 January 2017 - 09:23 AM

Vyger,

 

In theory your concept has merit.

 

I would take the concept a lot farther however and to make it more GEDCOM like.

 

I would include:

0 @P001@ _PLAC

 

Since the name of the place is actually "Sault Sainte Marie" within Chippewa County, within Michigan, within United States of America.  My proposal would not have "Places" include the entire long list of "winthin" but pointers to the next higher entity in the linked list.  So the place name would be

 

0 @P001@ _PLAC

1 NAME Sault Sainte Marie

2 TYPE official

1 NAME Sault St. Marie

2 TYPE aka

1 PLACWITHIN @P002@  <= pointer to Chippewa County

1 PLACCONT @P003@ <= pointer to places within this place (place contains place)

1 MAP_POLY <= Points of a polygon to encompass the entire place with a from/to date of when it is valid

2 POINT 

3 LONG

3 LATI

2 POINT

3 LONG

3 LATI

2 DATE

1 MAP <= standard map point locator

2 LONG

2 LATI

1 TEXT

2 CONT



#64 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3387 posts

Posted 13 January 2017 - 10:14 AM

I believe this discussion has some elements of confusion and I do hope there is some focus on Place designation and mapping in RM8. One thing for sure is we have no influence over what other software providers do, import or provide within their Gedcom variations but Rootsmagic does have an opportunity to stand above the competition here.

 

The arguments over how you record a place and how I do, all the variations of whether to enter County, Twps, Country and the numerous variations of styles within those variations are overcome by the Standardized Place Name in my opinion and if you share your research with Ancestry, Family Search or myself that is the information we should be compatible on so NO duplication on Place information.

 

That covers the personal preferences of reporting but from a researching point of view what happened when and where is an important indicator and I believe the coordinates should be used in this respect, this is simple mathematics. It's like the old "where were you on the night of ...?", if the same year or year span another individual had facts recorded within a short distance then there is some possibility of that warranting further research. I would like to see Duplicate Search Merge using a mathematical system rather than comparing freehand text fields as it does now. I would also like to see reporting expanded to allow search variables as is now but also allow for the input of a specified proximity to certain coordinates, I believe this would be best facilitated through an embedded map UI.

 

It would never be possible to force users to enter a certain format, RM could program in nag pop ups to encourage users but that would annoy people and still not make them enter a particular format. I believe one of the main problems here is ignorance of the possible ramifications outside of their own locality, country and database. I have been sharing data and working on all continents for 25 years and I research each place as far as I can, one important thing is I want to geocode the Place for the reasons I stated before but also for the benefits that will bring me in the future as genealogy programs develop.

 

What I was trying to illustrate to SomebodySmart was that these instances of places straddling counties and countries could be handled in Notes, the question then is does Rootsmagic have enough reporting options to output this information again to satisfy everyone's needs, currently not. I use Place Details and generally explain and enhance them further through Notes and Media, currently Rootsmagic does not provide me for the option to report that information out again but I believe it should and I am not going to adopt another data entry method to overcome this current shortcoming, I would prefer to create a quality database and wish for those enhancements.


“Your most unhappy customers are your greatest source of learning.” -Bill Gates

It's now time for discretion, trust, patience and support

 

User of Rootsmagic 7.5.9, Family Historian 6.2.7, Family Tree Maker 2014 & Legacy 7.5

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#65 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6177 posts

Posted 13 January 2017 - 11:05 AM

We are probably not being read by anyone who can make a difference to what RootsMagic does and, even if we were, this discussion is all over the map!  :P

 

Geo-coordinates are wonderful for pinpointed locations but they convey a precision that is inappropriate for all others. Representing a town, county,... by a single point is misleading. So, yes, a closed polyline as KFN proposes would be an improvement but at what expense. The more accurate a boundary is followed, the more points required (with few exceptions such as the 49th parallel that part of the US-Canada boundary follows. That has enormous implications for its successor to th Gazetteer, to storage and to computations. A compromise might be to restrict the complexity to a rectangle or ellipse; the simplest would be the addition of a radius parameter to the existing point. RM can already find points within a radius from a point; this enhancement would require that it find the points whose radiuses fall within that of the reference.

 

And RM needs to fully support what it already has by

exporting alll the place fields to GEDCOM,

enhancing its sentence template language to support the Standardized name,

enhancing all its reports and place indexes to support multiple name types, notes and media

and...


Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#66 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3387 posts

Posted 13 January 2017 - 12:11 PM

RM can already find points within a radius from a point; this enhancement would require that it find the points whose radiuses fall within that of the reference.

 

This does not take into account Place Details which may be geocoded in a suberb of a City (neighbourhood) like Thorntons Ferry, and when I do search for proximity of 1 mile I seem to get results well beyond 1 mile from the main geocoded Place.

 

And RM needs to fully support what it already has by

exporting alll the place fields to GEDCOM,

enhancing its sentence template language to support the Standardized name,

enhancing all its reports and place indexes to support multiple name types, notes and media

and...

 

Amen to that...


“Your most unhappy customers are your greatest source of learning.” -Bill Gates

It's now time for discretion, trust, patience and support

 

User of Rootsmagic 7.5.9, Family Historian 6.2.7, Family Tree Maker 2014 & Legacy 7.5

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#67 SomebodySmart

SomebodySmart

    Advanced Member

  • Members
  • PipPipPip
  • 112 posts

Posted 13 January 2017 - 01:40 PM

Google Maps shows Sault Sainte Marie to be in Algoma District

 

https://www.google.c...5!4d-84.7951524

 

http://www.discoverthesault.ca

http://www.saultstemarie.ca

 

Cool to see the longitude and latitude embedded in that URL.



#68 SomebodySmart

SomebodySmart

    Advanced Member

  • Members
  • PipPipPip
  • 112 posts

Posted 13 January 2017 - 01:43 PM

Vyger,

 

In theory your concept has merit.

 

I would take the concept a lot farther however and to make it more GEDCOM like.

 

I would include:

0 @P001@ _PLAC

 

Since the name of the place is actually "Sault Sainte Marie" within Chippewa County, within Michigan, within United States of America.  My proposal would not have "Places" include the entire long list of "winthin" but pointers to the next higher entity in the linked list.  So the place name would be

 

0 @P001@ _PLAC

1 NAME Sault Sainte Marie

2 TYPE official

1 NAME Sault St. Marie

2 TYPE aka

1 PLACWITHIN @P002@  <= pointer to Chippewa County

1 PLACCONT @P003@ <= pointer to places within this place (place contains place)

1 MAP_POLY <= Points of a polygon to encompass the entire place with a from/to date of when it is valid

2 POINT 

3 LONG

3 LATI

2 POINT

3 LONG

3 LATI

2 DATE

1 MAP <= standard map point locator

2 LONG

2 LATI

1 TEXT

2 CONT

 

For reference, the page at http://wiki-en.genea...et/Gedcom_5.5ELsuggests the _LOC tag. If anybody starts one way, we all gotta go that way.



#69 SomebodySmart

SomebodySmart

    Advanced Member

  • Members
  • PipPipPip
  • 112 posts

Posted 13 January 2017 - 03:47 PM

In all your straddling examples the actual site must be in one jurisdiction or the other,

 

 

The genealogist may have a source of information that does not specify which side of the line the event happened on. It's better to use both than to leave it blank.



#70 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3487 posts

Posted 13 January 2017 - 05:53 PM

 

The genealogist may have a source of information that does not specify which side of the line the event happened on. It's better to use both than to leave it blank.

 

I don't understand this suggestion. Suppose (real example) that I have a death in Aurora, Colorado and that I don't know the county. Aurora spans Arapahoe County, Adams County, and Douglas County. Until and unless I can identify the county, I have entered the place of death as simply Aurora, Colorado. What is the alternate suggestion? For example, I'm extremely unhappy with Aurora, , Colorado. So are you suggesting, Aurora, Arapahoe County or Adams County or Douglas County, Colorado?

 

To me, this is one of several reasons why the word "County" is so important in place names. Absent the word "County", you are stuck with counting commas to determine what is a city and what is a county. And counting commas suggests that with the place name Aurora, Colorado that Aurora is a county when in fact it isn't. Aurora is a city. To get the counting of commas work right when the word "County" is omitted, you have to go with Aurora, , Colorado or Aurora, unknown, Colorado or Aurora, Arapahoe or Adams or Douglas, Colorado.

 

To me, the least bad solution is always to include the word "County" when it's a county and always to omit the word "County" when it is not. That way, the syntax tells you that with Aurora, Colorado that Aurora is a city/town/municipality rather than a county. You might argue instead that you "know" that Aurora, Colorado means a city rather than a county because you "know" that Aurora is a city and is not a county. But your reader might not know that, especially if your reader is not familiar with Colorado. But more importantly, this example is mixing syntax with semantics. The problem with mixing syntax with semantics is displayed more starkly when a city and a town have the same name, and the problem is especially stark when a city with a particular name is not even in the same named county. The exact same problem exists with townships, and for this reason including the word "township" or the abbreviation "TWP" is the least bad solution for entering townships.

 

I have tried to stay out of this thread because I have written so much about the place name standard in the past. In physics, things are right or wrong or not even wrong. "Not even wrong" is way worse than wrong. With truth, there is true and false when it comes to things like math and Boolean logic, but in the real world there is true and false and  pants on fire. The place name standard is so bad it's in the "not even wrong" or "pants on fire" category. I feel strongly that standards should be followed as much as possible, but this is a standard that is so bad that it really shouldn't be followed.

 

But I have considered the following question: what if the place name standard actually was a pretty good standard. Would I follow it under those circumstances? The answer is that it depends. It seems to me that even a really good standard for storing place names and exchanging place names would not be a very good way for displaying place names in reports and Web pages, etc. So if the pretty good place name standard required that I slavishly follow it for reports and Web sites and such, then I probably wouldn't follow it. What I think needs to happen instead is that standards for storing and exchanging place names on the one hand needs to be separated from the reporting of place names on the other hand. RM actually contains a very primitive implementation of this idea, since it includes a Place Name, a Standard Place Name, and an Abbreviated Place Name. But RM's very primitive implementation of this idea is not well developed enough to be of much value to its users. Perhaps this is an area of RM that will be improved in RM8. One can only hope.

 

Jerry



#71 SomebodySmart

SomebodySmart

    Advanced Member

  • Members
  • PipPipPip
  • 112 posts

Posted 14 January 2017 - 11:51 AM

If an event happened in a namable place such as Aurora, Colorado then go with that. However, sometimes the overlapping area does not belong directly junior to the parent area, such as 2 PLAC Saint Mary's Cemetery, Essex County, Massachusetts, United States of America is not a proper PLAC tag. Some graves are in Peabody and some in Salem. Limiting the place to the two towns is much better than simply the county.

 

Consider that "2 PLAC means take your green crayon and colour in all the area that is within the following. If 2 PLAC starts another line before you encounter a line starting with 1, then continue colouring all the area that is within the following.

 

Now, the event happened somewhere in the green area.

 

This allows for the 3 _LOC tag proposed in the GEDCOM 5.5EL proposal.

 

Moving on, I took the liberty of uploading the file http://gedcomindex.c...ixie_sorted.txtin which you can see all manner of junk found in  2 PLAC lines.

Some material will be captured as I improve the system. I discovered the cool trick in Python 3.2

 

place = place.replace(" Co. "," Co., ")

 

so that 

 

2 PLAC Nashua, Hillsborough Co. New Hampshire

 

will be treated as if there was a comma after Co.