Jump to content


Photo

Character set compatability RM7 & TNG

GEDCOM Character Set

  • Please log in to reply
8 replies to this topic

#1 Tony Harris

Tony Harris

    New Member

  • Members
  • Pip
  • 1 posts

Posted 24 October 2018 - 06:18 AM

I have been exporting a GEDCOM file from my version of RM7 to upload into a TNG website and I have had some problems with some of the characters in the website being displayed incorrectly. I think this may be due to the character set used in the "research notes" section associated with the "sources" for an individuals "Facts". I believe the character set used in Notes was "Tahoma" with the "Basic Latin" character set which may have been the default option when I first bought RM7. Is this set compatible with English -UTF8 which is what the website was expecting or would aa alternative character set be more appropriate?

 



#2 zhangrau

zhangrau

    Advanced Member

  • Members
  • PipPipPip
  • 1474 posts

Posted 24 October 2018 - 09:07 AM

Have you tried experimenting with different fonts in RM at Tools >Program Options > Display ? Can you change the font in TNG ?

 

My first suggestion would be setting RM and TNG to use the same font.



#3 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6146 posts

Posted 24 October 2018 - 11:30 AM

I wouldn't expect any font setting in RootsMagic would have any bearing on TNG. The transfer is via GEDCOM. RM claims to export using UTF-8. Maybe RM or TNG does not fully adhere to UTF-8.

Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#4 robertjacobs0

robertjacobs0

    Advanced Member

  • Members
  • PipPipPip
  • 260 posts

Posted 24 October 2018 - 02:31 PM

A GEDCOM can be edited to change the character encoding. RM7 seems to export in UTF-8. If TNG wants something else, open the GEDCOM in Notepad, click file/save as. At the bottom of the save dialog you'll see a button which offers UTF-8, ANSI, Unicode and Unicode big endian. Choose your poison and save, either with the existing file name or a new one. Your data will be unaffected.

 

If I were guessing, I'd try Unicode first.



#5 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6146 posts

Posted 25 October 2018 - 01:48 PM

I'm not sure that the Notepad save option is going to do anything useful. Does it even recognise that the GEDCOM is UTF-8? It is in plain text with a tag:

1 CHAR UTF-8

That is to tell the GEDCOM importer the character set but Notepad won't recognise the tag and it won't modify it whatever its Save setting is. 

 

It's possible to edit this tag to tell the importer to treat it differently. I have a GEDCOM from Heredis which says:

1 CHAR MACINTOSH

Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#6 KFN

KFN

    Advanced Member

  • Members
  • PipPipPip
  • 206 posts

Posted 25 October 2018 - 08:50 PM

A UTF-8 file is actually saved differently than other files. If you save a file after editing it you can select UTF-8 as the file type. This is best illustrated when you “save as” a txt file in NOTEPAD there is a drop down just below the file name where you can select UTF-8. IIRC their is also a 8bit character at the beginning of the file that denote UTF-8 as well.

#7 keithcstone

keithcstone

    Advanced Member

  • Members
  • PipPipPip
  • 121 posts

Posted 26 October 2018 - 06:22 AM

Font and character set are two different things. While there are some fonts that can only be represented in UTF-8, font is how you DISPLAY a character, not how you store it.



#8 robertjacobs0

robertjacobs0

    Advanced Member

  • Members
  • PipPipPip
  • 260 posts

Posted 26 October 2018 - 08:52 AM

 

I'm not sure that the Notepad save option is going to do anything useful. Does it even recognise that the GEDCOM is UTF-8? It is in plain text with a tag:

1 CHAR UTF-8

The Notepad trick worked with TMG -- I no longer recollect whether it was an import or export which was giving the problem. It's true that the Notepad save won't change the CHAR designation in the GEDCOM, but that can be done manually if necessary.

 

I'm pretty sure that Notepad will save in the designated character set, whatever the state of the input file.



#9 kbens0n

kbens0n

    Advanced Member

  • Members
  • PipPipPip
  • 3442 posts

Posted 26 October 2018 - 09:39 AM

Initial creation time of the GEDCOM file is where file encoding is most germane and character encodings first occur. Any non-UTF8 characters generated in the source application will not be represented correctly as (originally intended) when viewed later and any subsequent edits made to the file must be saved continually as UTF8 in order to avoid subsequent possibility of character changes caused by varying to any other encoding where differences may not translate. Typically, this affects the extended or special characters that differ between encoding formats and not the most common set of text from the average user (ie. ASCII alphabet, punctuation, etc.).

---
--- "GENEALOGY, n. An account of one's descent from an ancestor who did not particularly care to trace his own." - Ambrose Bierce
--- "The trouble ain't what people don't know, it's what they know that ain't so." - Josh Billings
---Ô¿Ô---
K e V i N