Jump to content


Photo

Does using free-format citations really improve GEDCOM exchange


  • Please log in to reply
6 replies to this topic

#1 History Hunter

History Hunter

    Advanced Member

  • Members
  • PipPipPip
  • 79 posts

Posted 31 October 2020 - 04:10 PM

Does using free-format citations really improve GEDCOM exchange?

Having looked at a test RM7 GEDCOM, it seems that there's very little that could be entered in such a way as to improve porting of information between apps using GEDCOM. Even using the Free-Form template just assigns the whole 1st reference note into a TITL tag followed by multiple CONC fields. This still doesn't actually reflect things properly when ported. The 2nd and subsequent notes also don't seem to be ported properly,

Can one do better? If so, how?



#2 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3978 posts

Posted 31 October 2020 - 06:23 PM

Even using the Free-Form template just assigns the whole 1st reference note into a TITL tag followed by multiple CONC fields. This still doesn't actually reflect things properly when ported. The 2nd and subsequent notes also don't seem to be ported properly,

 

This is a question out of curiosity because I think that someday I need to spend some serious time studying some of the fine points of GEDCOM that I have never really looked into. But exactly how should notes for sources appear in GEDCOM?

 

Jerry



#3 KFN

KFN

    Advanced Member

  • Members
  • PipPipPip
  • 338 posts

Posted 31 October 2020 - 10:26 PM

The 5 main data points of a GEDCOM SOURce record are:

1) DATA
2) AUTH
3) PUBL
4) TITL
5) TEXT

Their are several other tags that are part of the record (NOTE, REPO, REFN, ABBR) but have little to do with the catalog information about the source record.  The following are definitions of tags 2,3,4,5 above are straight from the GEDCOM Documentation  and all allow subtags CONC/CONT so you can have multile lines of information.

AUTH: The person, agency, or entity who created the record. For a published work, this could be the author,compiler, transcriber, abstractor, or editor. For an unpublished source, this may be an individual, a government agency, church organization, or private organization, etc.

TITL: The title of the work, record, or item and, when appropriate, the title of the larger work or series of which it is a part.For a published work, a book for example, might have a title plus the title of the series of which the book is a part. A magazine article would have a title plus the title of the magazine that published the article.For An unpublished work, such as: A letter might include the date, the sender, and the receiver. A transaction between a buyer and seller might have their names and the transaction date. A family Bible containing genealogical information might have past and present owners and a physical description of the book.  A personal interview would cite the informant and interviewer.

PUBL: When and where the record was created. For published works, this includes information such as the city of publication, name of the publisher, and year of publication.For an unpublished work, it includes the date the record was created and the place where it was created. For example, the county and state of residence of a person making a declaration for a pensionor the city and state of residence of the writer of a letter

TEXT: A verbatim copy of any description contained within the source. This indicates notes or text that are actually contained in the source document, not the submitter's opinion about the source. This should be, from the evidence point of view, "what the original record keeper said" as opposed to the researcher's interpretation. The word TEXT, in this case, means from the text which appeared in the source record including labels.

The REPO and NOTE record do have some significance if used: REPO represents a link to the repository of the source.  This would be libraries, personal files, online websites (Ancestry and others).  In general The NOTE tag is used for personal notes or researcher's notes about the source.

The DATA tag has several subtags of importance:

1) EVEN.DATE (This is the date of the event was recorded in the source.)  
2) EVEN.PLAC (The name of the lowest jurisdiction for the events named in this source.)
3) AGNC (The person, agency, or entity who created the record. For a published work, this could be the author,compiler, transcriber, abstractor, or editor. For an unpublished source, this may be an individual, agovernment agency, church organization, or private organization, etc.)
4) NOTE (Notes about the data)

My interpretation of all of this to follow.



#4 KFN

KFN

    Advanced Member

  • Members
  • PipPipPip
  • 338 posts

Posted 31 October 2020 - 10:46 PM

A lot of the tags above are geared directly toward books and official documents.  When I pull stuff from websites like Ancestry, The Norwegian Archive, and other Archives and Libraries around the world these always become REPOsitory records because they are never the Author, Agency or other entity that created the records.  They are just a library or holder of the information.  Ancestry in particular tries to generate a citation that includes them as if they were the creating authority for the source, they are not.  Normally the authority is the Church, Federal or Local Governmental agency, Book Author/Data Compiler/Transcriber/Editor. 

 

I use the GEDCOM SOURce Record as a general information holder for Yearly Census (i.e. 1801 Norwegian Census), Church Books that record birth/death/weddings (i.e. from 1845-1865 for Naustdal Parish), then let the source_citation handle the exact page and text detail.

 

The information in the DATA tag is a little obscure, it covers mostly the information about the recording data of the source which could be different than the responsible agency of the source. It is not that likely that this will happen very often but for Books Authors that are compilers/Transcribers/Editors of the actual data, they may have sources of their own and if you dont actually look up the source they cited you could place the information they provided to you here and the date of the book or when you looked of the information on the web. The AGNC tag has the most value as it will be the "Responsible Agency" for the data you are sourcing which could be different than the Author/Publisher of the source.  Think Local Parish vs National Church Compilations.

 

It seems to me that the tags noted above are never used (Except TITL) in building a GEDCOM for all Source Records created.

 

AND on import of a GEDCOM that uses these tags, they are lost / merged into the TITL tag.  <= This is bad for me!!



#5 KFN

KFN

    Advanced Member

  • Members
  • PipPipPip
  • 338 posts

Posted 31 October 2020 - 10:59 PM

Jerry,

 

NOTEs in a SOURce record are not special or different that any other note in any other record.  I use them as they are defined in GEDCOM as: "Comments or opinions from the submitter"  I substitute "researcher" for "submitter" in this case.

 

By this we mean, if the source document has writing in the margin from an unknown source (writer), concludes something that is not obvious in the text, is hard to read or understand (interviews and stories can be hard to follow) I will state this in a note, I also use it when I need to follow up on information found in the source or go to another source cited in this source.  NOTEs are generally directed to the researcher, either myself, colleague or family member.



#6 History Hunter

History Hunter

    Advanced Member

  • Members
  • PipPipPip
  • 79 posts

Posted 01 November 2020 - 07:57 AM

KFN;

 

Really appreciate your providing the info on the purpose and content of the GEDCOM tags. It shows that the commonly available programs likely cannot convey the information that is contained in their programs using the tags in the way they were designed to be used. In addition their use of the ability to define new tags has aggravated the issue of "miscommunication". So; it seems that no amount of kludging the input in the programs will really help make ones information more transportable.

What is really needed is an up-to-date GEDCOM format with more current content capability, but coded as an API so that the data exchange interface is defined and invariant. That way, developers could not claim to have GEDCOM capability unless it actually could exchange information with another program without loss or corruption. I think this is what the new GEDCOM v7 has in mind. Unfortunately; I have no idea when it will be released. The real question will then become one of how much of a battle they will face from developers who are very happy with a loosely defined GEDCOM, which allows them to prevent users from migrating to a competitor.

I'm seriously considering keeping as much data outside the programs as possible and only entering the bare minimum needed to create the Pedigree Charts and Family Group Sheets, which I'll save externally. That way, when the inevitable happens, I can re-enter my data from a storage format that facilitates that process. Besides; I can put together a much more "readable" narrative report using Scrivener or a word processor than I can with the automated capability of most programs. I don't do full write-ups often, but I do them to actually be read. So; having the graphical portion (Pedigree Charts and Family Group Sheets ) done by the program would make it fairly simple to turn out a polished report.
 



#7 KFN

KFN

    Advanced Member

  • Members
  • PipPipPip
  • 338 posts

Posted 01 November 2020 - 08:53 AM

I have a lot of opinion on various schemes to move GEDCOM forward, some of it is based around your correct thoughts (in my opinion) that genealogy programs are not GEDCOM compliant, and that they really don’t want you to use multiple platforms to store and report on your family history.

Personally, I think programs are not using GEDCOM to its fullest to begin with, use it completely first then let’s talk about upgrading the current GEDCOM Standard to something newer.  I’ve participated in several different committees with the goal to enhance GEDCOM and at present they have all failed!  
 

I too have moved away from any thought of running an automated book generation program.  I beta tested a third party one for FTM and to no fault of the program had my issues.  The real problem was not the report/book generator but the data model used by FTM which lost a lot of my GEDCOM compliant data when I imported it from my primary database.  To use the sourcing model I would have had to spend days adding information that was already in the GEDCOM!