Posted 13 June 2017 - 06:49 PM
Your Book and Page understanding of sourcing in GEDCOM is some-what accurate in that it is very much directed toward “old school” hard copy objects such as: audio and files, books, fiche and film, magazines, manuscripts, maps, newspapers, photos, tombstones.
GEDCOM states that a PAGE (as part of a “source_citation”) is defined as: “Specific location with in the information referenced.” Which more people and software programs assume only refer to books, since a location within a book is best described with a page number, while other media requires additional information beyond a page to describe a unique location.
However, the field/tag title of “PAGE” does not best describe what GEDCOM really can expect to find in the tag PAGE. GEDCOM goes on to state:
“For a published work, this could include the volume of a multi-volume work and the page number(s). For a periodical, it could include volume, issue, and page numbers. For a newspaper, it could include a column number and page number. For an unpublished source or microfilmed works, this could be a film or sheet number, page number, frame number, etc. A census record might have an enumerating district, page number, line number, dwelling number, and family number. The data in this field should be in the form of a label and value pair, such as Label1: value, Label2: value, with each pair being separated by a comma. For example, Film: 1234567, Frame: 344, Line: 28.”
This statement therefore incorporates a whole lot of data points that many programs fail to include (or create their own tags for) in a well formed GEDCOM file.
However, this is not to say that I endorse this construct in my discussions on the topic for a successor version to v5.5.1
As it pertains to web based data, I recommend two different approaches to sourcing these data points, with the following caveat.
“Is the website REALLY a source or just a repository for the actual information?” I look at sites like Ancestry.com as repositories and treat them as such in my sourcing, footnotes/endnotes.
A source is the originator or creator of the data. They are generally responsible for the accuracy and collection of the original data. Ancestry did not go out and capture the data for a census or immigrant list. They are acting like a library who has collected books on a subject, not the author of that book. The only cases where websites in general are “Sources” is where they take other individuals work and author their own manuscript. These manuscripts generally contain the website’s conclusions and should source where they collected the data that led to their conclusions.
The two places I recommend placing website information is: (1) In the PAGE tag/field, used for specific pieces of information that collaborate your fact’s conclusion. (2) In the Source_Repository_Citation directing readers to look at a website for additional data.
Since “PAGE” is already noted in the GEDCOM standard as more than just a single page number that direct you to a fully qualified location like sheet numbers, frame numbers, volume, issue etc. A full URI (aka URL) to the source on the web you are citing would be appropriate. In cases where the site is not the actual source but the “Repository” of the data, placing the site URI in the Source_Repository_Citation structure is the appropriate location for this address. The Source_Repository_Citation structure contains a place to put notes about the data location, a CALN tag for the actual Source Call Number at the Repository and a media type.