Jump to content


Photo

Master Source List mess

source master

  • Please log in to reply
11 replies to this topic

#1 GlenB

GlenB

    Advanced Member

  • Members
  • PipPipPip
  • 109 posts

Posted 01 March 2019 - 03:39 PM

I have been working with the Family Tree capability of adding people and citations for events to my database and now my master source list is a MESS!

 

It is not really a Family Tree interface problem. Whenever you import citations from any other people they are going to be a mess since I don't think most people know how to use citations.

 

Here's one simple example:

Master Source: Leah Olsen in household of Stephen Olsen, "United States Census, 1910"

and the footnote, short footnote and bibliography ALL say the same thing: "United States Census, 1910," database with images, FamilySearch (https://familysearch...03/1:1:M5XS-BC7 : accessed 12 February 2019), Leah Olsen in household of Stephen Olsen, Castle Dale, Emery, Utah, United States; citing enumeration district (ED) ED 42, sheet 15A, family 251, NARA microfilm publication T624 (Washington D.C.: National Archives and Records Administration, 1982), roll 1603; FHL microfilm 1,375,616.

 

I can live with the 3 long entries as the details of the citation, but the Master Source seriously needs work! Maybe it is just me, but I would rather that such a Master Source said only: "Census, United States, 1910".

It is attached to Leah Olsen and likely also attached to Stephen Olsen and others in the family group. They all have the same source and it is not necessary or proper to put their names in the Master Source.

 

Doing it as they have done results in one Master Source record per person per event and kind of removes the ability to know what all your sources are. The source in this case is a Census and it is from the United States (not Canada or the UK or ...) and it is from 1910 (not any other year). That is a unique source and the specific citation of that source identifies it in terms of the person / people and the exact place within that source where the information may be found. Can you imagine how nice your Master Source List would be:

Census, UK and Wales, 1881

Census, UK and Wales, 1901

Census, United States, 1910

Census, United States, 1920

 

Maybe some of you don't agree with my sense of Sources and Citations. It is open enough in RM& that you can do it almost any way you want.

 

Given that, what may be needed is some way to massage thousands of Master Source records to conform to the style that the user wants. Consistency has value - but my list is now a MESS!



#2 kbens0n

kbens0n

    Advanced Member

  • Members
  • PipPipPip
  • 3449 posts

Posted 01 March 2019 - 05:15 PM

I have been working with the Family Tree capability of adding people and citations for events to my database and now my master source list is a MESS!


Ideally,this should all be done into a database separate from your master or working database, massaged and then drag 'N' dropped (after backing up both).

---
--- "GENEALOGY, n. An account of one's descent from an ancestor who did not particularly care to trace his own." - Ambrose Bierce
--- "The trouble ain't what people don't know, it's what they know that ain't so." - Josh Billings
---Ô¿Ô---
K e V i N


#3 GlenB

GlenB

    Advanced Member

  • Members
  • PipPipPip
  • 109 posts

Posted 01 March 2019 - 06:00 PM

Sure kbens0n, always good advice. But the problem remains - lots of entries in my Master Source list that do not conform to what I think constitutes good use of the capability. How to clean up a large mess?



#4 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6214 posts

Posted 01 March 2019 - 08:58 PM

I've not explored sources from FamilySearch very much but do recall that they are what we have called "extremely split". That means that the data held in the Master Source fields (yellow zone in the Citation Editor) vary from source to source for the same census year and the fields in the Source Details area (green zone) are empty. 
 
What you are desiring is called "lumping", i.e., one Master Source for the census year and the varying data is stored in the fields in the Source Details (green zone).
 
A transformation from split to lumped is technically conceivable, especially when the original split sources have been systematically (computer) generated. The opposite transformation has already been developed - see Sources – Adventures in Extreme Splitting which also suggests why you might not want to "lump". Search in this Forum for posts containing words like "split" and "lump" and authored by Jerry Bryan for more extensive explanations.


Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#5 Nettie

Nettie

    Advanced Member

  • Members
  • PipPipPip
  • 1636 posts

Posted 02 March 2019 - 09:54 AM

I would suggest that read the past forums about all the issues with Census Templates.  Many different opinions.  2015  Tips and Hints has a lot of information.  Jerry and Tom H have worked hard to help but others have their opinions., also.  I am a lumper especially for census records and I have templates for each census years.  But I would read what has been done in the past and there are many suggestions. 


Genealogy:
"I work on genealogy only on days that end in "Y"." [Grin!!!]
from www.GenealogyDaily.com.
"Documentation....The hardest part of genealogy"
"Genealogy is like Hide & Seek: They Hide & I Seek!"
" Genealogists: People helping people.....that's what it's all about!"
from http://www.rootsweb....nry/gentags.htm
Using FO and RM since FO2.0 


#6 GlenB

GlenB

    Advanced Member

  • Members
  • PipPipPip
  • 109 posts

Posted 02 March 2019 - 11:50 AM

I will try to read through some lumping and splitting discussions and census template discussions, but it feels like you're all going deeper into the problem than I was aiming.

 

In the sample I provided above:

Master Source: Leah Olsen in household of Stephen Olsen, "United States Census, 1910"

and the footnote, short footnote and bibliography ALL say the same thing: "United States Census, 1910," database with images, FamilySearch (https://familysearch...03/1:1:M5XS-BC7 : accessed 12 February 2019), Leah Olsen in household of Stephen Olsen, Castle Dale, Emery, Utah, United States; citing enumeration district (ED) ED 42, sheet 15A, family 251, NARA microfilm publication T624 (Washington D.C.: National Archives and Records Administration, 1982), roll 1603; FHL microfilm 1,375,616.

I just want to revise the Master Source record and leave everything else alone. Change it from

Leah Olsen in household of Stephen Olsen, "United States Census, 1910"

to

Census, United States, 1910

 

As Tom notes where "the original split sources have been systematically (computer) generated" it should be possible to do this with a little parsing and rearranging.

 

It sounds like nothing already exists to do that, so I can go code up some SQL and see what I can manage. It may be though that the easiest way is not to actually parse but only identify records with particular strings in them and replace the whole record, ie search for all records containing "United States Census, 1910" and replace that Master Source record with "Census, United States, 1910" and do that exhaustively for each of the 30-50 or so Census types I have in my database. Having done that I will have hundreds of identical Master Source records and I'll need to understand the database structure better in order to collapse a same-named group of them into one Master Source record. Hopefully the lumping discussion will help me do that.

 

I'll drop the resulting script on you, Tom, for your consideration but this exhaustive approach probably won't be generally useful. I still have some work to do on the database before I get to that phase of my cleanup, so it will be a while. Thanks to Tom and Nettie for your thoughts on this.



#7 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3531 posts

Posted 02 March 2019 - 12:10 PM

Here is an experiment you could do - maybe in a copy of your database - to see if the approach you are looking at will work. I suspect it will, but the real answer is "it depends" - hence, the little test I'm proposing.

 

Take two of your troublesome Master Sources - maybe Leah Olsen in household of Stephen Olsen, "United States Census, 1910" and William Olsen in household of Stephen Olsen, "United States Census, 1910". I just invented the name "William Olsen" since I don't know for sure who else you have recorded in the census for the same household. Then rename both Master Sources to just Census, United States, 1910. There is nothing in RM that will prevent you from naming two different Master Sources with the same name. Then, see if you can merge the two Master Sources. You should be able to if the Master Source info is identical between the two.

 

And for that matter, you don't even have to rename both of the Master Sources. For example, rename the Leah Olsen one to the Master Source name you want and then try to merge it with the William Olsen one (or whatever real name you have from the same household). Doing it this way, be sure that as a part of the merge the name you keep is the Census, United States, 1910 name for the Master Source.

 

If this works, then you could proceed manually without any renaming after the first one to solve the problem- just a lot of merging. If this doesn't work, then you need to identify what is different in the Master Sources to prevent their merging.

 

No matter what you find in this little test, the scope of the problem may be so large that you will find that you could benefit from a little assistance from SQLite. But I would start by playing around with it manually to be sure the nature of the problem is well understood.

 

Jerry



#8 GlenB

GlenB

    Advanced Member

  • Members
  • PipPipPip
  • 109 posts

Posted 02 March 2019 - 01:06 PM

Great advice Jerry. That's kind of how I was hoping it would work. But doing it a few times by hand is a key test before coding.

 

It will be easy enough to use case specific renames for the 30-50 censuses I have, and I can do that in SQL easily enough.

 

For merging Sources through SQL I'll piggyback on some existing scripts, I think, I hope -- needs some looking around.



#9 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6214 posts

Posted 02 March 2019 - 01:47 PM

I've had a brief look at a couple of Census sources from FamilySearch Family Tree to get a better understanding of what you are talking about.

  • They are Free Form sources
  • They are extremely split: the Master Source field named Footnote contains the entire citation which is replicated (an RM default until overwritten) in Short Footnote and Bibliography fields. The Source Detail field named Page is empty.
  • The Master Source Name varies from person to person, a result of all the source data being in the Master Source.

If what you are wanting to do is to end up with one Master Source for all citations of the US Census for a given year, you have to move the variable data from Footnote to Page and remove it from Short Footnote and Bibliography. And rename the Master Sources to "US Census - yyyy" and then merge them. Here's an example:
 
Master Source: James E Normington in household of Bertie Normington, "United States Census, 1910"
Revised: United States Census, 1910
 
Footnote: "United States Census, 1910," database with images, FamilySearch (https://familysearch...03/1:1:M5ZY-183 : accessed 2 March 2019), James E Normington in household of Bertie Normington, Brooklyn Ward 26, Kings, New York, United States; citing enumeration district (ED) ED 767, sheet 12A, family , NARA microfilm publication T624 (Washington D.C.: National Archives and Records Administration, 1982), roll 977; FHL microfilm 1,374,990.
Revised: "United States Census, 1910," database with images, FamilySearch
 
Page:
Revised: (https://familysearch...03/1:1:M5ZY-183 : accessed 2 March 2019), James E Normington in household of Bertie Normington, Brooklyn Ward 26, Kings, New York, United States; citing enumeration district (ED) ED 767, sheet 12A, family , NARA microfilm publication T624 (Washington D.C.: National Archives and Records Administration, 1982), roll 977; FHL microfilm 1,374,990.
 
If you also get Census records from other sources and want to lump them in, then "database with images, FamilySearch" should be in Page, not Footnote.
 
The Bibliography can be changed simply to "United States Census, 1910".
 
The Short Footnote field is problematic as it does  not offer much to shorten, e.g., "US Census, 1910" but when outputted it still has the entire Page value appended. I tried with a custom Free Form template to have reduced content from the Source Details for the Short Footnote by splitting the Source Details into two fields, both being used in the Footnote, one only in the Short Footnote. But the programming involved in parsing that may be difficult. See Source Template, A Better Free Form .


Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#10 GlenB

GlenB

    Advanced Member

  • Members
  • PipPipPip
  • 109 posts

Posted 02 March 2019 - 01:57 PM

Thanks for the additional research Tom. I likely would have stumbled over such things later when I tried to figure it out. Your understanding of how these tables are structured and used by RM is a big help.

 

As mentioned, all this discussion now is in anticipation of one phase of my database cleanup. None of that starts for another few months. As I've been watching the mess I'm importing I've been thinking about such cleanups and how to automate them. I have a few phases to do in cleaning up places and citations/sources and date formats and simple naming conventions ... all standardizing a mash-up of stuff from others to meet my personal standards!

 

Some of your scripts that I've run into already have been very helpful and they have supported my belief that once something is in a database you can manipulate the HECK out of it to make it look the way you want. Fortunately, SQL is not a new language for me so ....



#11 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3531 posts

Posted 02 March 2019 - 08:19 PM

As usual, Tom's analysis is correct and you would have discovered fairly quickly just by playing around that test I was proposing would not have been successful. Or maybe we should say that it would have been successful in showing that the approach I was suggesting wouldn't work.

 

As Tom indicated,the problem is that the Footnote field in the Free Form Source Template is a Master Source field rather than being a Source Detail field. No amount of "simple merging" will solve this problem. To lump your sources that have come in from ancestry, you will have to move some of the data from a Master Source field to a Source Detail field. This problem is aggravated from the point of view of an SQLite solution in that the data of interest in both the Master Source (RM's SourceTable) and the Source Detail (RM's CitationTable) is encoded as XML within an SQLite Text field - not so easy to decode and encode from within SQL itself.

 

By the way, I think that the same basic incompatibility exists between RM sources and sources from many other online genealogy software sites. Several sites to which I subscribe (e.g., FamilySearch, genealogybank.com, fold3.com, myheritage.com, and several others) will give you what RM would call a Footnote Sentence that you can copy from the site and paste into RM. But paste where into RM? I guess the best you could do would be to paste the string into the Footnote field of the Free Form source template, and you would be back into the same problem that you are experiencing now with ancestry.com, even when doing a manual copy and paste for each Footnote Sentence.

 

Jerry

 



#12 GlenB

GlenB

    Advanced Member

  • Members
  • PipPipPip
  • 109 posts

Posted 02 March 2019 - 11:07 PM

Sounds like step 1 still has to be replacing the Master Source title with non-person specific versions such as "Census, United States, 1910". I can see how to do that in SQL.

 

We're now deep into Step 2 where we try to collapse the same-named Master Source records AFTER moving info from the Master Source record to the Source detail record. I might need to write some non-SQL code and attach it as an executable that I can call which can do the nitty-gritty. Yup, getting more complex all the time...

 

These problems seem to be founded on RM having a proper relational model between sources and citations while everyone else seems to use the inelegant "bag or bytes" design.