Jump to content


Photo

Research Management System


  • Please log in to reply
75 replies to this topic

#41 Serenity20061

Serenity20061

    Advanced Member

  • Members
  • PipPipPip
  • 34 posts

Posted 16 December 2012 - 06:05 PM

Good attitude, Theresa! I don't think I have a book on SQL but I cut my teeth on dBase II - maybe before your time ;). There's plenty of on-line stuff and nothing better than messing around with it.


Well, my first experience on a computer was as a kid using the DARPA system for a math program (ca. 1970). My first personal computer was Aug 1977 - the TRS80 (model 1) from RadioShack and ran on Basic I (before DOS). The 5 1/2 in floppy disks hadn't been invented yet and we used a special cassette player that ran a magnet tape for about 45 minutes before the system booted up and the cursor would come up on the blank screen. My dad bought a simple chess game for the computer - it was a book about 150 pages long of nothing but the lines of computer code. There were no hard-drives, so if you turned off the computer you needed to re-key the entire program again before you could run it. I think ours had 4k RAM. http://oldcomputers.net/trs80i.html

So... not before my time :)

I have had a home computer ever since and seen the evolution of the computer industry. My first experience with BASIC I is probably what kept me out of the programming field. I stopped by the bookstore today - after seeing "The Hobbit" (great movie!) - and bought a book on SQL and have been reading through how the queries are written. It looks straightforward enough, but will have to play with it. I do have a much better understanding of what I am looking at on the SQLite wiki site now. Enough to be able to jump in with some comments at the very least.

Theresa

#42 Serenity20061

Serenity20061

    Advanced Member

  • Members
  • PipPipPip
  • 34 posts

Posted 16 December 2012 - 07:22 PM

Tom and Alfred,

Sorry I wasn't more clear. I was trying to create a "group" using the create group icon in the index field. I wanted to query the database for everyone associated with the master source "Washta - 1870 US Federal Census". I was trying to mark the group using the "Select people by data fields" and then selected "Source (general)" from the list. I have played with the query several different ways, but can't get it to pull the correct number of individuals (there are 277 people with citations for that source document). I can't run it by event because the tag is CENS for some and RESI for others. I was hoping to be able to do the groups by source record. I am still cleaning up the tag names and place names that are all over the place after the FTM2012 gedcom fiasco. I may just re-key my entire tree in from scratch so that I can clean up the data as I go.

Doing the genealogy for a town is a bit different than doing the genealogy of a single family tree -- the starting point is at the top of the tree and works down through descendant lines as opposed to a single family tree where you start at the bottom of the tree and work up through ancestors. All of my research for the town starts with the earliest source documents for the town and adding each person found in that document to the tree, finding all possible records for their lifetime. With each new census or source document, more families are added to the tree, more descendant lines are traced, and many linkages between families are established. I am tracing as many of the descendant lines for everyone who lived in the town of Washta Iowa from 1868 to present day.

I plan on making heavy use of the "Groups" feature in RM6 to create sub-sets of the tree based on source document, contributor, family line, and tasks still to be completed. Needless to say, I will probably be VERY good at using the search functions in RM6 by the time this project is completed in 2018.

#43 Alfred

Alfred

    Advanced Member

  • Members
  • PipPipPip
  • 5734 posts

Posted 16 December 2012 - 07:53 PM

Try:
Any Fact -- source -- name contains -- Washta - 1870 US Federal Census
Alfred

#44 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3442 posts

Posted 16 December 2012 - 08:03 PM

Try:
Any Fact -- source -- name contains -- Washta - 1870 US Federal Census


Yes, and to be even more inclusive, you could use

Source (general) -- name contains -- Washta - 1870 US Federal Census
OR
Any Fact -- source -- name contains -- Washta - 1870 US Federal Census

Depending on how you have done your citations, you may not need the test for Source (general) if each person with a Washta - 1870 US Federal Census citation has it associated with at least one fact. And vice versa, if each such person has such a citation associated with the general person, then you wouldn't need to test Any Fact. But it won't hurt anything to do both.

Jerry

#45 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6161 posts

Posted 16 December 2012 - 08:20 PM

Neat! Four different questions. Four different answers and, happily, RM can answer the real one.

Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#46 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3373 posts

Posted 17 December 2012 - 11:01 AM

From a programming perspective, here are some of the enhancements needed to integrate the system:

  • A good example of the necessary functionality is the typical business contact management systems that link contact history to a person in the address book. In the genealogy program, would be a link between an individual to a correspondence or research log entry. Would be a many-to-many relationship (a person could have many correspondence items and a correspondence item could be linked to many people). A junction table with link between person and correspondence tables would do the job.
  • Create a similar many-to-many relationship between the research log and individuals in the tree.
  • Create many-to-many relationship between correspondence log and research log.


These are all valid wishes and I also believe they have all been voiced before, maybe Renee can confirm.

In the Correspondence List I would see this as tagging similar to media items where the piece of correspondence is tagged with any individual it refers to, my head just said "why not also places" so I am adding that without really thinking it through. :unsure:

“Your most unhappy customers are your greatest source of learning.” -Bill Gates

It's now time for discretion, trust, patience and support

 

User of Rootsmagic 7.5.9, Family Historian 6.2.7, Family Tree Maker 2014 & Legacy 7.5

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#47 Serenity20061

Serenity20061

    Advanced Member

  • Members
  • PipPipPip
  • 34 posts

Posted 18 December 2012 - 04:28 PM

These are all valid wishes and I also believe they have all been voiced before, maybe Renee can confirm.

In the Correspondence List I would see this as tagging similar to media items where the piece of correspondence is tagged with any individual it refers to, my head just said "why not also places" so I am adding that without really thinking it through. :unsure:


I was actually thinking about this more after my initial post and thought that the Research Manager, Tasks, and Correspondence logs would be to have as additional tabs on the main screen rather than hidden in the menu. Then you could have research items associated with a person/group; tasks associated with person/group. Correspondence would be good to have as a tab, but also the ability to use a correspondence item act as a source document. I have several emails that I have saved for years because they provided key information related to my genealogy. My paternal grandfather who has been estranged from the family since before I was born is still living and is in his late 90s. I tracked him down and he responded to an email where I asked him how he met my grandmother. He told me they went to school together, ran away from home together, and were married at 14 years of age. There is no other place I would find the story behind the early marriage. The email can be printed, but what about an interview or telephone conversation. If the correspondence log was a tab on the main screen it would be easier to document the conversation and have it attached to the person providing the information and to the family members who they provided information about.

All the conversations on the message boards and wiki sites have really got my head spinning about the programming side of genealogy software. I even spent several hours this morning trying to map out all the different tables/relationships between tables this morning before I left for work - when you start listing all the different tables and required in a relational database, it gets complicated really fast. One new feature and it adds multiple tables. I also started looking at XML this morning after reading a post on the BetterGedcom wiki. What is the advantage of using SQL vs using XML?

Theresa

#48 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6161 posts

Posted 18 December 2012 - 05:04 PM

XML is not a database engine or a query language; it is a way of wrapping data in a lingua franca that allows it to be transferred between different database systems using different engines and query languages, or to be interpreted and displayed in a structure. It is a Markup Language (as is HTML) but for data.

SQL is Structured Query Language, used with a variety of different database engines which store data in different ways, to manage and interrogate the data. SQLite is a database engine which has but one language, a flavour of SQL; yes, there are a variety of syntaxes.

XML adds a great deal of overhead to the storage of data so it is ill-suited as a means of storing large amounts of heavily accessed data. That's why there are a great many kinds of database engines tailored to different kinds of applications. SQLite, used within RM, is designed for multiple read, single write usage of a local datafile in contrast with, say, MySQL which is perhaps the major backend data server on the Internet.

And, I should add, RM stores things like sentences for facts and citations in XML in the SQLite database, writes data to XML files for the Publish Online website, that gets interpreted and displayed by Javascripts downloaded to the web browser.

Edited by TomH, 18 December 2012 - 05:14 PM.

Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#49 Serenity20061

Serenity20061

    Advanced Member

  • Members
  • PipPipPip
  • 34 posts

Posted 18 December 2012 - 06:27 PM

Ok.... now my head is really spinning. So let me see if I got the gist of what you were saying:
  • XML isn't a database/query language - it acts more as a foreign language dictionary or a translator between databases .
  • There are different kinds of SQL (as opposed to the variation on the term SQL being different upgraded versions of the same thing).
  • XML would suck the life out of your computer storage and RAM if it were applied to genealogy software because of the heavy data manipulation done between tables in a genealogy database
  • Some functions within RM use XML for moving data between the tables to reports or from the database to the web (javascript being a common web design language that defines the GUI interface).

What I didn't quite grasp were the distinctions between SQL, MySQL, and SQLite, other than perhaps a different tool for a different task?

Overall, your response did answer my question regarding why SQL for genealogy databases over XML. But did I get the meaning of the rest of your answer? I am curious to know if any of the stuff that I have been reading for the past week is starting to sink and actually beginning to make sense.

#50 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6161 posts

Posted 18 December 2012 - 08:55 PM

SQL is a language used by humans and by applications to interact with database engines through which we build and maintain databases. SQLite and MySQL are database engines. If both had identically defined databases with the same data, then this SQL query would return the same results from each: SELECT Given, Surname FROM NameTable;. They share a common query language that has an ISO standard but each different database engine has some variation, either a constraint or an extension.

RootsMagic interacts with its SQLite database using SQL to interact with the SQLite 3 database engine that is compiled along with the user interface and other input and output interfaces that comprise the application. MySQL cannot read or write a SQLite database and vice versa. But one could write SQL queries or, more likely, a set of procedures in a high level language such as C++ or Delphi (used by RootsMagic) that would generate the needed SQL queries and output all the table definitions and data from, say, the SQLite database and write it out in XML. On the other end, another database engine could use a complementary procedure to import the XML, translating it into its query language to build the database in its format. SQLite and MySQL could skip XML because they have SQL in common and can backup to and restore from SQL files.

BetterGEDCOM favours XML over the old GEDCOM standard for a variety of good reasons but a disadvantage of xml is that the file size for the same data transport is very much larger

Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#51 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3442 posts

Posted 18 December 2012 - 09:13 PM

XML isn't a database/query language - it acts more as a foreign language dictionary or a translator between databases .


I'm trying to decide if the description of XML as a foreign language dictionary or a translator between databases captures the essence of XML. I think it's sort of yes and no. Let me try to explain further by examples.

I first make note of the fact that HTML really is a specific instance of XML. HTML is the primary language of Web pages, and it works with tags. For example, a paragraph in HTML starts with a <p> tag and ends with a </p> tag, and the text for the paragraph goes between the <p> and the </p>. Another example is the bold tag which is <b> and the end of bold tag is </b>. As you can see, most tags have a begin and end, and the one with the end contains a /. Tags frequently are nested so that a paragraph in a Web page might be coded as something like

<p> unbolded text for the paragraph goes here <b> bolded text for the paragraph goes here </b> more unbolded text for the paragraph goes here </p>

If a protocol were being set up to exchange medical information between medical providers and insurance companies, it might be done with XML and one example of a tag might be <prescription> to begin the information about a prescription and </prescription> to end the information about the same prescription, with the actual prescription data between the two tags.

In the genealogy world, GEDCOM might some day be replaced with XML where a birth date might be transmitted in an XML file as <birth> <date> birth date goes here </date> <place> birth place goes here </place> </birth>

Maybe that will help a little.

Jerry

#52 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3442 posts

Posted 18 December 2012 - 09:27 PM

.

  • There are different kinds of SQL (as opposed to the variation on the term SQL being different upgraded versions of the same thing).


SQL is a full fledged programming language. The different kinds of SQL don't differ very much, but it's enough difference to drive you crazy. It's not like you have the English kind of SQL and the Russian kind of SQL and the Chinese kind of SQL. It's more like you have the American kind and the kinds from different regions of the U.S. and the kind from Scotland and the kind from England. For example, the U.S. spells a word "color" that is spelled "colour" in England. That's a very minor difference in the kinds of English. But what would be called a "small child" in much of the English speaking world might be called a "wee bairn" in the highlands of Scotland. That's a pretty major difference. Most native English speakers would recognize "wee" as a synonym for "small", but many native English speakers would not recognize that "bairn" is an English word that is a synonym for "child". And I can't finish without mentioning the "you all" that we Southerners in the U.S. typically say where the rest of the English speaking world would just say "you".

It's unfortunate that SQL is not more standard, and indeed there are published SQL standards. It's just that the various vendors and developers of SQL systems tend not to follow the standards very accurately. So SQL programmers have to learn to speak in several different dialects of SQL. As indicated above, it's not like having to learn English and Russian and Chinese. It's more like learning Southern and American and Scottish and Australian, etc.

Jerry

#53 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3442 posts

Posted 18 December 2012 - 09:45 PM

Some functions within RM use XML for moving data between the tables to reports or from the database to the web (javascript being a common web design language that defines the GUI interface).


The SQL programming language is used to create, update, maintain, and query relational databases. Relational databases always follow a row and column model that looks a lot like an Excel spreadsheet or like the People view in RM.

As has already been discussed, XML is very useful for transmitting data between different system. But sometimes it's also useful for storing data within a system in certain specialized instances. You wouldn't normally want to store all your data as XML, but sometimes storing some of your data as XML is the best choice.

In a relational database, XML might be used when the normal row/column model is not flexible enough. A really good of example within RM is the storage of source templates. Source templates allow you as the user to define your own record formats and field names and that sort of thing that really can't be reflected in RM's relational model of rows and columns. So what RM does is to store a source template as XML with the tags reflecting your record format and field names. A single source template, including all its tags and data, is stored as a single column in a single row within the relational database. The relational database doesn't know that it's storing XML. It thinks it's just storing a string of text. It's the RM program itself that understands that the string of text is XML and needs to be treated as such. And you as the user never see the XML for source templates because by the time you see a source template RM will have formatted the data for you very nicely on the source template screens.

Jerry

#54 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3442 posts

Posted 18 December 2012 - 09:53 PM

What I didn't quite grasp were the distinctions between SQL, MySQL, and SQLite, other than perhaps a different tool for a different task?


SQL is a programming language, the language of relational databases. MySQL and SQLite and Oracle and Access and SQL/Server and DB2 and many other such systems are relational databases. MySQL and SQLite happen to be free relational databases, and most of the rest of the relational databases are not free. Every relational database uses SQL as its language, but each relational database seems to have its own dialect of SQL. The different dialects are 90% to 95% to 98% the same, but they are not totally the same. RM uses SQLite as its database so RM uses the SQLite dialect of SQL.

Jerry

#55 Serenity20061

Serenity20061

    Advanced Member

  • Members
  • PipPipPip
  • 34 posts

Posted 19 December 2012 - 07:45 PM

Ok... head is starting to clear now. The light bulb is going on. :D

Jerry -- your comparison of SQLite and MySQL to Access and Oracle really helped - I am familiar with both (hate Oracle, but have worked with creating a database in Access, so it helped me get my head around it.

The comparison of different kinds of SQL being similar to regional variations of the same language also helped. It is sort of like dealing with GEDCOM files created by different software applications. All GEDCOMS may have the same file extension. And all the genealogy programs may understand how the gedcom file is supposted to import. However, due to the shortcomings of the 16 year old GEDCOM standard, different software companies have come up with custom tags to try to fill in the gaps. As a result, over time gedcoms have developed there own "regional dialect" depending on which software creates the gedcom file. Some companies do crazy things like merge the description and place names into a single PLAC tag when exporting to gedcom and then the receiving software doesn't know that they are supposed to be different fields. Some companies allow multiple NAME tags while other companies only allow one NAME tag and mulitple alternate name tags.

Your example of an XML code for the source template being inserted as a string of code in a single field in the table helped too. I could get the mental picture of how the computer would build the string of code then insert it in the table or extract it from the table and put it back into separate fields. I could also understand how doing this could cause some of the problems when transfering data between different software applications that may not be using the same variation of XML or the same variation of SQL.

I can see by the sample string of XML code how it could be very bulky trying to use XML exclusively for large database transfers. I took a class in HTML once and remember the challenge of making sure all my code had the closing tags. When the strings get really long, it is easy to lose track of which ones need to be closed and where.

Thank you for condensing what was about six chapters in the book I am reading into a few short paragraphs.

#56 Serenity20061

Serenity20061

    Advanced Member

  • Members
  • PipPipPip
  • 34 posts

Posted 19 December 2012 - 08:14 PM

I have been reading the debate about Gedcom vs XML on the BetterGedcom wiki. This may be completely out in left field, but if databases are basically a series of tables, why do the data transter in either?

Wouldn't a csv format be a better mode of transfer? If I am understanding all the discussion on updating the standard, it is basically coming to agreement on what the column headings are in the standard tables (i.e. Person, Place, Event, Repository, Source, Citation, etc) and associated junction tables.

It seems to me, if the standard is set for the basic tables and junction tables, then a batch transfer of csv files would be a better way to ensure that the data stayed in the correct fields and would more effectively handle larger files.

Am I completely off base with the potential use of csv format for data transfer?

Theresa

#57 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6161 posts

Posted 19 December 2012 - 08:32 PM

CSV does not carry any data typing whereas GEDCOM does by virtue of a specification and XML inherently does. CSV inherently follows a specific table design. Each proprietary database typically has many different tables and there is no 1:1 relationship between software A's tables and software B's tables. I suppose it would be possible to conceive of an interchange database design that could be the basis for writing a series of export tables in CSV and proprietary rules for the import of same but I think it might suffer the same difficulty as GEDCOM in the interpretation of the specification. Maybe the biggest question is the handling of binary stuff, but now I think I'm just blowing smoke!

Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#58 Renee Zamora

Renee Zamora

    Advanced Member

  • Support
  • PipPipPip
  • 8335 posts

Posted 28 December 2012 - 02:12 PM

Confirming enhancement requests are in our tracking system.
Renee
RootsMagic

#59 BTBailey

BTBailey

    Member

  • Members
  • PipPip
  • 26 posts

Posted 28 December 2012 - 07:12 PM

Bygones was a hard concept to grasp, but once understood it was a decent 'tool.' Clooz 3 has made some progress, but still has a long way to go. I am beginning to focus on a budding product that shows a lot of promise to be on the vanguard of genealogical research tools.


I can see I'm a little late to the party, as this thread has meandered in a different direction, but I just wanted to throw in a HALLELUJAH! I have been waiting YEARS to hear news like this. I can't believe it's taken this long for someone to take Bygones and "put it on Steroids" as you describe.

The goal in managing genealogical research, as I see it, is to enter information ONCE then LINK it to people, places, events, etc, until there is a research reason to refine that information. And secondly, to NOT enter information that is already available in your scans, copies, etc. unless there is a way to sort/analyze/retrieve it. There are too many fields in RM that are flat/dummy fields that you can't do anything with except view in a certain screen... reference numbers, event proof and source quality are three that immediately come to mind. I hope this application you describe works harder toward those goals.

But most importantly, I hope it utilizes the Lists of people and places that we already maintain in RM rather than having to maintain duplicate lists in separate applications. I can't wait to migrate my decades of research out of Bygones, but I've yet to find anything that improves upon it without losing functionality.

#60 Serenity20061

Serenity20061

    Advanced Member

  • Members
  • PipPipPip
  • 34 posts

Posted 29 December 2012 - 08:49 AM

To be announced - - - - soon. It is in the final stages of beta testing and version 1 should be available soon. Yes, I am teasing, because I am looking forward to sharing this information with many people, but have been asked by the developer to hold off just a bit so they are not bombarded with new users until they are ready. I promise I will reveal all as soon as is practicable because I would really like to see some of you power users ring this thing out. We would all benefit greatly.

Paul


Paul,

Do you know when this new product is likely to hit the market? It sounds like the solution to many of my problems with current software. If it is still several months (or years) out, is there any way to get in on the beta testing?

Theresa