Jump to content


Photo

DataClean Beta Feature

dataclean

  • Please log in to reply
73 replies to this topic

#41 versutus

versutus

    Member

  • Members
  • PipPip
  • 5 posts

Posted 24 July 2013 - 09:36 AM

I served with many guys in the Army that had the family name Major.

Versutus

#42 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3546 posts

Posted 30 July 2013 - 04:35 PM

I don't have the time to check if this has been reported before but I was using NameClean tonight and noticed a little quirk when more than one initial appears in the Given name.

For "Agnes M J" Rootsmagic suggests "Agnes M .J." and not "Agnes M. J." (note the position of the first period, M space . rather than M. space.

Keeping ones customers and their important views at a distance is never a good approach

 

User of Family Historian 7.0, Rootsmagic 7.6.3

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#43 Renee Zamora

Renee Zamora

    Advanced Member

  • Admin
  • PipPipPip
  • 8768 posts

Posted 05 August 2013 - 11:50 AM

Confirming issue is noted in our tracking system.
Renee
RootsMagic

#44 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3546 posts

Posted 05 August 2013 - 12:53 PM

Renee,

Whilst not part of the DataClean Beta feature I was reminded of something related which I believe is in the enhancement requests but maybe you could confirm.

As you know there are many very useful options under Lists > Fact Type > Print under the "Create a List of:" dropdown which are related to data cleaning, like "People Missing a fact", "People with more than one of this fact type" and "Facts with text dates" amongst others.

It would be a very useful enhancement if a Create Group option existed here to compliment the existing Generate Report option. Users could then use People View to work through these anomalies and clean them from the database.

Keeping ones customers and their important views at a distance is never a good approach

 

User of Family Historian 7.0, Rootsmagic 7.6.3

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#45 Don Newcomb

Don Newcomb

    Advanced Member

  • Members
  • PipPipPip
  • 1075 posts

Posted 19 August 2013 - 01:31 PM

Jerry, do you know anyone who actually has the forename Doctor or Major they would be new to me. Yes RM would suggest these for the prefix field and if they are valid forenames they should also have an exclusion clause.

I've heard of one person named "General Blackjack Pershing Jones". As the story was told, he was a private in the Army, so he'd be Private General Blackjack Pershing Jones.

#46 Renee Zamora

Renee Zamora

    Advanced Member

  • Admin
  • PipPipPip
  • 8768 posts

Posted 20 August 2013 - 02:13 PM

Renee,

Whilst not part of the DataClean Beta feature I was reminded of something related which I believe is in the enhancement requests but maybe you could confirm.

As you know there are many very useful options under Lists > Fact Type > Print under the "Create a List of:" dropdown which are related to data cleaning, like "People Missing a fact", "People with more than one of this fact type" and "Facts with text dates" amongst others.

It would be a very useful enhancement if a Create Group option existed here to compliment the existing Generate Report option. Users could then use People View to work through these anomalies and clean them from the database.


Just noticed I some how missed this enhancement request, confirming it is in our tracking system.
Renee
RootsMagic

#47 Kyle

Kyle

    Member

  • Members
  • PipPip
  • 16 posts

Posted 05 December 2013 - 07:45 PM

Just found this thread and now I'm busy cleaning the places in my inherited data. Love this feature!


Is there a way to add/edit the abbreviation list? I'm finding many cases of "Cem" or "Cem." that I want to expand to "Cemetery" everywhere. And "Gen Hosp" that I want to expand to "General Hospital"

Additionally it would be nice to be able to click a button and find which people will be affected by the change (a button next to both BEFORE and AFTER, since the AFTER version may match others that were origianlly correctly entered or fixed previously)


Edit: Now I'm trying out the name cleaner (didn't finish all the places, but wanted to try the other piece too)
I was doing a "fix everything" pass through and stopped after a while and only am fixing the "all uppercase" names now. It woudl be really nice to have a couple of buttons. "Ignore all like this" (which seeing an exact string match for the field would skip all matching ones) and a "Replace all like this".

Love this feature too!


Oh, found a bug in the misplaced nickname search.
"Sally"Sarah -> correctly finds "Sally" as the nickname, but it eats the leading S on Sarah and offers 'arah' as the name. (I am using verison 6.3.0.6)

#48 Renee Zamora

Renee Zamora

    Advanced Member

  • Admin
  • PipPipPip
  • 8768 posts

Posted 06 December 2013 - 11:03 AM

Confirming enhancement requests are in our tracking system.

"Sally"Sarah -> correctly finds "Sally" as the nickname, but it eats the leading S on Sarah and offers 'arah' as the name. (I am using verison 6.3.0.6)

[/size]

At first I couldn't see this issue then I realized you need to NOT have a space between "Sally" and Sarah. If you have no space then you will see the "S" missing in Sarah. I have added this issue in our tracking system.
Renee
RootsMagic

#49 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3546 posts

Posted 08 December 2013 - 01:52 AM

Is there a way to add/edit the abbreviation list? I'm finding many cases of "Cem" or "Cem." that I want to expand to "Cemetery" everywhere. And "Gen Hosp" that I want to expand to "General Hospital"


Use the Search&Replace feature on Places (assuming they are Places and not Place Details yet !) you will get a chance to confirm for each one. With "Cem" I would search for "Cem " thats Cem with a trailing space.

Keeping ones customers and their important views at a distance is never a good approach

 

User of Family Historian 7.0, Rootsmagic 7.6.3

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#50 Guy

Guy

    Advanced Member

  • Members
  • PipPipPip
  • 37 posts

Posted 23 January 2014 - 08:04 PM

I am a little behind in trying this beta feature. However, I must say this is the best thing since take out pizza!. I have been manually doing cleanup for a while and in just about two hours using this feature I have cleaned up many of the errors in my db.(11.5K persons and 1388 places) I also have run into the situation of the name clean trying to put Earl and Dean as a suffix, but I can live with this. (by the way my wife's side of the family has two persons with the given name of Major, so this is another one I skip). As for the place clean, it did not always suggest the full state name (it gave Pa for PA instead of Pennsylvania, Loudon County, VA suggested Loudon County, Va, for some reason it wanted to capitalize the word MILL in Mill Creek, Randolph, West Virginia and other places where the word Mill was used, suggested Ar for AR instead of Arizona, etc.). Also, many times it would not remove the word County when I had a city, county, state. It seemed to be inconsistent. However, in spite of all of these issues the feature made it easier to fix the changes than I would have otherwise done. I would like to have the option to use the county check in conjunction with this feature so as to kill 2 birds with one stone.

#51 Renee Zamora

Renee Zamora

    Advanced Member

  • Admin
  • PipPipPip
  • 8768 posts

Posted 24 January 2014 - 11:01 AM

Confirming issues and enhancement requests are in our tracking system.
Renee
RootsMagic

#52 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3974 posts

Posted 02 April 2014 - 12:40 PM

In a certain sense, this message is more about RM in general than it is about the DataClean Beta. Indeed, this message is really more about genealogy in general than it is about RM. Nevertheless, the issue came up in the context of the DataClean Beta, so I'll post it here.

I was running the "Names in all uppercase" test with the PlaceClean part of the beta. It identitied "France, WWII" as a potential problem. I'm well aware that "WWII" is not a place, but let me continue.

In this particular case, the death note for the individual in question said "US Army Air Corp. Pilot killed in action over France during World War II. Made night drops of supplies to the French Resistance.". So in this particular case, the best solution seems to be simply to change the death place from "France, WWII" to "France" and all is well. The minor glitch from PlaceClean's point of view is that I had to exit from PlaceClean and do a Find in RM Explorer to find the individual in question. So far, so good.

But without even running PlaceClean any further prior to posting this note, I already know that I have some cases where an individual's place of death simply says WWI or WWII because I don't actually know where the individual died. PlaceClean is going to identify a problem. I can't really think of a better place of death to enter for these individuals than WWI or WWII, and I have no way to identify to PlaceClean that those places are not a problem.

I also have individuals whose place of death is entered as Civil War because that's all I know. But PlaceClean doesn't identify Civil War as a problem because it's not all caps.

PlaceClean would obviously be well served by having a "not a problem" feature. But in the meantime, I would be curious what others of you all might do about entering the place of death when all you know is that the individual died in a particular war and you know nothing else about the place.

Jerry

#53 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6444 posts

Posted 02 April 2014 - 01:14 PM

Coincidentally, I struggled with the same question (WW1 as a place) yesterday while also struggling to use RM on an 8" tablet. I could not use the Data Clean feature - it is simply inaccessible through touch. WW1 is an era but not a valid date so the only option I could think of was to put in the Death fact Description and use a date range or estimated date in the Date field. So Place was left blank unless more was known, like was it Europe or at home?

Tom user of RM7630 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#54 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3546 posts

Posted 02 April 2014 - 01:43 PM

I already know that I have some cases where an individual's place of death simply says WWI or WWII because I don't actually know where the individual died.


I am going to make an assumption here and say "Europe"

I have worked to eradicate all non places from my Place List with the exception of "a place unknown" chosen for report reading and could be easily removed to leave blank and the other one "at sea". Apart from those I either have a Country noted at the very least otherwise "a place unknown" prompting further research.

Is there any other reason for entering WWI or WWII? you could always enter "World War I" which I presume dataclean would not have a problem with.

I do agree that both Name and Place Clean do yearn for list where users can enter acceptable data, I see this as something the Rootsmagician should consider carefully in his quest for compatibility with other Countries where Place Details names can vary widely.

Keeping ones customers and their important views at a distance is never a good approach

 

User of Family Historian 7.0, Rootsmagic 7.6.3

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#55 Renee Zamora

Renee Zamora

    Advanced Member

  • Admin
  • PipPipPip
  • 8768 posts

Posted 03 April 2014 - 08:42 AM

Confirming enhancement requests are in our tracking system.
Renee
RootsMagic

#56 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3974 posts

Posted 03 April 2014 - 02:29 PM

Is there any other reason for entering WWI or WWII? you could always enter "World War I" which I presume dataclean would not have a problem with.


The reason I have chosen to enter things like WWI and WWII when no other good options seems available is simply to make the death fact sentence read well in a report - "He died in WWII". For example, I have one individual where the death fact sentence says "He died in WWII" and where the death fact note says something to the effect - "He was a pilot, and he never returned from his last mission. It is not known if his plane went down over France or over Germany or at sea".

Also, it's not always Europe. For example, a death fact sentence might say something like "He died in WWII in the Pacific Theater" if I don't know where he died in the Pacific Theater.

I really don't like my solution, but it's hard to find a better one. I do realize that you can clean things up a great deal in reports by customizing sentence templates for a particular fact. But I have begun to shy away from that a bit because it means I'm putting what I consider important data into what in effect is RM metadata, and RM metadata doesn't always transfer well to third party software.

I can obviously solve the problem for the DataClean beta by using World War I and World War II (well, World War II is still a problem because of the double capital I). But that still leaves me with the greater problem of how best to enter this data. The greater problem of how best to enter this data is sort of off topic for the DataClean beta except for my desire for a "not a problem" feature in DataClean.

Jerry

#57 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3546 posts

Posted 03 April 2014 - 04:14 PM

Thanks for the reply, trust me I can see your dilemma and especially identify with the "at sea" or no specific land mass place identification.

I also agree that a "not a problem" facility or user generated "acceptance" list of entries is warranted probably in both Name and PlaceClean.

Keeping ones customers and their important views at a distance is never a good approach

 

User of Family Historian 7.0, Rootsmagic 7.6.3

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#58 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3546 posts

Posted 03 April 2014 - 04:51 PM

Jerry,

I was wondering if the place/place details "appendix" report enhancement might go some way towards fulfilling your reporting desires.

http://forums.rootsm...nt-place-notes/

Keeping ones customers and their important views at a distance is never a good approach

 

User of Family Historian 7.0, Rootsmagic 7.6.3

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#59 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3546 posts

Posted 06 April 2014 - 04:06 AM

Also, it's not always Europe. For example, a death fact sentence might say something like "He died in WWII in the Pacific Theater" if I don't know where he died in the Pacific Theater.


Jerry, did you see the tip on Handling Incomplete Places in the current Newsletter, it may work for your quandary and possibly others with similar situations.

http://us1.campaign-...9&id=daa3eb3317

Keeping ones customers and their important views at a distance is never a good approach

 

User of Family Historian 7.0, Rootsmagic 7.6.3

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#60 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3974 posts

Posted 06 April 2014 - 07:46 AM

Jerry, did you see the tip on Handling Incomplete Places in the current Newsletter, it may work for your quandary and possibly others with similar situations.

http://us1.campaign-...9&id=daa3eb3317


Yes, I saw it. It's quite excellent (and timely!). I wonder if it's in resonse to this thread, or if it was already in the works.

Jerry