Jump to content


Photo

Inaccurate Sorting of Estimated Dates - Sort is skewed

dates sort

  • Please log in to reply
7 replies to this topic

#1 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3373 posts

Posted 31 May 2019 - 07:47 AM

I'm travelling with patchy connectivity at the moment but couldn't find this acknowledged as an issue before.

 
The sorting of approximated dates popped up on Facebook this morning, at present I don't find these sort dates an accurate estimation usually being skewed to the begin date.
 
For example a Bet date applies the earliest date as the sort date not the MID. So for an entered date of Bet 1 Jan 1900 and 31Dec 1900 Rootsmagic will apply 1 Jan 1900 as the sort date and not 2 Jul 1900. This is also true where quarters are used were Q1 1900 will effectively sort as 1 Jan 1900, this is not the displayed sort it is the effective sort date. All my estimated dates are personally refined and reviewed often in line with available information.
 
The skewing of sort dates for Bet dates is not terribly noticeable in small databases but it can be very deceptive in larger databases. I have a large database and presently work my own solutions to this but shouldn't need to. Within Rootsmagic I would suggest providing a user option regarding how to calculate and apply sort dates for approximated dates.
 
I know RM7 has been programmed this way so I would assume the programmers know how these sort dates are calculated, however it is not an accurate estimation to sort on and becomes very evident within a large database.

“Your most unhappy customers are your greatest source of learning.” -Bill Gates

It's now time for discretion, trust, patience and support

 

User of Rootsmagic 7.5.9, Family Historian 6.2.7, Family Tree Maker 2014 & Legacy 7.5

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#2 Renee Zamora

Renee Zamora

    Advanced Member

  • Support
  • PipPipPip
  • 8335 posts

Posted 31 May 2019 - 09:16 AM

It does sound like our expectation of where the date should begin are different. I would always look for someone with a Bet 1 Jan 1900 and 31 Dec 1900 date at 1 Jan 1900. It wouldn't occur to me to find them in the mid-range in a sort. I wonder if this expectation is a geographical thing?


Renee
RootsMagic

#3 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3373 posts

Posted 31 May 2019 - 01:28 PM

It does sound like our expectation of where the date should begin are different. I would always look for someone with a Bet 1 Jan 1900 and 31 Dec 1900 date at 1 Jan 1900. It wouldn't occur to me to find them in the mid-range in a sort. I wonder if this expectation is a geographical thing?

 

I am well aware of how users expectations can adapt to how things are without asking whether it is correct or not, users can easily adapt their thinking to how things are or were and that is mainly why I suggested an option rather than forcing correctness on anyone.

 

By definition using the example Bet 1 Jan 1900 and 31 Dec 1900 a sort date of 2nd July 1900 has a maximum margin of error of 182 days, the way Rootsmagic presently apply sort dates allows for a maximum margin of error in a one year example of 364 days, it doesn't need opinions or debate, this is mathematics, Rootsmagic are applying skewed data and it would seem happy to accept it.

 

It is purely wrong imo to define any estimation based on any extremity of the data, any estimation should be based on it's MID point whether it's a pin in the map regarding geography or a date span, there is no other factual way so the way Rootsmagic is programmed to apply sort dates is wrong imo. If you expand our simple one year example to only knowing someone had a BMD somewhere within a certain decade that is one large skew of sorting.

 

I was reporting this as I believe it is a false calculation hoping it may have been overlooked but it would seem Rootsmagic would prefer to brush over this skewed application of sort dates and so be it. I can easily correct these sort dates but I would imagine many users can not, regardless of that fact should I be correcting flawed and skewed data?

 

Regardless, I have noted Rootsmagics position and will move on and not waste my time further, you can lead a horse to water but you can't make him drink.

 

Reference;

Min and Max shows you the lowest (Min) and the highest (Max) value, Mean also known as "average" summing all the values and dividing them by the number of values. Median or Mid gives you the value that would be in the middle of a list of more than one value.


“Your most unhappy customers are your greatest source of learning.” -Bill Gates

It's now time for discretion, trust, patience and support

 

User of Rootsmagic 7.5.9, Family Historian 6.2.7, Family Tree Maker 2014 & Legacy 7.5

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#4 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6161 posts

Posted 01 June 2019 - 11:51 AM

"Rootsmagic would prefer to brush over this skewed application of sort dates"

I've not seen any explicit indication of that but I understand your frustration generally with the lack of action on many other issues.

I would suggest that the SortDate algorithm was kept simpler by ignoring the second date of any dual date event while allowing the user to override the default SortDate with a custom value.

Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#5 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3373 posts

Posted 01 June 2019 - 04:59 PM

I would suggest that the SortDate algorithm was kept simpler by ignoring the second date of any dual date event while allowing the user to override the default SortDate with a custom value.

 

I have many John Doe birth events of very similar time frames, the present skewing of the sort date to the begin extremity is very counter productive in a larger database such as mine. I was therefore suggesting an option to allow the user to define how the sort date was calculated and applied to various dual date entries which I believe is a productive suggestion.

 

Regardless of how the logic of others is believed a date BETWEEN X & Y is naturally some point in between and therefore best approximated and sorted as the MID point between those two points, Rootsmagic appear to disagree with logic, don't seem interested in improving accuracy and I have to accept that deliberate lact of attention to detail.

 

The lack of logic in this discussion is not something I want to labour, reducing a maximum margin of error from 364 days to 182 days is a good and desireable gain in my opinion and I welcome any counter arguements but I do respect and honour the views of others regardless of logic, I certainly don't want to mention the horse again.

 

I will continue maintain my sort dates within Rootsmagic to my standard for as long as I decide to persevere with the Rootsmagic program and it's current limitations.


“Your most unhappy customers are your greatest source of learning.” -Bill Gates

It's now time for discretion, trust, patience and support

 

User of Rootsmagic 7.5.9, Family Historian 6.2.7, Family Tree Maker 2014 & Legacy 7.5

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root


#6 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6161 posts

Posted 01 June 2019 - 07:36 PM

"Rootsmagic appear to disagree with logic, don't seem interested in improving accuracy"

I agree with the logic of your proposal to enhance the SortDate algorithm for range dates. However, I am troubled by your conclusion.

Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#7 robertjacobs0

robertjacobs0

    Advanced Member

  • Members
  • PipPipPip
  • 263 posts

Posted 01 June 2019 - 09:16 PM

I too agree with Vyger's logic, but given that the user can set the sort date to the midpoint of two two end dates if desired, I wouldn't delay the release of RM8 by one microsecond for the proposed change.

 

Wherever the sort date is set, either by RM or by the user, the date itself will be displayed as entered.

 

In my own records (between 5800 and 5900 people, n=5851 :) ), a quick and dirty survey shows fewer than five "between x and y" sort dates which would materially affect the order of events in print reports or on my GedSite web site.



#8 Vyger

Vyger

    Advanced Member

  • Members
  • PipPipPip
  • 3373 posts

Posted 02 June 2019 - 09:20 AM

I too agree with Vyger's logic, but given that the user can set the sort date to the midpoint of two two end dates if desired, I wouldn't delay the release of RM8 by one microsecond for the proposed change.

 

This was just an observation I believed should have had merit, it's not an issue for me other than having to batch update sort dates very occasionally, The sort date will be populated as between bet 1 Jan 1900 and 31 Dec 1900 but will sort to 1 Jan 1900, Abt 1900 sort date will be populated as such but again sort to 1 Jan 1900, Q2 1900 will sort to 1 April 1900. I noticed this as an issue affecting my expectations on People View and moved to correct it some years ago, however don't think I posted it so did now. On large data sets of common names this skew can be quite counter productive and I find my sort dates much more revealing in ordered lists. 

 

I agree with the logic of your proposal to enhance the SortDate algorithm for range dates. However, I am troubled by your conclusion.

 

Thank you, I am starting to worry about myself, my conclusion came from a less than constructive reply, in fact I have found a number of recent replies almost dismissive. I think I need to go back into my cave before I become a more cynical, disillusioned grumpy old man than I already am <_<

 

I believe Rootsmagic need to appreciate a lot of people are trying to contribute in a constructive way towards making a better and more successful program and they are not on the Rootsmagic payroll, in fact on that topic I'm not coming in next week :D


“Your most unhappy customers are your greatest source of learning.” -Bill Gates

It's now time for discretion, trust, patience and support

 

User of Rootsmagic 7.5.9, Family Historian 6.2.7, Family Tree Maker 2014 & Legacy 7.5

 

Excel to Gedcom conversion - simple getting started tutorials here

 

Root