Merging 2 databases on different platforms has long been my interest too, but I have found the RootsMagic tools rather insufficient so far. But while you can upload an RM database to Ancestry, trying to merge them there is unworkable, as you will have to pull one individual at a time, one fact at a time, one source at a time. RM does have tools but several of them are not ready for use. Duplicate checking and the Compare Files tool seem to have been done in a hurry, with only an exact string match to locate matches and very poor and incomplete merging functions. However the Drag and Drop works well, for what it does.
My current thinking is to download the Ancestry tree to a new RM tree, and check each record for issues, then decide which database is more important, and shelve the other for serious work, for now. Then when I have time (overloaded with little projects currently), use Drag and Drop on whole branches of the shelved tree to my new primary tree. Afterward, possibly manually, compare and merge or copy over what's missing.
As to Compare Files and Duplicate Checking, there's 2 parts to merging records - first the identification of the records that match, then the merging of those 2 records. The RM developer does good work normally, but clearly was in a hurry here, and only provided a quick and dirty rudimentary dup finding function, basically nothing more than a little normalization of the person's name then an exact string match, which can only work if the names were entered exactly the same way, and with the same tool. As I'm sure you have found out, different tools handle things like nicknames, maiden names, prefixes and suffixes differently. So even though you think you've entered the same persons the same way, they will often not match here. I often enter nicknames when possible, and that's my main reason for so many identical persons not being identified as matches. I think I only had about 20% to 30% of the identical persons match. I suspect the RM developer only did a quicky to help merge records that came from the same database back together, which *is* an important function, then hoped to get back to this when he had more time. Due to the current crunch, that's probably very low in priority just now.
A quick way that would help immensely would be to match on birth date, where month and day exist. Just matching on that would have raised my match percentage to better than 80%, perhaps better than 90%.
But the right way to do it, when he has time, is to use true fuzzy matching algorithms, something I've spent quite a bit of time on, way back. For each data type you develop normalization routines that ensure you are always comparing apples to apples, no matter how someone has entered it. Then you develop strategies for identifying potential matches, then fuzzy comparison routines for each data type, that score the degree of matching, and add the scores together to determine a record matching score. You set a low score, below which it's 'not a match', a high score above which is a 'clear match', and the rest are for the user to review and confirm or reject. Development of this kind of strategy takes time, requires lots of testing, feedback, and tweaking of the scoring and algorithms. But I was able to get records to match and merge that had absolutely no exact matching fields, in the tougher field of library cataloging, merging poor quality locally entered records into high quality OCLC, UTLAS, MARCive, LC, etc. records, with hundreds of field types and many ways to enter things and very different quality levels.