Jump to content


Photo

Google & Publish online?


  • Please log in to reply
22 replies to this topic

#1 Don Newcomb

Don Newcomb

    Advanced Member

  • Members
  • PipPipPip
  • 1045 posts

Posted 07 February 2016 - 08:37 AM

I'm curious if anyone found a way to make a RM6 Publish Online website searchable by Google and other web search engines? I was thinking maybe a list of all the names in the database embedded inside some sort of HTML keyword field with a redirect to the index.htm file. Google could search and index the keyword list. When the user clicked the link, they'd get a "redirecting" and be dropped onto your home page. Unfortunately I don't know enough about HTML to make that work. Embedding the keyword table in index.html would probably make it too slow to load, particularly for large databases. Thoughts?



#2 mleroux

mleroux

    Advanced Member

  • Members
  • PipPipPip
  • 68 posts

Posted 27 February 2016 - 10:43 AM

Don, I am a new RM7 user, so I hope my response relates to RM6.

I'm presuming that the "Publish online" is the same as the RM7 "MyRootsMagic Publish Online" where it is hosted in a RM provided server and the pages are generated from a copy of your database.

In my environment I don't see any potential to modify the home page other than adding links.

If you have access to another web based server that can render HTML you could pretty easily create a list of names that hyperlink to the detail page on your RM server.

 

An easy way is to get an export of the names and ID numbers into excel. I don't know how to do it from RM, but it's very straight forward to do from sql

 

In excel, build a string that creates the hyperlinks. If your surname is column A, first name in column b and id in column C it would look like:

="<p><a href=http://sites.rootsma...ividual.php?p="& C1 &  " target='_blank'>" & A1 & ", " & B1 & "</a></p>"

 

This will create an html string that will list the name and open up a link to your RM page in a new window on the browser. It's in a new window so that the indexing robot can fork a branch without having a return path since there are no "back" links on the RM pages.

 

If you are going to get the names via SQL, you could shortcut the excel process and create the links from sql directly:

 

select '<p><a href=http://sites.rootsma...ividual.php?p='|| ownerid  
        || ' target="_blank">' || surname || ', ' || given || '</a></p>' as [List of names in my Database]
from nameTable
order by surname, given
 

You should be able to past the result into any html editor (or save in a text file with an html extension and open it up in Word - I think that should work) edit what you want and then publish.

 

It sounds a lot more complicated than it is.

 

The key thing is that you need a place to publish this list of names, and it would have to be some place where Google can find it - meaning something has to link to this page.

 

To your other part of the question, I only have 3,500+ names, and it took under a second for a first time load of the page, so I wouldn't be too worried about performance. You could also split the page into several (i.e. A-C, D-F) and have links between them.

 

I hope this helps

 


Marc
Always learning and loving the discovery process. Focusing on the Huntingdon and Soulanges areas of Quebec - O'Connor/Leroux/Walsh/McCann/Savage/Lalonde/Lauzon


#3 Don Newcomb

Don Newcomb

    Advanced Member

  • Members
  • PipPipPip
  • 1045 posts

Posted 09 March 2016 - 07:46 AM

Thanks melroux. When you create a "Publish Online" site, RM creates a set of files in the target directory. When you "publish" it, RM bundles up the files and ships them off to their server. I believe that the server somehow optimizes the thousands of tiny data files so that they don't eat up massive amounts of storage with unused sectors. I don't know if the "publish" process will also upload any .html files in your directory or not. You can also upload all the files to your own server and it seems to work just as well. I thought about just creating a file with an HTML "keywords" meta:

meta name="keywords" content=

and a huge list of names followed by a redirect back to the index.html page. This file would be referenced somewhere in the index.html so that search engines could find it but if accessed would just redirect back to the main page. The user could then use the name index to find the individual.

 

I presently have about 15K names in my database. 



#4 Ludlow Bay

Ludlow Bay

    Advanced Member

  • Members
  • PipPipPip
  • 868 posts

Posted 09 March 2016 - 09:23 AM

googlelogo_color_150x54dp.png Webmaster Central Blog

 

https://webmasters.g...s-meta-tag.html



#5 mleroux

mleroux

    Advanced Member

  • Members
  • PipPipPip
  • 68 posts

Posted 09 March 2016 - 09:36 AM

Don, It looks like the process is different with RM6 - in RM7 if you publish it on-line there is no local interim step - it just publishes directly, so there is no opportunity for adding anything, In your case a simple test would be to add a new html file to the root of your directory and try to access it from a browser. Create a small html file that just says "Hello", save it as myNames.html and publish your site. If you access the site and go to <path>/myNames.html it will either work or not. Either way gives you a new datapoint.

 

If it does, then create a file as I described earlier with all the names linking back to either the individual, or the index if you prefer. I'd suggest this over a long list in a content keyword because the search engine will stop reading after a fairly small number of characters - somewhere between 156 and 256. Also, search engines do not like redirect pages, links have a much better chance of being indexed. Keep your content keyword to a descriptive sentence (the content keyword is what will show up in the google text) that contains your major family branches, but keep it relatively short.

 

Just my $0.02


Marc
Always learning and loving the discovery process. Focusing on the Huntingdon and Soulanges areas of Quebec - O'Connor/Leroux/Walsh/McCann/Savage/Lalonde/Lauzon


#6 Trebor22

Trebor22

    Advanced Member

  • Members
  • PipPipPip
  • 190 posts

Posted 10 March 2016 - 06:00 AM

Hope Mieroux suggestion works OK but if not a possible fix would be to host your name index elsewhere (plenty of free webspace out there) and  have it point to the home page of your tree - bit long winded but perhaps an 'interim fix'?



#7 mleroux

mleroux

    Advanced Member

  • Members
  • PipPipPip
  • 68 posts

Posted 10 March 2016 - 01:08 PM

Hope Mieroux suggestion works OK but if not a possible fix would be to host your name index elsewhere (plenty of free webspace out there) and  have it point to the home page of your tree - bit long winded but perhaps an 'interim fix'?

 

Agreed - that was my initial suggestion as well


Marc
Always learning and loving the discovery process. Focusing on the Huntingdon and Soulanges areas of Quebec - O'Connor/Leroux/Walsh/McCann/Savage/Lalonde/Lauzon


#8 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3567 posts

Posted 10 March 2016 - 04:19 PM

Only have iPad this week and no access to my RM, so going by memory. RM7 web pages upload your real RM database - all of it unless you make a copy which is a subset, can be hosted only at MyRootsmagic.com, and is searchable by Google. RM6 web pages can be made by RM6 or RM7, uses a unique and proprietary format of data - can't remember details, uses PHP or something similar to display data - can't remember details, can be hosted anywhere, cannot be searched by Google. HTML web pages are static HTML, can be hosted anywhere, and can be searched by Google.

The crux of the problem is that there is no RM6/RM7 solution that is both searchable by Google and can be hosted anywhere. There really isn't. MyRootsmagic is very limited in size, which forces people to the RM6 solution, even from RM7. Hence the desire for a way to make RM6 websites searchable by Google. I really don't think there is a way.

The only other solutions are HTML websites or uploading GEDCOM to third party web sites. I think RM's HTML web sites are great, but I think I'm the only person in the world who thinks so.

Jerry



#9 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6250 posts

Posted 11 March 2016 - 08:12 AM

The RM6 Published website uses XHTML pages. It is my (faulty?) recollection that Google started indexing this type a year or so ago.

Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#10 kbens0n

kbens0n

    Advanced Member

  • Members
  • PipPipPip
  • 3455 posts

Posted 11 March 2016 - 08:54 AM

The RM6 Published website uses XHTML pages. It is my (faulty?) recollection that Google started indexing this type a year or so ago.


Yes, RootsMagic 6 style websites (using XML and JavaScript) are somewhat crawled by Google and appear in Google's search results for some "term" searches.
For example:
This random site and page I chose http://www.mcgregor-...dual.html#13009
Then searched for a couple terms on that page https://www.google.c...xander" keeling
And a linked page came up in the search results http://www.mcgregor-mills.ca/b143.htm

Quick and unscientific, but shows some indexing occurs.

---
--- "GENEALOGY, n. An account of one's descent from an ancestor who did not particularly care to trace his own." - Ambrose Bierce
--- "The trouble ain't what people don't know, it's what they know that ain't so." - Josh Billings
---Ô¿Ô---
K e V i N


#11 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6250 posts

Posted 11 March 2016 - 09:06 AM

Isnt that a html website, not the RM6 Published type?


Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#12 kbens0n

kbens0n

    Advanced Member

  • Members
  • PipPipPip
  • 3455 posts

Posted 11 March 2016 - 09:49 AM

Isnt that a html website, not the RM6 Published type?


It's definitely not a Pre-Rootsmagic 6 style because it's not a static HTML website (view Source shows javascript in use) and it's definitely not a MyRootsMagic (Publish online) because its not at MyRootsMagic.com ...so I'm guessing it's just as I stated.

---
--- "GENEALOGY, n. An account of one's descent from an ancestor who did not particularly care to trace his own." - Ambrose Bierce
--- "The trouble ain't what people don't know, it's what they know that ain't so." - Josh Billings
---Ô¿Ô---
K e V i N


#13 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3567 posts

Posted 11 March 2016 - 03:33 PM

It´s definitely an RM6 site, so I need to stand at least partially corrected about Google search. I didn´t realize that Google had started crawling such pages. But I don´t know how complete the crawling process is.

 

Jerry

 



#14 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3567 posts

Posted 11 March 2016 - 03:38 PM

My standing corrected needs to stand corrected. The example page was definitely RM6. But the results page that came up from searching text on the example page was static HTML from pre RM6. When I searched for David Arthur James Alexander 1904 1977 I got no hits. So I still haven´t seen an example of Google finding an RM6 page.

 

Jerry



#15 kbens0n

kbens0n

    Advanced Member

  • Members
  • PipPipPip
  • 3455 posts

Posted 11 March 2016 - 05:22 PM

My standing corrected needs to stand corrected. The example page was definitely RM6. But the results page that came up from searching text on the example page was static HTML from pre RM6. When I searched for David Arthur James Alexander 1904 1977 I got no hits. So I still haven´t seen an example of Google finding an RM6 page.


Excellent observation! I concur completely. I only chose and checked one site in a very cursory way and that site's owner coincidentally had the older style separately uploaded. Back to Google not crawling javascript/xml (NOT the same as XHTML ...BTW) Apologies for injecting misinformation into this thread.

---
--- "GENEALOGY, n. An account of one's descent from an ancestor who did not particularly care to trace his own." - Ambrose Bierce
--- "The trouble ain't what people don't know, it's what they know that ain't so." - Josh Billings
---Ô¿Ô---
K e V i N


#16 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6250 posts

Posted 11 March 2016 - 08:59 PM

I've crossed the Atlantic since my previous post but am still on iPad so a little constrained in what I can do, apart from being whacked by the change in time zone. I had forgotten that the RM6 Publish Online was JavaScript and XML and incorrectly described it as XHTML. And as we were about to board in Lisbon, I did not look at Kevin's random sample page, only at the result of his search. Perhaps my recollection about Google being able to index the RM6 JS/XML website is completely wrong.

Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#17 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6250 posts

Posted 11 March 2016 - 09:31 PM

In 2013, the author of the RM6 JS/XML website generator said that indexing by search engines would be supported by a future version... See http://forums.rootsm...ine/#entry57007. I may have conflated that statement with Google Search having the capability to crawl these sites.

Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#18 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3567 posts

Posted 12 March 2016 - 04:23 AM

In 2013, the author of the RM6 JS/XML website generator said that indexing by search engines would be supported by a future version... See http://forums.rootsm...ine/#entry57007. I may have conflated that statement with Google Search having the capability to crawl these sites.

 

I think the future version being referred to where the site can be indexed by search engines is the RM7 version where the data is in an rmgc file. So I doubt that a future version of the RM6 style Web pages will ever be indexable. It remains to be seen how RM will make indexable Web pages that can be hosted at sites other than MyRootsmagic.

 

Jerry



#19 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6250 posts

Posted 12 March 2016 - 04:45 PM

Searching the Forum using the keywords "Google Analytics" (sans ") is a great memory refresher. The reason I thought that Google had advanced in its potential to crawl RM6 Javascript/XML websites is because kbens0n posted a link to a Google 2014 blog about how they were working on crawling Javascript and CSS and were developing a tool to help webmasters evaluate whether their sites were crawler-friendly. 

 

I searched those keywords today because I had come across Google's Search Console and had a look at the experimental RM6 website to which I had applied Google Analytics back in 2014. It was forgotten because RM7 came out and it and other things distracted me from that dead-end website. I'm not proficient with these Google tools and the Search Console seems completely new to me but it probably existed back then. What has likely been enhanced is the "Fetch and Render as Google" feature showing how Google sees a page and compares that to how a browser sees it. What it shows for the Name Index and Pedigree pages is not encouraging - the crawler may not get much beyond the home page.

 

The Forum search results includes a discussion about supplementing the website with a page having links to each individual which would be crawled.


Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.


#20 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6250 posts

Posted 14 March 2016 - 03:16 PM

I've added a Google Analytics report screenshot to the Jan 2014 discussion about adding Google Analytics tracking code to a RM6 Publish Online website. Having just recently enabled the Google Search Console for this website, it does not yet show any subsequent indexing activity but I do see results from the following Google Search terms (sans quotes):

 

"rootsmagic analytics"

"James HOLDEN 29 Feb 1828"

 

Now that may be because Google knows so much about me that it offers them up. I'm curious if you see them in your results.

 

There is no evidence that Google has crawled below the top level pages.


Tom user of RM7550 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.