Jump to content


Photo

What to do when citation URLs change?


  • Please log in to reply
9 replies to this topic

#1 kidhazy

kidhazy

    Member

  • Members
  • PipPip
  • 23 posts

Posted 21 September 2020 - 12:43 AM

I know the Internet is not static or permanent, and over time URLs we store in sources or citations may become invalid as websites change and disappear.  That's why we specify a date as to when we referenced a website, and I grab as much info as possible - because you never know when it will go.

 

However, if in your Internet travels you notice that a website (in my case an Online Newspaper Obituary site) has changed the URLs to the entries that you had created citations for - what do you do?

 

Do you spend the time changing them to the new URL?

 

Do you just leave it with a broken URL?

 

I've only got about 130 citations from this source with various specific URLs (some are duplicates as the citations have been reused) so I could possibly spend the time to correct them.

 

But is there a find/replace option within citations? Or out in SQLlite land?

In my case, there's no pattern to the URL changes, so they would need to be done individually, but in some cases 1 URL is used in multiple citations, so the total may be around 60 or less.

 

Anyone gone through this process?

 

Thanks.



#2 JimDavis79

JimDavis79

    Advanced Member

  • Members
  • PipPipPip
  • 74 posts

Posted 21 September 2020 - 05:58 AM

My practice is two-fold:  leave the URL as it was when I accessed it (just like leaving place names as they were on a particular date) and also grab a copy of the site information to a PDF that I use as a media exhibit supporting the source.  The first fact is like citing a book that is out of print and for which a copy cannot be found.


Best regards, Jim

"When you shake my family tree, nuts fall out."


#3 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3978 posts

Posted 21 September 2020 - 06:48 AM

I'm very suspicious of URL's because in general they are not very persistent. Well, I guess it depends in part on how specific the URL is. For example, I'm dubious that something like www.ancestry.com will still be around 100 years from now if somebody finds a copy of one of my family reunion reports in an old box in an attic. But I'm even more dubious of a "long" URL that points to a specific record being around 100 years ago, a URL such as https://www.ancestry...cf89&pId=987671

 

Therefore I don't even record URL's in general except that I treat something like www.ancestry.com as a repository. For a site such as ancestry with multiple databases I record the specific database as a part of recording the repository. In this case the specific database is Tennessee, Death Records, 1908-1965. Except that you can't tell the exact database just from looking at the URL. Instead, you would have to plug the "long" URL into a browser to find out what is going on.

 

And no matter what, I always download the images and link them into RM. For text only databases, I do a screen capture to make my image. Screens can  be hard to capture if they require scrolling, so I use Snagit as my screen capture tool of choice. It's not free (and there are many free screen capture tools), but it's powerful enough to capture all of a screen that requires scrolling.

 

Jerry 



#4 KFN

KFN

    Advanced Member

  • Members
  • PipPipPip
  • 337 posts

Posted 21 September 2020 - 11:22 AM

In general I don’t save URLs for my source information for reasons as Jerry points out.

 

I always find a way to capture an image or copy of the source material from the site and add-it to my database.  Anything other than that runs the risk of not being supported in the future.



#5 J P

J P

    Advanced Member

  • Members
  • PipPipPip
  • 336 posts

Posted 21 September 2020 - 02:51 PM

In general I don’t save URLs for my source information for reasons as Jerry points out.

 

I always find a way to capture an image or copy of the source material from the site and add-it to my database.  Anything other than that runs the risk of not being supported in the future.

The other issue I have come across is related to sites like Ancestry that have country specific variations - US, UK, Australia, etc - to which individual accounts are tied. In my case (UK based), it’s no use coming across a UK census source, say, on Family Search with a URL tied to Ancestry.com when I can only access it via Ancestry.co.uk. Yes, I can edit the “com” to “co.uk” and then see the source, but that soon gets very tedious. So, URLs are a problem from several or many angles.



#6 Trebor22

Trebor22

    Advanced Member

  • Members
  • PipPipPip
  • 204 posts

Posted 22 September 2020 - 02:24 AM

JP wrote:-

The other issue I have come across is related to sites like Ancestry that have country specific variations - US, UK, Australia, etc - to which individual accounts are tied. In my case (UK based), it’s no use coming across a UK census source, say, on Family Search with a URL tied to Ancestry.com when I can only access it via Ancestry.co.uk. Yes, I can edit the “com” to “co.uk” and then see the source, but that soon gets very tedious. So, URLs are a problem from several or many angles.

 

I am subscribed to Ancestry.co.uk but can sign in and use the .com site with my 'uk' sign in - I have world subscription, wonder if access might not be the same with all levels?

I also share the view that specific url's are too transient to be of long term use. As others have said I think a copy of the document linked to the database is the best option but mindful of copyright I will be pleased to see an option in RM8 to select if an image is exported or private - very disappointed if its missing!



#7 J P

J P

    Advanced Member

  • Members
  • PipPipPip
  • 336 posts

Posted 22 September 2020 - 02:34 AM

JP wrote:-

The other issue I have come across is related to sites like Ancestry that have country specific variations - US, UK, Australia, etc - to which individual accounts are tied. In my case (UK based), it’s no use coming across a UK census source, say, on Family Search with a URL tied to Ancestry.com when I can only access it via Ancestry.co.uk. Yes, I can edit the “com” to “co.uk” and then see the source, but that soon gets very tedious. So, URLs are a problem from several or many angles.

 

I am subscribed to Ancestry.co.uk but can sign in and use the .com site with my 'uk' sign in - I have world subscription, wonder if access might not be the same with all levels?

Well, you learn something every day. I don’t have world-wide access but the middle level (for a UK user) and it turns out that I also can login to ancestry.com with my “UK” user Id and password, and at least access the same things I am supposed to be able to access as when I use the UK site. How many years have I been working at this ? 
 

MANY THANKS for the clarification.



#8 kidhazy

kidhazy

    Member

  • Members
  • PipPip
  • 23 posts

Posted 22 September 2020 - 08:56 PM

Thanks all for the tips/suggestions/thoughts.  Like JP we're always learning new things, and ways of improving our research and recording.

 

I too use SnagIt like Jerry - and I only discovered a couple of days ago it can also do OCR to convert images to text.  That definitely helps with transcribing scanned documents. 



#9 Jerry Bryan

Jerry Bryan

    Advanced Member

  • Members
  • PipPipPip
  • 3978 posts

Posted 23 September 2020 - 06:31 AM

I too use SnagIt like Jerry - and I only discovered a couple of days ago it can also do OCR to convert images to text.  That definitely helps with transcribing scanned documents. 

 

There is lots of software that will do OCR. Some does it better than others. Some OCR software is free and some is not. I didn't purchase Snagit because it does OCR. I purchased it because of it's extreme power in doing screen captures, much better than what is free with Windows. But I recently discovered that it does OCR better than most other software I have tried. Even so, sometimes it helps with OCR to clean up an image first before doing OCR. By cleanup, I mean remove non-text such as photos and other graphics. OCR software tries to do this for itself, but if you help it along a bit it can great improve the quality of the OCR. Even with the best OCR, I always treat the results of OCR as a first draft. The OCR can be a great saver of time and aggravation, but it nearly always can use some human improvement.

 

Just in general (and to keep this a little about RM), my sense is that a lot of RM users try to transcribe into RM's note windows. I think that's a very difficult way to work for all kinds of reason. I usually transcribe into a Notepad window and then copy and paste from Notepad into RM. Notepad is a very simple editor. I can easily make its screen any size I wish and can easily have the Notepad window open on top of the document I am transcribing and move the Notepad window around to keep the text I'm transcribing visible. In the case of text that I grab with Snagit ("grab" is what Snagit calls its OCR process), I first paste the grabbed text into Notepad for further cleanup before copying and pasting into RM.

 

Jerry

 

P.S. And for those Notepad++ fans out there, I am big fan of Notepad++ as well and I use it a great deal. But for something like transcribing a document into text, I think Notepad++ is overkill and I actually think Notepad is a better tool than is Notepad++ for this particular task.



#10 TomH

TomH

    Advanced Member

  • Members
  • PipPipPip
  • 6444 posts

Posted 24 September 2020 - 08:34 AM

But is there a find/replace option within citations? Or out in SQLlite land?

Within RM, the Find Everywhere (sic) tool might give you results with the search term "http" but "Everywhere" is an overstatement. 

 

SQLite can search more broadly than RM has seen fit to implement, see 

Search – Find Almost Everywhere

Tom user of RM7630 FTM2017 Ancestry.ca FamilySearch.org FindMyPast.com
SQLite_Tools_For_Roots_Magic_in_PR_Celti wiki, exploiting the database in special ways >>> RMtrix-tiny.png app, a bundle of RootsMagic utilities.