If you are here, you probably saw the blog post announcing the ability for Ancestry users to find relevant records about their ancestors – even if they are on other websites.
Our users keep telling us they want more content to help them understand their family history and build and source their family trees. At Ancestry.com, we are continually digitizing and adding more records to our service to help our users with their family history. At the same time, we also realize that Ancestry.com will never hold all of the content out there and there are many organizations and individuals who are publishing a lot of complementary genealogical and historical content. We have also heard from a number of these publishers that they want to have more people find their websites and this information.
This information lives on thousands of websites but, as a family historian, it isn’t always easy to find and it can take a lot of effort to figure out which websites contain information about your ancestors. We wanted a way to help Ancestry.com users find this great content. At the same time, we wanted it to be a win for the website publishers.
Last fall, we launched this idea in Ancestry.com Labs. We provided the first glimpse of how we might approach this. We got feedback from many users and publishers. Some of it was very glowing and some of it definitely wasn’t. We listened to all of it to try to develop an experience that was helpful to users and, at the same time, respectful to publishers. It’s not easy trying to balance all of the many ideas and needs but we hope we are doing it right.
PRINCIPLES OF WEB SEARCH
In building this system, we want to be forthright about the principles we are striving to follow. Some of the things we heard from the start are that many publishers want both credit and traffic from any such service, so they want us to make it easy for users to find their records and get to their site.
To accomplish this, and in the spirit of openness, we want to state up front the key principles we are following as we build this service. For simplification, I will refer to the system that searches this information as “web search” and the content itself as “web records.”
• Free access to web records – Users do not have to subscribe or even register with Ancestry.com to search and view these records.
• We will always strive to follow web standards for web crawling permissions. For example, some websites have a robots.txt file that instructs search engines (like Google) to not crawl the site, or to only crawl certain areas.
• Proper attribution of web records to content publishers - we will link prominently to the original site within the search experience.
• We have in place processes to remove content from the index if a website owner requests us to do that and we will publish how to contact our team to do this. Website owners can also contact us to ask questions or to request their site be indexed – see this page to learn how to contact us: http://www.ancestry.com/websearch
• Ancestry.com users will be able to save key information to their trees but it will list the website as the source and will have an easy way to link back to the original site.
We really want to maintain trust and openness with our users and with website publishers. This is part of an ongoing dialog starting back last year and is continuing with this launch.
If you have questions, concerns, or suggestions, please feel free to comment here on these boards or email us at firstname.lastname@example.org
. Either way, we will respond.
We are only launching a search to a few websites initially because we want to continue to let people know what we are doing and to learn as we go from our users and from you, the people and organizations that publish these websites. As we continue to learn, you will see this service expand to include more websites.
HOW DO I FIND OUT WHAT INFORMATION OR WEBSITES YOU HAVE SEARCHED?
This information will show up in search results on Ancestry. To see an example of this, you can do a general (global) search on Louise M Chrisman in Indiana, USA. This link will show the search results:http://search.ancestry.com/cgi-bin/sse.dll?gl=ROOT_CATEGORY&...
Looking at the search results, you will see a record from “Web: Allen County, Indiana Deaths 1870-1920.” We use the “web:” prefix to denote records found via web search and to let users know that they are not from Ancestry.com. If you are finding these records from global search, you will also find a link that goes straight to the website.
You can also go to the card catalog and search on the word “web” in the title to see a list of what has been indexed. If you search just on one collection, you will find a link to the website on the search form and on the index page.
Here is an example of a way to search just one web collection: http://search.ancestry.com/search/db.aspx?dbid=70001
HOW DO I PREVENT ANCESTRY.COM FROM INDEXING MY SITE?
The easiest way to prevent us, or other search engines, from searching your site is to add instructions in the website’s robots.txt file. Ancestry.com’s crawler is called “ancestrybot”. You can easily instruct Ancestry and other search engines which areas of your site you want searched, or not searched. You can also restrict all search engines, or just specific ones, like ours, through this file. Because we follow web standards, we will honor the instructions in the robots.txt file.
To learn more about how to use a robots.txt file, there are many sites with tutorials or instructions. Here is one such site: http://www.robotstxt.org/robotstxt.html
If your site is included in the search but you don’t want it there, you may contact us at email@example.com
and we will remove it from our search index.
HOW DO I ASK ANCESTRY.COM TO INCLUDE MY WEBSITE IN THE SEARCH?
You can contact us at the same email address: firstname.lastname@example.org
to request us to include your site in the index. Please also make sure that your robots.txt doesn’t prevent crawling of your site.
As I stated at the start of this long post, we hope this is a great service for family historians and for web publishers as well. We expect it will help many more people find your website and the great content you are publishing and hope that we have done this in a way that is win-win for everyone.
We are happy to answer questions, either here on this message board or via the email address listed above.
Please try out the service, kick the tires, and let us know what you think.
Director of Search Product Management
Search Product Manager