Search for content in message boards

Request: wider date range for searches

Replies: 12

Re: Request: wider date range for searches

Posted: 27 Apr 2013 11:16AM GMT
Classification: Query
Edited: 27 Apr 2013 11:18AM GMT
The server overload messages you see are more likely affected by the number of people accessing Ancestry and the number of family trees and the number of record databases, rather than the complexity of search expressions in general, and especially not the complexity of your specific search expression. What the servers are doing at that moment for all other people matters FAR FAR more than what it's being asked to do for you. Hence why you get the overload errors on even very simple requests like showing the next post in a thread.

But under the assumption that your desire is to be a thoughtful citizen and not add too much to that background activity that is affecting other users, it is still unlikely that this change would have much impact on that.

I imagine that all of Ancestry's databases are preindexed by all possible search terms, and then boolean set intersect operations are done on the results of per-field matches (which is, e.g., why they require 3 characters in every wild card expression - 1 or 2 character substrings would create result sets that are too large for them to fit in their prebuilt indexes.)

Such databases generally have sortable record keys, presort both the indexes (by search terms) and the index result sets (by record keys), and use boolean search in both the index lookup and the set intersection tests.

Therefore doubling the size of any set only adds one extra test, quadrupling the size of the set adds only two extra tests, etc. (Eg., if I am searching for key Z in set S, set S will already be sorted so that all the keys less than or equal to the midpoint are in one half, and all the keys greater than the midpoint are in the other half, in a data structure set up so that it only takes one test to get the appropriate half, with each half similarly sorted.)

So this change is unlikely to significantly impact the performance over using smaller date ranges. A 40 year range will typically take 2 comparisons more than a 10 year range. And it will improve the performance in multi-database searches over using no date range at all (which is what I most often do when the actual desired date range is too wide), since more databases can be eliminated.

(I have been a professional software engineer for 35 years, and an Ancestry.com user since the first month they came online.)

SubjectAuthorDate Posted
godavem 27 Apr 2013 3:52AM GMT 
BurgessDonnel... 27 Apr 2013 12:42PM GMT 
Teri Pettit 27 Apr 2013 5:16PM GMT 
per page

Find a board about a specific topic