What the Google Mini will spider

Posted on January 10th, 2006 in Google Mini, Spidering by Paul

Google have a helpful list of the file formats the Google Mini will spider.

It’s worth checking what you want to spider before you consider the other factors in why you are buying a search appliance. Checking through the various file formats I have, I was surprised to see the Mini supports .wps files written by Microsoft Works for DOS. It’s not a difficult format to read, but it is an old format now - Works v2 being copywrite 1988 if my memory serves me correctly. Personally I have a ton of old Works files and it’s nice to know something will still understand them, I told Google Desktop they were txt files with an odd extension, but that can have dubious results as some of the file is binary.

2 Responses to 'What the Google Mini will spider'

Subscribe to comments with RSS or TrackBack to 'What the Google Mini will spider'.

  1. C.H. Van said,

    on January 21st, 2006 at 6:52 am

    It’s my understanding that Google licenses a third party filter for its appliance (as almost all search indexing companies do). I don’t know which one Google licenses, but both Verity Keyview and Stellent OutsideIn support this format, along with lots of others.


  2. on January 16th, 2007 at 2:06 pm

    [...] A question I’ve seen come up a lot which isn’t answered directly by my earlier post is whether the Google Mini or Search Appliance can spider raw XML. Unfortunately, no, it cannot. [...]

Post a comment