Archive for January, 2007

You can’t spider XML with a Google Mini (so far)

January 16, 2007 in Google Mini,GSA | Comments (1)

A question I’ve seen come up a lot which isn’t answered directly by my earlier post is whether the Google Mini or Search Appliance can spider raw XML. Unfortunately, no, it cannot.

The Mini / Search Appliance can read the XML, but it takes it in as straight text, so any searching you do will look at node names, attributes and content, rather than just content.

The best I can suggest is you have some scripting to run an XSL transform on your XML to turn it in to a small (or indeed large) site of web pages, then spider those with the appliance.

Pages without titles get blank <T> nodes in XML

January 8, 2007 in Google Mini,GSA,XML API | Comments (1)

When using the XML API to your Google Mini or Search Appliance, if a page in the search results does not have a <title> in it’s HTML, then it does not have a ‘T’ node (in GSP/RES/R/T) in the XML returned for the search.

The XSLT controlling the look of the web frontend on the box automatically replaces the title with the URL of the page instead (with the http:// taken off the start.) With the XML API you can decide to replace it with anything you like, but this behaviour is certainly preferred by the clients I’ve had to set it up for. Best of all would be for all pages to have a title, but there could well be some that slip through testing (when there is testing) so it’s best to be prepared for it.

Google Mini: Searching Subcollections from the frontend

in Google Mini | Comments (3)

If you are creating new subcollections on your Google Mini (v1), and you want to search them from a drop down menu on the frontend, you will need to refresh the frontend to do it. Sound confusing, just do these steps (the same goes for turning on the menu for searching subcollections in the first place):

  1. Login to the admin area and click on the collection name, then ‘Edit’
  2. Click ‘Configure Serving’ then ‘Output Format’
  3. Click the little arrow next to ‘Search Box’
  4. Click the tickbox next to ‘include a menu to search by subcollection’
  5. Click ‘Save Page Layout Code’

Note: This can take a few minutes to have an effect on the frontend, so if you don’t see a change to your search page immediately, just hang on and try again in a few minutes.

If you were just trying to turn on the menu of subcollections, that’s all you need to do. However, if you’re trying to get your new subcollection to show, you need to do these two steps again:

  1. Click the tickbox next to ‘include a menu to search by subcollection’
  2. Click ‘Save Page Layout Code’

That will cause the menu to update. You don’t need to wait for the frontend to refresh when doing this, just go through the first steps to switch the menu off, save it, then re-tick the box and save again.