By Peter Clemons

Posted: 12 Jun 2003

Most of you probably already own NetWare Web Search Server. It's been shipping for free with NetWare since 5.1. And it's a pretty capable product.

Since the initial release, Web Search has included a little known ability to "filter" the search result set based on the filename of the individual hits. Normally, a user performs a search using the &query=search+words parameter. But what if you wanted to see only Word or PDF documents? What about limiting search results to a particular path or server on your web site?

To "limit" or "filter" the search results to a particular filename, just include the &filefilter=filename query parameter in addition to the normal &query= parameter. For example, a search for Windows Auto Login Utility brings up 66 hits on Novell's web site when searching in the Cool Solutions index. If you include the additional &filefilter=tools query parameter, you limit the result set to only those documents that have the word tools somewhere in their filename (or path or extension). Most of them come from the directory.

What if you wanted to skip the original query and just find all of the XML documents? Prior to NetWare 6.5, you still had to set the query operator to something, anything, that's in your files. You could use a high-frequency word such as the which appears in just about all of your files. But, a better "query" might be the asterisk character (*) all by itself. It's one of Web Search's wildcard characters (the other one is the question mark -- ?) that tells Web Search to look for any word. So, to do a filename only search, use &query=*&filefilter=XML.

The only problem with this is that the asterisk is VERY slow compared to normal words ...but it's still acceptably fast if you have 50,000 or less documents. In NetWare 6.5, they've modified the behavior of the &filefilter= query operator so that if the normal &query= operator is missing, then Web Search performs only a filename search ...and it's very, very fast.

Note that Web Search finds "words" not symbols. So, all punctuation, symbols, and control characters are ignored when doing a search. In Web Search, the asterisk means "find any word". When you enter a filefilter such as *.exe, you're telling Web Search to find any word, followed by the word exe (with a word separator in-between). In other words, just drop the asterisk, and it still means the same thing, but runs 100 times faster.

In a post NetWare 6.5 release, they're going to try to support a more natural filename search syntax. It's already pretty close, but not quite there.

For more information, see the documentation at:

