Novell Doc: QuickFinder Server ide

2.1 Linux

QuickFinder Server is now fully ported to the Linux operating system, using OpenSSL for cryptography and HTTPS support and PAM for authentication.
QuickFinder has the ability to index remote file servers.

Relevance of Search Results

Administrators can now artificially control the relevance of search results. This means that you can decide what hits should and should not show up high in the search results list. A new optional index weight value was also added to any index name passed on a query-by-query basis.
The internal relevance calculation has been improved to include a document’s date/time stamp and the depth of the item within the data store.
Together with other relevance algorithm improvements, all search results should now be even better

Index Weight

QuickFinder has the ability to specify an “index weight” value on a query-by-query basis. The new syntax is:

&index=indexName1;indexName2:###&index=indexName3:###;indexName4

The &index= parameter can be sent multiple times; each occurrence can specify multiple indexes; and each index can include an optional weight value (:###). Weight values can range from 1 to 200.

You can now choose to place emphasis on a particular index by boosting its index weight value without excluding other indexes from the search (as you have to do in the past). Now on a different part of the Web site, you can choose to boost another index from the list, yet always search the same indexes. In this manner, you are no longer forced to eliminate “other” results; you can now simply emphasize particular results as the needs arise. The new index weight values (:###) override the like-named defaults specified in the Index Definition.

The new index weight values can be used with the following parameters:

&index=
&collection=
&bbindex=
&bbcollection=
&expandindex=
&expandcollection=

Document Types and File Readers

QuickFinder now has the ability to index OpenOffice.org Write, Impress, Calc, Draw, and Math documents. It also directly recognizes over 100 OpenOffice.org-specific XML tags.
Improvements were made to the existing QuickFinder file readers, including better document titles and descriptions, protected/encrypted PDF files, XML files, and XHTML files.

Off-site Links

You can now direct the crawler to allow a certain number of offsite links during an index, and you can control both the depth of offsite links and the Web sites that should be specifically excluded from the offsite capability.

Why would you want to index files not on your Web site? The best answer is because your Web site points to them. Presumably, someone has taken the time to create a link from your site to a file on another site. That other file probably relates to something on your site. For examples:

News articles about your company on CNN* and Reuters* Web sites
Links to third-party products that relate to your products and services
Magazine articles about your products, services, and management
Gartner* studies about directories and servers
Links to benefits providers, such as life insurance, health care, and 401K plans

Crawler Additions and Improvements

QuickFinder can now detect URLs embedded within all link types, including image, a href, iFrame, base, and framesets.
While QuickFinder cannot find links within complex Java code, we’ve added the ability to detect URLs embedded anywhere within simple, non-variable javascript links that conform to the following format:

<a href=”javascript:blah blah blah http://www.domain.com/url.html”>

This represents a large percentage of the Javascript links currently used on the Internet.
QuickFinder allows dynamic URLs with multiple question marks (?).

QuickFinder now identifies and corrects a number of malformed URL issues allowing it to find more content than ever before.
The crawler now automatically tries to relogin during a crawl.

This is important when crawling multiple Web sites in a single index. While the crawler spends time on one site, another site might log the crawler out. This new feature automatically logs back in as needed.
The crawler automatically retries any failed URLs up to three times.
All URLs are listed (and indexed) in the order that they were found on your Web site.

The previous crawler logs showed the indexed URLs in a hash order that produced logs that have a widely interspersed set of depth indicators; you could easily have a URL at depth 5 indexed immediately before a URL of depth 2. Now all the depth 2’s are indexed before the files at depth 3, 3 before 4, and so on. Note that the resultant side effect is that the crawler might now find more files (if you’re using the “Max Depth” setting) because a file might previously have been found at depth 5 when the new logic ensures that it is found at depth 2 if it’s a link off the starting page.
The crawler’s User Agent HTTP header ID string has been changed to QuickFinder Crawler.

This string identifies the QuickFinder crawler to a Web site as it reads the files. Any previous entries in a robots.txt file might need to be updated to reflect the new ID.
The QuickFinder crawler is significantly faster than in previous releases, including exponentially faster for very large Web sites (over 1,000,000) files. Linux users should see speed benefits in all aspects of the product.

Indexing Logs

QuickFinder has added the ability to control the amount of detail in the indexing logs, and a cumulative summary of up to 30 different statistics at the end of the log, including total URLs indexed, total URLs redirected, skipped links (and fifteen reasons why), and errors (and fifteen reasons why).
QuickFinder added a new View Log button that lets you see the failed.log when an index fails.

A failed.log is created when an index job fails and an existing “crawler.log” (presumably from a previous successful run) is already in the file system. To see the new button, click the Active Jobs link in the left-hand frame, view a running index job or one that has just ended, then click View Log after the index job has completed.
The View Index Jobs page was modified to include a new Stop capability that tells the crawler to stop crawling and start generating the index for the already crawled files.

This is similar to the Cancel button except that Stop generates an index.
Date/time stamps (once every 60 seconds) were added throughout the indexing log.

This helps you resolve problems when Web sites go down because you will now know when problems occurred.

&FileFilter=words Query Parameter

The &FileFilter=words query parameter now looks in both the original indexed paths as well as the Show URL in Search Results As paths.
The search syntax has been enhanced to allow a new /filefilter=words switch.

Other Additions and Improvements

QuickFinder has a new Novell iManager plug-in module that let you register and administer all QuickFinder and Web Search servers throughout your enterprise.
The QuickFinder invoker URL segment /NSearch has been changed to /qfsearch.

An Apache redirect from /NSearch to /qfsearch is created at install time, but you should update your current template to the new /qfsearch invoker.
QuickFinder Server now provides alternate spelling suggestions if a user’s search terms were misspelled or produced few search results.

The speller dictionaries are the same ones used in GroupWise® and are currently available in 16 languages. The speller also includes new admin-defined Ignore Words and Replace Words lists.
QuickFinder now has the ability to redirect to the actual hit URL if there is only one hit in the search results page.

You can control this in the Search.properties file by setting the Search.Request.GoToSingleHit property to true.
QuickFinder added a Minimum Best Bets Relevance setting which removes low relevance docs from the Best Bets display.
Date-based searches are now much faster.
Synchronization has been improved, and QuickFinder now has the ability to timeout from a hung (mirroring indefinitely) Synchronization request (this feature only works with JVM 1.5 (5.0) and above).
The Query Report templates (ReportTemplate.html and ExportTemplate.xml) are automatically copied into the current Virtual Search Server if they are not otherwise present.
The NetWare® Installation routines now prompt you to upgrade to the latest QuickFinder Server product.
New Require Authorization When Administering QuickFinder Server and Require HTTPS When Administering QuickFinder Server settings were added to the Global General Services Settings page, giving you the option to require authentication and encryption when logging into QuickFinder Server Manager.
You can now highlight both XML and XHTML files in the QuickFinder Highlighter.

Highlighter now detects a greater number of files as highlightable text formats. For example, it is no longer dependent on the filename extension; it now also uses the mime-type setting.
All servlet names now have similar information.

For example, SearchServlet, searchservlet, Search, and search are all valid names for the other servlets.