4.12 Understanding and Configuring Search Functionality

Novell Vibe contains various features that control how items are indexed and how searches are executed on the Vibe site. As the Vibe administrator, you can configure the settings for these features. The default configurations are optimal for most Vibe sites.

These features are supported only with Vibe folder entries that contain English or other Western European languages. (For a list of supported languages, see Section 4.12.5, Supported Languages for Indexing and Searching for the Root Form of Words.)

4.12.1 Removing Frequently Used Words That Have No Inherent Meaning

Vibe removes frequently used words that have no inherent meaning when items are indexed and when users perform a search. Examples of such words are a, an, the, in, on, the, and so forth.

This includes when users perform a search with quotation marks. For example, “sell the products” would return all of the following: sell their products, sell with products, sell the products, and so forth. However, it would not return sell products.

You can customize the words that Vibe considers to have no inherent meaning. For more information, see lucene.indexing.stopwords.file.path.

This functionality is enabled by default. For information on how to modify the default settings, see Section 4.12.4, Modifying Configuration Settings.

4.12.2 Searching for Various Forms of the Same Word

When Vibe indexes a word, it indexes the root form of the word. Likewise, when users perform a search for a word, Vibe searches for the root form of the word and returns all matches. For example, performing a search on the word research returns all forms of the word, including researching, researched, and researches. Likewise, searching for the word researching returns results for research, researches, and so forth.

This functionality is enabled by default. For information on how to modify the default settings, see Section 4.12.4, Modifying Configuration Settings.

4.12.3 Searching for Words That Contain Accents

When Vibe indexes a word, it indexes the word without accents, regardless of whether the word originally contains accents. Likewise, when users perform a search for a word, Vibe searches for the word without accents, regardless of whether the user uses accents during the search. This means that when a user performs a search on the word cliché, Vibe returns results for the word cliché and cliche. Vibe also returns both forms of the word if the user performs a search on the word cliche.

This functionality is enabled by default. For information on how to modify the default settings, see Section 4.12.4, Modifying Configuration Settings.

4.12.4 Modifying Configuration Settings

The location of the configuration file on the Vibe server depends on whether you have a single Lucene Index Server that is located on the same server as Vibe, or if your Lucene Index Server is located on a remote server or you have multiple Lucene Index Servers.

For more information about the various ways that you can set up the Lucene Index Server, see Changing Your Lucene Index Server Configuration in the Novell Vibe OnPrem 3.1 Installation Guide.

To modify configuration settings for the Search feature:

  1. Change to the following directory:

    Linux:

    /opt/novell/teaming/apache-tomcat/
                               webapps/ssf/WEB-INF/classes/config
    

    Windows:

    c:\Program Files\Novell\Teaming\apache-tomcat\
                               webapps\ssf\WEB-INF\classes\config
    
  2. Open the ssf.properties file in a text editor if you have a single Lucene Index Server that is located on the same server as Vibe.

    or

    Open the lucene-server.properties file in a text editor if your Lucene Index Server is located on a remote server or you have multiple Lucene Index Servers.

  3. Scroll down to locate the line for the search functionality that you want to change.

    For information on each configuration setting that you can modify for searching on the Vibe site, see Configuration Settings for Search Features.

  4. Copy that line to the clipboard of your text editor.

  5. Depending on which file you are modifying (as described in Step 2), make a backup copy of the corresponding ssf-ext.properties file or lucene-server-ext.properties file, which are located in the same directory as the ssf.properties file and the lucene-server.properties file.

  6. Open either the ssf-ext.properties file or the lucene-server-ext.properties file.

  7. Scroll to the end of the ssf-ext.properties file or the lucene-server-ext.properties file, then paste the line you copied.

  8. Edit the setting for the appropriate search functionality as needed.

  9. Save and close the ssf-ext.properties file or the lucene-server-ext.properties file.

  10. Close the ssf.properties file or the lucene-server.properties without saving.

  11. Stop and restart Vibe to put the modified search customizations into affect for your Vibe site.

  12. Re-index the Vibe site, as described in Section 24.4, Rebuilding the Lucene Index.

Configuration Settings for Search Features

The following tables show the configuration settings that you can modify for the various search features. Each configuration setting has a lucene.indexing setting and a corresponding lucene.searching setting. Both setting must be configured in order to produce the desired functionality.

Table 4-2 Removing Frequently Used Words

Setting

Function

lucene.indexing.stopwords.enable

Enables or disables the functionality that removes frequently used words that have no inherent meaning when items are added to the index. For more information, see Section 4.12.1, Removing Frequently Used Words That Have No Inherent Meaning.

By default, the value is true (enabled).

lucene.indexing.stopwords.file.charset

If you have provided your own file that contains frequently used words that you want to be ignored (as described in lucene.indexing.stopwords.file.path), you can change the default character encoding of the file that contains the new words.

By default, the value is UTF-8.

lucene.indexing.stopwords.file.path

Enables you to point to a file that you create that contains your own list of words that you want Vibe to ignore when items are added to the index. This file should be in a directory where it does not get overwritten or removed during an upgrade. If you are running Vibe in a clustered environment, this should be a directory that is accessible to and shared by all Vibe nodes.

You must specify the full path to the file.

Each line of the file should contain only one word.

All words in the file must be in lowercase.

By default, there is no file path specified, and Vibe defaults to a list of common words that are not normally useful when performing a search, such as a, in, this, and so forth.

lucene.searching.stopwords.enable

Enables or disables the functionality that removes frequently used words that have no inherent meaning when users perform a search. For more information, see Section 4.12.1, Removing Frequently Used Words That Have No Inherent Meaning.

By default, the value is true (enabled).

lucene.searching.stopwords.file.charset

If you have provided your own file that contains frequently used words that you want to be ignored (as described in lucene.indexing.stopwords.file.path), you can change the default character encoding of the file that contains the new words.

By default, the value is UTF-8.

lucene.searching.stopwords.file.path

Enables you to point to a file that you create that contains your own list of words that you want Vibe to ignore when performing a search. This file should be in a directory where it does not get overwritten or removed during an upgrade. If you are running Vibe in a clustered environment, this should be a directory that is accessible to and shared by all Vibe nodes.

You must specify the full path to the file.

Each line of the file should contain only one word.

All words in the file must be in lowercase.

By default, there is no file path specified, and Vibe defaults to a list of common words that are not normally useful when performing a search, such as a, in, this, and so forth.

If you leave all three search features enabled (removing frequently used words, searching for various forms of the same word, and searching for words that contain accents), and you want to specify words to ignore that contain accents, you must specify both forms of the word (with and without the accents).

Table 4-3 Searching for Various Forms of the Same Word

Setting

Function

lucene.indexing.stemming.enable

Enables or disables the functionality that indexes various forms of the same word. For more information, see Section 4.12.2, Searching for Various Forms of the Same Word.

By default, the value is true (enabled).

lucene.indexing.stemming.stemmer.names

Allows you to specify the language that you want Vibe to use when indexing the root form of words. For more information, see Section 4.12.2, Searching for Various Forms of the Same Word.

By default, the language is English.

For information about which languages are available, see Section 4.12.5, Supported Languages for Indexing and Searching for the Root Form of Words.

lucene.searching.stemming.enable

Enables or disables the functionality that allows users to search for various forms of the same word. For more information, see Section 4.12.2, Searching for Various Forms of the Same Word.

By default, the value is true (enabled).

lucene.searching.stemming.stemmer.names

Allows you to specify the language that you want Vibe to use when searching for the root form of words. For more information, see Section 4.12.2, Searching for Various Forms of the Same Word.

By default, the language is English.

For information about which languages are available, see Section 4.12.5, Supported Languages for Indexing and Searching for the Root Form of Words.

Table 4-4 Searching for Words That Contain Accents

Setting

Function

lucene.indexing.asciifolding.enable

Enables or disables the functionality that indexes words with accents as well as the same word without the accents. For more information, see Section 4.12.3, Searching for Words That Contain Accents.

By default, the value is true (enabled).

lucene.searching.asciifolding.enable

Enables or disables the functionality that allows users to search for words with accents as well as the same word without the accents. For more information, see Section 4.12.3, Searching for Words That Contain Accents.

By default, the value is true (enabled).

4.12.5 Supported Languages for Indexing and Searching for the Root Form of Words

By default, when Vibe indexes a word, it indexes the root form of the word. Likewise, when users perform a search for a word, Vibe searches for the root form of the word and returns all matches. (For more information, see Section 4.12.2, Searching for Various Forms of the Same Word.)

The default language of the Vibe site is irrelevant in regards to indexing and searching for the root form of words; Vibe detects the language for each individual entry when it performs the indexing and search.

You can configure Vibe to use any of the following languages when indexing and searching for the root form of words (the default is English):

  • Danish

  • Dutch

  • English

  • Finnish

  • French

  • German

  • German2 (This is a modified version of German that handles umlaut characters differently. Appends an e after vowels that would otherwise have an umlaut. For example, ä becomes ae, ë becomes oe, and ü becomes ue.)

  • Hungarian

  • Italian

  • Norwegian

  • Porter (This is for the English language; this option simply indexes and searches in a different way.)

  • Portuguese

  • Romanian

  • Russian

  • Spanish

  • Swedish

  • Turkish