3.7 Planning for Content Searching (Content Indexing)

IMPORTANT:We recommend that you review the following information to prepare for planning your file-content searching strategy:

Although providing users with the ability to search file content is an attractive feature of Filr, it comes at a significant cost in terms of the time required for Filr to download files prior to extracting content for indexing, and in terms of backing up the indexed data. Therefore, you will want to weigh the benefits against the cost and ensure that only data that must be searchable is indexed.

  1. On your planning worksheet, identify the directories that will need to be indexed for full text searching.

  2. Identify the files in the directories by size and quantify how many there are of each size.

    • Small (less than 500 MB)

    • Medium (between 500 MB and 2 GB)

    • Large (over 2 GB)

  3. Assess the impacts and costs of content indexing.

    1. Start with a subset (1 to 10 GB) of that data.

    2. Monitor how much time is required to complete the indexing process.

    3. Increase the amount and monitor the process again.

    4. Always ask, “How many of these files actually need to be indexed?”

    5. Keep in mind that indexing impacts Filr in the following areas:

      • Time required to synchronize and index a Net Folder

      • Disk space usage in Filr’s filr/filerepository directory

      • Bandwidth usage between the Filr appliance and the target servers where Net Folders are located

      • CPU utilization on the Filr appliances

  4. After you have planned for your organization’s content searching needs, continue with Planning for Filr Email Integration.