4.7 Filr Search Appliance Planning

4.7.1 Filr Search Disk Space Sizing

To determine the disk space required for /vastorage on Filr Search appliances, do the following:

  1. Allocate 10 GB as a base requirement.

  2. Allocate ~11 KB per file for indexed Net Folders.

  3. Allocate ~5 KB per file for non-indexed Net Folders.

IMPORTANT:Initial synchronization requires more disk space, but Lucene optimizes space usage after the initial sync, thus reducing the space required. 

Because of the initial indexing overhead, Novell recommends waiting until the initial synchronization and optimization complete before enabling user access to the Filr system.

Formula

(F * 11 KB * indexed_NF) + (F * 5 KB * non-indexed_NF) = Size

Key

  • F=Number of Files

  • NF=Net Folders

Sample Organization Numbers

1,000,000 files (75% indexed; 25% non-indexed)

Sample Filr Search Estimate

750,000 files * 11 KB = 8250000 KB

250,000 files * 5 KB = 1250000 KB

Initial Estimate of Lucene index is 9.05 GB.

4.7.2 Filr Search Caveats

  • Do not configure /vastorage as an NFS Share for the Filr Search appliance, even though the installation UI shows it as an option. Always set /vastorage to be on the local disk.

  • Make certain that the Filr appliances can resolve the Filr Search appliance DNS host names to IP addresses and IP addresses to host names. This can be accomplished through DNS or by modifying the /etc/hosts file on the Filr appliances.

    If this is not done, the Filr installation will fail.

  • Always deploy two Filr Search appliances.

    By using two Search Appliances, you ensure that indexing can be completed on each appliance in turn without the Filr clients re-downloading all their data.

    WARNING:If you can only run one search appliance, be very careful about re-indexing. Re-indexing will cause all Windows and Macintosh Desktop Clients to delete all of their locally held Filr data, and then re-download it from the server as the index is recreated.

    When this happens, users can easily conclude that their files have been deleted, resulting in angry support calls and other problems.

4.7.3 Content Index Planning

  • Assess which data must be searchable.

  • Start with a subset (1 to 10 GB) of that data.

  • Monitor how much time is required to complete the indexing process.

  • Increase the amount and monitor the process again.

  • Always ask, “How many of these files actually need to be indexed?”

  • Keep in mind that indexing impacts Filr in the following areas:

    • Time required to synchronize and index a Net Folder

    • Disk space usage in Filr’s filr/filerepository directory

    • Bandwidth usage between the Filr appliance and the target servers where Net Folders are located

    • CPU utilization on the Filr appliances