4.7 Disk Space Requirements

Based on your analysis of files and folders, indexing requirements, and search requirements, define the estimated space needed for the Filr, MySQL, and Search servers. Check the Novell Filr 1.0.1 Installation and Configuration Guide for the latest data and start with the recommended minimums. Use the following as a guide:

4.7.1 Sizing /vastorage

/vastorage can be 5-10GB in size because only Ganglia and the appliance-specific files needed for updates are stored there.

4.7.2 Sizing the Remote NFS Share ( /vashare)

/vashare is used by all of the Filr appliances in a Filr Cluster for storing the following:

Personal Storage

The files that users store in their assigned My Files personal storage are stored in /vashare. Of course, Home directories are merely a special Net Folder and their data is stored on the file servers that host them.

HTML Renderings

HTML renderings and text extractions have a significant impact on the amount of storage required per file for the Filr appliance.

Each HTML rendering occupies disk space.

All HTML renderings are stored in /vashare.

Only one HTML rendering exists per document.

The disk space used by HTML renderings depends on the type of file that is rendered:

  • Microsoft Office and OpenOffice files use the same amount of disk space as the original file, so a rendered 1 MB file requires an additional 1 MB of disk space.

  • PowerPoint files require ~3 times the disk space as the original file, so a rendered 1 MB file requires an additional 3 MB of disk space.

  • PDF files require 10 times the disk space as the original file, so a rendered 1 MB file requires an additional 10 MB of disk space.

If HTML renderings are consuming more than 10 GB of disk space on your system, you can delete all HTML renderings by restarting the Filr appliance. To restart the Filr appliance, change any configuration option, then click Reconfigure Filr Server, as described in Changing Configuration Options for the Filr Appliance in the Novell Filr 1.0.1 Installation and Configuration Guide.

Indexing Text Extractions

When a file is added to Filr (either to Personal Storage or to a Net Folder), text from the file is extracted and added to the search index to be used for searching. By default, each text-extracted file is truncated to 1.1 MB.

Uploads to Net Folders

Based on these items and the number of users, /vashare should be at least 50 GB in size and perhaps much larger, depending on usage. For example, if there are 1,000 users uploading a 2MB file at the same time, that activity alone would consume up to 2 GB of disk space.

4.7.3 Filr Search Sizing Guidelines

  • Allocate ~5 KB disk space per file on non-indexed Net Folders.

  • Allocate ~11 KB of disk space per file on indexed Net Folders.

    Initial synchronization requires more disk space, but Lucene optimizes space usage after the initial sync, thus reducing the space required. 

    Because of the initial indexing overhead, Novell recommends waiting until the initial synchronization and optimization complete before enabling user access to the Filr system.

Formula

F=Number of Files        NF=Net Folders

Filr Search

F x 11 KB (NF Indexed)+

F x 5 KB (NF not indexed)

Organization Numbers

1,000,000 files (75% indexed; 25% non-indexed)

Filr Search Estimate

750,000 files * 11 KB = 8250000 KB

250,000 files * 5 KB = 1250000 KB

Initial Estimate of Lucene index is 9.05 GB.

4.7.4 MySQL Sizing Guidelines

  • Allocate 30 KB per file when indexing is enabled for Net Folders.

  • Allocate 20 KB per file when indexing is not enabled for Net Folders.

  • Allocate 10 KB per user.

Formula

U=Number of Users        F=Number of Files        NF=Net Folders

MYSQL

F x 30 KB (NF indexed)+

F x 20 KB (NF not indexed)+

U x 10 KB

Organization Numbers

1,000,000 files (75% indexed; 25% non-indexed) and 1,000 users

MySQL Database Estimate

750,000 files * 30 KB = 22,500,000k

250,000 files *20 KB =  5,000,000k

1,000 Users * 10 KB = 10,000k

Initial Estimate of Database size is 26 GB.

4.7.5 A Word About Inodes

Each file written to a disk consumes one inode, and Filr writes multiple files to /vashare for each file or directory that is added. For example, if a disk targeted by the /vashare mount point has 3 million inodes, and if 3 million files are written to the /vashare directory, then no more files can be written to that disk unless the amount of inodes is increased. See section The File System in Reality, in the The Linux Documentation Project.