2.1 Retain Planning and Design Best Practices

Use and apply the concepts and principles in the following sections as they apply to your needs.

2.1.1 Retain Architecture

Retain can run on

  • Stand-alone server hardware

    Or

  • A Virtual Machine hosted on a Windows or SuSE Linux supported hypervisor.

    This is the best-practice recommendation for backup purposes and flexibility.

Retain must have these four components:

  • Server

  • Worker

  • Indexer

  • Database

Server

This is where the archive system is configured and maintained. It coordinates and directs the storing, indexing, searching, and reading of archived items.

Worker

Workers interface with the messaging host/mail servers that contain the messages you are archiving. Workers retrieve the messages and hand them to the Retain Server.

IMPORTANT:A worker can handle only one job at a time. You can queue up more than one job for a worker, but make sure the worker can complete all of its jobs in less than 24 hours.

Indexer

This indexes each word (and some phrases) in all archived messages and attachments.

When someone searches in Retain, the indexer returns the list of hits by retrieving metadata from the database. (Some have incorrectly assumed that the database returns the list of hits.)

Retain’s most memory-intensive process is indexing, not running the database.

Keep this in mind when dividing memory between Tomcat/the indexer and the database.

Database

This stores most of the Retain configuration and all the message metadata (subject, sender, recipients, links to attachments, indexed state of messages, folder-context of the message, and so on).

Everything displayed in a Retain mailbox is metadata that is retrieved through the index from the database, not from the message archive.

2.1.2 Retain is Modular and Flexible

Because Retain is modular, component software can be installed on different servers, as the needs of your system dictate.

All-in-One Systems

This is recommended for:

  • Proof-of-concept systems.

  • Small systems that aren’t expected to grow beyond a few hundred users.

Medium to Large Systems

This is recommended if

  • Your organization already has a dedicated database server.

  • Your users are assigned to several different post offices and/or messaging systems.

For medium to large systems, Micro Focus recommends assigning one worker agent per post office/messaging system (as applicable).

Very Large Systems

This is recommended if you have a very large system that requires high availability for searching the Retain archive.

The high-availability indexer requires a separate license and at least a 3-server cluster.

2.1.3 Worker Locations

Dredging large email systems can take a long time, especially when it involves multiple mail servers.

Because Worker Agents do the heavy lifting when it comes to dredging, their placement is critical to overall system efficiency.

You can install Worker Agents in three locations as your deployment needs dictate.

By order of recommendation, these locations are:

Post Office Servers

Micro Focus recommends installing one worker on each mail server wherever possible because

  • Item retrieval happens quickly and efficiently within the server itself.

  • Processing happens locally.

  • Only new messages are transmitted over the network to the datastore, conserving considerable bandwidth. See How Archive Jobs Work

A Separate Worker Server

If installing on post office servers is not an option for whatever reason, you can install up to 10 Worker Agents on a dedicated Worker Server.

Keep in mind that the network must have sufficient bandwidth to handle the load of all items that require processing, traveling from the post office servers to their respective workers. See How Archive Jobs Work

With the Retain Server

For troubleshooting purposes, Micro Focus recommends always installing one worker on the Retain server.

However, having this Worker Agent function as the system worker is only recommended for small, proof-of-concept, all-in-one deployments. This configuration would rarely, if ever, be an effective solution for actual production workloads.