1.1 Overview

1.1.1 Intended Audience

This manual is intended for IT administrators in their use of Retain or anyone wanting to learn more about Retain. It includes installation instructions and feature descriptions.

1.1.2 What Retain does

Retain provides a service of long-term storage of data as well as providing search, retrieval, and review services for archived messages. Retain is NOT a backup or emergency restoration system. Retain archives messages and data from messaging systems, phones, and social websites and stores the data for long-term reference. Users may log in and review their personal archived data and search through it. This not only provides legal compliance and litigation protection, but may be used to free up space on messaging systems and enhances the data management.

1.1.3 How Retain works

The Retain Worker process connects to the appropriate message server to collect data, using the message system’s defined APIs, (For example, SOAP for GroupWise and Exchange). This data is transferred to the Retain Server which stores the collected data in a defined storage location and indexes the data in the SQL server. Users log in to the Retain Server’s web interface to search through and access their archived messages. User rights are managed by the administrator.

1.1.4 Architecture

Retain consists of several main parts which can be installed on the same server or they can be spread out across different servers to allow flexibility in where data gets stored and which servers will be used to perform Retain functions.

  • Retain Server This is the core piece of Retain. All functions are controlled from the Retain Server. The archive is stored here. The server also manages the Retain Worker and stores data in the database server. There is only one Retain Server per system.

  • Retain Worker There is at least one per system, and often one per messaging server. The Retain Worker performs the data collection and transfers the collected data is handed to the server. The server stores the data in a database. This can be installed on the Retain server, the mail server or a standalone server.

  • SQL Database Server : This is an SQL server where Retain stores the message header data, user data and links to the stored messages in the archive. It is not actually part of Retain. Retain was designed to support many different databases. NOTE: Installation, maintenance, tuning, and backup of the database is the customer’s responsibility. This can be installed on the Retain server or a standalone server.

  • Reporting and Monitoring Server : This component keeps job and server statistics and handles mailbox error monitoring. This can be installed on the Retain server or on a standalone server.

  • Indexing Engine : This component keeps indexes all the data. The standard High Performance Indexer is installed on the Retain server or on a separate server cluster as the High Availability Indexer.

  • Stubbing Server: The stubbing server works to remove large messages from the GroupWise system and creates a ‘stub’, or link, to the message which is stored in the SQL database. See the Stubbing Server section in the Main Guide to decide if a stubbing server is correct for your system. From a User’s point of view, there is no change to the behavior of their GroupWise mailbox. Currently stubbing is only supported for GroupWise 8.0.1 or later. This must be installed on the Retain Server itself.

  • Retain Router The Retain Router gathers messages data from Android and BlackBerry phones using REST, and is installed and controlled by the local administrator. Phone data is sent to the Retain Router or Server where the device ID has been registered. The data is stored locally until the Retain Router forwards it on to the Retain Server. Afterwards, the data is accessed the same as all other message data in the Retain Server. This is installed in the network DMZ.

1.1.5 Best Practices for Component Placement

Retain components communicate via TCP/IP. Though placing all the components in the same server would yield the best communication speed, such placement is impractical for larger systems. There must be balance the performance of the components on the physical servers with the speed of the network links joining them. Retain Workers may exist on the same physical machine as a messaging server.

1.1.6 How Retain Stores the Archives

Retain uses a hybrid data storage approach. The database contains all the meta data, folder structure and attachment information but does not hold the actual message text or actual attachments. These are stored on the file system in a single instance storage scheme using a hash for each individual message. Data is hashed using the SHA-256 algorithm which can be used to detect tampering.

1.1.7 Other components that Retain depends on

The following items must be ready before you install Retain's core components.

  • Supported messaging system. (For social media capture the RSM gateway must also be installed)

  • Apache Web Server.

  • Apache Modules mod proxy and mod_rewrite. (Installed and enabled)

  • SQL Database for storage.

  • Java JDK. (This is installed automatically by the installer for Retain use only.)

1.1.8 Design Considerations

Retain is designed to be as flexible as possible, giving you choices as to where to install its components. Here are some points to keep in mind when deciding where to put everything.

1.1.9 SQL Database Server

Where should the SQL database server be placed in the network? The faster the network connection the better. Local installation gives the best communication speed, but it’s usually unrealistic to do so. In a large system, you might have the database on a server by itself for performance or security reasons. Then, network speed and reliability become key considerations.

  • Network link between the Retain Server and SQL Database Server must be speedy and reliable.

  • The Retain database may have to be manually created by the administrator and a user account must be assigned with full rights. See the Database section.

  • Storage requirements: Roughly equivalent to the cumulative size of the message data store. See the ‘Estimating Storage Requirements’ section.

  • NOTE: Installation, maintenance, tuning, and backup of the database is the customer’s responsibility. Tuning an SQL Database Server can result in significant performance gains.

1.1.10 Retain Server

The Retain Server is the heart of the Retain system. All archive processes, search queries, user activities, and auditing is funneled through the Server. When planning the Server, consider the following:

  • CPU requirements are high. The bigger and faster the better.

  • Storage requirements: Storage sizes may change over time. An expandable storage scheme ensures options down the road.

  • Other web applications such as GroupWise WebAccess or iManager should not be installed on the same server. The Retain Server should be a dedicated machine.

  • Do not install Retain components on the same machine as iFolder.

1.1.11 Retain Worker

The Retain Worker is the piece that receives data on a scheduled basis from the messaging systems. It then passes this data to Retain Server. Things to consider when placing a Retain Worker are:

  • A reliable, speedy network connection between Retain Server and Retain Worker.

  • A reliable, speedy network connection to CAS and Mailbox Servers or Post Office Agents being accessed.

  • If desired, one Retain Worker can be placed on the same box as the Server for communication performance reasons.

1.1.12 Retain Router

The Retain Router needs to be visible and accessible to the Internet to receive mobile data.

  • The Retain router should be installed in the DMZ.

  • A dependable network connection to both the Internet and the Retain Server are required.

  • The Retain Router must have an active and constant connection to a data holding location, called the ‘data path’. The data path is simply the directory where phone data is kept while waiting to be sent to the Retain Server.

  • Because data path storage requirements are low, it is highly recommended that the router and data path all be located on the same machine.

1.1.13 Messaging System Address Book

Retain gathers mail from known users. Users known to the messaging system are stored in the System Address Book. Retain caches this information locally. The address book needs to be updated as users are added. The Retain never deletes a user from the cached address book unless there is no mail archived for the user. Over time, Retain will know about all users in the messaging system, both current and past. Because Retain stores past users in its cached address book, it can distinguish between two users of the same name. For example, “John Smith” added today will be recognized as a different user from “John Smith” who worked at the company six months ago.