How Forward Proxy Caching Works

This section describes how forward proxy caching works, and includes information on caching hierarchies, object types and caching, and proxy server security. This section contains the following subsections:


Forward Proxy Caching

Proxy Services acts as an intermediary between hosts on a protected network and the Internet or intranet, or between Internet clients and servers on your network. When a user requests an Internet service, such as HTTP, the client submits the request to the proxy, which then acts on the client's behalf. The proxy checks its local cache for the data and, if the data is available, sends it to the client immediately. If the data is not available, the proxy requests the data from the hierarchical cache servers or the origin server on the Internet, and then returns the data to the client.

The proxy server, then, works as both a client and a server. As a server, it receives requests from intranet clients. As a client, it forwards the requests to the origin Internet server.

When a client makes a regular HTTP request (without using a proxy), the HTTP server receives only the path and keyword portion of the requested URL. For example, the user enters the following command:

http://host.com/marketing/doc.html

The browser sends the following command to host.com:

GET /marketing/doc.htm

In this example, the protocol specifier http and the hostname are already known to the remote HTTP server. The requested path specifies the object available from that server.

When a client sends a request to a proxy server through a browser, the proxy uses HTTP and the GET method. The client (browser) uses HTTP when communicating with the proxy, even when accessing an object on a remote server that uses a different protocol, such as Gopher or FTP. If a different protocol is being used, the proxy will identify the protocol and will use that protocol and the full URL to make the request. The proxy server has all the information necessary to make the request to the remote server.

For example, for an FTP protocol request, the proxy uses the following command:

GET ftp://host.com/marketing/doc.html

All the information is used when requesting information from the origin server. The proxy requests the document using FTP, the results are returned to the proxy server as an FTP reply, and the server then sends the information to the user as an HTTP reply.


Caching Hierarchies

You can set up a hierarchy of proxy cache servers to reduce the WAN load and resolve requests. Whenever a request for an object cannot be resolved, the proxy server contacts its neighbors (peers) and parents using the Internet Cache Protocol (ICP), a simple resolution protocol. The proxies exchange queries and replies to gather information and select the best location from which to retrieve a requested object.

If the URL matches a listing on a configurable list of substrings, the object is retrieved directly from the origin server rather than from other proxy servers. If the request is a cachable object, the proxy server sends the request to the siblings and parents using UDP broadcast. The object is retrieved from the closest available site. Caching hierarchies reduce the load on origin Web servers and distribute the load across many cache servers. See ICP Hierarchical Caching for more information about hierarchical caching and how it works.


Object Types and Caching

Not all objects can or should be cached. Some types of objects are of no value when cached because they change too frequently. Other types of objects require authentication before they can be accessed.

HTTP supports the HEAD method to retrieve only the header to determine how recent an object is. If an object has not been modified since the time specified in a header request, the object is not returned and the cached object is used.

HTTP also supports the If-Modified-Since request header, enabling a conditional GET request. The GET request contains the date and time the object in the proxy cache was last modified. If the object has been modified since the stored date and time, a new copy is retrieved.

Usually, the proxy server does not cache the following types of objects:

You can specify additional noncachable object types. For more information, refer to the Proxy Services online documentation.

The Novell BorderManager 3.7 proxy uses the cache aging information that Web servers usually provide to browsers. This information specifies how long pages should be cached. The HTML text is typically only a small part of the transmitted data, even for sites that dynamically generate HTML pages. The majority of the data consists of images that are static and cachable. To improve performance, you can fine-tune cache aging policies. For more information, refer to the Proxy Services online documentation.


Proxy Servers and Security

Proxy Services interacts with the following to provide additional proxy server security:

One benefit of establishing proxy servers on your intranet is to increase security through access control and the logging of URL requests. Proxy servers have two types of security:

The proxy server provides tighter security than using only address filtering. The proxy server determines the address of a packet and the entire context of the session in which the packet is being sent, making it easier to identify suspicious packets.

The proxy server can be used as a part of a firewall solution or together with firewall solutions from other vendors. It can be used in front of, within, or behind existing firewalls.


Proxy Services and Access Control

For additional security, access is controlled using access control list rules. You can set up Proxy Services access control to do the following:

For example, you might want to deny access to Web sites that do not fit your company policy or are not essential to completing company work.

With access control lists, the proxy server restricts access based on the source and destination IP addresses, URLs, domains, and NDS or eDirectory usernames. The proxy server also works with other third-party site-blocking software, such as SurfControl, to block sites by category.

The access control list rules are stored in the eDirectory database. The access control list is a set of rules that either allow or deny a specific action. The access control list module checks the HTTP request and determines whether any of the access rules apply. If a rule applies, the specified action is performed. Otherwise, the default rule is applied. You can create access control rules at the Country, Organization, Organizational Unit, and Server object levels. Rules can be based on criteria such as users, groups, IP addresses, or services. For more information, refer to the access control online documentation.

With NDS or eDirectory, access control lists are associated by container, group, user, or server. Access control lists can apply to all proxies in an organization, thereby giving management a global view.

Novell IP Gateway clients also provide eDirectory usernames to log in to the gateway before sending HTTP requests. URL restrictions can also be based on usernames. In addition, HTTP/IPX and HTTP/IP clients can directly access proxy servers using the gateway client transparent proxy feature.


Novell Internet Gateway Clients

The proxy server also supports the HTTP protocol over IP (as well as any WinSock-based program). Novell IPXTM/IP and IP/IP gateway clients can directly access the proxy server. When you configure your browser on the gateway client to go through a proxy, the gateway client automatically detects the proxy servers using NDS or eDirectory. The gateway client then redirects the requests to the proxy.