16.4 Configuring a Pin List

A pin list contains URL patterns for identifying objects on the Web. The Access Gateway uses the list to prepopulate the cache, before any requests have come in for the content. This accelerates user access to the content because it is retrieved from a local cache rather than from an exchange with the Web server, which would read it from disk.

You can use the pin list to specify the following:

The pin list is global to the Access Gateway and affects all protected resources. The pinned objects remain in cache indefinitely unless the cache fills up. This ensures that the objects are available from cache and are not bumped out by more recently requested objects. You configure each pinned object with a URL pattern and specific handling instructions.

To configure a pin list:

  1. In the Administration Console, click Access Manager > Access Gateways > Edit > Pin List.

    Configuring a pin list
  2. Fill in the following fields:

    Enable Pin List: Select this option to enable the use of pinned objects. If this option is not selected, the pinned objects in the pin list are not used.

    Default Refresh Frequency/Time: (NetWare only) Sets a default refresh interval for checking the URL patterns and seeing if any new objects need to be cached (or deleted objects removed from cache). This default refresh interval can be overwritten by selecting a different refresh interval for a specific pinned object. Select one of the following for the default value:

    • Once Immediately: Select this option to refresh the list as soon as the changes to this page are pushed to the server.

    • Day and Hour: Select a day and a time for the refresh.

    • Hourly Interval: Select an interval, specified in hours, for refreshing the pin list.

  3. In the Pin List section, click New.

  4. Fill in the following fields.

    URL Mask: Specifies the URL pattern to match. For more information, see Section 16.4.1, URL Mask.

    Pin Type: Specifies how the URL is to be used to cache objects. Select from Normal, Cache, Memory, and Bypass. The Linux Access Gateway supports only Normal and Bypass. For more information, see Section 16.4.2, Pin Type.

    Follow Links: (NetWare only) Indicates whether the Access Gateway can follow links and limits nested links to the value specified. A value of zero indicates that links should not be followed. For more information, see Section 16.4.3, Follow Links.

    Other Hosts: (NetWare only) Indicates whether the Access Gateway can follow links to other hosts and cache pages from these hosts. This is only available if the Follow Links field is set between 1 and 4.

    Refresh Frequent/Time: (NetWare only) Sets a default refresh interval for checking the URL patterns and seeing if any objects have been modified. You can select Use Default to use the refresh interval set for all URL patterns or you can specify one for this object, whose value overrides the default setting.

    When the fields are configured, click OK.

  5. To save your changes to browser cache, click OK.

  6. To apply the changes, click the Access Gateways link, then click Update > OK.

16.4.1 URL Mask

The URL mask can contain complete or partial URL patterns. A single URL mask might apply to a large set of URLs, or it might be so specific that only a single file on the Web matches it.

The Access Gateway processes the masks in the pin list in order of specificity. A mask containing a host name is more specific than a mask that specifies only a file type. The action taken for an object is the action specified for the first mask that the object matches.

The Access Gateways recognizes four levels of specificity, using the following format:

Level

Examples

hostname

http://www.foo.gov/documents/picture.gif
http://www.foo.gov/documents/*
http://www.foo.gov
foo.gov/documents/*
foo.gov/*

All of these are classified as hostnames, and they are ordered by specificity. The first item in the list is considered the most specific and would be processed first. The last item is the most general and would be processed last.

path

/documents/picture.gif
/documents/pictures.gif/*
/documents/*

Path entries are processed after hostnames. A leading forward slash must always be used when specifying a path, and the entry that follows must always reference the root directory of the Web server. In these examples, documents is the root directory.

The /* at the end of the path indicates that the entry is a directory. Its absence indicates that the entry is a file. In these examples, picture.gif is a file and pictures.gif/* and documents/* are directories.

If you enter a path without the trailing *, the path matches only the directory. With the trailing *, the path matches everything in the directory and its subdirectories.

These path entry examples are ordered by specificity. The objects in the /documents/picture.gif directory are processed before the objects in the /documents directory.

filename

/picture.gif
/widget.js

Filenames are processed after paths. A leading forward slash must always be used when specifying a filename. If a path is included with a filename, the path must start with the root directory of the Web server (and the entry is processed as a path entry, not as a filename entry).

file extension

/*.gif
/*.js
/*.htm

File extensions are processed last. They consist of a leading forward slash, an asterisk, a period, and a file extension.

Specific rules have precedence over less specific rules. Thus, objects matched by a more specific rule are always processed according to its conditions. If a less specific rule also matches the object, the less specific rule is ignored for the object. For example, assume the following two entries in the pin list:

URL Mask

Pin Type

Pin Links

http://www.foo.gov/documents/*

cache

1

www.foo*

bypass

N/A

The first entry, because it is most specific, caches the pages in the documents directory and follows any links on those pages and caches the linked pages. The second entry does not affect what the first entry caches, but it prevents any other domain extensions (.com, .net, .org, etc.) whose DNS names begin with www.foo from being cached.

16.4.2 Pin Type

The pin type specifies how the Access Gateway caches objects that match the URL mask.

  • Normal: The Access Gateway handles objects matching the mask in the same way it handles any other requested objects. In other words, the objects are cached but not pinned.

    Administrators often use this pin type in combination with a broad URL mask that has a bypass pin type. This allows them to insulate specific objects from the effects of the bypass rule.

    For example, you could specify a URL mask of /*.jpg with a pin type of bypass and a second URL mask of www.foo.gov/graphics/* with a pin type of normal. This causes all files, including .jpg files, in the graphics directory on the foo.gov Web site to be cached as requested. They are not, however, pinned in cache because of the normal pin type. Assuming there are no other URL masks in the pin list, all other JPG graphics are not cached because of the /*.jpg mask.

  • Cache: The Access Gateway keeps the pinned objects in cache as long as possible, although they might be written to the hard disk. This option is not supported by the Linux Access Gateway.

  • Memory: The Access Gateway keeps the pinned objects in memory as long as possible, writes them to disk when memory gets too full, and places them back in memory as soon as they are requested by a user of the cache. This option is not supported by the Linux Access Gateway.

  • Bypass: The Access Gateway does not cache the objects. In other words, you can use this option to prevent objects from being cached.

16.4.3 Follow Links

The Follow Links field specifies the number of links the Access Gateway can follow as it caches objects that match the URL pattern. For example, if the requested object is an HTML page and you have specified a Follow Links level of 1, the HTML page is downloaded and cached along with all the items linked from the page. These cached objects are also refreshed at the frequency and time specified. If there are links on the linked pages, these links are not followed and those pages are not cached. To add these objects, you would need to specify 2 for the Follow Links option.

To use a level other than 0, you must specify an absolute address, including the scheme, host, and path for the URL mask, for example:

 http://www.foo.gov/documents/