Object Pinning

This section contains the following topics:


The Pin List

The pin list contains URL patterns for identifying objects on the Web. You configure each URL pattern in the list with specific handling instructions as explained in the following sections.

Pinned objects remain in the cache indefinitely unless it fills up. This ensures that the lists are available from cache and will not be bumped out by more recently requested objects.


URL Mask

The URL mask can contain complete or partial URL patterns. A single URL mask might apply to a large set of URLs, or it might be so specific that only a single file on the Web matches it. For more information, see Pin List Examples.

The appliance processes the masks in the pin list in order of specificity. A mask containing a host name is more specific than a mask that specifies only a file type. The action taken for an object is the action specified for the first mask that the object matches. For more information, see Processing URL Masks.

If the mask contains an asterisk, only the pin type can be specified. The Pin Links, Pin Images, and Refresh Frequency/Time options are not available for URLs containing this wildcard. Objects matching a mask with an asterisk are not automatically downloaded, but are pinned in cache only as individually requested.


Pin Type

The pin type specifies whether and how the appliance will cache objects that match the URL mask.


Pin Links

This specifies how many link levels iChain Proxy Services will follow for the pin type rule you've established. Selecting levels 1 or 2 causes all linked objects, including the images on the host, to be downloaded and cached when the pin list is applied to the appliance configuration, and then to be periodically refreshed as specified.

For example, if the requested object is an HTML page and you have specified a pin links level of 1, the HTML page will be downloaded and cached when the pin list is applied along with all the items linked from the page. These cached objects will also be refreshed at the refresh frequency and time specified.

To use levels 1 or 2 you must specify an absolute address, including the scheme, host, and path for the URL mask, for example, http://www.foo.gov/documents/. The tool will let you insert masks that do not meet this requirement, but the entries are removed when you click Apply.

Attempting to include an asterisk wildcard immediately hides this option.


Pin Images

This option is used to pin image files that reside on a different host than the page requested. It works in conjunction with the Pin Links option, which specifies how many levels of links iChain Proxy Services will follow when downloading a page.

For example, if the requested HTML page uses images that reside on another host and you have checked this option, the HTML page will be cached along with all the image files associated with the page, including those on the other host. If you have also specified a pin link level, images on the linked pages that reside on another host will also be pinned.

On the other hand, if the Pin Images option is not checked, iChain Proxy Services only pins the images that reside on the same host as the requested page.


Refresh Frequency/Time

This lets you specify a refresh frequency and time for the URL that is different from the default values shown above the pin list.


Processing URL Masks

There are four basic types of URL masks you can enter in the pin list. The following table lists each type, provides a few examples of each, and provides information on how they are processed by iChain Proxy Services.

Type URL Mask Examples by Specificity Notes

Hostname

http://www.foo.gov/documents/picture.gif

http://www.foo.gov/documents/

http://www.foo.gov

foo.gov/documents/

foo.gov/

*.foo.gov/

Although these entries can include the protocol or scheme, the DNS name, the path, and the filename, only the DNS or hostname must be present in the mask. All DNS label portions must be indicated, if only by an asterisk wildcard.

iChain Proxy Services processes hostname entries before it processes other mask types. It also processes the most specific URL mask entries first.

When an object match occurs, iChain Proxy Services applies the pin type rule, and processing of the object is finished.

For example, if the first URL mask in the examples column has a pin type rule of bypass, PICTURE.GIF will not be cached regardless of the pin type rules for the other URL masks.

Hostname entries can have a dramatic impact on object pinning and cache bypassing.

For example, if the first two URL masks in the examples column were not present, a pin type of Bypass on the third URL mask would prevent caching of all objects delivered through HTTP on the www.foo.gov Web site.

If no scheme (HTTP, FTP, etc.) is indicated, the mask applies to all schemes. The last three masks would apply to objects delivered through any Web protocol.

Finally, Configure interprets hostnames literally. For example, the sixth entry would cover www.foo.gov, ww1.foo.gov, army.foo.gov, etc., but the fourth and fifth entries would not, because a scheme is assumed to immediately precede the hostname.

Path

/documents/picture.gif

/documents/picture.gif/

/documents/

iChain Proxy Services processes path entries after all hostname entries have been considered. It assumes that the first forward slash immediately follows a hostname.

A leading forward slash must always be used when specifying a directory. The leading slash always references the root directory of the Web server.

For example, the first entry would apply only to a graphics file named PICTURE.GIF that is located in a DOCUMENTS directory at the root of the host.

The forward slash in the second entry causes iChain Proxy Services to assume that PICTURE.GIF is a directory. The pin type rules associated with this entry would apply to any matched objects that have a URL directory path that starts with a documents directory followed by a subdirectory named PICTURE.GIF.

The third entry would apply to any matched objects that contain a DOCUMENTS directory at the Web server's specified root directory.

Filename

/picture.gif

/widget.js

/default.htm

After the path entries have all been processed, iChain Proxy Services looks for specific filenames.

A leading forward slash must be used and, as opposed to a path-based mask, does not reference the root directory of the Web server.

For example, if requested objects named PICTURE.GIF, WIDGET.JS, and DEFAULT.HTM have not been covered by one of the hostname or path entries above, the files will have the pin type rule for their respective filename mask applied to them.

If the first entry carries a pin type rule of Bypass, all PICTURE.GIF files that didn't match previously processed hostname or path masks would not be cached.

File Extension

/*.gif

/*.js

/*.htm

File extension entries are processed last.

These are simply filename entries with the root of the filename replaced by an asterisk, which makes them less specific that complete filenames.

A leading forward slash must be used and, as opposed to a path-based mask, does not reference the root directory of the Web server.

For example, If the examples shown all had pin types of Bypass, then only those .GIF, .JS, and .HTM files that had been cached and pinned because of hostname, path, or filename masks would be stored in cache. All other files with the named extensions would not be cached.


Wildcards in Pin Lists

Only the asterisk (*) wildcard is allowed in pin list entries.

iChain Proxy Services interprets everything between an asterisk and the next delimiter to the right (a forward slash [/], a period[.], or a colon [:]) as a wildcard. This effectively allows only one asterisk between delimiters.


Pin List Examples

The following table provides brief examples of sample pin list entries and their effects on appliance caching.

URL Mask Pin Type Pin Links Pin Images Effect on Cache

http://www.foo.gov/documents/

cache

1

Yes

As a general rule, you should always include fully qualified DNS or hostnames in the pin list. iChain Proxy Services resolves these more quickly than other masks, and you will be able to track the effects on pinning more easily.

For this URL mask, iChain Proxy Services downloads, caches, and pins all objects whose URL starts with the mask. In other words, all objects below the documents directory will be downloaded, cached, and pinned. Also, all objects that are linked from one of the pinned objects will be downloaded, cached, and pinned. And finally, images that reside on other hosts will be downloaded, cached, and pinned as well.

Objects will be refreshed according to the refresh settings (default or specific) as specified in the pin list entry.

www.foo.gov/groups.html

cache

1

No

iChain Proxy Services downloads, caches, and pins objects (including images) in the GROUPS.HTML page and in pages linked from that page. Any images referenced from other hosts, however, are not included.

www.foo.gov/groups.html/

normal

1

Yes

iChain Proxy Services downloads and caches objects in the subdirectory named groups.html and in pages linked from any of those objects.

The forward slash at the end of the path tells iChain Proxy Services that this is a directory rather than a file.

Objects are cached but not pinned in cache, meaning they might be bumped by more frequently accessed objects or objects that are pinned.

Images linked from other hosts are downloaded and cached.

www.foo.*

bypass

n/a

n/a

iChain Proxy Services doesn't cache objects from any URLs whose DNS names begin with www.foo.

All domain extensions (.com, .net, .org, etc.) are covered by the asterisk wildcard.

Link and image pinning is not available for bypass pin types.

If this entry appeared in a pin list with either of the previous two entries, it would not prevent caching of objects covered by them because it is less specific than they are.

w*.f*.com

bypass

n/a

n/a

iChain Proxy Services doesn't cache objects for any URLs whose first domain label begins with w and second domain label begins with f, providing the domain extension is .com.

This mask doesn't prevent caching of objects on other domains such as .net, .gov, etc.

w*.f*.*

bypass

n/a

n/a

This mask functions like the previous entry, but the domain is not limited to .com.

*.foo.*

cache

n/a

n/a

This causes all objects on any Web server whose second domain label is foo to be pinned in cache.

Link and image pinning are not available because the mask contains asterisks.

This mask would not cover DNS names that don't have a domain label before foo. For example, foo.gov would not normally be covered. However, if foo.gov happened to resolve in DNS to the same IP address as www.foo.gov, the iChain Proxy Server would apply the pinning rules specified for www.foo.gov to foo.gov. To understand more about IP addresses and URL masks, see Using the Proxy Server to Record IP Addresses When Resolving URL Masks.