13.5 Configuring HTML Rewriting

Access Gateway configurations generally require HTML rewriting because the Web servers are not aware that the Access Gateway machine is obfuscating their DNS names. URLs contained in their pages must be checked to ensure that these references contain the DNS names that the client browser understands. On the other end, the client browsers are not aware that the Access Gateway is obfuscating the DNS names of the resources they are accessing. The URL requests coming from the client browsers that use published DNS names must be rewritten to the DNS names that the Web servers expect. Figure 13-3 illustrates these processes.

Figure 13-3 HTML Rewriting

The following sections describe the HTML rewriting process:

13.5.1 Understanding the Rewriting Process

The Access Gateway needs to rewrite URL references under the following conditions:

  • To ensure that URL references contain the proper scheme (HTTP or HTTPS).

    If your Web servers and Access Gateway machines are behind a secure firewall, you might not require SSL sessions between them, and only require SSL between the client browser and the Access Gateway. For example, an HTML file being accessed through the Access Gateway for the Web site novell.com might have a URL reference to http://novell.com/path/image1.jpg. If the reverse proxy for novell.com/path is using SSL sessions between the browser and Access Gateway, the URL reference http://novell.com/path/image1.jpg must be rewritten to https://novell.com/path/image1.jpg. Otherwise, when the user clicks this link, the browser bounces between HTTP and HTTPS to establish a new SSL session.

  • To ensure that URL references containing private IP addresses or private DNS names are changed to the published DNS name of the Access Gateway or hosts.

    For example, suppose that a company has an internal Web site named data.com, and wants to expose this site to Internet users through the Access Gateway using a published DNS name of novell.com. Many of the HTML pages on this Web site have URL references that contain the private DNS name, such as http://data.com/imagel.jpg. Because Internet users are unable to resolve data.com/imagel.jpg, links using this URL reference would return DNS errors in the browser.

    The HTML rewriter can resolve this issue. The DNS name field in the Access Gateway configuration is set to novell.com, which users can resolve through a public DNS server to the Access Gateway. The rewriter parses the Web page, and any URL references matching the private DNS name or private IP address listed in the Web server address field of the Access Gateway configuration are rewritten to the published DNS name novell.com and the port number of the Access Gateway.

    Rewriting URL references addresses two issues: 1) URL references that are unreachable because of the use of private DNS names or IP addresses are now made accessible and 2) Rewriting prevents the exposure of private IP addresses and DNS names that might be sensitive information.

  • To ensure that the Host header in incoming HTTP packets contains the name understood by the internal Web server.

    Using the example in Figure 13-3, suppose that the internal Web server expects all HTTP or HTTPS requests to have the Host field set to data.com. When users send requests using the published DNS name novell.com/path, the Host field of the packets in those requests received by the Access Gateway is set to novell.com. The Access Gateway can be configured to rewrite this public name to the private name expected by the Web server by setting the Web Server Host Name option to data.com. Before the Access Gateway forwards packets to the Web server, the Host field is changed (rewritten) from novell.com to data.com. For information about configuring this option, see Configuring the Web Servers of a Proxy Service.

The rewriter searches for URLs in the following HTML contexts. They must meet the following criteria to be rewritten:

Context

Criteria

HTTP Headers

Qualified URL references occurring within certain types of HTTP response headers such as Location and Content-Location are rewritten. The Location header is used to redirect the browser to where the resource can be found. The Content-Location header is used to provide an alternate location where the resource can be found.

JavaScript

Within JavaScript*, absolute references are always evaluated for rewriting. Relative references (such as index.html) are not attempted. Absolute paths (such as /docs/file.html) are evaluated if the page is read from a path-based multi-homing Web server and the reference follows an HTML tag. For example, the string href=‘/docs/file.html’ is rewritten if /docs is a multi-homing path that has been configured to be stripped.

HTML Tags

URL references occurring within the following HTML tag attributes are evaluated for rewriting:

action          archive           background
base            borderimage       cite           
code            codebase          data 
dynscr          href              longdesc
lowsrc          onclick           pluginspage
src             usemap

References

An absolute reference is a reference that has all the information needed to locate a resource, including the hostname, such as http://internal.web.site.com/index.html. The rewriter always attempts to rewrite absolute references.

The rewriter attempts to rewrite an absolute path when it is the multi-homing path of a path-based multi-homing service. For example, /docs/file1.html is rewritten if /docs is a multi-homing path that has been configured to be stripped.

Relative references are not rewritten.

Query Strings

URL references contained within query strings can be configured for rewriting on path-based multi-homing proxy services.

Post Data

URL references specified in Post Data can be configured for rewriting on path-based multi-homing proxy services.

13.5.2 Specifying the DNS Names to Rewrite

The rewriter parses and searches the Web content that passes through the Access Gateway for URL references that qualify to be rewritten. URL references are rewritten when they meet the following conditions:

  • URL references containing DNS names or IP addresses matching those in the Web server address list are rewritten with the Published DNS Name.

  • URL references matching the Web Server Host Name are rewritten with the Published DNS Name.

  • URL references matching entries in the Additional DNS Name List of the host are rewritten with the Published DNS Name. The Web Server Host Name does not need to be included in this list.

  • The DNS names in the Exclude DNS Name List specify the names that the rewriter should skip and not rewrite.

The following sections describe the conditions to consider when adding DNS names to the lists:

Determining Whether You Need to Specify Additional DNS Names

Sometimes Web pages contain URL references to a host name that does not meet the default criteria for being rewritten. That is, the URL reference does not match the Web Server Host Name or any value (IP address) in the Web Server List. If these names are sent back to the client, they are not resolvable. Figure 13-4 illustrates a scenario that requires an entry in the Additional DNS Name List.

Figure 13-4 Rewriting a URLs for Web Servers

The page on the data.com Web server contains two links, one to an image on the data.com server and one to an image on the graphics.com server. The link to the data.com server is automatically rewritten to novell.com, when rewriting is enabled. The link to the image on graphics.com is not rewritten, until you add this URL to the Additional DNS Name List. When the link is rewritten, the browser knows how to request it, and the Access Gateway knows how to resolve it.

You need to include names in this list if your Web servers have the following configurations:

  • If you have a cluster of Web servers that are not sharing the same DNS name, you need to add their DNS names to this list.

  • If your Web server obtains content from another Web server, the DNS name to this additional Web server needs to rewritten.

  • If the Web server listens on one port (for example, 80), and redirects the request to a secure port (for example, 443). The response to the user comes back on https://<DNS_name>:443. This does not match the request which was sent on http://<DNS_name>:80. If you add the DNS name to the list, the response can be sent in the format that the user expects.

  • If an application is written to use a private host name. For example, assume that an application URL reference contains the host name of home (http://home/index.html). This host name would need to be added to the Additional DNS Name List.

  • If you enable the Forward Received Host Name option on your path-based multi-homing service and your Web server is configured to use a different port, you need to add the DNS name with the port to the Additional DNS Name List.

    For example, if the public DNS name of the proxy service is www.mylag.com, the path for the path-based multi-homing service is /sales, and the Web server port is 801, the following DNS name needs to be added to the Additional DNS Name List of the /sales service:

    http://www.mylag.com:801
    

When you enter a name in the list, it can use any of the following formats:

DNS_name
host_name
IP_address
scheme://DNS_name
scheme://IP_address
scheme://DNS_name:port
scheme://IP_address:port

For example:

HOME
https://www.backend.com
https://10.10.15.206:444

These entries are not case sensitive.

Determining Whether You Need to Exclude DNS Names from Being Rewritten

If you have two reverse proxies protecting the same Web server, the rewriter correctly rewrites the references to the Web server so that browser always uses the same reverse proxy. In other words, if the browser requests a resource using acme.com.uk, the response is returned with references to acme.com.uk and not acme.com.usa. If you have a third reverse proxy protecting a Web server, the rewriting rules can become ambiguous. For example, consider the configuration illustrated in Figure 13-5.

Figure 13-5 Excluding URLs

A user accesses data.com through the published DNS name of novell.com.mx. The data.com server has references to product.com. The novell.com.mx proxy has two ways to get to the product.com server because this Web server has two published DNS names (novell.com.uk and novell.com.usa). The rewriter could use either of these names to rewrite references to product.com.

  • If you want all users coming through novell.com.mx to use the novell.com.usa proxy, you need to block the rewriting of product.com to novell.com.uk. On the HTML Rewriting page of the reverse proxy for novell.com.uk, add product.com and any aliases to the Exclude DNS Name List.

  • If you do not care which proxy is returned in the reference, you do not need to add anything to the Exclude DNS Names List.

13.5.3 Defining the Requirements for the Rewriter Profile

An HTML rewriter profile allows you to customize the rewriting process and specify which profile is selected to rewrite content on a page. This section describes the following features of the rewriter profile:

Types of Rewriter Profiles

The Access Gateway allows you to define two types of profiles:

Word Profile

A Word profile searches for matches on words. For example, “get” matches the word “get” and any word that begins with “get” such as “getaway” but it does not match the “get” in “together” or “beget.”

The Access Gateway has a default Word profile. It is not specific to a reverse proxy or its proxy services. When you modify its behavior, remember its scope.

If you enable HTML rewriting, but do not define a Word profile for the proxy service, the default Word profile is used. This profile is preconfigured to rewrite the Web Server Host Name and any other names listed in the Additional DNS Name List. The preconfigured profile matches all URLs with the following content-types:

text/html

text/javascript

text/xml

application/javascript

text/css

application/x-javascript

If this default behavior does not match your requirements for a particular page, create your own Word profile and position it before the default profile in the list of profiles. Only one Word profile is applied per page. The first Word profile that matches the page is applied. Profiles lower in the list are ignored.

For information about how strings are replaces in a Word profile, see the following:

Character Profile

A Character profile searches for matches on a specified set of characters. For example, “top” matches the word “top” and the “top” in “tabletop,” “stopwatch,” and “topic.”

If need functionality not provided by the default profile, create a Character profile. If you create multiple Character profiles, order is important. The first Character profile that matches the page is applied. Profiles lower in the list are ignored.

For information on how strings are replaced in a Character profile, see String Replacement Rules for Character Profiles.

Page Matching Criteria for Rewriter Profiles

You specify the following matching criteria for selecting the profile:

  • The URLs to match

  • The URLs that cannot match

  • The content types to match

You use the Requested URLs to Search section of the profile to set up the matching policy.

URLs: The URLs specified in the policy should use the following formats:

Sample URL

Description

http://www.a.com/content

Matches pages only if the request URL does not contain a trailing slash.

http://www.a.com/content/

Matches pages only if the request URL does contain a trailing slash.

http://www.a.com/content/index.html

Matches only this specific file.

http://www.a.com/content/*

Matches the request URL whether or not it has a trailing slash and matches all files in the directory.

http://www.a.com/*

Matches the proxy service and everything it is protecting.

You can specify two types of URLs. In the If Requested URL Is list, you specify the URLs of the pages you want this profile to match. In the And Requested URL Is Not list, you specify the URLs you don’t want this profile to match. You can use the asterisk wildcard for a URL in the If Requested URL Is list that matches pages you really don’t want this profile to match, then use a URL in the And Requested URL Is Not list to exclude them from matching. If a page matches both a URL in the If Requested URL Is list and in the And Requested URL Is Not list, the profile does not match the page.

For example, you could specify the following URL in the If Requested URL Is list:

http://www.a.com/*

You could then specify the following URL in the And Requested URL Is Not list:

http://www.a.com/content/*

These two entries cause the profile to match all pages on the www.a.com Web server except for the pages in the /content directory and its subdirectories.

IMPORTANT:If nothing is specified in either of the two lists, the profile skips the URL matching requirements and uses the content-type to determine if a page matches.

Content-Type: In the And Document Content-Type Is section, you specify the content-types you want this profile to match. To add a new content-type, click New and specify the name such as text/dns. Search your Web pages for content-types to determine if you need to add new types. To add multiple values, enter each value on a separate line.

Regardless of content-type, the page matches if the file extension is html, htm, shtml, jhtml, asp, or jsp.

Possible Actions for Rewriter Profiles

The rewriter action section of the profile determines the actions the rewriter performs when a page matches the profile. Select from the following:

Strip Path Actions: A profile might require the strip path options if the proxy service has the following characteristics:

  • It is a path-based multi-homing proxy.

  • The Remove Path on Fill option has been enabled.

  • URLs appear in query strings or Post Data.

If your profile needs to match pages from this type of proxy server, you might need to enable the Strip Path from Query String and Strip Path from Post Data options.

The strip path options are not available for a Character profile. If the proxy service is not a path-based multi-homing proxy, the strip path options have no effect.

Enabling or Disabling Rewriting: The Enable Rewriter Actions option determines whether the rewriter performs any actions:

  • Select the option to have the rewriter rewrite the references and data on the page.

  • Leave the option unselected to disable rewriting. This allows you to create a profile for the pages you do not want rewritten.

Replacing URLs in JavaScript Variables and HTML Attributes: The Variable and Attribute Name list allows you to specify the HTML attributes or JavaScript variables that you want searched for DNS names that might need to be rewritten. For the list of HTML attribute names that are automatically searched, see HTML Tags. You might want to add the following attributes:

  • value: This attribute enables the rewriter to search the <param> elements on the HTML page for value attributes and rewrite the value attributes that are URL strings.

    If you need more granular control (some need to be rewritten but others do not) and you can modify the page, see Disabling with Page Modifications.

  • formvalue: This attribute enables the rewriter to search the <form> element on the HTML page for <input>, <button>, and <option> elements and rewrite the value attributes that are URL strings. For example, if your multi-homing path is /test and the form line is <input name="navUrl" type="hidden" value="/IDM/portal/cn/GuestContainerPage/656gwmail">, this line would be rewritten to the following value before sending the response to the client:

    <input name="navUrl" type="hidden" value="/test/IDM/portal/cn/GuestContainerPage/656gwmail">
    

    The formvalue attribute enables the rewriting of all URLs in the <input>, <button>, and <option> elements in the form. If you need more granular control (some need to be rewritten but others do not) and you can modify the form page, see Disabling with Page Modifications.

This option is not available for a Character profile.

Replacing URLs in Java Methods: The And JavaScript Method to Search for Is list allows you to specify the Java methods to search to see if their parameters contain a URL string.

This option is not available for a Character profile.

String Replacement: The Additional Strings to Replace list allows you to search for a string and replace it.

When defining a rewriter profile, you should try to put all the string actions into the Word profile. When a Word profile and a Character profile both match the same URL, you need to ensure that they do not contain search and replace actions for overlapping strings. The results of such actions are unpredictable.

For example, if your Word profile has an action to search for Doodle and replace this string with Artwork and your Character profile has an action to search for Doo and replace this string with Zoo, the results are unpredictable. If you place both search and replace actions in the Word profile, the results are predictable.

For the rules and tokens that can be used in the search strings, see the following:

For information on how the Additional Strings to Replace list can be used to reduce the number of Java methods you need to list, see Using $path to Rewrite Paths in JavaScript Methods, Parameters, or Variables.

String Replacement Rules for Word Profiles

In a Word profile, a string matches all paths that start with the characters in the specified string. For example:

Search String

Matches This String

Doesn’t Match This String

/path

/path

/pathother

/path/other

/path.html

/mypath

You can use the following special tokens to modify the default matching rules:

  • [w] to match one white space character

  • [ow] to match 0 or more white space characters

  • [ep] to match a path element in a URL path, excluding words that end in a period

  • [ew] to match a word element in a URL path, including words that end in a period

  • [oa] to match one or more alphanumeric characters

White Space Tokens: You use the [w] and the [ow] tokens to specify where white space might occur in the string. For example:

[ow]my[w]string[w]to[w]replace[ow]

If you don’t know, or don’t care, whether the string has zero or more white characters at the beginning and at the end, use [ow] to specify this. The [w] specifies exactly one white character.

Path Tokens: You use the [ep] and [ew] tokens to match path strings. The [ep] token can be used to match the following types of paths:

Search String

Matches This String

Doesn’t Match This String

/path[ep]

/path

/home/path/other

/path.html

/home/pathother

The [ew] token can be used to match the following types of paths:

Search String

Matches This String

Doesn’t Match This String

/path[ew]

/path.html

/home/path

/paths

Name Tokens: You use the [oa] token to match function or parameter names that have a set string to start the name and end the name, but the middle part of the name is a computer-generated alphanumeric string. For example, the [oa] token can be used to match the following types of names:

Search String

Matches This String

Doesn’t Match This String

javaFunction-[oa](

javaFunction-1234a56()

javaFunction-a()

javaFunction()

String Replacement Rules for Character Profiles

When you configure multiple strings for replacement, the rewriter uses the following rules for determining how characters are replaced in strings:

  • String replacement is done as a single pass.

  • String replacement is not performed recursively. Suppose you have listed the following search and replacement strings:

    DOG     to be replaced with     CAT
    A       to be replaced with     O
    

    All occurrences of the string DOG are replaced with CAT, regardless of whether it is the word DOG or the word DOGMA. Only one replacement pass occurs. The rewritten CAT is not replaced with COT.

  • Because string replacement is done in one pass, the string that matches first takes precedence. Suppose you have listed the following search and replacement strings:

    ABC       to be replaced with     XYZ
    BCDEF     to be replaced with     PQRSTUVWXYZ
    

    If the original string is ABCDEFGH, the replaced string is XYZDEFGH.

  • If two specified search strings match the data portion, the search string of longer length is used for the replacement except for the case detailed above. Suppose you have listed the following search and replacement strings:

    ABC        to be replaced with     XYZ
    ABCDEF     to be replaced with     PQRSTUVWXYZ
    

    If the original string is ABCDEFGH, the replaced string is PQRSTUVWXYZGH.

Using $path to Rewrite Paths in JavaScript Methods, Parameters, or Variables

You can use the $path token to rewrite paths on a path-based multi-homing service that has the Remove Path on Fill option enabled. This token is useful for Web applications that require a dedicated Web server and are therefore installed in the root directory of the Web server. If you protect this type of application with Access Manager using a path-based multi-homing proxy service, your clients access the application with a URL that contains a /path value. The proxy service uses the path to determine which Web server a request is sent to, and the path must be removed from the URL before sending the request to the Web server.

The application responds to the requests. If it uses JavaScript methods, parameters, or variables to generate paths to resources, these paths are sent to client without prepending the path for the proxy service. When the client tries to access the resource specified by the Web server path, the proxy service cannot locate the resource because the multi-homing path is missing. The figure below illustrates this flow with the rewriter adding the multi-homing path in the reply.

Figure 13-6 Rewriting with a Multi-homing Path

To make sure all the paths generated by JavaScript are rewritten, you must search the Web pages of the application. You can then either list all the JavaScript methods, parameters, and variables in the Additional Names to Search for URL Strings to Rewrite with Host Name section of the rewriter profile, or you can use the $path token in the Additional Strings to Replace section. This token, which is a shortcut for the multi-homing path, together with the Strip Path from Query String and Strip Path from Post Data actions, usually can find all the paths that need to rewritten. If nothing else, it reduces the number of JavaScript methods, parameters, and variables that you otherwise need to list individually.

To use the $path token, you add a search string and a replace string that uses the token. For example, if the /prices/pricelist.html page is generated by JavaScript and the multi-homing path for the proxy service is /inner, you would specify the following stings:

Table 13-1 Search and Replace Strings

Search String

Replacement String

/prices

$path/prices

This configuration allows the following paths to be rewritten.

Table 13-2 Rewriting Strings Sent from the Web Server to the Browser

Web Server String

Rewritten String for the Browser

/prices/pricelist.html

/inner/prices/pricelist.html

/prices

/inner/prices

If the Strip Path from Query String or Strip Path from Post Data option is enabled, the search and replace strings allow the following paths to be rewritten.

Table 13-3 Rewriting Strings Sent from the Browser to the Web Server

Browser String

Rewritten String for the Web Server

/inner/prices/pricelist.html

/prices/pricelist.html

/inner/prices

/prices

13.5.4 Configuring the HTML Rewriter and Profile

You configure the HTML rewriter for a proxy service, and these values are applied to all Web servers that are protected by this proxy service.

To configure the HTML rewriter:

  1. In the Administration Console, click Access Manager > Access Gateways > Edit > [Name of Reverse Proxy] > [Name of Proxy Service] > HTML Rewriting.

    Configuring HTML rewriting

    The HTML Rewriting page specifies which DNS names are to be rewritten. The HTML Rewriter Profile specifies which pages to search for DNS names that need to be rewritten.

  2. Select Enable HTML Rewriting.

    This option is enabled by default. When it is disabled, no rewriting occurs.When enabled, this option activates the internal HTML rewriter. This rewriter replaces the name of the Web server with the published DNS name when sending data to the browsers. It replaces the published DNS name with the Web Server Host Name when sending data to the Web server. It also makes sure the proper scheme (HTTP or HTTPS) is included in the URL. This is needed because you can configure the Access Gateway to use HTTPS between itself and client browsers and to use HTTP between itself and the Web servers.

  3. In the Additional DNS Name List section, click New, specify a DNS that appears on the Web pages of your server (for example a DNS name other than the Web server’s DNS name), then click OK.

    For more information, see Determining Whether You Need to Specify Additional DNS Names.

  4. In the Exclude DNS Name List section, click New, specify a DNS name that appears on the Web pages of your server that you do not want rewritten, then click OK.

    For more information, see Determining Whether You Need to Exclude DNS Names from Being Rewritten.

  5. Use the HTML Rewriter Profile List to configure a profile. Select one of the following actions:

    • New: To create a profile, click New. Specify a display name for the profile and select either a Word or Character for the Search Boundary. Continue with Step 6.

      • Word: A Word profile searches for matches on words. For example, “get” matches the word “get” and any word that begins with “get” such as “getaway” but it does not match the “get” in “together” or “beget.”

        If you create multiple Word profiles, order is important. The first Word profile that matches the page is executed. Profiles lower in the list are ignored.

      • Character: A Character profile searches for matches on a specified set of characters. For example, “top” matches the word “top” and the “top” in “tabletop,” “stopwatch,” and “topic.”

        If you want to add functionality to the default profile, create a Character profile. It has all the functionality of a Word profile, except searching for attribute names and Java variables and methods. If you create multiple Character profiles, order is important. The first Character profile that matches the page is executed. Profiles lower in the list are ignored.

    • Delete: To delete a profile, select the profile, then click Delete. Continue with Step 13.

    • Enable: To enable a profile, select the profile, then click Enable. Continue with Step 13.

    • Disable: To disable a profile, select the profile, then click Disable. Continue with Step 13.

    • Modify: To view or modify the current configuration for a profile, click the name of the profile. Continue with Step 6.

      The default profile is designed to be applied to all pages protected by the Access Gateway. It is not specific to a reverse proxy or its proxy services. If you modify its behavior, remember its scope. Rather than modify the default profile, you should create your own customized Word profile and enable it

  6. Use the Requested URLs to Search section to set up a policy for specifying the URLs you want this profile to match.

    Specifying which pages to search

    Fill in the following fields:

    If Requested URL Is: Specify the URLs of the pages you want this profile to match. Click New to add a URL to the text box. To add multiple values, enter each value on a separate line.

    And Requested URL Is Not: Specify the URLs of pages that this profile should not match. If a page matches the URL in both the If Requested URL Is list and And Requested URL Is Not list, profile does not match the page. Click New to add a URL to the text box. To add multiple values, enter each value on a separate line.

    And Document Content-Type Is: Select the content-types you want this profile to match. To add a new content-type, click New and specify the name such as text/dns. Search your Web pages for content-types to determine if you need to add new types. To add multiple values, enter each value on a separate line.

    For more information on how to use these options, see Page Matching Criteria for Rewriter Profiles.

  7. Use the Actions section to specify the actions the rewriter should perform if the page matches the criteria in the Requested URLs to Search section.

    Configure the following actions:

    Strip Path from Query String: (Not available for Character profiles) Select this option to remove the path from the query string. To use this option, your proxy service must meet the conditions listed in Possible Actions for Rewriter Profiles.

    Strip Path from Post Data: (Not available for Character profiles) Select this option to remove the path from the Post Data command. To use this option, your proxy service must meet the conditions listed in Possible Actions for Rewriter Profiles.

    Enable Rewriter Actions: Select this action to enable the rewriter to perform any actions:

    • Select it to have the rewriter use the profile to rewrite references and data on the page. If this option is not selected, you cannot configure the action options.

    • Leave it unselected to disable rewriting. This allows you to create a profile for the pages you do not want rewritten.

  8. (Not available for Character profiles) If your pages contain JavaScript, use the Additional Names to Search for URL Strings to Rewrite with Host Name section to specify JavaScript variables or methods. You can also add HTML attribute names. (For the list of attribute names that are automatically searched, see HTML Tags.)

    Fill in the following fields:

    Variable or Attribute Name to Search for Is: Lists the name of an HTML attribute or JavaScript variable to search to see if its value contains a URL string. Click New to add a name to the text box. To add multiple values, enter each value on a separate line.

    JavaScript Method to Search for Is: Lists the names of Java methods to search to see if their parameters contain a URL string. Click New to add a method to the text box. To add multiple values, enter each value on a separate line.

  9. Use the Additional Strings to Replace section to specify a string to search for and specify the text it should be replaced with. The search boundary (word or character) that you specified when creating the profile is used when searching for the string.

    To add a string, click New, then fill in the following:

    Search: Specify the string you want to search for. The profile type controls the matching and replacement rules. For more information, see one of the following:

    Replace With: Specify the string you want to use in place of the search string.

  10. Click OK.

  11. If you have more than one profile in the HTML Rewriter Profile List, use the up-arrow and down-arrow buttons to order the profiles.

    If you create more than one profile, order becomes important. For example if you want to rewrite all pages with a general rewriter profile (with a URL such as /*) and one specific set of pages with another rewriter profile (with a URL such as /doc/100506/*), you need to have the specific rewriter profile listed before the general rewriter profile. Only one Word profile and one Character profile are executed per page.

    Even if multiple Word or Character profiles are enabled, only a maximum of one Word and one Character profile is executed per page. The first one in the list that matches a page is executed, and the others are ignored.

  12. Enable the profiles you want to use for this protected resource. Select the profile, then click Enable.

    The default profile cannot be disabled. However, it is not executed if you have enabled another Word profile that matches your pages, and this profile comes before the default profile in the list.

  13. To save your changes to browser cache, click OK.

  14. To apply your changes, click the Access Gateways link, then click Update > OK.

  15. The cached pages affected by the rewriter changes must be updated on the Access Gateway. Do one of the following:

    • If the changes affect numerous pages, click Access Gateways, select the name of the server, then click Actions > Purge All Cache.

    • If the changes affect only a few pages, you can update them from a browser. Access the page, then press Ctrl+Shift+Refresh to force a refresh of the page.

13.5.5 Disabling the Rewriter

There are three methods you can use to disable the internal rewriter:

Disabling per Proxy Service

By default, the rewriter is enabled for all proxy services. The rewriter can slow performance because of the parsing overhead. In some cases, a Web site might not have content with URL references that need to be rewritten. The rewriter can be disabled on the proxy service that protects that Web site.

  1. In the Administration Console, click Access Manager > Access Gateways > Edit > [Name of Reverse Proxy] > [Name of Proxy Service] > HTML Rewriting.

  2. Deselect the Enable HTML Rewriting option, then click OK.

  3. To apply your changes, click the Access Gateways link, then click Update > OK.

  4. Select the Access Gateway, then click Actions > Purge All Cache > OK.

Disabling per URL

You can also specify a list of URLs that are to be excluded from being rewritten for the selected proxy service.

  1. In the Administration Console, click Access Manager > Access Gateways > Edit > [Name of Reverse Proxy] > [Name of Proxy Service] > HTML Rewriting.

  2. Click the name of the Word profile defined for this proxy service.

    If you have not defined a custom Word profile for the proxy service, you might want to create one. If you modify the default profile, those changes are applied to all proxy services.

  3. In the And Requested URL Is Not section, click New, then specify the names of the URLs you do not want rewritten.

    Specify each URL on a separate line.

  4. Click OK twice

  5. In the HTML Rewriter Profile List, make sure the profile you have modified is enabled and at the top of the list, then click OK.

  6. To apply your changes, click the Access Gateways link, then click Update > OK.

  7. Select the Access Gateway, then click Actions > Purge All Cache > OK.

Disabling with Page Modifications

There are cases when the URLs in only part of a page or in some of the JavaScript or form can be rewritten and the rest should not be rewritten. When this is the case, you might need to modify the content on the Web server. Although this deviates from the design behind Access Manager, you might encounter circumstances where it cannot be avoided.

You can add the following types of tags to the pages on the Web server:

These tags are seen by browsers as a comment mark, and do not show up on the screen (except possibly on older browser versions).

NOTE:If the pages you modify are cached on the Access Gateway, you need to purge the cache before the changes become effective.

Page Tags: In the case where you want only portions of a page rewritten, you can add the following tags to the page.

<!--NOVELL_REWRITER_OFF--> 
.
.
HTML data not to be rewritten
.
.
<!--NOVELL_REWRITER_ON-->

The last tag is optional, and if omitted, it prevents the rest of the page from being rewritten after the initial tag is encountered.

Param Tags: Sometimes the JavaScript on the page contains <param> elements that contain a value attribute with a URL. You can enable global rewriting of this attribute by adding value to the list of variable and attribute names to search for. If you need more control because some URLs need to be rewritten but others cannot be rewritten, you can turn on and turn off the value rewriting by adding the following tags before and after the <param> element in the JavaScript.

<!--NOVELL_REWRITE_ATTRIBUTE_ON='value'-->
.
.
<param> elements to be rewritten
.
.
<!--NOVELL_REWRITE_ATTRIBUTE_OFF='value'-->
.
.
<param> elements that shouldn’t be rewritten

Form Tags: Some applications have forms in which the <input>, <button>, and <option> elements contain a value attribute with a URL. You can enable global rewriting of these attributes by adding formvalue to the list of variable and attribute names to search for. If you need more control because some URLs need to be rewritten but others cannot be rewritten, you can turn on and turn off the formvalue rewriting by adding the following tags before and after the <input>, <button>, and <option> elements in the form.

<!--NOVELL_REWRITE_ATTRIBUTE_ON='formvalue'-->
.
.
<input>, <button>, and <option> elements to be rewritten
.
.
<!--NOVELL_REWRITE_ATTRIBUTE_OFF='formvalue'-->
.
.
<input>, <button>, and <option> elements that shouldn’t be rewritten