novell-oes-pure-ftpd - mget or mdelete fail on NSS mount if too many matching files are in a single directory

  • 7016833
  • 14-Sep-2015
  • 29-Jun-2017

Environment

Novell Open Enterprise Server 11 (OES 11) Linux Support Pack 2
Novell Open Enterprise Server 2015 (OES 2015)

Situation

An OES FTP server is providing FTP users access to other remote NSS volumes in the eDir tree.  An FTP user issues an mget or mdelete command against a remote volume.  If the remote directory contains a lot of files which match the file specification, this operation may timeout after 17 seconds, after which the FTP session will end abruptly.  The number of files necessary to see this may vary, depending on network speeds between the OES FTP server and the remote OES Server where the NSS volume resides.  In severe situations, this has been reported with as little at 600 matching files.
 
This can also happen with a simple directory list if wildcards are included (ls *, ls name*, dir *, etc.).
 
Upon the 17 seconds, the client will typically see the following message from the server:
421 Timeout
 
After which both the FTP control connection and data connection will be closed.

At the FTP server, in the messages file, a message in the following form will be shown:
pure-ftpd:  (user@address) [pidNumber] [INFO] Timeout

Resolution

Within pure-ftpd code, a "GLOB_TIMEOUT" setting, which controls a timer for wildcard operations, has been increased from 17 seconds to 30 seconds to allow for more success in larger directories.

This change is first available in OES November 2015 Maintenance, specifically in:
 
OES 11 SP2, novell-oes-pure-ftpd 1.0.22-33.52.56.1
OES 2015, novell-oes-pure-ftpd  1.0.22-33.63.1

Depending on network speeds and directory sizes, this problem may still occur.  Other options to avoid hitting this timeout:
 
1) Organize the directory structure to have fewer files in each directory.
2) Move the files local to the FTP server.
3) Increase network connection speed between the FTP server and the remote OES server where the NSS volume resides.
4) Change the relative location of the FTP server to the NCP/NSS volume to avoid slow network links.
5) Add an additional FTP server on the same LAN as the NCP/NSS volume, and use that FTP server instead.

Cause

Wildcard operations can be expensive, and in theory can be used for denial of service attacks.  Therefore, pure-ftpd limits how long it will allow them to perform work.  The limit in mainstream pure-ftpd is 17 seconds, set in the source code with a "GLOB_TIMEOUT".
 
When dealing with local file systems (i.e. local to the FTP server), this timer would rarely be exceeded.  However, novell-oes-pure-ftpd is a highly customized version of the package, which allows users to reach remote NCP volumes.  This potentially introduces a lot of latency in obtaining remote directory listings.  Depending on the size of the directory and the speed of the network links (especially if a WAN link is involved), a GLOB_TIMEOUT of 17 seconds could easily be exceeded.
 
To provide more time for these operations, without introducing too much potential delay or risk, OES development has raised this timeout within novell-oes-pure-ftpd to 30 seconds.  (The more mainstream pure-ftpd package which comes with certain SUSE distributions has not been modified.)
 
Note that during wildcard operations, multiple tasks are performed.  The glob timer does not directly limit the entire operation, it limits certain individual tasks.  If the first wildcard task exceeds this timer, the overall operation will immediately stop (i.e. after 17 or 30 seconds, depending on which novell-oes-pure-ftpd code is in use).  However, if the first GLOB operation succeeds, an additional operation will come afterward, so the entire process could approach twice the GLOB_TIMEOUT (nearly 34 or 60 seconds) and still succeed (or fail after that time expires).