Novell is now a part of Micro Focus

My Favorites

Close

Please to see your favorites.

XFS metadata corruption and invalid checksum on SAP Hana servers

This document (7022921) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 12 Service Pack 2 (SLES 12 SP2)
SUSE Linux Enterprise Server for SAP Applications
SAP Hana 2.0
SAN supporting DISCARD / UNMAP Functions

Situation

This is only seen on newer SANs which support the TRIM / DISCARD with support for one or both (WRITE_SAME and UNMAP) functions.

There are several symptoms seen with the same underlying problem.
1. XFS metadata corruption. 
    kernel: Metadata corruption detected at xfs_agf_read_verify+0x78/0x140 [xfs], xfs_agf block 0x707a601
    kernel: XFS (dm-15): Unmount and run xfs_repair
    kernel: XFS (dm-15): First 64 bytes of corrupted metadata buffer:
    kernel: c000002f73357380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
    kernel: c000002f73357390: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

2.  SAP checksum error during backup caused by underlying OS I/O issue
Seeing messages in indexserver trace files.
  FaultProtectionImpl.cpp(01620) : NOTE: full crash dump will be written to /usr/sap/<SID>/HDB00 /sapqsgdb01/trace/DB_<SID>/indexserver_sapqsgdb01.30003.crashdump.20180428-100239.091877.trc

Seeing quite a few corrupt .dmp files inside SAP working directories.
 indexserver.perspage.20180428-092540580.0x000000000001b918L.0x0000700000087dc8P.0.corrupt.dmp
 nameserver.perspage.20180404-152641740.0x000000000001b42eL.0x0000800000070050P.0.corrupt.dmp

Hana Studio shows a checksum error while trying to backup SYSTEMDB.
 [447] backup could not be completed, Wrong checksum Calculated xxxxxxxxx with checksum algorithm 3 (CRC32)

3. Multipath and SCSI errors seen after fstrim.service is run.  This can overload the SAN with I/O requests especially when multiple SLES12 SP2 servers are running the fstrim.service at the same time all connected to same back end SAN.

 kernel: sd 3:0:3:37: Cancelling outstanding commands.
 kernel: sd 4:0:0:8: [sdac] tag#4 Command (42) failed: transaction cancelled (200:600) flags: 0 fcp_rsp: 0, resid=0, scsi_status: 0
  
 multipathd[44370]: 360050768018000222000000000008efa: sdlf - tur checker timed out
 multipathd[44370]: checker failed path 67:464 in map 360050768018000222000000000008efa
 multipathd[44370]: 360050768018000222000000000008efa: remaining active paths: 6
 kernel: device-mapper: multipath: Failing path 67:464.

Resolution

There are 3 possible solutions, starting with the fastest and easiest.
1. Stop and Disable the fstrim.timer and fstrim.service.  The fstrim.timer schedules the fstrim.service to run every Sunday night at midnight and runs an "fstrim -a"
  systemctl stop fstrim.timer
  systemctl disable fstrim.timer
1a. The "UNMAP" functionality can also be turned off on the storage arrays.  You will need to contact specific storage vendors for details.

2. Apply the kernel mass PTF.
Kernel Version: 4.4.120-92.70.1
Kernel update for x86_64:
Kernel update for ppc64:

Kernel Version: 4.4.121-92.73.1
Kernel update for x86_64:
Kernel update for ppc64:

With the above kernel patch we should be setting the provisioning mode correctly.
We should no longer attempt to call UNMAP/WRITE SAME erroneously on those devices.

3. Upgrade to SLES12 SP3
This should not be seen in SLES12 SP3. Significant efforts to properly support the TRIM/DISCARD functions (WRITE_SAME and UNMAP) have already been included in SLES12 SP3.

If you would like to continue using fstrim.service on SLES12 SP2, then it would be recommended to apply the kernel PTF patch listed in item 2. above which should now properly detect the provisioning modes supported by the SAN.

Cause

SLES12 SP2 only supports one flavor of DISCARD, i.e. discarding of specified sectors. This will get translated into 'UNMAP' or 'WRITE SAME', depending on the capabilities of the device. But as WRITE SAME has a different use-case (namely writing identical patterns onto the specified sectors, with _optionally_ discarding them) the kernel might incorrectly select WRITE SAME instead of UNMAP and overwrite sectors.

The correct fix includes a major rework of the DISCARD handling to separate out UNMAP and WRITE SAME. This has been accomplished in SLES12 SP3, but was deemed too intrusive for SLE12 SP2.

Additional Information

There are several SCSI commands which provide the UNMAP functionality:
- UNMAP: just discard the blocks; no defined change to the content
- WRITE_SAME: write a bit-pattern to the specified blocks, and possibly discard them.

The contents of the LBA after the UNMAP operation has completed are influenced by the 'LBPRZ' (logical block provisioning read zero) setting of the Logical Block provisioning page;  it can be either vendor specific, zeroes, or a predefined initialization pattern.

And the 'WRITE SAME' command only provides the 'UNMAP' functionality if the 'UNMAP' bit in the command parameters is set.

If fstrim /UNMAP is NOT supported on a file system / SAN device, it will simply return with "the discard operation is not supported"
Example:
fstrim -v /sapmnt/LM1/data
    fstrim: /sapmnt/LM1/data: the discard operation is not supported

If fstrim /UNMAP is supported on a file system / SAN that supports the fstrim feature, you will see a number of bytes trimmed after running.
Example:
fstrim -v /sapmnt/
    /sapmnt: 183.8 GiB (197290381312 bytes) trimmed

Disclaimer

This Support Knowledgebase provides a valuable tool for NetIQ/Novell/SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7022921
  • Creation Date:03-MAY-18
  • Modified Date:01-JUN-18
    • NovellSUSE Linux Enterprise Server for SAP Applications
    • SUSESUSE Linux Enterprise Server

Did this document solve your problem? Provide Feedback