13.070 bytes

Service Hints & Tips

Document ID: MCGN-3HKK6P

Servers - RAID Array: Data scrubbing, prevent RAID rebuild failure

Applicable to: World-Wide

Read and understand this document prior to applying any steps/ procedures.

Before installing software or data for the first time on an IBM PC Server RAID system, the following must be performed:

1- UPDATE THE RAID ADAPTER FIRMWARE TO THE FOLLOWING MINIMUM FIRMWARE LEVEL OR HIGHER.


ADAPTER FIRMWARE BIOS

a. Micro Channel RAID Adapter 2.21
FRU p/n06H3059 (Opt. p/n70G9263)

b. PCI RAID Adapter FRU p/n06H5078 2.43
(Opt. p/n94G2764)

c. ServeRAID Adapter FRU p/n06H9334 2.23s.6* 2.30.04*
(Opt. p/n70G8489)

d. ServeRAID Adapter FRU p/n76H6875 2.23s.6* 2.30.04*
(Opt. p/n70G8489)

e. ServeRAID II Adapter FRU p/n76H3587 2.30.04* 2.30.04*
(Opt. p/n76H3584)

f. ServeRAID Onboard Controller 97239* 2.30.04*

* The Firmware/BIOS diskette 2.30 contains the BIOS flash 2.30.04 as well as the Firmware flashes.

NOTE: These Firmware and BIOS versions are to date at the time this document was released. Firmware and BIOS levels are subject to change over time. Always check for the latest BIOS and Firmware utility on the IBM Website URL:
http://www.us.pc.ibm.com/files.html

NOTE: Be sure that the latest version of the corresponding RAID utility diskette is used to ensure compatibility with the latest Firmware and BIOS on the corresponding adapter.

2- INITIALIZE RAID LEVEL 0, 1, AND 5 LOGICAL DRIVES (ALL RAID ADAPTERS).

3- SYNCHRONIZE ALL RAID 5 LOGICAL DRIVES AFTER INITIALIZATION (PRIOR TO INSTALLING SOFTWARE AND DATA) ON THE SERVERAID ADAPTER, SERVERAID II ADAPTER, SERVERAID ONBOARD CONTROLLER, OR DATA LOSS MAY OCCUR.

NOTE: Synchronization is done automatically when initializing RAID 5 logical drives on the following adapters:


Micro Channel RAID Adapter FRU p/n92F0335 (Opt. p/n none)
Micro Channel RAID Adapter FRU p/n06H3059 (Opt. p/n70G9263)
PCI RAID Adapter FRU p/n06H5078 (Opt. p/n94G2764)

4- DATA SCRUB ALL RAID 5 LOGICAL DRIVES USING THE SYNCHRONIZE UTILITY WEEKLY (AFTER SOFTWARE AND DATA ARE INSTALLED) TO PROVIDE A HIGH LEVEL OF PROTECTION AGAINST DATA LOSS.

NOTE: "Data Scrubbing" of the drives may be accomplished one of two ways on the following adapters:

a. Micro Channel RAID Adapter FRU p/n06H3059 (Opt. p/n70G9263)
b. PCI RAID Adapter FRU p/n06H5078 (Opt. p/n94G2764)
c. ServeRAID Adapter FRU p/n06H9334 (Opt. p/n70G8489)
d. ServeRAID Adapter FRU p/n76H6875 (Opt. p/n70G8489)

- The Raid Utility Diskette may be used to apply "Data Scrubbing" of Raid level 1 and 5 Logical drives using the "Synchronize" utility. This method requires that you "down" the server.

- Netfinity Manager 5.0 or higher may be used to allow "Data Scrubbing" via Synchronization to be run in the background while the server is up. This will allow users to access data on the Logical drive.

NOTE: See the matrix of utilities vs. adapters vs. Network Operating Systems in the White Paper; "Using IBM RAID Adapters to Avoid Data Loss".
The WEB URL to search for this White Paper is:
www.us.pc.ibm.com/support.html

Click on "Search" at the top of the page and use "White Paper" as Keywords.

NOTE: "Data Scrubbing" runs automatically in the background on the ServeRAID II Adapter. The Firmware of the adapter must be at 2.30.04 or higher to include this feature.

DETAILS:
When a hard drive fails and is replaced in a RAID-1 or RAID-5 array, data loss may occur if a sector on one of the remaining working drives cannot be read.

RAID-5 logical drives must be synchronized immediately after they are created to ensure that the parity data stripe units (RAID 5 ) accurately reflect the data.

The IBM ServeRAID Adapter, IBM ServeRAID II Adapter and the ServeRAID Onboard controller requires the user to synchronize the RAID 5 Logical drives after initialization before any data is stored on the drives.

"Data Scrubbing" is recommended as a preventative maintenance procedure to reduce the risk of an array rebuild failure, or possible data loss if using the ServeRAID adapter. IBM recommends that "Data Scrubbing" be run weekly to provide a high level of protection. The level of protection increases as more frequent "Data Scrubbing" is performed. To reduce the frequency of "Data Scrubbing" to once or twice a month and still maintain a high level of protection, schedule "Data Scrubbing" along with other preventative maintenance procedures like regular tape backups.

Over time a hard disk may accumulate grown defects. This is normal. Defects are corrected on accessed files by the hardfile ECC or RAID subsystem. If a grown defect is encountered when a file is accessed, the data is reconstructed using either the ECC on the hardfile or the RAID redundant information. However, if a grown defect appears on an area that is not accessed (the area is free space, or because the file is accessed from cache), then "Data Scrubbing" is required to detect it. Once detected, the hardfile will reallocate the sector. In the case where all drives are online, the ECC on the hardfile or the RAID redundant information is used to reconstruct the lost stripe unit. However, if a drive has a grown defect, and another drive has failed completely, then there is not enough information to reconstruct the data and data loss may occur after the rebuild.

Predictive Failure Analysis (PFA) has been developed to monitor performance of drives, analyze data from periodic internal measurements, and recommend replacement when specific thresholds are exceeded. The data from periodic internal measurements is collected when actual accesses of the data sectors occur. "Data Scrubbing" , which forces all data sectors to be read, provides more data to improve the accuracy of PFA. IBM recommends that customers read the following White Papers to ensure a thorough understanding of RAID and hardfile technologies:

Document

Faxback
Document #

- Using IBM RAID Adapters to Avoid Data Loss

11202

- Understanding Hard Disk drive Media Defects.

11205

- Ensuring High Availability of Your Raid Subsystem with:
> IBM SCSI-2 Fast/Wide PCI-Bus RAID Adapter.
> IBM Fast/Wide Streaming RAID Adapter.

11204

- Ensuring High Availability Using the PC ServeRAID Adapter.

11203


The IBM Faxback system may be accessed by calling 1-800-IBM-3395

- - - - - - - - - - - - - - - OR - - - - - - - - - - - - - - - -

- The IBM Website at URL: http://www3.pc.ibm.com/support. Choose Servers , then choose Hints and Tips.

NOTE: WITH THE SERVERAID ADAPTER, SERVERAID ONBOARD CONTROLLER AND SERVERAID II ADAPTER, SYNCHRONIZATION IS REQUIRED TO ENSURE THE PARITY ACCURATELY REFLECTS THE DATA. IF SYNCHRONIZATION OR DATA SCRUBBING IS PERFORMED ON AN ARRAY THAT WAS NEVER PREVIOUSLY SYNCHRONIZED, THEN ANY MEDIA DEFECTS FOUND THAT REQUIRE RAID RECONSTRUCTION MAY BE REBUILT USING INCORRECT PARITY WHICH MAY RESULT IN DATA LOSS.

NOTE: Use the "IBM PC ServeRAID Synch Verify Update Diskette" ver 1.10 or higher to determine the status of any RAID arrays on the ServeRAID adapter ONLY. Be sure to read the README file prior to executing any programs on the diskette. The diskette can be located at and downloaded from the IBM Website at URL:
http:/www.us.pc.ibm.com/files.html


SAS KEYWORDS:
PSY2 PSY2ADPT D/T8640 D/T8642
320 06H5078 06H3059 92F0335
06H9334 DDD DEFUNCT 520
720 SERVER 500 SYNCHRONIZE
RAID 320 SCRUB D/T8639
325 330 704 D/T8650
DATA SCRUBBING REBUILD FAILS DATA LOSS HARDFILE
PARITY D/T8639 D/T8640 D/T8641
D/T8642 D/T8650 D/T8651 RAID BIOS
RAID FIRMWARE UNCLASSIFIED NETFINITY 7000 HEALTH


Search Keywords

PSY2, PSY2ADPT, D/T8640, D/T8642, 320, 06H5078, 06H3059, 92F0335, 06H9334, DDD, DEFUNCT, 520, 720, SERVER, 500, SYNCHRONIZE, RAID, 320, SCRUB, D/T8639, 325, 330, 704, D/T8650, DATA SCRUBBING, REBUILD FAILS, DATA LOSS, HARDFILE, PARITY, D/T8639, D/T8640, D/T8641, D/T8642, D/T8650, D/T8651, RAID BIOS, RAID FIRMWARE, UNCLASSIFIED, NETFINITY 7000, HEALTH

Hint Category

RAID, Retain

Date Created

30-05-97

Last Updated

22-02-99

Revision Date

22-02-2000

Brand

IBM PC Server

Product Family

Netfinity 7000, PC Server 320, PC Server 325, PC Server 330, PC Server 500, PC Server 704, PC Server 720, ServeRAID

Machine Type

8651, 8640, 8639, 8641, 8650, 8642, Various

Model

TypeModel

Retain Tip (if applicable)

H134082

Reverse Doclinks
and Admin Purposes

USA=A, EMA=A, AFE=A, Owning B.U.: USA , Date created: O96/09/18, Date last altered:A97/11/14