CategoryNetApp

Netapp 7 mode snapvault broken relationship check

Netapp Protection Manager is good in handling the snapvault and snapmirror relationship. However, the snapvault relationship will become “broken” once the source qtree deleted/removed. Using command like “snapvault status” is killing me especially when you have hundred of lines to view, though ssh method with “grep” command is helpful but it would be great to have a script to spoon feed us(almost) !

Here is the python script to run in Linux environment:

#/usr/bin/python
import sys
import os
#Define max tolerate hour , 168 = 7days
max_tolerate_hour=168
 
filer_input=sys.argv[1]
if filer_input not in ["filerA","filerB","filerC"]:
    print """This script only works for 3 filers: filerA,filerB,filerC .
Example:
python /home/thinkway/myscript/python/snapvault_relationship_check.py filerA
 
Exiting...
"""
    sys.exit()
else:
    print "Running..."
 
 
f=os.popen("ssh %s snapvault status | grep vol" % (filer_input))
for x in f.readlines():
    output_data = x.split()
    #Define variable
    source_path = output_data[0]
    dest_path = output_data[1]
    relationship = output_data[2]
    #We are only interested with hour,split it out!
    buffer_time = output_data[3].split(":",1)
    relationship_status = output_data[4]
    #Get destination nas hostname
    dest_nas = output_data[1].split(":",1)
    dest_nas_hostname = dest_nas[0]
    #Get the exact hour number and convert it into int
    extracted_hour = int(buffer_time[0])
    if relationship_status == "Idle":
        if extracted_hour > max_tolerate_hour:
            print "Source path         : ",source_path
            print "Destination path    : ",dest_path
            print "Max threshold(hours): ",max_tolerate_hour
            print "Idle (hours)        : ",extracted_hour
            print "Command             : ssh ",dest_nas_hostname," snapvault stop ",dest_path
            print "======================================================================"
 
    else:
        pass
print "Scan completed!"

The output will be…..

Running....
 
Source path         :  Source_filer:/vol/testvol/qtreeA
Destination path    :  filerA:/vol/filerX_testvol/qtreeA
Max threshold(hours):  168
Idle (hours)        :  10207
Command             : ssh  filerA  snapvault stop  filerA:/vol/filerX_testvol/qtreeA
======================================================================
Source path         :  Source_filer:/vol/rnd/r&d
Destination path    :  filerA:/vol/filerX_rnd/rxd
Max threshold(hours):  168
Idle (hours)        :  9884
Command             : ssh  filerA  snapvault stop  filerA:/vol/filerX_rnd/rxd
======================================================================
Source path         :  Source_filer:/vol/shared/sample
Destination path    :  filerA:/vol/filerX_shared/sample
Max threshold(hours):  168
Idle (hours)        :  5875
Command             : ssh  filerA  snapvault stop  filerA:/vol/filerX_shared/sample
======================================================================

See! Just copy and paste the “Command” line!!
P/S: If you have more simplified code to do this job, feel free to share with me and I will post it out ūüôā

Netapp Ontap 7-mode command cheat sheet

 

Netapp Ontap cheat sheet

Netapp Ontap cheat sheet

I have a list of command references that I used it for my daily operational task. It is not complete but once you know most command in the list, it would speed up your task and useful for your Linux shell scripting (check aggr size, vol size ..etc) .  The 1 page reference is the limit, I tried my very best to squeeze in as much as I could.

Feel free to let me know your comment , I will be sharing the Excel spreadsheet if you need to translate to your own languages.

 

Netapp smtape from site A to site B ‚Äď Part 2

Networker wayIn part 2 , the steps are only 4.  This option require fundamental knowledge of Networker. I am using Networker 8.6.

Steps are:

1. smtape directly to tape using Networker.

2. Send the tape to the destination site

3. smtape restore directly to snapmirror destination volume.

4. snapmirror resync!

 

I am not going to cover how to create client/group/media pool . Here is the command i used for smtape backup.

 Once you got the tape at destination site , load it :

You MUST use option “-S” (SSID) for smtape restoration.

1
2
3
4
5
6
7
8
[root@backup_server]# nsrndmp_recover -c filer_A -S 2604589186 -R filer_B -m /vol/test_restore
42795:nsrndmp_recover: Performing recover from Non-NDMP type of device
Host = backup_server (192.168.1.123) port = 9861
42690:nsrndmp_recover: Performing non-DAR Recovery..
42937:nsrdsa_recover: Performing Immediate recover
42940:nsrdsa_recover: Reading Data...
42942:nsrdsa_recover: Reading data...DONE.
42927:nsrndmp_recover: Successfully done

After restoration is done, check on snapmirror status and run the resync command. Please check on article part 1 if you do not know how to check the snapmirror status and run the resync.

Limitation:
1. SMtape allow backup and recover for full volume only.
2. Recovery must have save set ID (SSID)
3. Do not remove the smtape snapshot at source until the snapmirror resync run.

Netapp smtape from site A to site B – Part 1

I need to run snapmirror for a 8TB volume from site A to site B through WAN. The initial effort to get the snapmirror initialization via WAN failed, it tooks more than 3 weeks to run baseline and yet still failing .  Here comes SnapMirror to tape (smtape)  to save me from wasting my time praying so the initialization will get through.  I have my Oracle StorageTek SL150 installed at site B recently and here comes the challenges.

My company is using EMC Networker 8 , i have experience with Legato Networker(before it was acquired by EMC) , not so friendly. I was then introduced to Commvault ,  it could be very user friendly but due to it flexibility, it is sometimes complicated too.   There are 2 ways for me to accomplish my mission Рexport the data to tape , send it to another site, restore tape data, snapmirror resync !

1. Simple way – could apply to any backup software

2. Networker way

Part 1 is covering “simple way”

Ugly screenshot

simple way

There are 6 steps involved:

1.smtape to a source temp volume

2. Backup source temp volume to up to your tape using backup software

3. Send the tape to the destination site

4. Restore the date from tape to destination temp volume

5. smtape restore from temp volume to exact volume

6. snapmirror resync to establish the connection.

 
Example output :
1. create temp volume and smtape to it

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
filer_A> vol create smtape_test_source -l en_US aggr0 50g
Creation of volume 'smtape_test_source' with size 50g on containing aggregate
'aggr0' has completed.
 
filer_A> qtree create /vol/smtape_test_source/source
filer_A> priv set diag
Warning: These diagnostic commands are for use by NetApp
         personnel only.
filer_A*> smtape backup /vol/VMware_ISO /vol/smtape_test_source/source/dump_file
Job 13 started.
filer_A*> smtape status
Job ID Seq No Type    Status      Path                   Device                 Progress
    13      0 Backup  Active      /vol/VMware_ISO        /vol/smtape_test_source/source/dump_file 230.676 MB
filer_A*> smtape status
job not found

2. [Backup using your backup software to tape]
3. [send the tape to destination]

4. Retore the tape data to temp volume. I am using Networker “recover” command to restore the data from tape to filer.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
filer_B> vol create smtape_test_dest -l en_US aggr0 60g
 
[root@backup_server]# recover -R filer_B -c filer_A -d /vol/smtape_test_dest /vol/smtape_test_source
Current working directory is /vol/smtape_test_source/
recover> dir
total 256
03/11/13 17:24            <DIR>         source
recover> pwd
/vol/smtape_test_source/
recover> list
0 file(s) marked for recovery
recover> destination
recover files into /vol/smtape_test_dest
recover> ls
source
recover> cd source
recover> ls
dump_file
recover> cd ..
recover> ls -l
total 256
drwxrwxrwx root             4096 Mar 11 17:24 source
recover> add source
2 file(s) marked for recovery
recover> recover
Recovering 2 files within /vol/smtape_test_source/ into /vol/smtape_test_dest
Volumes needed (all on-line):
        A00035 at /dev/rmt/0cbn
Total estimated disk space needed for recover is 2113 MB.
Requesting 2 file(s), this may take a while...
42795:nsrndmp_recover: Performing recover from Non-NDMP type of device
Host = backup_server.example.com (192.168.1.123) port = 9661
42689:nsrndmp_recover: Performing DAR Recovery..
42617:nsrndmp_recover: NDMP Service Log: DIRECT ACCESS RECOVERY (DAR) requested
 
42937:nsrdsa_recover: Performing Immediate recover
42940:nsrdsa_recover: Reading Data...
42617:nsrndmp_recover: NDMP Service Log: RESTORE: RESTORE IS DONE
 
42942:nsrdsa_recover: Reading data...DONE.
42927:nsrndmp_recover: Successfully done

5. smtape command to restore the data from temp volume to the snapmirror destination volume.

1
2
3
4
5
6
7
8
9
10
11
12
filer_B*> vol create test_restore -l en_US aggr0 60g
Creation of volume 'test_restore' with size 60g on containing aggregate
'aggr0' has completed.
 
filer_B*> vol restrict /vol/test_restore
Volume 'test_restore' is now restricted.
 
filer_B*> smtape restore /vol/test_restore /vol/smtape_test_dest/source/dump_file
 
Job 9 started.filer_B*> smtape status
Job ID Seq No Type    Status      Path                   Device                 Progress
    11      0 Restore Active      /vol/test_restore      /vol/smtape_test_dest/source/dump_file 430.848 MB

6. Once the restoration completed, check on snapmirror status to see if the connection ready. Resync one you see it!

1
2
3
4
filer_B*> snapmirror status
Snapmirror is on.
Source                                                      Destination                                State          Lag        Status
snapshot_for_smtape.8add9a0e-8a2d-11e2-a408-123478563412.0  filer_B:test_restore                      Snapmirrored   21:08:21   Idle
1
2
3
4
5
6
7
8
filer_B*> snapmirror resync -S filer_A:VMware_ISO filer_B:test_restore
NOTE: Destination volume test_restore is already a replica.
NOTE: Resync will not need to revert the volume.
The resync base snapshot will be: snapshot_for_smtape.8add9a0e-8a2d-11e2-a408-123478563412.0
Are you sure you want to resync the volume? y
Tue Mar 12 14:35:17 MYT [filer_B:replication.dst.resync.success:notice]: SnapMirror resync of test_restore to filer_A:VMware_ISO was successful.
Transfer started.
Monitor progress with 'snapmirror status' or the snapmirror log.

The drawback for this method is: require 2x of space in destination during the restoration. This is simple but space consuming. It is good if you have the extra space in your destination filer.

Part II will cover the restoration using Networker 8.6P4.

Netapp 8.1.x Maximum aggregate size

Source

It took me a while to search for the Maximum 64 bit aggregate size.

SSH to Netapp without password

Assuming you have

1. Linux box

2. Netapp box

You need to ssh to Netapp box without password, please follow the guide below , it is taken from Netapp site with the last steps they didn’t include which is to off and on “ssh2.enable” in order to reset the ssh2.

Setup SecureAdmin:

  1. Configure SecureAdmin to enable SSH2 to only accept defaults when it comes to selecting key size.
    Example:

    filer> secureadmin setup ssh

    SSH Setup
    ---------
    Determining if SSH Setup has already been done before...no

    SSH server supports both ssh1.x and ssh2.0 protocols.

    SSH server needs two RSA keys to support ssh1.x protocol. The host key is generated and saved to file /etc/sshd/ssh_host_key during setup. The server key is re-generated every hour when SSH server is running.

    SSH server needs a RSA host key and a DSA host key to support ssh2.0 protocol. The host keys are generated and saved to /etc/sshd/ssh_host_rsa_key and /etc/sshd/ssh_host_dsa_key files respectively during setup.

    SSH Setup prompts for the sizes of the host and server keys.
    For ssh1.0 protocol, key sizes must be between 384 and 2048 bits.
    For ssh2.0 protocol, key sizes must be between 768 and 2048 bits.
    The size of the host and server keys must differ by at least 128 bits.

    Please enter the size of host key for ssh1.x protocol [768] :
    Please enter the size of server key for ssh1.x protocol [512] :
    Please enter the size of host keys for ssh2.0 protocol [768] :

    You have specified these parameters:
    host key size = 768 bits
    server key size = 512 bits
    host key size for ssh2.0 protocol = 768 bits
    Is this correct? [yes]

  2. Setup will now generate the host keys in the background. This could take a few minutes to complete. After the setup is complete, start the SSH server using the ‘secureadmin enable ssh‘ command. A syslog message is generated when the setup is complete.
    filer> Wed Oct 25 05:59:56 GMT [rc:info]: SSH Setup: SSH Setup is done. Host keys are stored in /etc/sshd/ssh_host_key, /etc/sshd/ssh_host_rsa_key and /etc/sshd/ssh_host_dsa_key.

 

Linux:

  1. Configure and enable SSH on the Storage Controller as outlined in the Windows section above, steps 1 through 3.
  2. Test SSH access from the Linux client:
    linux> ssh root@filer ?
  3. From the Linux client, Generate the public/private key pair:
    linux> ssh-keygen -t rsa
  4. When asked for a ‘passpharse’, do not enter one.¬† Just press¬†Enter¬†twice.
  5. Mount the Storage Controller’s root volume to a temporary path¬†on the linux client:
    linux> mount filer:/vol/vol0 /mnt/filer
  6. Create a folder on the storage controllers root volume: /etc/sshd/<username>/.ssh
    linux> mkdir -p /mnt/filer/etc/sshd/<username>/.ssh
    — Note:¬†An error may be generated if this path already exists.¬† This can be safely ignored.
  7. Append the contents of the id_rsa.pub file to the 'authorized_keys' file:
    linux> cat ~/.ssh/id_rsa.pub >> /mnt/filer/etc/sshd/<username>/.ssh/authorized_keys
  8. Set the correct permissions on the .ssh folder and authorized_keys file:
    linux> chmod 700 /mnt/filer/etc/sshd/<username>/.ssh
    linux> chmod 600 /mnt/filer/etc/sshd/<username>/.ssh/authorized_keys
  9. filer>options ssh2.enable off
  10. filer>options ssh2.enable on
  11. Test that SSH to the Storage Controller does not prompt for a password:
    linux> ssh <user>@filer
    filer>
  12. Unmount the Storage Controller’s root volume:
    linux> cd ~
    linux> umount /mnt/filer

Netapp smtape to external hard harddisk (workaround)

smtape is useful if you need to transfer huge volume over low bandwidth  network.  Let say i have a 7TB volumeA to transfer over to remote site , it is going to take forever to transfer.  smtape comes in as a solution, you can smtape to tapedrive and send the tape over to your remote site.

But what if your remote site does not have any tape library? I am going to show you the workaround smtape to volume(not tape drive) . smtape will create a single dump file and you can just copy it via CIFS/NFS to your external harddisk. Send it over to remote site, then attach the USB harddisk to one of Windows/Linux server, copy it into Netapp vis CIFS/NFS(temp volume) .

Steps
Source : VMware_ISO
Destination : test_restore
smtape temp volume : test_sm2t

Prerequisite : priv set diag

1. smtape backup /vol/VMware_ISO /vol/test_sm2t/dump/disk_file
2. vol restrict test_restore
3. smtape restore /vol/test_restore /vol/test_sm2t/dump/disk_file
4. snapmirror status

Snapmirror is on.
Source                                                      Destination                                           State          Lag        Status
snapshot_for_smtape.6a1045a6-1050-11e2-bc32-00a098186ab2.0  mynas01:test_restore                                 Snapmirrored   00:01:09   Idle
mynas01:VMware_ISO                                         snapmirror_tape_6a1045a6-1050-11e2-bc32-00a098186ab2  Source         00:01:09   Idle

5. snap list VMware_ISO

Volume VMware_ISO
working...
 
  %/used       %/total  date          name
----------  ----------  ------------  --------
  0% ( 0%)    0% ( 0%)  Oct 05 15:27  snapshot_for_smtape.6a1045a6-1050-11e2-bc32-00a098186ab2.0 (snapmirror)

6. snap list test_restore

Volume test_restore
working......
 
  %/used       %/total  date          name
----------  ----------  ------------  --------
  0% ( 0%)    0% ( 0%)  Oct 05 15:27  snapshot_for_smtape.6a1045a6-1050-11e2-bc32-00a098186ab2.0

7. snapmirror resync -S mynas01:VMware_ISO mynas01:test_restore

mynas01*&gt; snapmirror resync -S mynas01:VMware_ISO mynas01:test_restore
NOTE: Destination volume test_restore is already a replica.
NOTE: Resync will not need to revert the volume.
The resync base snapshot will be: snapshot_for_smtape.6a1045a6-1050-11e2-bc32-00a098186ab2.0
Are you sure you want to resync the volume? y
Fri Oct  5 15:30:04 MYT [mynas01: replication.dst.resync.success:notice]: SnapMirror resync of test_restore to mynas01:VMware_ISO was successful.
Transfer started.
Monitor progress with 'snapmirror status' or the snapmirror log.
 
mynas01*&gt; snapmirror status
Snapmirror is on.
Source                Destination                                           State          Lag        Status
mynas01:VMware_ISO   mynas01:test_restore                                 Snapmirrored   00:00:52   Idle
mynas01:VMware_ISO   snapmirror_tape_6a1045a6-1050-11e2-bc32-00a098186ab2  Source         00:03:41   Idle

8. snapmirror status -l mynas01:test_restore

Source:                 mynas01:VMware_ISO
Destination:            mynas01:test_restore
Status:                 Idle
Progress:               -
State:                  Snapmirrored
Lag:                    00:01:14
Mirror Timestamp:       Fri Oct  5 15:30:05 MYT 2012
Base Snapshot:          mynas01(1574419105)_test_restore.1
Current Transfer Type:  -
Current Transfer Error: -
Contents:               Replica
Last Transfer Type:     Resync
Last Transfer Size:     100 KB
Last Transfer Duration: -
Last Transfer From:     mynas01:VMware_ISO

9. snap list VMware_ISO

Volume VMware_ISO
working...
 
  %/used       %/total  date          name
----------  ----------  ------------  --------
  0% ( 0%)    0% ( 0%)  Oct 05 15:30  mynas01(1574419105)_test_restore.1 (snapmirror)
  0% ( 0%)    0% ( 0%)  Oct 05 15:27  snapshot_for_smtape.6a1045a6-1050-11e2-bc32-00a098186ab2.0 (snapmirror)

10. snap list test_restore

Volume test_restore
working...
 
  %/used       %/total  date          name
----------  ----------  ------------  --------
  0% ( 0%)    0% ( 0%)  Oct 05 15:30  mynas01(1574419105)_test_restore.1
  0% ( 0%)    0% ( 0%)  Oct 05 15:27  snapshot_for_smtape.6a1045a6-1050-11e2-bc32-00a098186ab2.0

 

Netapp 32bit aggregate upgrade to 64bit

Netapp released 8.1GA on May 2012, with quite a number of improvement and the most eye catching is capability to upgrade aggregate from 32bit to 64bit. I have been waiting for 6 months and eager to try with 8.1RC , but RC is not a good choice for production. Always remember not to risk your production NAS!

Back to the story, 32bit to 64bit is a great feature provided. Why so? Because you do not need to perform migration , doing QSM from 32bit aggregate to 64bit aggregate. Getting downtime from the client is another pain too. This is a beautiful feature and I must praise Netapp for doing great job!

Here is my test plan:
1. Get a NAS with Ontap 8.0.2 – 7 Mode
2. Create 32bit “newaggr” and dump in the disk to make it become 15-16TB
3. Upgrade Ontap 8.1 – 7 Mode
4. Perform upgrade to “newaggr” from 32bit to 64bit

 

Useful Command 

aggr add aggrname
[ -f ]
[ -n ]
[ -g {raidgroup | new | all} ]
[ -c checksum-style ]
[ -64bit-upgrade {check | normal} ] { ndisks[@size]

|

-d disk1 [ disk2 … ] [ -d diskn [ diskn+1 … ] ] }

## add an additonal disk to aggregate pfvAggr, use “aggr status” to get group name
aggr status pfvAggr -r
aggr add pfvAggr -g rg0 -64bit-upgrade normal -d v5.25 normal

                    ## Add 4 300GB disk to aggregate aggr1

aggr add aggr1 -64bit-upgrade check 4@300

Adds disks to the aggregate named aggrname. Specify the disks in the same way as for the aggr create command. If the aggregate is mirrored, then the -d argument must be used twice (if at all).

If the size option is used, it specifies the disk size in GB. Disks that are within approximately 20% of the specified size will be selected. If the size option is not specified, existing groups are appended with disks that are the best match by size for the largest disk in the group, i.e., equal or smaller disks are selected first, then larger disks. When starting new groups, disks that are the best match by size for the largest disk in the last raidgroup are selected. The size option is ignored if a specific list of disks is specified.

If the -g option is not used, the disks are added to the most recently created RAID group until it is full, and then one or more new RAID groups are created and the remaining disks are added to new groups. Any other existing RAID groups that are not full remain partially filled.

The -g option allows specification of a RAID group (for example, rg0) to which the indicated disks should be added, or a method by which the disks are added to new or existing RAID groups.

If the -g option is used to specify a RAID group, that RAID group must already exist. The disks are added to that RAID group until it is full. Any remaining disks are ignored.

If the -g option is followed by new, Data ONTAP creates one or more new RAID groups and adds the disks to them, even if the disks would fit into an existing RAID group. Any existing RAID groups that are not full remain partially filled. The name of the new RAID groups are selected automatically. It is not possible to specify the names for the new RAID groups.

If the -g option is followed by all, Data ONTAP adds the specified disks to existing RAID groups first. Disks are added to an existing RAID group until it reaches capacity as defined by raidsize. After all existing RAID groups are full, it creates one or more new RAID groups and adds the remaining disks to the new groups. If the disk type or checksum style or both is specified then the command would operate only on the RAID groups with matching disk type or checksum style or both.

The -n option can be used to display the command that the system will execute, without actually making any changes. This is useful for displaying the automatically selected disks, for example.

The -c checksum-style argument specifies the checksum style of disks to use when adding disks to an existing aggregate. Possible values are: block for Block Checksum and advanced_zoned for Advanced Zoned Checksum (azcs).

If the -64bit-upgrade option is followed by check, Data ONTAP displays a summary of the space impact which would result from upgrading the aggregate to 64-bit. This summary includes the space usage of each contained volume after the volume is upgraded to 64-bit and the amount of space that must be added to the volume to successfully complete the 64-bit upgrade. This option does not result in an upgrade to 64-bit or addition of disks.

If the -64bit-upgrade option is followed by normal, Data ONTAP upgrades the aggregate to 64-bit if the total aggregate size after adding the specified disks exceeds 16TB. This option does not allow Data ONTAP to automatically grow volumes if they run out of space due to the 64-bit upgrade.

By default, the filer fills up one RAID group with disks before starting another RAID group. Suppose an aggregate currently has one RAID group of 12 disks and its RAID group size is 14. If you add 5 disks to this aggregate, it will have one RAID group with 14 disks and another RAID group with 3 disks. The filer does not evenly distribute disks among RAID groups.

You cannot add disks to a mirrored aggregate if one of the plexes is offline.

The disks in a plex are not permitted to span disk pools. This behavior can be overridden with the -f flag when used together with the -d argument to list disks to add. The -f flag, in combination with -d, can also be used to force adding disks that have a rotational speed that does not match that of the majority of existing disks in the aggregate. 

Before Upgrade

mynas02&gt; aggr status
           Aggr State           Status            Options
          aggr0 online          raid_dp, aggr     root
                                32-bit
    sata64_bk11 online          raid_dp, aggr     raidsize=20
                                64-bit
        newaggr online          raid_dp, aggr     raidsize=10
                                32-bit

Perform upgrade check

mynas02&gt; aggr add newaggr -64bit-upgrade check -d 1a.38.22 1a.38.21
File system size 16.97 TB exceeds maximum 15.99 TB
Checking for additional space required to upgrade all writable 32-bit
volumes in aggregate newaggr (Ctrl-C to interrupt)......
 
Adding the specified disks and upgrading the aggregate to
64-bit will add 1489GB of usable space to the aggregate.
 
To initiate the 64-bit upgrade of aggregate newaggr, run this
command with the "normal" option.

After upgrade

mynas02&gt; aggr add newaggr -64bit-upgrade normal -d 1a.38.22 1a.38.21
File system size 16.97 TB exceeds maximum 15.99 TB
Checking for additional space required to upgrade all writable 32-bit
volumes in aggregate newaggr (Ctrl-C to interrupt)......
File system size 16.97 TB exceeds maximum 15.99 TB
Addition of 2 disks to the aggregate has completed.
mynas02&gt; Checking for additional space required to upgrade all writable 32-bit
volumes in aggregate newaggr (Ctrl-C to interrupt)......
Mon MayMon May 14 18:12:35 MYT 14 18: [mynas02:raid.vol.disk.add.done:notice]: Addition of Disk /newaggr/plex0/rg2/1a.38.21 Shelf 38 Bay 21 [NETAPP   X302_HJUPI01TSSM NA02] S/N [N01RBR3L] to aggregate newaggr has completed successfully
12:35 MYT [mynas02:raid.vMon May 14 18:12:35 ol.diskMYT [mynas02:raid.vol.disk.add.done:notice]: Addition of Disk /newaggr/plex0/rg2/1a.38.22 Shelf 38 Bay 22 [NETAPP   X302_HJUPI01TSSM NA02] S/N [N01RGUYL] to aggregate newaggr has completed successfully
.add.done:notice]: Addition of Disk /newaggr/plex0/rg2/1a.38.21 Shelf 38 Bay 21 [NETAPP   X302_HJUPI01TSSM NA02] S/N [N01RBR3L] to aggregate newaggr has completed successfully
Mon May 14 18:12:35 MYT [mynas02Mon May 14 18:12:3:raid.v5 MYT [mynas02:wafl.scan.64bit.upgrade.start:notice]: The 64-bit upgrade scanner has started running on aggregate newaggr.
ol.disk.add.done:notiMon May 14 18:12:35 MYT [mynas02:wafl.scan.start:info]: Startingce]: Ad 64bit upgrade on aggregate newaggr.
dition of Disk /newaggr/plex0/rg2/1a.38.22 Shelf 38 Bay 22 [NETAPP   X302_HJUPI01TSSM NA02] S/N [N01RGUYL] to aggregate newaggr has completed successfully
Mon May 14 18:12:35 MYT [mynas02:wafl.scan.64bit.upgrade.start:notice]: The 64-bit upgrade scanner has started running on aggregate newaggr.
Mon May 14 18:12:35 MYT [mynas02:wafl.scan.start:info]: Starting 64bit upgrade on aggregate newaggr.
 14 18:Mon May 14 18:12:37 MYT [mynas02:wafl.scan.64bit.upgrade.completed:notice]: The 64-bit upgrade scanner has completed running on aggregate newaggr.
12:37 MYT [mynas02:wafl.scan.64bit.upgrade.completed:notice]: The 64-bit upgrade scanner has completed running on aggregate newaggr.
 
mynas02&gt; aggr show_space -h newaggr
Aggregate 'newaggr'
 
    Total space    WAFL reserve    Snap reserve    Usable space       BSR NVLOG           A-SIS          Smtape
           16TB          1738GB             0KB            15TB             0KB             0KB             0KB
 
This aggregate does not contain any volumes
 
Aggregate                       Allocated            Used           Avail
Total space                           0KB             0KB            15TB
Snap reserve                          0KB             0KB             0KB
WAFL reserve                       1738GB          1392KB          1738GB
 
mynas02&gt; aggr status newaggr
           Aggr State           Status            Options
        newaggr online          raid_dp, aggr     raidsize=10
                                64-bit
                Volumes: 
 
                Plex /newaggr/plex0: online, normal, active
                    RAID group /newaggr/plex0/rg0: normal, block checksums
                    RAID group /newaggr/plex0/rg1: normal, block checksums
                    RAID group /newaggr/plex0/rg2: normal, block checksums

So we have seen that the upgrade is very successful and option for “-64bit-upgrade” is only required once, after you got the aggregate to 64bit, ¬†you just keep increasing the disk depending on your budget!

Update 28th May 2012:

WAFL will start to scan 32bit’s volume once Ontap is upgraded to 8.1 . If you have a huge volume, it might take more than 24hours to scan & you are not able to perform the “check” and “normal” command until entire 32bit aggregate is scanned.

mynas03 &gt; aggr add fc_aggr0 -T FCAL -64bit-upgrade check 16@450
 
Note: preparing to add 12 data disks and 4 parity disks.
 
Continue? ([y]es, [n]o, or [p]review RAID layout) y
 
File system size 20.72 TB exceeds maximum 15.99 TB
 
Checking for additional space required to upgrade all writable 32-bit
 
volumes in aggregate fc_aggr0 (Ctrl-C to interrupt)......
 
aggr add: This aggregate has volumes that contain space-reserved
 
files, and the computation of the additional space required to upgrade
 
these files is not yet available. Retry at a later time.

Sample from /etc/messages:

Sun May 27 01:15:05 GMT [mynas03:wafl.scan.start:info]: Starting 64bit space qualifying on volume vol_test_01.
Sun May 27 17:13:00 MYT [mynas03:wafl.scan.64bit.space.done:notice]: The 64-bit space qualifying scanner has completed running on Volume vol_test_01.

Update 20th Feb 2013:

The WAFL scan will not start until the snapmirror relationship is break off. Thanks to locohost & C3 for the info!

I used to turn off snapmirror on the filer before Ontap upgrade, hence my scan is able to proceed. If you have more finding, feel free to post it here ūüôā

Netapp Volume Migration

As a storage admin , volume migration is always an active routine that need to go through every months or quarters. I was involving in most storage migration using lots of method for the past few years and here is my finding.

  1. Always use snapmirror for volume migration! This is the best way to migrate data in the most consistent way.
  2. Avoid using rsync for huge volume with multiple sessions(directory too huge and you have to split it to multiple sessions).
  3. Detailed plan and down time needed.
Let me share with you my own best practice:
  1. Source filer
    • Identify Volume
    • Volume Size
    • Aggregate type (32bit or 64bit)
  2. Destination filer
    • Identify new volume (with new volume name)
    • New volume size (must be greater than source volume during transfer to minimize the chances of failing during transfer. snapshot might take up additional space on the migration)
    • Aggregate balanced space . Aggregate free space is important for performance , best recommended free space of 15%-20%.
    • Aggregate type (32bit or 63bit)
  3. Getting downtime
    • Engage with customer to get the downtime. Usually the downtime i request is less than 2 hours as the final sync is usually less than 10minutes because i have scheduled snapmirror running daily for the incremental changes.

  4. Cut over plan
    • Always schedule snapmirror schedule to update daily during off hour(or when the filer is free).
    • Run manual snapmirror update for few time an hour before the final sync.
    • restrict source volume after the migration to prevent new data writing in.
    • Test the access in new destination (NFS/CIFS/iSCSI)

Snapmirror alert script

Snapmirror is very cool and fantastic tool command used for migration and mirroring. You can run the snapmirror using schedule (snapmirror.conf) and let it run.

However, the draw back of snapmirror is

  • keep the information for last transfer only. No more history more than last transfer.
  • NO Alert sent when the job fail due to destination space is full (& others reason) .

 
Continue reading

© 2018 Thinkway

Theme by Anders NorénUp ↑