Page 2 of 4

Netapp 32bit aggregate upgrade to 64bit

Netapp released 8.1GA on May 2012, with quite a number of improvement and the most eye catching is capability to upgrade aggregate from 32bit to 64bit. I have been waiting for 6 months and eager to try with 8.1RC , but RC is not a good choice for production. Always remember not to risk your production NAS!

Back to the story, 32bit to 64bit is a great feature provided. Why so? Because you do not need to perform migration , doing QSM from 32bit aggregate to 64bit aggregate. Getting downtime from the client is another pain too. This is a beautiful feature and I must praise Netapp for doing great job!

Here is my test plan:
1. Get a NAS with Ontap 8.0.2 – 7 Mode
2. Create 32bit “newaggr” and dump in the disk to make it become 15-16TB
3. Upgrade Ontap 8.1 – 7 Mode
4. Perform upgrade to “newaggr” from 32bit to 64bit

 

Useful Command 

aggr add aggrname
[ -f ]
[ -n ]
[ -g {raidgroup | new | all} ]
[ -c checksum-style ]
[ -64bit-upgrade {check | normal} ] { ndisks[@size]

|

-d disk1 [ disk2 … ] [ -d diskn [ diskn+1 … ] ] }

## add an additonal disk to aggregate pfvAggr, use “aggr status” to get group name
aggr status pfvAggr -r
aggr add pfvAggr -g rg0 -64bit-upgrade normal -d v5.25 normal

                    ## Add 4 300GB disk to aggregate aggr1

aggr add aggr1 -64bit-upgrade check 4@300

Adds disks to the aggregate named aggrname. Specify the disks in the same way as for the aggr create command. If the aggregate is mirrored, then the -d argument must be used twice (if at all).

If the size option is used, it specifies the disk size in GB. Disks that are within approximately 20% of the specified size will be selected. If the size option is not specified, existing groups are appended with disks that are the best match by size for the largest disk in the group, i.e., equal or smaller disks are selected first, then larger disks. When starting new groups, disks that are the best match by size for the largest disk in the last raidgroup are selected. The size option is ignored if a specific list of disks is specified.

If the -g option is not used, the disks are added to the most recently created RAID group until it is full, and then one or more new RAID groups are created and the remaining disks are added to new groups. Any other existing RAID groups that are not full remain partially filled.

The -g option allows specification of a RAID group (for example, rg0) to which the indicated disks should be added, or a method by which the disks are added to new or existing RAID groups.

If the -g option is used to specify a RAID group, that RAID group must already exist. The disks are added to that RAID group until it is full. Any remaining disks are ignored.

If the -g option is followed by new, Data ONTAP creates one or more new RAID groups and adds the disks to them, even if the disks would fit into an existing RAID group. Any existing RAID groups that are not full remain partially filled. The name of the new RAID groups are selected automatically. It is not possible to specify the names for the new RAID groups.

If the -g option is followed by all, Data ONTAP adds the specified disks to existing RAID groups first. Disks are added to an existing RAID group until it reaches capacity as defined by raidsize. After all existing RAID groups are full, it creates one or more new RAID groups and adds the remaining disks to the new groups. If the disk type or checksum style or both is specified then the command would operate only on the RAID groups with matching disk type or checksum style or both.

The -n option can be used to display the command that the system will execute, without actually making any changes. This is useful for displaying the automatically selected disks, for example.

The -c checksum-style argument specifies the checksum style of disks to use when adding disks to an existing aggregate. Possible values are: block for Block Checksum and advanced_zoned for Advanced Zoned Checksum (azcs).

If the -64bit-upgrade option is followed by check, Data ONTAP displays a summary of the space impact which would result from upgrading the aggregate to 64-bit. This summary includes the space usage of each contained volume after the volume is upgraded to 64-bit and the amount of space that must be added to the volume to successfully complete the 64-bit upgrade. This option does not result in an upgrade to 64-bit or addition of disks.

If the -64bit-upgrade option is followed by normal, Data ONTAP upgrades the aggregate to 64-bit if the total aggregate size after adding the specified disks exceeds 16TB. This option does not allow Data ONTAP to automatically grow volumes if they run out of space due to the 64-bit upgrade.

By default, the filer fills up one RAID group with disks before starting another RAID group. Suppose an aggregate currently has one RAID group of 12 disks and its RAID group size is 14. If you add 5 disks to this aggregate, it will have one RAID group with 14 disks and another RAID group with 3 disks. The filer does not evenly distribute disks among RAID groups.

You cannot add disks to a mirrored aggregate if one of the plexes is offline.

The disks in a plex are not permitted to span disk pools. This behavior can be overridden with the -f flag when used together with the -d argument to list disks to add. The -f flag, in combination with -d, can also be used to force adding disks that have a rotational speed that does not match that of the majority of existing disks in the aggregate. 

Before Upgrade

mynas02> aggr status
           Aggr State           Status            Options
          aggr0 online          raid_dp, aggr     root
                                32-bit
    sata64_bk11 online          raid_dp, aggr     raidsize=20
                                64-bit
        newaggr online          raid_dp, aggr     raidsize=10
                                32-bit

Perform upgrade check

mynas02> aggr add newaggr -64bit-upgrade check -d 1a.38.22 1a.38.21
File system size 16.97 TB exceeds maximum 15.99 TB
Checking for additional space required to upgrade all writable 32-bit
volumes in aggregate newaggr (Ctrl-C to interrupt)......
 
Adding the specified disks and upgrading the aggregate to
64-bit will add 1489GB of usable space to the aggregate.
 
To initiate the 64-bit upgrade of aggregate newaggr, run this
command with the "normal" option.

After upgrade

mynas02> aggr add newaggr -64bit-upgrade normal -d 1a.38.22 1a.38.21
File system size 16.97 TB exceeds maximum 15.99 TB
Checking for additional space required to upgrade all writable 32-bit
volumes in aggregate newaggr (Ctrl-C to interrupt)......
File system size 16.97 TB exceeds maximum 15.99 TB
Addition of 2 disks to the aggregate has completed.
mynas02> Checking for additional space required to upgrade all writable 32-bit
volumes in aggregate newaggr (Ctrl-C to interrupt)......
Mon MayMon May 14 18:12:35 MYT 14 18: [mynas02:raid.vol.disk.add.done:notice]: Addition of Disk /newaggr/plex0/rg2/1a.38.21 Shelf 38 Bay 21 [NETAPP   X302_HJUPI01TSSM NA02] S/N [N01RBR3L] to aggregate newaggr has completed successfully
12:35 MYT [mynas02:raid.vMon May 14 18:12:35 ol.diskMYT [mynas02:raid.vol.disk.add.done:notice]: Addition of Disk /newaggr/plex0/rg2/1a.38.22 Shelf 38 Bay 22 [NETAPP   X302_HJUPI01TSSM NA02] S/N [N01RGUYL] to aggregate newaggr has completed successfully
.add.done:notice]: Addition of Disk /newaggr/plex0/rg2/1a.38.21 Shelf 38 Bay 21 [NETAPP   X302_HJUPI01TSSM NA02] S/N [N01RBR3L] to aggregate newaggr has completed successfully
Mon May 14 18:12:35 MYT [mynas02Mon May 14 18:12:3:raid.v5 MYT [mynas02:wafl.scan.64bit.upgrade.start:notice]: The 64-bit upgrade scanner has started running on aggregate newaggr.
ol.disk.add.done:notiMon May 14 18:12:35 MYT [mynas02:wafl.scan.start:info]: Startingce]: Ad 64bit upgrade on aggregate newaggr.
dition of Disk /newaggr/plex0/rg2/1a.38.22 Shelf 38 Bay 22 [NETAPP   X302_HJUPI01TSSM NA02] S/N [N01RGUYL] to aggregate newaggr has completed successfully
Mon May 14 18:12:35 MYT [mynas02:wafl.scan.64bit.upgrade.start:notice]: The 64-bit upgrade scanner has started running on aggregate newaggr.
Mon May 14 18:12:35 MYT [mynas02:wafl.scan.start:info]: Starting 64bit upgrade on aggregate newaggr.
 14 18:Mon May 14 18:12:37 MYT [mynas02:wafl.scan.64bit.upgrade.completed:notice]: The 64-bit upgrade scanner has completed running on aggregate newaggr.
12:37 MYT [mynas02:wafl.scan.64bit.upgrade.completed:notice]: The 64-bit upgrade scanner has completed running on aggregate newaggr.
 
mynas02> aggr show_space -h newaggr
Aggregate 'newaggr'
 
    Total space    WAFL reserve    Snap reserve    Usable space       BSR NVLOG           A-SIS          Smtape
           16TB          1738GB             0KB            15TB             0KB             0KB             0KB
 
This aggregate does not contain any volumes
 
Aggregate                       Allocated            Used           Avail
Total space                           0KB             0KB            15TB
Snap reserve                          0KB             0KB             0KB
WAFL reserve                       1738GB          1392KB          1738GB
 
mynas02> aggr status newaggr
           Aggr State           Status            Options
        newaggr online          raid_dp, aggr     raidsize=10
                                64-bit
                Volumes: 
 
                Plex /newaggr/plex0: online, normal, active
                    RAID group /newaggr/plex0/rg0: normal, block checksums
                    RAID group /newaggr/plex0/rg1: normal, block checksums
                    RAID group /newaggr/plex0/rg2: normal, block checksums

So we have seen that the upgrade is very successful and option for “-64bit-upgrade” is only required once, after you got the aggregate to 64bit,  you just keep increasing the disk depending on your budget!

Update 28th May 2012:

WAFL will start to scan 32bit’s volume once Ontap is upgraded to 8.1 . If you have a huge volume, it might take more than 24hours to scan & you are not able to perform the “check” and “normal” command until entire 32bit aggregate is scanned.

mynas03 > aggr add fc_aggr0 -T FCAL -64bit-upgrade check 16@450
 
Note: preparing to add 12 data disks and 4 parity disks.
 
Continue? ([y]es, [n]o, or [p]review RAID layout) y
 
File system size 20.72 TB exceeds maximum 15.99 TB
 
Checking for additional space required to upgrade all writable 32-bit
 
volumes in aggregate fc_aggr0 (Ctrl-C to interrupt)......
 
aggr add: This aggregate has volumes that contain space-reserved
 
files, and the computation of the additional space required to upgrade
 
these files is not yet available. Retry at a later time.

Sample from /etc/messages:

Sun May 27 01:15:05 GMT [mynas03:wafl.scan.start:info]: Starting 64bit space qualifying on volume vol_test_01.
Sun May 27 17:13:00 MYT [mynas03:wafl.scan.64bit.space.done:notice]: The 64-bit space qualifying scanner has completed running on Volume vol_test_01.

Update 20th Feb 2013:

The WAFL scan will not start until the snapmirror relationship is break off. Thanks to locohost & C3 for the info!

I used to turn off snapmirror on the filer before Ontap upgrade, hence my scan is able to proceed. If you have more finding, feel free to post it here 🙂

Netapp Volume Migration

As a storage admin , volume migration is always an active routine that need to go through every months or quarters. I was involving in most storage migration using lots of method for the past few years and here is my finding.

  1. Always use snapmirror for volume migration! This is the best way to migrate data in the most consistent way.
  2. Avoid using rsync for huge volume with multiple sessions(directory too huge and you have to split it to multiple sessions).
  3. Detailed plan and down time needed.
Let me share with you my own best practice:
  1. Source filer
    • Identify Volume
    • Volume Size
    • Aggregate type (32bit or 64bit)
  2. Destination filer
    • Identify new volume (with new volume name)
    • New volume size (must be greater than source volume during transfer to minimize the chances of failing during transfer. snapshot might take up additional space on the migration)
    • Aggregate balanced space . Aggregate free space is important for performance , best recommended free space of 15%-20%.
    • Aggregate type (32bit or 63bit)
  3. Getting downtime
    • Engage with customer to get the downtime. Usually the downtime i request is less than 2 hours as the final sync is usually less than 10minutes because i have scheduled snapmirror running daily for the incremental changes.

  4. Cut over plan
    • Always schedule snapmirror schedule to update daily during off hour(or when the filer is free).
    • Run manual snapmirror update for few time an hour before the final sync.
    • restrict source volume after the migration to prevent new data writing in.
    • Test the access in new destination (NFS/CIFS/iSCSI)

Snapmirror alert script

Snapmirror is very cool and fantastic tool command used for migration and mirroring. You can run the snapmirror using schedule (snapmirror.conf) and let it run.

However, the draw back of snapmirror is

  • keep the information for last transfer only. No more history more than last transfer.
  • NO Alert sent when the job fail due to destination space is full (& others reason) .

 
Continue reading

awk print output to 1 line

awk '{printf("%s ",$1) }' <file name>

Designsync dword

I have no result found from Google regarding Designsync dword and i think i should share it out after getting the info from Designsync support.

Command:

grep -H ServerMaintenanceMode=dword:1 /<path>/<port_number>/PortRegistry.reg

dword:1 = Normal operation

dword:2 = Set server to read-only mode

dword:3 = block all read and write access to server

Simple Home NAS solution

I have been thinking on having my home NAS and i did survey for a few market product like QSNAP or SYNOLOGY will cost me near to 1000USD , if i were to get a RAID-5 + hot swap features. I struggled for a while and decided to use the simplest + cheap solution for home NAS to host my photo + movie + mp3.

Step 1 – Identify OS:

1. Freenas – FreeBSD based , i like FreeBSD because it was the core for Netapp too.

2. Openfiler – CentOS based, customized for NAS.

3. Fedora + Samba

I picked number 3 which is Fedora + Samba , as i still need my NAS OS to be a multi purpose server and i love YUM~

Step2 – Hardware

1. 2 years old HP 110 Netbook

2. 2TB Seagate External Harddisk

Finally…. the combination!

HP 2 years old 110

2TB Seagate External Disk

Fedora 15

How to check if there is packet drop in Netapp filer

Command: netstat -I e4a -d

Output:
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Collis Drop
e4a 9000 none none 2007724074 1 2991841358 0 0 0

 

0 is the one you are looking for.

Proper way to delete snapvault relationship

I have problem with my backup in Netapp Protection Manager, the backup using snapvault technology just hang and unable to run any backup at all. This is a bug according to them and will be fixed in 8.0.2(hopefully). I always wanted to know the proper way to remove snapvault relation shop and i am doing this today.
Note: create case in Netapp to assist you in case you are uncertain of anything, this guideline is just for information sharing and my own references.


########################################################
To list out the dfpm data set for job id(check from source snapshot)
########################################################
C:\Documents and Settings\admin1>dfpm dataset list -m 23953
Id Node Name Dataset Id Dataset Name Type Name
---------- -------------------- ---------- -------------------- --------------- -------------------------------------------------------
20538 filerSource 23953 VOL A Dataset new volume filerSource:/volumeA
23955 filerDest 23953 VOL A Dataset new volume filerDest:/volume_A_Dataset_new_backup
24400 filerDest 23953 VOL A Dataset new volume filerDest:/volume_A_Dataset_new_backup_1

C:\Documents and Settings\admin1>dfpm dataset list -x 23953
Id: 23953
Name: VOL A Dataset new
Policy: Backup Policy For PG t28
Description:
Owner:
Contact:
Volume Qtree Name Prefix:
DR Capable: No
Requires Non Disruptive Restore: No

Node details:

Node Name: filerSource
Resource Pools:
Provisioning Policy:
Time Zone:
DR Capable: No
vFiler:

Node Name: filerDest
Resource Pools: PG DRNAS BACKUP 02
Provisioning Policy: No Dedupe Policy for Secondary
Time Zone:
DR Capable: No
vFiler:

==========================

After

C:\Documents and Settings\admin1>dfpm dataset snapshot list 23953
Id Name Unique Id Volume Timestamp Versioned Dependencies % of Total Blocks
---- --------------- ------------ --------------- -------------------- ---------- --------------- ----------
75019592 dfpm_base(dataset-id-23953)conn1.3 1306675183 filerSource:/volumeA 29 May 2011 21:19:43 No SnapVault 33% (19%)
83770083 dfpm_base(dataset-id-23953)conn1.0 1310908962 filerSource:/volumeA 17 Jul 2011 21:22:42 No SnapVault,acs 13% (11%)
86355690 dfpm_base(dataset-id-25559)conn1.1 1311686499 filerSource:/volumeA 26 Jul 2011 21:21:39 No Busy - SnapVault 3% (2%)
86355689 dfpm_base(dataset-id-25559)conn1.0 1311772921 filerSource:/volumeA 27 Jul 2011 21:22:01 No Busy - SnapVault,acs 0% (0%)
86377105 nightly.0 1311782476 filerSource:/volumeA 28 Jul 2011 00:01:16 No None 0% (0%)

########################################################
To delete snapshot in source from OM databse
########################################################
C:\Documents and Settings\admin1>dfpm dataset snapshot delete 23953 "filerSource:/volumeA" 75019592
Dataset dry run results
----------------------------------
Do: Delete snapshots 'dfpm_base(dataset-id-23953)conn1.3' of volume filerSource:/volumeA (20538).
Effect: Selected snapshots will be deleted.
Following snapshots have applications dependent on them:
dfpm_base(dataset-id-23953)conn1.3 - SnapVault

Started job 13340 to delete snapshot(s) of the volume 'filerSource:/volumeA' of the dataset 'VOL A Dataset new' (23953).

C:\Documents and Settings\admin1>dfpm dataset snapshot delete 23953 "filerSource:/volumeA" 83770083
Dataset dry run results
----------------------------------
Do: Delete snapshots 'dfpm_base(dataset-id-23953)conn1.0' of volume filerSource:/volumeA (20538).
Effect: Selected snapshots will be deleted.
Following snapshots have applications dependent on them:
dfpm_base(dataset-id-23953)conn1.0 - SnapVault, acs

Started job 13341 to delete snapshot(s) of the volume 'filerSource:/volumeA' of the dataset 'VOL A Dataset new' (23953).

########################################################
To delete snapshot at destination, take few hours.
########################################################

[root@linux tiwlim]# rsh filerDest snapvault stop filerDest:/vol/volume_A_Dataset_new_backup_1/t28lp_ip
Stopping /vol/volume_A_Dataset_new_backup_1/t28lp_ip is permanent.
The secondary qtree will be deleted.
Further incremental updates will be impossible.
Data already stored in snapshots will not be deleted.
This may take a long time to complete.
Are you sure you want to do this? y
Snapvault configuration for the qtree has been deleted.

python script for simple write and delete test

Have been interested for python script and finally i got a chance to play with it.

So the mission for today is to create the write and delete test for my environment.

Local machine  —-COPY—>  Mounted filesystem in NAS

Why i am doing so? to get a baseline for my write and delete operation.I will make use of “time” command to get total runtime for the script to complete the operation. So here goes the simple script(Feel free to provide your code if you have idea, i am a noob in python:) )

BTW, Nanako is just an echo.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#!/usr/bin/python
import os
import shutil
 
#set source path & file name
src = "/usr2/"
file_name = "testfile.txt"
source_file_path = src + file_name
 
#set destination path
dest = "/data/tingwei/test_copy/"
 
#set number of copies for the copying progress
copies = 1000
dir_copies = 10
 
for x in range(dir_copies):
    dirname = dest + str(x)
    if not os.path.isdir(dirname):
        os.mkdir(dirname)
        print "Nanako: mkdir " + str(dirname)
#start progress
        print "Nanako: Ready to copy " + `copies` + " files."
        for i in range(copies):
                new_file_name = file_name + `i`
                dest_file_path = dirname + "/" + new_file_name
                shutil.copyfile (source_file_path, dest_file_path)
 
print "Nanako: Done copying " + `copies` + " files."
print "Nanako: Preparing to cleanup " + dest
 
#this line is to remove the files & directories created
for x in os.listdir(dest):
   shutil.rmtree(os.path.join(dest,x))
 
#just in case you want to see is there any left over in the destination
def listdir_fullpath(d):
    return [os.path.join(d, f) for f in os.listdir(d)]
 
for i in listdir_fullpath(dest):
    print i

ndmpcopy duration

I have to migrate a volume from filerA to filerB , instead of using snapmirror, i will give ndmpcopy a chance.
Like snapmirror it is a block level transfer(lot faster than rsync). The benefit for this would be the single file/directory that we are able to move. snapmirror transfer entire volume while ndmpcopy, we can copy a single directory within a volume, and yet still using block level.

Below is just the example output i have for references:
I am copying from a FAS3170 to FAS960
Directory size: 1.5TB
~45MBps!!!!
Time taken: 9 hours 15 minute
Ndmpcopy: filerA: Log: DUMP: dumping (Pass V) [ACLs]
Ndmpcopy: filerA: Log: DUMP: 1563223738 KB
Ndmpcopy: filerB: Log: RESTORE: RESTORE IS DONE
Ndmpcopy: filerB: Log: RESTORE: The destination path is /vol/vol_restore/re store/
Ndmpcopy: filerB: Notify: restore successful
Ndmpcopy: filerA: Log: DUMP: DUMP IS DONE
Ndmpcopy: filerA: Log: DUMP: Deleting “/vol/fc_loan/../snapshot_for_ backup.195” snapshot.
Ndmpcopy: filerA: Notify: dump successful
Ndmpcopy: Transfer successful [ 9 hours 15 minutes 47 seconds ]
Ndmpcopy: Done

© 2018 Thinkway

Theme by Anders NorénUp ↑