Category Archives: EMC

EMC NetWorker – Delete the contents of the MNT directory

When troubleshooting NetWorker backups, I typically look to see if the failed VM has an older snapshot that was not deleted and remove it.  Thinking I’m awesome and have solved the VMs backup problem, I’m surprised to see that, more often than not, the VM backup fails the next time the backup runs.  Why, because the MNT directory for the failed backup job/VM still exists on the backup proxy.  Why do I always forget to check the MNT directory!!  Hopefully writing it down will help me to remember.

1_backupfailed

2_deletemntfolders

Leave a comment

Filed under EMC

Testing Replicated Data on DataDomain appliances using Fastcopy

Many companies replicate data between sites.  The specific methodology may be different; different vendors, different solutions, etc., but the end goal is basically the same….data redundancy to ensure as smooth a transition as possible should a DR event occur.

To look at one specific replication scenario, a number of my clients use Veeam to backup their virtual machines to backup repositories located on DataDomain appliances and then that data is replicated to a DataDomain appliance located at another site.  Now I know, assuming I have green checks indicating my data is in sync in DD Enterprise Manager, that the data is replicated and if need be, usable in the secondary site.  But still, it’s a good idea to test the data thus ensuring its integrity.

Looking at DataDomain articles, seems the first step in testing the replicated data is to break the replication context with the source system so that you can convert the replicated data from read-only to read-write.  But I really don’t want to have to break replication connections in order to test the integrity of my data, so I called DD support who suggested I use fastcopy to make a read-write copy of the replicated data for DR/data integrity testing.

I asked, “What happens if I don’t have enough room to make a copy?” and was told that fastcopy uses “pointers”, “will not consume any additional space” and “the easiest method for r/w access for a DR test would be to fastcopy the replicated data.” Sounds too good to be true but I decided to test it as follows:

1. Created a new VM, a new backup repository on the DD, and then replicated the new DD MTree from the source to the destination DD.  (Wasn’t going to try it out the first time with my production data)

2. After replication was successful, I created a new, blank MTree on the destination DD system.  **If you do not create a blank MTree, the fastcopy command will fail as it is unable to create MTree’s on its own.

3. Execute the fastcopy command, an example of which is shown below:

  • filesys fastcopy source /data/col1/CAN-DR-Test destination /data/col1/VMTest
  • NOTE: The command is case sensitive, remember that when specifying your source and destination MTree

1_FC-Command

4. In DD Enterprise Manager, enable CIFS share for the replicated MTree (VMTest in this example)

5. Add VMTest as a backup repository in Veeam, importing existing backups, and then perform a restore.

In this instance, the process worked perfectly and thus ensured the client that their data was indeed replicating and accessible if needed.

Leave a comment

Filed under BRS/DR, Data Domain, EMC, VMware

Troubleshooting VNXe Replication Error 0x6500019

When setting up replication between VNXe systems, I encountered the following error when verifying and updating a replication connection:

Upon troubleshooting, the error occurred due to a time difference between the Source and Destination VNXe systems; specifically, the source system was 21 minutes behind the destination.

But if you try to advance the time more than 300 seconds on a VNXe, you will be prompted to reboot the SPs and who wants to do that just to change the time?  I called EMC to confirm that the SPs wouldn’t reboot simultaneously, and this is the response I received from tech support initially:

No, it reboots the entire System (Both SPs) together

REALLY!!??!!  To change the time, they system will reboot the SPs together?  At this point, the thought crossed my mind that perhaps I could increment the time in chunks….and decided I’d try to forward the time by 180 seconds at a time.  Fortunately, this worked!  Thus, I was able to forward my source system time to match the destination as well as set the appropriate NTP servers to ensure this wouldn’t happen again.  Once the time was in sync on both devices, my replication connection to the remote system and the replication sessions worked without issue.

3_ReplOK

So if you’re having problems replicating VNXe systems, make sure your time is in sync!  Of course, this is good advice for pretty much anything in the IT realm….

Additionally, as the conversation continued with EMC support, I was told if the code version on your VNXe is greater than 2.3.x.xxx and you advance the time more than 300 seconds, it will reboot 1 SP at a time….but if I had to do it over again, I’d still advance the time manually and save the reboot.

Leave a comment

Filed under EMC, VMware

RecoverPoint – Failed to configure splitter credentials

If you have RecoverPoint and use the SAN splitter for an EMC storage system, you may run into an issue if you change the password to the EMC storage account being used by RecoverPoint.  After changing the password in Unisphere, you will likely see the following when attempting to update the account credentials inside the RecoverPoint GUI:

Failed to configure splitter credentials
 As of 3/20/2013, this is a known issue.  Here’s an email I received from EMC support:
This is actually a code bug that we are still trying to resolve. When a user changes the unisphere password or creates a new user, RP does not update it and still uses the original credentials from installation. We can see that the VNX is blocking commands from site control RPA. VNX engineering is investigating this currently.

Though its a known issue, I was able to do the following to get around the issue:  Though I was using a “global” Unisphere account, I set the scope of my account to “local” and set the new password.  To my amazement, the change took.  I went back into the splitter properties and set the scope by to “global” and that change held and RP did not have any additional issues communicating with the SAN splitter.

Leave a comment

Filed under BRS/DR, EMC

Deleting old save sets from NetWorker

This is another in the series of “I’m posting this before I forget.” Had to delete some older NetWorker save sets in order to free some space on a DataDomain DD-160 and if I don’t put these down, I’m certain I’ll forget them and to avoid having to search again, here’s my steps: (using server1.ballfield.local as an example)

1. I didn’t want to remove every old backup job, but just those for a given server that were older than 3 months. To that end, I used MMINFO from the command-line of the NetWorker server to get a list of all backups that are 3 months or older and pipe the output to a text file that I’ll use to built a BAT file for deleting the save sets:

mminfo -avot -c server1.ballfield.local -q “savetime c:\temp\server1_ss_3MO.txt

This command will give you a text file similar to the following:

MMINFO Output 

 2. Next, I opened NetWorker to “spot check” a few of my SSIDs to verify that I was about to backup older save sets as, initially, I was somewhat confused by the savetime command switch.  More about the savetime command switch can be found here.

NetWorker – Show Save Sets

Use SSIDs to verify age of the save set

 3. After verification, I edited my text file as shown below and saved it as a BAT file:

Use NSRMM to detele old save sets

NSRMM is used to remove save sets.  In this case, the command was:
nsrmm -dy -S SSID/CloneID
-d / delete
y / responds with “Yes”…if you do not specify the “y” switch, nsrmm will asked you for verification before each deletion
-S / specifies the SSID and CloneID (if there are any) to delete

4. In the command window, execute the BAT file.  Once the BAT file is completed deleting the older save sets, execute the command nsrim -X which synchronizes the media database and completes the deletion of the save sets from NetWorker.  NetWorker support advised that NSRIM should not be executed when backups are running.

5. Finally, I wanted to “clean” the space on the DataDomain manually as opposed to waiting for it to automatically do so on its schedule.  I logged into the DD and on the Data Management | File System page, clicked Start Cleaning.  In this case, after deleting my older save sets, I had 400GB of space I could recover.

DataDomain – Start Cleaning

DataDomain – Cleaning finished/more space available

As you can imagine, the backups function much better when there is free space to write to.  In addition to removing the older save sets, you may need to adjust the NetWorker browse and retention settings (shown below) on your machines to avoid the same issue again.

NetWorker Client – Browse and Retention Policies

2 Comments

Filed under BRS/DR, Data Domain, EMC