Page 1
AFF and FAS System Documentation ONTAP Systems NetApp November 23, 2021 This PDF was generated from https://docs.netapp.com/us-en/ontap-systems/index.html on November 23, 2021. Always check docs.netapp.com for the latest.
• Video steps Video step-by-step instructions. Installation and setup PDF poster - AFF A200 You can use the PDF poster to install and set up your new system. The PDF poster provides step-by-step instructions with live links to additional content.
Page 6
◦ If the impaired node is in a standalone configuration and at LOADER prompt, contact NetApp Support. mysupport.netapp.com 2. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message...
Page 7
Option 1: Checking NVE or NSE on systems running ONTAP 9.5 and earlier Before shutting down the impaired node, you need to check whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
Page 8
▪ Run the key-manager setup wizard: security key-manager setup -node target/impaired node name Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact mysupport.netapp.com ▪ Verify that the column displays for all authentication key: Restored...
Page 9
Option 2: Checking NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired node, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
Page 10
Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
Page 11
Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact NetApp Support. mysupport.netapp.com b. Verify the column shows for all authentication keys: Restored security key-manager key query c. Verify that the type shows onboard, manually backup the OKM information.
Page 12
Return to admin mode: set -priv admin h. You can safely shutdown the node. Shut down the node - AFF A200 After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired node. Steps 1.
Page 13
This command may not work if the boot device is corrupted or non-functional. Remove the controller module, replace the boot media and transfer the boot image to the boot media - AFF A200 To replace the boot media, you must remove the impaired controller module, install the replacement boot media, and transfer the boot image to a USB flash drive.
Page 14
5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 2: Replace the boot media You must locate the boot media in the controller and follow the directions to replace it.
Page 15
• A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
Page 16
4. Push the controller module all the way into the system, making sure that the cam handle clears the USB flash drive, firmly push the cam handle to finish seating the controller module, push the cam handle to the closed position, and then tighten the thumbscrew. The node begins to boot as soon as it is completely installed into the chassis.
Page 17
Other parameters might be necessary for your interface. You can enter help ifconfig the firmware prompt for details. Boot the recovery image - AFF A200 You must boot the ONTAP image from the USB drive, restore the file system, and verify the environmental variables.
Page 18
If your system has… Then… No network connection a. Press when prompted to restore the backup configuration. b. Reboot the system when prompted by the system. c. Select the Update flash from backup config (sync flash) option from the displayed menu. If you are prompted to continue with the update, press y.
Page 19
Restore OKM, NSE, and NVE as needed - AFF A200 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
Page 20
--------------------------BEGIN BACKUP-------------------------- TmV0QXBwIEtleSBCbG9iAAEAAAAEAAAAcAEAAAAAAADuD+byAAAAACEAAAAAAAAA QAAAAAAAAABvOlH0AAAAAMh7qDLRyH1DBz12piVdy9ATSFMT0C0TlYFss4PDjTaV dzRYkLd1PhQLxAWJwOIyqSr8qY1SEBgm1IWgE5DLRqkiAAAAAAAAACgAAAAAAAAA 3WTh7gAAAAAAAAAAAAAAAAIAAAAAAAgAZJEIWvdeHr5RCAvHGclo+wAAAAAAAAAA IgAAAAAAAAAoAAAAAAAAAEOTcR0AAAAAAAAAAAAAAAACAAAAAAAJAGr3tJA/ LRzUQRHwv+1aWvAAAAAAAAAAACQAAAAAAAAAgAAAAAAAAACdhTcvAAAAAJ1PXeBf ml4NBsSyV1B4jc4A7cvWEFY6lLG6hc6tbKLAHZuvfQ4rIbYAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA H4nPQM0nrDRYRa9SCv8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAA ---------------------------END BACKUP--------------------------- 7. At the Boot Menu select the option for Normal Boot. The system boots to prompt. Waiting for giveback… 8. Move the console cable to the partner node and login as admin. 9.
Page 21
b. Enter the key-manager key show -detail command to see a detailed view of all keys stored in the onboard key manager and verify that the column = for all authentication keys. Restored If the column = anything other than yes, contact Customer Support. Restored c.
Page 22
This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
Page 23
a. Use the security key-manager key show -detail to see a detailed view of all keys stored in the onboard key manager. b. Use the command and verify that the security key-manager key show -detail Restored column = for all authentication keys. If the Restored column = anything other than yes, use the...
Page 24
-auto-giveback true Return the failed part to NetApp - AFF A200 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 25
• This procedure is written with the assumption that you are moving all drives and controller module or modules to the new chassis, and that the chassis is a new component from NetApp. • This procedure is disruptive. For a two-node cluster, you will have a complete service outage and a partial outage in a multi-node cluster.
Page 26
3. Where applicable, halt the second node to avoid a possible quorum error message in an HA pair configuration: system node halt -node second_node_name -ignore-quorum-warnings true Move and replace hardware - AFF A200 Step 1: Move the power supply Move the power supply from the old chassis to the replacement chassis.
Page 27
Only connect the power cable to the power supply. Do not connect the power cable to a power source at this time. Step 2: Remove the controller module Remove the controller module or modules from the old chassis. Steps 1. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the system cables and SFPs (if needed) from the controller module, keeping track of where the cables were connected.
Page 28
Step 3: Move drives to the new chassis Move the drives from each bay opening in the old chassis to the same bay opening in the new chassis. Steps 1. Gently remove the bezel from the front of the system. 2.
Page 29
Step 5: Install the controller After you install the controller module and any other components into the new chassis, boot it to a state where you can run the interconnect diagnostic test. About this task For HA pairs with two controller modules in the same chassis, the sequence in which you install the controller module is especially important because it attempts to reboot as soon as you completely seat it in the chassis.
Page 30
From the boot menu, select the option for Maintenance mode. Restoring and verifying the configuration - AFF A200 Step 1: Verifying and setting the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match your system configuration.
Page 31
3. If you have not already done so, recable the rest of your system. Step 2: Running system-level diagnostics After installing a new chassis, you should run interconnect diagnostics. What you’ll need Your system must be at the LOADER prompt to start System Level Diagnostics. About this task All commands in the diagnostic procedures are issued from the node where the component is being replaced.
Page 32
If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
Page 33
Rerun the system-level diagnostics test. Step 3: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 34
This provides you a record of the procedure so that you can troubleshoot any issues that you might encounter during the replacement process. Shut down the impaired controller - AFF A200 To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage.
Page 35
Move the controller module hardware - AFF A200 To replace the controller module hardware, you must remove the impaired node, move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode.
Page 36
6. Turn the controller module over and place it on a flat, stable surface. 7. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 2: Move the boot media You must locate the boot media and follow the directions to remove it from the old controller module and insert it in the new controller module.
Page 37
Steps 1. Locate the boot media using the following illustration or the FRU map on the controller module: 2. Press the blue button on the boot media housing to release the boot media from its housing, and then gently pull it straight out of the boot media socket. Do not twist or pull the boot media straight up, because this could damage the socket or the boot media.
Page 38
3. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket. 4. Grasp the battery and press the blue locking tab marked PUSH, and then lift the battery out of the holder and controller module.
Page 39
and then slide the DIMM out of the slot. Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board. The number and placement of system DIMMs depends on the model of your system. The following illustration shows the location of system DIMMs: 4.
Page 40
operating system. About this task For HA pairs with two controller modules in the same chassis, the sequence in which you install the controller module is especially important because it attempts to reboot as soon as you completely seat it in the chassis. The system might update system firmware when it boots.
Page 41
If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
Page 42
You can safely respond to these prompts. Restore and verify the system configuration - AFF A200 After completing the hardware replacement and booting to Maintenance mode, you verify the low-level system configuration of the replacement controller and reconfigure system settings as necessary.
Page 43
It is important that you apply the commands in the steps on the correct systems: • The replacement node is the new node that replaced the impaired node as part of this procedure. • The healthy node is the HA partner of the replacement node Steps 1.
Page 44
After you issue the command, you should wait until the system stops at the LOADER prompt. 2. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags During the boot process, you can safely respond to the prompts until the Maintenance mode prompt (*>) appears.
Page 45
If you want to run diagnostic Then… tests on… Individual components a. Clear the status logs: sldiag device clearstatus b. Display the available tests for the selected devices: sldiag device show -dev dev_name dev_name can be any one of the ports and devices identified in the preceding step.
Page 46
If you want to run diagnostic Then… tests on… Multiple components at the same a. Review the enabled and disabled devices in the output from the time preceding procedure and determine which ones you want to run concurrently. b. List the individual tests for the device: sldiag device show -dev dev_name c.
Page 47
Reconnect the power supplies, and then power on the storage system. e. Rerun the system-level diagnostics test. Recable the system and reassign disks - AFF A200 Continue the replacement procedure by re-cabling the storage and confirming disk reassignment. Step 1: Re-cable the system After running diagnostics, you must recable the controller module’s storage and network...
Page 48
c. Click the Cabling tab, and then examine the output. Make sure that all disk shelves are displayed and all disks appear in the output, correcting any cabling issues you find. d. Check other cabling by clicking the appropriate tab, and then examining the output from Config Advisor. Step 2: Reassign disks If the storage system is in an HA pair, the system ID of the new controller module is automatically assigned to the disks when the giveback occurs at the end of the...
Page 49
You can respond when prompted to continue into advanced mode. The advanced mode prompt appears (*>). b. Save any coredumps: system node run -node local-node-name partner savecore c. Wait for savecore command to complete before issuing the giveback. You can enter the following command to monitor the progress of the savecore command: system node run -node local-node-name partner savecore -s d.
Page 50
(118073209) 5. Boot the node: boot_ontap Complete system restoration - AFF A200 Step 1: Install licenses for the replacement node in ONTAP You must install new licenses for the replacement node if the impaired node was using ONTAP features that require a standard (node-locked) license. For features with standard licenses, each node in the cluster should have its own key for the feature.
Page 51
Steps 1. If you need new license keys, obtain replacement license keys on the NetApp Support Site in the My Support section under Software licenses. The new license keys that you require are automatically generated and sent to the email address on file.
Page 52
-node local -auto -giveback true Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 53
If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 54
5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 3: Replace the DIMMs To replace the DIMMs, locate them inside the controller and follow the specific sequence of steps.
Page 55
About this task If you are replacing a DIMM, you need to remove it after you have unplugged the NVMEM battery from the controller module. Steps 1. If you are not already grounded, properly ground yourself. 2. Check the NVMEM LED on the controller module. You must perform a clean system shutdown before replacing system components to avoid losing unwritten data in the nonvolatile memory (NVMEM).
Page 56
Each system memory DIMM has an LED located on the board next to each DIMM slot. The LED for the faulty blinks every two seconds. 7. Note the orientation of the DIMM in the socket so that you can insert the replacement DIMM in the proper orientation.
Page 57
Make sure that the plug locks down onto the controller module. 13. Close the controller module cover. Step 4: Reinstall the controller module After you replace components in the controller module, reinstall it into the chassis. 1. If you are not already grounded, properly ground yourself. 2.
Page 58
If your system is in… Then perform these steps… A stand-alone a. With the cam handle in the open position, firmly push the controller module in configuration until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
Page 59
appears. 3. Run diagnostics on the system memory: sldiag device run -dev mem 4. Verify that no hardware problems resulted from the replacement of the DIMMs: sldiag device status -dev mem -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of failures resulting from testing the component.
Page 60
If the system-level Then… diagnostics tests… Resulted in some test Determine the cause of the problem: failures a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
Page 61
If the system-level Then… diagnostics tests… Were completed without a. Clear the status logs: sldiag device clearstatus any failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
Page 62
Rerun the system-level diagnostic test. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 63
About this task All other components in the system must be functioning properly; if not, you must contact technical support. Step 1: Shut down the impaired controller Shut down or take over the impaired controller using different procedures, depending on the storage system hardware configuration.
Page 64
Step 2: Open the system To access components inside the controller, you must first remove the controller module from the system and then remove the cover on the controller module. Steps 1. If you are not already grounded, properly ground yourself. 2.
Page 65
Step 3: Replace the NVMEM battery To replace the NVMEM battery in your system, you must remove the failed NVMEM battery from the system and replace it with a new NVMEM battery. Steps 1. If you are not already grounded, properly ground yourself. 2.
Page 66
This typically occurs during an uncontrolled shutdown after ONTAP has successfully booted. 3. Locate the NVMEM battery in the controller module. 4. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket.
Page 67
4. Recable the system, as needed. If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables. 5. Complete the reinstallation of the controller module: If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis.
Page 68
If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
Page 69
2. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags During the boot process, you can safely respond to the prompts until the Maintenance mode prompt (*>) appears. 3. Run diagnostics on the NVMEM memory: sldiag device run -dev nvmem 4.
Page 70
Rerun the system-level diagnostic test. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 71
4. Squeeze the latch on the power supply cam handle, and then open the cam handle to fully release the power supply from the mid plane. If you have an AFF A200 system, a plastic flap within the now empty slot is released to cover the opening and maintain air flow and cooling.
Page 72
The power supply LEDs are lit when the power supply comes online. 11. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 73
If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 74
5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 3: Replace the RTC battery To replace the RTC battery, locate it inside the controller and follow the specific sequence of steps.
Page 75
Steps 1. If you are not already grounded, properly ground yourself. 2. Locate the RTC battery. 3. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
Page 76
-node local -auto -giveback true Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
NetApp video: AFF A220 and FAS2700 Systems: Installation and Setup Instructions Video two of two: Performing end-to-end software configuration The following video shows end-to-end software configuration for systems running ONTAP 9.2 and later. NetApp video: Software configuration for vSphere NAS datastores for FAS/AFF systems running ONTAP 9.2...
Page 78
To install your FAS2700 or AFF A220 system, you need to create an account on the NetApp Support Site, register your system, and get license keys. You also need to inventory the appropriate number and type of cables for your system and collect specific network information.
Page 79
6. Download and complete the Cluster configuration worksheet. Cluster Configuration Worksheet Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable. Steps 1. Install the rail kits, as needed.
Page 80
You need to be aware of the safety concerns associated with the weight of the system. 3. Attach cable management devices (as shown). 4. Place the bezel on the front of the system. Step 3: Cable controllers to your network You can cable the controllers to your network by using the two-node switchless cluster method or by using the cluster interconnect network.
Page 81
Step Perform on each controller Cable the cluster interconnect ports to each other with the cluster interconnect cable: • e0a to e0a • e0b to e0b...
Page 82
Step Perform on each controller Use one of the following cable types to cable the UTA2 data ports to your host network: An FC host • 0c and 0d • or 0e and 0f A 10GbE • e0c and e0d •...
Page 83
2. To cable your storage, see Cabling controllers to drive shelves Option 2: Cable a switched cluster, unified network configuration Management network, UTA2 data network, and management ports on the controllers are connected to switches. The cluster interconnect ports are cabled to the cluster interconnect switches.
Page 84
Step Perform on each controller module Cable e0a and e0b to the cluster interconnect switches with the cluster interconnect cable: Use one of the following cable types to cable the UTA2 data ports to your host network: An FC host •...
Page 85
Step Perform on each controller module Cable the e0M ports to the management network switches with the RJ45 cables: DO NOT plug in the power cords at this point. 2. To cable your storage, see Cabling controllers to drive shelves Option 3: Cable a two-node switchless cluster, Ethernet network configuration Management network, Ethernet data network, and management ports on the controllers are connected to switches.
Page 86
Step Perform on each controller Cable the cluster interconnect ports to each other with the cluster interconnect cable: • e0a to e0a • e0b to e0b Use the Cat 6 RJ45 cable to cable the e0c through e0f ports to your host network:...
Page 87
Step Perform on each controller Cable the e0M ports to the management network switches with the RJ45 cables: DO NOT plug in the power cords at this point. 2. To cable your storage, see Cabling controllers to drive shelves Option 4: Cable a switched cluster, Ethernet network configuration Management network, Ethernet data network, and management ports on the controllers are connected to switches.
Page 88
Step Perform on each controller module Cable e0a and e0b to the cluster interconnect switches with the cluster interconnect cable: Use the Cat 6 RJ45 cable to cable the e0c through e0f ports to your host network:...
Page 89
Cabling controllers to drive shelves Step 4: Cable controllers to drive shelves You must cable the controllers to your shelves using the onboard storage ports. NetApp recommends MP-HA cabling for systems with external storage. If you have a SAS tape drive, you can use single-path cabling.
Page 90
Step Perform on each controller Cable the shelf-to-shelf ports. • Port 3 on IOM A to port 1 on the IOM A on the shelf directly below. • Port 3 on IOM B to port 1 on the IOM B on the shelf directly below. mini-SAS HD to mini-SAS HD cables Connect each node to IOM A in the stack.
Page 91
Step 5: Complete system setup and configuration You can complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch. Option 1: Complete system setup if network discovery is enabled If you have network discovery enabled on your laptop, you can complete system setup and configuration using automatic cluster discovery.
Page 92
Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens. 7. Use System Manager guided setup to configure your system using the data you collected in the NetApp ONTAP Configuration Guide. ONTAP Configuration Guide 8.
Page 93
c. Connect the laptop or console to the switch on the management subnet. d. Assign a TCP/IP address to the laptop or console, using one that is on the management subnet. 2. Use the following animation to set one or more drive shelf IDs: Setting drive shelf IDs 3.
Page 94
Point your browser to the node management IP address. The format for the address is https://x.x.x.x. b. Configure the system using the data you collected in the NetApp ONTAP Configuration guide. ONTAP Configuration Guide 7. Verify the health of your system by running Config Advisor.
Page 95
◦ If the impaired node is in a standalone configuration and at LOADER prompt, contact NetApp Support. mysupport.netapp.com 2. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message...
Page 96
Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
Page 97
Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
Page 98
Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired node, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
Page 99
Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
Page 100
Key Manager external Restored yes: a. Enter the onboard security key-manager sync command: security key-manager external sync If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored security key-manager key query c.
Page 101
NetApp Support. mysupport.netapp.com b. Verify the column shows for all authentication keys: Restored security key-manager key query c. Verify that the type shows onboard, manually backup the OKM information. Key Manager d. Go to advanced privilege mode and enter...
Page 102
Do not use this procedure if your system is in a two-node MetroCluster configuration. To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage. •...
Page 103
1. If you are not already grounded, properly ground yourself. 2. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the system cables and SFPs (if needed) from the controller module, keeping track of where the cables were connected.
Page 104
Step 2: Replace the boot media You must locate the boot media in the controller and follow the directions to replace it. Steps 1. If you are not already grounded, properly ground yourself. 2. Locate the boot media using the following illustration or the FRU map on the controller module:...
Page 105
• A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
Page 106
• If your system is a stand-alone system you do not need a network connection, but you must perform an additional reboot when restoring the var file system. Steps 1. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system.
Page 107
Other parameters might be necessary for your interface. You can enter help ifconfig the firmware prompt for details. 8. Although the environment variables and bootargs are retained, you should check that all required boot environment variables and bootargs are properly set for your system type and configuration using the command and correct any errors using the printenv bootarg name setenv variable-name...
Page 108
If your system has… Then… A network connection a. Press when prompted to restore the backup configuration. b. Set the healthy node to advanced privilege level: -privilege advanced c. Run the restore backup command: system node restore- backup -node local -target-address impaired_node_IP_address d.
Page 109
Restore OKM, NSE, and NVE as needed - AFF A220 and FAS2700 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled.
Page 110
prompt 5. Enter the passphrase for the onboard key manager you obtained from the customer at the beginning of this procedure. 6. When prompted to enter the backup data, paste the backup data you captured at the beginning of this procedure, when asked.
Page 111
◦ If the command fails because of an NDMP, SnapMirror, or SnapVault process, disable the process. See the appropriate Documentation Center for more information. 11. Once the giveback completes, check the failover and giveback status with the storage failover show commands.
Page 112
This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
Page 113
◦ If the Restored column = anything other than yes, and/or one or more key managers is not available, use the command to retrieve and restore all security key-manager restore -address authentication keys (AKs) and key IDs associated with all nodes from all available key management servers.
Page 114
-auto-giveback true Return the failed part to NetApp - AFF A220 and FAS2700 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 115
Replace the caching module - AFF A220 and FAS2700 You must replace the caching module in the controller module when your system registers a single AutoSupport (ASUP) message that the module has gone offline; failure to do so results in performance degradation. •...
Page 116
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 117
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 118
5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 3: Replace a caching module To replace a caching module referred to as the M.2 PCIe card on the label on your controller, locate the slot inside the controller and follow the specific sequence of steps.
Page 119
Your storage system must meet certain criteria depending on your situation: • It must have the appropriate operating system for the caching module you are installing. • It must support the caching capacity. • All other components in the storage system must be functioning properly; if not, you must contact technical support.
Page 120
7. Close the controller module cover, as needed. Step 4: Reinstall the controller module After you replace components in the controller module, reinstall it into the chassis. Steps 1. If you are not already grounded, properly ground yourself. 2. If you have not already done so, replace the cover on the controller module. 3.
Page 121
If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
Page 122
If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
Page 123
3. Run diagnostics on the caching module: sldiag device run -dev fcache 4. Verify that no hardware problems resulted from the replacement of the caching module: sldiag device status -dev fcache -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of failures resulting from testing the component.
Page 124
If the system-level Then… diagnostics tests… Resulted in some test Determine the cause of the problem: failures a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
Page 125
cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
Page 126
Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
Page 127
If your system is running Then… clustered ONTAP with… Two nodes in the cluster cluster ha modify -configured false storage failover modify -node node0 -enabled false More than two nodes in the storage failover modify -node node0 -enabled false cluster 2.
Page 128
The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h 2. Disable automatic giveback from the console of the healthy node: storage failover modify –node local -auto-giveback false 3.
Page 129
Do not use excessive force when sliding the power supply into the system. You can damage the connector. 7. Close the cam handle so that the latch clicks into the locked position and the power supply is fully seated. 8. Reconnect the power cable and secure it to the power supply using the power cable locking mechanism. Only connect the power cable to the power supply.
Page 130
4. Set the controller module aside in a safe place, and repeat these steps if you have another controller module in the chassis. Step 3: Move drives to the new chassis You need to move the drives from each bay opening in the old chassis to the same bay opening in the new chassis.
Page 131
Step 5: Install the controller After you install the controller module and any other components into the new chassis, boot it to a state where you can run the interconnect diagnostic test. For HA pairs with two controller modules in the same chassis, the sequence in which you install the controller module is especially important because it attempts to reboot as soon as you completely seat it in the chassis.
Page 132
6. Boot each node to Maintenance mode: a. As each node starts the booting, press to interrupt the boot process when you see the Ctrl-C message Press Ctrl-C for Boot Menu. If you miss the prompt and the controller modules boot to ONTAP, enter halt, and then at the LOADER prompt enter boot_ontap, press when prompted, and then Ctrl-C...
Page 133
Step 2: Run system-level diagnostics After installing a new chassis, you should run interconnect diagnostics. Your system must be at the LOADER prompt to start System Level Diagnostics. All commands in the diagnostic procedures are issued from the node where the component is being replaced. 1.
Page 134
If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
Page 135
If your system is running Then… ONTAP… Resulted in some test failures Determine the cause of the problem. a. Exit Maintenance mode: halt b. Perform a clean shutdown, and then disconnect the power supplies. c. Verify that you have observed all of the considerations identified for running system-level diagnostics, that cables are securely connected, and that hardware components are properly installed in the storage system.
Page 136
6. Reestablish any SnapMirror or SnapVault configurations. Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 137
recovery procedure to determine whether you should use this procedure. If this is the procedure you should use, note that the controller replacement procedure for a node in a four or eight node MetroCluster configuration is the same as that in an HA pair. No MetroCluster-specific steps are required because the failure is restricted to an HA pair and storage failover commands can be used to provide nondisruptive operation during the replacement.
Page 138
system node autosupport invoke -node * -type all -message MAINT=2h 2. If the impaired node is part of an HA pair, disable automatic giveback from the console of the healthy node: storage failover modify -node local -auto-giveback false 3. Take the impaired node to the LOADER prompt: If the impaired node is Then…...
Page 139
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 140
6. Turn the controller module over and place it on a flat, stable surface. 7. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 2: Move the NVMEM battery To move the NVMEM battery from the old controller module to the new controller module, you must perform a specific sequence of steps.
Page 141
1. Check the NVMEM LED: ◦ If your system is in an HA configuration, go to the next step. ◦ If your system is in a stand-alone configuration, cleanly shut down the controller module, and then check the NVRAM LED identified by the NV icon. The NVRAM LED blinks while destaging contents to the flash memory when you halt the system.
Page 142
7. Position the battery pack by aligning the battery holder key ribs to the “V” notches on the sheet metal side wall. 8. Slide the battery pack down along the sheet metal side wall until the support tabs on the side wall hook into the slots on the battery pack, and the battery pack latch engages and clicks into the opening on the side wall.
Page 143
You must have the new controller module ready so that you can move the DIMMs directly from the impaired controller module to the corresponding slots in the replacement controller module. 1. Locate the DIMMs on your controller module. 2. Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement controller module in the proper orientation.
Page 144
Make sure that the plug locks down onto the controller module. Step 5: Move a caching module, if present If your AFF A220 or FAS2700 system has a caching module, you need to move the caching module from the old controller module to the replacement controller module. The caching module is referred to as the “M.2 PCIe card”...
Page 145
5. Reseat and push the heatsink down to engage the locking button on the caching module housing. 6. Close the controller module cover, as needed. Step 6: Install the controller After you install the components from the old controller module into the new controller module, you must install the new controller module into the system chassis and boot the operating system.
Page 146
If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
Page 147
If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
Page 148
Restore and verify the system configuration - AFF A220 and FAS2700 After completing the hardware replacement and booting to Maintenance mode, you verify the low-level system configuration of the replacement controller and reconfigure system settings as necessary. Step 1: Set and verify system time after replacing the controller You should check the time and date on the replacement controller module against the healthy controller module in an HA pair, or against a reliable time server in a stand-alone configuration.
Page 149
▪ ▪ ▪ mcc-2n ▪ mccip ▪ non-ha b. Confirm that the setting has changed: ha-config show Step 3: Run system-level diagnostics You should run comprehensive or focused diagnostic tests for specific components and subsystems whenever you replace the controller. All commands in the diagnostic procedures are issued from the node where the component is being replaced.
Page 150
If you want to run diagnostic Then… tests on… Individual components a. Clear the status logs: sldiag device clearstatus b. Display the available tests for the selected devices: sldiag device show -dev dev_name dev_name can be any one of the ports and devices identified in the preceding step.
Page 151
If you want to run diagnostic Then… tests on… Multiple components at the same a. Review the enabled and disabled devices in the output from the time preceding procedure and determine which ones you want to run concurrently. b. List the individual tests for the device: sldiag device show -dev dev_name c.
Page 152
If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
Page 153
If any LIFs are listed as false, revert them to their home ports: network interface revert 2. Register the system serial number with NetApp Support. ◦ If AutoSupport is enabled, send an AutoSupport message to register the serial number. ◦ If AutoSupport is not enabled, call NetApp Support to register the serial number.
Page 154
-giveback true Step 4: Switch back aggregates in a two-node MetroCluster configuration After you have completed the FRU replacement in a two-node MetroCluster configuration, you can perform the MetroCluster switchback operation. This returns the configuration to its normal operating state, with the sync-source storage virtual machines (SVMs) on the formerly impaired site now active and serving data from the local disk pools.
Page 155
6. Reestablish any SnapMirror or SnapVault configurations. Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 156
MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h 2. Disable automatic giveback from the console of the healthy node: storage failover modify –node local -auto-giveback false 3.
Page 157
3. If the system has only one controller module in the chassis, turn off the power supplies, and then unplug the impaired node’s power cords from the power source. Option 2: Controller is in a MetroCluster Do not use this procedure if your system is in a two-node MetroCluster configuration. To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage.
Page 158
Steps 1. If you are not already grounded, properly ground yourself. 2. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the system cables and SFPs (if needed) from the controller module, keeping track of where the cables were connected.
Page 159
Step 3: Replace the DIMMs To replace the DIMMs, locate them inside the controller and follow the specific sequence of steps. If you are replacing a DIMM, you need to remove it after you have unplugged the NVMEM battery from the controller module.
Page 160
a. Locate the battery, press the clip on the face of the battery plug to release the lock clip from the plug socket, and then unplug the battery cable from the socket. b. Confirm that the NVMEM LED is no longer lit. c.
Page 161
9. Remove the replacement DIMM from the antistatic shipping bag, hold the DIMM by the corners, and align it to the slot. The notch among the pins on the DIMM should line up with the tab in the socket. 10. Make sure that the DIMM ejector tabs on the connector are in the open position, and then insert the DIMM squarely into the slot.
Page 162
Do not completely insert the controller module in the chassis until instructed to do so. 4. Recable the system, as needed. If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables. 5.
Page 163
If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
Page 164
During the boot process, you can safely respond to the prompts until the Maintenance mode prompt (*>) appears. 3. Run diagnostics on the system memory: sldiag device run -dev mem 4. Verify that no hardware problems resulted from the replacement of the DIMMs: sldiag device status -dev mem -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of...
Page 165
If your node is in… Then… Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
Page 166
1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B ...
Page 167
Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
Page 168
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 169
State is configured and that the nodes are in an enabled and normal state (metrocluster node show). Steps 1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*>...
Page 170
4. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller module from the midplane, and then, using two hands, pull the controller module out of the chassis. 5. Turn the controller module over and place it on a flat, stable surface. 6.
Page 171
The NVRAM LED blinks while destaging contents to the flash memory when you halt the system. After the destage is complete, the LED turns off. ▪ If power is lost without a clean shutdown, the NVMEM LED flashes until the destage is complete, and then the LED turns off.
Page 172
1. If you are not already grounded, properly ground yourself. 2. If you have not already done so, replace the cover on the controller module. 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system.
Page 173
If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
Page 174
During the boot process, you can safely respond to the prompts until the Maintenance mode prompt (*>) appears. 3. Run diagnostics on the NVMEM memory: sldiag device run -dev nvmem 4. Verify that no hardware problems resulted from the replacement of the NVMEM battery: sldiag device status -dev nvmem -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of...
Page 175
If your node is in… Then… Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
Page 176
1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B ...
Page 177
Step 8: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
Page 178
5. Use the cam handle to slide the power supply out of the system. CAUTION: When removing a power supply, always use two hands to support its weight. 6. Make sure that the on/off switch of the new power supply is in the Off position. 7.
Page 179
11. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888-463-8277 (North America), 00-800- 44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
Page 180
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 181
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 182
5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 3: Replace the RTC battery To replace the RTC battery, locate it inside the controller and follow the specific sequence of steps.
Page 183
1. If you are not already grounded, properly ground yourself. 2. Locate the RTC battery. 3. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
Page 184
module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 3. Recable the system, as needed. If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables.
Page 185
1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B ...
Page 186
Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
This guide gives graphic instructions for a typical installation of your system from racking and cabling, through initial system bring-up. Use this guide if you are familiar with installing NetApp systems. Access the Installation and Setup Instructions PDF poster: • English: AFF A250 Installation and Setup Instructions •...
Page 188
NetApp video: Software configuration for vSphere NAS datastores for FAS/AFF systems running ONTAP 9.2 Detailed guide - AFF A250 This guide gives detailed step-by-step instructions for installing an AFF A250 system. Step 1: Prepare for installation To install your AFF A250 system, you need to create an account and register the system.
Page 189
ONTAP Configuration Guide Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable. 1. Install the rail kits, as needed. 2. Install and secure your system using the instructions included with the rail kit.
Page 190
3. Identify and manage cables because this system does not have a cable management device. 4. Place the bezel on the front of the system. Step 3: Cable controllers There is required cabling for your platform’s cluster using the two-node switchless cluster method or the cluster interconnect network method.
Page 191
Step Perform on each controller Cable the cluster interconnect ports to each other with the 25GbE cluster interconnect cable • e0c to e0c • e0d to e0d Cable the wrench ports to the management network switches with the RJ45 cables. DO NOT plug in the power cords at this point.
Page 192
Cabling a switched cluster Step Perform on each controller Cable the cluster interconnect ports to the 25 GbE cluster interconnect switches. • e0c • e0d Cable the wrench ports to the management network switches with the RJ45 cables. DO NOT plug in the power cords at this point. 2.
Page 193
As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Step Perform on each controller module Cable ports 2a through 2d to the FC host switches.
Page 194
Step Perform on each controller module Cable ports e4a through e4d to the 10GbE host network switches. To perform other optional cabling, choose from: • Option 1: Cable to a Fibre Channel host network • Option 3: Cable the controllers to a single drive shelf To complete setting up your system, see Step4: Completing system setup and...
Page 195
Step Perform on each controller module Cable controller A to the shelf Cable controller B to the shelf: 2. To complete setting up your system, see Step4: Completing system setup and configuration. Step 4: Complete system setup and configuration Complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch.
Page 196
Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens. 5. Use System Manager guided setup to configure your system using the data you collected in the NetApp ONTAP Configuration Guide. ONTAP Configuration Guide 6.
Page 197
Point your browser to the node management IP address. The format for the address is https://x.x.x.x. b. Configure the system using the data you collected in the NetApp ONTAP Configuration guide. ONTAP Configuration Guide 5. Verify the health of your system by running Config Advisor.
Page 198
◦ If the impaired node is in a standalone configuration and at LOADER prompt, contact NetApp Support. mysupport.netapp.com 2. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message...
Page 199
Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired node, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
Page 200
Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
Page 201
Key Manager external Restored yes: a. Enter the onboard security key-manager sync command: security key-manager external sync If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored security key-manager key query c.
Page 202
Shut down the node - AFF A250 Option 1: Most systems After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired node. Steps 1. If the impaired node isn’t at the LOADER prompt: If the impaired node displays… Then… Waiting for giveback...
Page 203
The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h 2. Disable automatic giveback from the console of the healthy node: storage failover modify –node local -auto-giveback false 3.
Page 204
Lever Latching mechanism 5. Using both hands, grasp the controller module sides and gently pull it out of the chassis and set it on a flat, stable surface. 6. Turn the thumbscrew on the front of the controller module anti-clockwise and open the controller module cover.
Page 205
Thumbscrew Controller module cover. 7. Lift out the air duct cover. Step 2: Replace the boot media You locate the failed boot media in the controller module by removing the air duct on the controller module before you can replace the boot media. You need a #1 magnetic Phillips head screw driver to remove the screw that holds the boot media in-place.
Page 206
• You must have a USB flash drive, formatted to MBR/FAT32, with at least 4GB capacity • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site...
Page 207
◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button. ◦ If NVE is not enabled, download the image without NetApp Volume Encryption, as indicated in the download button. • If your system is an HA pair, you must have a network connection.
Page 208
3. Close the controller module cover and tighten the thumbscrew.
Page 209
Controller module cover Thumbscrew 4. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. 5. Plug the power cable into the power supply and reinstall the power cable retainer. 6.
Page 210
▪ bootarg.storageencryption.support value ▪ bootarg.keymanager.support value ▪ bootarg.onboard_keymanager value d. Save the environment variables you changed with the command savenv e. Confirm your changes using the command. printenv variable-name Boot the recovery image - AFF A250 You must boot the ONTAP image from the USB drive, restore the file system, and verify the environmental variables.
Page 211
If your system has… Then… No network connection and is in a a. Press when prompted to restore the backup configuration. MetroCluster IP configuration b. Reboot the system when prompted by the system. c. Wait for the iSCSI storage connections to connect. You can proceed after you see the following messages: date-and-time [node- name:iscsi.session.stateChanged:notice]:...
Page 212
Restore OKM, NSE, and NVE as needed - AFF A250 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. 1. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
Page 213
If the console displays… Then… The LOADER prompt Boot the node to the boot menu: boot_ontap menu Waiting for giveback…. a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this node rather than wait [y/n]? , enter: c.
Page 214
9. Confirm the target node is ready for giveback with the storage failover show command. 10. Giveback only the CFO aggregates with the storage failover giveback -fromnode local command. -only-cfo-aggregates true ◦ If the command fails because of a failed disk, physically dis-engage the failed disk, but leave the disk in the slot until a replacement is received.
Page 215
Restore NSE/NVE on systems running ONTAP 9.6 and later Steps 1. Connect the console cable to the target node. 2. Use the command at the LOADER prompt to boot the node. boot_ontap 3. Check the console output: If the console displays… Then…...
Page 216
-auto-giveback true Return the failed part to NetApp - AFF A250 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 217
impaired node; see the Administration overview with the CLI. • If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*>...
Page 218
of the bezel: Replacing the chassis 1. If you are not already grounded, properly ground yourself. 2. Unplug the controller module power supplies from the source. 3. Release the power cable retainers, and then unplug the cables from the power supplies. 4.
Page 219
The drive should disengage from the chassis, allowing it to slide free of the chassis. When removing a drive, always use two hands to support its weight. Drives are fragile. Handle them as little as possible to prevent damage to them. 3.
Page 220
3. Plug the power cables into the power supplies and reinstall the power cable retainers. 4. Insert the controller module into the chassis: a. Ensure the latching mechanism arms are locked in the fully extended position. b. Using both hands, align and gently slide the controller module into the latching mechanism arms until it stops.
Page 221
◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 3: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 222
Shut down the impaired controller module - AFF A250 • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the “Returning SEDs to unprotected mode” section of the ONTAP 9 NetApp Encryption Power Guide.
Page 223
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 224
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 225
Lever Latching mechanism 5. Using both hands, grasp the controller module sides and gently pull it out of the chassis and set it on a flat, stable surface. 6. Turn the thumbscrew on the front of the controller module anti-clockwise and open the controller module cover.
Page 226
Thumbscrew Controller module cover. 7. Lift out the air duct cover. Step 2: Move the power supply You must move the power supply from the impaired controller module to the replacement controller module when you replace a controller module. 1. Disconnect the power supply: a.
Page 227
Blue power supply locking tab Power supply 3. Move the power supply to the new controller module, and then install it. 4. Using both hands, support and align the edges of the power supply with the opening in the controller module, and then gently push the power supply into the controller module until the locking tab clicks into place.
Page 228
Fan module 2. Move the fan module to the replacement controller module, and align the edges of the fan module with the opening in the controller module, and then slide the fan module in. 3. Repeat these steps for the remaining fan modules. Step 4: Move the boot media There is one boot media device in the AFF A250 under the air duct in the controller module.
Page 229
Remove the screw securing the boot media to the motherboard in the impaired controller module. Lift the boot media out of the impaired controller module. a. Using the #1 magnetic screw driver remove the screw from the boot media, and set it aside safely on the magnet.
Page 230
Install each DIMM into the same slot it occupied in the impaired controller module. a. Slowly push apart the DIMM ejector tabs on either side of the DIMM, and slide the DIMM out of the slot. Hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board.
Page 231
Remove screws on the face of the controller module. Loosen the screw in the controller module. Move the mezzanine card. a. Unplug any cabling associated with the mezzanine card. Make sure that you label the cables so that you know where they came from. b.
Page 232
h. Insert the SFP or QSFP modules that were removed onto the mezzanine card. Step 7: Move the NV battery When replacing the controller module, you must move the NV battery from the impaired controller module to the replacement controller module 1.
Page 233
c. Locate the corresponding NV battery holder on the replacement controller module and align the NV battery to the battery holder. d. Insert the NV battery plug into the socket. e. Slide the battery pack down along the sheet metal side wall until the support tabs on the side wall hook into the slots on the battery pack, and the battery pack latch engages and clicks into the opening on the side wall.
Page 234
Controller module cover Thumbscrew 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 4.
Page 235
e. Release your thumbs from the top of the latching mechanisms and continue pushing until the latching mechanisms snap into place. The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process.
Page 236
2. If the displayed system state of the controller module does not match your system configuration, set the state for the controller module: ha-config modify controller ha-state The value for HA-state can be one of the following: ◦ ha ◦ mcc ◦...
Page 237
Step 1: Recable the system After running diagnostics, you must recable the controller module’s storage and network connections. Steps 1. Recable the system. 2. Verify that the cabling is correct by using Active IQ Config Advisor. a. Download and install Config Advisor. b.
Page 238
4. From the healthy node, verify that any coredumps are saved: a. Change to the advanced privilege level: set -privilege advanced You can respond when prompted to continue into advanced mode. The advanced mode prompt appears (*>). b. Save any coredumps: system node run -node local-node-name partner savecore c.
Page 239
node1> storage disk show -ownership Disk Aggregate Home Owner DR Home Home ID Owner ID DR Home ID Reserver Pool ----- ------ ----- ------ -------- ------- ------- ------- --------- 1.0.0 aggr0_1 node1 node1 1873775277 1873775277 1873775277 Pool0 1.0.1 aggr0_1 node1 node1 1873775277 1873775277 1873775277 Pool0 7.
Page 240
Steps 1. If you need new license keys, obtain replacement license keys on the NetApp Support Site in the My Support section under Software licenses. The new license keys that you require are automatically generated and sent to the email address on file.
Page 241
-node local -auto -giveback true Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 242
increasing number of correctable error correction codes (ECC); failure to do so causes a system panic. About this task All other components in the system must be functioning properly; if not, you must contact technical support. You must replace the failed component with a replacement FRU component you received from your provider. Step 1: Shut down the impaired controller You can shut down or take over the impaired controller using different procedures, depending on the storage system hardware configuration.
Page 243
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 244
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 245
5. Using both hands, grasp the controller module sides and gently pull it out of the chassis and set it on a flat, stable surface. 6. Turn the thumbscrew on the front of the controller module anti-clockwise and open the controller module cover.
Page 246
Step 3: Replace a DIMM To replace a DIMM, you must locate it in the controller module using the DIMM map label on top of the air duct or locating it using the LED next to the DIMM, and then replace it following the specific sequence of steps.
Page 247
a. Note the orientation of the DIMM in the socket so that you can insert the replacement DIMM in the proper orientation. b. Slowly push apart the DIMM ejector tabs on either side of the DIMM, and slide the DIMM out of the slot. c.
Page 248
2. Close the controller module cover and tighten the thumbscrew.
Page 249
Controller module cover Thumbscrew 3. Insert the controller module into the chassis: a. Ensure the latching mechanism arms are locked in the fully extended position. b. Using both hands, align and gently slide the controller module into the latching mechanism arms until it stops.
Page 250
Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
Page 251
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 252
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 253
5. Using both hands, grasp the controller module sides and gently pull it out of the chassis and set it on a flat, stable surface. 6. Turn the thumbscrew on the front of the controller module anti-clockwise and open the controller module cover.
Page 254
Fan module 3. Align the edges of the replacement fan module with the opening in the controller module, and then slide the replacement fan module into the controller module. Step 4: Reinstall the controller module After you replace a component within the controller module, you must reinstall the controller module in the system chassis and boot it.
Page 255
Controller module cover Thumbscrew 2. Insert the controller module into the chassis: a. Ensure the latching mechanism arms are locked in the fully extended position. b. Using both hands, align and gently slide the controller module into the latching mechanism arms until it stops.
Page 256
-node local -auto -giveback true Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 257
3. Take the impaired node to the LOADER prompt: If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 258
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 259
5. Using both hands, grasp the controller module sides and gently pull it out of the chassis and set it on a flat, stable surface. 6. Turn the thumbscrew on the front of the controller module anti-clockwise and open the controller module cover.
Page 260
Remove screws on the face of the controller module. Loosen the screw in the controller module. Remove the mezzanine card. a. Unplug any cabling associated with the impaired mezzanine card. Make sure that you label the cables so that you know where they came from. b.
Page 261
Do not apply force when tightening the screw on the mezzanine card; you might crack it. i. Insert any SFP or QSFP modules that were removed from the impaired mezzanine card to the replacement mezzanine card. 3. To install a mezzanine card: 4.
Page 262
-node local -auto -giveback true Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 263
Option 1: Most configurations To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage. About this task If you have a cluster with more than two nodes, it must be in quorum.
Page 264
• If you have a MetroCluster configuration, you must have confirmed that the MetroCluster Configuration State is configured and that the nodes are in an enabled and normal state (metrocluster node show). Steps 1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours:...
Page 265
Lever Latching mechanism 5. Using both hands, grasp the controller module sides and gently pull it out of the chassis and set it on a flat, stable surface. 6. Turn the thumbscrew on the front of the controller module anti-clockwise and open the controller module cover.
Page 266
Thumbscrew Controller module cover. Step 3: Replace the NV battery To replace the NV battery, you must remove the failed battery from the controller module and install the replacement battery into the controller module. Use the following video or the tabulated steps to replace the NVMEM battery: Replacing the NVMEM battery 1.
Page 267
Grasp the battery and press the blue locking tab marked PUSH. Lift the battery out of the holder and controller module. a. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket.
Page 268
Controller module cover Thumbscrew 2. Insert the controller module into the chassis: a. Ensure the latching mechanism arms are locked in the fully extended position. b. Using both hands, align and gently slide the controller module into the latching mechanism arms until it stops.
Page 269
◦ If the scan reported no failures, select Reboot from the menu to reboot the system. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 270
Replacing the power supply 1. If you are not already grounded, properly ground yourself. 2. Identify the power supply you want to replace, based on console error messages or through the red Fault LED on the power supply. 3. Disconnect the power supply: a.
Page 271
Once power is restored to the power supply, the status LED should be green. 7. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 272
3. Take the impaired node to the LOADER prompt: If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 273
If the impaired node is Then… displaying… Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name...
Page 274
Latching mechanism 5. Using both hands, grasp the controller module sides and gently pull it out of the chassis and set it on a flat, stable surface. 6. Turn the thumbscrew on the front of the controller module anti-clockwise and open the controller module cover.
Page 275
Step 3: Replace the RTC battery To replace the RTC battery, locate it inside the controller and follow the specific sequence of steps. Use the following video or the tabulated steps to replace the RTC battery: Replacing the RTC battery 1.
Page 276
Gently pull tab away from the battery housing. Attention: Pulling it away aggressively might displace the tab. Lift the battery up. Note: Make a note of the polarity of the battery. The battery should eject out. The battery will be ejected out. 2.
Page 277
With positive polarity face up, slide the battery under the tab of the battery housing. Push the battery gently into place and make sure the tab secures it to the housing. Attention: Pushing it in aggressively might cause the battery to eject out again.
Page 278
-node local -auto -giveback true Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
AFF A300 System Documentation Install and setup Cluster configuration worksheet - AFF A300 You can use the worksheet to gather and record your site-specific IP addresses and other information required when configuring an ONTAP cluster. Cluster Configuration Worksheet Start here: Choose your installation and setup experience For most configurations, you can choose from different content formats.
Page 280
◦ If the impaired node is in a standalone configuration and at LOADER prompt, contact NetApp Support. mysupport.netapp.com 2. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message...
Page 281
Option 1: Check NVE or NSE on systems running ONTAP 9.5 and earlier Before shutting down the impaired node, you need to check whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
Page 282
▪ Run the key-manager setup wizard: security key-manager setup -node target/impaired node name Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact mysupport.netapp.com ▪ Verify that the column displays for all authentication key: Restored...
Page 283
Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired node, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
Page 284
Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys:...
Page 285
Enter the onboard security key-manager sync command: security key-manager onboard sync Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact NetApp Support. mysupport.netapp.com b. Verify the column shows...
Page 286
If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored security key-manager key query c. You can safely shutdown the node. 4. If the type displays and the column displays anything other than yes:...
Page 287
2. From the LOADER prompt, enter: printenv to capture all boot environmental variables. Save the output to your log file. This command may not work if the boot device is corrupted or non-functional. Option 2: Controller is in a MetroCluster configuration After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired node.
Page 288
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 289
controller_A_1::> metrocluster operation show Operation: heal-aggregates State: successful Start Time: 7/25/2016 18:45:55 End Time: 7/25/2016 18:45:56 Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
Page 290
Step 1: Remove the controller module To access components inside the controller, you must first remove the controller module from the system and then remove the cover on the controller module. 1. If you are not already grounded, properly ground yourself. 2.
Page 291
Step 2: Replace the boot media - AFF A300 You must locate the boot media in the controller and follow the directions to replace it. 1. If you are not already grounded, properly ground yourself. 2. Locate the boot media using the following illustration or the FRU map on the controller module: 3.
Page 292
• A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
Page 293
▪ bootarg.init.boot_clustered ▪ partner-sysid ▪ bootarg.init.flash_optimized ▪ bootarg.init.switchless_cluster.enable b. If External Key Manager is enabled, check the bootarg values, listed in the ASUP output: kenv ▪ bootarg.storageencryption.support <value> ▪ bootarg.keymanager.support <value> ▪ kmip.init.interface <value> ▪ kmip.init.ipaddr <value> ▪ kmip.init.netmask <value> ▪...
Page 294
configuration: a. Boot to Maintenance mode: boot_ontap maint b. Set the MetroCluster ports as initiators: ucadmin modify -m fc -t initiator adapter_name c. Halt to return to Maintenance mode: halt The changes will be implemented when the system is booted. Boot the recovery image - AFF A300 The procedure for booting the impaired node from the recovery image depends on whether the system is in a two-node MetroCluster configuration.
Page 295
If your system has… Then… No network connection a. Press when prompted to restore the backup configuration. b. Reboot the system when prompted by the system. c. Select the Update flash from backup config (sync flash) option from the displayed menu. If you are prompted to continue with the update, press y.
Page 296
Option 2: Controller is in a two-node MetroCluster You must boot the ONTAP image from the USB drive and verify the environmental variables. This procedure applies to systems in a two-node MetroCluster configuration. Steps 1. From the LOADER prompt, boot the recovery image from the USB flash drive: boot_recovery The image is downloaded from the USB flash drive.
Page 297
cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
Page 298
Restore OKM, NSE, and NVE as needed - AFF A300 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
Page 299
--------------------------BEGIN BACKUP-------------------------- TmV0QXBwIEtleSBCbG9iAAEAAAAEAAAAcAEAAAAAAADuD+byAAAAACEAAAAAAAAA QAAAAAAAAABvOlH0AAAAAMh7qDLRyH1DBz12piVdy9ATSFMT0C0TlYFss4PDjTaV dzRYkLd1PhQLxAWJwOIyqSr8qY1SEBgm1IWgE5DLRqkiAAAAAAAAACgAAAAAAAAA 3WTh7gAAAAAAAAAAAAAAAAIAAAAAAAgAZJEIWvdeHr5RCAvHGclo+wAAAAAAAAAA IgAAAAAAAAAoAAAAAAAAAEOTcR0AAAAAAAAAAAAAAAACAAAAAAAJAGr3tJA/ LRzUQRHwv+1aWvAAAAAAAAAAACQAAAAAAAAAgAAAAAAAAACdhTcvAAAAAJ1PXeBf ml4NBsSyV1B4jc4A7cvWEFY6lLG6hc6tbKLAHZuvfQ4rIbYAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA H4nPQM0nrDRYRa9SCv8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAA ---------------------------END BACKUP--------------------------- 7. At the Boot Menu select the option for Normal Boot. The system boots to prompt. Waiting for giveback… 8. Move the console cable to the partner node and login as admin. 9.
Page 300
b. Enter the key-manager key show -detail command to see a detailed view of all keys stored in the onboard key manager and verify that the column = for all authentication keys. Restored If the column = anything other than yes, contact Customer Support. Restored c.
Page 301
This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
Page 302
a. Use the security key-manager key show -detail to see a detailed view of all keys stored in the onboard key manager. b. Use the command and verify that the security key-manager key show -detail Restored column = for all authentication keys. If the Restored column = anything other than yes, use the...
Page 303
-auto-giveback true Return the failed part to NetApp - AFF A300 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 304
• This procedure is written with the assumption that you are moving the controller module or modules to the new chassis, and that the chassis is a new component from NetApp. • This procedure is disruptive. For a two-node cluster, you will have a complete service outage and a partial outage in a multi-node cluster.
Page 305
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 306
4. Verify that the operation has been completed by using the metrocluster operation show command. controller_A_1::> metrocluster operation show Operation: heal-aggregates State: successful Start Time: 7/25/2016 18:45:55 End Time: 7/25/2016 18:45:56 Errors: - 5. Check the state of the aggregates by using the command.
Page 307
Move and replace hardware - AFF A300 Step 1: Move a power supply Moving out a power supply when replacing a chassis involves turning off, disconnecting, and removing the power supply from the old chassis and installing and connecting it on the replacement chassis.
Page 308
Power supply Cam handle release latch Power and Fault LEDs Cam handle...
Page 309
Power cable locking mechanism 4. Use the cam handle to slide the power supply out of the system. CAUTION: When removing a power supply, always use two hands to support its weight. 5. Repeat the preceding steps for any remaining power supplies. 6.
Page 310
Cam handle Fan module Cam handle release latch Fan module Attention LED 3. Pull the fan module straight out from the chassis, making sure that you support it with your free hand so that it does not swing out of the chassis. CAUTION: The fan modules are short.
Page 311
9. Repeat these steps for the remaining fan modules. 10. Align the bezel with the ball studs, and then gently push the bezel onto the ball studs. Step 3: Remove the controller module To replace the chassis, you must remove the controller module or modules from the old chassis.
Page 312
6. Set the controller module aside in a safe place, and repeat these steps if you have another controller module in the chassis. Step 4: Replace a chassis from within the equipment rack or system cabinet You must remove the existing chassis from the equipment rack or system cabinet before you can install the replacement chassis.
Page 313
If your system is in… Then perform these steps… An HA pair a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Tighten the thumbscrew on the cam handle on back of the controller module.
Page 314
Then… A stand-alone configuration a. Exit Maintenance mode: halt b. Go to Step 4: Return the failed part to NetApp. An HA pair with a second Exit Maintenance mode: The LOADER prompt appears. halt controller module Step 2: Run system-level diagnostics After installing a new chassis, you should run interconnect diagnostics.
Page 315
During the boot process, you can safely respond to prompts: 2. Repeat the previous step on the second node if you are in an HA configuration. Both controllers must be in Maintenance mode to run the interconnect test. 3. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags During the boot process, you can safely respond...
Page 316
If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
Page 317
If your system is running Then… ONTAP… Resulted in some test failures Determine the cause of the problem. a. Exit Maintenance mode: halt b. Perform a clean shutdown, and then disconnect the power supplies. c. Verify that you have observed all of the considerations identified for running system-level diagnostics, that cables are securely connected, and that hardware components are properly installed in the storage system.
Page 318
6. Reestablish any SnapMirror or SnapVault configurations. Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 319
• Any PCIe cards moved from the old controller module to the new controller module or added from existing customer site inventory must be supported by the replacement controller module. NetApp Hardware Universe • It is important that you apply the commands in these steps on the correct systems: ◦...
Page 320
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 321
If the impaired node… Then… Has not automatically switched Review the veto messages and, if possible, resolve the issue and try over, you attempted switchover again. If you are unable to resolve the issue, contact technical support. with the metrocluster command, and the switchover switchover was vetoed...
Page 322
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 323
controller_A_1::> metrocluster heal -phase aggregates [Job 130] Job succeeded: Heal Aggregates is successful. If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
Page 324
mcc1A::> metrocluster operation show Operation: heal-root-aggregates State: successful Start Time: 7/29/2016 20:54:41 End Time: 7/29/2016 20:54:42 Errors: - 8. On the impaired controller module, disconnect the power supplies. Move the controller module hardware - AFF A300 To replace the controller module hardware, you must remove the impaired node, move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode.
Page 325
Thumbscrew Cam handle 6. Pull the cam handle downward and begin to slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. Step 2: Move the boot device You must locate the boot media and follow the directions to remove it from the old controller and insert it in the new controller.
Page 326
2. Press the blue button on the boot media housing to release the boot media from its housing, and then gently pull it straight out of the boot media socket. Do not twist or pull the boot media straight up, because this could damage the socket or the boot media.
Page 327
◦ If your system is in an HA configuration, go to the next step. ◦ If your system is in a stand-alone configuration, cleanly shut down the controller module, and then check the NVRAM LED identified by the NV icon. The NVRAM LED blinks while destaging contents to the flash memory when you halt the system.
Page 328
Battery lock tab NVMEM battery pack 3. Grasp the battery and press the blue locking tab marked PUSH, and then lift the battery out of the holder and controller module. 4. Remove the battery from the controller module and set it aside. Step 4: Move the DIMMs To move the DIMMs, locate and move them from the old controller into the replacement controller and follow the specific sequence of steps.
Page 329
4. Locate the slot where you are installing the DIMM. 5. Make sure that the DIMM ejector tabs on the connector are in the open position, and then insert the DIMM squarely into the slot. The DIMM fits tightly in the slot, but should go in easily. If not, realign the DIMM with the slot and reinsert it. Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot.
Page 330
1. Loosen the thumbscrew on the controller module side panel. 2. Swing the side panel off the controller module. Side panel PCIe card 3. Remove the PCIe card from the old controller module and set it aside. Make sure that you keep track of which slot the PCIe card was in. 4.
Page 331
7. Close the side panel and tighten the thumbscrew. Step 6: Install the controller After you install the components from the old controller module into the new controller module, you must install the new controller module into the system chassis and boot the operating system.
Page 332
If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
Page 333
If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Tighten the thumbscrew on the cam handle on back of the controller module.
Page 334
It is important that you apply the commands in the steps on the correct systems: • The replacement node is the new node that replaced the impaired node as part of this procedure. • The healthy node is the HA partner of the replacement node Steps 1.
Page 335
Step 3: Run system-level diagnostics You should run comprehensive or focused diagnostic tests for specific components and subsystems whenever you replace the controller. All commands in the diagnostic procedures are issued from the node where the component is being replaced. 1.
Page 336
If you want to run diagnostic Then… tests on… Individual components a. Clear the status logs: sldiag device clearstatus b. Display the available tests for the selected devices: sldiag device show -dev dev_name dev_name can be any one of the ports and devices identified in the preceding step.
Page 337
If you want to run diagnostic Then… tests on… Multiple components at the same a. Review the enabled and disabled devices in the output from the time preceding procedure and determine which ones you want to run concurrently. b. List the individual tests for the device: sldiag device show -dev dev_name c.
Page 338
If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
Page 339
c. Click the Cabling tab, and then examine the output. Make sure that all disk shelves are displayed and all disks appear in the output, correcting any cabling issues you find. d. Check other cabling by clicking the appropriate tab, and then examining the output from Config Advisor. Step 2: Reassign disks If the storage system is in an HA pair, the system ID of the new controller module is automatically assigned to the disks when the giveback occurs at the end of the...
Page 340
b. Save any coredumps: system node run -node local-node-name partner savecore c. Wait for savecore command to complete before issuing the giveback. You can enter the following command to monitor the progress of the savecore command: system node run -node local-node-name partner savecore -s d.
Page 341
disks to the new controller’s system ID before you return the system to normal operating condition. About this task This procedure applies only to systems in a two-node MetroCluster configuration running ONTAP. You must be sure to issue the commands in this procedure on the correct node: •...
Page 342
5. Verify that the disks (or FlexArray LUNs) were assigned correctly: disk show -a Verify that the disks belonging to the replacement node show the new system ID for the replacement node. In the following example, the disks owned by system-1 now show the new system ID, 118065481: *>...
Page 343
Display the results of the MetroCluster check: metrocluster check show e. Run Config Advisor. Go to the Config Advisor page on the NetApp Support Site at support.netapp.com/NOW/download/tools/config_advisor/. After running Config Advisor, review the tool’s output and follow the recommendations in the output to address any issues discovered.
Page 344
1. Verify that the logical interfaces are reporting to their home server and ports: network interface show -is-home false If any LIFs are listed as false, revert them to their home ports: network interface revert 2. Register the system serial number with NetApp Support.
Page 345
◦ If AutoSupport is enabled, send an AutoSupport message to register the serial number. ◦ If AutoSupport is not enabled, call NetApp Support to register the serial number. 3. If automatic giveback was disabled, reenable it: storage failover modify -node local -auto...
Page 346
6. Reestablish any SnapMirror or SnapVault configurations. Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 347
impaired node storage. About this task If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy node shows false for eligibility and health, you must correct the issue before shutting down the impaired node; see the Administration overview with the CLI.
Page 348
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 349
If the impaired node… Then… Has not automatically switched Perform a planned switchover operation from the healthy node: over metrocluster switchover Has not automatically switched Review the veto messages and, if possible, resolve the issue and try over, you attempted switchover again.
Page 350
mcc1A::> metrocluster heal -phase root-aggregates [Job 137] Job succeeded: Heal Root Aggregates is successful If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
Page 351
Thumbscrew Cam handle 5. Pull the cam handle downward and begin to slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. Step 3: Replace the DIMMs To replace the DIMMs, locate them inside the controller and follow the specific sequence of steps.
Page 352
a. Open the CPU air duct and locate the NVMEM battery. NVMEM battery lock tab NVMEM battery b. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket. c.
Page 353
8. Eject the DIMM from its slot by slowly pushing apart the two DIMM ejector tabs on either side of the DIMM, and then slide the DIMM out of the slot. Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board.
Page 354
11. Push carefully, but firmly, on the top edge of the DIMM until the ejector tabs snap into place over the notches at the ends of the DIMM. 12. Locate the NVMEM battery plug socket, and then squeeze the clip on the face of the battery cable plug to insert it into the socket.
Page 355
All commands in the diagnostic procedures are issued from the node where the component is being replaced. 1. If the node to be serviced is not at the LOADER prompt, perform the following steps: a. Select the Maintenance mode option from the displayed menu. b.
Page 356
If your node is in… Then… An HA pair Perform a give back: storage failover giveback -ofnode replacement_node_name If you disabled automatic giveback, re-enable it with the storage failover modify command. A two-node MetroCluster Proceed to the next step. The MetroCluster switchback procedure is configuration done in the next task in the replacement process.
Page 357
Step 6 (Two-node MetroCluster only): Switch back aggregates After you have completed the FRU replacement in a two-node MetroCluster configuration, you can perform the MetroCluster switchback operation. This returns the configuration to its normal operating state, with the sync-source storage virtual machines (SVMs) on the formerly impaired site now active and serving data from the local disk pools.
Page 358
6. Reestablish any SnapMirror or SnapVault configurations. Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 359
10. Align the bezel with the ball studs, and then gently push the bezel onto the ball studs. 11. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 360
44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure. Replace the NVMEM battery - AFF A300 To replace an NVMEM battery in the system, you must remove the controller module from the system, open it, replace the battery, and close and replace the controller module.
Page 361
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 362
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 363
controller_A_1::> metrocluster heal -phase aggregates [Job 130] Job succeeded: Heal Aggregates is successful. If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
Page 364
mcc1A::> metrocluster operation show Operation: heal-root-aggregates State: successful Start Time: 7/29/2016 20:54:41 End Time: 7/29/2016 20:54:42 Errors: - 8. On the impaired controller module, disconnect the power supplies. Step 2: Open the controller module To access components inside the controller, you must first remove the controller module from the system and then remove the cover on the controller module.
Page 365
Thumbscrew Cam handle 5. Pull the cam handle downward and begin to slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. Step 3: Replace the NVMEM battery To replace the NVMEM battery in your system, you must remove the failed NVMEM battery from the system and replace it with a new NVMEM battery.
Page 366
Battery lock tab NVMEM battery pack 4. Grasp the battery and press the blue locking tab marked PUSH, and then lift the battery out of the holder and controller module. 5. Remove the replacement battery from its package. 6. Align the tab or tabs on the battery holder with the notches in the controller module side, and then gently push down on the battery housing until the battery housing clicks into place.
Page 367
diagnostic tests on the replaced component. 1. If you are not already grounded, properly ground yourself. 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so.
Page 368
2. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags During the boot process, you can safely respond to the prompts until the Maintenance mode prompt (*>) appears. 3. Run diagnostics on the NVMEM memory: sldiag device run -dev nvmem 4.
Page 369
If your node is in… Then… Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
Page 370
1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B ...
Page 371
Step 8: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
Page 372
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 373
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 374
controller_A_1::> metrocluster heal -phase aggregates [Job 130] Job succeeded: Heal Aggregates is successful. If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
Page 375
mcc1A::> metrocluster operation show Operation: heal-root-aggregates State: successful Start Time: 7/29/2016 20:54:41 End Time: 7/29/2016 20:54:42 Errors: - 8. On the impaired controller module, disconnect the power supplies. Step 2: Open the controller module To access components inside the controller, you must first remove the controller module from the system and then remove the cover on the controller module.
Page 376
Thumbscrew Cam handle 5. Pull the cam handle downward and begin to slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. Step 3: Replace a PCIe card To replace a PCIe card, locate it within the controller and follow the specific sequence of steps.
Page 377
4. Remove the PCIe card from the controller module and set it aside. 5. Install the replacement PCIe card. Be sure that you properly align the card in the slot and exert even pressure on the card when seating it in the socket.
Page 378
If your system is in… Then perform these steps… A two-node MetroCluster a. With the cam handle in the open position, firmly push the configuration controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Tighten the thumbscrew on the cam handle on back of the controller module.
Page 379
This task only applies to two-node MetroCluster configurations. Steps 1. Verify that all nodes are in the state: enabled metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A ...
Page 380
6. Reestablish any SnapMirror or SnapVault configurations. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 381
Power supply Cam handle release latch Power and Fault LEDs Cam handle...
Page 382
The power supply LEDs are lit when the power supply comes online. 2. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 383
Option 1: Most configurations To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage. About this task If you have a cluster with more than two nodes, it must be in quorum.
Page 384
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 385
If the impaired node… Then… Has automatically switched over Proceed to the next step. Has not automatically switched Perform a planned switchover operation from the healthy node: over metrocluster switchover Has not automatically switched Review the veto messages and, if possible, resolve the issue and try over, you attempted switchover again.
Page 386
mcc1A::> metrocluster heal -phase root-aggregates [Job 137] Job succeeded: Heal Root Aggregates is successful If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
Page 387
Thumbscrew Cam handle 5. Pull the cam handle downward and begin to slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. Step 3: Replace the RTC Battery To replace the RTC battery, locate them inside the controller and follow the specific sequence of steps.
Page 388
3. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
Page 389
1. If you have not already done so, close the air duct or controller module cover. 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so.
Page 390
pools. This task only applies to two-node MetroCluster configurations. Steps 1. Verify that all nodes are in the state: enabled metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A ...
Page 391
6. Reestablish any SnapMirror or SnapVault configurations. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
NetApp video: AFF A320 Installation and setup Video two of two: Performing end-to-end software configuration The following video shows end-to-end software configuration for systems running ONTAP 9.2 and later. NetApp video: Software configuration for vSphere NAS datastores for FAS/AFF systems running ONTAP 9.2...
Page 393
Detailed guide - AFF A320 This guide gives detailed step-by-step instructions for installing a typical NetApp system. Use this guide if you want more detailed installation instructions. Prepare for installation To install your AFF A320 system, you need to create an account, register the system, and get license keys.
Page 394
Type of Part number and length Connector For… cable… type 100 GbE cable X66211A-05 (112-00595), 0.5m Storage, cluster (QSF(28) interconnect/HA, and Ethernet X66211A-1 (112-00573), 1m data (order-dependent) X66211A-2 (112-00574), 2m X66211A-5 (112-00574), 5m 40 GbE cable X66211A-1 (112-00573), 1m; Storage, cluster interconnect/HA, and Ethernet X66211A-3 (112-00543),3m;...
Page 395
Cluster Configuration Worksheet Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable. 1. Install the rail kits, as needed. 2. Install and secure your system using the instructions included with the rail kit.
Page 396
As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. 1. You can used the illustration or the step-by step instructions to complete the cabling between the controllers and to the switches: Step Perform on each controller module...
Page 397
Step Perform on each controller module If you are using your onboard ports for a data network connection, connect the 100GbE or 40Gbe cables to the appropriate data network switches: • e0g and e0h If you are using your NIC cards for Ethernet or FC connections, connect the NIC card(s) to the appropriate switches:...
Page 398
Step Perform on each controller module Cable the e0M ports to the management network switches with the RJ45 cables. DO NOT plug in the power cords at this point. 2. Cable your storage: Cabling controllers to drive shelves Option 2: Cabling a switched cluster The optional data ports, optional NIC cards, and management ports on the controller modules are connected to switches.
Page 399
../media/drw_a320_switched_network_cabling_composite_animated_gif.png Step Perform on each controller module Cable the cluster/HA ports to the cluster/HA switch with the 100 GbE (QSFP28) cable: • e0a on both controllers to the cluster/HA switch • e0d on both controllers to the cluster/HA switch If you are using your onboard ports for a data network connection, connect the 100GbE or 40Gbe cables to the appropriate data network switches: •...
Page 400
Step Perform on each controller module If you are using your NIC cards for Ethernet or FC connections, connect the NIC card(s) to the appropriate switches: Cable the e0M ports to the management network switches with the RJ45 cables. DO NOT plug in the power cords at this point. 2.
Page 401
Option 1: Cable the controllers to a single drive shelf You must cable each controller to the NSM modules on the NS224 drive shelf. Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again.
Page 402
Step Perform on each controller module Cable controller A to the shelf Cable controller B to the shelf: 2. To complete setting up your system, see Completing system setup and configuration. Option 2: Cable the controllers to two drive shelves You must cable each controller to the NSM modules on both NS224 drive shelves.
Page 403
Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. 1.
Page 404
Step Perform on each controller module Cable controller A to the shelves: Cable controller B to the shelves: 2. To complete setting up your system, see Completing system setup and configuration. Complete system setup and configuration You can complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch.
Page 405
Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens. 5. Use System Manager guided setup to configure your system using the data you collected in the NetApp ONTAP Configuration Guide. ONTAP Configuration Guide 6.
Page 406
Option 2: Completing system setup and configuration if network discovery is not enabled If network discovery is not enabled on your laptop, you must complete the configuration and setup using this task. 1. Cable and configure your laptop or console: a.
Page 407
Point your browser to the node management IP address. The format for the address is https://x.x.x.x. b. Configure the system using the data you collected in the NetApp ONTAP Configuration guide. ONTAP Configuration Guide 6. Verify the health of your system by running Config Advisor.
Page 408
◦ If the impaired node is in a standalone configuration and at LOADER prompt, contact NetApp Support. mysupport.netapp.com 2. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message...
Page 409
Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
Page 410
Key Manager external Restored yes: a. Enter the onboard security key-manager sync command: security key-manager external sync If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored security key-manager key query c.
Page 411
Enter the onboard security key-manager sync command: security key-manager onboard sync Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact NetApp Support. mysupport.netapp.com b. Verify the column shows for all authentication keys:...
Page 412
Option 2: System is in a MetroCluster After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired node. Do not use this procedure if your system is in a two-node MetroCluster configuration. To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage.
Page 413
Step 1: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis. 1. If you are not already grounded, properly ground yourself. 2. Unplug the controller module power supply from the power source. 3.
Page 414
Step 2: Replace the boot media You must locate the boot media in the controller module, and then follow the directions to replace it. 1. Open the air duct and locate the boot media using the following illustration or the FRU map on the controller module: 2.
Page 415
• A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
Page 416
c. Press down and hold the orange tabs on top of the latching mechanism. d. Gently push the controller module into the chassis bay until it is flush with the edges of the chassis. The latching mechanism arms slide into the chassis. The controller module begins to boot as soon as it is fully seated in the chassis.
Page 417
11. When prompted, either enter the name of the image or accept the default image displayed inside the brackets on your screen. 12. After the image is installed, start the restoration process: a. Record the IP address of the impaired node that is displayed on the screen. b.
Page 418
If your system is in… Then… An HA pair After the impaired node is displaying the Waiting for message, perform a giveback from the healthy node: giveback… a. From the healthy node: storage failover giveback -ofnode partner_node_name The impaired node takes back its storage, finishes booting, and then reboots and is again taken over by the healthy node.
Page 419
If your system has… Then… A network connection a. Press when prompted to restore the backup configuration. b. Set the healthy node to advanced privilege level: -privilege advanced c. Run the restore backup command: system node restore- backup -node local -target-address impaired_node_IP_address d.
Page 420
If your system has… Then… No network connection and is in a a. Press when prompted to restore the backup configuration. MetroCluster IP configuration b. Reboot the system when prompted by the system. c. Wait for the iSCSI storage connections to connect. You can proceed after you see the following messages: date-and-time [node- name:iscsi.session.stateChanged:notice]:...
Page 421
Restore OKM, NSE, and NVE as needed - AFF A320 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. 1. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
Page 422
If the console displays… Then… The LOADER prompt Boot the node to the boot menu: boot_ontap menu Waiting for giveback…. a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this node rather than wait [y/n]? , enter: c.
Page 423
9. Confirm the target node is ready for giveback with the storage failover show command. 10. Giveback only the CFO aggregates with the storage failover giveback -fromnode local command. -only-cfo-aggregates true ◦ If the command fails because of a failed disk, physically dis-engage the failed disk, but leave the disk in the slot until a replacement is received.
Page 424
Restore NSE/NVE on systems running ONTAP 9.6 and later Steps 1. Connect the console cable to the target node. 2. Use the command at the LOADER prompt to boot the node. boot_ontap 3. Check the console output: If the console displays… Then…...
Page 425
-auto-giveback true Return the failed part to NetApp - AFF A320 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 426
system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h Steps 1. If your system has two controller modules, disable the HA pair. If your system is running Then…...
Page 427
3. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the system cables and SFPs (if needed) from the controller module, keeping track of where the cables were connected. Leave the cables in the cable management device so that when you reinstall the cable management device, the cables are organized.
Page 428
the locked position. The fan LED should be green after the fan is seated and has spun up to operational speed. 10. Repeat these steps for the remaining fan modules. Step 3: Replace a chassis from within the equipment rack or system cabinet You must remove the existing chassis from the equipment rack or system cabinet before you can install the replacement chassis.
Page 429
e. Release the latches to lock the controller module into place. f. Recable the power supply. g. If you have not already done so, reinstall the cable management device. h. Interrupt the normal boot process by pressing Ctrl-C. 5. Repeat the preceding steps to install the second controller into the new chassis. Complete the restoration and replacement process - AFF A320 Step 1: Verify and set the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match...
Page 430
◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 2: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 431
This provides you a record of the procedure so that you can troubleshoot any issues that you might encounter during the replacement process. Shut down the impaired controller - AFF A320 You can shut down or take over the impaired controller using different procedures, depending on the storage system hardware configuration.
Page 432
Option 2: Controller is in a MetroCluster Do not use this procedure if your system is in a two-node MetroCluster configuration. To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage.
Page 433
Step 1: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis. You can use the following images or the written steps to remove the controller module from the chassis. The following image shows removing the cables and cable management arms from the impaired controller module: The following image shows removing the impaired controller module from the chassis:...
Page 434
c. Gently pull the controller module a few inches toward you so that you can grasp the controller module sides. d. Using both hands, gently pull the controller module out of the chassis and set it on a flat, stable surface.
Page 435
1. Locate the NVDIMM battery in the controller module. 2. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket. 3.
Page 436
1. Open the air duct and locate the boot media using the following illustration or the FRU map on the controller module: 2. Locate and remove the boot media from the controller module: a. Press the blue button at the end of the boot media until the lip on the boot media clears the blue button. b.
Page 437
1. Locate the DIMMs on your controller module.
Page 438
Air duct • System DIMMs slots: 2,4, 7, 9, 13, 15, 18, and • NVDIMM slot: 11 The NVDIMM looks significantly different than system DIMMs. 2. Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement controller module in the proper orientation.
Page 439
1. Remove the cover over the PCIe risers by unscrewing the blue thumbscrew on the cover, slide the cover toward you, rotate the cover upward, lift it off the controller module, and then set it aside. 2. Remove the empty risers from the replacement controller module. a.
Page 440
replacement controller module, you must install the replacement controller module into the chassis, and then boot it to Maintenance mode. You can use the following illustration or the written steps to install the replacement controller module in the chassis. 1. If you have not already done so, close the air duct at the rear of the controller module and reinstall the cover over the PCIe cards.
Page 441
the low-level system configuration of the replacement controller and reconfigure system settings as necessary. Step 1: Set and verify the system time after replacing the controller module You should check the time and date on the replacement controller module against the healthy controller module in an HA pair, or against a reliable time server in a stand-alone configuration.
Page 442
◦ mccip ◦ non-ha 3. If the displayed system state of the controller module does not match your system configuration, set the state for the controller module: ha-config modify controller ha-state 4. Confirm that the setting has changed: ha-config show Step 3: Run diagnostics After you have replaced a component in your system, you should run diagnostic tests on that component.
Page 443
Step 1: Shut down the node You can shut down or take over the impaired controller using different procedures, depending on the storage system hardware configuration. Option 1: Most configurations To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage.
Page 444
node so that the healthy node continues to serve data from the impaired node storage. • If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy node shows false for eligibility and health, you must correct the issue before shutting down the impaired node;...
Page 445
Leave the cables in the cable management device so that when you reinstall the cable management device, the cables are organized. 4. Remove and set aside the cable management devices from the left and right sides of the controller module. 5.
Page 446
Air duct • System DIMMs slots: 2,4, 7, 9, 13, 15, 18, and • NVDIMM slot: 11 The NVDIMM looks significantly different than system DIMMs. 3. Note the orientation of the DIMM in the socket so that you can insert the replacement DIMM in the proper orientation.
Page 447
The DIMM fits tightly in the slot, but should go in easily. If not, realign the DIMM with the slot and reinsert it. Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot. 7.
Page 448
-node local -auto -giveback true Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 449
Hot-swap a fan module - AFF A320 To swap out a fan module without interrupting service, you must perform a specific sequence of tasks. 1. If you are not already grounded, properly ground yourself. 2. Remove the bezel (if necessary) with two hands, by grasping the openings on each side of the bezel, and then pulling it toward you until the bezel releases from the ball studs on the chassis frame.
Page 450
10. Align the bezel with the ball studs, and then gently push the bezel onto the ball studs. Replace an NVDIMM - AFF A320 You must replace the NVDIMM in the controller module when your system registers that the flash lifetime is almost at an end or that the identified NVDIMM is not healthy in general;...
Page 451
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 452
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 453
a. Insert your forefinger into the latching mechanism on either side of the controller module. b. Press down on the orange tab on top of the latching mechanism until it clears the latching pin on the chassis. The latching mechanism hook should be nearly vertical and should be clear of the chassis pin. a.
Page 454
2. Note the orientation of the NVDIMM in the socket so that you can insert the NVDIMM in the replacement controller module in the proper orientation. 3. Eject the NVDIMM from its slot by slowly pushing apart the two NVDIMM ejector tabs on either side of the NVDIMM, and then slide the NVDIMM out of the socket and set it aside.
Page 455
The latching mechanism arms slide into the chassis. The controller module begins to boot as soon as it is fully seated in the chassis. e. Release the latches to lock the controller module into place. f. Recable the power supply. g.
Page 456
Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
Page 457
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 458
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 459
a. Insert your forefinger into the latching mechanism on either side of the controller module. b. Press down on the orange tab on top of the latching mechanism until it clears the latching pin on the chassis. The latching mechanism hook should be nearly vertical and should be clear of the chassis pin. a.
Page 460
1. Open the air duct and locate the NVDIMM battery. 2. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket. 3.
Page 461
-node local -auto -giveback true Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 462
procedure. Replace a PCIe card - AFF A320 To replace a PCIe card, you must disconnect the cables from the cards, remove the SFP and QSFP modules from the cards before removing the riser, reinstall the riser, and then reinstall the SFP and QSFP modules before cabling the cards. •...
Page 463
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 464
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 465
a. Insert your forefinger into the latching mechanism on either side of the controller module. b. Press down on the orange tab on top of the latching mechanism until it clears the latching pin on the chassis. The latching mechanism hook should be nearly vertical and should be clear of the chassis pin. a.
Page 466
1. Remove the cover over the PCIe risers by unscrewing the blue thumbscrew on the cover, slide the cover toward you, rotate the cover upward, lift it off the controller module, and then set it aside. 2. Remove the riser with the failed PCIe card: a.
Page 467
d. Reinstall the PCIe riser cover on the controller module. Sep 4: Install the controller module After you have replaced the component in the controller module, you must re-install the controller module into the chassis, and then boot it to Maintenance mode. 1.
Page 468
-node local -auto -giveback true Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 469
Once power is restored to the power supply, the status LED should be green. 8. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 470
Replace the real-time clock battery - AFF A320 You replace the real-time clock (RTC) battery in the controller module so that your system’s services and applications that depend on accurate time synchronization continue to function. • You can use this procedure with all versions of ONTAP supported by your system •...
Page 471
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 472
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 473
a. Insert your forefinger into the latching mechanism on either side of the controller module. b. Press down on the orange tab on top of the latching mechanism until it clears the latching pin on the chassis. The latching mechanism hook should be nearly vertical and should be clear of the chassis pin. a.
Page 474
1. Remove the PCIe cover. a. Unscrew the blue thumbscrew located above the onboard ports at the back of the controller module. b. Slide the cover toward you and rotate the cover upward. c. Remove the cover and set it aside. 2.
Page 475
-node local -auto -giveback true Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
This guide gives graphic instructions for a typical installation of your system from racking and cabling, through initial system bring-up. Use this guide if you are familiar with installing NetApp systems. Access the Installation and Setup Instructions PDF poster: AFF A400 Installation and Setup Instructions Videos - AFF A400 There are two videos;...
Page 477
NetApp video: Software configuration for vSphere NAS datastores for FAS/AFF systems running ONTAP 9.2 Detailed guide - AFF A400 This guide gives detailed step-by-step instructions for installing a typical NetApp system. Use this guide if you want more detailed installation instructions.
Page 478
Power cables Not applicable Powering up the system 4. Review the NetApp ONTAP Configuration Guide and collect the required information listed in that guide. ONTAP Configuration Guide Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable.
Page 479
1. Install the rail kits, as needed. 2. Install and secure your system using the instructions included with the rail kit. You need to be aware of the safety concerns associated with the weight of the system. 3. Attach cable management devices (as shown). 4.
Page 480
As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the animation or illustration to complete the cabling between the controllers and to the switches: Two-node switchless cluster cabling 2.
Page 481
As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the animation or illustration to complete the cabling between the controllers and to the switches: Switched cluster cabling 2.
Page 482
As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the following animation or illustration to cable your controllers to a single drive shelf. Cabling the controllers to one NS224 drive shelf 2.
Page 483
As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the following animation or illustration to cable your controllers to two drive shelves. Cabling controllers to two NS224 drive shelves 2.
Page 484
As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the following illustration to cable your controllers to two drive shelves. Cabling the controllers to SAS drive shelves...
Page 485
2. Go to Step 5: Complete system setup and configuration to complete system setup and configuration. Step 5: Complete system setup and configuration You can complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch.
Page 486
Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens. 6. Use System Manager guided setup to configure your system using the data you collected in the NetApp ONTAP Configuration Guide. ONTAP Configuration Guide 7.
Page 487
Point your browser to the node management IP address. The format for the address is https://x.x.x.x. b. Configure the system using the data you collected in the NetApp ONTAP Configuration guide. ONTAP Configuration Guide 4. Set up your account and download Active IQ Config Advisor: a.
Page 488
◦ If the impaired node is in a standalone configuration and at LOADER prompt, contact NetApp Support. mysupport.netapp.com 2. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message...
Page 489
Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired node, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
Page 490
Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
Page 491
Key Manager external Restored yes: a. Enter the onboard security key-manager sync command: security key-manager external sync If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the Restored column equals for all authentication keys: security key-manager key query c.
Page 492
Shut down the impaired controller - AFF A400 Option 1: Most configurations After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired node. Steps 1. If the impaired node isn’t at the LOADER prompt: If the impaired node displays…...
Page 493
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 494
If the impaired node… Then… Has not automatically switched Review the veto messages and, if possible, resolve the issue and try over, you attempted switchover again. If you are unable to resolve the issue, contact technical support. with the metrocluster command, and the switchover switchover was vetoed...
Page 495
If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation. 7. Verify that the heal operation is complete by using the command on metrocluster operation show the destination cluster:...
Page 496
NetApp Support Site. You must log into the NetApp Support Site to display the Statement of Volatility for your system. You can use the following animation, illustration, or the written steps to replace the boot media.
Page 497
Locking tabs Slide air duct toward back of controller Rotate air duct up a. Press the locking tabs on the sides of the air duct in toward the middle of the controller module. b. Slide the air duct toward the back of the controller module, and then rotate it upward to its completely open position.
Page 498
Press blue button Rotate boot media up and remove from socket a. Press the blue button at the end of the boot media until the lip on the boot media clears the blue button. b. Rotate the boot media up and gently pull the boot media out of the socket. 3.
Page 499
Steps 1. Download and copy the appropriate service image from the NetApp Support Site to the USB flash drive. a. Download the service image to your work space on your laptop. b. Unzip the service image.
Page 500
7. Complete the installation of the controller module: a. Plug the power cord into the power supply, reinstall the power cable locking collar, and then connect the power supply to the power source. b. Firmly push the controller module into the chassis until it meets the midplane and is fully seated. The locking latches rise when the controller module is fully seated.
Page 501
d. Save the environment variables you changed with the savenv command e. Confirm your changes using the command. printenv variable-name 10. If the controller is in a stretch or fabric-attached MetroCluster, you must restore the FC adapter configuration: a. Boot to Maintenance mode: boot_ontap maint b.
Page 502
If your system has… Then… No network connection a. Press when prompted to restore the backup configuration. b. Reboot the system when prompted by the system. c. Select the Update flash from backup config (sync flash) option from the displayed menu. If you are prompted to continue with the update, press y.
Page 503
Option 2: Controller is in a two-node MetroCluster You must boot the ONTAP image from the USB drive and verify the environmental variables. This procedure applies to systems in a two-node MetroCluster configuration. Steps 1. From the LOADER prompt, boot the recovery image from the USB flash drive: boot_recovery The image is downloaded from the USB flash drive.
Page 504
cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
Page 505
Restore OKM, NSE, and NVE as needed - AFF A400 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. 1. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
Page 506
--------------------------BEGIN BACKUP-------------------------- TmV0QXBwIEtleSBCbG9iAAEAAAAEAAAAcAEAAAAAAADuD+byAAAAACEAAAAAAAAA QAAAAAAAAABvOlH0AAAAAMh7qDLRyH1DBz12piVdy9ATSFMT0C0TlYFss4PDjTaV dzRYkLd1PhQLxAWJwOIyqSr8qY1SEBgm1IWgE5DLRqkiAAAAAAAAACgAAAAAAAAA 3WTh7gAAAAAAAAAAAAAAAAIAAAAAAAgAZJEIWvdeHr5RCAvHGclo+wAAAAAAAAAA IgAAAAAAAAAoAAAAAAAAAEOTcR0AAAAAAAAAAAAAAAACAAAAAAAJAGr3tJA/ LRzUQRHwv+1aWvAAAAAAAAAAACQAAAAAAAAAgAAAAAAAAACdhTcvAAAAAJ1PXeBf ml4NBsSyV1B4jc4A7cvWEFY6lLG6hc6tbKLAHZuvfQ4rIbYAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA H4nPQM0nrDRYRa9SCv8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAA ---------------------------END BACKUP--------------------------- 7. At the Boot Menu select the option for Normal Boot. The system boots to Waiting for giveback… prompt. 8. Move the console cable to the partner node and login as "admin". 9.
Page 507
c. Enter the security key-manager key query command to see a detailed view of all keys stored in the onboard key manager and verify that the column = for all authentication Restored yes/true keys. If the Restored column = anything other than yes/true, contact Customer Support. d.
Page 508
-auto-giveback true Return the failed part to NetApp - AFF A400 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 509
Chassis Replace the chassis - AFF A400 To replace the chassis, you must move the fans and controller modules from the impaired chassis to the new chassis of the same model as the impaired chassis. All other components in the system must be functioning properly; if not, you must contact technical support. •...
Page 510
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 511
command from the surviving cluster. controller_A_1::> metrocluster heal -phase aggregates [Job 130] Job succeeded: Heal Aggregates is successful. If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
Page 512
mcc1A::> metrocluster operation show Operation: heal-root-aggregates State: successful Start Time: 7/29/2016 20:54:41 End Time: 7/29/2016 20:54:42 Errors: - 8. On the impaired controller module, disconnect the power supplies. Move and replace hardware - AFF A400 Step 1: Remove the controller modules To replace the chassis, you must remove the controller modules from the old chassis.
Page 513
that it does not swing out of the chassis. The fan modules are short. Always support the bottom of the fan module with your free hand so that it does not suddenly drop free from the chassis and injure you. 5.
Page 514
2. Recable the console to the controller module, and then reconnect the management port. 3. Complete the installation of the controller module: a. Plug the power cord into the power supply, reinstall the power cable locking collar, and then connect the power supply to the power source.
Page 515
▪ mccip ▪ non-ha b. Confirm that the setting has changed: ha-config show 3. If you have not already done so, recable the rest of your system. 4. Reinstall the bezel on the front of the system. Step 2: Run diagnostics After you have replaced a component in your system, you should run diagnostic tests on that component.
Page 516
cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
Page 517
Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
Page 518
necessary, take over the node so that the healthy node continues to serve data from the impaired node storage. If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy node shows false for eligibility and health, you must correct the issue before shutting down the impaired node;...
Page 519
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 520
If the impaired node… Then… Has not automatically switched Review the veto messages and, if possible, resolve the issue and try over, you attempted switchover again. If you are unable to resolve the issue, contact technical support. with the metrocluster command, and the switchover switchover was vetoed...
Page 521
If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation. 7. Verify that the heal operation is complete by using the command on metrocluster operation show the destination cluster:...
Page 522
6. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 7. Place the controller module on a stable, flat surface. 8. On the replacement controller module, open the air duct and remove the empty risers from the controller module using the animation, illustration, or the written steps: Removing the empty risers from the replacement controller module...
Page 523
a. Press the locking tabs on the sides of the air duct in toward the middle of the controller module. b. Slide the air duct toward the back of the controller module, and then rotate it upward to its completely open position.
Page 524
a. Rotate the cam handle so that it can be used to pull the power supply out of the chassis. b. Press the blue locking tab to release the power supply from the chassis. c. Using both hands, pull the power supply out of the chassis, and then set it aside. 2.
Page 525
1. Open the air duct: a. Press the locking tabs on the sides of the air duct in toward the middle of the controller module. b. Slide the air duct toward the back of the controller module, and then rotate it upward to its completely open position.
Page 526
You can use the following animation, illustration, or the written steps to move the boot media from the impaired controller module to the replacement controller module. Moving the boot media 1. Locate and remove the boot media from the controller module: a.
Page 527
Step 5: Move the PCIe risers and mezzanine card As part of the controller replacement process, you must move the PCIe risers and mezzanine card from the impaired controller module to the replacement controller module. You can use the following animations, illustrations, or the written steps to move the PCIe risers and mezzanine card from the impaired controller module to the replacement controller module.
Page 528
2. Remove riser number 3, remove the mezzanine card, and install both into the replacement controller module: a. Remove any SFP or QSFP modules that might be in the PCIe cards. b. Rotate the riser locking latch on the left side of the riser up and toward air duct. The riser raises up slightly from the controller module.
Page 529
1. Locate the DIMMs on your controller module. 2. Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement controller module in the proper orientation. 3. Verify that the NVDIMM battery is not plugged into the new controller module. 4.
Page 530
b. Locate the corresponding DIMM slot on the replacement controller module. c. Make sure that the DIMM ejector tabs on the DIMM socket are in the open position, and then insert the DIMM squarely into the socket. The DIMMs fit tightly in the socket, but should go in easily. If not, realign the DIMM with the socket and reinsert it.
Page 531
You will connect the rest of the cables to the controller module later in this procedure. 4. Complete the installation of the controller module: a. Plug the power cord into the power supply, reinstall the power cable locking collar, and then connect the power supply to the power source.
Page 532
1. If the replacement node is not at the LOADER prompt, halt the system to the LOADER prompt. 2. On the healthy node, check the system time: show date The date and time are given in GMT. 3. At the LOADER prompt, check the date and time on the replacement node: show date The date and time are given in GMT.
Page 533
1. If the node to be serviced is not at the LOADER prompt, reboot the node: system node halt -node node_name After you issue the command, you should wait until the system stops at the LOADER prompt. 2. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags 3.
Page 534
This procedure applies only to systems running ONTAP in an HA pair. 1. If the replacement node is in Maintenance mode (showing the prompt, exit Maintenance mode and go *> to the LOADER prompt: halt 2. From the LOADER prompt on the replacement node, boot the node, entering if you are prompted to override the system ID due to a system ID mismatch:boot_ontap...
Page 535
If the giveback is vetoed, you can consider overriding the vetoes. Find the High-Availability Configuration Guide for your version of ONTAP 9 b. After the giveback has been completed, confirm that the HA pair is healthy and that takeover is possible: storage failover show The output from the storage failover show...
Page 536
Steps 1. If you need new license keys, obtain replacement license keys on the NetApp Support Site in the My Support section under Software licenses. The new license keys that you require are automatically generated and sent to the email address on file.
Page 537
If any LIFs are listed as false, revert them to their home ports: network interface revert 2. Register the system serial number with NetApp Support. ◦ If AutoSupport is enabled, send an AutoSupport message to register the serial number. ◦ If AutoSupport is not enabled, call NetApp Support to register the serial number.
Page 538
1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B ...
Page 539
Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
Page 540
If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 541
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 542
controller_A_1::> metrocluster heal -phase aggregates [Job 130] Job succeeded: Heal Aggregates is successful. If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
Page 543
mcc1A::> metrocluster operation show Operation: heal-root-aggregates State: successful Start Time: 7/29/2016 20:54:41 End Time: 7/29/2016 20:54:42 Errors: - 8. On the impaired controller module, disconnect the power supplies. Step 2: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis.
Page 544
4. Remove the cable management device from the controller module and set it aside. 5. Press down on both of the locking latches, and then rotate both latches downward at the same time. The controller module moves slightly out of the chassis. 6.
Page 545
The DIMMs are located in sockets 2, 4, 13, and 15. The NVDIMM is located in slot 11. 1. Open the air duct: a. Press the locking tabs on the sides of the air duct in toward the middle of the controller module. b.
Page 546
5. Remove the replacement DIMM from the antistatic shipping bag, hold the DIMM by the corners, and align it to the slot. The notch among the pins on the DIMM should line up with the tab in the socket. 6. Make sure that the DIMM ejector tabs on the connector are in the open position, and then insert the DIMM squarely into the slot.
Page 547
the following sections. You will connect the rest of the cables to the controller module later in this procedure. 4. Complete the installation of the controller module: a. Plug the power cord into the power supply, reinstall the power cable locking collar, and then connect the power supply to the power source.
Page 548
5. Select an option from the displayed sub-menu and run the test. 6. Proceed based on the result of the preceding step: ◦ If the test failed, correct the failure, and then rerun the test. ◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 6: Restore the controller module to operation after running diagnostics After completing diagnostics, you must recable the system, give back the controller module, and then reenable automatic giveback.
Page 549
6. Reestablish any SnapMirror or SnapVault configurations. Step 8: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 550
You can use the following animation, illustration, or the written steps to hot-swap a fan module. Replacing a fan 1. If you are not already grounded, properly ground yourself. 2. Remove the bezel (if necessary) with two hands, by grasping the openings on each side of the bezel, and then pulling it toward you until the bezel releases from the ball studs on the chassis frame.
Page 551
10. Align the bezel with the ball studs, and then gently push the bezel onto the ball studs. 11. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 552
If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 553
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 554
controller_A_1::> metrocluster heal -phase aggregates [Job 130] Job succeeded: Heal Aggregates is successful. If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
Page 555
mcc1A::> metrocluster operation show Operation: heal-root-aggregates State: successful Start Time: 7/29/2016 20:54:41 End Time: 7/29/2016 20:54:42 Errors: - 8. On the impaired controller module, disconnect the power supplies. Step 2: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis.
Page 556
4. Remove the cable management device from the controller module and set it aside. 5. Press down on both of the locking latches, and then rotate both latches downward at the same time. The controller module moves slightly out of the chassis. 6.
Page 557
1. Open the air duct: a. Press the locking tabs on the sides of the air duct in toward the middle of the controller module. b. Slide the air duct toward the back of the controller module, and then rotate it upward to its completely open position.
Page 558
3. Cable the management and console ports only, so that you can access the system to perform the tasks in the following sections. You will connect the rest of the cables to the controller module later in this procedure. 4. Complete the installation of the controller module: a.
Page 559
5. Proceed based on the result of the preceding step: ◦ If the scan show problems, correct the issue, and then rerun the scan. ◦ If the scan reported no failures, select Reboot from the menu to reboot the system. Step 6: Restore the controller module to operation after running diagnostics After completing diagnostics, you must recable the system, give back the controller module, and then reenable automatic giveback.
Page 560
6. Reestablish any SnapMirror or SnapVault configurations. Step 8: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 561
Step 1: Shut down the impaired controller You can shut down or take over the impaired controller using different procedures, depending on the storage system hardware configuration. Option 1: Most configurations To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage.
Page 562
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 563
1. Check the MetroCluster status to determine whether the impaired node has automatically switched over to the healthy node: metrocluster show 2. Depending on whether an automatic switchover has occurred, proceed according to the following table: If the impaired node… Then…...
Page 564
controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols Nodes RAID Status --------- -------- --------- ----- ------- ------ ---------------- ------------ aggr_b2 227.1GB 227.1GB 0% online 0 mcc1-a2 raid_dp, mirrored, normal... 6. Heal the root aggregates by using the command. metrocluster heal -phase root-aggregates mcc1A::>...
Page 565
LED turns off. • Although the contents of the NVDIMM is encrypted, it is a best practice to erase the contents of the NVDIMM before replacing it. For more information, see the Statement of Volatility on the NetApp Support...
Page 566
Site. You must log into the NetApp Support Site to display the Statement of Volatility for your system. You can use the following animation, illustration, or the written steps to replace the NVDIMM. The animation shows empty slots for sockets without DIMMs. These empty sockets are populated with blanks.
Page 567
2. Eject the NVDIMM from its slot by slowly pushing apart the two NVDIMM ejector tabs on either side of the NVDIMM, and then slide the NVDIMM out of the socket and set it aside. Carefully hold the NVDIMM by the edges to avoid pressure on the components on the NVDIMM circuit board.
Page 568
1. If you have not already done so, close the air duct. 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 3.
Page 569
After you issue the command, you should wait until the system stops at the LOADER prompt. 2. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags 3. Select Scan System from the displayed menu to enable running the diagnostics tests. 4.
Page 570
cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
Page 571
Step 8: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
Page 572
If the impaired node is Then… displaying… Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name...
Page 573
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 574
controller_A_1::> metrocluster heal -phase aggregates [Job 130] Job succeeded: Heal Aggregates is successful. If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
Page 575
mcc1A::> metrocluster operation show Operation: heal-root-aggregates State: successful Start Time: 7/29/2016 20:54:41 End Time: 7/29/2016 20:54:42 Errors: - 8. On the impaired controller module, disconnect the power supplies. Step 2: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis.
Page 576
4. Remove the cable management device from the controller module and set it aside. 5. Press down on both of the locking latches, and then rotate both latches downward at the same time. The controller module moves slightly out of the chassis. 6.
Page 577
The riser raises up slightly from the controller module. d. Lift the riser up straight up and set it aside on a stable flat surface, 2. Remove the PCIe card from the riser: a. Turn the riser so that you can access the PCIe card. b.
Page 578
1. Remove riser number 3 (slots 4 and 5): a. Open the air duct by pressing the locking tabs on the sides of the air duct, slide it toward the back of the controller module, and then rotate it to its completely open position. b.
Page 579
c. Rotate the latch down flush with the sheet metal on the riser. Step 5: Install the controller module After you have replaced the component in the controller module, you must re-install the controller module into the chassis, and then boot it to Maintenance mode. You can use the following animation, illustration, or the written steps to install the controller module in the chassis.
Page 580
Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors. The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. c.
Page 581
cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
Page 582
Step 8: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
Page 583
Once power is restored to the power supply, the status LED should be green. 8. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 584
necessary, take over the node so that the healthy node continues to serve data from the impaired node storage. About this task If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy node shows false for eligibility and health, you must correct the issue before shutting down the impaired node;...
Page 585
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 586
If the impaired node… Then… Has not automatically switched Perform a planned switchover operation from the healthy node: over metrocluster switchover Has not automatically switched Review the veto messages and, if possible, resolve the issue and try over, you attempted switchover again.
Page 587
mcc1A::> metrocluster heal -phase root-aggregates [Job 137] Job succeeded: Heal Root Aggregates is successful If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
Page 588
1. If you are not already grounded, properly ground yourself. 2. Release the power cable retainers, and then unplug the cables from the power supplies. 3. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the system cables and SFPs (if needed) from the controller module, keeping track of where the cables were connected.
Page 589
1. If you are not already grounded, properly ground yourself. 2. Open the air duct: a. Press the locking tabs on the sides of the air duct in toward the middle of the controller module. b. Slide the air duct toward the back of the controller module, and then rotate it upward to its completely open position.
Page 590
5. Close the air duct. Step 4: Reinstall the controller module and setting time/date after RTC battery replacement After you replace a component within the controller module, you must reinstall the controller module in the system chassis, reset the time and date on the controller, and then boot it.
Page 591
The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. b. Fully seat the controller module in the chassis by rotating the locking latches upward, tilting them so that they clear the locking pins, gently push the controller all the way in, and then lower the locking latches into the locked position.
Page 592
cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
Page 593
Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
This guide gives graphic instructions for a typical installation of your system from racking and cabling, through initial system bring-up. Use this guide if you are familiar with installing NetApp systems. Access the Installation and Setup Instructions PDF poster: AFF A700 Installation and Setup Instructions...
Page 595
Use this guide if you want more detailed installation instructions. Step 1: Prepare for installation To install your system, you need to create an account on the NetApp Support Site, register your system, and get license keys. You also need to inventory the appropriate number and type of cables for your system and collect specific network information.
Page 596
Power cables Not applicable Powering up the system 4. Review the NetApp ONTAP Confiuration Guide and collect the required information listed in that guide. ONTAP Configuration Guide Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable.
Page 597
Steps 1. Install the rail kits, as needed. 2. Install and secure your system using the instructions included with the rail kit. You need to be aware of the safety concerns associated with the weight of the system. The label on the left indicates an empty chassis, while the label on the right indicates a fully- populated system.
Page 598
As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the animation or illustration to complete the cabling between the controllers and to the switches: Cabling a two-node switchless cluster 1.
Page 599
As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the animation or illustration to complete the cabling between the controllers and to the switches: Switched cluster cabling 1.
Page 600
Steps 1. Use the following animations or illustrations to cable your drive shelves to your controllers. The examples use DS224C shelves. Cabling is similar with other supported SAS drive shelves. ◦ Cabling SAS shelves in FAS9000, AFF A700, and ASA AFF A700, ONTAP 9.7 and earlier: Cabling SAS storage - ONTAP 9.7 and earlier...
Page 601
◦ Cabling SAS shelves in FAS9000, AFF A700, and ASA AFF A700, ONTAP 9.8 and later: Cabling SAS storage - ONTAP 9.8 and later...
Page 602
If you have more than one drive shelf stack, see the Installation and Cabling Guide for your drive shelf type. Install and cable shelves for a new system installation - shelves with IOM12 modules...
Page 603
2. Go to Step 5: Complete system setup and configuration to complete system setup and configuration. Option 2: Cable the controllers to a single NS224 drive shelf in AFF A700 and ASA AFF A700 systems running ONTAP 9.8 and later only You must cable each controller to the NSM modules on the NS224 drive shelf on an AFF A700 or ASA AFF A700 running system ONTAP 9.8 or later.
Page 604
• The systems must have at least one X91148A module installed in slots 3 and/or 7 for each controller. The animation or illustrations show this module installed in both slots 3 and 7. • Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. The cable pull-tab for the storage modules are up, while the pull tabs on the shelves are down.
Page 606
2. Go to Step 5: Complete system setup and configuration to complete system setup and configuration. Option 3: Cable the controllers to two NS224 drive shelves in AFF A700 and ASA AFF A700 systems running ONTAP 9.8 and later only You must cable each controller to the NSM modules on the NS224 drive shelves on an AFF A700 or ASA AFF A700 running system ONTAP 9.8 or later.
Page 607
As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the following animation or illustrations to cable your controllers to two NS224 drive shelves. Cabling two NS224 shelves - ONTAP 9.8 and later...
Page 608
2. Go to Step 5: Complete system setup and configuration to complete system setup and configuration. Step 5: Complete system setup and configuration You can complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch.
Page 609
Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens. 7. Use System Manager guided setup to configure your system using the data you collected in the NetApp...
Page 610
Register your system. NetApp Product Registration c. Download Active IQ Config Advisor. NetApp Downloads: Config Advisor 9. Verify the health of your system by running Config Advisor. 10. After you have completed the initial configuration, go to the ONTAP & ONTAP System Manager Documentation Resources page for information about configuring additional features in ONTAP.
Page 611
Point your browser to the node management IP address. The format for the address is https://x.x.x.x. b. Configure the system using the data you collected in the NetApp ONTAP Configuration guide. ONTAP Configuration Guide 7. Set up your account and download Active IQ Config Advisor: a.
Page 612
◦ If the impaired node is at the LOADER prompt and is part of HA configuration, log in as on the admin healthy node. ◦ If the impaired node is in a standalone configuration and at LOADER prompt, contact NetApp Support. mysupport.netapp.com...
Page 613
Option 1: Check NVE or NSE on systems running ONTAP 9.5 and earlier Before shutting down the impaired node, you need to check whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
Page 614
Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
Page 615
Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
Page 616
Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired node, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
Page 617
Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored security key-manager key query c.
Page 618
Key Manager external Restored yes: a. Enter the onboard security key-manager sync command: security key-manager external sync If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored security key-manager key query c.
Page 619
If the impaired node displays… Then… Press Ctrl-C, and then respond when prompted. Waiting for giveback... System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode `impaired_node_name`...
Page 620
If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 621
If the impaired node is Then… displaying… Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name...
Page 622
Cam handle release button Cam handle 4. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 5.
Page 623
Controller module cover locking button Step 2: Replace the boot media Locate the boot media using the following illustration or the FRU map on the controller module:...
Page 624
• A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
Page 625
4. Push the controller module all the way into the system, making sure that the cam handle clears the USB flash drive, firmly push the cam handle to finish seating the controller module, and then push the cam handle to the closed position. The node begins to boot as soon as it is completely installed into the chassis.
Page 626
◦ If you are configuring manual connections: ifconfig e0a -addr=filer_addr -mask=netmask -gw=gateway-dns=dns_addr-domain=dns_domain ▪ filer_addr is the IP address of the storage system. ▪ netmask is the network mask of the management network that is connected to the HA partner. ▪ gateway is the gateway for the network. ▪...
Page 627
If your system has… Then… A network connection a. Press when prompted to restore the backup configuration. b. Set the healthy node to advanced privilege level: -privilege advanced c. Run the restore backup command: system node restore- backup -node local -target-address impaired_node_IP_address d.
Page 628
If your system has… Then… No network connection and is in a a. Press when prompted to restore the backup configuration. MetroCluster IP configuration b. Reboot the system when prompted by the system. c. Wait for the iSCSI storage connections to connect. You can proceed after you see the following messages: date-and-time [node- name:iscsi.session.stateChanged:notice]:...
Page 629
◦ If your system does not have onboard keymanager, NSE or NVE configured, complete the steps in this section. 6. From the LOADER prompt, enter the command. boot_ontap *If you see… Then…* The login prompt Go to the next Step. Waiting for giveback…...
Page 630
b. Check the environment variable settings with the printenv command. c. If an environment variable is not set as expected, modify it with the setenv environment- command. variable-name changed-value d. Save your changes using the command. savenev e. Reboot the node. Switch back aggregates in a two-node MetroCluster configuration - AFF A700 and FAS9000 After you have completed the FRU replacement in a two-node MetroCluster configuration, you can perform the MetroCluster switchback operation.
Page 631
Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
Page 632
If the console Then… displays… The LOADER prompt Boot the node to the boot menu: boot_ontap menu Waiting for giveback… a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this node rather than wait [y/n]? , enter: c.
Page 633
9. Confirm the target node is ready for giveback with the storage failover show command. 10. Giveback only the CFO aggregates with the storage failover giveback -fromnode local -only-cfo command. -aggregates true ◦ If the command fails because of a failed disk, physically dis-engage the failed disk, but leave the disk in the slot until a replacement is received.
Page 634
18. At the clustershell prompt, enter the net int show -is-home false command to list the logical interfaces that are not on their home node and port. If any interfaces are listed as false, revert those interfaces back to their home port using the net int command.
Page 635
This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
Page 636
If the console Then… displays… Waiting for giveback… a. Log into the partner node. b. Confirm the target node is ready for giveback with the storage command. failover show 4. Move the console cable to the partner node and give back the target node storage using the storage command.
Page 637
Return the failed part to NetApp - AFF A700 and FAS9000 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 638
Steps 1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h 2.
Page 639
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 640
If the impaired node… Then… Has not automatically switched Review the veto messages and, if possible, resolve the issue and try over, you attempted switchover again. If you are unable to resolve the issue, contact technical support. with the metrocluster command, and the switchover switchover was vetoed...
Page 641
If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation. 7. Verify that the heal operation is complete by using the command on metrocluster operation show the destination cluster:...
Page 642
Orange release button. Caching module cam handle. a. Press the orange release button on the front of the caching module. Do not use the numbered and lettered I/O cam latch to eject the caching module. The numbered and lettered I/O cam latch ejects the entire NVRAM10 module and not the caching module.
Page 643
front of the NVRAM module in slot 6-1 in the rear of the system. To replace or add the core dump module, locate slot 6-1, and then follow the specific sequence of steps to add or replace it. Before you begin •...
Page 644
Do not use the numbered and lettered I/O cam latch to eject the core dump module. The numbered and lettered I/O cam latch ejects the entire NVRAM10 module and not the core dump module. c. Rotate the cam handle until the core dump module begins to slide out of the NVRAM10 module. d.
Page 645
cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
Page 646
Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888-463-8277 (North America), 00-800-44- 638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
Page 647
::> system controller slot module replace -node node1 -slot 6-2 Warning: NVMe module in slot 6-2 of the node node1 will be powered off for replacement. Do you want to continue? (y|n): `y` The module has been successfully powered off. It can now be safely replaced.
Page 648
Orange release button. Caching module cam handle. a. Press the orange release button on the front of the caching module. Do not use the numbered and lettered I/O cam latch to eject the caching module. The numbered and lettered I/O cam latch ejects the entire NVRAM10 module and not the caching module.
Page 649
If you replace the caching module with a caching module from a different vendor, the new vendor name is displayed in the command output. 9. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 650
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the...
Page 651
"Returning SEDs to unprotected mode" section of Administration overview with the CLI. • You must leave the power supplies turned on at the end of this procedure to provide power to the healthy node. Steps 1. Check the MetroCluster status to determine whether the impaired node has automatically switched over to the healthy node: metrocluster show 2.
Page 652
controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols Nodes RAID Status --------- -------- --------- ----- ------- ------ ---------------- ------------ aggr_b2 227.1GB 227.1GB 0% online 0 mcc1-a2 raid_dp, mirrored, normal... 6. Heal the root aggregates by using the command. metrocluster heal -phase root-aggregates mcc1A::>...
Page 653
chassis. When removing a power supply, always use two hands to support its weight. Locking button 4. Repeat the preceding steps for any remaining power supplies. Step 2: Remove the fans To remove the fan modules when replacing the chassis, you must perform a specific sequence of tasks.
Page 654
Orange release button 3. Set the fan module aside. 4. Repeat the preceding steps for any remaining fan modules. Step 3: Remove the controller module To replace the chassis, you must remove the controller module or modules from the old chassis.
Page 655
Cam handle release button Cam handle 3. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 4.
Page 656
NVRAM module when moving it to a new chassis. 1. Unplug any cabling associated with the target I/O module. Make sure that you label the cables so that you know where they came from. 2. Remove the target I/O module from the chassis: a.
Page 657
Step 5: Remove the De-stage Controller Power Module Steps You must remove the de-stage controller power modules from the old chassis in preparation for installing the replacement chassis. 1. Press the orange locking button on the module handle, and then slide the DCPM module out of the chassis.
Page 658
5. Slide the chassis all the way into the equipment rack or system cabinet. 6. Secure the front of the chassis to the equipment rack or system cabinet, using the screws you removed from the old chassis. 7. Secure the rear of the chassis to the equipment rack or system cabinet. 8.
Page 659
Step 10: Install I/O modules Steps To install I/O modules, including the NVRAM/FlashCache modules from the old chassis, follow the specific sequence of steps. You must have the chassis installed so that you can install the I/O modules into the corresponding slots in the new chassis.
Page 660
2. Recable the console to the controller module, and then reconnect the management port. 3. Connect the power supplies to different power sources, and then turn them on. 4. With the cam handle in the open position, slide the controller module into the chassis and firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle until it clicks into the locked position.
Page 661
3. If you have not already done so, recable the rest of your system. 4. Exit Maintenance mode: halt The LOADER prompt appears. Step 2: Running system-level diagnostics After installing a new chassis, you should run interconnect diagnostics. Your system must be at the LOADER prompt to start System Level Diagnostics. All commands in the diagnostic procedures are issued from the node where the component is being replaced.
Page 662
If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
Page 663
If the system-level diagnostics Then… tests… Resulted in some test failures Determine the cause of the problem. a. Exit Maintenance mode: halt b. Perform a clean shutdown, and then disconnect the power supplies. c. Verify that you have observed all of the considerations identified for running system-level diagnostics, that cables are securely connected, and that hardware components are properly installed in the storage system.
Page 664
6. Reestablish any SnapMirror or SnapVault configurations. Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 665
(referred to in this procedure as the “impaired node”). • If your system is in a MetroCluster configuration, you must review the section Choosing the correct recovery procedure to determine whether you should use this procedure. If this is the procedure you should use, note that the controller replacement procedure for a node in a four or eight node MetroCluster configuration is the same as that in an HA pair.
Page 666
local -auto-giveback false 3. Take the impaired node to the LOADER prompt: If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 667
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 668
controller_A_1::> metrocluster heal -phase aggregates [Job 130] Job succeeded: Heal Aggregates is successful. If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
Page 669
mcc1A::> metrocluster operation show Operation: heal-root-aggregates State: successful Start Time: 7/29/2016 20:54:41 End Time: 7/29/2016 20:54:42 Errors: - 8. On the impaired controller module, disconnect the power supplies. Replace the controller module hardware - AFF A700 and FAS9000 To replace the controller module hardware, you must remove the impaired node, move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode.
Page 670
Cam handle release button Cam handle 1. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 2.
Page 671
Controller module cover locking button Step 2: Move the boot media You must locate the boot media and follow the directions to remove it from the old controller and insert it in the new controller. Steps 1. Lift the black air duct at the back of the controller module and then locate the boot media using the following illustration or the FRU map on the controller module:...
Page 672
Press release tab Boot media 2. Press the blue button on the boot media housing to release the boot media from its housing, and then gently pull it straight out of the boot media socket. Do not twist or pull the boot media straight up, because this could damage the socket or the boot media.
Page 673
2. Locate the DIMMs on your controller module. 3. Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement controller module in the proper orientation. 4. Eject the DIMM from its slot by slowly pushing apart the two DIMM ejector tabs on either side of the DIMM, and then slide the DIMM out of the slot.
Page 674
The DIMM fits tightly in the slot, but should go in easily. If not, realign the DIMM with the slot and reinsert it. Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot. 7.
Page 675
The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. c. Rotate the locking latches upward, tilting them so that they clear the locking pins, and then lower them into the locked position.
Page 676
1. In Maintenance mode from the new controller module, verify that all components display the same state: ha-config show The value for HA-state can be one of the following: ◦ ◦ ◦ mcc-2n ◦ mccip ◦ non-ha a. Confirm that the setting has changed: ha-config show Step 3: Run system-level diagnostics You should run comprehensive or focused diagnostic tests for specific components and...
Page 677
If you want to run diagnostic Then… tests on… Individual components a. Clear the status logs: sldiag device clearstatus b. Display the available tests for the selected devices: sldiag device show -dev _dev_name can be any one of the ports and devices identified in dev_name the preceding step.
Page 678
If you want to run diagnostic Then… tests on… Multiple components at the same a. Review the enabled and disabled devices in the output from the time preceding procedure and determine which ones you want to run concurrently. b. List the individual tests for the device: sldiag device show -dev dev_name c.
Page 679
If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
Page 680
If the system-level diagnostics Then… tests… Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
Page 681
1. Recable the system. 2. Verify that the cabling is correct by using Active IQ Config Advisor. a. Download and install Config Advisor. b. Enter the information for the target system, and then click Collect Data. c. Click the Cabling tab, and then examine the output. Make sure that all disk shelves are displayed and all disks appear in the output, correcting any cabling issues you find.
Page 682
appears (*>). b. Save any coredumps: system node run -node local-node-name partner savecore c. Wait for savecore command to complete before issuing the giveback. You can enter the following command to monitor the progress of the savecore command: system node run -node local-node-name partner savecore -s d.
Page 683
Complete system restoration - AFF A700 and FAS9000 To complete the replacement procedure and restore your system to full operation, you must recable the storage, restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller. You must complete a series of tasks before restoring your system to full operation.
Page 684
If the node is in a MetroCluster configuration and all nodes at a site have been replaced, license keys must be installed on the replacement node or nodes prior to switchback. 1. If you need new license keys, obtain replacement license keys on the NetApp Support Site in the My Support section under Software licenses.
Page 685
If any LIFs are listed as false, revert them to their home ports: network interface revert 2. Register the system serial number with NetApp Support. ◦ If AutoSupport is enabled, send an AutoSupport message to register the serial number. ◦ If AutoSupport is not enabled, call NetApp Support to register the serial number.
Page 686
6. Reestablish any SnapMirror or SnapVault configurations. Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 687
You must dispose of batteries according to the local regulations regarding battery recycling or disposal. If you cannot properly dispose of batteries, you must return the batteries to NetApp, as described in the RMA instructions that are shipped with the kit.
Page 688
Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888-463-8277 (North America), 00-800-44- 638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
Page 689
If the impaired node is Then… displaying… Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name...
Page 690
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 691
controller_A_1::> metrocluster heal -phase aggregates [Job 130] Job succeeded: Heal Aggregates is successful. If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
Page 692
mcc1A::> metrocluster operation show Operation: heal-root-aggregates State: successful Start Time: 7/29/2016 20:54:41 End Time: 7/29/2016 20:54:42 Errors: - 8. On the impaired controller module, disconnect the power supplies. Step 2: Open the controller module To access components inside the controller, you must first remove the controller module from the system and then remove the cover on the controller module.
Page 693
Cam handle release button Cam handle 4. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 5.
Page 694
Controller module cover locking button Step 3: Replace the DIMMs To replace the DIMMs, locate them inside the controller and follow the specific sequence of steps. Steps 1. If you are not already grounded, properly ground yourself. 2. Locate the DIMMs on your controller module. Each system memory DIMM has an LED located on the board next to each DIMM slot.
Page 695
3. Eject the DIMM from its slot by slowly pushing apart the two DIMM ejector tabs on either side of the DIMM, and then slide the DIMM out of the slot. Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board.
Page 696
DIMM ejector tabs DIMM 4. Remove the replacement DIMM from the antistatic shipping bag, hold the DIMM by the corners, and align it to the slot. The notch among the pins on the DIMM should line up with the tab in the socket. 5.
Page 697
Step 4: Install the controller After you install the components into the controller module, you must install the controller module back into the system chassis and boot the operating system. For HA pairs with two controller modules in the same chassis, the sequence in which you install the controller module is especially important because it attempts to reboot as soon as you completely seat it in the chassis.
Page 698
b. After the node boots to Maintenance mode, halt the node: halt After you issue the command, you should wait until the system stops at the LOADER prompt. During the boot process, you can safely respond to prompts: ▪ A prompt warning that when entering Maintenance mode in an HA configuration, you must ensure that the healthy node remains down.
Page 699
If the system-level diagnostics Then… tests… An HA pair Perform a give back: storage failover giveback -ofnode replacement_node_name If you disabled automatic giveback, re-enable it with the storage failover modify command. A two-node MetroCluster Proceed to the next step. configuration The MetroCluster switchback procedure is done in the next task in the replacement process.
Page 700
If the system-level diagnostics Then… tests… Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
Page 701
1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B ...
Page 702
Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
Page 703
7. Align the bezel with the ball studs, and then gently push the bezel onto the ball studs. 8. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 704
Step 1: Shut down the impaired controller You can shut down or take over the impaired controller using different procedures, depending on the storage system hardware configuration. Option 1: Most configurations To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage.
Page 705
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 706
1. Check the MetroCluster status to determine whether the impaired node has automatically switched over to the healthy node: metrocluster show 2. Depending on whether an automatic switchover has occurred, proceed according to the following table: If the impaired node… Then…...
Page 707
controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols Nodes RAID Status --------- -------- --------- ----- ------- ------ ---------------- ------------ aggr_b2 227.1GB 227.1GB 0% online 0 mcc1-a2 raid_dp, mirrored, normal... 6. Heal the root aggregates by using the command. metrocluster heal -phase root-aggregates mcc1A::>...
Page 708
b. Rotate the cam latch down until it is in a horizontal position. The I/O module disengages from the chassis and moves about 1/2 inch out of the I/O slot. c. Remove the I/O module from the chassis by pulling on the pull tabs on the sides of the module face. Make sure that you keep track of which slot the I/O module was in.
Page 709
2. If your system is configured to support 10 GbE cluster interconnect and data connections on 40 GbE NICs or onboard ports, convert these ports to 10 GbE connections by using the command nicadmin convert from Maintenance mode. Be sure to exit Maintenance mode after completing the conversion. 3.
Page 710
6. Reestablish any SnapMirror or SnapVault configurations. Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 711
There is an audible click when the module is secure and connected to the midplane. Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 712
replace a failed NVRAM module or the DIMMs inside the NVRAM module. To replace a failed NVRAM module, you must remove it from the chassis, remove the FlashCache module or modules from the NVRAM module, move the DIMMs to the replacement module, reinstall the FlashCache module or modules, and install the replacement NVRAM module into the chassis.
Page 713
Option 1: Most configs Most configurations To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage. About this task If you have a cluster with more than two nodes, it must be in quorum.
Page 714
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.
Page 715
Steps 1. Check the MetroCluster status to determine whether the impaired node has automatically switched over to the healthy node: metrocluster show 2. Depending on whether an automatic switchover has occurred, proceed according to the following table: If the impaired node… Then…...
Page 716
controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols Nodes RAID Status --------- -------- --------- ----- ------- ------ ---------------- ------------ aggr_b2 227.1GB 227.1GB 0% online 0 mcc1-a2 raid_dp, mirrored, normal... 6. Heal the root aggregates by using the metrocluster heal -phase root-aggregates command.
Page 717
Orange release button (gray on empty FlashCache modules) FlashCache cam handle a. Press the orange button on the front of the FlashCache module. The release button on empty FlashCache modules is gray. b. Swing the cam handle out until the module begins to slide out of the old NVRAM module. c.
Page 718
Lettered and numbered I/O cam latch I/O latch completely unlocked 4. Set the NVRAM module on a stable surface and remove the cover from the NVRAM module by pushing down on the blue locking button on the cover, and then, while holding down the blue button, slide the lid off the NVRAM module.
Page 719
Cover locking button DIMM and DIMM ejector tabs 5. Remove the DIMMs, one at a time, from the old NVRAM module and install them in the replacement NVRAM module. 6. Close the cover on the module. 7. Install the replacement NVRAM module into the chassis: a.
Page 720
c. Remove the NVRAM module from the chassis by pulling on the pull tabs on the sides of the module face. Lettered and numbered I/O cam latch I/O latch completely unlocked 3. Set the NVRAM module on a stable surface and remove the cover from the NVRAM module by pushing down on the blue locking button on the cover, and then, while holding down the blue button, slide the lid off the NVRAM module.
Page 721
Cover locking button DIMM and DIMM ejector tabs 4. Locate the DIMM to be replaced inside the NVRAM module, and then remove it by pressing down on the DIMM locking tabs and lifting the DIMM out of the socket. Each DIMM has an LED next to it that flashes when the DIMM has failed. 5.
Page 722
reassign the disks. Select one of the following options for instructions on how to reassign disks to the new controller.
Page 723
Option 1: Verify ID (HA pair) Verify the system ID change on an HA system You must confirm the system ID change when you boot the replacement node and then verify that the change was implemented. This procedure applies only to systems running ONTAP in an HA pair. Steps 1.
Page 724
node run -node local-node-name partner savecore -s d. Return to the admin privilege level: set -privilege admin 5. Give back the node: a. From the healthy node, give back the replaced node’s storage: storage failover giveback -ofnode replacement_node_name the replacement node takes back its storage and completes booting. If you are prompted to override the system ID due to a system ID mismatch, you should enter y.
Page 725
8. If the node is in a MetroCluster configuration, depending on the MetroCluster state, verify that the DR home ID field shows the original owner of the disk if the original owner is a node on the disaster site. This is required if both of the following are true: ◦...
Page 726
Ctrl-C, and then select the option to boot to Maintenance mode from the displayed menu. You must enter when prompted to override the system ID due to a system ID mismatch. 2. View the old system IDs from the healthy node: metrocluster node show -fields node- systemid,dr-partner-systemid In this example, the Node_B_1 is the old node, with the old system ID of 118073209:...
Page 727
*> disk show -a Local System ID: 118065481 DISK OWNER POOL SERIAL NUMBER HOME ------- ------------- ----- ------------- ------------- disk_name system-1 (118065481) Pool0 J8Y0TDZC system-1 (118065481) disk_name system-1 (118065481) Pool0 J8Y09DXC system-1 (118065481) 6. From the healthy node, verify that any coredumps are saved: a.
Page 728
Display the results of the MetroCluster check: metrocluster check show e. Run Config Advisor. Go to the Config Advisor page on the NetApp Support Site at support.netapp.com/NOW/download/tools/config_advisor/. After running Config Advisor, review the tool’s output and follow the recommendations in the output to address any issues discovered.
Page 729
• “Restoring external key management encryption keys” Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 730
The green power LED lights when the PSU is fully inserted into the chassis and the amber attention LED flashes initially, but turns off after a few moments. 9. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 731
continue to function. • You can use this procedure with all versions of ONTAP supported by your system • All other components in the system must be functioning properly; if not, you must contact technical support. Step 1: Shut down the impaired controller You can shut down or take over the impaired controller using different procedures, depending on the storage system hardware configuration.
Page 732
About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the...
Page 733
"Returning SEDs to unprotected mode" section of Administration overview with the CLI. • You must leave the power supplies turned on at the end of this procedure to provide power to the healthy node. Steps 1. Check the MetroCluster status to determine whether the impaired node has automatically switched over to the healthy node: metrocluster show 2.
Page 734
controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols Nodes RAID Status --------- -------- --------- ----- ------- ------ ---------------- ------------ aggr_b2 227.1GB 227.1GB 0% online 0 mcc1-a2 raid_dp, mirrored, normal... 6. Heal the root aggregates by using the command. metrocluster heal -phase root-aggregates mcc1A::>...
Page 735
Cam handle release button Cam handle 4. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 5.
Page 736
Controller module cover locking button Step 3: Replace the RTC battery To replace the RTC battery, you must locate the failed battery in the controller module, remove it from the holder, and then install the replacement battery in the holder. Steps 1.
Page 737
RTC battery RTC battery housing 3. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
Page 738
Steps 1. If you have not already done so, close the air duct or controller module cover. 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so.
Page 739
Steps 1. Verify that all nodes are in the state: enabled metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A controller_A_1 configured enabled heal roots completed cluster_B ...
Page 740
6. Reestablish any SnapMirror or SnapVault configurations. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 741
Option 1: Add an X91148A module as a NIC module in a system with open slots To add an X91148A module as a NIC module in a system with open slots, you must follow the specific sequence of steps. Steps 1.
Page 742
See the Hardware Universe for other slots that can be used by the X91148A module for networking. NetApp Hardware Universe • All other components in the system must be functioning properly; if not, you must contact technical support.
Page 743
install one or more X91148A NIC modules into your fully-populated system. Steps 1. If you are adding an X91148A module into a slot that contains a NIC module with the same number of ports as the X91148A module, the LIFs will automatically migrate when its controller module is shut down. If the NIC module being replaced has more ports than the X91148A module, you must permanently reassign the affected LIFs to a different home port.
Page 744
Lettered and numbered I/O cam latch I/O cam latch completely unlocked 6. Install the X91148A module into the target slot: a. Align the X91148A module with the edges of the slot. b. Slide the X91148A module into the slot until the lettered and numbered I/O cam latch begins to engage with the I/O cam pin.
Page 745
Option 2: Adding an X91148A module as a storage module in a system with no open slots You must remove one or more existing NIC or storage modules in your system in order to install one or more X91148A storage modules into your fully-populated system. •...
Page 746
Lettered and numbered I/O cam latch I/O cam latch completely unlocked 6. Install the X91148A module into slot 3: a. Align the X91148A module with the edges of the slot. b. Slide the X91148A module into the slot until the lettered and numbered I/O cam latch begins to engage with the I/O cam pin.
AFF A700s System Documentation Install and setup Cluster configuration worksheet - AFF A700s You can use the worksheet to gather and record your site-specific IP addresses and other information required when configuring an ONTAP cluster. Cluster Configuration Worksheet Start here: Choose your installation and setup experience You can choose from different content formats to guide you through installing and setting up your new storage system.
Page 748
◦ If the impaired node is in a standalone configuration and at LOADER prompt, contact NetApp Support. mysupport.netapp.com 2. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message...
Page 749
Option 1: Check NVE or NSE on systems running ONTAP 9.5 and earlier Before shutting down the impaired node, you need to check whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
Page 750
Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
Page 751
Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired node, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
Page 752
Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
Page 753
Key Manager external Restored yes: a. Enter the onboard security key-manager sync command: security key-manager external sync If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the Restored column equals for all authentication keys: security key-manager key query...
Page 754
Enter the onboard security key-manager sync command: security key-manager onboard sync Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact NetApp Support. mysupport.netapp.com b. Verify the column shows...
Page 755
Remove the controller module and replace the boot media - AFF A700s You must remove the controller module from the chassis, open it, and then replace the failed boot media. Step 1: Remove the controller module You must remove the controller module from the chassis when you replace the controller module or replace a component inside the controller module.
Page 756
Locking pin 1. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 2. Place the controller module on a stable, flat surface, and then open the air duct: a.
Page 757
a. Open the air duct, if needed. b. If needed, remove Riser 2, the middle PCIe module, by unlocking the locking latch and then removing the riser from the controller module. Air duct Riser 2 (middle PCIe module) Boot media screw Boot media 3.
Page 758
7. Rotate the boot media down until it is flush with the motherboard. 8. Secure the boot media in place by using the screw. Do not over-tighten the screw. Doing so might crack the boot media circuit board. 9. Reinstall the riser into the controller module. 10.
Page 759
Air duct Risers 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. 4. Reinstall the cable management device and recable the system, as needed. When recabling, remember to reinstall the media converters (SFPs) if they were removed.
Page 760
• A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
Page 761
Air duct Risers 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. 4. Reinstall the cable management device and recable the system, as needed. When recabling, remember to reinstall the media converters (SFPs) if they were removed.
Page 762
9. Although the environment variables and bootargs are retained, you should check that all required boot environment variables and bootargs are properly set for your system type and configuration using the command and correct any errors using the printenv bootarg name setenv variable-name command.
Page 763
15. After the configuration synchronization is complete without errors, press when prompted to confirm that the backup procedure was successful. 16. Press when prompted whether to use the restored copy, and then press when prompted to reboot the node. 17. Verify that the environmental variables are set as expected. a.
Page 764
1. From the LOADER prompt, boot the recovery image from the USB flash drive: boot_recovery The image is downloaded from the USB flash drive. 2. When prompted, either enter the name of the image or accept the default image displayed inside the brackets on your screen.
Page 765
Restore OKM, NSE, and NVE as needed - AFF A700s Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
Page 766
If the console Then… displays… The LOADER prompt Boot the node to the boot menu: boot_ontap menu Waiting for giveback… a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this node rather than wait [y/n]? , enter: c.
Page 767
9. Confirm the target node is ready for giveback with the storage failover show command. 10. Giveback only the CFO aggregates with the storage failover giveback -fromnode local -only-cfo command. -aggregates true ◦ If the command fails because of a failed disk, physically dis-engage the failed disk, but leave the disk in the slot until a replacement is received.
Page 768
18. At the clustershell prompt, enter the net int show -is-home false command to list the logical interfaces that are not on their home node and port. If any interfaces are listed as false, revert those interfaces back to their home port using the net int command.
Page 769
This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
Page 770
If the console Then… displays… Waiting for giveback… a. Log into the partner node. b. Confirm the target node is ready for giveback with the storage command. failover show 4. Move the console cable to the partner node and give back the target node storage using the storage command.
Page 771
-auto-giveback true command. Return the failed part to NetApp - AFF A700s After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 772
If your system is running Then… clustered ONTAP with… Two nodes in the cluster cluster ha modify -configured false storage failover modify -node node0 -enabled false More than two nodes in the storage failover modify -node node0 -enabled false cluster 2.
Page 773
The controller module moves slightly out of the chassis. Locking latch Locking pin 6. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 7.
Page 774
Drives are fragile. Handle them as little as possible to prevent damage to them. 3. Align the drive from the old chassis with the same bay opening in the new chassis. 4. Gently push the drive into the chassis as far as it will go. The cam handle engages and begins to rotate upward.
Page 775
5. Complete the reinstallation of the controller module: a. If you have not already done so, reinstall the cable management device. b. Firmly push the controller module into the chassis until it meets the midplane and is fully seated. The locking latches rise when the controller module is fully seated. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
Page 776
◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 3: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 777
This provides you a record of the procedure so that you can troubleshoot any issues that you might encounter during the replacement process. Shut down the impaired node - AFF A700s To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage.
Page 778
module in the chassis, and then boot the system to Maintenance mode. Step 1: Remove the controller module You must remove the controller module from the chassis when you replace the controller module or replace a component inside the controller module. 1.
Page 779
Make sure that you support the bottom of the controller module as you slide it out of the chassis. 7. Place the controller module on a stable, flat surface, and then open the air duct: a. Press in the locking tabs on the sides of the air duct toward the middle of the controller module. b.
Page 780
stable, flat surface so that you can access the NVRAM card. Air duct Riser 1 locking latch NVRAM battery cable plug connecting to the NVRAM card Card locking bracket NVRAM card 2. Remove the NVRAM card from the riser module: a.
Page 781
c. Connect the battery cable to the socket on the NVRAM card. d. Swing the locking latch into the locked position and make sure that it locks in place. Step 3: Move PCIe cards As part of the controller replacement process, you must remove both PCIe riser modules, Riser 2 (the middle riser) and Riser 3 (riser on the far right) from the impaired controller module, remove the PCIe cards from the riser modules, and install them in the same riser modules in the replacement controller module.
Page 782
Card locking bracket Riser 2 (middle riser) and PCI cards in riser slots 2 and 3. 2. Remove the PCIe card from the riser: a. Turn the riser so that you can access the PCIe card. b. Press the locking bracket on the side of the PCIe riser, and then rotate it to the open position. c.
Page 783
Air duct Riser 2 (middle PCIe module) Boot media screw Boot media 2. Remove the boot media from the controller module: a. Using a #1 Phillips head screwdriver, remove the screw holding down the boot media and set the screw aside in a safe place.
Page 784
b. Rotate the boot media down toward the motherboard. c. Secure the boot media to the motherboard using the boot media screw. Do not over-tighten the screw or you might damage the boot media. Step 5: Move the fans You must move the fans from the impaired controller module to the replacement module when replacing a failed controller module.
Page 785
1. Locate the DIMMs on your controller module. Air duct Riser 1 and DIMM bank 1-4 Riser 2 and DIMM banks 5-8 and 9-12 Riser 3 and DIMM bank 13-16 2. Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement controller module in the proper orientation.
Page 786
7. Repeat these steps for the remaining DIMMs. Step 7: Install the NVRAM module To install the NVRAM module, you must follow the specific sequence of steps. 1. Install the riser into the controller module: a. Align the lip of the riser with the underside of the controller module sheet metal. b.
Page 787
4. Move the battery pack to the replacement controller module, and then install it in the NVRAM riser: a. Slide the battery pack down along the sheet metal side wall until the support tabs on the side wall hook into the slots on the battery pack, and the battery pack latch engages and locks into place. b.
Page 788
Blue power supply locking tab Power supply 3. Move the power supply to the new controller module, and then install it. 4. Using both hands, support and align the edges of the power supply with the opening in the controller module, and then gently push the power supply into the controller module until the locking tab clicks into place.
Page 789
Locking tabs Slide plunger 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 4.
Page 790
The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. c. Rotate the locking latches upward, tilting them so that they clear the locking pins, and then lower them into the locked position.
Page 791
to match your system configuration. 1. In Maintenance mode from the new controller module, verify that all components display the same state: ha-config show The HA state should be the same for all components. 2. If the displayed system state of the controller module does not match your system configuration, set the state for the controller module: ha-config modify controller ha-state The value for HA-state can be one of the following:...
Page 792
Recable the system and reassign disks - AFF A700s To complete the replacement procedure and restore your system to full operation, you must recable the storage, restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller. You must complete a series of tasks before restoring your system to full operation.
Page 793
node1> storage failover show Takeover Node Partner Possible State Description ------------ ------------ -------- ------------------------------------- node1 node2 false System ID changed on partner (Old: 151759755, New: 151759706), In takeover node2 node1 Waiting for giveback (HA mailboxes) 4. From the healthy node, verify that any coredumps are saved: a.
Page 794
Steps 1. If you need new license keys, obtain replacement license keys on the NetApp Support Site in the My Support section under Software licenses. The new license keys that you require are automatically generated and sent to the email address on file.
Page 795
-node local -auto -giveback true Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 796
system panic. All other components in the system must be functioning properly; if not, you must contact technical support. You must replace the failed component with a replacement FRU component you received from your provider. Shut down the impaired node To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage.
Page 797
4. Remove the cable management device from the controller module and set it aside. 5. Press down on both of the locking latches, and then rotate both latches downward at the same time. The controller module moves slightly out of the chassis. Locking latch Locking pin 6.
Page 798
Air duct locking tabs Risers Air duct Replace a DIMM To replace a DIMM, you must locate it in the controller module using the DIMM map on the inside of the controller module or locating it using the LED next to the DIMM, and then replace it following the specific sequence of steps.
Page 799
Air duct cover Riser 1 and DIMM bank 1-4 Riser 2 and DIMM bank 5-8 and 9-12 Riser 3 and DIMM 13-16 ◦ If you are removing or moving a DIMM in bank 1-4, unplug the NVRAM battery, unlock the locking latch on Riser 1, and then remove the riser.
Page 800
squarely into the slot. The DIMM fits tightly in the slot, but should go in easily. If not, realign the DIMM with the slot and reinsert it. Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot. 7.
Page 801
◦ If the test reported no failures, select Reboot from the menu to reboot the system. Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 802
1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h 2.
Page 803
Locking latch Locking pin 6. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 7. Place the controller module on a stable, flat surface, and then open the air duct: a.
Page 804
Air duct locking tabs Risers Air duct Replacing a fan - AFF A700s To replace a fan, remove the failed fan module and replace it with a new fan module. 1. If you are not already grounded, properly ground yourself. 2.
Page 805
Fan locking tabs Fan module 4. Align the edges of the replacement fan module with the opening in the controller module, and then slide the replacement fan module into the controller module until the locking latches click into place. Reinstall the controller module - AFF A700s After you replace a component within the controller module, you must reinstall the controller module in the system chassis and boot it.
Page 806
Locking tabs Slide plunger 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 4.
Page 807
-node local -auto -giveback true Return the failed part to NetApp - AFF A700s After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 808
local -auto-giveback false 3. Take the impaired node to the LOADER prompt: If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 809
Locking latch Locking pin 6. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 7. Set the controller module aside in a safe place. Replace the NVRAM battery To replace the NVRAM battery, you must remove the failed NVRAM battery from the controller module and install the replacement NVRAM battery into the controller module.
Page 810
NVRAM battery plug Blue NVRAM battery locking tab 3. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket. 4.
Page 811
Locking tabs Slide plunger 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 4.
Page 812
-node local -auto -giveback true Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 813
When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y. ◦ If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the “Returning SEDs to unprotected mode” section of the ONTAP 9 NetApp Encryption Power Guide.
Page 814
system cables and SFPs (if needed) from the controller module, keeping track of where the cables were connected. Leave the cables in the cable management device so that when you reinstall the cable management device, the cables are organized. 3. Unplug the controller module power supply from the source, and then unplug the cable from the power supply.
Page 815
Air duct locking tabs Risers Air duct Remove the NVRAM card Replacing the NVRAM consist of removing the NVRAM riser, Riser 1, from the controller module, disconnecting the NVRAM battery from the NVRAM card, removing the failed NVRAM card and installing the replacement NVRAM card, and then reinstalling the NVRAM riser back into the controller module.
Page 816
Air duct Riser 1 locking latch NVRAM battery cable plug connecting to the NVRAM card Card locking bracket NVRAM card 3. Remove the NVRAM card from the riser module: a. Turn the riser module so that you can access the NVRAM card. b.
Page 817
d. Swing the locking latch into the locked position and make sure that it locks in place. 5. Install the riser into the controller module: a. Align the lip of the riser with the underside of the controller module sheet metal. b.
Page 818
e. Select the option to boot to Maintenance mode from the displayed menu. Verify the system ID change on an HA system You must confirm the system ID change when you boot the replacement node and then verify that the change was implemented. This procedure applies only to systems running ONTAP in an HA pair.
Page 819
a. From the healthy node, give back the replaced node’s storage: storage failover giveback -ofnode replacement_node_name the replacement node takes back its storage and completes booting. If you are prompted to override the system ID due to a system ID mismatch, you should enter y. If the giveback is vetoed, you can consider overriding the vetoes.
Page 820
• “Restoring external key management encryption keys” Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 821
If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 822
Locking latch Locking pin 6. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 7. Place the controller module on a stable, flat surface, and then open the air duct: a.
Page 823
Air duct locking tabs Risers Air duct Step 3: Replace a PCIe card To replace a PCIe card, you must remove the cabling and any SFPs from the ports on the PCIe cards in the target riser, remove the riser from the controller module, remove and replace the PCIe card, reinstall the riser, and recable it.
Page 824
Air duct Riser locking latch Card locking bracket Riser 2 (middle riser) and PCI cards in riser slots 2 and 3. 3. Remove the PCIe card from the riser: a. Turn the riser so that you can access the PCIe card. b.
Page 825
module. c. Swing the locking latch down and click it into the locked position. When locked, the locking latch is flush with the top of the riser and the riser sits squarely in the controller module. d. Reinsert any SFP modules that were removed from the PCIe cards. Step 4: Reinstall the controller module After you replace a component within the controller module, you must reinstall the controller module in the system chassis and boot it.
Page 826
-node local -auto -giveback true Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 827
• This procedure is written for replacing one power supply at a time. It is a best practice to replace the power supply within two minutes of removing it from the chassis. The system continues to function, but ONTAP sends messages to the console about the degraded power supply until the power supply is replaced.
Page 828
Once power is restored to the power supply, the status LED should be green. 8. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 829
local -auto-giveback false 3. Take the impaired node to the LOADER prompt: If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 830
Locking latch Locking pin 6. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 7. Place the controller module on a stable, flat surface, and then open the air duct: a.
Page 831
Air duct locking tabs Risers Air duct Replace the RTC battery To replace the RTC battery, locate it inside the controller and follow the specific sequence of steps. 1. If you are not already grounded, properly ground yourself. 2. Locate the RTC battery.
Page 832
Air duct RTC battery and housing 3. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
Page 833
-node local -auto -giveback true Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Installation and Setup of an AFF A800 Video two of two: Perform end-to-end software configuration The following video shows end-to-end software configuration for systems running ONTAP 9.2 and later. NetApp video: Software configuration for vSphere NAS datastores for FAS/AFF systems running ONTAP 9.2...
Page 835
You might also want to have access to the Release Notes for your version of ONTAP for more information about this system. NetApp Hardware Universe Find the Release Notes for your version of ONTAP 9 You need to provide the following at your site: •...
Page 836
5. Download and complete the Cluster configuration worksheet. Cluster Configuration Worksheet Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable. 1. Install the rail kits, as needed. Installing SuperRail into a four-post rack...
Page 837
You need to be aware of the safety concerns associated with the weight of the system. 3. Attach cable management devices (as shown). 4. Place the bezel on the front of the system. Step 3: Cable controllers There is required cabling for your platform’s cluster using the two-node switchless cluster method or the cluster interconnect network method.
Page 838
As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. 1. Use the animation or the step-by step instructions to complete the cabling between the controllers and to the switches: Cabling a two-node switchless cluster Step...
Page 839
Step Perform on each controller module Cable the management ports to the management network switches DO NOT plug in the power cords at this point. 2. To perform optional cabling, see: [Option 1: Connect to a Fibre Channel host] ◦ [Option 2: Connect to a 10GbE host] ◦...
Page 840
Cabling a switched cluster Step Perform on each controller module Cable the HA interconnect ports: • e0b to e0b • e1b to e1b Cable the cluster interconnect ports to the 100 GbE cluster interconnect switches.
Page 841
Step Perform on each controller module Cable the management ports to the management network switches DO NOT plug in the power cords at this point. 2. To perform optional cabling, see: [Option 1: Connect to a Fibre Channel host] ◦ [Option 2: Connect to a 10GbE host] ◦...
Page 842
As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Step Perform on each controller module Cable ports 2a through 2d to the FC host switches.
Page 843
Step Perform on each controller module Cable ports e4a through e4d to the 10GbE host network switches. To perform other optional cabling, choose from: • [Option 3: Connect to a single direct-attached NS224 drive shelf] • [Option 4: Connect to two direct-attached NS224 drive shelves] To complete setting up your system, see Step4: Completing system setup and...
Page 844
Step Perform on each controller module Cable controller A to the shelf Cable controller B to the shelf: 2. To complete setting up your system, see Step4: Completing system setup and configuration. Option 4: Cable the controllers to two drive shelves You must cable each controller to the NSM modules on both NS224 drive shelves.
Page 845
As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. 1. Use the following animation or the written steps to cable your controllers to two drive shelves. Cabling the controllers to two drive shelves Step Perform on each controller module...
Page 846
Step Perform on each controller module Cable controller B to the shelves: 2. To complete setting up your system, see Step4: Completing system setup and configuration. Step 4: Complete system setup and configuration Complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch.
Page 847
Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens. 5. Use System Manager guided setup to configure your system using the data you collected in the NetApp ONTAP Configuration Guide. ONTAP Configuration Guide 6.
Page 848
c. Connect the laptop or console to the switch on the management subnet. d. Assign a TCP/IP address to the laptop or console, using one that is on the management subnet. 2. Plug the power cords into the controller power supplies, and then connect them to power sources on different circuits.
Page 849
◦ If the impaired node is in a standalone configuration and at LOADER prompt, contact NetApp Support. mysupport.netapp.com 2. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message...
Page 850
Option 1: Check NVE or NSE on systems running ONTAP 9.5 and earlier Before shutting down the impaired node, you need to check whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
Page 851
If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers Restored display available: security key-manager query c. Shut down the impaired node. 3. If you saw the message This command is not supported when onboard key management is enabled,...
Page 852
Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
Page 853
Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired node, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
Page 854
Restored a. Enter the onboard security key-manager sync command: security key-manager onboard sync Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact NetApp Support. mysupport.netapp.com b. Verify the Restored column shows...
Page 855
Key Manager external Restored yes: a. Enter the onboard security key-manager sync command: security key-manager external sync If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored security key-manager key query c.
Page 856
If the impaired node displays… Then… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode `impaired_node_name` When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 857
If the impaired node is Then… displaying… Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name...
Page 858
Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Place the controller module on a stable, flat surface, and then open the air duct: a.
Page 859
Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 2: Replace the boot media You locate the failed boot media in the controller module by removing Riser 3 on the controller module before you can replace the boot media. You need a Phillips head screw driver to remove the screw that holds the boot media in-place.
Page 860
Air duct Riser 3 Phillips #1 screwdriver Boot media screw Boot media 2. Remove the boot media from the controller module: a. Using a #1 Phillips head screwdriver, remove the screw holding down the boot media and set the screw aside in a safe place.
Page 861
Steps 1. Download and copy the appropriate service image from the NetApp Support Site to the USB flash drive. a. Download the service image to your work space on your laptop. b. Unzip the service image.
Page 862
Air duct Risers 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. 4. Reinstall the cable management device and recable the system, as needed. When recabling, remember to reinstall the media converters (SFPs or QSFPs) if they were removed.
Page 863
command. <value> a. Check the boot environment variables: ▪ bootarg.init.boot_clustered ▪ partner-sysid ▪ bootarg.init.flash_optimized ▪ for All SAN Array bootarg.init.san_optimized ▪ bootarg.init.switchless_cluster.enable b. If External Key Manager is enabled, check the bootarg values, listed in the kenv ASUP output: ▪ bootarg.storageencryption.support <value>...
Page 864
If your system has… Then… A network connection a. Press when prompted to restore the backup configuration. b. Set the healthy node to advanced privilege level: -privilege advanced c. Run the restore backup command: system node restore- backup -node local -target-address impaired_node_IP_address d.
Page 865
If your system has… Then… No network connection and is in a a. Press when prompted to restore the backup configuration. MetroCluster IP configuration b. Reboot the system when prompted by the system. c. Wait for the iSCSI storage connections to connect. You can proceed after you see the following messages: date-and-time [node- name:iscsi.session.stateChanged:notice]:...
Page 866
Restore OKM, NSE, and NVE as needed - AFF A800 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
Page 867
3. Check the console output: If the console Then… displays… The LOADER prompt Boot the node to the boot menu: boot_ontap menu Waiting for giveback… a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this node rather than wait [y/n]? , enter: c.
Page 868
8. Move the console cable to the partner node and login as admin. 9. Confirm the target node is ready for giveback with the command. storage failover show 10. Giveback only the CFO aggregates with the storage failover giveback -fromnode local -only-cfo -aggregates true command.
Page 869
18. At the clustershell prompt, enter the net int show -is-home false command to list the logical interfaces that are not on their home node and port. If any interfaces are listed as false, revert those interfaces back to their home port using the net int command.
Page 870
This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
Page 871
If the console Then… displays… Waiting for giveback… a. Log into the partner node. b. Confirm the target node is ready for giveback with the storage command. failover show 4. Move the console cable to the partner node and give back the target node storage using the storage command.
Page 872
-auto-giveback true command. Return the failed part to NetApp - AFF A800 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 873
1. If your system has two controller modules, disable the HA pair. If your system is running Then… clustered ONTAP with… Two nodes in the cluster cluster ha modify -configured false storage failover modify -node node0 -enabled false More than two nodes in the storage failover modify -node node0 -enabled false cluster 2.
Page 874
The controller module moves slightly out of the chassis. Locking latch Locking pin 6. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 7.
Page 875
Drives are fragile. Handle them as little as possible to prevent damage to them. 3. Align the drive from the old chassis with the same bay opening in the new chassis. 4. Gently push the drive into the chassis as far as it will go. The cam handle engages and begins to rotate upward.
Page 876
The locking latches rise when the controller module is fully seated. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors. The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process.
Page 877
◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 3: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 878
• You must always capture the node’s console output to a text file. This provides you a record of the procedure so that you can troubleshoot any issues that you might encounter during the replacement process. Do not downgrade the BIOS version of the replacement node to match the partner node or the old controller module.
Page 879
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 880
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 881
Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Place the controller module on a stable, flat surface, and then open the air duct: a.
Page 882
Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 2: Move the power supplies You must move the power supplies from the impaired controller module to the replacement controller module when you replace a controller module. 1.
Page 883
Blue power supply locking tab Power supply 2. Move the power supply to the new controller module, and then install it. 3. Using both hands, support and align the edges of the power supply with the opening in the controller module, and then gently push the power supply into the controller module until the locking tab clicks into place.
Page 884
Fan locking tabs Fan module 2. Move the fan module to the replacement controller module, and then install the fan module by aligning its edges with the opening in the controller module, and then sliding the fan module into the controller module until the locking latches click into place.
Page 885
Air duct riser NVDIMM battery plug NVDIMM battery pack Attention: The NVDIMM battery control board LED blinks while destaging contents to the flash memory when you halt the system. After the destage is complete, the LED turns off. 2. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket.
Page 886
Step 5: Remove the PCIe risers As part of the controller replacement process, you must remove the PCIe modules from the impaired controller module. You must install them into the same location in the replacement controller module once the NVDIMMS and DIMMs have moved to the replacement controller module.
Page 887
1. Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement controller module in the proper orientation. 2. Eject the DIMM from its slot by slowly pushing apart the two DIMM ejector tabs on either side of the DIMM, and then slide the DIMM out of the slot.
Page 888
Air duct NVDIMMs 2. Note the orientation of the NVDIMM in the socket so that you can insert the NVDIMM in the replacement controller module in the proper orientation. 3. Eject the NVDIMM from its slot by slowly pushing apart the two NVDIMM ejector tabs on either side of the NVDIMM, and then slide the NVDIMM out of the socket and set it aside.
Page 889
Air duct Riser 3 Phillips #1 screwdriver Boot media screw Boot media 2. Remove the boot media from the controller module: a. Using a #1 Phillips head screwdriver, remove the screw holding down the boot media and set the screw aside in a safe place.
Page 890
Step 9: Install the PCIe risers You install the PCIe risers in the replacement controller module after moving the DIMMs, NVDIMMs, and boot media. 1. Install the riser into the replacement controller module: a. Align the lip of the riser with the underside of the controller module sheet metal. b.
Page 891
Locking tabs Slide plunger 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 3.
Page 892
1. If the replacement node is not at the LOADER prompt, halt the system to the LOADER prompt. 2. On the healthy node, check the system time: show date The date and time are given in GMT. 3. At the LOADER prompt, check the date and time on the replacement node: show date The date and time are given in GMT.
Page 893
node_name After you issue the command, you should wait until the system stops at the LOADER prompt. 2. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags 3. Select Scan System from the displayed menu to enable running the diagnostics tests. 4.
Page 894
1. If the replacement node is in Maintenance mode (showing the *> prompt, exit Maintenance mode and go to the LOADER prompt: halt 2. From the LOADER prompt on the replacement node, boot the node, entering if you are prompted to override the system ID due to a system ID mismatch:boot_ontap 3.
Page 895
Find the High-Availability Configuration Guide for your version of ONTAP 9 b. After the giveback has been completed, confirm that the HA pair is healthy and that takeover is possible: storage failover show The output from the command should not include the System ID changed storage failover show on partner message.
Page 896
Steps 1. If you need new license keys, obtain replacement license keys on the NetApp Support Site in the My Support section under Software licenses. The new license keys that you require are automatically generated and sent to the email address on file.
Page 897
-node local -auto -giveback true Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 898
increasing number of correctable error correction codes (ECC); failure to do so causes a system panic. All other components in the system must be functioning properly; if not, you must contact technical support. You must replace the failed component with a replacement FRU component you received from your provider. Step 1: Shut down the impaired controller You can shut down or take over the impaired controller using different procedures, depending on the storage system hardware configuration.
Page 899
If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 900
Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Place the controller module on a stable, flat surface, and then open the air duct: a.
Page 901
Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 3: Replace a DIMM To replace a DIMM, you must locate it in the controller module using the DIMM map label on top of the air duct or locating it using the LED next to the DIMM, and then replace it following the specific sequence of steps.
Page 902
Air duct cover Riser 1 and DIMM bank 1, and 3-6 Riser 2 and DIMM Riser 3 and DIMM 19 -22 and 24 bank 7-10, 12-13, and 15-18 Note: Slot 2 and 14 are left empty. Do not attempt to install DIMMs into these slots. 2.
Page 903
Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board. 4. Remove the replacement DIMM from the antistatic shipping bag, hold the DIMM by the corners, and align it to the slot. The notch among the pins on the DIMM should line up with the tab in the socket.
Page 904
Locking tabs Slide plunger 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 3.
Page 905
◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 906
If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 907
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 908
Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Set the controller module aside in a safe place. Step 3: Replace a fan To replace a fan, remove the failed fan module and replace it with a new fan module.
Page 909
Fan locking tabs Fan module 3. Align the edges of the replacement fan module with the opening in the controller module, and then slide the replacement fan module into the controller module until the locking latches click into place. Step 4: Reinstall the controller module After you replace a component within the controller module, you must reinstall the controller module in the system chassis and boot it.
Page 910
-node local -auto -giveback true Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 911
2. Disable automatic giveback from the console of the healthy node: storage failover modify –node local -auto-giveback false 3. Take the impaired node to the LOADER prompt: If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback…...
Page 912
If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 913
Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Place the controller module on a stable, flat surface, and then open the air duct: a.
Page 914
Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 3: Replace the NVDIMM To replace the NVDIMM, you must locate it in the controller module using the NVDIMM map label on top of the air duct or locating it using the LED next to the NVDIMM, and then replace it following the specific sequence of steps.
Page 915
Air duct cover Riser 2 and NVDIMM 11 2. Note the orientation of the NVDIMM in the socket so that you can insert the NVDIMM in the replacement controller module in the proper orientation. 3. Eject the NVDIMM from its slot by slowly pushing apart the two NVDIMM ejector tabs on either side of the NVDIMM, and then slide the NVDIMM out of the socket and set it aside.
Page 916
5. Locate the slot where you are installing the NVDIMM. 6. Insert the NVDIMM squarely into the slot. The NVDIMM fits tightly in the slot, but should go in easily. If not, realign the NVDIMM with the slot and reinsert it. Visually inspect the NVDIMM to verify that it is evenly aligned and fully inserted into the slot.
Page 917
Slide plunger 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 3.
Page 918
◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 919
If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 920
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 921
Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Set the controller module aside in a safe place. Step 3: Replace the NVDIMM battery To replace the NVDIMM battery, you must remove the failed battery from the controller module and install the replacement battery into the controller module.
Page 922
Air duct riser NVDIMM battery plug NVDIMM battery pack Attention: The NVDIMM battery control board LED blinks while destaging contents to the flash memory when you halt the system. After the destage is complete, the LED turns off. 2. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket.
Page 923
b. Plug the battery plug into the riser socket and make sure that the plug locks into place. 6. Close the NVDIMM air duct. Make sure that the plug locks into the socket. Step 4: Reinstall the controller module and booting the system After you replace a FRU in the controller module, you must reinstall the controller module and reboot it.
Page 924
◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 925
The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h 2. Disable automatic giveback from the console of the healthy node: storage failover modify –node local -auto-giveback false 3.
Page 926
If the impaired node is Then… displaying… The LOADER prompt Go to the next step. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired node: prompt (enter system password) •...
Page 927
Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Place the controller module on a stable, flat surface, and then open the air duct: a.
Page 928
Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 3: Replace a PCIe card To replace a PCIe card, you must remove the cabling and any QSFPs and SFPs from the ports on the PCIe cards in the target riser, remove the riser from the controller module, remove and replace the PCIe card, reinstall the riser and any QSFPs and SFPs onto the ports, and cable the ports.
Page 929
Air duct Riser locking latch Card locking bracket Riser 1 (left riser) with 100GbE PCIe card in slot 1. 3. Remove the PCIe card from Riser 1: a. Turn the riser so that you can access the PCIe card. b. Press the locking bracket on the side of the PCIe riser, and then rotate it to the open position. c.
Page 930
Air duct Riser 2 (middle riser) or 3 (right riser) locking latch Card locking bracket Side panel on riser 2 or 3 PCIe cards in riser 2 or 3 5. Remove the PCIe card from the riser: a. Turn the riser so that you can access the PCIe cards. b.
Page 931
controller module. d. Reinsert any SFP modules that were removed from the PCIe cards. Step 4: Reinstall the controller module After you replace a component within the controller module, you must reinstall the controller module in the system chassis and boot it. 1.
Page 932
-node local -auto -giveback true Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 933
Once power is restored to the power supply, the status LED should be green. 1. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
Page 934
Replace the real-time clock battery - AFF A800 You replace the real-time clock (RTC) battery in the controller module so that your system’s services and applications that depend on accurate time synchronization continue to function. • You can use this procedure with all versions of ONTAP supported by your system •...
Page 935
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 936
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 937
Locking latch Locking pin 1. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 2. Place the controller module on a stable, flat surface, and then open the air duct: a.
Page 938
Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 3: Remove the PCIe risers You must remove one or more PCIe risers when replacing specific hardware components in the controller module. 1. Remove the PCIe riser from the controller module: a.
Page 939
Air duct Riser 2 (middle riser) locking latch Step 4: Replace the RTC battery To replace the RTC battery, locate it inside the controller and follow the specific sequence of steps. 1. Locate the RTC battery under Riser 2.
Page 940
Air duct Riser 2 RTC battery and housing 2. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
Page 941
module. c. Swing the locking latch down and click it into the locked position. When locked, the locking latch is flush with the top of the riser and the riser sits squarely in the controller module. d. Reinsert any SFP modules that were removed from the PCIe cards. Step 6: Reinstall the controller module and setting time/date after RTC battery replacement After you replace a component within the controller module, you must reinstall the controller module in the system chassis, reset the time and date on the controller, and...
Page 942
-node local -auto -giveback true Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Installation and Setup of an AFF C190 Video two of two: Performing end-to-end software configuration The following video shows end-to-end software configuration for systems running ONTAP 9.2 and later. NetApp video: Software configuration for vSphere NAS datastores for FAS/AFF systems running ONTAP 9.2...
Page 944
NetApp Product Registration 4. Download and install Config Advisor on your laptop. NetApp Downloads: Config Advisor 5. Inventory and make a note of the number and types of cables you received. The following table identifies the types of cables you might receive. If you receive a cable not listed in the table, see the Hardware Universe to locate the cable and identify its use.
Page 945
Cluster Configuration Worksheet Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable. 1. Install the rail kits, as needed. 2. Install and secure your system using the instructions included with the rail kit.
Page 946
4. Place the bezel on the front of the system. Step 3: Cable controllers to your network You can cable the controllers to your network by using the two-node switchless cluster method or by using the cluster interconnect network. Option 1: Cable a two node switchless cluster, unified configuration UTA2 ports and management ports on the controller modules are connected to switches.
Page 947
Step Perform on each controller Cable the cluster interconnect ports to each other with the cluster interconnect cable: • e0a to e0a • e0b to e0b Use one of the following cable types to cable the e0c/0c and e0d/0d or e0e/0e and e0f/0f data ports to your host network: Cable the e0M ports to the management network switches with the RJ45 cables:...
Page 948
Step Perform on each controller DO NOT plug in the power cords at this point. 2. To complete setting up your system, see Step4: Completing system setup and configuration. Option 2: Cable switched cluster, unified configuration UTA2 ports and management ports on the controller modules are connected to switches. The cluster interconnect ports are cabled to the cluster interconnect switches.
Page 949
Step Perform on each controller module Cable e0a and e0b to the cluster interconnect switches with the cluster interconnect cable: Use one of the following cable types to cable the e0c/0c and e0d/0d or e0e/0e and e0f/0f data ports to your host network: Cable the e0M ports to the management network switches with the RJ45 cables: DO NOT plug in the power cords at this point.
Page 950
2. To complete setting up your system, see Step4: Completing system setup and configuration. Option 3: Cable a two node switchless cluster, Ethernet configuration RJ45 ports and management ports on the controller modules are connected to switches. The cluster interconnect ports are cabled on both controller modules. Contact your network administrator for information about connecting the system to the switches.
Page 951
Step Perform on each controller Use the Cat 6 RJ45 cable to cable the e0c through e0f ports to your host network: Cable the e0M ports to the management network switches with the RJ45 cables DO NOT plug in the power cords at this point. 2.
Page 952
1. Use the illustration or the step-by step instructions to complete the cabling between the controllers and the switches: Step Perform on each controller module Cable e0a and e0b to the cluster interconnect switches with the cluster interconnect cable: Use the Cat 6 RJ45 cable to cable the e0c through e0f ports to your host network:...
Page 953
Step Perform on each controller module Cable the e0M ports to the management network switches with the RJ45 cables: DO NOT plug in the power cords at this point. 2. To complete setting up your system, see Step4: Completing system setup and configuration.
Page 954
Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens. 6. Use System Manager guided setup to configure your system using the data you collected in the NetApp ONTAP Configuration Guide. ONTAP Configuration Guide 7.
Page 955
See your laptop or console’s online help for how to configure the console port. b. Connect the console cable to the laptop or console, and connect the console port on the controller using the console cable that came with your system. c.
Page 956
Point your browser to the node management IP address. The format for the address is https://x.x.x.x. b. Configure the system using the data you collected in the NetApp ONTAP Configuration Guide. 6. Verify the health of your system by running Config Advisor.
Page 957
◦ If the impaired node is in a standalone configuration and at LOADER prompt, contact NetApp Support. mysupport.netapp.com 2. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message...
Page 958
Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys:...
Page 959
Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact NetApp Support. mysupport.netapp.com b. Verify the column shows for all authentication keys: Restored security key-manager key query c. Verify that the type shows onboard, manually backup the OKM information.
Page 960
Enter the onboard security key-manager sync command: security key-manager onboard sync Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact NetApp Support. mysupport.netapp.com b. Verify the column shows...
Page 961
This command may not work if the boot device is corrupted or non-functional. Remove the controller module, replace the boot media and transfer the boot image to the boot media - AFF C190 To replace the boot media, you must remove the impaired controller module, install the replacement boot media, and transfer the boot image to a USB flash drive.
Page 962
5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 2: Replace the boot media You must locate the boot media in the controller module, and then follow the directions to replace it.
Page 963
• A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
Page 964
The node begins to boot as soon as it is completely installed into the chassis. 5. Interrupt the boot process to stop at the LOADER prompt by pressing Ctrl-C when you see Starting AUTOBOOT press Ctrl-C to abort…. If you miss this message, press Ctrl-C, select the option to boot to Maintenance mode, and then halt node to boot to LOADER.
Page 965
After image_name.tgz is installed, the system prompts you to restore the backup configuration (the file system) from the healthy node. 9. Restore the file system: If your system has… Then… A network connection a. Press when prompted to restore the backup configuration. b.
Page 966
If your system is in… Then… A stand-alone configuration You can begin using your system after the node reboots. An HA pair After the impaired node is displaying the Waiting for Giveback… message, perform a giveback from the healthy node: a.
Page 967
If your system has… Then… A network connection a. Press when prompted to restore the backup configuration. b. Set the healthy node to advanced privilege level: set -privilege advanced c. Run the restore backup command: system node restore-backup -node local -target -address impaired_node_IP_address d.
Page 968
Restore OKM, NSE, and NVE as needed - AFF C190 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. 1. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
Page 969
If the console displays… Then… Waiting for giveback…. a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this node rather than wait [y/n]? , enter: c. At the LOADER prompt, enter the command. boot_ontap menu 4.
Page 970
command. -only-cfo-aggregates true ◦ If the command fails because of a failed disk, physically dis-engage the failed disk, but leave the disk in the slot until a replacement is received. ◦ If the command fails because of an open CIFS sessions, check with customer how to close out CIFS sessions.
Page 971
Restore NSE/NVE on systems running ONTAP 9.6 and later Steps 1. Connect the console cable to the target node. 2. Use the command at the LOADER prompt to boot the node. boot_ontap 3. Check the console output: If the console displays… Then…...
Page 972
-auto-giveback true Return the failed part to NetApp - AFF C190 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 973
• If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h Steps 1.
Page 974
1. If you are not already grounded, properly ground yourself. 2. Turn off the power supply and disconnect the power cables: a. Turn off the power switch on the power supply. b. Open the power cable retainer, and then unplug the power cable from the power supply. c.
Page 975
5. Set the controller module aside in a safe place, and repeat these steps if you have another controller module in the chassis. Step 3: Move drives to the new chassis You need to move the drives from each bay opening in the old chassis to the same bay opening in the new chassis.
Page 976
6. Repeat the process for the remaining drives in the system. Step 4: Replace a chassis from within the equipment rack or system cabinet You must remove the existing chassis from the equipment rack or system cabinet before you can install the replacement chassis. 1.
Page 977
message Press Ctrl-C for Boot Menu. If you miss the prompt and the controller modules boot to ONTAP, enter halt, and then at the LOADER prompt enter boot_ontap, press when prompted, and then Ctrl-C repeat this step. b. From the boot menu, select the option for Maintenance mode. Controller Replace the controller module - AFF C190 You must review the prerequisites for the replacement procedure and select the correct...
Page 978
see the Administration overview with the CLI. Steps 1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*>...
Page 979
3. Remove and set aside the cable management devices from the left and right sides of the controller module. 4. If you left the SFP modules in the system after removing the cables, move them to the new controller module. 5.
Page 980
Step 2: Move the boot media You must locate the boot media and follow the directions to remove it from the old controller module and insert it in the new controller module. 1. Locate the boot media using the following illustration or the FRU map on the controller module:...
Page 981
2. Press the blue button on the boot media housing to release the boot media from its housing, and then gently pull it straight out of the boot media socket. Do not twist or pull the boot media straight up, because this could damage the socket or the boot media.
Page 982
3. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket. 4. Grasp the battery and press the blue locking tab marked PUSH, and then lift the battery out of the holder and controller module.
Page 983
Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board. The number and placement of system DIMMs depends on the model of your system. The following illustration shows the location of system DIMMs: 4.
Page 984
For HA pairs with two controller modules in the same chassis, the sequence in which you install the controller module is especially important because it attempts to reboot as soon as you completely seat it in the chassis. The system might update system firmware when it boots. Do not abort this process. The procedure requires you to interrupt the boot process, which you can typically do at any time after prompted to do so.
Page 985
▪ A prompt warning of a system ID mismatch and asking to override the system ID. ▪ A prompt warning that when entering Maintenance mode in an HA configuration you must ensure that the healthy node remains down. You can safely respond to these prompts.
Page 986
The HA state should be the same for all components. 2. If the displayed system state of the controller module does not match your system configuration, set the state for the controller module: ha-config modify controller ha-state The value for HA-state can be one of the following: ◦...
Page 987
◦ is a network interface card. ◦ is nonvolatile RAM. nvram ◦ is a hybrid of NVRAM and system memory. nvmem ◦ is a Serial Attached SCSI device not connected to a disk shelf. 4. Run diagnostics as desired. If you want to run diagnostic Then…...
Page 988
If you want to run diagnostic Then… tests on… Multiple components at the same a. Review the enabled and disabled devices in the output from the time preceding procedure and determine which ones you want to run concurrently. b. List the individual tests for the device: sldiag device show -dev dev_name c.
Page 989
If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: failures sldiag device clearstatus b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
Page 990
Steps 1. Recable the system. 2. Verify that the cabling is correct by using Active IQ Config Advisor. a. Download and install Config Advisor. b. Enter the information for the target system, and then click Collect Data. c. Click the Cabling tab, and then examine the output. Make sure that all disk shelves are displayed and all disks appear in the output, correcting any cabling issues you find.
Page 991
a. Change to the advanced privilege level: set -privilege advanced You can respond when prompted to continue into advanced mode. The advanced mode prompt appears (*>). b. Save any coredumps: system node run -node local-node-name partner savecore c. Wait for savecore command to complete before issuing the giveback. You can enter the following command to monitor the progress of the savecore command: system node run -node local-node-name partner savecore -s d.
Page 992
After a valid license key is installed, you have 24 hours to install all of the keys before the grace period ends. Steps 1. If you need new license keys, obtain replacement license keys on the NetApp Support Site in the My Support section under Software licenses.
Page 993
-node local -auto -giveback true Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
Page 994
(Asia/Pacific) if you need the RMA number or additional help with the replacement procedure. Replace a DIMM - AFF C190 You must replace a DIMM in the controller module when your system registers an increasing number of correctable error correction codes (ECC); failure to do so causes a system panic.
Page 995
If the impaired node is Then… displaying… System prompt or password Take over or halt the impaired node: prompt (enter system password) • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…, press Ctrl-C, and then respond y.
Page 996
5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 3: Replace the DIMMs To replace the DIMMs, you need to locate them inside the controller module, and then follow the specific sequence of steps.
Page 997
If you are replacing a DIMM, you need to remove it after you have unplugged the NVMEM battery from the controller module. 1. Check the NVMEM LED on the controller module. You must perform a clean system shutdown before replacing system components to avoid losing unwritten data in the nonvolatile memory (NVMEM).
Page 998
6. Note the orientation of the DIMM in the socket so that you can insert the replacement DIMM in the proper orientation. 7. Eject the DIMM from its slot by slowly pushing apart the two DIMM ejector tabs on either side of the DIMM, and then slide the DIMM out of the slot.
Page 999
12. Close the controller module cover. Step 4: Reinstall the controller module After you replace components in the controller module, you must reinstall it into the chassis. 1. If you have not already done so, replace the cover on the controller module. 2.
Page 1000
b. After the node boots to Maintenance mode, halt the node: halt After you issue the command, you should wait until the system stops at the LOADER prompt. During the boot process, you can safely respond to prompts: ▪ A prompt warning that when entering Maintenance mode in an HA configuration, you must ensure that the healthy node remains down.
Page 1001
A stand-alone configuration Proceed to the next step. No action is required. You have completed system-level diagnostics. Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt.
Need help?
Do you have a question about the AFF A200 and is the answer not in the manual?
Questions and answers