Dynamic LPAR Tips/Checklist for RMC Authentication and Authorization Version: 0.3 Printed on 二月 5, 2003 This document was last updated on 11/01/2002 Please direct comments to Truc D. Nguyen [email protected] The information contained in this document has not been submitted to any formal IBM test and is distributed as is. The use of this information or the implementation of any of these techniques is the implementors responsibility and depends on the implementors ability to evaluate and integrate them into the customers operational environment. The customer must be made aware of all actions that will result in a permanent change to their environment While each item may have been reviewed by IBM for accuracy in a specific situation, there in no guarantee that the same or similar results will be obtained elsewhere. Individuals attempting to adapt these techniques to a specific customer environment do so at their own risk. This copy of the document may be obsolete. The user should obtain an updated copy from the website: http://w3.austin.ibm.com/:/projects/yxia/ Page 1 1.0 Verify the DLPAR Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.0 Checklist for DLPAR Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Verify If the Required Software is Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Verify Your RMC/DLPAR Network/Hostname Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.0 Setting up the HMC/Partition(s) Hostname and Network ...... 7 3.1 If DNS is On . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1.1 If the HMC and the Partition(s) both use Long name 3.1.2 Other Hostname Formats ......................... 8 .................................................. 8 3.2 If DNS is Off . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4.0 Procedures for Renaming a HMC/Partition Hostname, Redefining a Partition, orSwitching a Partition between SMP and LPAR Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.1 Step 1: Removing IBM.ManagedNode Resources on HMC . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.2 Step 2: Removing IBM.ManagementServer Resource on Partitions . . . . . . . . . . . . . . . . . . 9 5.0 Appendix 1: Other Useful Tips/Checks 5.1 To access Linux on the HMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5.2 An Alternate Way to Open an Xterm on HMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5.3 Use LAN Surveillance to Check Your Network Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5.4 An Quick Way to Verify if the System is DLPAR Ready . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5.5 Check Serial Cable Connection Between HMC and CEC . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 5.6 Checking for HMC version from ssh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 6.0 Appendix 2: Common Questions for RMC/DLPAR . . . . . . . . . . . . . . . . 13 6.1 Is There a Relation b/w DLPAR/LparCmdRM and SFP/ServiceRM? . . . . . . . . . . . . . . . . . . 13 6.2 Authentication & Authorization Process between HMC & Partitions . . . . . . . . . . . . . . . . . 13 6.3 What Does the Output of lspartition -dlpar Mean? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 6.4 Do IBM.DRM and IBM.HostRM Need to be “Active” on Partition for DLPAR to 14 Wor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Does IBM.CSMAgentRM Need to be Active on HMC? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 7.0 Appendix 3: Known Defects with HMC GA3 GOLD Level 7.1 Problem #1: HMC’ . . . . . . 15 acls File is Missing the DEFAULT Stanza . . . . . . . . . . . . . . . . . . . . . 15 7.2 Lspartition -dlpar Shows Active <1> but the DLPAR is not enable on the GUI . . . . . . . . 15 7.3 Problem #2: AIX ctcas daemon is ‘inoperative’ on A . . . . . . . . . . . . . . . . . . . . . . . . . . 15 7.4 Problem #3: AIX ctcas daemon is ‘inoperative’ on H . . . . . . . . . . . . . . . . . . . . . . . . . . 15 8.0 Change History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 9.0 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Page 2 1.0 Verify the DLPAR Requirements Please see Appendix session if you need instruction how to access Linux on command line. 1 Verify > Command -> Result -> Action HMC level Click [Help] and then [About] on the HMC main window. This should be release 3 version 1.1 or higher Customers are encouraged to get the latest software which can be obtained at http://techsupport.services.ibm.com/server/hmc?fetch=corrsrv. html 2 AIX Firmware > lsmcode Platform Firmware level is RKA20905 System Firmware level is RG02*_GA3_* 3 AIX filesets required for DLPAR AIX level should 5.2. These following filesets should be on partitions. (on each LPAR/SMP) The missing fileset can be installed from the AIX installation CD. > lslpp -l rsct.core* > lslpp -l csm.client 2.0 Checklist for DLPAR Setup 2.1 Verify If the Required Software is Functional Verify 1 HMC daemons are running Result and Action > su - root > lssrc -a Subsystem ctrmc ctcas Group rsct rsct Page 3 PID 822 823 Status active active IBM.DMSRM IBM.LparCmdRM rsct rm rsct_rm 906 901 active active If these daemons are active go to step 2: If any of the daemons show as inoperative, start it manually by > startsrc -s <subsystem name> Example: startsrc -s ctcas If a daemon can't be started Please contact IBM service personnel. 2 AIX daemons are running (on each LPAR/SMP with AIX 5.20) > su - root > lssrc -a | grep rsct Subsystem ctrmc ctcas IBM.CSMAgentRM IBM.ServiceRM IBM.DRM IBM.HostRM Group rsct rsct rsct_rm rsct_rm rsct_rm rsct_rm PID 21044 21045 21045 11836 20011 20012 Status active active active active active active IBM.DRM and IBM.HostRM are ‘lazy started’ resource managers which means they’re only started when they get use If IBM.DRM and IBM.HostRM are "inoperative, there is a good chance that you’re having network/hostname setup problem between HMC and the partition; to correct this, please refer to section Verify Your RMC/DLPAR Setup.s If any of the daemons show as inoperative > startsrc -s <subsystem name> Example: startsrc -s ctcas If ctcas can’t be starte If ctcas can't be started, please open a PMR before contact your IBM service personnel. NOTES: ctrmc: is a RMC subsystem ctcas: is for security verification IBM.DMSRM: is for tracking statuses of partitions IBM.LparCmdRM: is for DLPAR operation on HMC IBM.CSMAgentRM: is for handshaking between the partition and HMC IBM.DRM: is for executing the DLPAR command on the partition. IBM.HostRM: is for obtaining OS information. Page 4 2.2 Verify Your RMC/DLPAR Network/Hostname Setup 1 Verify >Result and Actions HMC: List partitions authenticated by RMC > /opt/csm/bin/lsnodes -a Status partition01 partition02 partition03 1 0 1 Where 1” means partition is activated and authenticated for DLPAR; 0” means otherwise If the partition is activated and still shows Active<0>, you could either have network or hostname setup problem. If you have just rebooted the HMC, wait for a few minutes. If nothing changes after that, check your hostname/network setup as in section Setting up HMC /Partition(s) Hostname and Network. 2 HMC: List partitions recognized by DLPAR > lspartition -dlpar <#0> Partition:<001, partition01.company.com, 9.3.206.300> Active:<1>, OS:<AIX, 5.2>, DCaps:<0xf>, CmdCaps:<0x1, 0x0> <#1> Partition:<002, partition02.company.com, 9.3.206.300> Active:<0>, OS:<AIX, 5.2>, DCaps:<0xf>, CmdCaps:<0x1, 0x0> <#2> Partition:<003, partition03.company.com, 9.3.206.300> Active:<0>, OS:<, 5.1F>, DCaps:<0x0>, CmdCaps:<0x0, 0x0> If all active AIX 5.2 partitions are listed as Active<1>, ..., DCaps:<0xf> your system has been set up properly for DLPAR, you can skip the checklist now. (In this example, partition 002 is being shut down, partition 003 is not activated for it is AIX 5.1.) If you’re missing some active partitions or some partitions are reported asActive<0>, your system probably still has network/hostname set up problem. Refer to section Setting up HMC/Partition(s) Hostname and Network. (If you partition is Active<1> but the GUI is still not DLPAR capable, do a rebuild to get around this problem. See Appendix session for more information about this problem.) If you still can’t get partitions recognized by DLPAR after verifying the checklist, contact IBM service persnnel. 3 AIX: Ensure /var > df Page 5 directory is not 100% full (on each LPAR/SMP) If /var is 100% full, use smitty to expand it. If there is no more space available, visit subdirectories to remove unnecessary files (e.g. trace.*, core, etc.) After expanding the /var directory, execute these command to fix possibly corrupted files. > > > > > 4 AIX: Verify if you have network problem? (From each LPAR/SMP) 5 6 7 rmrsrc -s "Hostname!='t' " IBM.ManagementServer /usr/sbin/rsct/bin/rmcctrl -z rm /var/ct/cfg/ct_has.thl rm /var/ct/cfg/ctrmc.acls /usr/sbin/rsct/bin/rmcctrl -A > ping <hmc_hostname> If ping fails, check your hostname/network setup. See section Setting up HMC/Partition(s) Hostname and Network. AIX: Verify partition(s) to HMC authentication > CT_CONTACT=<HMC name> lsrsrc IBM.ManagedNode (from each LPAR/SMP) If there is any error, you probably have network/hostname problem, please refer to section Setting up HMC /Partition(s) Hostname and Network. HMC: Verify network setup by telnet-ing into each partition(s) from the HMC. > telnet <hostname> > "Ctrl c" or "exit" to end HMC: Verify HMC to partition(s) authentication > CT CONTACT =<partition hostname> lsrsrc IBM.ManagementServer You should get a list of resource classes on HMC. If you can’t telnet, you have a network problem, refer to section 3.0Setting up HMC/Partition(s) Hostname and Network. If nothing is displayed or if there are any errors, you probably have a hostname problem, please refer to section Setting up HMC /Partition(s) Hostname and Network. 3.0 Setting up the HMC/Partition(s) Hostname and Network Most of DLPAR problems we have encounter from the labs during testing have been improper network and hostname(s) setup. This section aims to reduce these network setup/configuration problems. First, find out the hostname format the HMC and its partition(s) are using; short or long name. Your setup depends largely on the format of the hostname. The hostname format can be determined by typing the command "hostname” on the HMC and AIX system respectively then use Hhost return_from_hostname to verify it. Example: > hostname Partition.company.com Page 6 > host Partition.company.com Partitition.company.com has address 9.3.14.199 3.1 If DNS is On 3.1.1 If the HMC and the Partition(s) both use Long name y No hostname entry is needed in /etc/hosts on either AIX or the HMC. y If /etc/hosts has the hostname entry, the longname must be before the short name for HMC and all partitions (host names are case sensitive). Example: 10.10.10.11 y mymachine.mycompany.com mymachine After update /etc/host file, refresh RMC by either rebooting or > /usr/sbin/rsct/bin/rmcctrl -z > /usr/sbin/rsct/bin/rmcctrl -A 3.1.2 Other Hostname Formats y If the hostname command returns the short name, put the shortname before the longname in /etc/hosts on both HMC and the partitions; If the hostname command returns the long name, put the longname before the shortname in the /etc/hosts file on HMC and all partitions (names are case sensitive). Example: 10.10.10.11 mymachine mymachine.mycompany.com y Make sure that on the partition(s) (just partition(s)), the file /etc/netsvc.conf exists with one line: hosts=local,bind y Refresh RMC by either rebooting or by commands > /usr/sbin/rsct/bin/rmcctrl -z > /usr/sbin/rsct/bin/rmcctrl -A 3.2 If DNS is Off y The HMC and partition(s) /etc/hosts file need to be modified to contain the correct entries for the HMC and all partitions’ hostnames. If the hostname command returns the short name, put the short name before the longname in /etc/hosts for HMC and all partitions (names are case sensitive). If the hostname command returns the long name, put the long name before the short name in /etc/hosts for HMC and all partitions (names are case sensitive) Page 7 y Make sure that on the partition(s) (just partition), file /etc/netsvc.conf exists with one line: hosts=local,bind. y Refresh RMC by either rebooting or > /usr/sbin/rsct/bin/rmcctrl -z > /usr/sbin/rsct/bin/rmcctrl -A Customers should add all the LPAR's hostnames to the /etc/hosts file on the HMC. The HMC hostname must be added to each LPAR /etc/hosts file. Because the customer does not have DNS data, we do not have a domain name, only a short hostname, therefore the DNS enabled box will not be enabled. 4.0 Procedures for Renaming a HMC/Partition Hostname, Redefining a Partition, or Switching a Partition between SMP and LPAR Mode NOTE: This procedure is required only if you have the HMC GA3 GOLD level. If you have PTF2 or later, there is no need to perform this. y When a partition is switched between LPAR and SMP (Single Machine Partition) mode. y When a partition is redefined, i.e. It could be assigned a new partition ID. y When a partition is upgraded, installed with a new level of AIX. y When hostnames are swapped between 2 partitions. 4.1 Step 1: Removing IBM.ManagedNode Resources on HMC On HMC, execute these commands: > /usr/sbin/rsct/bin/rmrsrc -s "Hostname!=’ > /usr/sbin/rsct/bin/rmcctrl -z > /usr/sbin/rsct/bin/rmcctrl -Z a" IBM.ManagedNode 4.2 Step 2: Removing IBM.ManagementServer Resource on Partitions As documented in the IBM Hardware Management Console for pSeries Operations Guide. Service Focal Point, section Getting Started, the following procedure is to be performed on the affected partitions: Page 8 > /usr/sbin/rsct/bin/rmrsrc -s "Name!=’ > /usr/sbin/rsct/bin/rmcctrl -z > /usr/sbin/rsct/bin/rmcctrl -A N" IBM.ManagementServer Wait for a few minutes, then checking for IBM.ManagementServer resource(s) by > lsrsrc IBM.ManagementServer Resource Persistent Attributes for IBM.ManagementServer resource 1: Name = " hmc.yourcompany.com " Hostname = " hmc.yourcompany.com " ManagerType = " HSC " LocalHostname = " par_01.company.com " NodeNameList = {"par_01.company.com"} y Verify all HMC resources on the partition. y Verify that the HMC's Name and Hostname are correct. Page 9 5.0 Appendix 1: Other Useful Tips/Checks 5.1 To access Linux on the HMC To access the xterm on the HMC (Command Line Entry) you will need a PE passcode which can be obtained from IBM support. To access Linux command line: 1. Log on HMC as “hscpe user (user created by customer) 2. Select “Problem Determination” (In “Service Applications” folder at Release 2.0 and abo 3. Select “ 4. Enter serial number of HMC and PE password obtained from support 5. Select “Launch xterm shell Microcode Maintenance” 5.2 An Alternate Way to Open an Xterm on HMC Yes. From upper left corner, click on “Console” then select “Open Terminal Session” and enter your HMC hostn 5.3 Use LAN Surveillance to Check Your Network Problem In GA3, the LAN Surveillance feature has been added into SFP to alert users if a partition is having a network/hostname setup/RMC authentication problem by reporting a SURVALNC Serviceable Event to the HMC. (Users can have e-mail set up for notification of this type of problems.) You can use “List Serviceable Events” to check for these errors; if there is none, you should not have problem with DLPAR/SF If you do, please go through the checklist to diagnose/correct the problem. Page 10 5.4 An Quick Way to Verify if the System Network/Hostname is set up Properly From the HMC console, select “Server Management”, then expand it to the partition level Left click on an AIX 520 partition to get the pop-up menu, then select one of the item under “Dynamic Logical Part tion” e.g. Memory); if you get the error messages HMCERRV3DLPAR016: The selected logical partition is not enabled for dynamic logical partitioning operations There is a good chance that the system is having network/hostname setup problem. Please go through the checklist to diagnose/correct the problem. This procedure is best to be performed right after the HMC get rebooted. 5.5 Check Serial Cable Connection Between HMC and CEC On HMC: > query_cecs - returns cecname > get_cec_mode -m cecname - verifies connection to service processor 5.6 Checking for HMC version from ssh Use command #hsc versionU Page 11 6.0 Appendix 2: Common Questions for RMC/DLPAR Answer for common questions from the field. 6.1 Is There a Relation b/w DLPAR/LparCmdRM and SFP/ServiceRM? No, there is no relation between DLPAR and SFP. They are 2 independent daemons serving 2 different components, however, since they are using the same RMC framework and thus, subjected to the same authentication process as well as the same network/hostname setup. For DLPAR is only supported on AIX 5.2, AIX 5.1x partitions will be not be initialized for DLPAR. It’s correct to assume that ifSFP works, DLPAR would work but if DLPAR works, SFP might not be fully functional. To save time, it is advisable that you should verify/get SFP working before verify/get DLPAR working. 6.2 Authentication & Authorization Process between HMC & Partitions 1. On HMC: DMSRM pushes down the secret key and HMC hostname to NVRAM when it detects a new CEC - this process is repeated every 5 minutes. Each time a HMC is rebooted or DMSRM is restarted, a new key is used. 2. On AIX: CSMAgentRM, through RTAS, reads the key and HMC hostname out from NVRAM. It will then authenticate the HMC. This process is repeated every 5 minutes on partition to detect new HMC(s) and key changed. A HMC with a new key is treated as a new HMC and will go though the authentication and authorization processes again. 3. On AIX: After authenticating the HMC, CSMAgentRM will contact the DMSRM on HMC to create a ManagedNode resource in order to identify itself as a partition of this HMC. (At the creation time, the ManagedNode’ Status attribute will be set to 127.) CSMAgentRM then creates a compatible ManagementServer resource on AIX. 4. On AIX: After the creation of the ManagedNode and ManagementServer resources on HMC and AIX respectively, CSMAgentRM grants HMC permission to access necessary resource classes on the partition. After the granting HMC permission, CSMAgentRM will change its ManagedNode, on HMC, Status to 1. (It should be noted that without proper permission on AIX, the HMC would be able to establish a session with the partition but will not be able to query for OS information, DLPAR capabilities, or execute DLPAR command afterward.) 5. On HMC: After the ManagedNode Status changed to 1, LparCmdRM establishes a session with the partition, query for OS information, DLPAR capabilities, notify CIMOM about the DLPAR capabilities of the partition then wait for DLPAR command from users. Page 12