Bart Brashers bbrashers at environcorp.com
Mon Jan 23 13:36:27 PST 2012
I think on key tidbit of information would help you here. You only run insert-ethers when you are trying to install a NEW appliance (e.g. compute node). If you just want to re-install an existing node, you do not use insert-ethers. Insert-ethers is a tool to intercept PXE boots, detect MAC addresses, and create entries in the rocks database for new nodes. It does have a few (leftover) features like removing database entries, but those have been largely moved to "/opt/rocks/bin/rocks" commands. So in your case, to rename compute-0-4 to compute-0-0, you can do one of two things: Version one: 1. Run "insert-ethers --replace compute-0-4" 2. Pick "compute" from list. 3. Make that node PXE boot. 4. When the "( )" turns into a "(*)", indicating that the node has received a kickstart file, exit. Version two: 1. Run "rocks remove host compute-0-4" 2. Run "rocks sync config; rocks sync users" 3. Run "insert-ethers --cabinet 0 --rank 0" 4. Pick "compute" from list. 5. Make that node PXE boot. 6. When the "( )" turns into a "(*)", indicating that the node has received a kickstart file, exit. To re-install a node that already has an entry in the rocks database (i.e. a "known" node): 1. Run "rocks set host boot compute-0-0 action=install 2. Make that node PXE boot. When the node is either installed or re-installed, the OS is completely new. This includes things like the node's SSH ID (values you would put in ~/.ssh/known_hosts or /etc/ssh/ssh_known_hosts). But you can still currently do a "ssh compute-0-4", and after it's been installed as compute-0-0, you can "ssh compute-0-0". I think you can detect a hardware problem by reading `dmesg` and/or looking in the logs. No need to do a filesystem check (fsck) as suggested by Luca. If you really want to be sure, you can boot your FE from a LiveCD, and run "fsck" on each partition of your FE. Keep in mind that if you have some large userdata section, it could take a really, really long time to run. For now, run fsck only on the /, /var, /boot, etc. partitions. I can't give you an exact list, because I don't know how you partitioned your FE. But you get the idea: check the local OS partitions. I suspect that most of your problems stem from running insert-ethers too many times when you should have used "rocks set host boot compute-0-0 action=install". Bart
如果觉得我的文章对您有用,请随意打赏。你的支持将鼓励我继续创作!