'iSCSI Login Failed' Errors Post Reinstall on Eucalyptus Multi-Cluster Environment

Follow

Eucalyptus Versions: Upgrading from 3.0.x to 4.0.0

'iSCSI Login Failed' Errors Post Reinstall on Eucalyptus Multi-Cluster Environment (Storage Controllers Configured To Use DAS/Overlay)

Problem

After doing a reinstall on a Eucalyptus 3.0.x Multi-Cluster Cloud where the Storage Controllers (SCs) have been configured to use either Overlay or Direct Attached Storage (DAS) as the block storage manager, with Eucalyptus 4.0.0, the following errors will show up in the cloud logs on the Storage Controllers (SCs) if the Linux SCSI Target (TGT) state is left untouched:

2014-05-07 18:09:55 DEBUG 000029561 doAttachVolume           | [i-b210ce3c][vol-249b94c3] volume attaching (localDev=/dev/vdc) 2014-05-07 18:09:55 DEBUG 000029561 scClientCall             |  done scOps=ExportVolume clientrc=0 opFail=0 2014-05-07 18:10:02 DEBUG 000029561 connect_iscsi_target     | connect script returned: 1, stdout: '', stderr: 'Before connecting: Failed to run '/sbin/iscsiadm -m node -T iqn.2009-06.com.eucalyptus.two:store6 -p 172.16.55.3 -l': iscsiadm: Could not login to [iface: defau lt, target: iqn.2009-06.com.eucalyptus.two:store6, portal: 172.16.55.3,3260].  iscsiadm: initiator reported error (24 - iSCSI login failed due to authorization failure)  iscsiadm: Could not log into all portals  Logging in to [iface: default, target: iqn.2009-06.com.eucalyptus.two:store6, portal: 172.16.55.3,3260] (multiple) After connecting: Unable to get attached target device.' 2014-05-07 18:10:02 ERROR 000029561 connect_ebs_volume       | Failed to connect to iSCSI target: ,,,yNIKc6ENxgQiW1HRXQkkYVoEpJc/CxEAddq1vMXK rSpHYVD2o+bypQNJg3n2WM6o3oQUhxxRU9fQFl84gQkVz7N1tnL0ss1LC17mrAi2pjqPymIH+QeOw28PO/7pbXrMP2s31b+zaSxxMPIahIf2oIZwIQYpIptk2HaxG/1D6GWjZDU6VZZfj 8IrahLF9wUrwjguO4HrddgG9Z07TlJKCHbXvTyXP82vbqVtOveZBJGPrOs0VU2oWf3Uwgw+iqRrFIE9JF50LHT5O7Yy7G3aqIoNGVFvXKOmx8FZvdj+dJlLZr8YwvpI5hczdYhnhBKYI6 vDBXl8AkIp4cPw0soIgg==,,172.16.55.3,iqn.2009-06.com.eucalyptus.two:store6 2014-05-07 18:10:02 DEBUG 000029561 cleanup_volume_attachmen | [vol-249b94c3] attempting to disconnect iscsi target 2014-05-07 18:10:02 DEBUG 000029561 disconnect_iscsi_target  | disconnect script returned: 0, stdout: '', stderr: 'Failed to run '/sbin/iscsiadm -m node -p 172.16.55.3 -T iqn.2009-06.com.eucalyptus.two:store6 -u': iscsiadm: No matching sessions found' 2014-05-07 18:10:02 DEBUG 000029561 scClientCall             |  done scOps=UnexportVolume clientrc=0 opFail=0 2014-05-07 18:10:02 DEBUG 000029561 get_iscsi_target         | invoking `//usr/lib/eucalyptus/euca_rootwrap //usr/share/eucalyptus/get_iscsitarget.pl /,,,,yNIKc6ENxgQiW1HRXQkkYVoEpJc/CxEAddq1vMXKrSpHYVD2o+bypQNJg3n2WM6o3oQUhxxRU9fQFl84gQkVz7N1tnL0ss1LC17mrAi2pjqPymIH+QeOw28PO/7pbXrMP2s31b+zaSxxMPIahIf2oIZwIQYpIptk2HaxG/1D6GWjZDU6VZZfj8IrahLF9wUrwjguO4HrddgG9Z07TlJKCHbXvTyXP82vbqVtOveZBJGPrOs0VU2oWf3Uwgw+iqRrFIE9JF50LHT5O7Yy7G3aqIoNGVFvXKOmx8FZvdj+dJlLZr8YwvpI5hczdYhnhBKYI6vDBXl8AkIp4cPw0soIgg==,,172.16.55.3,iqn.2009-06.com.eucalyptus.two:store6` 2014-05-07 18:10:02 DEBUG 000029561 get_iscsi_target         | get storage script returned: 1, stdout: '', stderr: 'Failed to run '/sbin/iscsiadm -m session -R': iscsiadm: No session found.' 2014-05-07 18:10:02 ERROR 000029561 doAttachVolume           | Error connecting ebs volume sc://vol-249b94c3,2Z6ApHFpGaWAkTHvv5jYel7jBScyL0kUshZqItWq1qNmCC7ZeotsoU0NgDmtjD8N/cy+3u0MORkzE4sQhrjRv2C8I2SyNWPom1WnkLosH52ouz+HKZHDSW8L8RBMLwjySdB587dPt9bj1QDdZYZpoiQBhV6Ci5Kl64oeJGHR1nk6QUkHI3woJLB2SW26ryCIgXWlMFLoGDVN8oCvajPVVlqDolFp/2ABtZjrgReEZc9u/pvowsXsQs+8ht9tJOU/vvLCS6n38RSNc3nAkYSaIC3yGGHjp2Hxw93ODL4d4Fx3lwgQPz8dBeaRv+oG28gqPSb3zx/2V81Vo43X2eXNwA==

Resolution

To resolve this issue, execute the following command on the Storage Controller (SC) machines post upgrade:

$ tgtadm --mode account --op delete --user eucalyptus

Cause

The issue manifests only in a multi-cluster setup using DAS/Overlay in combination with a leftover state from previous Eucalyptus 3.0.x - 3.4.2 installation [1]The reason for this behavior is because with Direct Attached Storage (DAS) and Overlay manager, the Storage Controller (SC) configures the Challenge-Handshake Authentication Protocol (CHAP) credentials on the exported volume and these are passed to the Node Controller(s) (NC) which then presents them back to the Storage Controller (SC) for authentication. CHAP credentials are generated at bootstrap time and stored in the database. When these credentials are generated for the first time, a corresponding user is also added to the Linux SCSI Target (TGT) framework. The username in these credentials is hard coded to "eucalyptus". Before generating the credentials the Storage Controller (SC) checks if the TGT account exists. If the account exists but there is no record of it in the database, the account is deleted and re-added as the Storage Controller (SC) does not know the password. This works fine with the first Storage Controller (SC) that bootstrap. The Storage Controllers (SCs) that bootstrap after find the credentials generated by the first Storage Controller (SC) and assume thats the password to be used.  This causes a mismatch in the password set on the account versus the password sent to the Node Controllers (NCs) for authentication.

 

References

[1]  Eucalyptus Jira Ticket EUCA-9345 [Unable to attach volumes on both clusters in multi-cluster]

Have more questions? Submit a request

Comments

Powered by Zendesk