ELB Service Instance terminates immediately after startup

Follow

Summary

This situation occurs post 4.2.0/4.2.1 upgrade when the default service image is installed more than once. This causes an inconsistency in the image manifest which will prevent any new service instances from starting.

 

Symptoms

During the upgrade process you will likely see a step to install the default service image. Running this:

esi-install-image --install-default --region localhost

will yield the following output/error (scroll left-right as needed):

Installing Service Image...
Found tarball /usr/share/eucalyptus/service-images/eucalyptus-service-image-1.48.tar.xz
Decompressing tarball: /usr/share/eucalyptus/service-images/eucalyptus-service-image-1.48.tar.xz
Installing image from eucalyptus-service-image.raw
Bundling, uploading and registering image to bucket: eucalyptus-service-image-1.48
Error: Unable to install imaging worker due to:
euca-install-image: error (InvalidAMIName.Duplicate): AMI name eucalyptus-service-image-v1.48 is already in use by EMI emi-e92546f9

Subsequently, after creating an ELB (service instance), it will terminate immediately. Examination of the cloud-tracking.log on the CLC will reveal the following error:

2016-01-05 16:54:29 ERROR | [c7e928b0-8ce4] com.eucalyptus.compute.common.internal.util.MetadataException: com.eucalyptus.compute.common.internal.util.MetadataException: Instance manifest was changed after registration
2016-01-05 16:54:29 WARN | [c7e928b0-8ce4] Aborting resource token: ResourceToken:i-db7a564c:resources=TypedContext:{com.eucalyptus.util.TypedKey(NetworkResources)=[com.eucalyptus.compute.common.network.PublicIPResource(), com.eucalyptus.compute.common.network.PrivateIPResource(d0:0d:db:7a:56:4c)]}

A similar error message will show in the cloud-debug.log on the CLC:

Tue Jan 5 16:54:29 2016 ERROR [ClusterAllocator:Eucalyptus.cluster:EphemeralConfiguration:arn:euca:eucalyptus::cluster:com.eucalyptus.cloud.run.ClusterAllocator.SubmitAllocation-192.168.1.241-one-cc-1/.class java.util.concurrent.ThreadPoolExecutor$Worker#533] com.eucalyptus.compute.common.internal.util.MetadataException: com.eucalyptus.compute.common.internal.util.MetadataException: Instance manifest was changed after registration

The string to watch for in the above is:

Instance manifest was changed after registration

Solution

All commands will be run from the CLC as root. Use the following steps to resolve the problem:

  1. eval `clcadmin-assume-system-credentials`
  2. esi-describe-images
    • Note the name of the default image ID used for each service instance type
  3. euca-deregister <image_id>
    • where <image_id> is the service image identified by the previous command, shown above
  4. esi-install-image --install-default --region localhost
    • Note that this command should complete without any error messages
  5. esi-describe-images
    • Note the new ID of the default service image

ELBs and other service instances should now launch successfully.

See also:

Have more questions? Submit a request

Comments

Powered by Zendesk