Debugging Unresponsive Eucalyptus Java Components After AWS API Stress Testing

Follow

Eucalyptus Versions:  3.3.x to 3.4.x

Debugging Unresponsive Eucalyptus Java Components After AWS API Stress Testing

AWS has features in place to help ensure that API requests get handled appropriately, especially under heavy load.  With Eucalyptus, unfortunately this needs to be handled by the cloud administrator - either by load balancing or throttling connections using iptables.  However, there can be times when Eucalyptus Java components that provide support for the AWS APIs that clients access become unresponsive during heavy load.  The Eucalyptus Java components responsible for providing AWS service APIs are as follows:

If this load is reproducible (i.e. stress testing done by a script and/or program), the following should be done to help gather debugging information.  Steps that are highlighted and italicized are not directly supported and should only be done by customers who are instructed to do so by Eucalyptus Support.  Non-paying customers can run the highlighted and italicized steps, but at their own risk.

  • Stop the eucalyptus-cloud process (i.e. service eucalyptus-cloud stop)
  • Update the CLOUD_OPTS variable in the /etc/eucalyptus/eucalyptus.conf file by adding the following:
    • CLOUD_OPTS="-Dmule.verbose.exceptions=true"
  • Start the eucalyptus-cloud process (i.e. service eucalyptus-cloud start)
  • Make sure the cloud property cloud.euca_log_level is set to DEBUG
    • euca-modify-property --property cloud.euca_log_level=DEBUG
  • Before starting the stress test script, start tcpdump on the interface that is the Eucalyptus Java component is using to provide the AWS API service (e.g. em1).  For example:
    • tcpdump -i em1 -Als0 port 8773 and tcp -w api-traffic
  • Download the following file - jmx-all.groovy, and use it with the euca-modify-property command while the stress test script is running.  If the output completes before the stress test script is finish, re-run the command again making sure the output is saved to a different file.  For example:
    • euca-modify-property --property-from-file euca=/path/to/jmx-all.groovy | tee jmx-all.log.1
    • euca-modify-property --property-from-file euca=/path/to/jmx-all.groovy | tee jmx-all.log.2
  • Grab a list of all the threads used by the eucalyptus-cloud process:
    • euca-modify-property --property euca='Thread.getAllStackTraces( ).collect{ thread -> "\n"+thread?.getKey()?.getName( ) }' | sort > euca-list-threads.log
  • Create a summary from the list of threads provided from the previous step:
    • egrep ':' euca-list-threads.log | sed 's/:.*//g' | sort | uniq -c | tee euca-list-threads-summary.log
  • Get a list of eucalyptus-cloud processes running:
    • pgrep -f eucalyptus-cloud | xargs -i -P2 lsof -p {} > pgrep-output.txt
  • Get a thread dump by using the killall command:
    • killall -QUIT eucalyptus-cloud > killall-output.txt
  • Install eucalyptus-sos-report plugins, and run the sosreport command.

Once these steps have been done, if you are a paying customer -put all the files created from these steps in a tar-gzipped compressed file, grab the sosreport compressed file, and provide them to the Eucalyptus Support Team.  If you are a non-paying customer, please provide this information to the Eucalyptus Users mailing list with a description of the stress test that you are performing.

Have more questions? Submit a request

Comments

Powered by Zendesk