Blueprint-ec2-error-codes


 * Launchpad Entry: NovaSpec:ec2-error-codes
 * Created: 19 Nov 2012
 * Contributors:

Summary
This proposal about managing error responses in foreign API support code especially for EC2 in an unified way.

Introduction
The EC2 API has well defined Error Codes, which usable by programs to take corrective action. Without correct EC2 Error Codes very difficult to use the EC2 API.

Amazon provides documentation: Error Codes

An EC2 error response contains 3 important information: The documentation just distinguish the Client Error 4** or Server Errors 5**. Usable by scripts/programs to do the right corrective action. The same EC2 Error Code can be client or server, distinguishable by the HTTP Status Code. Human readable text. It can be translated to the end user's native language.
 * HTTP status code
 * Client Error: The client did something wrong, for example invalid malformed attribute, trying an action when it is not possible because of the actual resource state.
 * Server Error: The server is unable to carry out the request, because of his fault, for example not enough resources or Data Base connection issues.
 * Error Code (text)
 * Message part

The EC2 Error Codes in OpenStack Nowadays
In every other case you get "UnknownError":"An unknown error has occurred. Please try your request again." response.
 * nova/api/ec2/cloud.py raises exceptions with EC2APIError Error Code in 38 cases.
 * nova/api/ec2/faults.py renders the HTTP Status Codes (5**) as EC2 Error Code
 * nova/api/ec2/ init .py renders the 14 times the exception class name as Error Code and logs them on different log level.
 * nova/api/ec2/ init .py says "Unauthorized" 3 times

You can see the UnknownError very frequently. Imagine this kind of error, when you are a remote costumer and unable to see the log files.

Why this Document Created ?

 * Hundreds of wrong EC2 Error Codes could be reported as a bug.
 * In order to avoid unnecessary different solutions and unnecessary question about the implementation.

Design

 * Make the exceptions self describing when the Exception translated to error response
 * Consider EC2 SOAP transport and other APIes
 * Avoid confidential information leaking

Implementation

 * 1) Use  the exception class name instead of the UnknownError
 * 2) Modify the Executor according to Code Changes part
 * 3) Replace the EC2APIErrors with EC2 exceptions or add parameters to the OS exceptions
 * 4) Add attributes to the existing exceptions and create new EC2 exceptions
 * 5) Modify the unit tests to do not allow not EC2 Error Codes in EC2 error responses
 * 6) Modify the unit tests to verify the correct EC2 Error Code
 * 7) Fix the Server (faults.py) EC2 Error Codes

Code Changes
Rationality: The attributes for foreign API should be prefixed by the API name like 'ec2_' EC2 case: Rationality: Note: Rationality: Note: Rationality:
 * The EC2 API implementation MAY use the exception class name as EC2 Error Code if it does not have any other idea
 * Helps in tracing exception which does not have proper coding
 * As workaround end users temporary can use can use it
 * All exception MAY contain foreign API data, seams very difficult to translate it on the API implementation side.
 * ec2_error_code - The Error Code, when it specified it MUST used as EC2 Error Code
 * ec2_status - HTTP status code
 * ec2_message - Message for EC2 error response
 * ec2_loglevel - The correct log level if otherwise cannot be known
 * Avoid confusion
 * HTTP Status Code can be different in EC2 and in OS API specification
 * The HTTP status code with high probably will not need to be differ from the OS "code" attributes
 * The logging strategy could be the same as with OS API
 * The final message could be the same as the OS API messages
 * When the 'ec2_status' missing the EC2 error response should use the same 'code' as an OS error would
 * When the 'ec2_message' missing the EC2 error response should use the same 'message' as an OS error would
 * API specific exception SHOULD be suffixed by the API name like EC2. In the EC2 case the last tailing 'EC2' MUST be removed from the Error Code.
 * Helps to pick the right exception for the right API
 * The EC2 marked Exception classes you MAY define attributes without a prefix
 * The EC2 Error Code can contains '.' but does not contains '_' . The '_' MUST be translated to '.' when you create error code based on the exception name.
 * Avoid the need of specifying a similar ec2_error_code like the exception name

Security and Internal Server Error Considerations
The Server Errors (5**) message part does not necessarily designed for end users. For example: "Failed to connect to the DB with user=root, password=S3cr3t"

When you need to send error response related to an unknown internal error, you MUST replace the original message with something general and MUST log the original exception on critical or error level.

The exception's message considered as viewable by end user:
 * Exception has "code" attribute and it is 4**, or
 * Exception has "ec2_status" or "ec2_message" attribute

Otherwise the message part MUST be sanitized.

The exception class names does not considered as confidential data, however you SHOULD use the 'InternalError' as server error code, unless otherwise specified by the EC2 API specification.

Additional Notes

 * A possible future EC2 SOAP API MUST prefix the EC2 Error Code with
 * 'Client.' - when the HTTP Status Code is 4**
 * 'Server.' - when the HTTP Status Code is 5**
 * Since the EC2 does not specifies the exact HTTP Status Code you could use the same as the EC2 specification author or make your decision based on HTTP RFCs like rfc2616.
 * Cases might exists when the EC2 requires a *.NotFound error response, but the OS API uses an empty list

Test/Demo Plan
Unit tests and tempest.