ElasticRecheck

Dump information / FAQs on elastic-recheck and how to use it and contribute to it.

When you hit a failure and there is no e-r query comment in your patch, but you do find a bug to recheck against, you should look at writing an e-r query for it so you don't have to dig next time. Lots of people check the http://status.openstack.org/rechecks/ page but not all of those bugs have e-r queries.

So what's the thought process for writing an e-r query (best practices)?

First either identify or open the bug to recheck against, that's standard operating procedure.
- See here for more info: https://wiki.openstack.org/wiki/GerritJenkinsGit#Test_Failures
Second, check the logs for the failure looking for something that uniquely identifies the failure for the bug.
- Avoid general error messages from Tempest in console.html since those aren't always unique.
- Look for errors/warnings in the various log files, e.g. logs/screen-n-cpu.txt and pull information from them.
1. Test your query out in http://logstash.openstack.org:
  - Typically start with a simple message and filename query over the last 7 days.
  - Query is structured like this: message:"<your unique fail here>" AND filename:"<the log that the failure message appears in relative to the root of the job logs>"
    - For example: message:"because vif doesn't exist" AND filename:"logs/screen-n-net.txt"
  - If you have hits, make sure there are no false negatives by checking 'build_status' on the left side of the logstash page - that will show you the success/failure rate for the builds that the query hits. You need a 100% failure rate for a good e-r query.
TODO steps for writing the e-r query and pushing it up
TODO steps for what to do when a bug is resolved and we can archive the query with the 'resolved_at' field.