PagerDuty Guide
Context
We have various queues which alert on failures in Sirius. This include the Dead Letter Queue and the Client Queue. You can find these in the SQS Queue in AWS, there’s different versions for adhoc, preproduction and production.
When there is an issue, the person on pagerduty will be alerted, and it will be visible in the SQS Queues page (under messages available).
Get more info on the failure
You can do this for the dead letter queues by clicking on the queue name -> send and receive messages -> poll for messages
Investigating
Look into the logs for around this time, you can also check log insights.
When you find a relevant entry, open it, the @log tells you what the log group is (go into log groups and search for it).
There will be a @logstream with one specific id of which log it’s happening in, search for that in the box and open it for more info.
Resending
If its a dead letter, you can resend from that page.
If it’s a client, you can resend with the SQS queue. This message will look similar to the message on the DLQ.