I’m working with three lambdas: the first one has an API Gateway for a trigger, the second one has an SQS FIFO queue (let’s name it queue1) as a trigger, and the third one has also an SQS FIFO queue (let’s name it queue2) as a trigger. I also have one DLQ FIFO queues for each of those FIFO queues (2 DLQs in total).
So, whenever I trigger the first lambda through the API Gateway, I can see the message being processed in the queues (appears in flight for queue1, and if the retries are exceeded or the visibility timeout expires, that message goes to the DLQ for queue1). Nevertheless, I cannot see any logs on CloudWatch.
At first, one may think that this is a problem regarding permissions. I have this IAM role with these permissions. This same role is used by the three lambdas.
Now, in the image, you can see that I have two custom inline policies. These are for allowing sending and receiving messages from the SQS FIFO queues that I have. I have one policy per queue, and this is what I have inside of them:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"sqs:ReceiveMessage",
"sqs:SendMessage",
"sqs:GetQueueAttributes"
],
"Resource": "arn:aws:sqs:<rest-of-the-arn>.fifo"
}
]
}
I have read this thread and tried what they suggest (checking for the AWSLambdaBasicExecutionRole to be present, changing the function timeout by a second, etc.), and it works for the first execution after the change. But then, If i trigger again my workflow through the API Gateway, no log is printed, but everything else works fine.
The only thing that I have not tried yet is to have one role per lambda, and specify explicitly the arn for the CloudWatch log groups. Nevertheless, I don’t think that this is the problem, since re-deploying the functions each time that I want to get a log makes everything work fine, but just one time.
What do you think could be causing this?
I have tried most of the solutions presented in this thread (checking for the AWSLambdaBasicExecutionRole to be present, changing the function timeout by a second, etc.). Most of them momentarily solve the problem, meaning that the log for the execution done immediately after said changes is available on cloudwatch. However, for subsequent executions, the logs are not displayed once again.
2
Answers
I think that I found the "problem". There was actually no problem at all, it was just my faulty understanding about how log streams work.
@Mark B told me to check the all the log streams in the log group. The logs of the function executions were being put in existing log streams, instead of creating new ones.
So, I understand that this is the desired behavior, and a new log stream will be created only when the service that is connected to the log group (in this case, the lambda function) is modified. If the function is not modified, all of the logs for all of the executions will be stored in the same log stream.
Is that assumption correct?
I'm also noticing that, after a certain time (~5min), the execution of the lambda function causes a new log stream to be created. Is that normal? Is there a way to change that time?
Now, if that is correct, how many "messages" can be stored in a single log stream?
UPDATE: As @Mark B specified in a comment (on my first answer), the first time a lambda function executes after being created or having its code or resource configuration updated, a new container with the appropriate resources will be created to execute it. If a new execution happens with the code having suffered no modifications and with not too much time gone by, Lambda will reuse that container.
Each container has its own log stream, so that is why sometimes the logs were stored inside of an existing log stream, and some times they where stored in a new log stream.