In Azure Data Factory, I have recently moved a CosmosDb Connection which was using a connection string to using Managed Identity with RBAC (Cosmos DB Built-in Data Reader
role).
However, I noticed early on in the pipeline, when there are many activities run simultaneously against the Cosmos database, some of the activities are experiencing a 408 timeout (see error below).
On the Cosmos side, the database appears healthy with no reports of throttling or even that much usage.
When I go back to the connection string method, I do not experience this error and everything works as expected. I’m unsure if I’m doing anything wrong or if perhaps this is a limitation of this type of authentication with this database connection.
Operation on target My Activity Name failed: Failure happened on 'Source' side. ErrorCode=UserErrorDataStoreServiceThrottling,'Type=Microsoft.DataTransfer.Common.Shared.DataStoreThrottlingException,Message=Response status code does not indicate success: RequestTimeout (408); Substatus: 0; ActivityId: redacted; Reason: (GatewayStoreClient Request Timeout. Start Time UTC:4/10/2024 7:44:22 PM; Total Duration:36023.7489 Ms; Request Timeout 20000 Ms; Http Client Timeout:65000 Ms; Activity id: 1942511c-0e19-47d7-940d-4b49847c2f8c;);,Source=Microsoft.DataTransfer.ClientLibrary.CosmosDbSqlApiV3,''Type=Microsoft.Azure.Cosmos.CosmosException,Message=Response status code does not indicate success: RequestTimeout (408); Substatus: 0; ActivityId: redacted; Reason: (GatewayStoreClient Request Timeout. Start Time UTC:4/10/2024 7:44:22 PM; Total Duration:36023.7489 Ms; Request Timeout 20000 Ms; Http Client Timeout:65000 Ms; Activity id: redacted;);,Source=Microsoft.Azure.Cosmos.Client,''Type=System.Threading.Tasks.TaskCanceledException,Message=A task was canceled.,Source=mscorlib,'
3
Answers
Iām not sure if this is your issue, but did you check out other throttling such as Entra ID token throttling?
https://github.com/microsoftgraph/microsoft-graph-docs-contrib/blob/main/concepts/throttling-limits.md#identity-and-access-service-limits
Not the same error code but that might not be seen in the pipeline.
No, there is no limitation on execution with particular authentication.
Request throttling is the most common issue in Azure Cosmos DB. Azure Cosmos DB throttles requests if they exceed the allocated request units for the database or container.
This would suggest there are so many read/write requests at once, a limit was hit on cosmos db. As you mentioned there are many activities simultaneously running on cosmos db.
As solution You can try increasing RU for the database or container level or you can set the Retry option for copy activity.
In Azure Data Factory (ADF), alternately you can divide the input file into smaller segments, process these batches in sequence, and manually introduce brief pauses between batches. This approach ensures that there are sufficient Request Units (RU) available for other operations between each batch execution.
This is closer to your error code HTTP 408 and might be useful:
https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/troubleshoot-dotnet-sdk-request-timeout?tabs=cpu-new