skip to Main Content

Here is the pseudo-code for what current lambda function looks like;

import pandas
import pymysql


def get_db_data(con_):
    query = "SELECT * FROM mytable"
    data = pandas.read_sql(query, con_)
    return data


def lambda_handler(event, context):
    con = pymysql.connect()
    data = get_db_data(con)
    """
    do other things with event
    """
    con.close()

I am debating if I can do this instead:

import pandas
import pymysql

con = pymysql.connect()


def get_db_data(con_):
    query = "SELECT * FROM mytable"
    data = pandas.read_sql(query, con_)
    return data


data = get_db_data(con)


def lambda_handler(event, context):
    """
    do other things with event
    """
    con.close()

But I am not sure if it is a good practice. What implications would the second option have on run-time and cost? Is it against the recommended way?

2

Answers


  1. In summary, both approaches have their merits, and the choice depends on factors such as cold start time sensitivity, resource management, and the nature of your Lambda workload. To reuse connections, consider using connection pooling and ensuring proper scoping to mitigate potential issues.

    First Version (Connection Created Inside lambda_handler):

    • Pros:

      • Resource Management: Ensures proper closure of the database connection after each Lambda invocation.
      • Isolation: Each invocation gets a new connection, ensuring isolation.
    • Cons:

      • Overhead: Establishing a new connection may introduce some overhead.

    Second Version (Connection Created Outside lambda_handler):

    • Pros:

      • Reuse: Reuses the same connection, potentially reducing the overhead of connection setup.
      • Faster Cold Starts: May result in faster cold starts if connection setup is a significant portion.
    • Cons:

      • Resource Leaks: Keep connection scope limited to Lambda execution; global connections can lead to resource leaks.

      • Concurrency Issues: Ensure thread safety if multiple invocations modify the same global connection.

    Recommendations:

    • Limited Connection Scope: If reusing connections, limit the connection scope to the Lambda execution.
    • Use Connection Pooling: Consider using connection pooling for efficient connection management.
    • Lambda Concurrency: Be aware of Lambda concurrency limits and ensure thread safety or use connection pooling.
    Login or Signup to reply.
  2. When working with a database connection in a Lambda function, it is best to follow AWS best practices and use INIT code (which is where you are almost heading) to load expensive resources.

    Take advantage of execution environment reuse to improve the performance of your function. Initialize SDK clients and database connections outside of the function handler, and cache static assets locally in the /tmp directory. Subsequent invocations processed by the same instance of your function can reuse these resources. This saves cost by reducing function run time.

    Lambda can run from either a COLD or WARM start. On COLD start, the code outside the lambda handler is executed. When a Lambda is run from a WARM start, the resources loading during COLD start will be available. By including resources like database connection opening in the COLD start, subsequent WARM starts will not have to re-execute the same expensive operation. Getting to reuse the WARM start requires that calls to the specific Lambda be within a short period of time. This can greatly reduce the execution time on your Lambda functions and this reduce costs!

    Based on where you were going, I would say to rewrite it as such:

    import pandas
    import pymysql
    
    con = pymysql.connect()
     
    def get_db_data(con_):
        query = "SELECT * FROM mytable"
        data = pandas.read_sql(query, con_)
        return data
        
    def lambda_handler(event, context):
        data = get_db_data(con)
        """
        do other things with event
        """
        con.close()
    

    This concept is also explained well in the AWS Lambda docs here.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search