skip to Main Content

I’m using the following code to do a sklearn transformation job in sagemaker:

region = boto3.session.Session().region_name
role = sagemaker.get_execution_role()
sklearn_processor = SKLearnProcessor(
    framework_version="1.0-1", role=role,
    instance_type="ml.m5.xlarge", instance_count=1,
    # sagemaker_session = Session()
)
out_path = os.path.join(bucket, prefix, f'test_transform/data.csv')
sklearn_processor.run(
    code="preprocess.py",
    inputs = [
        ProcessingInput(source = 'my_package/', destination = '/opt/ml/processing/input/code/my_package/')
    ],
    outputs=[
        ProcessingOutput(output_name="test_transform_data", 
                         source = '/opt/ml/processing/output/test_transform',
                         destination = out_path),
    ],
    arguments=["--time-slot-minutes", "30min"]
)

Within the above code, it’s running preprocess.py, and (within) preprocess.py loads the data from snowflake database using the credentials saved in aws secretsmanager:

region = boto3.Session().region_name
secrets_client = boto3.client(service_name='secretsmanager', region_name=region)

So here’s where error happen: first line above returns region as None, so the the second line of code raises botocore.exceptions.NoRegionError: You must specify a region

In this case, how can I pass the region to SKLearnProcessor or is there any other way to make the code working within the processing job instance?

FYI:
the source of input 'my_package/' is in the structure below to install packages and include py dependencies used in preprocess.py

├── my_package
│   ├── file1.py
│   ├── file2.py
│   └── requirements.txt
└── preprocess.py

Thanks

2

Answers


  1. Chosen as BEST ANSWER

    set following in the code preprocess.py solved the issue:

    os.environ['AWS_DEFAULT_REGION'] = 'us-west-2' 
    

  2. The two easiest ways are to to either set it in your ~/.aws/config

    [default]
    region=us-west-2
    

    or you can use an environment variable as in:

    export AWS_DEFAULT_REGION=us-west-2

    but you do need to tell boto3 which region to use.

    Check out the documentation about how to input variables to boto3:
    https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search