skip to Main Content

Migrating a legacy project from 2.7 and Ubuntu 18.04 [piecemeal, Python 3.10 & 22.04 next… then Flask!] from vendored dependencies to requirements.txt. Removed dependencies from project root and enumerated them in my requirements.txt.

My requirements.txt contains google-cloud-storage==1.44.0 and was venv-2-7/bin/python -m pip install -t lib -r requirements.txt with a appengine_config.py in same dir as app.yaml with:

# From https://cloud.google.com/appengine/docs/legacy/standard/python/tools/using-libraries-python-27
import os

from google.appengine.ext import vendor

vendor.add('lib')
vendor.add(os.path.join(os.path.dirname(os.path.realpath(__file__)), 'lib'))

How do I resolve this error? – Attempted venv-2-7/bin/python -c 'import google.cloud.storage' which worked, but:

$ venv-2-7/bin/python /google-cloud-sdk/platform/google_appengine/dev_appserver.py --host 127.0.0.1 .

Errors [from PyCharm & manually] with:

ImportError: No module named google.cloud.storage

EDIT0: Including more information [below] as per comment requests:

app.yaml
runtime: python27
api_version: 1
threadsafe: yes

instance_class: F4
automatic_scaling:
  min_idle_instances: automatic
    
handlers:
- url: /1/account.*
  script: api_account.app

inbound_services:
- warmup

libraries: # Also tried removing this section entirely
- name: webapp2
  version: latest
- name: jinja2
  version: latest
- name: ssl
  version: latest

env_variables:
  PYTHONHTTPSVERIFY: 1

skip_files:
- ^(.*/)?.*.py[co]$

EDIT1: Tried both solutions independently and even together:

import os
import sys

import pkg_resources
import google

if os.environ.get("GOOGLE_CLOUD_SDK_APPENGINE"):
    sys.path.insert(0, os.environ["GOOGLE_CLOUD_SDK_APPENGINE"])

for lib_dir in os.path.join(os.path.dirname(__file__), 'lib'), 'lib':
    sys.path.insert(0, lib_dir)
    google.__path__.append(os.path.join(lib_dir, 'google'))
    pkg_resources.working_set.add_entry(lib_dir)

import google.cloud.storage

Is there some trick with from google.appengine.ext import vendor, should I not be using /google-cloud-sdk/platform/google_appengine as my GOOGLE_CLOUD_SDK_APPENGINE env var?

EDIT2: I tried inlining google.appengine.ext.vendor and calling it in the loop, which gave me a ImportError: No module named google.cloud._helpers error

3

Answers


  1. Chosen as BEST ANSWER

    Went nuclear with this solution, I hate this with a passion:

    import inspect
    import io
    import os.path
    import site
    import sys
    from collections import deque
    from copy import deepcopy
    from itertools import chain
    
    from distutils.sysconfig import get_python_lib
    from functools import partial
    from os import getcwd, listdir
    
    import google
    
    site_packages = os.path.join(
        os.path.dirname(os.path.dirname(sys.executable)), get_python_lib(prefix="")
    )
    new_sys_path = [getcwd(), site_packages]
    
    lib_dir = os.path.join(os.path.dirname(__file__), "lib")
    new_sys_path.append(lib_dir)
    
    appengine_python_sdk = (
        os.environ["GOOGLE_CLOUD_SDK_APPENGINE"]
        if os.environ.get("GOOGLE_CLOUD_SDK_APPENGINE")
        else os.path.dirname(
            os.path.dirname(
                os.path.dirname(
                    os.path.dirname(
                        os.path.dirname(
                            inspect.getsourcefile(os.environ["wsgi.errors"].__class__)
                        )
                    )
                )
            )
        )
    )
    
    lib_dir = os.path.join(os.path.dirname(__file__), "lib")
    
    google_cloud_sdk_appengine = os.path.dirname(os.path.dirname(appengine_python_sdk))
    
    non_venv_site_packages = os.path.dirname(inspect.getsourcefile(io))
    
    google_cloud_sdk_appengine_lib = os.path.join(google_cloud_sdk_appengine, "lib")
    
    
    def all_pkgs_for_dir(p):
        return map(
            partial(os.path.join, p),
            filter(
                lambda p: (lambda parts: parts[1] == "-" and parts[2][0].isdigit())(
                    p.rpartition("-")
                ),
                listdir(p),
            ),
        )
    
    
    app_yaml_libraries = (
        lambda g: [p[len(g) :].partition("-")[0] for p in sys.path if p.startswith(g)]
    )(google_cloud_sdk_appengine_lib + os.path.sep)
    

    Then construct the new_sys_path with the paths in the right order:

    new_sys_path = list(
        chain.from_iterable(
            (
                (
                    getcwd(),
                    appengine_python_sdk,
                    os.path.dirname(os.path.dirname(appengine_python_sdk)),
                    non_venv_site_packages,
                    os.path.join(non_venv_site_packages, "lib-dynload"),
                ),
                (
                    # Use new dependencies from `lib` dir if found
                    os.path.join(lib_dir, lib_p)
                    for lib_p in os.listdir(lib_dir)
                    if not lib_p.endswith(".dist-info")
                    and lib_p.partition("-")[0] in app_yaml_libraries
                ),
                (
                    lib_dir,
                    site_packages,
                    google_cloud_sdk_appengine_lib,
                ),
            )
        )
    )
    
    sys.path = deepcopy(new_sys_path)
    google.__path__ = deepcopy(new_sys_path)
    deque(map(site.addsitedir, new_sys_path), maxlen=0)
    

    If anyone has a less hacky solution I'm all ears 👂…

    FYI: Currently debugging a ImportError: No module named six error, even though os.path.isfile(os.path.join(lib_dir, "six.py"))


  2. I can reproduce your error.

    I used to have problems with protobuf (the error was somewhat similar to what you’re getting for storage). The solution was to update the google namespace package. I tried the same solution now (for storage) and your error went away

    Update the code in your appengine_config.py to

        from google.appengine.ext import vendor
        import google, os
    
        lib_dir = os.path.join(os.path.dirname(__file__), 'lib')
        google.__path__.append(os.path.join(lib_dir, 'google'))
    
        vendor.add('lib')
    

    After updating appengine_config.py, the line import google.cloud.storage no longer gives error of No module named cloud.storage.

    Login or Signup to reply.
  3. I didn’t try to repro your error but believe you. From initial glance, it looks like your appengine_config.py is incomplete. This suffices when you have non-GCP 3P dependencies:

    from google.appengine.ext import vendor
    
    # Set PATH to your libraries folder.
    PATH = 'lib'
    # Add libraries installed in the PATH folder.
    vendor.add(PATH)
    

    However, if your requirements.txt has GCP client libraries, e.g., google-cloud-*, your appengine_config.py needs to use pkg_resources to support their use:

    import pkg_resources
    from google.appengine.ext import vendor
    
    # Set PATH to your libraries folder.
    PATH = 'lib'
    # Add libraries installed in the PATH folder.
    vendor.add(PATH)
    # Add libraries to pkg_resources working set to find the distribution.
    pkg_resources.working_set.add_entry(PATH)
    

    To bring in pkg_resources, you need to add setuptools and grpcio to your app.yaml, and to use google-cloud-storage specifically, you need to also add ssl:

    runtime: python27
    threadsafe: yes
    api_version: 1
    
    handlers:
    - url: /.*
      script: main.app
    
    libraries:
    - name: grpcio
      version: latest
    - name: setuptools
      version: latest
    - name: ssl
      version: latest
    

    All of these 3P pkg games "go away" when you finally upgrade to Python 3 where your requirements.txt remains the same, but you delete appengine_config.py and replace your app.yaml with the following (if you’re not serving static files):

    runtime: python310
    

    These same instructions can also be found in the App Engine documentation on the migrating bundled services page. That page basically says what I just did above for both Python 2 and 3.

    <ADVERTISEMENT>

    When you’re ready to upgrade to Python 3 and/or get off App Engine bundled services (NDB, Task Queue [push & pull], Memcache, Blobstore, etc.) to standalone Cloud equivalents (Cloud NDB, Cloud Tasks [push] or Cloud Pub/Sub [pull], Cloud Memorystore, Cloud Storage (GCS), etc.), or switch to Cloud Functions or Cloud Run, I’ve produced (well, still producing) a modernization migration series complete with code samples, codelab tutorials, and videos, all of which complement the official migration docs. You can find more info including links to all those resources as its open source repo. In particular, your inquiry covers GCS, and that migration is covered by "Module 16." The Mod16 app is a sample that works with GCS and is a migration of its analog Mod15 app based on Blobstore.

    </ADVERTISEMENT>

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search