skip to Main Content

how do we copy files from Hadoop to abfs (azure blob file system)
I want to copy from Hadoop fs to abfs file system but it throws an error
this is the command I ran

hdfs dfs -ls abfs://….

ls: No FileSystem for scheme "abfs"

java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem not found

any idea how this can be done ?

2

Answers


  1. In the core-site.xml you need to add a config property for fs.abfs.impl with value org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem, and then add any other related authentication configurations it may need.

    More details on installation/configuration here – https://hadoop.apache.org/docs/current/hadoop-azure/abfs.html

    Login or Signup to reply.
  2. the abfs binding is already in core-default.xml for any release with the abfs client present. however, the hadoop-azure jar and dependency is not in the hadoop common/lib dir where it is needed (it is in HDI, CDH, but not the apache one)

    you can tell the hadoop script to pick it and its dependencies up by setting the HADOOP_OPTIONAL_TOOLS env var; you can do this in ~/.hadoop-env; just try on your command line first

    export HADOOP_OPTIONAL_TOOLS="hadoop-azure,hadoop-aws"
    

    after doing that, download the latest cloudstore jar and use its storediag command to attempt to connect to an abfs URL; it’s the place to start debugging classpath and config issues

    https://github.com/steveloughran/cloudstore

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search