Question:
I’m trying to set up a MongoDB instance in databricks notebook using Databricks’ DBFS mounted storage as its data storage location. I can mount my storage (azure blob or s3 bucket) to DBFS and can access it without issues from Databricks and do POSIX calls with python.
However, when I attempt to start MongoDB with the --dbpath
flag set to my DBFS-mount, I get the Operation not supported
error:
{"t":{"$date":"2023-10-12T11:13:17.785+00:00"},"s":"I", "c":"CONTROL", "id":23285, "ctx":"main","msg":"Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'"}
{"t":{"$date":"2023-10-12T11:13:17.786+00:00"},"s":"I", "c":"NETWORK", "id":4915701, "ctx":"main","msg":"Initialized wire specification","attr":{"spec":{"incomingExternalClient":{"minWireVersion":0,"maxWireVersion":21},"incomingInternalClient":{"minWireVersion":0,"maxWireVersion":21},"outgoing":{"minWireVersion":6,"maxWireVersion":21},"isInternalClient":true}}}
{"t":{"$date":"2023-10-12T11:13:17.788+00:00"},"s":"I", "c":"NETWORK", "id":4648601, "ctx":"main","msg":"Implicit TCP FastOpen unavailable. If TCP FastOpen is required, set tcpFastOpenServer, tcpFastOpenClient, and tcpFastOpenQueueSize."}
{"t":{"$date":"2023-10-12T11:13:17.789+00:00"},"s":"I", "c":"REPL", "id":5123008, "ctx":"main","msg":"Successfully registered PrimaryOnlyService","attr":{"service":"TenantMigrationDonorService","namespace":"config.tenantMigrationDonors"}}
{"t":{"$date":"2023-10-12T11:13:17.789+00:00"},"s":"I", "c":"REPL", "id":5123008, "ctx":"main","msg":"Successfully registered PrimaryOnlyService","attr":{"service":"TenantMigrationRecipientService","namespace":"config.tenantMigrationRecipients"}}
{"t":{"$date":"2023-10-12T11:13:17.789+00:00"},"s":"I", "c":"CONTROL", "id":5945603, "ctx":"main","msg":"Multi threading initialized"}
{"t":{"$date":"2023-10-12T11:13:17.789+00:00"},"s":"I", "c":"TENANT_M", "id":7091600, "ctx":"main","msg":"Starting TenantMigrationAccessBlockerRegistry"}
{"t":{"$date":"2023-10-12T11:13:17.789+00:00"},"s":"I", "c":"CONTROL", "id":4615611, "ctx":"initandlisten","msg":"MongoDB starting","attr":{"pid":4048,"port":27017,"dbPath":"/dbfs/mnt/aws/fiftyone/db","architecture":"64-bit","host":"1006-121834-gtupj2oh-10-139-64-4"}}
{"t":{"$date":"2023-10-12T11:13:17.789+00:00"},"s":"I", "c":"CONTROL", "id":23403, "ctx":"initandlisten","msg":"Build Info","attr":{"buildInfo":{"version":"7.0.2","gitVersion":"02b3c655e1302209ef046da6ba3ef6749dd0b62a","openSSLVersion":"OpenSSL 3.0.2 15 Mar 2022","modules":[],"allocator":"tcmalloc","environment":{"distmod":"ubuntu2204","distarch":"x86_64","target_arch":"x86_64"}}}}
{"t":{"$date":"2023-10-12T11:13:17.789+00:00"},"s":"I", "c":"CONTROL", "id":51765, "ctx":"initandlisten","msg":"Operating System","attr":{"os":{"name":"Ubuntu","version":"22.04"}}}
{"t":{"$date":"2023-10-12T11:13:17.789+00:00"},"s":"I", "c":"CONTROL", "id":21951, "ctx":"initandlisten","msg":"Options set by command line","attr":{"options":{"storage":{"dbPath":"/dbfs/mnt/aws/fiftyone/db"}}}}
{"t":{"$date":"2023-10-12T11:13:18.185+00:00"},"s":"I", "c":"STORAGE", "id":22315, "ctx":"initandlisten","msg":"Opening WiredTiger","attr":{"config":"create,cache_size=10994M,session_max=33000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,remove=true,path=journal,compressor=snappy),builtin_extension_config=(zstd=(compression_level=6)),file_manager=(close_idle_time=600,close_scan_interval=10,close_handle_minimum=2000),statistics_log=(wait=0),json_output=(error,message),verbose=[recovery_progress:1,checkpoint_progress:1,compact_progress:1,backup:0,checkpoint:0,compact:0,evict:0,history_store:0,recovery:0,rts:0,salvage:0,tiered:0,timestamp:0,transaction:0,verify:0,log:0],"}}
{"t":{"$date":"2023-10-12T11:13:19.576+00:00"},"s":"E", "c":"WT", "id":22435, "ctx":"initandlisten","msg":"WiredTiger error message","attr":{"error":95,"message":{"ts_sec":1697109199,"ts_usec":576472,"thread":"4048:0x7fcff4155c80","session_name":"connection","category":"WT_VERB_DEFAULT","category_id":9,"verbose_level":"ERROR","verbose_level_id":-3,"msg":"__posix_file_close:360:/dbfs/mnt/aws/fiftyone/db/journal/WiredTigerTmplog.0000000001: handle-close: close","error_str":"Operation not supported","error_code":95}}}
{"t":{"$date":"2023-10-12T11:13:20.307+00:00"},"s":"E", "c":"WT", "id":22435, "ctx":"initandlisten","msg":"WiredTiger error message","attr":{"error":95,"message":{"ts_sec":1697109200,"ts_usec":306958,"thread":"4048:0x7fcff4155c80","session_name":"connection","category":"WT_VERB_DEFAULT","category_id":9,"verbose_level":"ERROR","verbose_level_id":-3,"msg":"__posix_file_close:360:/dbfs/mnt/aws/fiftyone/db/journal/WiredTigerTmplog.0000000001: handle-close: close","error_str":"Operation not supported","error_code":95}}}
{"t":{"$date":"2023-10-12T11:13:21.059+00:00"},"s":"E", "c":"WT", "id":22435, "ctx":"initandlisten","msg":"WiredTiger error message","attr":{"error":95,"message":{"ts_sec":1697109201,"ts_usec":59781,"thread":"4048:0x7fcff4155c80","session_name":"connection","category":"WT_VERB_DEFAULT","category_id":9,"verbose_level":"ERROR","verbose_level_id":-3,"msg":"__posix_file_close:360:/dbfs/mnt/aws/fiftyone/db/journal/WiredTigerTmplog.0000000001: handle-close: close","error_str":"Operation not supported","error_code":95}}}
{"t":{"$date":"2023-10-12T11:13:21.063+00:00"},"s":"W", "c":"STORAGE", "id":22347, "ctx":"initandlisten","msg":"Failed to start up WiredTiger under any compatibility version. This may be due to an unsupported upgrade or downgrade."}
{"t":{"$date":"2023-10-12T11:13:21.063+00:00"},"s":"F", "c":"STORAGE", "id":28595, "ctx":"initandlisten","msg":"Terminating.","attr":{"reason":"95: Operation not supported"}}
{"t":{"$date":"2023-10-12T11:13:21.063+00:00"},"s":"F", "c":"ASSERT", "id":23091, "ctx":"initandlisten","msg":"Fatal assertion","attr":{"msgid":28595,"file":"src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp","line":674}}
{"t":{"$date":"2023-10-12T11:13:21.063+00:00"},"s":"F", "c":"ASSERT", "id":23092, "ctx":"initandlisten","msg":"nn***aborting after fassert() failurenn"}
Has anyone any idea why this is happening and way to get around it? Any help would be appreciated.
P.S. I am well aware about the caveats of using FUSE or NFS on mogodb performance.
-
I have tried doing the same with google colab with google drive mount and it works fine.
-
I have tried mounting both azure blob storage and s3 bucket both with same outcome. It works fine when I use the normal system path.
-
I have tried it with mongodb versions 5,6 and 7 and also various databricks runtimes, all with the same problem with the dbfs mount.
2
Answers
It’s a really bad idea to use DBFS for that – the underlying cloud storage doesn’t provide a lot of functionality (officially documented), such as random writes, sparse files, etc. Databases like MongoDB, Cassandra, relational databases, etc. should use real disk storage.
The operation that is not supported is fsync.
WiredTiger relies of fsync for normal operations.
You will need to consult the DBFS documentation to see if there is a way to permit remote fsync locking.