A team is developing a stateless docker container with the Cassandra database to be run under Kubernetes, with all data and metadata files shipped inside the container, so putting the database into a read-only mode would be ideal. The app to be connected with this database is an infrequently updated feature store.
How to approximate read-only mode as closely as possible, specifically in case of Cassandra or perhaps even in general (if some actions undertaken here are in common)?
2
Answers
Even under view-only user a lot of write attempts are still going on, e.g. to commit log files and
system
keyspace tables (even under normal conditions such as orderly stops and starts of pod replicas that an horizontal pod auto-scaler can initiate under k8s - not just when an eviction from a node occurs in abnormal conditions).So first, these are config file settings that can theoretically prevent container crashes during failed write attempts to a write-protected location inside the container:
You can also point non-data folders where Cassandra writes to an universally writable location. The complete list of these folders can be found in the config file, but most of them are better left as they are and where they are (after being changed to be owned by
nobody
and writable by anybody), unless you want to re-create the entire original folder structure in the new location. Bitnami Cassandra containers let you change easily (with a dedicated env variable) the location of only one such folder - the one with commit logs, so this is a safe bet (tested by me to have no impact on Cassandra even after its contents get removed by each container restart):Note: in case of the rootless
bitnami/cassandra
containers Cassandra config file is located at/opt/bitnami/cassandra/conf/cassandra.yaml
and its customized version can be substituted (but not simply copied at build time into the container: it has to be mounted at deployment time as aConfigMap
to have any effect).The way to deal with the unwanted
system
keyspace writes is similar: point the data folder location to a folder writable by any user (such as/tmp
), (pre-populated with data files at build time and then never mapped outside of the container at run time). If you can't move the entire data folder and its contents to/tmp
(or you suspect that it may not prevent all write attempts to the original location, despite config changes made to thedata_file_directories
config file key), keep the data in the original data folder, but change its ownership tonobody
and make it writable by all users (at container build time).Committing all cached data fully to disk by either calling
nodetool flush
or performing an orderly database shutdown (by scaling down the Cassandra pod to 0 replicas) will minimize disk operations at subsequent container startup (but will not eliminate writes at startup/shutdown completely).For application stability I'd also recommend monitoring how much (if any) data gets written by Cassandra inside the container under production loads (even if they are restricted to read-only queries made by the client app).
Enable both authentication and authorization in your Cassandra image with:
Then provision a new role with only view permissions to the keyspace/tables you want them to access, for example:
This role will not be able to login and will only have read access to the app keyspace. Cheers!