I’m new to Python. I’m building a learning application that allows users to run sql queries on training tables directly from a browser. I am using FastAPI + Jinja2 + PostgreSQL. I am using ACE as a code editor. Is there a safe way to allow users to run any raw sql queries on the database? Can you give me some code examples ?
I’ve tried asyncpg and psycopg but I’m not sure how safe it is.
Thank you.
3
Answers
IMO, I don’t think it is a good idea to let any users run raw queries on a database. I would instead first look for a way to just validate queries and execute none of them, unless changes to the database are fundamental to the learning process (which you could also simulate as an option).
Nevertheless, in this post a solution was proposed to somebody that wanted to do something similar to what you are proposing. At the end of the post the solution suggests using one database per user as a brute force approach, I don’t know if it is a possibility but you could use the sqlite databases that Python provides creating them in memory (first checking wether they would be run on server side or client side).
On the part of just checking the validity of queries you could take a look at this, running a script on the server side using a file where you dumped the sql to validate and returning the result of the analysis to the user (be aware of checking input, I would be very cautios to execute anything on the server that could be minimally crafted by the user).
what is the definition of safe?
hopefully, this can help your learning in python!
As soon as you’re allowing untrusted users to write code that can be run on your systems things get difficult fast. It doesn’t matter which database interface you’re using.
I’d suggest running a separate copy of PostgreSQL for each user so that they can’t disrupt with each other, nor any database that your website "backend" is using. This way, it doesn’t matter what users do to their copy, whether they leave any cursors open, locks taken, zero-days exploited, etc. Further, you should probably also isolate each database server in its own VM, so that if they manage to hack it they’re not going to get very far. VM software lets you put quotas on CPU, RAM, and disk usage, so one user can’t dominate others as well.
Otherwise, if you’re happy assuming the user has a recent browser you could run Postgres in their browser, e.g. via WASM, and just copy the data periodically during their session.