We’re developing a REST API for our platform. Let’s say we have organisations
and projects
, and projects
belong to organisations
.
After reading this answer, I would be inclined to use numerical ID’s in the URL, so that some of the URLs would become (say with a prefix of /api/v1
):
/organisations/1234
/organisations/1234/projects/5678
However, we want to use the same URL structure for our front end UI, so that if you type these URLs in the browser, you will get the relevant webpage in the response instead of a JSON
file. Much in the same way you see relevant names of persons and organisations in sites like Facebook or Github.
Using this, we could get something like:
/organisations/dutchpainters
/organisations/dutchpainters/projects/nightwatch
It looks like Github actually exposes their API in the same way.
The advantages and disadvantages I can come up with for using names instead of IDs for URL definitions, are the following:
Advantages:
- More intuitive URLs for end users
- 1 to 1 mapping of front end UI and JSON API
Disadvantages:
- Have to use unique names
- Have to take care of conflict with reserved names, such as
count
, so later on, you can still develop an API endpoint like/organisations/count
and actually get the number of organisations instead of the organisation calledcount
.
Especially the latter one seems to become a potential pain in the rear. Still, after reading this answer, I’m almost convinced to use the string identifier, since it doesn’t seem to make a difference from a convention point of view.
My questions are:
- Did I miss important advantages / disadvantages of using strings instead of numerical IDs?
- Did Github develop their string-based approach after their platform matured, or did they know from the start that it would imply some limitations (like the one I mentioned earlier, it seems that they did not implement such functionality)?
3
Answers
It’s common to use a combination of both:
where the last part is simply ignored but used to make the url more readable.
In your case, with multiple levels of collections you could experiment with this format:
If somebody writes
it would still map to the rembrandt, but that should be ok. That will leave room for editing the names without messing up url:s allready out there. Also, names doesn’t have to be unique if you don’t really need that.
https://www.serviceobjects.com/blog/path-and-query-string-parameter-calls-to-a-restful-web-service
Numerical consecutive IDs are not recommended anymore because it is very easy to guess records in your database and some might use that to obtain info they do not have access to.
Numerical IDs are used because the in the database it is a fixed length storage which makes indexing easy for the database. For example INT has 4 bytes in MySQL and BIGINT is 8 bytes so the number have the same length in memory (100 in INT has the same length as 200) so it is very easy to index and search for records.
If you have a lot of entries in the database then using a VARCHAR field to index is a bad idea. You should use a fixed width field like CHAR(32) and fill the difference with spaces but you have to add logic in your program to treat the differences when searching the database.
Another idea would be to use slugs but here you should take into consideration the fact that some records might have the same slug, depends on what are you using to form that slug. https://en.wikipedia.org/wiki/Semantic_URL#Slug
I would recommend using UUIDs since they have the same length and resolve this issue easily.