skip to Main Content

I am developing an API that is consumed by a backbone web app (and also by iOS and Android apps) that right now is returning user-generated content with encoded HTML entities, so for example the API clients will get &lt;div&gt; instead of <div>.

The web app has any problems with this, the problem is that iOS and Android are showing &lt;div&gt; as it is, to the final app users.

The API is going to be released as public soon for third-party apps, so if the API returns raw HTML any user could inject some malicious scripts to steal other user’s information, in case the third-party app does not have a way to prevent it.

Considering this potential security risk, what would be a good practice for a RESTFul JSON API, to return raw or to return encoded HTML?

I have seen that the Twitter API returns raw HTML, so I have a mixed feeling about this, and I don’t know if there is some common standard / good practice that the community is following right now.

Thanks

2

Answers


  1. I believe that in general the best security practice is to ensure that the content is made “safe” when it is created by the user, not when you are doing something with it.

    I’m not sure if you have control over the process of storing the UGC, but if you have, clean it up at that point. If you don’t, you can clean it at the API level of course, but anything else using the same UCG input, will have to implement the cleaning also (which is why I believe it should be done when the data is stored…)

    XSS and other attack vectors evolve all the time, and their are many other questions about “the best approach”. Opinions vary, but the easiest is probably to use a white-list approach that explicitly allows certain tags and filter everything else out using regex or whatever library you prefer to handle this type of parsing. (you didn’t mention which programming language you are using so it’s a bit hard to provide examples…)

    I feel your best source for information is OWASP, just read and understand how attacks happen, so you can also understand how to build your security to prevent them. And most importantly, keep updated on the topic, security is not a one-time off effort, it should be a continuous process throughout all your development.

    Also see this particular entry on why encoding is not really a good solution (as you suspected)
    https://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet#Why_Can.27t_I_Just_HTML_Entity_Encode_Untrusted_Data.3F

    I also don’t think there is a single “best practice”, there are always factors you cannot control and that cause a need to become “inventive” and still secure the piece of the software you do control in a different way, even if that is not the “standard”.

    Another good practice is to include this type of vulnerability testing in your development/testing cycle to really know that your API is secure. There are a lot of tools our there (just google for “web vulnerability testing tools” to get started…)

    I have had good experiences with OWASP’s ZED Proxy in the past: https://www.owasp.org/index.php/OWASP_Zed_Attack_Proxy_Project

    Login or Signup to reply.
  2. To start, I disagree with Niki Driessen’s answer.

    I believe that in general the best security practice is to ensure that
    the content is made “safe” when it is created by the user, not when
    you are doing something with it.

    The problem with this statement is that there is no such thing as “safe” without context.

    For example, take the string <script>alert(1)</script>.

    Is this “safe”? This cannot be answered without it being considered in the context it is used within.

    For example,

    • Output to the user in plain text: Safe, after all it is just a string and <script> means nothing to a text parser.
    • Storage within a database: Safe, it is a sequence of bytes stored within a data file.
    • Display within an app control (non HTML aware): Safe.
    • Display into HTML context of a web page: Unsafe, because <script> will be interpreted by the browser in order to start a code execution context.

    To fix the latter issue, the process that is creating the page for output should properly encode any HTML special characters to entity format (e.g. &lt;script&gt;…).

    Now back to your API question. There is no particular right way to do this.

    Advantages of returning raw content:

    • Clients do not have to trust that your API is not returning malicious script, because they simply encode everything for output themselves.
    • The same API can be used for multiple purposes (e.g. web, mobile).

    Disadvantages:

    • Clients will have to apply their own HTML formatting to returned output.
    • You may have to split the data that is returned, so that the client can differentiate between elements that may be best rendered individually (e.g. returning the username from a comment and the comment itself in two different JSON tags so that the client knows which is which).

    Advantages of returning HTML:

    • Clients won’t have to do their own formatting.

    Disadvantages:

    • They are trusting your API that no malicious content is rendered. This may be a big risk even if they trust you because if your system allows HTML, they are also trusting that you have adequately prevented malicious content from being entered in some way*.
    • Clients are stuck with whichever presentation format you have decided on for the data, and their only option to change it may be the use of CSS.

    This is a decision for you to make taking into consideration your likely use cases. You could have course have multiple methods, some of which return formatted HTML, and other that return raw data. Formatted HTML is only really of use if you’re adding value by pre-formatting the output.

    *Yes, this contradicts what I said in the first part of my answer, however if you are providing clients with pre-rendered HTML and letting users enter HTML to be rendered in fields that are returned from your API, then you must adequately filter this HTML to ensure no script is entered in the first place. This is because you are pre-defining the data to be output in an HTML context, rather than treating it as raw data. This is tricker than it sounds because there are so many ways of entering script into HTML (not just <script> tags). Use something like Google Caja if you need to do this, rather than rolling your own.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search