A web application frontend often performs requests to a backend API. Even though this API is only supposed to be used by the frontend, it is usually also accessible with a browser. An attacker can use this to exploit vulnerabilities.
Accessing API endpoints with a browser
/user/profile, which responds with the user properties, which the profile page then shows to the user. This backend endpoint,
/user/profile, is only meant to be accessed by the profile page. But of course it can also be accessed directly with the browser, by browsing to the correct URL. This has security implications: this page can be vulnerable to cross-site scripting or content injection. Even when the application as a whole is not vulnerable,
/user/profile may still be vulnerable in itself if requested out of context of the rest of the application.
Replacing parts of the page with HTML
This pattern was all the rage around 2006, when it was possible to dynamically update parts of the page using XMLHttpRequest. This was made easier by jQuery, with jQuery.load. It lost popularity for a while, but is now back as the backbone of htmx and Unpoly.
Risk of accessing the HTML API with a browser
If there is an cross-site scripting (XSS) vulnerability in
username.php, this can be exploited by luring a victim to open this page in the browser.
Besides cross site scripting, there is also the risk of content injection. An attacker may put a convincing message on the page. Even though this message originates from the attacker, a user may think this is a trustworthy message from the domain owner.
How do we prevent API endpoints from being requested from another context than the web application frontend?
Binary content type
What is the correct content type for a HTML fragment? It probably uses
text/html in most cases, even though it is not a full HTML document. Since it is content that is specific to our web application, it can be argued that it should be
application/.... If we set the content type to
application/html, the response will be offered for download in most browsers.
The disadvantage of this is that changing the content type from
text/html to something else disables cross-origin read blocking (CORB). CORB protects certain responses from being read through side-channel attacks. It only protects HTML, XML and JSON responses, so marking our response as something else disables this protection.
Conceptually, the response is not really a binary stream, and
text/html fits better. There is not really a standard MIME type for HTML fragments. Perhaps there should be!
The browser downloads the response as HTML file. Of course, it would be logically for a user to immediately open this downloaded file to inspect what is in it, partially defeating the protection. When opened this way, the file is served from the file system and not from the domain. This means that cross-site scripting is not effective anymore (because the script does not run on the same origin). Content injection may still be trustworthy, because the user just downloaded this file from a trusted domain. Other protecting headers, such as
Content-Security-Policy, no longer apply because the file is opened from disk.
Many developers seem to be under the impression that a web application should have one
Content-Security-Policy for the whole domain, but it is actually a good idea to give different parts or the application different policies.
Checking request headers
- HTMX has
- UnPoly has
- jQuery and many other libraries have
- Sec-Fetch-Dest, the initiator for the request.
documentfor direct access.
- Sec-Fetch-Mode, the request mode.
- Sec-Fetch-Site, whether the request is cross-site or cross-origin.
- Sec-Fetch-User, whether the request was initiated by the user.
These headers are supported in all modern browsers, and work across frameworks. This makes it easy to detect when our API endpoint is accessed in the intended manner, or whether an attacker lured a victim to open it in their browser.
When the endpoint is not used in the intended way, the application can handle it in several ways:
- Respond with 400 Bad Request or 403 Forbidden, and serve no content.
- Show the HTML response, but make it clear in the layout that this is a HTML fragment.
- Show the source code of the HTML fragment.
I would use the correct content type (i.e.
- Respond with the correct
Content-Type, and set
Content-Security-Policy: sandbox; default-src 'none'; frame-ancestors 'none'.
- For requests that have
Sec-Fetch-Mode: navigator, deny the request or prepend a warning that the content is a HTML fragment.
- If the frontend and API are on the same origin/site, only allow requests where
I am pretty sure about the first two, but have less experience with checking Sec-Fetch headers.