Uploading files to an API might seem like a solved problem, and mostly it is, but the trick is selecting the best practice solution for your situation.
This post provides a view of REST API file upload best practice for engineers and managers.
We’ll give an overview of the solutions for those that are less technical as well as dive into some of the technical details. This is essential to help engineers and managers understand each other, what is being proposed and how to decide on a solution.
Let’s establish some context first, what do we mean by a REST API file upload and why does it matter?
REST API File Upload: What is It, Why It Matters
Before we can get into discussing how to handle file uploads specifically we need to cover off some explanations and background:
- APIs are interfaces that define and enable two computer systems to communicate.
- REST is a common and highly popular approach to designing and building APIs.
- Files come in all different shapes and sizes. They could be a few bytes through to megabytes or gigabytes in size.
Going into detail on these is beyond the scope of this post.
So now, why do we need to discuss file uploads and APIs? There are a few reasons:
- Response times: It’s generally best practice for APIs, particularly RESTful APIs, to respond quickly. Even if an API call initiates a long running operation, the API should respond quickly with something like “I’ve successfully started the job that will take a while”. Files can make this challenging because they can take time to upload.
- Blocking resources: With the need for APIs to have short response times and be available to respond, file uploads tend to consume API service resources and block them out from use by other consumers or API calls. That is, while a file is uploading an API service worker is busy receiving the file and unable to respond to requests.
- Compute usage: Given API services usually have some functionality or smarts built in, receiving files usually isn’t the best use of compute power. ‘Dumber’ or fit for purpose file upload and receive services are better suited.
API File Upload Solutions
Now that we’ve covered the context and the problem we can now look at different solutions you can provide for your API.
The solutions we’ll cover:
- Embed File
- Link File
- Upload then Link
- API Call then Upload
We’ll then close out by looking at some of the other considerations around files and APIs.
To help work through the solutions let’s establish an example API we can come back to throughout the solutions.
Example API: PetCo
We need to deliver APIs for a pet insurance company, PetCo. We will mostly focus on PetCo’s Quote API, a fictitious API that we can call to get an insurance quote from PetCo.
For engineers the endpoint might be: GET api.petco.com/quote
Embed File in API Call
You can embed a file directly into an API call. When the API consumer sends their request for a quote they include the file in the API call itself, so there is just one query.
In our PetCo Quote API example, if we required a CSV of past medical expenses as part of the quote and we expect it to be fairly small for a pet then there are merits to embedding this in the API call of GET api.petco.com/quote.
The benefit of this is simplicity. It’s all in one. Easy to understand. Easy to use. Easy to build.
The downside is you are consuming smarter compute resources doing something simpler, often at a higher cost than purposely designed resources. The other downside is, file processing time slows down responding to your consumer.
This can be suitable for small file sizes.
This isn’t suitable where you will have one or more files that start to measure in megabytes.
Link File as URI in API Call
You can link to a file within the API call using a URI (essentially a URL like www.site.com/file-123.pdf). The API service then knows to go and fetch the file from the URI specified and can download it in whatever the best way is.
In our PetCo Quote API example, if we want a photo of the dog as part of the quote then someone developing against the API could make the photo available on a public cloud bucket (e.g. AWS S3 or GCP Storage Bucket). Then just put the link to it in the API call.
The benefit of this is simplicity for the API provider (PetCo in our example). It also allows you to use appropriate compute resources for downloading the file to where you need it.
The downside is that, on its own, can be difficult for API consumers – the people developing against and using your API. They need to develop a solution to hold the files that your API needs access to and this location must be accessible by your API. It’s possible this introduces some security risks on their side if not implemented well (e.g. photos of anyone that got a quote is publicly available).
These downsides make this an unattractive solution in most cases, however it gives us the foundation for some more solution options that we will cover next.
Upload Then Link
You can build on the Link File solution by providing an upload service to your API consumers.
The way it would work is this:
- Consumer uploads files: API consumer uses your file upload service to upload their file(s) (e.g. PUT
files.petco.com/quote/photos
) - Call API: API consumer then calls your API, giving it the links to the files that were uploaded. (e.g. GET
api.petco.com/quote
)
The API call itself can respond quickly and you’ve made it easy for users to upload files. Compute resources are used effectively. So we’ve overcome some of the challenges with the approaches we’ve already covered.
The challenge with this approach is you’re relying on the user to associate files properly and you are potentially allowing file uploads from people that should not be allowed to do so (which makes various attacks possible, not to mention handling files unnecessarily).
API Call Then Upload
This solution works in the following steps:
- Call the API: A consumer calls your API which gives you a token or location as well as authorisation to upload.
- Upload files: A consumer then uploads the files to or with the appropriate location(s) and token(s) respectively. They also have authorisation to do so.
To continue the PetCo Quote example, here is how we would do a quote with a photo of the dog as well as the CSV with past expenses. Let’s step through it:
- An API consumer could call the Quote API to get a quote (e.g. GET api.petco.com/quote)
- The Get Quote API would respond quickly with a quote identifier or token to use.
- The API consumer then calls Get Quote File Upload Service, specifying the quote identifier and file type.
e.g. PUTfiles.petco.com/quote/{quote-id}/photos
- After some time authorisation to upload is revoked from this API consumer.
The API call itself can respond quickly and compute resources are used most effectively. You can also control data processing.
The downside is relatively minor in that you’ve introduced an extra step. So it’s not as simple as embedding the file. However, this extra step outweighs the downsides of embedding, even with smaller files.
Other Considerations for API File Uploads
Some of the other considerations for uploading files and APIs are:
- File Security: the file you are accepting, however you accept it, introduces security risks. Consider how your preferred solution needs to be secured.
- Authentication: if you have a separate service for file uploads, you will need to consider how you authenticate and secure this service.
- Transfers: there may be use cases where, independent of the API, you allow for files to be transferred to you before or after the API call.