Boundary in Form Data
I am going to discuss here what is
boundary in multipart/form-data which is mainly found for an input type of file in an HTML form. The
boundary is included to separate name/value pair in the
boundary parameter acts like a marker for each pair of name and value in the multipart/form-data. The
boundary parameter is automatically added to the
Content-Type in the http (Hyper Text Transfer Protocol) request header.
What is multipart/form-data?
It is one of the encoding methods provided by an HTML (Hyper Text Markup Language) form data. There are three encoding methods provided by the HTML form:
- application/x-www-form-urlencoded (default)
Generally you include multipart/form-data in your HTML form for an input type file. Even you can use this encoding if your HTML form does not contain any input type file but application/x-www-form-urlencoded encoding would be more appropriate when your HTML form does not have any file input. But do not use text/plain for the Content-Type.
In conclusion when you make a POST request, your data need to be encoded in the request body by some means and it is where your one of the encoding methods comes into picture.
application/x-www-form-urlencoded is similar to the query string at the end of the URL. text/plain can be used only for debugging purpose. multipart/form-data is significantly more complex but it allows entire file data to be included in the body of the request.
Where does name/value pair come from?
The name and value pair correspond to the name and value respectively of the input fields in an HTML form which you define in the web page.
The name/value pair is passed when you submit an HTML form data and the
boundary parameter gets added automatically upon form submission.
Is arbitrary value allowed in boundary?
Yes, an arbitrary value is allowed in
boundary parameter. Make sure that the value for the
boundary parameter does not exceed 70 bytes in length and consists only of 7-bit US-ASCII characters.
Is boundary parameter mandatory in multipart/form-data?
Yes, it is not only mandatory in
multipart/form-data field but also it is required in any of the multipart/* content types.
If you do not specify the
boundary parameter then your server will not be able to parse the request payload.
Is other charset than US-ASCII allowed?
Yes, you can set the
charset parameter, for example, to UTF-8 in
Content-Type header unless you are absolutely sure that only US-ASCII charset, which is a default value in the absence of
charset parameter, will be used in payload.
According to the RFC2046, the Content-Type field for multipart entities requires one parameter –
The boundary delimiter line is then defined as a line consisting entirely of two hyphen characters (“-“, decimal value 45) followed by the
boundary parameter value from the
Content-Type header field, optional linear whitespace, and a terminating CRLF (Carriage Return Line Feed).
Boundary delimiters must not appear within the encapsulated material, and must be no longer than 70 characters, not counting the two leading hyphens.
The boundary delimiter line following the last body part is a distinguished delimiter that indicates that no further body parts will follow. Such a delimiter line is identical to the previous delimiter lines, with the addition of two more hyphens after the boundary parameter value.
Examples – Boundary in multipart/form-data
Enough talking about
boundary parameter, let’s see with examples…
If you run the example at link Python Flask File Upload, you will see the similar kind of data as shown below.
I have uploaded here an image file using Mozilla FireFox browser (you can use any browser).
Clicking on the Network tab of the browser debug tool you will find such information.
-----------------------------293582696224464 Content-Disposition: form-data; name="file"; filename="roytuts.jpg" Content-Type: image/jpeg <content of the file> -----------------------------293582696224464--
In the above example the boundary is defined by
---------------------------293582696224464 and the content is written inside the boundary delimiter or marker.
At the end of the boundary marker you will see
-- which indicates the end of the boundary.
If you run the file upload example using Restlet client then you will see similar to the below value for the boundary parameter in
Content-Type: multipart/form-data; boundary=----WebKitFormBoundarydMIgtiA2YeB1Z0kl
Here is an example of arbitrary boundary in
Content-Type: multipart/form-data;; charset=utf-8; boundary="----arbitrary boundary" ------arbitrary boundary Content-Disposition: form-data; name="foo" foo ------arbitrary boundary Content-Disposition: form-data; name="bar" bar ------arbitrary boundary--
That’s all about how boundary parameter works and what is the need of boundary parameter in multipart/form-data.