Skip to content

MultipartWriter quotes field name wrong #4012

Closed
@kohtala

Description

@kohtala

Long story short

My client needs to send multipart/form-data to an API that expects field names with [] in the name. The server does not accept the submission with default set_content_disposition parameters due to wrong quoting.

Expected behaviour

Content-Disposition: form-data; name="files[]"; filename="filename"

Actual behaviour

Content-Disposition: form-data; name="files%5B%5D"; filename="filename"; filename*=utf-8''filename

Steps to reproduce

Client code is like

        with aiohttp.MultipartWriter('form-data') as mpw:
                f = mpw.append(file)
                f.set_content_disposition("form-data", name="files[]", filename="filename")

        res = await self.session.post(url, data=mpw)

Your environment

aiohttp==3.5.4 async client, Ubuntu 18.04, python 3.6.8.

Analysis

Returning Values from Forms: multipart/form-data says

In most multipart types, the MIME header fields in each part are
restricted to US-ASCII; for compatibility with those systems, file
names normally visible to users MAY be encoded using the percent-
encoding method in Section 2, following how a "file:" URI
[URI-SCHEME] might be encoded.

NOTE: The encoding method described in [RFC5987], which would add a
"filename*" parameter to the Content-Disposition header field, MUST
NOT be used.

It would seem the current implementation misinterpreted this to mean all field values are to be percent-encoded. But the RFC7578 is clear that the encoding is only to be used on file names. Furthermore, the filename*= form from MIME Parameter Value and Encoded Word Extensions should be used only for the other fields, but as the filename is already via percent-encoding to within US-ASCII, filename*= is not to be used on the filename.

For converting from unicode string to bytes for the percent-encoding, user will need to specify charset in some cases, as in the RFC:

The encoding used for the file names is typically UTF-8, although
HTML forms will use the charset associated with the form.

Thus, in some cases, an additional charset parameter is needed in set_content_disposition. Is it needed in other functions?

The RFCs refer to RFC822 for quoted-string definition, which is currently obsoleted by Internet Message Format RFC5322.

   qtext           =   %d33 /             ; Printable US-ASCII
                       %d35-91 /          ;  characters not including
                       %d93-126 /         ;  "\" or the quote character
                       obs-qtext

   qcontent        =   qtext / quoted-pair

   quoted-string   =   [CFWS]
                       DQUOTE *([FWS] qcontent) [FWS] DQUOTE
                       [CFWS]

   quoted-pair     =   ("\" (VCHAR / WSP)) / obs-qp

And from Augmented BNF for Syntax Specifications: ABNF

  VCHAR          =  %x21-7E
                                ; visible (printing) characters

         WSP            =  SP / HTAB
                                ; white space```

The quoted-pair quoting of quoted-string is missing in the current implementation.

There is also a rather far-fetched case of extremely long values causing the line length limit of 998 characters to be exceeded https://tools.ietf.org/html/rfc5322#section-2.1.1 and requiring using the Folding White Space (FWS).

I can not tell if there would be any compatibility impact of just changing the percent-quoting to the correct quoted-pair quoting. Should the quote_fields parameter concern the percent-encoding of filename or the quoted-pair of all fields?

The current behavior seems to be result of discussion in #916 to fix #903.

Activity

kohtala

kohtala commented on Aug 31, 2019

@kohtala
ContributorAuthor

I noticed, the filename* is specified in RFC 6266 Use of the Content-Disposition Header Field in the Hypertext Transfer Protocol (HTTP). It however states clearly

  Note: This document does not apply to Content-Disposition header
  fields appearing in payload bodies transmitted over HTTP, such as
  when using the media type "multipart/form-data" ([RFC2388]).

RFC2388 is now obsoleted by the RFC7578.

added 3 commits that reference this issue on Aug 31, 2019

Fix aio-libs#4012 encoding of content-disposition parameters

9d3f5a8

Fix aio-libs#4012 encoding of content-disposition parameters

95d2049

Fix aio-libs#4012 encoding of content-disposition parameters

1a72de7
added a commit that references this issue on Sep 2, 2019

Fix aio-libs#4012 encoding of content-disposition parameters

bf3480f
added a commit that references this issue on Oct 10, 2019

Fix aio-libs#4012 encoding of content-disposition parameters

1b5aa25
kohtala

kohtala commented on Oct 10, 2019

@kohtala
ContributorAuthor

I checked with Firefox, and any form fields names containing "[]" are sent without being enoded. Also filenames are sent in 8-bit utf-8 without any encoding. Rereading #916, it would seem the quote_fields=False option has been added to get correct behavior. Any API expecting the default quote_fields=True behavior can not work with submissions from html forms. With my patch and quote_fields=False, it is further fixed to quote any quotes in the file names.

added a commit that references this issue on Nov 20, 2019

Fix aio-libs#4012 encoding of content-disposition parameters

954fa1f
added 2 commits that reference this issue on Oct 16, 2020

Fix aio-libs#4012 encoding of content-disposition parameters

f886860

Fix aio-libs#4012 encoding of content-disposition parameters

781b507
added 2 commits that reference this issue on Nov 23, 2020

Fix aio-libs#4012 encoding of content-disposition parameters

93c86ba

Fix aio-libs#4012 encoding of content-disposition parameters

f26bded

5 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      MultipartWriter quotes field name wrong · Issue #4012 · aio-libs/aiohttp