Skip to content

Commit

Permalink
fix: strip BOM in Response::text_with_charset (seanmonstar#1898)
Browse files Browse the repository at this point in the history
The byte order mark (BOM) is now stripped from utf-8 encoded response
bodies when calling `Response::text` and `Response::text_with_charset`.
This should prevent surprising behaviour when trying to use the returned
String.

Closes seanmonstar#1897
  • Loading branch information
ollyswanson authored Jul 6, 2023
1 parent 89df5d3 commit 3abcc7c
Showing 1 changed file with 5 additions and 9 deletions.
14 changes: 5 additions & 9 deletions src/async_impl/response.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
use std::borrow::Cow;
use std::fmt;
use std::net::SocketAddr;
use std::pin::Pin;
Expand Down Expand Up @@ -130,6 +129,8 @@ impl Response {
/// Encoding is determined from the `charset` parameter of `Content-Type` header,
/// and defaults to `utf-8` if not presented.
///
/// Note that the BOM is stripped from the returned String.
///
/// # Example
///
/// ```
Expand All @@ -155,6 +156,8 @@ impl Response {
/// `charset` parameter of `Content-Type` header is still prioritized. For more information
/// about the possible encoding name, please go to [`encoding_rs`] docs.
///
/// Note that the BOM is stripped from the returned String.
///
/// [`encoding_rs`]: https://docs.rs/encoding_rs/0.8/encoding_rs/#relationship-with-windows-code-pages
///
/// # Example
Expand Down Expand Up @@ -185,14 +188,7 @@ impl Response {
let full = self.bytes().await?;

let (text, _, _) = encoding.decode(&full);
if let Cow::Owned(s) = text {
return Ok(s);
}
unsafe {
// decoding returned Cow::Borrowed, meaning these bytes
// are already valid utf8
Ok(String::from_utf8_unchecked(full.to_vec()))
}
Ok(text.into_owned())
}

/// Try to deserialize the response body as JSON.
Expand Down

0 comments on commit 3abcc7c

Please sign in to comment.