Description
@seanmonstar Thank you very much for this great crate! It is badly needed, and I appreciate that you shared it.
Run the following simple test:
extern crate reqwest;
use reqwest::Error;
fn main() {
match run() {
Ok(_) => println!("success!"),
Err(e) => eprintln!("Error: {}",e),
}
}
fn run() -> Result<(), Error> {
let client = reqwest::Client::new();
let mut res = reqwest::get("http://google.com")?;
let text = res.text()?;
Ok(())
}
This is the output:
sh-4.4$ ./target/debug/rtest
Error: stream did not contain valid UTF-8
The error happens because Response::text()
ignores the Content-Type: text/html; charset=ISO-8859-1
header from google. Response::text()
is using read_to_string()
from the std library which explicitly requires utf-8 encoding.
I think it is a rather big problem if reqwest can't handle google.com. You could use the ecoding crate and honor the encoding header. As a short term workaround you could provide a method to return a &[u8]
rather than a String
, and the user can work around the bug.
NOTE: Google may change their page tomorrow and everything will work fine. Nonetheless I am glad it broke becuse otherwise this would have been hard to discover!
Here is the data in case google changes their pages:
2018_01_16_www.google.com.data_non_utf8.txt.gz
2018_01_16_www.google.com.header.txt.gz
BTW, it is rather funny that the offending bytes are around the "Advertising Program" string :)