Add tests for get_title with multibyte characters #46584

haruhisa-shin · 2024-06-03T08:33:31Z

This PR adds tests for <title> tag values containing multibyte characters, such as entity references and CJK languages.

We want to use these tests to confirm bugs reported in WebKit https://bugs.webkit.org/show_bug.cgi?id=270063.

whimboo

Thanks for your help in extending the WebDriver tests for the classis protocol! We still lack coverage on a lot of places...

I've taken a look and both of the newly added tests work well for all the supported browsers.

…retrieved correctly https://bugs.webkit.org/show_bug.cgi?id=270063 Reviewed by Alexey Proskuryakov. If the document title contains multibyte characters such as Japanese or entity references, the "Get Title" result will be garbled. The title string is obtained by JavaScript's "document.title()", and this data is encoded in UTF8. However, the StringBuilder.append() function used to create HTTP messages uses fromLatin1() internally to generate strings from byte data. This seems to be causing the multibyte characters to be garbled. This patch changes to use String::fromUTF8() before concatenation to restore the correct WTF::String even if it contains multibyte characters. Also, the change of get.py is regression test for this issue. This is an export from web-platform-tests/wpt#46584. * Source/WebDriver/socket/HTTPServerSocket.cpp: (WebDriver::HTTPRequestHandler::packHTTPMessage const): * WebDriverTests/imported/w3c/webdriver/tests/classic/get_title/get.py: (test_strip_and_collapse): (test_title_included_entity_references): (test_title_included_multibyte_char): Canonical link: https://commits.webkit.org/279767@main

Add tests for get_title with multibyte characters

fb794e3

wpt-pr-bot added webdriver wg-testing_browser labels Jun 3, 2024

wpt-pr-bot assigned jgraham Jun 3, 2024

wpt-pr-bot requested review from AutomatedTester, jgraham, jrandolf-2, juliandescottes, sadym-chromium, shs96c and whimboo June 3, 2024 08:33

haruhisa-shin mentioned this pull request Jun 3, 2024

[WebDriver][socket] Titles containing multibyte characters cannot be retrieved correctly WebKit/WebKit#25085

Merged

whimboo approved these changes Jun 3, 2024

View reviewed changes

whimboo merged commit 27c0328 into web-platform-tests:master Jun 3, 2024
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tests for get_title with multibyte characters #46584

Add tests for get_title with multibyte characters #46584

haruhisa-shin commented Jun 3, 2024

whimboo left a comment

Add tests for get_title with multibyte characters #46584

Add tests for get_title with multibyte characters #46584

Conversation

haruhisa-shin commented Jun 3, 2024

whimboo left a comment

Choose a reason for hiding this comment