Adapt to new ChatGPT HTML format (Aug 2023) #2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation and context
Recently (presumably in the last 1-2 months), ChatGPT changed the HTML formatting of its Q&A page. As a result, the questions cannot be properly parsed.
I dug into this a bit, and found that ChatGPT now adds a wrapping of
<div class="empty:hidden">
and</div>
around each question.What's changed
This PR made some code changes to accommodate this change, added a new test case (directly downloaded from the new ChatGPT page), and also updated the old test cases (basically adding
<div class="empty:hidden">
wraps around questions).