Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added UK Publisher The Sun #445

Merged
merged 15 commits into from
May 6, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
feat: changed subheadline selector in TheSunParser
  • Loading branch information
BorisKalika committed Apr 27, 2024
commit 79a60ad92a59d501b621c7111ade5d885e40c868
2 changes: 1 addition & 1 deletion src/fundus/publishers/uk/the_sun.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ class TheSunParser(ParserProxy):
class V1(BaseParser):
_summary_selector = CSSSelector("div[data-gu-name='standfirst'] p")
_paragraph_selector = CSSSelector("div.article__content > p")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some articles that also contain subheadlines, such as this one: https://www.thesun.co.uk/betting/21748039/best-monopoly-live-casinos/. It would be great, if you could also add a subheadline selector

Copy link
Contributor Author

@BorisKalika BorisKalika Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As requested, I added a subheadline selector and successfully executed python -m scripts.generate_parser_test_files -p TheSun -o

Copy link
Contributor Author

@BorisKalika BorisKalika Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tested black, isort, mypy and pytest. All of them passed on my local machine without any erros :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks a lot :)

Copy link
Contributor Author

@BorisKalika BorisKalika Apr 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did I change everything that was requested or did I miss something :)?'

I think I might've misunderstood what the subheadline of this article is. Could you maybe point it out for me please? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I picked the wrong subheadline. I changed the subheadline selector and re-generated test files, executed pytest, black, isort and mypy. Pycharm tells me there are no file changes thus I can't push or commit test.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just reviewed it and judging by the subheadline selector you chose, I think you got the correct idea. A subheadline in Fundus a line of text separating paragraphs into logical entities. For example in https://www.thesun.co.uk/news/27470413/ukraine-torpedo-submarine-black-sea-battle/ CAN IT BE REAL? would be considered a subheadline. In this case I would suggest something like this as the subheadline selector: div.article__content > h2.wp-block-heading

_sub_headline_selector = CSSSelector("div.article__content > h1")
_sub_headline_selector = CSSSelector("div.toplist_container__jpTyX thesun_container__fty3s > h2")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_sub_headline_selector = CSSSelector("div.toplist_container__jpTyX thesun_container__fty3s > h2")
_sub_headline_selector = CSSSelector("div.article__content > h2.wp-block-heading")

Copy link
Contributor Author

@BorisKalika BorisKalika Apr 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your suggested headline selector is failing 1 pytest test case :(

________________________________________________________________ TestParser.test_parsing[TheSun] _________________________________________________________________

self = <tests.test_parser.TestParser object at 0x1091b9970>, publisher = <UK.TheSun: 5>

    def test_parsing(self, publisher: PublisherEnum) -> None:
        comparative_data = load_test_case_data(publisher)
        html_mapping = load_html_test_file_mapping(publisher)
    
        for versioned_parser in publisher.parser:
            # validate json
            version_name = versioned_parser.__name__
            assert (
                version_data := comparative_data.get(version_name)
            ), f"Missing test data for parser version '{version_name}'"
    
            for key, value in version_data.items():
                if not value:
                    raise ValueError(
                        f"There is no value set for key '{key}' in the test JSON. "
                        f"Only complete articles should be used as test cases"
                    )
    
            # test coverage
            supported_attrs = set(versioned_parser.attributes().names)
            missing_attrs = attributes_required_to_cover & supported_attrs - set(version_data.keys())
            assert (
                not missing_attrs
            ), f"Test JSON for {version_name} does not cover the following attribute(s): {missing_attrs}"
    
            assert list(version_data.keys()) == sorted(
                attributes_required_to_cover & supported_attrs
            ), f"Test JSON for {version_name} is not in alphabetical order"
    
            assert (html := html_mapping.get(versioned_parser)), f"Missing test HTML for parser version {version_name}"
            # compare data
            extraction = versioned_parser().parse(html.content, "raise")
            for key, value in version_data.items():
>               assert value == extraction[key]
E               assert ArticleBody(s...s older."'))]) == ArticleBody(s...s older."'))])
E                 
E                 Omitting 1 identical items, use -vv to show
E                 Differing attributes:
E                 ['sections']
E                 
E                 Drill down into differing attribute sections:
E                   sections: [ArticleSection(headline=(), paragraphs=('REBECCA Cooke is the childhood sweetheart of England footballer Phil Foden.', 'And the couple have announced some great news with the pair expecting a new addition to the family.', 'Rebecca Cooke is the long-term partner of Manchester City midfielder Phil Foden.', 'Rebecca is thought to be 22 years old and the mother of two children with Phil.', 'She tends to keep out the spotlight and has her Inst...
E                 
E                 ...Full output truncated (5 lines hidden), use '-vv' to show

self = <tests.test_parser.TestParser object at 0x10527e5e0>, publisher = <UK.TheSun: 5>

    def test_parsing(self, publisher: PublisherEnum) -> None:
        comparative_data = load_test_case_data(publisher)
        html_mapping = load_html_test_file_mapping(publisher)
    
        for versioned_parser in publisher.parser:
            # validate json
            version_name = versioned_parser.__name__
            assert (
                version_data := comparative_data.get(version_name)
            ), f"Missing test data for parser version '{version_name}'"
    
            for key, value in version_data.items():
                if not value:
                    raise ValueError(
                        f"There is no value set for key '{key}' in the test JSON. "
                        f"Only complete articles should be used as test cases"
                    )
    
            # test coverage
            supported_attrs = set(versioned_parser.attributes().names)
            missing_attrs = attributes_required_to_cover & supported_attrs - set(version_data.keys())
            assert (
                not missing_attrs
            ), f"Test JSON for {version_name} does not cover the following attribute(s): {missing_attrs}"
    
            assert list(version_data.keys()) == sorted(
                attributes_required_to_cover & supported_attrs
            ), f"Test JSON for {version_name} is not in alphabetical order"
    
            assert (html := html_mapping.get(versioned_parser)), f"Missing test HTML for parser version {version_name}"
            # compare data
            extraction = versioned_parser().parse(html.content, "raise")
            for key, value in version_data.items():
>               assert value == extraction[key]
E               assert ArticleBody(summary=(), sections=[ArticleSection(headline=(), paragraphs=('REBECCA Cooke is the childhood sweetheart of England footballer Phil Foden.', 'And the couple have announced some great news with the pair expecting a new addition to the family.', 'Rebecca Cooke is the long-term partner of Manchester City midfielder Phil Foden.', 'Rebecca is thought to be 22 years old and the mother of two children with Phil.', 'She tends to keep out the spotlight and has her Instagram account currently set private, though it does seem to suggest that she goes by the nickname Becca.', 'The exact time at which they started dating is unknown, but they have been together since being teenagers.', 'At the age of 18 she became a mother to their son, Ronnie.', 'A fan account of the couple (@beccafodenx) on Instagram shows the two together, along with a closer look at the blonde bombshell.', 'Phil and Rebecca have a son called Ronnie, 4, and a daughter named True, 1.', 'In April 2024, the couple announced they are expecting a third child.', 'Speaking to Manchester City at the time of the birth of his son, Phil said: "I was there for the birth. I walked out of the room, gave it a little tear and then went back in like nothing happened.', '"I’m not one for crying in front of people. I like to be on my own, but I was there in the room, watched it happen and it was a special moment.', '"Your life changes."', 'He continued, speaking of the things he misses Ronnie doing due to football training: "There are things you miss when you’re not there because you’ve got an away game.', '"I was there when he started crawling, but I think I was in London when he started to walk.', '"Now he’s getting about and walking everywhere, so you have to have eyes in the back of your head or he starts running off.', '"It’s unfortunate to miss things like that but it’s a sacrifice that he’ll appreciate when he’s older."'))]) == ArticleBody(summary=(), sections=[ArticleSection(headline=(), paragraphs=('REBECCA Cooke is the childhood sweetheart of England footballer Phil Foden.', 'And the couple have announced some great news with the pair expecting a new addition to the family.')), ArticleSection(headline=('Who is Rebecca Cooke?',), paragraphs=('Rebecca Cooke is the long-term partner of Manchester City midfielder Phil Foden.', 'Rebecca is thought to be 22 years old and the mother of two children with Phil.', 'She tends to keep out the spotlight and has her Instagram account currently set private, though it does seem to suggest that she goes by the nickname Becca.')), ArticleSection(headline=('How long has she been dating Phil Foden?',), paragraphs=('The exact time at which they started dating is unknown, but they have been together since being teenagers.', 'At the age of 18 she became a mother to their son, Ronnie.', 'A fan account of the couple (@beccafodenx) on Instagram shows the two together, along with a closer look at the blonde bombshell.')), ArticleSection(headline=('How many children do couple have?',), paragraphs=('Phil and Rebecca have a son called Ronnie, 4, and a daughter named True, 1.', 'In April 2024, the couple announced they are expecting a third child.', 'Speaking to Manchester City at the time of the birth of his son, Phil said: "I was there for the birth. I walked out of the room, gave it a little tear and then went back in like nothing happened.', '"I’m not one for crying in front of people. I like to be on my own, but I was there in the room, watched it happen and it was a special moment.', '"Your life changes."', 'He continued, speaking of the things he misses Ronnie doing due to football training: "There are things you miss when you’re not there because you’ve got an away game.', '"I was there when he started crawling, but I think I was in London when he started to walk.', '"Now he’s getting about and walking everywhere, so you have to have eyes in the back of your head or he starts running off.', '"It’s unfortunate to miss things like that but it’s a sacrifice that he’ll appreciate when he’s older."'))])
E                 
E                 Matching attributes:
E                 ['summary']
E                 Differing attributes:
E                 ['sections']
E                 
E                 Drill down into differing attribute sections:
E                   sections: [ArticleSection(headline=(), paragraphs=('REBECCA Cooke is the childhood sweetheart of England footballer Phil Foden.', 'And the couple have announced some great news with the pair expecting a new addition to the family.', 'Rebecca Cooke is the long-term partner of Manchester City midfielder Phil Foden.', 'Rebecca is thought to be 22 years old and the mother of two children with Phil.', 'She tends to keep out the spotlight and has her Instagram account currently set private, though it does seem to suggest that she goes by the nickname Becca.', 'The exact time at which they started dating is unknown, but they have been together since being teenagers.', 'At the age of 18 she became a mother to their son, Ronnie.', 'A fan account of the couple (@beccafodenx) on Instagram shows the two together, along with a closer look at the blonde bombshell.', 'Phil and Rebecca have a son called Ronnie, 4, and a daughter named True, 1.', 'In April 2024, the couple announced they are expecting a third child.', 'Speaking to Manchester City at the time of the birth of his son, Phil said: "I was there for the birth. I walked out of the room, gave it a little tear and then went back in like nothing happened.', '"I’m not one for crying in front of people. I like to be on my own, but I was there in the room, watched it happen and it was a special moment.', '"Your life changes."', 'He continued, speaking of the things he misses Ronnie doing due to football training: "There are things you miss when you’re not there because you’ve got an away game.', '"I was there when he started crawling, but I think I was in London when he started to walk.', '"Now he’s getting about and walking everywhere, so you have to have eyes in the back of your head or he starts running off.', '"It’s unfortunate to miss things like that but it’s a sacrifice that he’ll appreciate when he’s older."'))] != [ArticleSection(headline=(), paragraphs=('REBECCA Cooke is the childhood sweetheart of England footballer Phil Foden.', 'And the couple have announced some great news with the pair expecting a new addition to the family.')), ArticleSection(headline=('Who is Rebecca Cooke?',), paragraphs=('Rebecca Cooke is the long-term partner of Manchester City midfielder Phil Foden.', 'Rebecca is thought to be 22 years old and the mother of two children with Phil.', 'She tends to keep out the spotlight and has her Instagram account currently set private, though it does seem to suggest that she goes by the nickname Becca.')), ArticleSection(headline=('How long has she been dating Phil Foden?',), paragraphs=('The exact time at which they started dating is unknown, but they have been together since being teenagers.', 'At the age of 18 she became a mother to their son, Ronnie.', 'A fan account of the couple (@beccafodenx) on Instagram shows the two together, along with a closer look at the blonde bombshell.')), ArticleSection(headline=('How many children do couple have?',), paragraphs=('Phil and Rebecca have a son called Ronnie, 4, and a daughter named True, 1.', 'In April 2024, the couple announced they are expecting a third child.', 'Speaking to Manchester City at the time of the birth of his son, Phil said: "I was there for the birth. I walked out of the room, gave it a little tear and then went back in like nothing happened.', '"I’m not one for crying in front of people. I like to be on my own, but I was there in the room, watched it happen and it was a special moment.', '"Your life changes."', 'He continued, speaking of the things he misses Ronnie doing due to football training: "There are things you miss when you’re not there because you’ve got an away game.', '"I was there when he started crawling, but I think I was in London when he started to walk.', '"Now he’s getting about and walking everywhere, so you have to have eyes in the back of your head or he starts running off.', '"It’s unfortunate to miss things like that but it’s a sacrifice that he’ll appreciate when he’s older."'))]
E                   At index 0 diff: ArticleSection(headline=(), paragraphs=('REBECCA Cooke is the childhood sweetheart of England footballer Phil Foden.', 'And the couple have announced some great news with the pair expecting a new addition to the family.', 'Rebecca Cooke is the long-term partner of Manchester City midfielder Phil Foden.', 'Rebecca is thought to be 22 years old and the mother of two children with Phil.', 'She tends to keep out the spotlight and has her Instagram account currently set private, though it does seem to suggest that she goes by the nickname Becca.', 'The exact time at which they started dating is unknown, but they have been together since being teenagers.', 'At the age of 18 she became a mother to their son, Ronnie.', 'A fan account of the couple (@beccafodenx) on Instagram shows the two together, along with a closer look at the blonde bombshell.', 'Phil and Rebecca have a son called Ronnie, 4, and a daughter named True, 1.', 'In April 2024, the couple announced they are expecting a third child.', 'Speaking to Manchester City at the time of the birth of his son, Phil said: "I was there for the birth. I walked out of the room, gave it a little tear and then went back in like nothing happened.', '"I’m not one for crying in front of people. I like to be on my own, but I was there in the room, watched it happen and it was a special moment.', '"Your life changes."', 'He continued, speaking of the things he misses Ronnie doing due to football training: "There are things you miss when you’re not there because you’ve got an away game.', '"I was there when he started crawling, but I think I was in London when he started to walk.', '"Now he’s getting about and walking everywhere, so you have to have eyes in the back of your head or he starts running off.', '"It’s unfortunate to miss things like that but it’s a sacrifice that he’ll appreciate when he’s older."')) != ArticleSection(headline=(), paragraphs=('REBECCA Cooke is the childhood sweetheart of England footballer Phil Foden.', 'And the couple have announced some great news with the pair expecting a new addition to the family.'))
E                   Right contains 3 more items, first extra item: ArticleSection(headline=('Who is Rebecca Cooke?',), paragraphs=('Rebecca Cooke is the long-term partner of Manchester ...has her Instagram account currently set private, though it does seem to suggest that she goes by the nickname Becca.'))
E                   Full diff:
E                     [
E                   +  ArticleSection(headline=(), paragraphs=('REBECCA Cooke is the childhood sweetheart of England footballer Phil Foden.', 'And the couple have announced some great news with the pair expecting a new addition to the family.', 'Rebecca Cooke is the long-term partner of Manchester City midfielder Phil Foden.', 'Rebecca is thought to be 22 years old and the mother of two children with Phil.', 'She tends to keep out the spotlight and has her Instagram account currently set private, though it does seem to suggest that she goes by the nickname Becca.', 'The exact time at which they started dating is unknown, but they have been together since being teenagers.', 'At the age of 18 she became a mother to their son, Ronnie.', 'A fan account of the couple (@beccafodenx) on Instagram shows the two together, along with a closer look at the blonde bombshell.', 'Phil and Rebecca have a son called Ronnie, 4, and a daughter named True, 1.', 'In April 2024, the couple announced they are expecting a third child.', 'Speaking to Manchester City at the time of the birth of his son, Phil said: "I was there for the birth. I walked out of the room, gave it a little tear and then went back in like nothing happened.', '"I’m not one for crying in front of people. I like to be on my own, but I was there in the room, watched it happen and it was a special moment.', '"Your life changes."', 'He continued, speaking of the things he misses Ronnie doing due to football training: "There are things you miss when you’re not there because you’ve got an away game.', '"I was there when he started crawling, but I think I was in London when he started to walk.', '"Now he’s getting about and walking everywhere, so you have to have eyes in the back of your head or he starts running off.', '"It’s unfortunate to miss things like that but it’s a sacrifice that he’ll appreciate when he’s older."')),
E                   -  ArticleSection(headline=(), paragraphs=('REBECCA Cooke is the childhood sweetheart of England footballer Phil Foden.', 'And the couple have announced some great news with the pair expecting a new addition to the family.')),
E                   -  ArticleSection(headline=('Who is Rebecca Cooke?',), paragraphs=('Rebecca Cooke is the long-term partner of Manchester City midfielder Phil Foden.', 'Rebecca is thought to be 22 years old and the mother of two children with Phil.', 'She tends to keep out the spotlight and has her Instagram account currently set private, though it does seem to suggest that she goes by the nickname Becca.')),
E                   -  ArticleSection(headline=('How long has she been dating Phil Foden?',), paragraphs=('The exact time at which they started dating is unknown, but they have been together since being teenagers.', 'At the age of 18 she became a mother to their son, Ronnie.', 'A fan account of the couple (@beccafodenx) on Instagram shows the two together, along with a closer look at the blonde bombshell.')),
E                   -  ArticleSection(headline=('How many children do couple have?',), paragraphs=('Phil and Rebecca have a son called Ronnie, 4, and a daughter named True, 1.', 'In April 2024, the couple announced they are expecting a third child.', 'Speaking to Manchester City at the time of the birth of his son, Phil said: "I was there for the birth. I walked out of the room, gave it a little tear and then went back in like nothing happened.', '"I’m not one for crying in front of people. I like to be on my own, but I was there in the room, watched it happen and it was a special moment.', '"Your life changes."', 'He continued, speaking of the things he misses Ronnie doing due to football training: "There are things you miss when you’re not there because you’ve got an away game.', '"I was there when he started crawling, but I think I was in London when he started to walk.', '"Now he’s getting about and walking everywhere, so you have to have eyes in the back of your head or he starts running off.', '"It’s unfortunate to miss things like that but it’s a sacrifice that he’ll appreciate when he’s older."')),
E                     ]
E                 Full diff:
E                 - ArticleBody(summary=(), sections=[ArticleSection(headline=(), paragraphs=('REBECCA Cooke is the childhood sweetheart of England footballer Phil Foden.', 'And the couple have announced some great news with the pair expecting a new addition to the family.')), ArticleSection(headline=('Who is Rebecca Cooke?',), paragraphs=('Rebecca Cooke is the long-term partner of Manchester City midfielder Phil Foden.', 'Rebecca is thought to be 22 years old and the mother of two children with Phil.', 'She tends to keep out the spotlight and has her Instagram account currently set private, though it does seem to suggest that she goes by the nickname Becca.')), ArticleSection(headline=('How long has she been dating Phil Foden?',), paragraphs=('The exact time at which they started dating is unknown, but they have been together since being teenagers.', 'At the age of 18 she became a mother to their son, Ronnie.', 'A fan account of the couple (@beccafodenx) on Instagram shows the two together, along with a closer look at the blonde bombshell.')), ArticleSection(headline=('How many children do couple have?',), paragraphs=('Phil and Rebecca have a son called Ronnie, 4, and a daughter named True, 1.', 'In April 2024, the couple announced they are expecting a third child.', 'Speaking to Manchester City at the time of the birth of his son, Phil said: "I was there for the birth. I walked out of the room, gave it a little tear and then went back in like nothing happened.', '"I’m not one for crying in front of people. I like to be on my own, but I was there in the room, watched it happen and it was a special moment.', '"Your life changes."', 'He continued, speaking of the things he misses Ronnie doing due to football training: "There are things you miss when you’re not there because you’ve got an away game.', '"I was there when he started crawling, but I think I was in London when he started to walk.', '"Now he’s getting about and walking everywhere, so you have to have eyes in the back of your head or he starts running off.', '"It’s unfortunate to miss things like that but it’s a sacrifice that he’ll appreciate when he’s older."'))])
E                 ?                                                                                                                                                                                                                                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^             ^^^^^^^^^^^^^^^^^^         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                        ^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^     ^^^^^^^^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E                 + ArticleBody(summary=(), sections=[ArticleSection(headline=(), paragraphs=('REBECCA Cooke is the childhood sweetheart of England footballer Phil Foden.', 'And the couple have announced some great news with the pair expecting a new addition to the family.', 'Rebecca Cooke is the long-term partner of Manchester City midfielder Phil Foden.', 'Rebecca is thought to be 22 years old and the mother of two children with Phil.', 'She tends to keep out the spotlight and has her Instagram account currently set private, though it does seem to suggest that she goes by the nickname Becca.', 'The exact time at which they started dating is unknown, but they have been together since being teenagers.', 'At the age of 18 she became a mother to their son, Ronnie.', 'A fan account of the couple (@beccafodenx) on Instagram shows the two together, along with a closer look at the blonde bombshell.', 'Phil and Rebecca have a son called Ronnie, 4, and a daughter named True, 1.', 'In April 2024, the couple announced they are expecting a third child.', 'Speaking to Manchester City at the time of the birth of his son, Phil said: "I was there for the birth. I walked out of the room, gave it a little tear and then went back in like nothing happened.', '"I’m not one for crying in front of people. I like to be on my own, but I was there in the room, watched it happen and it was a special moment.', '"Your life changes."', 'He continued, speaking of the things he misses Ronnie doing due to football training: "There are things you miss when you’re not there because you’ve got an away game.', '"I was there when he started crawling, but I think I was in London when he started to walk.', '"Now he’s getting about and walking everywhere, so you have to have eyes in the back of your head or he starts running off.', '"It’s unfortunate to miss things like that but it’s a sacrifice that he’ll appreciate when he’s older."'))])
E                 ?                                                                                                                                                                                                                                                               ^^^             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^         ^^^^^^^^^^^^^^^^                                                                                                                                                                                                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^                                                                                                                                                                                   ^^^^^^^^^^^^^^^^^^^     ^^^^^^^^^^^^^  ^^^^^^

tests/test_parser.py:182: AssertionError
==================================================================== short test summary info =====================================================================
FAILED tests/test_parser.py::TestParser::test_parsing[TheSun] - assert ArticleBody(summary=(), sections=[ArticleSection(headline=(), paragraphs=('REBECCA Cooke is the childhood sweetheart of England footballer Phil Foden....

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, but this is the expected behavior since you updated the selectors. If you run pytest now, the extracted content will differ from your test files you generated earlier. If you run python -m scripts.generate_parser_test_files -p TheSun -oj (make sure it's the -oj flag though) and run pytest again, everything should work just fine :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I am sorry for the inconvenience.

I committed and pushed the feature change && newly generated test :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries :)


@attribute
def body(self) -> ArticleBody:
Expand Down