Skip to content

Error using parse_pubmed_xml #95

Open
@sublimotion

Description

It looks like the pubmed parser doesn't support the pubmed baseline files?

I get the error below. It also doesn't look like the test file is using a similar file format.

pubmed_dict = pp.parse_pubmed_xml('./data/pubmed20n1015.xml') # dictionary output
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-5-2b4cea8c6fb9> in <module>
----> 1 pubmed_dict = pp.parse_pubmed_xml('./data/pubmed20n1015.xml') # dictionary output

~/anaconda3/envs/python3/lib/python3.6/site-packages/pubmed_parser/pubmed_oa_parser.py in parse_pubmed_xml(path, include_path, nxml)
    155         journal = ""
    156 
--> 157     dict_article_meta = parse_article_meta(tree)
    158     pub_year_node = tree.find(".//pub-date/year")
    159     pub_year = pub_year_node.text if pub_year_node is not None else ""

~/anaconda3/envs/python3/lib/python3.6/site-packages/pubmed_parser/pubmed_oa_parser.py in parse_article_meta(tree)
     67     """
     68     article_meta = tree.find(".//article-meta")
---> 69     pmid_node = article_meta.find('article-id[@pub-id-type="pmid"]')
     70     pmc_node = article_meta.find('article-id[@pub-id-type="pmc"]')
     71     pub_id_node = article_meta.find('article-id[@pub-id-type="publisher-id"]')

AttributeError: 'NoneType' object has no attribute 'find'

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions