Parsing Pdb file using Biopython parser: How to get all atoms instead of duplicates?
1
0
Entering edit mode
5.2 years ago
movchinar • 0

Hello everyone,

I am new in bioinformatics and when tried to parse pdb file using Biopython library came across this error: some atoms which defined twice in residue could not get their coordinates, id, name, etc.

How could I get all atoms instead of duplicates?

Here is my code:

from Bio.PDB.PDBParser import PDBParser
parser = PDBParser(PERMISSIVE=1)

structure = parser.get_structure("test", "/home/chinar/Downloads/Serinthreonine_ protein kinase, PIM 2/1_doc.pdb")


for model in structure:
    for chain in model:
        for residue in chain:
            for atom in residue:
                print(atom)

Output:

Warning: PDBConstructionException: Atom C defined twice in residue <Residue UNK het=H_UNK resseq=0 icode= > at line 31.

Exception ignored.

Some atoms or residues may be missing in the data structure.
  % message, PDBConstructionWarning)
/usr/local/lib/python3.6/dist-packages/Bio/PDB/PDBParser.py:291:

PDBConstructionWarning: 

PDBConstructionException: Atom C defined twice in residue <Residue UNK het=H_UNK resseq=0 icode= > at line 32.

Exception ignored.

Some atoms or residues may be missing in the data structure.

  % message, PDBConstructionWarning)

<Atom C>

<Atom N>
sequence software error assembly biopython • 6.3k views
ADD COMMENT
0
Entering edit mode

Is this a PDB file you have created?

Is the file definitely correctly formed?

ADD REPLY
0
Entering edit mode

After auto docking process the result was a file which format was pdbqt. I have changed the format pdbqt to PDB using UNIX command (cut -c-66 my_docking.pdbqt > my_docking.pdb).

Here is my PDB file:

MODEL 1

REMARK VINA RESULT:     -11.8      0.000      0.000

REMARK  1 active torsions:

REMARK  status: ('A' for Active; 'I' for Inactive)

REMARK    1  A    between atoms: C_18  and  C_23 

ROOT

HETATM    1  C   UNK     0      25.880   2.303   4.352  0.00  0.00

HETATM    2  C   UNK     0      26.930   7.122  -0.622  0.00  0.00

HETATM    3  C   UNK     0      27.007   7.975  -1.745  0.00  0.00

HETATM    4  C   UNK     0      26.792   7.469  -3.046  0.00  0.00

HETATM    5  C   UNK     0      26.498   6.104  -3.241  0.00  0.00

HETATM    6  C   UNK     0      26.639   5.755  -0.818  0.00  0.00

HETATM    7  C   UNK     0      26.423   5.271  -2.112  0.00  0.00

HETATM    8  N   UNK     0      26.153   3.962  -2.056  0.00  0.00

HETATM    9  N   UNK     0      26.501   4.704   0.026  0.00  0.00

HETATM   10  C   UNK     0      26.195   3.640  -0.750  0.00  0.00

HETATM   11  C   UNK     0      25.974   2.283  -0.221  0.00  0.00

HETATM   12  C   UNK     0      25.681   2.278   1.308  0.00  0.00

HETATM   13  C   UNK     0      26.620   4.618   1.394  0.00  0.00

HETATM   14  C   UNK     0      26.212   3.442   2.083  0.00  0.00

HETATM   15  C   UNK     0      26.293   3.445   3.503  0.00  0.00

HETATM   16  N   UNK     0      26.782   4.546   4.132  0.00  0.00

HETATM   17  C   UNK     0      27.213   5.646   3.465  0.00  0.00

HETATM   18  N   UNK     0      27.120   5.656   2.111  0.00  0.00

ENDROOT

BRANCH  17  19

HETATM   19  C   UNK     0      27.743   6.758   4.154  0.00  0.00

HETATM   20  C   UNK     0      27.954   7.982   3.476  0.00  0.00

HETATM   21  C   UNK     0      28.451   9.104   4.164  0.00  0.00

HETATM   22  C   UNK     0      28.735   9.018   5.539  0.00  0.00

HETATM   23  C   UNK     0      28.546   7.801   6.221  0.00  0.00

HETATM   24  C   UNK     0      28.061   6.673   5.531  0.00  0.00

ENDBRANCH  17  19

TORSDOF 1

ENDMDL

.

.

.

MODEL 20

.

.

.
ADD REPLY
0
Entering edit mode
11 months ago
IkramInf ▴ 20
from Bio.PDB.PDBParser import PDBParser
parser = PDBParser(PERMISSIVE=1)

structure = parser.get_structure("test", "/home/chinar/Downloads/Serinthreonine_ protein kinase, PIM 2/1_doc.pdb")

atoms = set()
for model in structure:
    for chain in model:
        for residue in chain:
            for atom in residue:
                atoms.add(atom)

print(atoms)
ADD COMMENT

Login before adding your answer.

Traffic: 2058 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6