Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README.md #379

Merged
merged 3 commits into from
Feb 2, 2017
Merged

Update README.md #379

merged 3 commits into from
Feb 2, 2017

Conversation

jeiros
Copy link
Contributor

@jeiros jeiros commented Jan 31, 2017

Add instructions to install sklearn

@rbharath
Copy link
Member

@jeiros Scikit-learn is included as part of the default anaconda install. There shouldn't be a need to include a separate installation instruction here. Would you mind checking to see whether your conda install didn't include scikit-learn for some reason?

@peastman
Copy link
Contributor

A lot of people will be using Miniconda. They'll need to install it.

@jeiros
Copy link
Contributor Author

jeiros commented Jan 31, 2017

It's not there if one creates a fresh environment, as I did to test out deepchem. These are the commands:

conda create -n deepchem2 python=3.5 -y
source activate deepchem2
conda install -c omnia openbabel=2.4.0 rdkit mdtraj -y
conda install joblib -y
pip install six tensorflow nose
git clone https://github.com/deepchem/deepchem.git
cd deepchem
python setup.py install
nosetests -v deepchem --nologcapture

And here is the log.

@jeiros
Copy link
Contributor Author

jeiros commented Jan 31, 2017

And here is the output of conda list in this new environment. So sklearn needs to be installed:

# packages in environment at /Users/je714/anaconda3/envs/deepchem2:
#
deepchem                  0.0.5.dev1112             <pip>
hdf5                      1.8.17                        1
joblib                    0.9.4                    py35_0
libiconv                  1.14                          0
libxml2                   2.9.4                         0
mdtraj                    1.8.0               np111py35_1    omnia
mkl                       2017.0.1                      0
nose                      1.3.7                     <pip>
numexpr                   2.6.1               np111py35_2
numpy                     1.11.3                   py35_0
openbabel                 2.4.0                    py35_2    omnia
openssl                   1.0.2k                        0
pandas                    0.19.2              np111py35_1
pip                       9.0.1                    py35_1
protobuf                  3.2.0                     <pip>
pytables                  3.3.0               np111py35_0
python                    3.5.2                         0
python-dateutil           2.6.0                    py35_0
pytz                      2016.10                  py35_0
rdkit                     2015.09.1                py35_2    omnia
readline                  6.2                           2
scipy                     0.18.1              np111py35_1
setuptools                27.2.0                   py35_0
six                       1.10.0                   py35_0
sqlite                    3.13.0                        0
tensorflow                0.12.1                    <pip>
tk                        8.5.18                        0
wheel                     0.29.0                   py35_0
xz                        5.2.2                         1
zlib                      1.2.8                         3

@rbharath
Copy link
Member

@jeiros Ah, I see, that makes a lot of sense. In that case, it might make sense to add a separate section to the README.md that explains how to do installation within a conda environment (the current write-up assumes that you're not in an environment). Would you be up to modify the PR with full directions for conda-envs? :-)

@jeiros
Copy link
Contributor Author

jeiros commented Jan 31, 2017 via email

@jeiros
Copy link
Contributor Author

jeiros commented Feb 2, 2017

@rbharath Done, let me know what you think, or if you want me to write more/less information.

@jeiros
Copy link
Contributor Author

jeiros commented Feb 2, 2017

I'm trying the install commands in a new environment on a Linux machine (CentOS 7.2 x86_64) with 4 GPUs, and the tests are giving a segfault (output).

Could it be that it needs all the GPUs in the system free? At the moment I have one running an MD production, maybe that's the reason:

$ nvidia-smi
Thu Feb  2 16:50:01 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 370.28                 Driver Version: 370.28                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX TIT...  On   | 0000:05:00.0     Off |                  N/A |
| 22%   61C    P8    30W / 250W |      0MiB / 12206MiB |      0%   E. Process |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX TIT...  On   | 0000:06:00.0     Off |                  N/A |
| 22%   55C    P8    16W / 250W |      0MiB / 12206MiB |      0%   E. Process |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX TIT...  On   | 0000:09:00.0     Off |                  N/A |
| 24%   62C    P8    18W / 250W |      0MiB / 12206MiB |      0%   E. Process |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX TIT...  On   | 0000:0A:00.0     Off |                  N/A |
| 55%   83C    P2   174W / 250W |   1074MiB / 12206MiB |    100%   E. Process |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    3     30964    C   pmemd.cuda_SPFP                               1072MiB |
+-----------------------------------------------------------------------------+

I've checked that it's not the openbabel issue:

$ ipython
Python 3.5.1 |Anaconda custom (64-bit)| (default, Jun 15 2016, 15:32:45)
Type "copyright", "credits" or "license" for more information.

IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import openbabel

So I've tried directly importing deepchem to see if there's any problem with that, and it's not being able to find rdkit, although it is installed:

In [2]: import deepchem as dc
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-2-e2622589c453> in <module>()
----> 1 import deepchem as dc

/home/je714/deepchem/deepchem/__init__.py in <module>()
      6 from __future__ import unicode_literals
      7
----> 8 import deepchem.data
      9 import deepchem.feat
     10 import deepchem.hyper

/home/je714/deepchem/deepchem/data/__init__.py in <module>()
      7
      8 # TODO(rbharath): Get rid of * import
----> 9 from deepchem.data.datasets import pad_features
     10 from deepchem.data.datasets import pad_batch
     11 from deepchem.data.datasets import Dataset

/home/je714/deepchem/deepchem/data/datasets.py in <module>()
     10 import random
     11 from functools import partial
---> 12 from deepchem.utils.save import save_to_disk
     13 from deepchem.utils.save import load_from_disk
     14 from deepchem.utils.save import log

/home/je714/deepchem/deepchem/utils/__init__.py in <module>()
     12 import pandas as pd
     13
---> 14 from rdkit import Chem
     15 from rdkit.Chem.Scaffolds import MurckoScaffold
     16

ImportError: No module named 'rdkit'

In [3]: import rdkit
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-3-e537e29c64f4> in <module>()
----> 1 import rdkit

ImportError: No module named 'rdkit'
$ conda list | grep 'rdkit'
rdkit                     2015.09.1                py35_2    omnia

Here's the full output of conda list in the new deepchem environment after following the installation steps:

# packages in environment at /home/je714/anaconda3/envs/deepchem:
#
boost                     1.59.0                   py35_1    omnia
bzip2                     1.0.6                         3
deepchem                  0.0.5.dev1119             <pip>
hdf5                      1.8.17                        1
joblib                    0.9.4                    py35_0
libgfortran               3.0.0                         1
libiconv                  1.14                          0
libxml2                   2.9.4                         0
mdtraj                    1.8.0               np111py35_1    omnia
mkl                       2017.0.1                      0
nose                      1.3.7                     <pip>
numexpr                   2.6.1               np111py35_2
numpy                     1.11.3                   py35_0
openbabel                 2.4.0                    py35_2    omnia
openssl                   1.0.2k                        0
pandas                    0.19.2              np111py35_1
pip                       9.0.1                    py35_1
protobuf                  3.2.0                     <pip>
pytables                  3.3.0               np111py35_0
python                    3.5.2                         0
python-dateutil           2.6.0                    py35_0
pytz                      2016.10                  py35_0
rdkit                     2015.09.1                py35_2    omnia
readline                  6.2                           2
scikit-learn              0.18.1              np111py35_1
scipy                     0.18.1              np111py35_1
setuptools                27.2.0                   py35_0
six                       1.10.0                   py35_0
sqlite                    3.13.0                        0
tensorflow-gpu            0.12.1                    <pip>
tk                        8.5.18                        0
wheel                     0.29.0                   py35_0
xz                        5.2.2                         1
zlib                      1.2.8                         3

@jeiros
Copy link
Contributor Author

jeiros commented Feb 2, 2017

Ah, the rdkit problem comes from using ipython, since it's not installed in the new environment, it uses the default one, and can't find rdkit. Weird. Still doesn't explain the segfault in the tests, though.

@rbharath
Copy link
Member

rbharath commented Feb 2, 2017

Python segfaults are nasty issues... One possible debugging strategy is to start python within gdb (https://wiki.python.org/moin/DebuggingWithGdb) and trying to get the stack trace for the segfault. It would also help to figure out a minimal failing test case.

I'd be OK to merge this as-is with a warning flagged on the conda environment directions that segfaults have sometimes happened

@jeiros
Copy link
Contributor Author

jeiros commented Feb 2, 2017

Done! It's not like modifying the README.md can break anything important 😄

@rbharath
Copy link
Member

rbharath commented Feb 2, 2017

@jeiros LGTM. Congrats on your first DeepChem PR :-)

@rbharath rbharath merged commit 73d71f2 into deepchem:master Feb 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants