This is the course material of the "First Steps with Python in Life Science" three-day course of SIB-training
The course is addressed to beginners wanting to become familiar with the Python syntax, environment, and the most common commands.
This course material provides an introduction to python and jupyter notebooks, a web based notebook system for creating and sharing computational documents in an interactive manner.
Please ensure you have installed all the required software as indicated in the environment setup section before the start of the course.
- Course material: the easiest way to get the course material is to
download the
.zip
file of the latest release that is available by clicking on this link. - Google doc: can be used to ask questions (especially for courses taught online) or otherwise provide a mean of communication between participants and trainers (e.g. to share code snippets).
The course revolves around a series of jupyter notebooks that take you on your first steps in you python journey.
Each jupyter notebook interleaves theory, code examples and exercises. We heartily recommend you execute and play around with these bits of code as you follow along: in programming, perhaps more than anywhere else, practice makes perfect.
Additionally, each notebook is associated with a number of exercises (generally in a separate notebook). Corrections are provided for all exercises.
If you are attending this course with a teacher (or if you are just curious), you can take a look at our schedule.
In short, lessons 0 to 4 deal with general aspect of the python language, while notebooks 5 to 8 present some of the most common modules used in data analysis and/or life sciences.
The notebooks/
directory contains each lesson:
- 00_jupyter_setup
- 01_python_basics
- 02_python_structures
- 03_reading_writing_files
- 04_modules
- 05_module_pandas: handle tabular data data-frames with pandas
- 06_module_matplotlib: create nice graphics and plots with matplotlib
- 07_module_biopython : do all kind of bioinformatics with [biopython]](https://biopython.org/)
- 08_module_numpy_and_scipy: fast numerical computations with numpy + a bit of statistics with scipy.stats.
Exercise notebooks:
- 01_python_basics_exercises
- 02_python_structures_exercises
- 03_reading_writing_files_exercises
- 04_modules_exercises
- 05_module_pandas_exercises
- 06_module_matplotlib_exercises
- 07_module_biopython_exercises
Data and solutions:
- The data used in the practicals can be found in the
notebooks/data
directory. - Solutions can be found in the
notebooks/solutions/
directory, but can also be loaded directly from the exercise notebooks.
Have you ever been stuck with a file format that doesn't precisely conform to your needs, found yourself doing annoyingly repetitive data manipulations, or struggled to efficiently manage and explore your data? Python to the rescue!
Python is an open-source and general-purpose programming language which runs on all major operating systems. It was designed to be easily read and written with a comparatively simple syntax, and is thus a good choice for beginners in programming.
Python is applied in many disciplines and is one of the most common languages for bioinformatics. The Python community enthusiastically maintains a rich collection of libraries/modules for everything from web development to machine learning.
In this 3-days course for beginners, participants will learn the basic concepts, data types and code structures necessary to solve routine data manipulation tasks. It also covers the concepts, terminology, and approach to documentation required to further develop skills in Python programming independently, helping participants to take control of their research questions in an independent manner.
Topics covered by this course include:
- A basic introduction to Python and computing in general.
- Overview of the basic data types in python, such as strings, numbers, lists, tuples, and dictionaries.
- Overview of the basic code structures: if/else, for loops and functions.
- Writing your own functions.
- Reading from and writing to files.
- Best practices in Python programming.
- Debugging and documentation.
- Installing and importing external libraries/modules.
- Introduction to some useful python libraries in data science: pandas, matplotlib, scipy, numpy, biopython (note: participants can elect the modules they wish to look at, so not everyone will go through all libraries).
The course is addressed to beginners who want to become familiar with writing Python code to accomplish common tasks such as automated data parsing, basic statistical operations and graphical representations.
For people who are proficient in programming: this course might be on the slow side for you and an intermediate python class is recommended (check regularly our upcoming training courses).
By the end of this course, participants will be familiar with the following basic python concepts:
- Basic data types in python (strings, numbers, lists, tuples, dictionaries).
- Basic code structures: if/else, for loops and functions.
- Writing functions.
- Reading from and writing to files.
- Installing and importing external libraries/modules.
- Debugging and documentation.
Participants will also gain an overview of the Python ecosystem and some of its popular libraries in data science and bioinformatics. The basic concepts learned in the course should also enable them to further self-study specific topics of interest and/or attend more advanced python training courses.
This course is designed for beginners; there is no requirement for previous knowledge in Python or programming. However, we encourage people to be have some familiarity with basic shell/terminal commands (e.g. navigating the filesystem in command line). For people with no experience of shell commands, we recommend either taking the SIB's "First Steps with UNIX" course, or completing the SIB's "First Steps with UNIX" e-learning module. Basic concepts of algorithmics is also a plus for this course.
Participants are required to have their own laptop with a reasonably recent
version of Python installed (version >= 3.10
), as well as the
jupyter notebook environment.
For this course, we suggest to install Python via Anaconda - a free and operating system (OS)-agnostic platform for installing Python libraries and environments. Anaconda is bundled with Anaconda Navigator, a graphical user interface which will help ease you into what Python makes possible. For details, please see the environment setup section of the course webpage.
If you use/reuse this material, please cite as:
Robin Engler, Wandrille Duchemin, & Orlin Topalov. (2024, March 18). Course material First steps with Python in Life Sciences. Zenodo, https://doi.org/10.5281/zenodo.10829064.