Skip to main content Link Menu Expand (external link) Document Search Copy Copied

Bonus page: selected Python packages

Below, you will find a collection of Python packages that might be useful to your journey into advanced python programming and data analysis. This list is not exhaustive but it should cover a good ground!

Table of contents

Packages for data visualization

Package Name Description Link
Matplotlib Comprehensive data visualization library with a low-level and high-level interface for creating 2D plots and charts https://matplotlib.org/
Seaborn High-level data visualization library built on top of Matplotlib for creating more complex and aesthetic visualizations https://seaborn.pydata.org/
Plotly Interactive data visualization library with support for creating complex visualizations such as heatmaps, contour plots, and 3D scatterplots https://plotly.com/python/
Bokeh Interactive data visualization library with support for creating dashboards and interactive visualizations in web browsers https://bokeh.org/
Altair Declarative data visualization library with a simple syntax for creating interactive visualizations using Vega-Lite JSON specification https://altair-viz.github.io/
ggplot Python port of R’s ggplot2, a grammar of graphics data visualization library https://pypi.org/project/ggplot/

Packages for data analysis

Package Name Description Link
NumPy Library for working with arrays and matrices, often used for numerical computing https://numpy.org/
Pandas Library for data manipulation and analysis, often used for working with structured data https://pandas.pydata.org/
SciPy Library for scientific computing, often used for solving complex mathematical problems https://www.scipy.org/
Statsmodels Library for statistical modeling and data analysis, often used for fitting regression models and time series analysis https://www.statsmodels.org/stable/index.html
Scikit-learn Library for machine learning, often used for classification, regression, and clustering https://scikit-learn.org/
NLTK Library for natural language processing, often used for text analysis and processing https://www.nltk.org/
Gensim Library for topic modeling and document similarity analysis https://radimrehurek.com/gensim/
PyCaret Low-code machine learning library for training and deploying models https://pycaret.org/

Packages for biomedical data

Package Name Description Link
Biopython Library for working with biological data, including DNA, RNA, and protein sequences https://biopython.org/
Scikit-bio Library for bioinformatics, including sequence alignment, phylogenetics, and microbiome analysis https://scikit-bio.org/
NetworkX Library for working with complex networks, often used for analyzing biological pathways and interactions https://networkx.github.io/
PyMOL Molecular visualization system for structural biology https://pymol.org/
RDKit Collection of cheminformatics and machine learning tools for working with molecular data https://www.rdkit.org/
OpenCV Library for computer vision, often used for analyzing medical images such as X-rays and MRI scans https://opencv.org/

Packages to check for code quality

Package Name Description Link
Flake8 Tool for checking PEP8 compliance and detecting code smells and errors https://flake8.pycqa.org/
Pylint Tool for checking Python code against coding standards and detecting programming errors https://www.pylint.org/
Black Opinionated code formatter that enforces a consistent style https://black.readthedocs.io/en/stable/
mypy Static type checker for Python code https://mypy.readthedocs.io/en/stable/index.html
Bandit Tool for finding security issues in Python code https://bandit.readthedocs.io/en/stable/
Pytest Testing framework for Python code, often used for unit testing and integration testing https://docs.pytest.org/en/stable/