Data Science Tutorial
This tutorial covers topics that are relevant or interesting for me for some kind of reason 😄. Have fun!
🚧 At the moment this tutorial is under construction 🚧. It will stay in that state for quite some time, but I will add new tutorials one by one. So it might be a good idea to check back periodically for updates.
Installation
You can run the chapter notebooks on your local machine or on Google Colab.
On your local machine you need to clone the repo, create a virtualenv and start your preferred notebook server.
$ git clone git@github.com:ephes/data_science_tutorial.git
$ python -m venv venv
$ venv/bin/python -m pip install jupterlab
$ venv/bin/jupyter lab
Usually, the dependencies needed for a notebook should be installed from within the notebook itself using the %pip
cell magic. Packages that you install using %pip
will be installed into the virtualenv that you run the notebook server from. It should make no difference whether you use classic or conda virtual environments.
Text Classification
Chapter 01: Getting started with Text Classification docs
- Local: Notebook
- Google Colab: Open in Colab
Foundations
🏗 I added the notebooks below already because they should work, but they are also still very much work in progress. 👷
Numpy
Numpy is the basis of a lot of stuff in related to data science in Python.
Chapter 08: Numpy Overview Covering the Basic Features docs
- Local: Notebook
- Google Colab: Open in Colab
Pandas
Pandas is very useful for all kinds of pre-processing and data cleaning.
Chapter 09: Using Pandas docs
- Local: Notebook
- Google Colab: Open in Colab