PyCaret
PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows - pycaret.org
PyCaret is a simple, easy to learn, low-code machine learning library in Python. With PyCaret, you spend less time coding and more time on analysis.
Modules in PyCaret
PyCaret is a modular library arranged into modules and each module representing a machine learning use-case, the following modules are supported:
- Classification
- Regression
- Clustering
- Anomaly Detection
- Natural Language Processing
- Association Rules Mining
- Time Series (beta)
Note: Time Series module is in making and will be available in the next major release.
Installing PyCaret
You install PyCaret with PIP
1
pip3 install pycaret
If you get an legacy-install-failure error, try this:
1
pip3 install -U --pre pycaret
Alternativ use Jupyter Lab in a new Virtuel Environment, se this guide: Using Jupyter Lab in a Virtual Environment
Data
For this demo you will be using a dataset from a case study by the Darden School of Business, published in Harvard Business.
The goal of this tutorial is to predict the diamond price based on its attributes like carat weight, cut, color, etc. You can download the dataset from PyCaret’s repository - https://github.com/pycaret/pycaret/tree/master/datasets
Load the dataset from PyCaret
You can get the data in Python by this code:
1
2
3
4
5
6
7
# Imports
from pycaret.datasets import get_data
# Get data
data = get_data('diamond')
print(data.head())