Scikit-learn

A powerful Python library for machine learning, featuring various algorithms and tools for data analysis

Overview

Scikit-learn is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN.

Key Features

  • Simple and efficient tools for data mining and data analysis
  • Built on NumPy, SciPy, and matplotlib
  • Accessible to everybody and reusable in various contexts
  • Open source, commercially usable

Getting Started

# Install scikit-learn
pip install scikit-learn

# Import scikit-learn
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

# Load dataset
iris = datasets.load_iris()
X_train, X_test, y_train, y_test = train_test_split(
    iris.data, iris.target, test_size=0.3)

# Create and train model
clf = RandomForestClassifier()
clf.fit(X_train, y_train)