.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "packages/scikit-learn/auto_examples/plot_california_prediction.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_packages_scikit-learn_auto_examples_plot_california_prediction.py: A simple regression analysis on the California housing data =========================================================== Here we perform a simple regression analysis on the California housing data, exploring two types of regressors. .. GENERATED FROM PYTHON SOURCE LINES 9-14 .. code-block:: Python from sklearn.datasets import fetch_california_housing data = fetch_california_housing(as_frame=True) .. GENERATED FROM PYTHON SOURCE LINES 15-16 Print a histogram of the quantity to predict: price .. GENERATED FROM PYTHON SOURCE LINES 16-24 .. code-block:: Python import matplotlib.pyplot as plt plt.figure(figsize=(4, 3)) plt.hist(data.target) plt.xlabel("price ($100k)") plt.ylabel("count") plt.tight_layout() .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_001.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 25-26 Print the join histogram for each feature .. GENERATED FROM PYTHON SOURCE LINES 26-35 .. code-block:: Python for index, feature_name in enumerate(data.feature_names): plt.figure(figsize=(4, 3)) plt.scatter(data.data[feature_name], data.target) plt.ylabel("Price", size=15) plt.xlabel(feature_name, size=15) plt.tight_layout() .. rst-class:: sphx-glr-horizontal * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_002.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_002.png :class: sphx-glr-multi-img * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_003.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_003.png :class: sphx-glr-multi-img * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_004.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_004.png :class: sphx-glr-multi-img * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_005.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_005.png :class: sphx-glr-multi-img * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_006.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_006.png :class: sphx-glr-multi-img * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_007.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_007.png :class: sphx-glr-multi-img * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_008.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_008.png :class: sphx-glr-multi-img * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_009.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_009.png :class: sphx-glr-multi-img .. GENERATED FROM PYTHON SOURCE LINES 36-37 Simple prediction .. GENERATED FROM PYTHON SOURCE LINES 37-58 .. code-block:: Python from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(data.data, data.target) from sklearn.linear_model import LinearRegression clf = LinearRegression() clf.fit(X_train, y_train) predicted = clf.predict(X_test) expected = y_test plt.figure(figsize=(4, 3)) plt.scatter(expected, predicted) plt.plot([0, 8], [0, 8], "--k") plt.axis("tight") plt.xlabel("True price ($100k)") plt.ylabel("Predicted price ($100k)") plt.tight_layout() .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_010.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_010.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 59-60 Prediction with gradient boosted tree .. GENERATED FROM PYTHON SOURCE LINES 60-77 .. code-block:: Python from sklearn.ensemble import GradientBoostingRegressor clf = GradientBoostingRegressor() clf.fit(X_train, y_train) predicted = clf.predict(X_test) expected = y_test plt.figure(figsize=(4, 3)) plt.scatter(expected, predicted) plt.plot([0, 5], [0, 5], "--k") plt.axis("tight") plt.xlabel("True price ($100k)") plt.ylabel("Predicted price ($100k)") plt.tight_layout() .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_011.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_011.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 78-79 Print the error rate .. GENERATED FROM PYTHON SOURCE LINES 79-84 .. code-block:: Python import numpy as np print(f"RMS: {np.sqrt(np.mean((predicted - expected) ** 2))!r} ") plt.show() .. rst-class:: sphx-glr-script-out .. code-block:: none RMS: np.float64(0.5314909993118918) .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 3.958 seconds) .. _sphx_glr_download_packages_scikit-learn_auto_examples_plot_california_prediction.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_california_prediction.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_california_prediction.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_california_prediction.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_