Database Management > STUDY GUIDE > DATA 8.2 lab12.REGRESSION (All)

DATA 8.2 lab12.REGRESSION

Document Content and Description Below

lab12 April 18, 2020 1 Lab 12: Regression Welcome to Lab 12! Today we will get some hands-on practice with linear regression. You can find more information about this topic in section 15.2. [1]: # Run this cell, but please don't change it. # These lines import the Numpy and Datascience modules. import numpy as np from datascience import * # These lines do some fancy plotting magic. import matplotlib %matplotlib inline import matplotlib.pyplot as plots plots.style.use('fivethirtyeight') import warnings warnings.simplefilter('ignore', FutureWarning) warnings.simplefilter('ignore', UserWarning) # These lines load the tests. import otter grader = otter.Notebook() 1.1 1. How Faithful is Old Faithful? Revisited Let’s revisit a question from lab 1. Last lab, we investigated Old Faithful, a geyser in Yellowstone National Park in the central United States. It’s famous for erupting on a fairly regular schedule. To recap, some of Old Faithful’s eruptions last longer than others. Today, we will use the same dataset on eruption durations and waiting times to see if we can make predict the wait time from the eruption duration using linear regression. The dataset has one row for each observed eruption. It includes the following columns: - duration: Eruption duration, in minutes - wait: Time between this eruption and the next, also in minutes Run the next cell to load the dataset. 1[2]: faithful = Table.read_table("faithful.csv") faithful [2]: duration | wait 3.6 | 79 1.8 | 54 3.333 | 74 2.283 | 62 4.533 | 85 2.883 | 55 4.7 | 88 3.6 | 85 1.95 | 51 4.35 | 85 … (262 rows omitted) Remember from last lab that we concluded eruption time and waiting time are positively correlated. The table below called faithful_standard contains the eruption durations and waiting times in standard units. [3]: duration_mean = np.mean(faithful.column("duration")) duration_std = np.std(faithful.column("duration")) wait_mean = np.mean(faithful.column("wait")) wait_std = np.std(faithful.column("wait")) faithful_standard = Table().with_columns( "duration (standard units)", (faithful.column("duration") - duration_mean) / ,! duration_std, "wait (standard units)", (faithful.column("wait") - wait_mean) / wait_std ) faithful_standard [3]: duration (standard units) | wait (standard units) 0.0984989 | 0.597123 -1.48146 | -1.24518 -0.135861 | 0.228663 -1.0575 | -0.655644 0.917443 | 1.03928 -0.530851 | -1.17149 1.06403 | 1.26035 0.0984989 | 1.03928 -1.3498 | -1.4

[Show More]

Last updated: 3 years ago

Preview 1 out of 15 pages

Buy Now

Instant download

We Accept: