11.8. Linear Regression Homework¶
Note
We will complete this assignment together in class. Watch for the video recording if you are not in class or on Zoom when we work on it.
Old Faithful is a cone geyser located in Yellowstone National Park in Wyoming, United States. It was named in 1870 during the Washburn-Langford-Doane Expedition and was the first geyser in the park to receive a name. It is a highly predictable geothermal feature, which makes it a favorite to visitors. In 1939, the average wait time between eruptions was 66.5 minutes, but wait times have slowly increased to an average of 90 minutes today. It has erupted every 44 minutes to two hours since 2000. The duration of the last eruption is a general indicator of the wait until the next eruption. A longer eruption predicts a longer wait until the next eruption starts.
Several years ago, measurements were taken of the duration and wait times until
the next eruption for 272 eruptions. This data is in the file
faithful.csv.
Use linear regression to fit a prediction line to the data. Then calculate the coefficient of determination to assess the accuracy of the prediction line. See Goodness of a Fit. Turn in the Python script and a plot image.
Note
Although the polyfit function provides a simple way to compute
regression coefficients, you need to build a design matrix and determine
the coefficients and projection as we did in class
(see Linear Regression Example). This is a requirement because I want everyone
to learn and practice using the linear algebra equations, rather than use
a function that does it for you.