Skip to content

Commit 8bd3e1f

Browse files
authored
With solutions
1 parent f695ccf commit 8bd3e1f

File tree

1 file changed

+22
-5
lines changed

1 file changed

+22
-5
lines changed

Week 5 - Introduction to Modeling/Lecture5.R

Lines changed: 22 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -162,15 +162,32 @@ mean((lm.train$residuals)^2)
162162
#
163163
###################
164164
#1. Develop a linear model to predict the rain, using all attributes (hint: you can write the formula this way: lm(rain~., data = fire) to indicate you are using all attributes)
165-
165+
lm.rain <- lm(rain~., data = fire)
166166
#2. Look at the summary of the model. Which attributes are statistically significant? (with p-value < 0.05)
167-
167+
summary(lm.rain)
168168
#3. Develop a linear model only using attributes that are statistically significant
169-
169+
lm.rain.reduced <- lm(rain~day+temp+RH, data = fire)
170170
#4. Compare model 1 and model 3. Are they statistically different? Which one would you choose?
171+
summary(lm.rain)
172+
summary(lm.rain.reduced)
173+
anova(lm.rain,lm.rain.reduced) #Not statistically different because anova gives a p-value of 0.515. Choose the simpler model to avoid overfitting
171174

172175
#5. Split the dataset into trainning and testing set. Then use the training set to train the model you selected.
173-
176+
set.seed(1234)
177+
sample_size <- floor(0.75*nrow(fire))
178+
train_idx <- sample(seq(nrow(fire)),size = sample_size)
179+
train_data <- fire[train_idx,]
180+
test_data <- fire[-train_idx,]
181+
182+
rain.train <- lm(rain~day+temp+RH, data = train_data)
183+
summary(rain.train)
174184
#6. Generate predictions for the testing sets.
185+
rain_pred <- predict(rain.train,test_data)
175186

176-
#7. Compute the mean squared error of the predictions
187+
#7. Compute the mean squared error of the predictions
188+
mse <- function(p, r)
189+
{
190+
mean((p-r)^2)
191+
192+
}
193+
mse(rain_pred,test_data$rain)

0 commit comments

Comments
 (0)