Browse Source

Add multiple regression and chi squared test

master
Ryan Stewart 11 months ago
parent
commit
9b0ab32bef
  1. 30
      project.rmd

30
project.rmd

@ -62,7 +62,7 @@ mdl1=lm(below_wage~education,data=df)
summary(mdl1)
```
Here, it can be seen that as the value of education goes higher, the value of lower wage jobs decreases.
If null hypothesis is that $\beta_1$ = 0, showing no correlation, the p-value would be evaluated highly. In this cause, the p-value is low enough to reject that null hypothesis.
If null hypothesis is that $\beta_1$ = 0, showing no correlation, the p-value would be evaluated highly. In this case, the p-value is low enough to reject that null hypothesis.
(I'm just writing generally here, I will make it better when everything else is done)
```{r}
@ -86,17 +86,22 @@ plot(mdl2)
```
In this experiment, there was an assumed homoscedasticity and errors of a normal distribution. Based on the residual v fitted and Normal Q-Q plots, those assumptions have the right to be questioned, but not as drastically.
1. Can the number of people in poverty predict crime rate? - Linear regression (should also do a predict command with this)
Here we take a look at the unemployment rate compared to the crime rate. We do not only have the unemployment, so we must take the sum of the young and old (We might want to drop this)
See what if anything affects crime rate
```{r}
mdl3 = lm(crime_rate ~ education+expenditure_year0+youth_unemployment+state_size+wage,df)
summary(mdl3)
```
By looking at the p value, we can see that very little affects the crime rate, but only thing that has a very strong relationship with crime rate is expenditure. This test shows that the more people spend on police, the more crime they find in those areas.
1. How does average education level of people in the area affect the amount of crime that occurs? - Years of education per crime rate
```{r}
plot(mdl3)
```
```{r}
ggplot(df, aes(education, i_crime_rate)) + geom_point() + theme_minimal()
```
As you can see by looking at this graph, there is little to no correlation at all between any of these points. Creating any type of model would be ineffective
1. Can education level predict the amount of crime that occurs? - Linear regression (with prediction?)
@ -104,8 +109,17 @@ As you can see by looking at this graph, there is little to no correlation at al
No
Put this together with the one above
1. Is there more crime from young males compared to any other group? (multiple regression)
young males, mature males, crime rate
1. Is there a relationship between high youth unemployment and southern states?
```{r}
chisq.test(df$southern, df$youth_unemployment)
```
Using the chi squared test, we can see that there is not a statistically significant difference between southern and northern states in youth unemployment. This is because the p value is nearly 3 times greater than our critical p value of 0.05.
1. Did things get worse over time? (Can interpret this on any other questions we have because it provides both some year and 10 years later) - Do tests on all previous questions
1. How does average education level of people in the area affect the amount of crime that occurs? - Years of education per crime rate
```{r}
ggplot(df, aes(education, crime_rate)) + geom_point() + theme_minimal()
```
As you can see by looking at this graph, there is little to no correlation at all between any of these points. Creating any type of model would be ineffective. Therefore, we know education level is not a good indicator of predicting an effect on crime rate.
Loading…
Cancel
Save