Logistic Regression Homework Help
Logistic Regression Homework Help


LOGISTIC REGRESSION
In the question set asked over here, the examiner is trying to test the comprehension of students over the topics like logistic regression and scatter plot. The statistical data set has been provided and the students are expected to write the equation of logistic regression and to create a scatter plot. Some questions asked over here deals with writing the equation of estimated logistic probabilities. Our website statisticsassignmentexperts.com is one of the best and legitimate sites for providing help with statistics topics.
SOLUTION
1
Number of obs  =  30  
LR chi2(3)  =  26.92  
Prob> chi2  =  0.000  
Pseudo R2  =  0.6474  
y  Coef.  Std. Err.  z  P>z  [95% Conf.  Interval] 
x  1.026793  0.528061  1.94  0.052  2.06177  0.008187 
td1  38.19539  16.05323  2.38  0.017  69.6591  6.73165 
X_TD1  2.497904  1.035859  2.41  0.016  0.467659  4.52815 
_cons  15.75439  8.265005  1.91  0.057  0.44472  31.9535 
The result above presents the logistic regression of Y which measure if a subject successfully completed a task or not on X and TD1 (which equals 1 for training and 0 for drugs) and their interaction, we observe that there interaction is significant as p=0.016<0.05. The result shows that X reduce the log odds of successfully completing task by 1.03 given that T=0. Moreover, we observe that if T=1, X increases the log odds of successfully completing the task by 1.47. On average, we expect the log odds of completing the task for T=1 to be lesser than that of T=0 by 38.2.
2
1
i
ii
iii
we used a logistic regression due to the fact that the outcome variable is a binary variable which have only 2 categories (lived or died).
2
3
i
Age category  mean 
1524  0.769 
2534  0 
3544  0.182 
4554  0.2 
5564  0.205 
6574  0.18 
7584  0.3 
8594  0.455 
ii
4
 The likelihood function can be written as
The log likelihood function may be written as:
ii the expression for the likelihood function may be written as:
the expression for the log likelihood is given as
5
i
sta  Coef.  Std. Err.  z  P>z  [95% Conf.  Interval] 
age  0.027543  0.010565  2.61  0.009  0.006836  0.048249 
_cons  3.05851  0.696122  4.39  0.000  4.42289  1.69414 
ii
iii
6
Chi2  p  
LR  7.85  0.0051 
Wald  6.8  0.0091 
ii
the variance must not be misspecified for the pvalues to be valid
iii
For the likelihood ratio test, the statistics will follow a distiribution with 1 degree of freedom under the null hypothesis
For the wald test, the wald statistics will follow a normal distribution with mean of 0 and variance of 1.
iv
the deviance of the model is given as
7
 The 95% confidence interval is given by stata as
lower  upper 
0.006836  0.048249 
ii
The 95% confidence interval of the slope estimates above ranges between 0.0068 and 0.0482 which means that we are 95% confident that a unit change in age will increase the probability of STA=1 by a value between 0.0068 and 0.0482.
8
i
the variance covariance matrix is given as:
age  _cons  
sta:age  0.000112  
sta:_cons  0.0071  0.484586 
ii
the logit is given as:
The estimated probability is given as
iii
from the estimated variance covariance matrix
The 95% confidence interval of the logit estimation is
The 95% confidence interval of the logit probability is
iv
the probability of that vital status at hospital discharged is live given that the age is 60 is 0.8031. we are 95% sure that the population estimate will range between 0.6754 and 0.7054
CODES
//question 1
gen X_TD1= x* td1
logit y x td1 X_TD1
//question 2
//2.2
label define st 1 “lived” 0 “died”
label values stast
twoway (scatter sta age), ytitle(STA) xtitle(AGE) title(scatterplot of STA against AGE)
//2.3(i)
genagecat=.
replaceagecat=1 if age>=15&age<=24
replaceagecat=2 if age>=25&age<=34
replaceagecat=3 if age>=35&age<=44
replaceagecat=4 if age>=45&age<=54
replaceagecat=5 if age>=55&age<=64
replaceagecat=6 if age>=65&age<=74
replaceagecat=7 if age>=75&age<=84
replaceagecat=8 if age>=85&age<=94
label define agect 1 “1524 years” 2 “2534 years” 3 “3544 years” 4 “4554 years” 5 “5564 years” 6 “6574 years” 7 “7584 years” 8 “8594 years”
label values agecatagect
sortagecat
byagecat:summarizesta
//2.3(ii)
inputmeanstamidage
.0769231 19.5
0 29.5
.1818182 39.5
0.2 49.5
.2051282 59.5
0.18 69.5
0.3 79.5
.4545455 89.5
end
twoway (scatter sta age) (scatter meanstamidage), ytitle(STA) xtitle(AGE)
//2.5(i)
logit sta age
estimates store m
//2.5(iii)
genexp=exp(3.05851+0.027543* age)
genfitp=1/(1+ exp)
twoway (scatter fitp age), ytitle(Predicted probabilities) xtitle(AGE) xlabel(20 (20) 100) ylabel(0 (0.2) 1)
//2.6(i)
logitsta
estimates store m2
lr m m2
lrtest m m2
logit sta age
test age
//2.8(i)
logit sta age
matrix list e(V)