Logistic Regression Homework Help

Logistic Regression Homework Help

Logistic Regression Homework Help

 Simple Linear Regression Multiple Linear Regression Logistic Regression Probit Regression Non-Linear Regression Ordinary Least Squares Regression Nonparametric regression Robust regression Stepwise regression Assessing Overall Fit Binary Predictors Common Misconceptions about Fit Confidence Intervals for Y Fitted Regression Multivariate One-Predictor Model Predictor Significance Regression Modeling Two-Predictor Model Multi Co linearity Heteroscedasticy Tests for Nonlinearity and Interaction

LOGISTIC  REGRESSION

In the question set asked over here, the examiner is trying to test the comprehension of students over the topics like logistic regression and scatter plot. The statistical data set has been provided and the students are expected to write the equation of logistic regression and to create a scatter plot. Some questions asked over here deals with writing the equation of estimated logistic probabilities. Our website statisticsassignmentexperts.com is one of the best and legitimate sites for providing help with statistics topics.

SOLUTION

1

 Number of obs = 30 LR chi2(3) = 26.92 Prob> chi2 = 0.000 Pseudo R2 = 0.6474 y Coef. Std. Err. z P>z [95% Conf. Interval] x -1.026793 0.528061 -1.94 0.052 -2.06177 0.008187 td1 -38.19539 16.05323 -2.38 0.017 -69.6591 -6.73165 X_TD1 2.497904 1.035859 2.41 0.016 0.467659 4.52815 _cons 15.75439 8.265005 1.91 0.057 -0.44472 31.9535

The result above presents the logistic regression of Y which measure if a subject successfully completed a task or not on X and TD1 (which equals 1 for training and 0 for drugs) and their interaction, we observe that there interaction is significant as p=0.016<0.05. The result shows that X reduce the log odds of successfully completing task by 1.03 given that T=0.  Moreover, we observe that if T=1, X increases the log odds of successfully completing the task by 1.47. On average, we expect the log odds of completing the task for T=1 to be lesser than that of T=0 by 38.2.

2

1

i

ii

iii

we used a logistic regression due to the fact that the outcome variable is a binary variable which have only 2 categories (lived or died).

2

3

i

 Age category mean 15-24 0.769 25-34 0 35-44 0.182 45-54 0.2 55-64 0.205 65-74 0.18 75-84 0.3 85-94 0.455

ii

4

1. The likelihood function can be written as

The log likelihood function may be written as:

ii the expression for the likelihood function may be written as:

the expression for the  log likelihood is given as

5

i

 sta Coef. Std. Err. z P>z [95% Conf. Interval] age 0.027543 0.010565 2.61 0.009 0.006836 0.048249 _cons -3.05851 0.696122 -4.39 0.000 -4.42289 -1.69414

ii

iii

6

 Chi2 p LR 7.85 0.0051 Wald 6.8 0.0091

ii

the variance must not be mis-specified for the p-values to be valid

iii

For the likelihood ratio test, the statistics will follow a distiribution with 1 degree of freedom under the null hypothesis

For the wald test, the wald statistics will follow a normal distribution with mean of 0 and variance of 1.

iv

the deviance of the model is given as

7

1. The 95% confidence interval is given by stata as
 lower upper 0.006836 0.048249

ii

The 95% confidence interval of the slope estimates above ranges between 0.0068 and 0.0482 which means that we are 95% confident that a unit change in age will increase the probability of STA=1 by a value between 0.0068 and 0.0482.

8

i

the variance covariance matrix is given as:

 age _cons sta:age 0.000112 sta:_cons -0.0071 0.484586

ii

the logit is given as:

The estimated probability  is given as

iii

from the estimated variance covariance matrix

The 95% confidence interval of the logit estimation is

The 95% confidence interval of the logit probability is

iv

the probability of that vital status at hospital discharged is live given that the age is 60 is 0.8031. we are 95% sure that the population estimate will range between 0.6754 and 0.7054

CODES

//question 1

gen X_TD1= x* td1

logit  y x td1 X_TD1

//question 2

//2.2

label define st 1 “lived” 0 “died”

label values stast

twoway (scatter sta age), ytitle(STA) xtitle(AGE) title(scatterplot of STA against AGE)

//2.3(i)

genagecat=.

replaceagecat=1 if age>=15&age<=24

replaceagecat=2 if age>=25&age<=34

replaceagecat=3 if age>=35&age<=44

replaceagecat=4 if age>=45&age<=54

replaceagecat=5 if age>=55&age<=64

replaceagecat=6 if age>=65&age<=74

replaceagecat=7 if age>=75&age<=84

replaceagecat=8 if age>=85&age<=94

label define agect 1 “15-24 years” 2 “25-34 years” 3 “35-44 years” 4 “45-54 years” 5 “55-64 years” 6 “65-74 years” 7 “75-84 years” 8 “85-94 years”

label values agecatagect

sortagecat

byagecat:summarizesta

//2.3(ii)

inputmeanstamidage

.0769231 19.5

0 29.5

.1818182 39.5

0.2 49.5

.2051282  59.5

0.18 69.5

0.3 79.5

.4545455 89.5

end

twoway (scatter sta age) (scatter meanstamidage), ytitle(STA) xtitle(AGE)

//2.5(i)

logit  sta age

estimates store m

//2.5(iii)

genexp=exp(-3.05851+0.027543* age)

genfitp=1/(1+ exp)

twoway (scatter fitp age), ytitle(Predicted probabilities) xtitle(AGE) xlabel(20 (20) 100) ylabel(0 (0.2) 1)

//2.6(i)

logitsta

estimates store m2

lr m m2

lrtest m m2

logit  sta age

test age

//2.8(i)

logit  sta age

matrix list e(V)