NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers
This test includes two sections.
· The FIRST SECTION includes 28 questions
· The SECOND SECTION includes 5 questions related to analyzing, and interpreting data, and summarizing the findings and its impact on nursing practice. Download and use the file “ sample_data_final_exam_SPRING2019.sav”
 The total score is 50 points (25% of the grade).
NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers
SECTION 1 : Below is the list of QUESTIONS.
Use the space assigned after each question for your answer. All the questions must be answered and show your calculations and write your justifications for full credit.
1. In 1945, there were 1,000 women who worked in a factory painting radium dials on watches. The incidence of bone cancer in these women up to 1975 was compared to that of 1,000 women who worked as telephone operators in 1945. Twenty of the radium dial painters and four of the telephone operators developed bone cancer between 1945 and 1975. This is an example of : [ONE POINT]
a. Cohort study
b. Experimental study
c. Clinical trial
d. Crosssectional study
e. Casecontrol study
A cohort study is used to analyze an incidence rate for a disease and its occurrence during a given time period. NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers.
2. Correlation coefficient implies causation [ONE POINT]
a. True
b. False
This statement is false because causation means that one variable is the result of the occurrence of another variable. Where correlation is used to prove causation.
3. Select the correct statement: [ONE POINT]
a. The attributable risk is the excess risk of disease in the exposed compared to the nonexposed during a defined period of time
b. The attributable risk is a ratio of the disease risk in the exposed compared to the nonexposed during a defined period of time
c. The attributable risk is a ratio of the disease risk in the nonexposed compared to the exposed during a defined period of time
d. The attributable risk is the prevalence of disease in the exposed minus the prevalence of disease in the nonexposed
e. The attributable risk is the disease risk in a defined group at a specific point in time
An attributable risk is a measure of the prevalence of a condition or disease. Given a group of people exposed to a risk, it’s the fraction who develop a disease or condition. NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers.
4. Investigators feel it is important to reduce the probability of a Type I error so they set alpha at 0.01. The resulting pvalue from their study is 0.02. Select the correct answer: [ONE POINT]
a. The study result might be a Type II error
b. The study result is statistically significant
c). The study result is not important to patient care
d. The study’s power is 99%
e. Under the assumption that the null hypothesis is true, the probability of getting a result as large or larger than 0.01 is 2%
The pvalue is the probability that the null hypothesis is true. (1 – the pvalue) is the probability that the alternative hypothesis is true. A low pvalue shows that the results are replicable. A low pvalue shows that the effect is large or that the result is of major theoretical, clinical or practical importance.
5. The following refers to questions 12 and 13. Researchers determine in a case control study that 30 of 100 patients with bladder cancer smoke cigarettes while 50 of 600 patients without bladder cancer smoke cigarettes.
A. Calculate the appropriate measure of association for developing bladder cancer in smokers versus nonsmokers. [ONE POINT]
(30÷100−30) ÷ (50 ÷600−50) = 4.71 measure of association
B. Interpret the measure of association you calculated in question “A”. [ONE POINT]
To interpret the measure of association we must know what the term means, which is defined as being x times more likely or less likely to occur. In this case a positive 4.71 shows as being 4.71 for the incidence of bladder cancer to occur in patients who smoked cigarettes. NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers.
6. Tell whether the following statement is true or false: [TWO POINTS]
A. Reliability is the degree to which an instrument measures what it is supposed to measure.
This is false. Reliability is the strength of the degree that instrument produces the same results. Whereas validity determines that an instrument measures what it is said to measure.
B. Internal consistency reliability is the extent to which the different items of the scale are not reliably and consistently measuring attribute.
This is false. The internal consistency is measure that determines if multiple items within a test have the same measurement.
7. Researchers recorded data on diastolic blood pressure and height on 70 subjects with a mean age of 70 and a median age of 70. These 70 subjects had a mean diastolic blood pressure of 70 mm Hg and a mean height of 70 inches. The researchers considered height as the independent variable and diastolic blood pressure as the dependent variable when they performed simple linear regression. They reported a correlation of (r) = 0.84.
a. Interpret the correlation (r) of 0.84? [ONE POINT]
We can conclude that 0.84 is a positive and strong linear relationship.
b. Calculate and interpret the R Squared? [ONE POINT]
We can calculate the r square by squaring the 0.84 value which is equal to 0.70 or 70%. The r square represents the proportion of variation in diastolic blood pressure that can be explained through the variation in height.
c. Is the R square clinically significant? Explain. [ONE POINT]
There is a statistical significance of the rsquare value because the value is greater than .20 or 20%. NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers.
8. Researchers randomly select a sample of male and female to determine if the mean cholesterol level in male is different from the mean cholesterol levels in female. Assume the data are normally distributed. Which statistical test should the researchers use? [ONE POINT]
The appropriate statistical test to use is the two sample ttest which is used to compare means between two unrelated groups.
9. If on a group of 457 patients, for a risk factor we calculated a Relative Risk RR= 7.74, the possibility of developing the disease being investigated is: [ONE POINT]
a. very high when exposed to the factor
b. very small when exposed to the factor
c. lower in the exposed than in the unexposed, RR being less than 100
This is a high value and 7.74 represents there being 7.74 times more likely individuals will be exposed to the disease.
10. For a clinical trial, the Sensitivity is 0.53 and Specificity is 0.88. This means that: [ONE POINT]
a. The test is a valuable test because both indicators are more than 50%
b. The test is worthless, since it gives errors when detecting both sick and healthy subjects
c. The test is a worthless test, because the sensitivity is too low (lower than 75%)
d. a perfect test
A sensitivity test that is less than 75% is more likely to produce false negative results.
NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers
11. A regression line is a straight line which: (select all applicable answers) [ONE POINT]
a. is located as close as possible to all the points of a scatter chart
b. is defined by an equation having 2 parameters: the slope and the intercept
c. provides an approximate relationship between the values of two parameters
d. is parallel to one of the coordinate axes
These are components of a straight regression line. In statistical modeling, regression analysis is a statistical process for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables.
12. Ŷ represents the predicted value ofy (systolic blood pressure [SBP]) calculated using the equation Ŷ = a + bx. In the formula, SBP = 71 + 2.0 x; where x = value of age (years) for an adult person.
a. What is the value of the intercept (a)? [ONE POINT]
This value is the y intercept and is 71.
b. Interpret the value of the intercept [ONE POINT]
The yintercept is a point where a line, curve, or surface intersects the yaxis.
c. What is the value of the slope (b)? (ONE POINT)
The value of the slope is a positive 2.0.
d. Interpret the value of the slope? [ONE POINT]
The slope can be interpreted as having a positive direction and move up the 2 and to the right by 1.
13. Which of the following statements about R^{2 }is not true? (ONE POINT)
a. It is sometimes called the coefficient of determination.
b. Its value indicates percentage of variation of Y explained by all predictors as a set.
c. It is a measure of magnitude but not direction of relationships.
d. Its values can range from 1.00 to +1.00.
This answer does not apply to R squared because R squared values range from 0 to 1 and are expressed through percentages. NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers.
14. The chisquare test is used to test the null hypothesis that: (ONE POINT)
a. The medians of groups being compared are equal
b. Two categorical variables are independent (not related)
c. The expected cell sizes are zero
d. The odds ratio is zero
The two categorical variables are being tested in order to determine if they are truly unrelated.
15. What are the uses of factor analysis? (ONE POINT)
Based on the data on table 1 below:
A. How many factors will be extracted? Provide justifications. [ONE POINT]
According to Eigen values; the extracted values are the values that are higher than 1, in this case are the first 3 factors.
B. What percent of the variance the extracted factor(s) will explain? Provide justifications. [ONE POINT]
To find the percent of variances for the extracted values we must add each factors percent of variance; 35.8(factor 1) + 32.3(factor 2) + 16.3(factor 3) = 84.4%. NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers.
Table 1. Factor analysis
FACTORS  Eigen Value  Percent of Variance 
1  2.00  35.8 
2  2.12  32.3 
3  1.10  16.3 
4  .43  7.2 
5  .38  6.4 
6  .12  2.0 
16. Use the table below (Table 2. Component Matrix) and indicate the loading of the variables for factor 2. [ONE POINT]
The loading variables are all of the variables that are close to one. The loading variables from factor 2 are .800(Q1), .651(Q4), .634(Q5), and .651(Q7).
Table 2. Component Matrix
FACTOR1 FACTOR2
Q1 0.260 0.800
Q2 0.619 0.387
Q3 0.760 0.091
Q4 0.396 0.651
Q5 0.030 0.634
Q6 0.805 .064
Q7 0.399 – 0.651
Q8 0.794 0.167
Q9 0.728 0.097
Q10 0.770 0.183
Extraction Method: Principal Component Analysis.
17. An instrument with 12 questions [i.e., a scale of 12 variables] was evaluated for internal consistency (reliability). The following is the result:
Cronbach’s Alpha N of Items
0.623 12
Is the scale internally consistent? Provide rationale. [ONE POINT]
The scale is not internally consistent because the given Cronbach’s Alpha is less than .80.
Question 18 refers to the following information :
A study was performed on 200 elementary school students to investigate whether regular
Vitamin A supplementation was effective in preventing colds during the month of March.
100 were randomized to receive daily Vitamin A supplements during the month of
March, and 100 students were randomized to a placebo group (and did not receive Vitamin A) during the same month. The number of students getting at least one cold in March was computed in the two groups.
18. If you were interested in testing for a statistical relationship between taking Vitamin A (yes/no) and cold status (yes/no) in the population of elementary school students, the correct statistical test is: [ONE POINT]
a. Twosample ttest
b. Adhoc test for testing the equality of things
c. Chisquared test
The chisquare test for independence, also called Pearson’s chisquare test or the chisquare test of association, is used to discover if there is a relationship between two categorical variables.
19. A study was done to investigate whether there is a relationship between survivals of patients with coronary heart disease and pet ownership. A representative sample of 92 patients with CHD was taken. Each of these patients was classified as having a pet or not and by whether they survived one year following their first heart attack. Of 53 pet owners, 50 survived. Of 39 nonpet owners, 28 survived.
– What is the design of this study? [ONE POINT] NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers.
Nonexperimental research is research that lacks the manipulation of an independent variable, random assignment of participants to conditions or orders of conditions, or both – characteristics pertinent to experimental designs (O’Dwyer & Bernauer, 2013)
NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers
20. To evaluate the effectiveness of 2 different smoking cessation programs, smokers are randomized to receive either program A or program B. Of 6 smokers on program A, 1 stopped smoking and 5 did not. Of 6 smokers on program B, 4stopped smoking and 2 did not. Which statistical procedure would you use to test the null hypotheses that the programs are equally effective and to obtain a pvalue? [ONE POINT]
a. Chi Square Test
b. One way Analysis of Variance
c. 2 sample t test
d. Fisher’s exact test
e. MannWhitney nonparametric test
Fisher’s exact test is a statistical test used to determine if there are nonrandom associations between two categorical variables.
21. The objective of a study is to understand the factors that are associated with systolic blood pressure in infants. Systolic blood pressure, weight (ounces) and age (days) are measured in 100 infants. A multiple linear regression is performed to predict blood pressure (mm Hg) from age and weight. The following results are presented in a journal article. (Questions A refer to these results)
Multiple Linear Regression Analysis of the Predictors of Systolic Blood Pressure in Infants
b ˆ’ s (coefficients)
Intercept 50
Birth Weight 0.10
Age (days) 4.0
A. How much higher would you expect the blood pressure to be of an infant who weighed 120 ounces compared to an infant who weighed 90 ounces if both infants were of exactly the same age of 10 days? [ONE POINT]
a. 0.1 mm Hg
b 1.0 mm Hg
c. 2.0 mm Hg
d. 3.0 mm Hg
e. 4.0 mm Hg
Y=50 +120x.10 = 62mmHg
Y=50+90x.10 = 59 mmHg
6259 = 3.0mmHg
22. A recent study of the relationship between Scholastic Aptitude Test (SAT) scores and
U.S. state level characteristics found a statistically significant (p < .05) relationship between average SAT scores and the percent of high school seniors who actually took the SAT within a state. Linear regression was used to estimate this relationship, and the resulting regression equation was: y = 1024 + 2.3x where y represents average SAT score, and “x” represents the percentage of high school seniors taking the SAT. The coefficient of determination, R2 is 0.76. What is the correlation coefficient, r? [ONE POINT] NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers.
a. The correlation coefficient is – 0.76
b. The correlation coefficient is 0.76
c. The correlation coefficient is .87
d. The correlation coefficient is – .87
R squared, when unsquared is equal to .87, which is a positive and strong linear relationship.
23. The relationship between forced expiratory volume (FEV), which is measured in liters, and age, which is measured in years, is evaluated in a random sample of 200 men between the ages of 20 and 60. A simple linear regression analysis is performed to predict FEV from age. The Intercept (constant) = 4.0 and the Regression coefficient (slope) for Age = 0.05.
a. Interpret the regression coefficient for age (0.05) in words. [ONE POINT]
The regression coefficient for age is .05, and this shows that with every 1 year increase in age is associated with a .05 increase in forced expiratory volume.
b. Interpret the intercept (constant) in words. [ONE POINT]
Forced expiratory volume will be predicted by age; age 0 FEV will be 4 liters.
24. The nurse researcher is reading about multiple regression. What is multiple regression? [ONE POINT]
a. Make predictions about the values of one variable based on values of a second variable
b. Make predictions about the values of two variables based on values of a third and fourth variables
c. Method of predicting a continuous dependent variable on the basis of two or more independent variables
d. Method of predicting a continuous independent variable on the basis of two or more predictor variables
Multiple regression involves a single dependent variable and two or more independent variables. It is a statistical technique that simultaneously develops a mathematical relationship between two or more independent variables and an interval scaled dependent variable.
25. The director of a major hospital complex conducts a study to discover the types of critical incidents that have occurred in this hospital and its sister hospital over the past five years. She makes a list of every critical incident that has occurred over this period. Choose the true statements about this list. [ONE POINT] NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers.
a. The list is the dependent variable.
b. The list represents the hospital director’s assumptions.
c. The list is an extraneous variable.
d. The list represents the sample.
e. If the two hospitals have been in operation only five years, the list represents the population.
All of the following options should be addressed when conducted a sample study.
26. A researcher identifies three variables and formulates a hypothesis that links them. That hypothesis is testable. What does it mean that the hypothesis is testable? [ONE POINT]
a. The value of the hypothesis is low.
b. The hypothesis must be replaced by a research question.
c. All the variables in the hypothesis are measurable.
d. The hypothesis is causational.
27. What is a multiple regression equation? (Select all that apply.) [ONE POINT]
a. One that represents the mathematical effect that several independent variables have on the dependent variable
b. One in which the xvalues are multiplied by one another
c. One that explains more of the variance in y than does a single linear regression equation
d. An experimental model for determining best practices
e. One that uses more than one predictor variable to predict the value of the outcome variable
f. One that explains all of the variance in the dependent variable, in terms of several independent variables
Multiple regression involves a single dependent variable and two or more independent variables. It is a statistical technique that simultaneously develops a mathematical relationship between two or more independent variables and an interval scaled dependent variable . NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers.
28. Table 3 below shows the multiple logistic regression for factors associated with having three or more cardiovascular disease risk factors among a population of 500 participants who participated in a crosssectional survey.
Which variable(s) is considered to be statistically significant? Interpret the odds ratio and the 95% confidence interval for the significant variable(s). [THREE POINTS]
Table 3. Multiple Logistic Regression for the predictors of having three or more Cardio Vascular Disease risk factors (N=486)
Variables  Odds ratio  95% confidence interval 
Perceived health status Excellent/Very Good Good Fair/Poor  2.4 Reference 3.1  0.84.2 1.86.1 
Gender Female Male  1.6 Reference  1.12.9 
Race/Ethnicity African American Hispanic White  2.4 0.6 Reference  1.23.0 0.31.4 
USING THE DATA SET DOWNLOADED FROM BLACKBOARD sample_data_final_exam_SPRING2019.sav”
We will use the data to answer the following two research questions:
A. What are the predictors of systolic blood pressure level in the study?
B. What are the factors associated with cardiovascular disease in the study?
Answer the following questions [total=10 POINTS]:
A. What are the predictors of SYSTOLIC BLOOD PRESSURE LEVEL in the study?
A1. Do descriptive statistics for the following variables and interpret the findings:
[TWO POINTS]
– Gender, OBESITY, PHYSICAL ACTIVITY, RACE/ETHNICITY, CVD
– SYSTOLIC_BP [descriptive and histogram]
A2. Is there a correlation between DIASTOLIC BLOOD PRESSURE and FGL, age, SYSTOLIC BLOOD PRESSURE?[TWO POINTS] NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers.
a. Report the correlation coefficient (r) [direction and strength and statistical significance] and interpret the results.
2.
Correlations 

systolic blood pressure 
diastolic blood pressure 
Fasting glucose level 
age in years 

systolic blood pressure  Pearson Correlation 
1 
.782^{**} 
.295^{**} 
.778^{**} 
Sig. (2tailed) 
.000 
.000 
.000 

N 
200 
200 
200 
200 

diastolic blood pressure  Pearson Correlation 
.782^{**} 
1 
.331^{**} 
.713^{**} 
Sig. (2tailed) 
.000 
.000 
.000 

N 
200 
200 
200 
200 

Fasting glucose level  Pearson Correlation 
.295^{**} 
.331^{**} 
1 
.323^{**} 
Sig. (2tailed) 
.000 
.000 
.000 

N 
200 
200 
200 
200 

age in years  Pearson Correlation 
.778^{**} 
.713^{**} 
.323^{**} 
1 
Sig. (2tailed) 
.000 
.000 
.000 

N 
200 
200 
200 
200 

**. Correlation is significant at the 0.01 level (2tailed). 
3.
A3. Which variable(s) in the data (gender, race/ethnicity, obesity, physical activity, systolic blood pressure, and diastolic blood pressure) is a predictor of SYSTOLCI BLOOD PRESSURE? [TWO POINTS]
Coefficients^{a} 

Model 
Unstandardized Coefficients 
Standardized Coefficients 
t 
Sig. 
95.0% Confidence Interval for B 

B 
Std. Error 
Beta 
Lower Bound 
Upper Bound 

1  (Constant) 
72.763 
4.871 
14.937 
.000 
63.155 
82.371 

Gender 
.060 
.902 
.003 
.066 
.947 
1.719 
1.839 

race/ethnicity 
5.148 
.991 
.231 
5.196 
.000 
3.194 
7.102 

obese 
4.419 
1.394 
.193 
3.169 
.002 
1.669 
7.170 

physically active 
1.400 
1.337 
.062 
1.047 
.296 
1.237 
4.037 

diastolic blood pressure 
.734 
.054 
.635 
13.698 
.000 
.629 
.840 

a. Dependent Variable: systolic blood pressure 
What statistical technique you will use? Explain and interpret the findings. NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers.
B. What are the factors associated with cardiovascular disease in the study?
B1. Which variable(s) in the data (gender, race/ethnicity, obesity, physical activity, FGL, systolic blood pressure, and diastolic blood pressure) is associated with CVD? [TWO POINTS]
Coefficients^{a} 

Model 
Unstandardized Coefficients 
Standardized Coefficients 
t 
Sig. 
95.0% Confidence Interval for B 

B 
Std. Error 
Beta 
Lower Bound 
Upper Bound 

1  (Constant) 
.926 
.384 
2.408 
.017 
1.684 
.167 

Gender 
.026 
.044 
.026 
.589 
.556 
.112 
.061 

race/ethnicity 
.025 
.052 
.026 
.478 
.633 
.078 
.128 

obese 
.406 
.069 
.408 
5.851 
.000 
.269 
.542 

physically active 
.255 
.065 
.258 
3.917 
.000 
.383 
.126 

Fasting glucose level 
.000 
.002 
.013 
.264 
.792 
.004 
.003 

systolic blood pressure 
.004 
.003 
.087 
1.085 
.279 
.003 
.011 

diastolic blood pressure 
.009 
.004 
.181 
2.461 
.015 
.002 
.016 

a. Dependent Variable: cardiovascular disease 
What statistical technique you will use? Explain and interpret the findings. NUR 627: Advanced Epidemiology and Biostatistics for Nursing Papers.
C. Write a ONE PAGE summary report for the results of the study and its impact on nursing practice (i.e., summarize the findings from question A1 to B3) [TWO POINTS]