SOSC Stats 2006 Cumulative Take Home Final 1-9 F2006

Name____________________________

SOSC 2225 Statistics for the Social Sciences Cumulative Final Ch 1-9 F2006

1. In social research the purpose of statistics is to

a. prove that the research theory is correct

b. validate the research project design

c. manipulate and analyze data

d. ensure acceptance by the scientific community

2. In the research process, theory

a. is unnecessary

b. is always fully developed

c. is developed only after the data have been completely analyzed

d. attempts to explain the relationship between phenomena

3. In the language of sciences, a variable that is thought to be causal is called

a. an independent variable

b. a hypothetical variable

c. a primary variable

d. a dependent variable

4. "Ninety percent of dorm residents approved a proposed ban on smoking." This statement is an example of the use of

a. inferential statistics

b. univariate descriptive statistics

c. multivariate descriptive statistics

d. inductive statistics

5. A public opinion poll that gauges the popularity of the President of the United States is an example of

a. descriptive statistics

b. inferential statistics

c. analytical statistics

d. reductionist statistics

6. Which of the following is a discrete variable?

a. height

b. age

c. miles per gallon

d. number of children

7. Which of the following questions would generate a continuous variable?

a. How old are you?

b. How many books do you own?

c. How many times have you ever changed a flat tire?

d. How many degrees do you have?

8. Categories of nominal level variables should be

a. mutually exhaustive to avoid ambiguity in classifying cases

b. exhaustive so that every case fits into a category

c. relevant to the research goals

d. all the above

9. Choose the nominal level variable below:

a. size of family unit

b. eye color of students in a statistics class

c. speed of travel by jet

d. your weight

10. In addition to saying that one case is different from another, the ordinal level of measurement allows us to

a. order categories from high to low

b. measure the distance between high and low

c. say that one case is more or less than another

d. both a and c

e. all the above

11. Prejudice, when measured on a scale ranging from "most prejudiced" to "least prejudiced" is an example of which level of measurement?

a. actual

b. ordinal

c. nominal

d. interval-ratio

12. The number of years that a couple has been happily married is an example of

a. nominal level data

b. ordinal level data

c. interval-ratio level data

d. ordinary level data

13. Addition or subtraction are completely justified only when the variables are

a. discrete

b. continuous

c. ordinal

d. interval-ratio

For the items below, indicate the level of measurement.

Use (a) for nominal (b) for ordinal (c) for interval-ratio

14. __________________ age of patients in a mental health unit

15. __________________ classification of pain: very bad, bad, slightly bad, none

16. __________________ different types of medical personnel

17. __________________ the number of errors a rat makes while running a maze

18. The purpose of univariate descriptive statistics is to

a. summarize relationships between many variables

b. display the essential meaning of variables measured at the interval-ratio level

c. combine nominal and discrete variables

d. summarize a single variable

19. To calculate a proportion, the number of cases in any category (f) is divided by

a. the total number of categories (k)

b. the number of cases in all categories (N)

c. the cases in that category (f)

d. the number of cases in adjacent categories (k-1)

20. Forty of every 200 students attend all their classes. What percentage of the student body is this?

a. 5% c. 2%

b. 50% d. 20%

Identify which type of sampling is used:

(a) simple random, (b) stratified, (c) systematic, (d) cluster, or (e) convenience.

21. A reporter for the AJC newspaper interviews the first 20 doctors entering the hospital cafeteria.

22. A tobacco lobbyist writes the name of each U.S. Senator on a separate card, shuffles the cards, and then draws 10 names.

23. Planned Parenthood polls 500 men and 500 women about their views concerning the use of contraceptives.

24. A medical researcher from John Hopkins University interviews all leukemia patients in each of 20 randomly selected hospitals.

25. A psychiatric researcher from Emory University interviews every 10^th mentally ill patient (names received from the patient directory).

26. A line chart or frequency polygon is based on

a. the upper limits of each interval

b. the lower limits of each interval

c. the midpoints of each interval

d. any limit the researcher selects

For questions 27-31 a) mode b) median c) mean d) all the above

27. The _______ measures central tendency in terms of the most common.

28. The __________ measures central tendency in terms of the average score.

29. The _________ measures central tendency in terms of the middle score.

30. ________measure of central tendency is affected by every score in the distribution?

31. When data are badly skewed, the most appropiate measure of central tendency is ______________

32. If scores on a variable are 11, 14, 18, 19, 20, and 25, the median is

a. 3 c. 18.5 e. all the above

b. 18 d. 19

33. The fourth decile whould be at the same location as

a. the second quartile c. the fortieth percentile

b. the fourth percentile d. the fourth quartile

34. The purpose of measures of central tendency is to describe what value of a distribution of scores?

a. the most typical or representative

b. the most surprising or unexpected

c. the most significant or important

d. all the above

35. Measures of dispersion provide an indication of the

a. typical or most common score

b. variety within the distribution of scores

c. size of the sample

d. adequacy of the selection criteria for the sample

36. Which of the following data sets shows the greatest variability?

a. 100, 101, 102 c. 60, 70, 180

b. 0, 6, 10 d. 2, 4, 6

37. The index of qualitative variation (IQV) varies from 0.00 to 1.00. Which of the IQV's below shows the greatest degree of homogeneity?

a. 0.25 c. 0.75

b. 0.50 d. 1.00

38. One problem with the range (R) as a measure of dispersion is that it

a. is very difficult to calculate

b. ignores the most extreme scores

c. can be used only for nominal level data

d. is based on only the most extreme scores

39. Your score on the test is the same as the third quartile (Q3). You may conclude that

a. you scored higher than 75% of the people who took the test

b. the distribution of scores is skewed

c. your score is typical since it is the same value as the median

d. you scored higher than 25% of the people who took the test

40. As the distribution of scores becomes more variable, the value of the standard deviation

a. decreases c. increases

b. stays the same d. becomes unpredictable

41. A defining characteristic of the normal curve is that it is

a. theoretical c. negative skewed

b. positively skewed d. all the above

42. The tails of the normal curve

a. intersect with the horizonal axis beyond the 3rd standard deviation

b. intersect with the y axis at 0

c. never touches the horizontal axis

d. none of the above

43. On all normal curves the area between the mean and +/- 2 standard deviations is

a. about 34% of the area c. less than 50% of the area

b. about 95% of the area d. about 68% of the area

44. If a Z score is 0, then the value of the corresponding raw score would be

a. 0

b. the same as the mean of the empirical distribution

c. the same as the standard deviation of the empirical distribution

d. none of the above

45. The probability of getting a king out of a deck of 52 cards is

a. 1/4 c. 1/52

b. 1/13 d. 1/6

46. Social scientists gather data from samples instead of populations because

a. samples are much larger and more complete

b. samples are more trustworthy

c. populations are often too large to test

d. samples are more meaningful and interesting

e. all the above

47. Compared to probability samples, non-probability samples

a. are usually cheaper to assemble

b. are always much larger

c. are usually more expensive to assemble

d. allow for generalizations to populations

48. According to the theorems in chapter 6, we can be sure that the sampling distribution is normal if

a. the sample is large c. the population is small

b. the sample is stratified d. the sample is normal

49. Between 70% and 80% of the people who do the family grocery shopping are women. This is

a. not a finding which can be generalized

b. a point estimate

c. an interval estimate

d. an example of sexism

50. From a random sample of 300 state university students, you found that the average number of hours of study time each week is 30 with a standard deviation of 5. A point estimate of the average study time for all state university students would be

a. 5 c. 300

b. 30 d. 15 +/-1 standard deviation

51. An estimator is unbiased if its sampling distribution is equal to

a. the midpoint of the distribution

b. the sample mean

c. the population value

d. all the above

52. The efficiency of any estimator can be improved by

a. increasing the sample size

b. decreasing the sample size

c. making the sample representative

d. changing the sample

53. The probability that an interval estimate dows not include the population value is called

a. the margin c. an error

b. alpha d. the odds

54. To decrease the probability that a confidence interval will NOT include the population parameter

a. lower the alpha level c. lower the beta level

b. raise the alpha level d. set efficiency to 0

55. In the formula for finding a confidence interval when the value of the population standard deviation is unknown, we change N to N-1. The reason for this change is

a. to correct for the fact that s is biased

b. the standard deviation of a sample is always greater that the standard deviation of the population

c. the standard deviation of a sample is unbiased

d. sample size is much too large

56. If a researcher changes from the 90% confidence interval to the 95% level, the confidence interval will

a. widen c. not be affected

b. decrease in width d. widen only if N is greater than 100

57. The width of an interval estimate can be controlled by

a. changing the confidence level

b. changing the alpha level

c. changing the sample size

d. any of the above

58. The central problem in the case of one sample hypothesis test is to determine

a. if a sample is random

b. if sample statistics are the same as those of the sampling distribution

c. if parameters are representative of population

d. if a sample came from a population with a certain characteristic

59. Which assumption must be true in order to justify the use of hypothesis testing?

a. random sampling

b. very large samples

c. interval-ratio level of measurement

d. samples have been stratified

60. The null hypothesis in the one sample case is a statement of

a. agreement with the research hypothesis

b. rejection

c. acceptance

d. no difference

61. The research hypothesis (H1) typically states what the researcher expects to find and

a. contradicts the null hypothesis

b. verifies the null hypothesis

c. modifies the null hypothesis

d. all the above

62. If we reject a null hupothesis at the 0.05 level

a. the odds are 20 to 1 in our favor that we have made a correct decision

b. the null hypothesis is true

c. the odds are 5 to 1 in our favor that we have made a correct decision

d. the research hypothesis is true

63. The critical region is

a. the area under the curve between +/- 2 standard deviations

b. the area under the curve that includes those values of a sample statistic that will lead to rejection of the null

c. the area under the curve between +/- 3 standard deviations

d. all the above

64. If the critical region begins at +/-2.56 and the test statistic is -2.5, we

a. fail to reject the null hypothesis

b. reject the null hypothesis

c. cannot make a decision because the test statistic is so close to the critical region

d. change the alpha level

65. A researcher is interested in the effect that neighborhood crime-watch efforts have on the crime rate in the inner city, but s/he is is unwilling to predict the direction of the difference. The appropriate test is

a. one-tailed c. descriptive

b. two-tailed d. symmetrical

66. Do sex education classes and free clinics that offer counseling for teenagers reduce the number of pregnancies among teenagers? The appropriate test of hypothesis would be

a. one-tailed test c. cross-sectional

b. two-tailed test d. participant observation

67. If we reject a null hypothesis which is in fact true, we

a. have made a correct decision

b. have made a Type I error

c. have made a Type II error

d. should have used a one-tailed test

68. The probability of a Type I error is

a. beta c. alpha level

b. 0.01 d. 0.05

69. As the critical region decreases in size

a. the probability of Type I error increases

b. the probability of rejecting the hull hypothesis increases

c. alpha increases

d. the probability of Type II error increases

70. The t distribution, compared to the Z distribution, is

a. more skewed

b. more peaked for small samples but increasingly like the Z distribution as N increases

c. bimodal

d. flatter for small samples but increasingly like the Z distribution as N increases

71. In a t test of differences between means, incrasing sample size will affect

a. degrees of freedom

b. the standard deviation of the sampling distribution

c. t score

d. all of the above

72. The central problem in the case of two-sample hypothesis test is to determine

a. if the samples are random

b. if sample statistics are the same as those of the sampling distribution

c. if the parameters are representative of the populations

d. if two populations differ signifcantly on the trait in question

73. When testing for the significance of the difference between two samples, which is the proper assumption for step 1?

a. random sampling

b. ordinal level of measurment

c. degrees of freedom are zero

d. samples are independent as well as random

74. When testing for the significance of the difference between two samples, the null hypothesis reminds us that our interest is on differences between the

a. samples c. sampling distributions

b. populations d. standard deviations

75. When conducting hypothesis tests for two sample means, the test statistic is

a. alpha

b. the difference in the sample means

c. the degrees of freedom

d. the difference in the population means

Problems

1. The Newport Chronicle claims that pregnant mothers can increase their chances of having healthy babies by eating lobsters. That claim is based on a study showing that babies born to lobster-eating mothers have fewer health problems than babies born to mothers who don’t eat lobsters. What is wrong with this claim?

2. The test scores of the following students are summarized in the frequency table below.

Score Students

90-99 6

80-89 9

70-79 14

60-69 6

50-59 5

a)___________ lower apparent limit of class 70-79

b) __________ lower real limit of class 70-79

c)___________ upper apparent limit of class 70-79

d)___________ upper real limit of class 70-79

e)___________ class interval size

f)____________midpoint of class 50-59

g) ___________N =

h) draw a line chart

i) draw a pie chart

3. The number of students enrolled in social science classes for a given semester is recorded.

9 15 17 19 20 20

Find :

a)_____________N e)_____________Q1

b)_____________Ex f)_____________Q3

c)_____________mean g)_____________Interquartile range

d)_____________median h)_____________mode

4. At a local preschool, children were observed for one week and the number of aggressive acts committed were recorded in a grouped frequency distribution. Compute the mean, the variance, and the standard deviation.

Apparent Limits Frequency

0-2 1

3-5 2

6-8 4

9-11 3

12-14 1

5. A sample of university students has an average GPA of 2.78 with a standard deviation of 0.45. If GPA is normally distributed, what percentage of the students has GPAs

Z score Proportion Percentage

a. less than 2.10 _______ _________ _________

b. less than 2.80? _______ _________ _________

c. more than 2.10? _______ _________ _________

d. more than 3.00? _______ _________ _________

e.between 2.50 and 3.50? _______ _________ _________

f. between 2.00 and 2.50? _______ _________ _________

g) Find the z score and raw score at the 30^th percentile

z=____________ x= ____________

h) Find the z score and raw score at the 90^th percentile

z=____________ x =_____________

6) For each situation, find Z critical

Alpha Form Z critical

.05 One-tailed

.10 Two-tailed

.06 Two-tailed

.01 One-tailed

.02 Two-tailed

7) For each situation, find t critical

Alpha Form n t critical

.05 two-tailed 121

.01 one-tailed 21

.10 two-tailed 30

.01 two-tailed 5

8) For each situation below, compute the test statistic

a. u = 2.39 x bar = 2.23

o~ = 0.75 n = 200

b. u = 17.1 x bar= 17.8

s = 0.92

n = 105

c. u =10.2 x bar = 9.4

s = 1.7

n = 40

d. u = 142 x bar = 145

o~ = 10 n = 35

For following questions, use the five step model and write a sentence or two interpreting your results.

9. A sample of 117 workers in the Overkill Division of the Machismo Toy Factory earns an average of $24, 600 per year. The average salary for all workers is $24,240 with a standard deviation of $521. Are workers in the Overkill Division overpaid?

10. A random sample of 39 local sociology graduates scored an average of 449 on the GRE advanced sociology test with a standard deviation of 21. Is this significantly different from the national average of 445?

11. The mean of a statistics test is 78 and the standard deviation of this test is 4. The following are the scores from a statistics class. How does the class compare to the population?

Test score

Student A 56

Student B 88

Student C 62

Student D 91

Student E 43

12. A sample of students attending a large university has been selected. Is there a statistically significant difference beween Liberal Arts majors and other students on average number of books (other than those required by course work) read per year?

Liberal Arts Other

X bar (1) = 16.2 X bar (2) = 13.7

s(1) = 2.3 s(2) = 9.0

N (1) = 236 N (2) = 321