Prior to reading any one of these, please take a moment to look in the book and the professor’s lectures to see if you can find an answer on your own. Seeking out the knowledge for yourself will help to fix it in your mind, and you are less likely to forget it come test day.
Unless otherwise noted, almost all of the answers come from the book, Methods in Behavioral Research. No challenge to copyright or plagiarism intended.
Define reliability of a measure of behavior and describe the difference between test-retest, internal consistency, and interrater reliability
Discuss ways to establish construct validity, including face validity, content validity, predictive validity, concurrent validity, convergent validity, and discriminant validity.
Describe the problem of reactivity of a measure of behavior and discuss ways to minimize reactivity.
Describe the properties of the four scales of measurement: nominal, ordinal, interval, and ratio.
Contrast the three ways of describing results: comparing group percentages, correlating scores, and comparing group means
Correlating Individual Scores: Individuals are measured on two variables, and each variable has a range of numerical values.
Comparing Group Means: Groups with different conditions have various outcomes measured. The groups then have their means calculated. The means may then be compared to see if there is a statistically significant difference between the groups.
Describe a frequency distribution, including the various ways to display a frequency distribution
Frequency distribution: the number of individuals who receive each possible score on a variable.
Pie Charts: divide a whole circle, or “pie” into “slices” that represent relative percentages.
Bar graphs: use a separate and distinct score for each piece of information. Typically used to chart categorical scores which a distinct from one another (e.g. liking vs disliking, red vs blue)
Frequency polygons: use a line to represent the distribution of frequencies of scores. Most helpful when the data represent ratio or interval scores.
Histograms: Uses bars to show frequencies similar to a bar graph, but scores are continuous, such as age or height.
Describe the measures of central tendency and variability
Central tendency tells us what the sample as a whole, or on the average, is like. There are three measures of central tendency.
Mean: Set of scores obtained by adding all of the scores and dividing by the number of scores. It is symbolized by x̄ and abbreviated by M.
Median: The score which divides all of the scores in half. Abbreviated as Mdn.
Mode: The most frequent score. This is the only measurement of central tendency that is appropriate if a nominal score is used.
Variability: A number that characterizes the amount of spread in a distribution of scores.
Standard Deviation: the average deviation of scores from the mean. Symbolized as s, abbreviated as SD.
The formulae for the standard deviation are different depending on whether the it is the population or the sample being found.
Population: σ = √(Σ(x-μ)²)/N
Sample: s = √(Σ(x – x̄)²)/(n – 1)
Variance: s² – just the SD squared
Range: The difference between the highest score and the lowest score.
Define a correlational coefficient
Correlational coefficient: a statistic that describes how strongly variables are related to one another.
Define effect size
Effect size: the strength of association between variables
Describe the use of a regression equation and multiple correlation to predict behavior
Regression equations are calculations used to predict a person’s score on one variable when that person’s score on another variable is already known. The general form of a regression equation is:
Y = a + bX
Where Y is the score we wish to predict, a is the constant, b is the weighting adjustment factor, and X is the known score.
When researchers are interested in predicting some future behavior (called the criterion variable) on the basis of a person’s score on some other variable (called the predictor variable), it s first necessary to demonstrate that there is a reasonably high correlation between the criterion and predictor variables. The regression equation then provides the method for making predictions on the basis of the predictor variable score only.
A technique called multiple correlation is used to combine a number of predictor variables to increase the accuracy of prediction of a given criterion or outcome variable. A multiple correlation is the correlation between a combined set of predictor variables and a single criterion variable. Taking all of the predictor variables into account usually permits greater accuracy of prediction tan if any single predictor is considered alone.
A multiple regression equation can be calculated that takes the following form:
Y = a + b1X1 + b2X2 . . . + bnXn
where Y is the criterion variable, X1 to Xn are the predictor variables, a is a constant and b1 to b n are weights that are multiplied by scores on the predictor variables.
Discuss how a partial correlation addresses the third-variable problem
A technique called partial correlation provides a way of statistically controlling third variables. A partial correlation is a correlation between the two variables of interest, with the influence of the third variable removed from or “partialed out of”, the original correlation. This provides an indication of what the correlation between the primary variables would be if the third variable were held constant. This is not the same as actually keeping the variable constant, but it is a useful approximation.
Summarize the purpose of structural equation models
Note: likely won’t be on test
Structural Equation modeling is a set of techniques to examine models that specify a set of relationships among variables using quantitative non-experimental methods. A model is an expected pattern of relationships among a set of variables. The proposed model is based on a theory of how the variables are causally related to one another. After the data have been collected, statistical methods can be applied to examine how closely the proposed model actually fits the obtained data.
Explain how researchers use inferential statistics to evaluate sample data
Inferential statistics are used to determine whether the results match what would happen if we were to conduct the experiment again and again with multiple samples. In essence, we are asking whether we can infer that the difference in the sample means reflect a true difference in the population means. Inferential statistics give the probability that the difference between means reflects random error rather than a real difference.
Distinguish between the null hypothesis and the research hypothesis
Statistical inference begins with a statement of the null hypothesis and a research (or alternative) hypothesis. The null hypothesis is simply that the population means are equal – the observed difference is due to random error. The research hypothesis is that the population means are, in fact, not equal. The null hypothesis states that the independent variable had no effect; the research hypothesis states that the independent variable did have an effect. If we can determine that the null hypothesis is incorrect, then we accept the research hypothesis as correct.
Discuss probability in statistical inference, including the meaning of statistical significance
Probability is the likelihood of the occurrence of some event or outcome.
Statistical significance: A significant result is one that has a very low probability of occurring if the population means are equal. Significance indicates that there is a low probability that the difference between the obtained sample means was due to random error.
Describe the t test and explain the difference between one-tailed and two-tailed tests
The t test is commonly used to examine whether two groups are significantly different from each other. It reflects all the possible outcome we could expect if we compare the means of two groups and the null hypothesis is correct. To use this distribution to evaluate our data, we need to calculate a value of t from the obtained data and evaluate the obtained t in terms of the sampling distribution of t that is based on the null hypothesis. If the obtained t has a low probability of occurrence (.05 or less), then the null hypothesis is rejected. The t value is a ratio of two aspects of the data, the difference between group means and the variability within groups. The ratio may be described as follows:
t = group difference / within-group variability
The group difference is simply the difference between your obtained means; under the null hypothesis, you expect this difference to be zero. The value of t increases as the difference between your obtained sample means increases.
Describe the F test, including systematic variance and error variance
Describe what a confidence interval tells you about your data
Distinguish between Type I and Type II errors
Type I error – when we reject the null hypothesis but the null hypothesis is actually true.
Type II error – occurs when the null hypothesis is accepted, although in the population the research hypothesis is true.
The probability of a Type I error is called alpha (α) and the probability of a Type II error is called beta (β).
Discuss the factors that influence the probability of a Type II error
- The significance (alpha) level. If we set a very low significance level to decrease the chances of a Type I error, we increase the chances of a Type II error. (As alpha decreases, beta increases.)
- Sample size – true differences are more likely to be detected if the sample size is large.
- Effect size – If the effect size is large, a Type II error is unlikely. However, a small effect size may not be significant with a small sample.
Discuss the reasons a researcher may obtain nonsignificant results
Define power of a statistical test
Describe the criteria for selecting an appropriate statistical test