skip to main | skip to sidebar

See lectures by course…

Wednesday, December 29, 2010

PSYC 2001: Research Designs

1. Qualitative Research

Other than quantitative research, there are four major approaches to qualitative research:

  • Naturalistic Observation: describes  behaviour in real life.
    • Strengths: gives insight into contexts, relationships, behaviours (more real life) and flexible.
    • Weaknesses: time-consuming, hard to document, observe – don’t interpret, access may be hard to get, informed consent may be difficult to get, and may not be representative.
  • Questionnaires.
  • Conversation Analysis.
  • Single-Case Study: an in-depth analysis of a single person or small group. If there is no treatment, it’s called a case history.

We can use indirect measures (e.g., observing people) or content analysis (e.g., looking at patterns and themes in written text or media).                

  • Program Evaluation: in order of increasing usefulness, we can use reactions and feelings, learning, changes in skills, and effectiveness to evaluate a program.
    • Issue: Participants may not volunteer or may feel coerced to participate.

There are 6 ways to evaluate a program:

  1. Questionnaires: used to quickly/easily access information from a large sample in a non-threatening way.
    • Pros: anonymous, cheap, easy to compare and analyze, already exist.
    • Cons: getting careful feedback is tough, wording can bias responses, representative sample may not be available, limited to questions on the sheet.
  2. Interviews: used to fully understand someone’s answers or expand on questionnaire responses.
    • Pros: full range and depth of information, develops relationship with client.
    • Cons: time consuming, hard to analyze and compare, expensive, interviewer can bias response.
  3. Documentation Review: used when we want an impression of how a program operates without interrupting the program (where we review applications, finances, memos, etc).
    • Pros: historical information is available, doesn’t interrupt the client, few biases in information.
    • Cons: information may be incomplete, no flexibility (you have what exists and nothing more), need to be clear about what you’re looking for (so much data).
  4. Observation: used to gather information about how a program operates (particularly about specific processes).
    • Pros: views operations of a program as they are actually occurring, can adapt to events as they occur.
    • Cons: difficult to interpret observations, hard to categorize observations, can influence behaviour of participants.
  5. Focus Groups: used to explore a topic in depth with a discussion (useful in evaluations and marketing).
    • Pros: quickly and reliably get common impressions, fast and efficient way to get a wide range of information, can convey key information about programs.
    • Cons: hard to analyze responses, need a good facilitator, hard to schedule people together.
  6. Case Studies: used to fully understand a client’s experience in a program and conduct comprehensive examination through cross comparison of cases.
    • Pros: fully depicts client’s experience, powerful way to portray program to outsiders.
    • Cons: time consuming, provides depth but not breadth.
Developing Surveys and Questionnaires

Surveys evaluate attitudes, feelings, experience, and knowledge, whereas questionnaires evaluate aptitudes, abilities, and characteristics.

There are two characteristics of tests:

  • Reliability: inter-rater reliability and evaluation (split-half – is performance in one section relate to performance in another?; and test-retest – are scores consistent over time?)
  • Validity: concurrent (do results concur with other tests?) and criterion (does it predict behaviour? Is it meaningful? Is it useful?)

Though we can use existing measures, they can be expensive, intended for special populations, have and restricted access.

There are five types of questions:

  1. Yes-No: easy to mark. Measure with frequency/percentage.
  2. Forced-Alternative: easy to score but nothing ever fits anyone perfectly.
  3. Multiple-Choice: easy to mark.
  4. Likert Scales: people tend not to choose the extremes – if you use even numbers, people have to at least agree/disagree. Measure with a mean/modal response.
  5. Open-Ended: allows a wide variety of responses. Measure by coding first.

Remove questions that  have low variability, were omitted or commented on by participants, or do an item analysis (if too many people get it wrong, remove the question).

There are three ways to distribute surveys (in addition to online and interviews):

  Pros Cons
Mail No need to be present
Potential for larger sample
No idea who completed it
Questions may not be answered in order
Low sample rate (biased return, 20-30%)
Incomplete questions
Unanswered questions
Answers are not representative
Phone No need to be present
Random sample possible
Questions in correct order
Survey completed
Clarification possible
Random number dialing may call same person twice
Hang-ups, blocked calls, screening
No visuals in survey
Harder to establish a rapport
Lower response rate (between mail and in-person)
In Person High return rate (90%)
Answer classification possible
Ensures surveys are completed
Questions answered in correct order
Time
Money
Training
Interviewer Bias
Harder to access (in-home)

2. Quantitative Research

There are five major approaches to quantitative research:

  1. Descriptive Statistics: describes a group using numerical scores on variables.
    • Use: often a starting point in research.
    • Statistics: mean, median, and standard deviation.
  2. Correlational: where we look for relationships between two or more variables. We measure each participant on each variable but there is no causal inferences.
    • Use: for testing theories and help form predictions; often as a secondary analysis.
    • Statistics: often requires a linear relationship which is evaluated on strength and direction.
  3. Experimental: tests your hypotheses on the causal effects of the IV on the DV. The IV is actively manipulated and there are two levels. Your participants must be randomly assigned to conditions and you need to specify your procedures for testing your hypothesis. Also, you have to control for major threats to internal validity.
  4. Quasi-Experimental: where we test our hypotheses about relationships between the IV and DV but we have limited ability to actively manipulate the IV or randomly assign participants to conditions. We must still include specific procedures for testing the hypotheses and still control for the major threats to internal validity. We cannot make true causal inferences.
  5. Non-Experimental: see book!

3. Experimental Designs

There are four characteristics of an experimental design:

  1. Two levels of the IV (by manipulating the IV).
  2. Include a measurement variable (DV) to compare the participants in each condition.
  3. Look for consistent differences between groups.
  4. Include control for the major threats to internal validity.

You should also test your hypothesis about the causal effects of the IV, randomly assign participants to conditions, and limit the influence of extraneous variables (so they don’t become confounds).

Causation and Directionality: when we find a relationship between two variables, we have to ensure that we manipulate and control our variables to ensure that we can make a causal and directional conclusion.

  • Some variables may be hard to manipulate (ethically or logistically).

3.1. Control Groups

A good experiment will always have a control group, otherwise we can’t be sure if the treatment caused the change or whether it was due to some other variable. The control group must receive the exact same experience as the treatment group (with the exception of the actual treatment).

There are two types of control groups:

  1. No-Treatment Group: where the group receives no treatment. This provides a baseline to compare the treatment group with.
  2. Placebo Group: where the group receives a placebo treatment. It helps rule out the possibility that the difference between the treatment and no-treatment group are due to participant expectations and regression to the mean (i.e., people will ‘get better’ without treatment so we have to know if it’s a result of the treatment or time).
    • There are three types of placebos:
      • Inert substance – a sugar pill.
      • Sham surgery.
      • False information – telling participants they’re getting the ‘better’ treatment.

Manipulation Check: tests whether the manipulation of the IV actually worked as intended.

3.2. Between-Subject Designs

Between-Subject Design: where we compare scores from separate group (different treatment conditions) to determine which treatment is more effective. Scores are independent.

Groups should be equivalent. The process in which they are created should be the same, they should be treated equally, and they should be composed of equivalent individuals.

  • Advantages: each score is independent of other scores.
    • Not influenced by experience from other treatments (i.e., practice effects).
    • Not influenced by fatigue or boredom.
    • Not influenced by contrast effects.
  • Disadvantages:
    • Requires a large number of participants (can be difficult with special populations).
    • Individual Differences: characteristics that differ from one participant to another; these can become confounding variables and can produce high variability in the scores.
      • Assignment bias (e.g. older versus younger group)
      • Confounding environmental variables

Variance Between and Within Groups

We want variance within groups but we want to control unwanted sources of variance. There are two forms of variance:

  1. Systematic Between-Group Variance: experimental variance due to the IV, or extraneous variance due to confounding variables.
  2. Non-systematic Within-Group Variance: due to chance or individual differences.

To determine if there is a systematic difference between groups, we must compare between-group and within-group variance. If between-group variance is greater than within-group variance, there is a significant, systematic difference, meaning the null hypothesis should likely be rejected.

Using a t-test or ANOVA, we can calculate the difference between groups divided by the difference expected by chance. t = (difference between groups)/(difference expected by chance [error]) or F = (difference between groups)/(difference expected by chance [error])

To determine the variability due to chance, we calculate the signal to noise ratio. The signal to noise ratio must be large in order to conclude that the difference is due to something other than chance.

  • Noise: the standard deviation.
  • Signal:
  • 10:3 would be large whereas 8:7 is small
  • Signal : Noise, where signal > noise.

There are three ways we can control the variance:

  • Maximize experimental variance: make sure that the differences between groups are real by using a manipulation check or maximizing the effect size.
  • Control extraneous variance: rule out confounds by making sure the groups are as similar as possible.
  • Minimize error variance: control your study with careful measurements.

Threats to Internal Validity

There are two major threats to internal validity with this design:

  • Differential Attrition: if too many participants drop out, your groups may be uneven, creating problems.
  • Communication between groups:
    • Diffusion: where participants from the no-treatment group start using techniques from the treatment group.
    • Compensatory Equalization: where the no-treatment group wants the treatment.
    • Compensatory Rivalry: where the no-treatment group tries harder than the treatment group to prove themselves.
    • Resentful Demoralization: where the no-treatment resents the treated groups and performs poorly.

Applications

Two-Group Mean Difference: where we compare only two groups of participants and the research manipulates only one independent variable with only two levels (single-factor two-group design).

  • Often used when we compare treatment and control group.
  • Advantage: simple and provides best opportunity to maximize variance between groups.
  • Disadvantage: provides little data for comparison (only 2 data sets).

Comparing Means for More than Two Groups: where we compare multiple groups and the research manipulates the independent variables on several levels (single-factor multiple group design).

  • Advantage: provides stronger evidence for real cause-and-effect relationship.
  • Disadvantage: minimizes the differences between groups so we have to ensure we manipulate the independent variables sufficiently.

Comparing Proportions for Two or More Groups: when you use nominal or ordinal scales, you can’t compute a mean but you can compare proportions using a chi-square test.

3.3. Within-Subject Designs

Within-Subject Design: where we use a single group of participants and test/observe each individual in all of the different treatments being compared. (Aka. repeated-measures design.) Scores are dependent.

  • Advantages: requires few participants and eliminates all issues relating to individual differences.
    • Possible to measure differences between treatments without involving individual differences.
    • Possible to measure differences between individuals – these individual differences can be measured and removed.
    • Easier to detect a treatment effect with this design.
  • Disadvantages:
    • Time variables (fatigue, weather) can influence participants and threaten internal validity.
    • Participant attrition – counteract this by getting more participants at the outset than you think you’ll need.

There are two major threats to internal validity:

  • Confounding environmental variables (e.g., morning versus afternoon).
  • Confounding time variables (history, maturation, instrumentation, testing effects, and regression).

There are two type of order effects:

  • Carryover Effect: (treatment-specific experience) participant develops new skills/techniques that carries over to the new study
  • Progressive Error: (general experience)
    • Practice Effects: progressive improvement in performance as a participant gains experience.
    • Fatigue: progressive decline in performance as a participant gains experience.

Three ways to control time-related threats and order effects:

  1. Controlling Time: shortening the time between experiments can reduce time-related threats to internal validity but increases order effects.
  2. Switch to a Between-Subject Design: whenever you have reason to suspect that order effects will play a substantial role, switch to between-subject design
  3. Counterbalancing – Matching Treatments with Respect to Time: where we change the order in which treatments are given – the goal is to use every possible order of treatments with an equal amount of participants in each sequence
    • Note that this method doesn’t eliminate order effects but distributes the effects evenly across treatment groups
    • Limitations:
      • Counterbalancing and Variance: counterbalancing increases the variance within the treatment groups.
      • Asymmetrical Order Effects: counterbalancing doesn’t take into account that some treatments may be more demanding (have an unbalanced amount of order effects).
      • Counterbalancing and the Number of Treatments: n! (n factorial) is the number of sequences needed to get a complete counterbalance study; partial counterbalancing can be used (Latin square or randomly assigning treatment group orders).

Applications

Two-Treatment Designs: where there are only two treatment conditions.

  • Advantages: easy to conduct and results are easy to understand, easy to counterbalance, easy to get a statistical difference.
  • Disadvantages: doesn’t demonstrate any functional relationship between the two data sets – we can’t know how gradual changes in the independent variable would affect the dependent variable.
  • Statistical Analysis: t (repeated-measures) or analysis of variance can be used (interval or ratio). For ordinal scales, the Wilcox on test can be used.

Multiple-Treatment Designs:

  • Advantage: demonstrates stronger cause-and-effect relationship.
  • Disadvantage: too many treatments may not result in statistically significant variation and increases the amount of time to complete the study which increases participant attrition.

3.4. Comparing Within- and Between-Subject Designs

There are three ways in which these design differ significantly:

When you suspect… Good for… Bad for…
Individual Differences Within-subject Between-subject
Time-Related Factors and Order Effects Between-subject Within-subject
Fewer Participants Within-subject Between-subject

3.5. Single-Subject Research Design

Single-subject research designs allow researchers to manipulate IVs and measure the DV with a single person (or small group) and infer a causal relationship. There is no control group.

  • Useful for studying unusual disorders and unique experiences, or when we’re concerned that group variation will hide individual effects. (Often used in clinical psychology.)
  • Flexible and you only need one (or a small group of) participant(s).

Classical Research Design: [baseline] – [treatment] – [post-treatment]

ABAB Reversal Design: [baseline] – [treatment] – [baseline] – [treatment]

  • In the treatment conditions, we’re looking for a full reversal of behaviour so we can conclusively so that the treatment is effecting the behaviour.
  • The baseline conditions act as the ‘control’.
  • This can’t be used for treatments that are expected to have long-lasting or permanent effects.

There are three Multiple-Baseline Designs:

  • Multiple-Baseline Across Subjects: where a baseline is established for multiple participants and the treatment is staggered across (only once) each participant.
  • Multiple-Baseline Across Behaviours: where baseline behaviours are established for a single participants and treatments are staggered to determine whether the treatment had an effect.
  • Multiple-Baseline Across Situations: where baseline behaviours are established in two different situations and the treatments are staggered.
  • All of the multiple-baseline designs follow this pattern:
    Observation 1: [---baseline---]           [---treatment---]
    Observation 2: [---baseline---]                            [---treatment---]
  • Strengths: good for treatments with long-lasting effects.
  • Weaknesses: hard to discern between individual behaviours; treatments may produce a general effect.

There are three other types of single-subject designs:

  1. Dismantling (Component-Analysis) Design: where each phase adds or subtracts one component of a complex treatment to determine how each component contributes to the overall treatment.
  2. Changing Criterion Design: where each phase is defined by a specific criterion that determines a target level of behaviour (e.g., 20 cigarettes a day for the first week, 10 for the next). To ensure there is a correlation between the treatment and the behaviour, researchers often include backward phases (“now have 30 cigarettes!”) to make sure the behaviour is causally related to the treatment.
  3. Alternating-Treatment Design: where two or more treatment conditions are randomly alternated from one observation to the next.

Single-subject designs have low external validity but high internal validity. We can generalize our results using systematic, clinical, and direct replication.

  • Systematic replication is where we test the effectiveness of a given treatment across a variety of behaviours (e.g., change the participant, setting, or behaviour).
  • Clinical replication is where we test the effectiveness of a combination of treatments for various behaviours, participants, and settings.
  • Direct replication is where we test the effectiveness of a given treatment for a single behaviour, multiple times. ABAB design uses this approach.

4. Quasi- and Non-Experimental

Quasi-experimental makes some attempt to reduce threats to internal validity whereas non-experimental makes none. In both cases, the conditions are not changed by manipulating an IV but are defined by pre-existing conditions (e.g., gender, age) or in terms of time (e.g., before and after). These designs look for differences between groups.

There are two categories of designs that can be classified as quasi- or non-experimental:

  • Between-subject designs, or non-equivalent group designs; and
  • Within-subject designs, or pre-post designs.

Non-equivalent group (between-subject) designs are so named because the researcher cannot control the assignment of individuals to a group.

  • All three design subtypes suffer from assignment bias.
  • The three design subtypes:
    • Differential Design (non-experimental): where participants are assigned to groups on a particular participant character (such as gender or race). The DV is compared between the groups to see if there is a consistent difference. Similar to correlational research.
    • Non-Equivalent Group Design (non-experimental): where the control and treatment groups are pre-existing and not randomly assigned.
      • Posttest-only Non-equivalent Control Group Design: where two non-equivalent groups are compared – they are measured at the same time but only one group receives the treatment.
        X O (treatment group)
            O (control group)
    • Pretest-Posttest Non-Equivalent Group Design (quasi): where two non-equivalent groups (treatment and control ) are compared – the treatment group is measured before AND after the treatment whereas the control group is measured twice (before and after) but with no treatment in between.
      O X O (treatment group)
      O     O (control group)

      This design reduces assignment bias, limits threats from time-related  factors and provides evidence for a cause-and-effect relationship.
  • History effects that affect groups differently are called differential threats. Time-related threats may occur with all of the above as well.

Pre-post group designs (within-subject) measures the pretest and posttest scores of a group before and after treatment to determine if there’s an effect. There is no control group with this design.

There are three types of designs:

  • One-Group Pretest-Posttest Design (non-experimental): only one group is used
    O X O
  • Time-Series Design (quasi-experimental): multiple observations are made before and after the treatment. This is often used with predictable events (e.g., a change in law).
    O O O X O O O

These designs suffer from time-related threats to internal validity (history, instrumentation, testing effects, maturation, and statistical regression).

Developmental Research Designs: experiments that examine changes in behaviour that are related to age. There are three types:

  1. Cross-Sectional: (between-subject) different cohorts are measured at one point in time and then compared.
    • Cohorts: individuals born roughly at the time same and growing up in similar conditions.
    • Cohort Effects: differences between age groups caused by unique experiences other than age. (Be careful! Developmental differences may just be cohort effects.)
  2. Longitudinal: examines development by observing or measuring one group of cohorts over time.
  3. Cross-Sectional Longitudinal Designs: compare results obtained from separate samples that were obtained at different times (i.e., many cohorts, many times).
  Cross-Sectional Longitudinal
Strengths
  • Time efficient
  • No long-term cooperation required
  • Easier to generalize
  • No cohort or generation effects
  • Assess individual behaviour effects
Weaknesses
  • Individual changes not assessed
  • Cohort or generation effects
  • Time consuming
  • Participant dropout may create bias
  • Potential for practice effects
  • Harder to generalize results

5. Factorial Designs

Experimental factorial designs are studies that contain more than one IV (or more than one factor) and allow us to examine the interactive effects of IVs. This is good for external validity.

We use ANOVA for factorial designs. It’s written to express the number of levels of each IV.

For example, if there are two IVs and the first IV has 2 levels and the second IV has 5 levels, it would be a 2 × 5 factorial design.

A Main Effects describes the individual effect of each IV on the DV.

  • On a graph, the x variable has an effect if the mean line drawn is not parallel.
  • On a graph, the y variable has an effect if the mean points are not in the same place.

An Interaction Effect describes the combined effect of two or more IVs on the DV.

  • On a graph, there is an interaction if the lines are not parallel.

Interaction and Main Effect for BAR Interaction and Main Effect for DRINK

6. Correlational Designs

Correlational designs allow us to measure each participant on each variable and find a relationship between two variables (which is calculated using a correlational coefficient and a scatter plot). This design looks for a relationship between two or more variables.

There are three characteristics of a correlation:

  1. The direction can be upward (+r), downward (−r), or non-existent (r = 0).
  2. The nature of the relationship: linear (Pearson’s r) or curved (Spearman’s rho, ρ).
  3. The consistency or strength. The closer r is to 1, the more consistent the relationship (i.e., the closer the data points appear on the scatter plot).

To interpret a correlation:

  • The strength of the relationship is indicated by the coefficient of determination (r2). This tells us the variability of one variable that is predicted by another.
    • For example, if the correlation coefficient r = .80, then r2 = .64 which means that 64% of the differences are predictable.
    • Small (r = .10, medium (r = .30), large (r = .64)
      Small (r2 = .01, medium (r2 = .09), large (r2 = .25)
  • The significance of the relationship is indicated by the statistical significance of the correlation, which means that the result is unlikely to have occurred by random chance.

There are three applications of correlational designs:

  1. Prediction: if x and y are correlated, we can use the x value to predict the y value! We can use a regression line to plot the prediction.
  2. Test-Retest Reliability: where we can see if our first and second tests are positively correlated.
  3. Concurrent Validity: where we can see if our test correlates positively with a well-tested test.

There are strengths and weaknesses of correlational designs:

Strengths:

  • Describes relationships between variables
  • Nonintrusive – allows natural behaviour to emerge
  • High external validity

Weaknesses:

  • Cannot assess causality
  • Third-variable problem
  • Directionality problem
  • Low internal validity

0 comments: