Wednesday, December 29, 2010

PSYC 2001: Research Designs

1. Qualitative Research
2. Quantitative Research
3. Experimental Designs
4. Quasi- and Non-Experimental
5. Factorial Designs
6. Correlational Designs

1. Qualitative Research

Other than quantitative research, there are four major approaches to qualitative research:

Naturalistic Observation: describes behaviour in real life.
- Strengths: gives insight into contexts, relationships, behaviours (more real life) and flexible.
- Weaknesses: time-consuming, hard to document, observe – don’t interpret, access may be hard to get, informed consent may be difficult to get, and may not be representative.
Questionnaires.
Conversation Analysis.
Single-Case Study: an in-depth analysis of a single person or small group. If there is no treatment, it’s called a case history.

We can use indirect measures (e.g., observing people) or content analysis (e.g., looking at patterns and themes in written text or media).

Program Evaluation: in order of increasing usefulness, we can use reactions and feelings, learning, changes in skills, and effectiveness to evaluate a program.
- Issue: Participants may not volunteer or may feel coerced to participate.

There are 6 ways to evaluate a program:

Questionnaires: used to quickly/easily access information from a large sample in a non-threatening way.
- Pros: anonymous, cheap, easy to compare and analyze, already exist.
- Cons: getting careful feedback is tough, wording can bias responses, representative sample may not be available, limited to questions on the sheet.
Interviews: used to fully understand someone’s answers or expand on questionnaire responses.
- Pros: full range and depth of information, develops relationship with client.
- Cons: time consuming, hard to analyze and compare, expensive, interviewer can bias response.
Documentation Review: used when we want an impression of how a program operates without interrupting the program (where we review applications, finances, memos, etc).
- Pros: historical information is available, doesn’t interrupt the client, few biases in information.
- Cons: information may be incomplete, no flexibility (you have what exists and nothing more), need to be clear about what you’re looking for (so much data).
Observation: used to gather information about how a program operates (particularly about specific processes).
- Pros: views operations of a program as they are actually occurring, can adapt to events as they occur.
- Cons: difficult to interpret observations, hard to categorize observations, can influence behaviour of participants.
Focus Groups: used to explore a topic in depth with a discussion (useful in evaluations and marketing).
- Pros: quickly and reliably get common impressions, fast and efficient way to get a wide range of information, can convey key information about programs.
- Cons: hard to analyze responses, need a good facilitator, hard to schedule people together.
Case Studies: used to fully understand a client’s experience in a program and conduct comprehensive examination through cross comparison of cases.
- Pros: fully depicts client’s experience, powerful way to portray program to outsiders.
- Cons: time consuming, provides depth but not breadth.

Developing Surveys and Questionnaires

Surveys evaluate attitudes, feelings, experience, and knowledge, whereas questionnaires evaluate aptitudes, abilities, and characteristics.

There are two characteristics of tests:

Reliability: inter-rater reliability and evaluation (split-half – is performance in one section relate to performance in another?; and test-retest – are scores consistent over time?)
Validity: concurrent (do results concur with other tests?) and criterion (does it predict behaviour? Is it meaningful? Is it useful?)

Though we can use existing measures, they can be expensive, intended for special populations, have and restricted access.

There are five types of questions:

Yes-No: easy to mark. Measure with frequency/percentage.
Forced-Alternative: easy to score but nothing ever fits anyone perfectly.
Multiple-Choice: easy to mark.
Likert Scales: people tend not to choose the extremes – if you use even numbers, people have to at least agree/disagree. Measure with a mean/modal response.
Open-Ended: allows a wide variety of responses. Measure by coding first.

Remove questions that have low variability, were omitted or commented on by participants, or do an item analysis (if too many people get it wrong, remove the question).

There are three ways to distribute surveys (in addition to online and interviews):

	Pros	Cons
Mail	No need to be present Potential for larger sample	No idea who completed it Questions may not be answered in order Low sample rate (biased return, 20-30%) Incomplete questions Unanswered questions Answers are not representative
Phone	No need to be present Random sample possible Questions in correct order Survey completed Clarification possible	Random number dialing may call same person twice Hang-ups, blocked calls, screening No visuals in survey Harder to establish a rapport Lower response rate (between mail and in-person)
In Person	High return rate (90%) Answer classification possible Ensures surveys are completed Questions answered in correct order	Time Money Training Interviewer Bias Harder to access (in-home)

2. Quantitative Research

There are five major approaches to quantitative research:

Descriptive Statistics: describes a group using numerical scores on variables.
- Use: often a starting point in research.
- Statistics: mean, median, and standard deviation.
Correlational: where we look for relationships between two or more variables. We measure each participant on each variable but there is no causal inferences.
- Use: for testing theories and help form predictions; often as a secondary analysis.
- Statistics: often requires a linear relationship which is evaluated on strength and direction.
Experimental: tests your hypotheses on the causal effects of the IV on the DV. The IV is actively manipulated and there are two levels. Your participants must be randomly assigned to conditions and you need to specify your procedures for testing your hypothesis. Also, you have to control for major threats to internal validity.
Quasi-Experimental: where we test our hypotheses about relationships between the IV and DV but we have limited ability to actively manipulate the IV or randomly assign participants to conditions. We must still include specific procedures for testing the hypotheses and still control for the major threats to internal validity. We cannot make true causal inferences.
Non-Experimental: see book!

3. Experimental Designs

There are four characteristics of an experimental design:

Two levels of the IV (by manipulating the IV).
Include a measurement variable (DV) to compare the participants in each condition.
Look for consistent differences between groups.
Include control for the major threats to internal validity.

You should also test your hypothesis about the causal effects of the IV, randomly assign participants to conditions, and limit the influence of extraneous variables (so they don’t become confounds).

Causation and Directionality: when we find a relationship between two variables, we have to ensure that we manipulate and control our variables to ensure that we can make a causal and directional conclusion.

Some variables may be hard to manipulate (ethically or logistically).

3.1. Control Groups

A good experiment will always have a control group, otherwise we can’t be sure if the treatment caused the change or whether it was due to some other variable. The control group must receive the exact same experience as the treatment group (with the exception of the actual treatment).

There are two types of control groups:

No-Treatment Group: where the group receives no treatment. This provides a baseline to compare the treatment group with.
Placebo Group: where the group receives a placebo treatment. It helps rule out the possibility that the difference between the treatment and no-treatment group are due to participant expectations and regression to the mean (i.e., people will ‘get better’ without treatment so we have to know if it’s a result of the treatment or time).
- There are three types of placebos:
  - Inert substance – a sugar pill.
  - Sham surgery.
  - False information – telling participants they’re getting the ‘better’ treatment.

Manipulation Check: tests whether the manipulation of the IV actually worked as intended.

3.2. Between-Subject Designs

Between-Subject Design: where we compare scores from separate group (different treatment conditions) to determine which treatment is more effective. Scores are independent.

Groups should be equivalent. The process in which they are created should be the same, they should be treated equally, and they should be composed of equivalent individuals.

Advantages: each score is independent of other scores.
- Not influenced by experience from other treatments (i.e., practice effects).
- Not influenced by fatigue or boredom.
- Not influenced by contrast effects.
Disadvantages:
- Requires a large number of participants (can be difficult with special populations).
- Individual Differences: characteristics that differ from one participant to another; these can become confounding variables and can produce high variability in the scores.
  - Assignment bias (e.g. older versus younger group)
  - Confounding environmental variables

Variance Between and Within Groups

We want variance within groups but we want to control unwanted sources of variance. There are two forms of variance:

Systematic Between-Group Variance: experimental variance due to the IV, or extraneous variance due to confounding variables.
Non-systematic Within-Group Variance: due to chance or individual differences.

To determine if there is a systematic difference between groups, we must compare between-group and within-group variance. If between-group variance is greater than within-group variance, there is a significant, systematic difference, meaning the null hypothesis should likely be rejected.

Using a t-test or ANOVA, we can calculate the difference between groups divided by the difference expected by chance. t = (difference between groups)/(difference expected by chance [error]) or F = (difference between groups)/(difference expected by chance [error])

To determine the variability due to chance, we calculate the signal to noise ratio. The signal to noise ratio must be large in order to conclude that the difference is due to something other than chance.

Noise: the standard deviation.
Signal:
10:3 would be large whereas 8:7 is small
Signal : Noise, where signal > noise.

There are three ways we can control the variance:

Maximize experimental variance: make sure that the differences between groups are real by using a manipulation check or maximizing the effect size.
Control extraneous variance: rule out confounds by making sure the groups are as similar as possible.
Minimize error variance: control your study with careful measurements.

Threats to Internal Validity

There are two major threats to internal validity with this design:

Differential Attrition: if too many participants drop out, your groups may be uneven, creating problems.
Communication between groups:
- Diffusion: where participants from the no-treatment group start using techniques from the treatment group.
- Compensatory Equalization: where the no-treatment group wants the treatment.
- Compensatory Rivalry: where the no-treatment group tries harder than the treatment group to prove themselves.
- Resentful Demoralization: where the no-treatment resents the treated groups and performs poorly.

Applications

Two-Group Mean Difference: where we compare only two groups of participants and the research manipulates only one independent variable with only two levels (single-factor two-group design).

Often used when we compare treatment and control group.
Advantage: simple and provides best opportunity to maximize variance between groups.
Disadvantage: provides little data for comparison (only 2 data sets).

Comparing Means for More than Two Groups: where we compare multiple groups and the research manipulates the independent variables on several levels (single-factor multiple group design).

Advantage: provides stronger evidence for real cause-and-effect relationship.
Disadvantage: minimizes the differences between groups so we have to ensure we manipulate the independent variables sufficiently.

Comparing Proportions for Two or More Groups: when you use nominal or ordinal scales, you can’t compute a mean but you can compare proportions using a chi-square test.

3.3. Within-Subject Designs

Within-Subject Design: where we use a single group of participants and test/observe each individual in all of the different treatments being compared. (Aka. repeated-measures design.) Scores are dependent.

Advantages: requires few participants and eliminates all issues relating to individual differences.
- Possible to measure differences between treatments without involving individual differences.
- Possible to measure differences between individuals – these individual differences can be measured and removed.
- Easier to detect a treatment effect with this design.
Disadvantages:
- Time variables (fatigue, weather) can influence participants and threaten internal validity.
- Participant attrition – counteract this by getting more participants at the outset than you think you’ll need.

There are two major threats to internal validity:

Confounding environmental variables (e.g., morning versus afternoon).
Confounding time variables (history, maturation, instrumentation, testing effects, and regression).

There are two type of order effects:

Carryover Effect: (treatment-specific experience) participant develops new skills/techniques that carries over to the new study
Progressive Error: (general experience)
- Practice Effects: progressive improvement in performance as a participant gains experience.
- Fatigue: progressive decline in performance as a participant gains experience.

Three ways to control time-related threats and order effects:

Controlling Time: shortening the time between experiments can reduce time-related threats to internal validity but increases order effects.
Switch to a Between-Subject Design: whenever you have reason to suspect that order effects will play a substantial role, switch to between-subject design
Counterbalancing – Matching Treatments with Respect to Time: where we change the order in which treatments are given – the goal is to use every possible order of treatments with an equal amount of participants in each sequence
- Note that this method doesn’t eliminate order effects but distributes the effects evenly across treatment groups
- Limitations:
  - Counterbalancing and Variance: counterbalancing increases the variance within the treatment groups.
  - Asymmetrical Order Effects: counterbalancing doesn’t take into account that some treatments may be more demanding (have an unbalanced amount of order effects).
  - Counterbalancing and the Number of Treatments: n! (n factorial) is the number of sequences needed to get a complete counterbalance study; partial counterbalancing can be used (Latin square or randomly assigning treatment group orders).

Applications

Two-Treatment Designs: where there are only two treatment conditions.

Advantages: easy to conduct and results are easy to understand, easy to counterbalance, easy to get a statistical difference.
Disadvantages: doesn’t demonstrate any functional relationship between the two data sets – we can’t know how gradual changes in the independent variable would affect the dependent variable.
Statistical Analysis: t (repeated-measures) or analysis of variance can be used (interval or ratio). For ordinal scales, the Wilcox on test can be used.

Multiple-Treatment Designs:

Advantage: demonstrates stronger cause-and-effect relationship.
Disadvantage: too many treatments may not result in statistically significant variation and increases the amount of time to complete the study which increases participant attrition.

3.4. Comparing Within- and Between-Subject Designs

There are three ways in which these design differ significantly:

When you suspect…	Good for…	Bad for…
Individual Differences	Within-subject	Between-subject
Time-Related Factors and Order Effects	Between-subject	Within-subject
Fewer Participants	Within-subject	Between-subject

3.5. Single-Subject Research Design

Single-subject research designs allow researchers to manipulate IVs and measure the DV with a single person (or small group) and infer a causal relationship. There is no control group.

Useful for studying unusual disorders and unique experiences, or when we’re concerned that group variation will hide individual effects. (Often used in clinical psychology.)
Flexible and you only need one (or a small group of) participant(s).

Classical Research Design: [baseline] – [treatment] – [post-treatment]

ABAB Reversal Design: [baseline] – [treatment] – [baseline] – [treatment]

In the treatment conditions, we’re looking for a full reversal of behaviour so we can conclusively so that the treatment is effecting the behaviour.
The baseline conditions act as the ‘control’.
This can’t be used for treatments that are expected to have long-lasting or permanent effects.

There are three Multiple-Baseline Designs:

Multiple-Baseline Across Subjects: where a baseline is established for multiple participants and the treatment is staggered across (only once) each participant.
Multiple-Baseline Across Behaviours: where baseline behaviours are established for a single participants and treatments are staggered to determine whether the treatment had an effect.
Multiple-Baseline Across Situations: where baseline behaviours are established in two different situations and the treatments are staggered.
All of the multiple-baseline designs follow this pattern:
Observation 1: [---baseline---] [---treatment---]
Observation 2: [---baseline---] [---treatment---]
Strengths: good for treatments with long-lasting effects.
Weaknesses: hard to discern between individual behaviours; treatments may produce a general effect.

There are three other types of single-subject designs:

Dismantling (Component-Analysis) Design: where each phase adds or subtracts one component of a complex treatment to determine how each component contributes to the overall treatment.
Changing Criterion Design: where each phase is defined by a specific criterion that determines a target level of behaviour (e.g., 20 cigarettes a day for the first week, 10 for the next). To ensure there is a correlation between the treatment and the behaviour, researchers often include backward phases (“now have 30 cigarettes!”) to make sure the behaviour is causally related to the treatment.
Alternating-Treatment Design: where two or more treatment conditions are randomly alternated from one observation to the next.

Single-subject designs have low external validity but high internal validity. We can generalize our results using systematic, clinical, and direct replication.

Systematic replication is where we test the effectiveness of a given treatment across a variety of behaviours (e.g., change the participant, setting, or behaviour).
Clinical replication is where we test the effectiveness of a combination of treatments for various behaviours, participants, and settings.
Direct replication is where we test the effectiveness of a given treatment for a single behaviour, multiple times. ABAB design uses this approach.

4. Quasi- and Non-Experimental

Quasi-experimental makes some attempt to reduce threats to internal validity whereas non-experimental makes none. In both cases, the conditions are not changed by manipulating an IV but are defined by pre-existing conditions (e.g., gender, age) or in terms of time (e.g., before and after). These designs look for differences between groups.

There are two categories of designs that can be classified as quasi- or non-experimental:

Between-subject designs, or non-equivalent group designs; and
Within-subject designs, or pre-post designs.

Non-equivalent group (between-subject) designs are so named because the researcher cannot control the assignment of individuals to a group.

All three design subtypes suffer from assignment bias.
The three design subtypes:

Differential Design (non-experimental): where participants are assigned to groups on a particular participant character (such as gender or race). The DV is compared between the groups to see if there is a consistent difference. Similar to correlational research.
Non-Equivalent Group Design (non-experimental): where the control and treatment groups are pre-existing and not randomly assigned.

Posttest-only Non-equivalent Control Group Design: where two non-equivalent groups are compared – they are measured at the same time but only one group receives the treatment.
X O (treatment group)
O (control group)

Pretest-Posttest Non-Equivalent Group Design (quasi): where two non-equivalent groups (treatment and control ) are compared – the treatment group is measured before AND after the treatment whereas the control group is measured twice (before and after) but with no treatment in between.
O X O (treatment group)
O O (control group)

This design reduces assignment bias, limits threats from time-related factors and provides evidence for a cause-and-effect relationship.

History effects that affect groups differently are called differential threats. Time-related threats may occur with all of the above as well.

Pre-post group designs (within-subject) measures the pretest and posttest scores of a group before and after treatment to determine if there’s an effect. There is no control group with this design.

There are three types of designs:

One-Group Pretest-Posttest Design (non-experimental): only one group is used
O X O
Time-Series Design (quasi-experimental): multiple observations are made before and after the treatment. This is often used with predictable events (e.g., a change in law).
O O O X O O O

These designs suffer from time-related threats to internal validity (history, instrumentation, testing effects, maturation, and statistical regression).

Developmental Research Designs: experiments that examine changes in behaviour that are related to age. There are three types:

Cross-Sectional: (between-subject) different cohorts are measured at one point in time and then compared.
- Cohorts: individuals born roughly at the time same and growing up in similar conditions.
- Cohort Effects: differences between age groups caused by unique experiences other than age. (Be careful! Developmental differences may just be cohort effects.)
Longitudinal: examines development by observing or measuring one group of cohorts over time.
Cross-Sectional Longitudinal Designs: compare results obtained from separate samples that were obtained at different times (i.e., many cohorts, many times).

	Cross-Sectional	Longitudinal
Strengths	Time efficient No long-term cooperation required Easier to generalize	No cohort or generation effects Assess individual behaviour effects
Weaknesses	Individual changes not assessed Cohort or generation effects	Time consuming Participant dropout may create bias Potential for practice effects Harder to generalize results

5. Factorial Designs

Experimental factorial designs are studies that contain more than one IV (or more than one factor) and allow us to examine the interactive effects of IVs. This is good for external validity.

We use ANOVA for factorial designs. It’s written to express the number of levels of each IV.

For example, if there are two IVs and the first IV has 2 levels and the second IV has 5 levels, it would be a 2 × 5 factorial design.

A Main Effects describes the individual effect of each IV on the DV.

On a graph, the x variable has an effect if the mean line drawn is not parallel.
On a graph, the y variable has an effect if the mean points are not in the same place.

An Interaction Effect describes the combined effect of two or more IVs on the DV.

On a graph, there is an interaction if the lines are not parallel.

Interaction and Main Effect for BAR

6. Correlational Designs

Correlational designs allow us to measure each participant on each variable and find a relationship between two variables (which is calculated using a correlational coefficient and a scatter plot). This design looks for a relationship between two or more variables.

There are three characteristics of a correlation:

The direction can be upward (+r), downward (−r), or non-existent (r = 0).
The nature of the relationship: linear (Pearson’s r) or curved (Spearman’s rho, ρ).
The consistency or strength. The closer r is to 1, the more consistent the relationship (i.e., the closer the data points appear on the scatter plot).

To interpret a correlation:

The strength of the relationship is indicated by the coefficient of determination (r²). This tells us the variability of one variable that is predicted by another.

For example, if the correlation coefficient r = .80, then r² = .64 which means that 64% of the differences are predictable.
Small (r = .10, medium (r = .30), large (r = .64)
Small (r² = .01, medium (r² = .09), large (r² = .25)

The significance of the relationship is indicated by the statistical significance of the correlation, which means that the result is unlikely to have occurred by random chance.

There are three applications of correlational designs:

Prediction: if x and y are correlated, we can use the x value to predict the y value! We can use a regression line to plot the prediction.
Test-Retest Reliability: where we can see if our first and second tests are positively correlated.
Concurrent Validity: where we can see if our test correlates positively with a well-tested test.

There are strengths and weaknesses of correlational designs:

Strengths:

Describes relationships between variables
Nonintrusive – allows natural behaviour to emerge
High external validity

Weaknesses:

Cannot assess causality
Third-variable problem
Directionality problem
Low internal validity

PSYC 2001: Operational Definitions, Scales of Measurement, and Evaluating Measures

1. Operational Definitions
2. Scales of Measurement
3. Evaluating Measures

Hypothesis: a sentence that clearly states the relationship we expect to see between our IV(s) and DV(s). Variables are stated in operational definition terms.

1. Operational Definitions

Operational Definition (OD): describes your variables in terms of the way you will measure them.

Importance: ODs are necessary in order to replicate studies.
Look at past research to see how variables are operationally defined.

Example of an OD: A nacho can be defined as: the result of covering a corn-chip with cut vegetables (olives, red peppers, mushrooms, and onions) and shredded cheese, and baking it in the oven at 350°C for approximately 15 minutes.

Constructs: hypothetical mechanisms or attributes to explain behaviour (e.g., motivation, self-esteem).

Variables: a characteristic or condition that changes or has different values for different individuals; any measure or characteristic that we use in research

Examples of Typical Psychological Variables:
Cognitive: IQ, memory, reading language
Perceptual: vision, audition, tactile
Social: personality, self-esteem, attachment

There are three types of variables:

Independent Variables (IV): what we manipulate.
Dependent Variables (DV): what we analyze or measure.
Extraneous Variables (EV): any other variable (confound or not) that is not an IV or a DV.

There are two ways to categorize variables:

Categorical (qualitative): descriptive qualities that describe something (e.g., male/female, University attended)
Continuous (quantitative): quantifiable qualities that describe something (e.g., age, height)

Note: we can often convert between the two. For example, I may be 22 (continuous) but I am also an adult (categorical). How you define your variables will depend on your study.

We can have any number and combination of IVs and DVs.

We can also measure variables directly (e.g., IQ can be measured using an IQ test) but others have to be measured using more creative techniques; unfortunately, when we can’t measure something directly, we can’t conclusively get an accurate measurement. We can use constructs to help us measure behaviour.

2. Scales of Measurement

There are four ways to measure variables (NOIR):

Nominal: categorizing variables in no particular order.
- E.g., by gender (male/female/other), or religion (Jewish, Muslim, Hindu)
- Statistics: chi-square; proportions, percentages, and mode.
Ordinal: categorizing variables unequally along a continuum. (We can talk about magnitude.)
- E.g., by satisfaction (1 through 7).
- Statistics: Mann-Witney U; proportions, percentages, mode, and median.
Interval: categorizing variables equally along a continuum. There is no true zero.
- E.g., (temperature) it can be 10°C today and 20°C tomorrow but that doesn’t mean that tomorrow will be twice as hot as today.
- E.g., (IQ) if you have an IQ of zero, you don’t have an absence of intelligence.
- Statistics: t-test or ANOVA; median, mean and standard deviation.
Ratio: categorizing variables equally along a continuum with a true zero.
- E.g., length, height, and reaction times can all have zero values. A 10 second reaction time is twice as fast as a 5 second reaction time.
- Statistics: t-test or ANOVA; median, mean and standard deviation.

In Psychology, we normally use interval scales in tests and measurement.

3. Evaluating Measures

There are three ways to evaluate your measures: the reliability, effective range, and validity.

3.1. Reliability

Reliability: the consistency of a measurement. There are three ways to measure how reliable your study is:

Inter-rater: the degree of agreement between two independent raters.
Test-retest: the degree of consistency (in scores) over time.
Internal Consistency: the degree to which all items of a test measure the same thing. This is often measured using the split-half reliability test, where the scores taken from a random sample of questions match the scores taken from another sample of questions.

There are three factors that influence the reliability of your study:

Clarity: Clarity of your operation definition;
Follow-Through: The extent to which your operational definition is followed; and
Observations: The number of observations the overall score is based on (more is better).

3.2. Effective Range

Effective Range: the range of scores we need to get a good idea of the performance on our dependent variable(s). We have to consider:

Characteristics of the Sample: formulate your study to match the characteristics of your sample. E.g., don’t ask blind people to watch a movie.

Scale Attenuation Effects: where the range is restricted – people above or below the effective range aren’t accurately measured; this distorts the scores, reducing reliability and validity. This can occur in two ways:

Ceiling Effects: there’s an insufficient range at the top of the scale – most high scores are bunched together, preventing us from increasing scores.
Floor Effects: there’s an insufficient range at the bottom of the scale – most low scores are bunched together, preventing us from decreasing scores.

3.3. Validity

There are four major types of validity (ESCI) and five others:

External: the degree to which your study applies to the real world (i.e., generalizability).
- Influenced by: quality of sample selection (e.g., random selection, representativeness).
Statistical: the degree to which your decision about the null hypothesis is reasonable.
- Influenced by: reliability of your measures, ability to meet assumptions of statistical tests used.
Construct: the degree to which underlying theories fit with your results (and vice versa).
- Influenced by: quality of rationalistic reasoning, consideration of possible theories.
Internal: the degree to which your study accurately demonstrates that measured changes were a result of manipulating the IV.
- Influenced by: the degree to which you controlled extraneous confound variables.
- We want to ensure:
  - One Explanation: there should be only one explanation for the results.
  - Groups Differ: the treatment and non-treatment groups differ.
  - Other: we have to consider that other variables may have an effect (e.g., environment, groups, and time variables).
Face Validity: a measure superficially appears to measure what it claims to measure.
Concurrent Validity: scores obtained from a new measure relate to scores obtained from an old, established measure. (Note: this always compares new and old.)
Predictive Validity: scores obtained from a measure accurately predict behaviour according to a theory.
Convergent Validity: strong relationship between scores obtained from two different methods of measuring the same construct.
Divergent Validity: demonstrated by using two different methods to measure two different constructs. Then convergent validity must be shown for each of the two constructs. Finally, there should be little or no relationship between scores obtained for the two different constructs when they are measured by the same method. This shows that we’re measuring one construct, not two. (Note: this always compares new methods.)

3.4. Confounding Variables

We must control our confounding variables only when:

A variable changes systematically with the IV; or
A variable influences the DV.

There are three types of confounding variables (or many threats to internal validity):

Participant: each person has different demographics.
- Subject Effects: people tend to be suspicious when they are in an experiment.
  - Demand Characteristics: where people interpret what the purpose of the experiment is and change their behaviour unconsciously.
  - Hawthorne Effect: people try harder because they’re in an experiment.
  - Social Desirability Bias: tendency to respond in a way that will be viewed favourably by others.
  - Control by using deception, placebos, and single/double-blind designs.
- Selection Effects: use random assignment or match groups to ensure participants are the same before the treatment.
- Attrition Effects: where participants drop-out or die before completing the experiment.
Environmental: experiments may be conducted in different environments.
- Experimenter Effects: experimenters may introduce bias by interpreting the data in a way that supports their hypothesis.
  - Clever Hans: the horse that could understand cues from audiences.
  - Greenspoon Effect: the guy that emphasized the syllables in nouns to demonstrate the experimenter effect.
  - Control by using single/double-blind designs, standardized instructions, and automated and objective measurements or multiple observers.
- Diffusion of Treatment: where treatment effects spread from the treatment to the control group.
Time: over time, time variables may influence the study.
- Sequence/Order Effects: experience with other conditions influences performance in each condition.
  - Control by counterbalancing the order of conditions across participants.
- Practice Effects: experience with a study can influence the performance.
  - Control with counterbalancing.
- Maturation: participants get older and mature during the experiment.
- History: events outside the study that influence a participant’s behaviour.
- Instrumentation: where the instrument used to assess, measure, or evaluate the participant changes over the course of the study.
- Regression to the Mean: where the scores tend to regress to the mean after practice or time.

There are three ways to control confounds:

Holding the variable constant: where we make sure the confound is the same for everyone.
- Con: limits external validity.
Matching: where we match the level of the confound across groups.
- Participants: match by number/proportion or by average. (E.g., each group has 50% males and 50% females, or the average IQ of each group is 120.)
- Environment: (e.g., have each experimenter work is 50% of participants from each group.)
- Time: counterbalance the order of conditions, the time between groups, etc.
- Con: can be intrusive to participants and time consuming (you have to measure the confound ).
Random Assignment: randomly assign participants to treatment and non-treatment groups.
- Advantages: allows us to control many variable simultaneously.
- Disadvantages: only works with large groups – make sure you do a check to compare groups on potential confounds because it may not be ‘random’.

PSYC 2001: Ethics in Research

1. Ethical Guidelines

1. Ethical Guidelines

Ethical guidelines are present in any type of research:

Human: focuses on protecting the rights of participants.
Animal: focuses on proper care and minimizing pain.
- Animals cannot give informed consent.
- This type of research tends to be more invasive.

1.1. Human Research Ethics

All research proposals must be approved by a Research Ethics Board (REB). The first safeguard is an informed consent form. Though we focus on protecting the rights of participants, ethics also applies to the storage of data, anonymity of files, adequate supervision, plagiarism, and honesty with data.

We also have to ensure that our research includes a diverse sample of people, that is, our studies should include all ages, all sexes, and all ethnicities. Before, the majority of research was done on adult, white males.

1.2. General Ethical Guidelines

There are many facets of the guidelines (this is just a sample):

Honesty
Objectivity
Carefulness
Openness
Respect for intellectual property
Confidentiality
Responsible publication
Social responsibility
Non-discrimination
Competence
Legality
Animal care
Human subjects protection

And there are many ethical principles (this is just a sample):

Protection from harm (physical, psychological, and social/economical)
Respect for human dignity and privacy
Minimize use of deception (active and passive)
Free and informed consent
Debriefing

Consent Forms: these are designed to inform the participant of their involvement in the research. When participants give their consent, it must be given intelligently, knowingly, and voluntarily:

Freely: informed consent must be voluntarily given, without manipulation, undue influence or coercion.
Maintained: no undue influence or coercion to continue the study
- Participants always have the right to withdraw but they should also be informed of the consequences beforehand (e.g., what will happen to data already collected, and compensation)

1.3. Ethical Checks

Will the study have informational value?
Does the study pose risks to participants? If so, are there sufficient controls for those risks?
Is there a provision for informed consent?
Is there a provision for adequate feedback?
Do I accept full responsibility for the ethical conduct of the study?
Has the proposal been approved by the REB?

PSYC 2001: Introduction to Research

1. What is Research?

Normally I post notes in the order of the lecture but seeing as I took condensed lecture-book notes for the exam, they are in order of topic rather than lectures.

1. What is Research?

Inductive Reasoning is where you come up with a general theory from specific observations whereas Deductive Reasoning is where you use a general theory and apply it to specific observations.

Methods of Knowledge Acquisition:

Tenacity (habit or superstition)
Intuition (hunch or feeling)
Authority (expert)
Rationalism (reasoning)
Empiricism (direct sensory observation)
Science

Science is different from pseudoscience in that it must withstand rigorous tests.

The Scientific Method: first you develop interest through observation, form a tentative explanation (hypothesis), generate a testable prediction, test the prediction using systematic observations, and evaluate the hypothesis using observations.

Advantages: experimental findings and methods are precise and clear which makes it easier to replicate studies.
Good Science is empirical, public, and objective.

Basic research is done to advance our knowledge whereas applied research is done to help people (for society). Applied science is more funded and often builds on basic research.

Notes by Lecture Date

PSYC 2001: Research Designs

Contents

1. Qualitative Research

Developing Surveys and Questionnaires

2. Quantitative Research

3. Experimental Designs

3.1. Control Groups

3.2. Between-Subject Designs

Variance Between and Within Groups

Threats to Internal Validity

Applications

3.3. Within-Subject Designs

Applications

3.4. Comparing Within- and Between-Subject Designs

3.5. Single-Subject Research Design

4. Quasi- and Non-Experimental

5. Factorial Designs

6. Correlational Designs

PSYC 2001: Operational Definitions, Scales of Measurement, and Evaluating Measures

Contents

1. Operational Definitions

2. Scales of Measurement

3. Evaluating Measures

3.1. Reliability

3.2. Effective Range

3.3. Validity

3.4. Confounding Variables

PSYC 2001: Ethics in Research

Contents

1. Ethical Guidelines

1.1. Human Research Ethics

1.2. General Ethical Guidelines

1.3. Ethical Checks

PSYC 2001: Introduction to Research

Contents

1. What is Research?