skip to main | skip to sidebar

See lectures by course…

Wednesday, December 29, 2010

PSYC 2001: Operational Definitions, Scales of Measurement, and Evaluating Measures

Hypothesis: a sentence that clearly states the relationship we expect to see between our IV(s) and DV(s). Variables are stated in operational definition terms.

1. Operational Definitions

Operational Definition (OD): describes your variables in terms of the way you will measure them.

  • Importance: ODs are necessary in order to replicate studies.
    Look at past research to see how variables are operationally defined.

Example of an OD: A nacho can be defined as: the result of covering a corn-chip with cut vegetables (olives, red peppers, mushrooms, and onions) and shredded cheese, and baking it in the oven at 350°C for approximately 15 minutes.

Constructs: hypothetical mechanisms or attributes to explain behaviour (e.g., motivation, self-esteem).

Variables: a characteristic or condition that changes or has different values for different individuals; any measure or characteristic that we use in research

Examples of Typical Psychological Variables:
Cognitive: IQ, memory, reading language
Perceptual: vision, audition, tactile
Social: personality, self-esteem, attachment

There are three types of variables:

  • Independent Variables (IV): what we manipulate.
  • Dependent Variables (DV): what we analyze or measure.
  • Extraneous Variables (EV): any other variable (confound or not) that is not an IV or a DV.

There are two ways to categorize variables:

  • Categorical (qualitative): descriptive qualities that describe something (e.g., male/female, University attended)
  • Continuous (quantitative): quantifiable qualities that describe something (e.g., age, height)

    Note: we can often convert between the two. For example, I may be 22 (continuous) but I am also an adult (categorical). How you define your variables will depend on your study.

We can have any number and combination of IVs and DVs.

We can also measure variables directly (e.g., IQ can be measured using an IQ test) but others have to be measured using more creative techniques; unfortunately, when we can’t measure something directly, we can’t conclusively get an accurate measurement. We can use constructs to help us measure behaviour.

2. Scales of Measurement

There are four ways to measure variables (NOIR):

  1. Nominal: categorizing variables in no particular order.
    • E.g., by gender (male/female/other), or religion (Jewish, Muslim, Hindu)
    • Statistics: chi-square; proportions, percentages, and mode.
  2. Ordinal: categorizing variables unequally along a continuum. (We can talk about magnitude.)
    • E.g., by satisfaction (1 through 7).
    • Statistics: Mann-Witney U; proportions, percentages, mode, and median.
  3. Interval: categorizing variables equally along a continuum. There is no true zero.
    • E.g., (temperature) it can be 10°C today and 20°C tomorrow but that doesn’t mean that tomorrow will be twice as hot as today.
    • E.g., (IQ) if you have an IQ of zero, you don’t have an absence of intelligence.
    • Statistics: t-test or ANOVA; median, mean and standard deviation.
  4. Ratio: categorizing variables equally along a continuum with a true zero.
    • E.g., length, height, and reaction times can all have zero values. A 10 second reaction time is twice as fast as a 5 second reaction time.
    • Statistics: t-test or ANOVA; median, mean and standard deviation.

In Psychology, we normally use interval scales in tests and measurement.

3. Evaluating Measures

There are three ways to evaluate your measures: the reliability, effective range, and validity.

3.1. Reliability

Reliability: the consistency of a measurement. There are three  ways to measure how reliable your study is:

  1. Inter-rater: the degree of agreement between two independent raters.
  2. Test-retest: the degree of consistency (in scores) over time.
  3. Internal Consistency: the degree to which all items of a test measure the same thing. This is often measured using the split-half reliability test, where the scores taken from a random sample of questions match the scores taken from another sample of questions.

There are three factors that influence the reliability of your study:

  1. Clarity: Clarity of your operation definition;
  2. Follow-Through: The extent to which your operational definition is followed; and
  3. Observations: The number of observations the overall score is based on (more is better).

3.2. Effective Range

Effective Range: the range of scores we need to get a good idea of the performance on our dependent variable(s). We have to consider:

  • Characteristics of the Sample: formulate your study to match the characteristics of your sample. E.g., don’t ask  blind people to watch a movie.

Scale Attenuation Effects: where the range is restricted – people above or below the effective range aren’t accurately measured; this distorts the scores, reducing reliability and validity. This can occur in two ways:

  1. Ceiling Effects: there’s an insufficient range at the top of the scale – most high scores are bunched together, preventing us from increasing scores.
  2. Floor Effects: there’s an insufficient range at the bottom of the scale – most low scores are bunched together, preventing us from decreasing scores.

3.3. Validity

There are four major types of validity (ESCI) and five others:

  1. External: the degree to which your study applies to the real world (i.e., generalizability).
    • Influenced by: quality of sample selection (e.g., random selection, representativeness).
  2. Statistical: the degree to which your decision about the null hypothesis is reasonable.
    • Influenced by: reliability of your measures, ability to meet assumptions of statistical tests used.
  3. Construct: the degree to which underlying theories fit with your results (and vice versa).
    • Influenced by: quality of rationalistic reasoning, consideration of possible theories.
  4. Internal: the degree to which your study accurately demonstrates that measured changes were a result of manipulating the IV.
    • Influenced by: the degree to which you controlled extraneous confound variables.
    • We want to ensure:
      • One Explanation: there should be only one explanation for the results.
      •  Groups Differ: the treatment and non-treatment groups differ.
      • Other: we have to consider that other variables may have an effect (e.g., environment, groups, and time variables).
  5. Face Validity: a measure superficially appears to measure what it claims to measure.
  6. Concurrent Validity: scores obtained from a new measure relate to scores obtained from an old, established measure. (Note: this always compares new and old.)
  7. Predictive Validity: scores obtained from a measure accurately predict behaviour according to a theory.
  8. Convergent Validity: strong relationship between scores obtained from two different methods of measuring the same construct.
  9. Divergent Validity: demonstrated by using two different methods to measure two different constructs. Then convergent validity must be shown for each of the two constructs. Finally, there should be little or no relationship between scores obtained for the two different constructs when they are measured by the same method. This shows that we’re measuring one construct, not two.  (Note: this always compares new methods.)

3.4. Confounding Variables

We must control our confounding variables only when:

  • A variable changes systematically with the IV; or
  • A variable influences the DV.

There are three types of confounding variables (or many threats to internal validity):

  1. Participant: each person has different demographics.
    • Subject Effects: people tend to be suspicious when they are in an experiment.
      • Demand Characteristics: where people interpret what the purpose of the experiment is and change their behaviour unconsciously.
      • Hawthorne Effect: people try harder because they’re in an experiment.
      • Social Desirability Bias: tendency to respond in a way that will be viewed favourably by others.
      • Control by using deception, placebos, and single/double-blind designs.
    • Selection Effects: use random assignment or match groups to ensure participants are the same before the treatment.
    • Attrition Effects: where participants drop-out or die before completing the experiment.
  2. Environmental: experiments may be conducted in different environments.
    • Experimenter Effects: experimenters may introduce bias by interpreting the data in a way that supports their hypothesis.
      • Clever Hans: the horse that could understand cues from audiences.
      • Greenspoon Effect: the guy that emphasized the syllables in nouns to demonstrate the experimenter effect.
      • Control by using single/double-blind designs, standardized instructions, and automated and objective measurements or multiple observers.
    • Diffusion of Treatment: where treatment effects spread from the treatment to the control group.
  3. Time: over time, time variables may influence the study.
    • Sequence/Order Effects: experience with other conditions influences performance in each condition.
      • Control by counterbalancing the order of conditions across participants.
    • Practice Effects: experience with a study can influence the performance.
      • Control with counterbalancing.
    • Maturation: participants get older and mature during the experiment.
    • History: events outside the study that influence a participant’s behaviour.
    • Instrumentation: where the instrument used to assess, measure, or evaluate the participant changes over the course of the study.
    • Regression to the Mean: where the scores tend to regress to the mean after practice or time.

There are three ways to control confounds:

  1. Holding the variable constant: where we make sure the confound is the same for everyone.
    • Con: limits external validity.
  2. Matching: where we match the level of the confound across groups.
    • Participants: match by number/proportion or by average. (E.g., each group has 50% males and 50% females, or the average IQ of each group is 120.)
    • Environment: (e.g., have each experimenter work is 50% of participants from each group.)
    • Time: counterbalance the order of conditions, the time between groups, etc.
    • Con: can be intrusive to participants and time consuming (you have to measure the confound ).
  3. Random Assignment: randomly assign participants to treatment and non-treatment groups.
    • Advantages: allows us to control many variable simultaneously.
    • Disadvantages: only works with large groups – make sure you do a check to compare groups on potential confounds because it may not be ‘random’.

0 comments: