Sunday, April 4, 2010

Knowledge about the level of measurement helps not only in interpreting the data but also in selecting the appropriate statistical procedures to analyze the data.

The four different levels of measurement, in ascending order of accuracy, are:
nominal, ordinal, interval and ratio.

  • Nominal data has no order and the use of numbers to classify categories is arbitrary. Example: 1=Honda, 2=Toyota, 3=Nissan, etc. can be used to classify different Japanese car make.
    Arithmetic operations (+, -, ÷, ×) cannot be performed on nominal data.

  • Ordinal data has order, but the interval between values is not interpretable. Rank data Likert scales are ordinal data. Example: 1=High School, 2=Diploma, 3= Degree, 4=Post Graduate Degree may be used to represent ranking of Academic Achievement.
    Arithmetic operations cannot be performed on ordinal data.

  • Interval data has order and the interval between values is interpretable. Counts (integers) such as Years of Education and Temperatures in degrees Fahrenheit are interval data. Note that for interval data, ratios do not make sense and a zero value is not meaningful. For example, 40 degrees is not twice as hot as 20 degrees and 0 degrees does not mean ‘no temperature’.
    Only addition and subtraction can be performed on interval data.

  • Ratio data are interval data with a zero value which is meaningful. Examples are Age, Length, Weight and Income. Note that A with a monthly income of Rm4,000 earns twice as much as B whose monthly salary is only Rm2,000. Also, Rm0 means no income.
    All four arithmetic operations can be performed on interval data.

We can see that there is a hierarchy in the four levels of measurement with nominal being the least accurate and ratio the most accurate. So when the phrase ‘at least ordinal’ is used, it refers to all except nominal.

Note: According to Garson, G.D. (1999), ‘Likert scales are very commonly used with interval procedures, provided the scale item has at least 5 and preferably 7 categories’. Read more.

Categorical Data and Continuous Data

It is customary to refer to nominal and ordinal data as categorical data and interval and ratio data as continuous data.

References

Garson, G.D. (2009). SPSS Tutorial. URL:http://faculty.chass.ncsu.edu/garson/PA765/datalevl.htm. Accessed: 2010-03-23. (Archived by WebCite® at http://www.webcitation.org/5oSViac3a)

Trochim, W.M.K. (2006). Web Center for Social Research Methods. URL:http://www.socialresearchmethods.net/kb/measlevl.php. Accessed: 2010-03-23. (Archived by WebCite® at http://www.webcitation.org/5oSUm5C7F)

Friday, April 2, 2010

Testing

Before you begin to analyze your data, it is important to check the assumptions associated with each statistical procedure. Many of the procedures in SPSS assume that the scores on each of the variables have a normal distribution. This is due to the fact that most of the statistics, such as the t statistic, the F statistic and r statistic (Pearson correlation coefficient) are all derived theoretically from a normal distribution.

SPSS provides three methods for assessing normality:
  • Graphical methods – this involves examining histogram, Q-Q plot and box plot
  • Descriptive methods – this involves examining mean, median, mode, skew and kurtosis
  • Tests of Normality
Two tests are given by SPSS: Kolmogorov-Smirnov’s test and Shapiro-Wilk’s test. Shapiro-Wilk’s test is recommended for sample size up to 2000. For sample size larger than 2000, use Kolmogorov-Smirnov’s test. If the p-value given under the Sig. column is smaller than 0.05, then the normality assumption is violated.

A researcher should use all three methods to assess normality. If data distribution is found to be non-normal but skewed positively or negatively, several transformations can be used to correct skew and these are called normalizing transformations.

Normalizing Transformations

A. For positively-skewed distribution, suggested transformations are:
  • Square root
    Formula: new variable = SQRT(old variable)
  • Natural logarithm
    Formula: new variable = LN(old variable)
  • Inverse
    Formula: new variable = 1/(old variable)

B. For negatively-skewed distribution, suggested transformation are:

  • Reflect and square root
    Formula: new variable = SQRT(K - old variable)
  • Reflect and natural logarithm
    Formula: new variable = LN(K - old variable)
  • Reflect and inverse
    Formula: new variable = 1/(K - old variable),

where K = (the largest value) + 1.

The transformed variables should again be assessed for normality. If none of the transformations work, stay cool because SPSS provides alternative non-parametric procedures.

References

Garson, G.D. (2010). SPSS Tutorial. URL: http://faculty.chass.ncsu.edu/garson/PA765/assumpt.htm#normal

Pallant, J. (2007). SPSS survival manual (3rd edn). Two Penn Plaza, New York, NY: McGraw-Hill.

Thursday, April 1, 2010

A. For positively-skewed distribution, suggested transformations are:

  • Square root
    Formula: new variable = SQRT(old variable)
  • Natural logarithm
    Formula: new variable = LN(old variable)
  • Inverse
    Formula: new variable = 1/(old variable)

LEVELS OF MEASUREMENT

Knowledge about the level of measurement helps not only in interpreting the data but also in selecting the appropriate statistical procedures to analyze the data.

The four different levels of measurement, in ascending order of accuracy, are:
nominal, ordinal, interval and ratio.

  • Nominal data has no order and the use of numbers to classify categories is arbitrary. Example: 1=Honda, 2=Toyota, 3=Nissan, etc. can be used to classify different Japanese car make.
    Arithmetic operations (+, -, ÷, ×) cannot be performed on nominal data.

  • Ordinal data has order, but the interval between values is not interpretable. Rank data Likert scales are ordinal data. Example: 1=High School, 2=Diploma, 3= Degree, 4=Post Graduate Degree may be used to represent ranking of Academic Achievement.
    Arithmetic operations cannot be performed on ordinal data.

  • Interval data has order and the interval between values is interpretable. Counts (integers) such as Years of Education and Temperatures in degrees Fahrenheit are interval data. Note that for interval data, ratios do not make sense and a zero value is not meaningful. For example, 40 degrees is not twice as hot as 20 degrees and 0 degrees does not mean ‘no temperature’.
    Only addition and subtraction can be performed on interval data.

  • Ratio data are interval data with a zero value which is meaningful. Examples are Age, Length, Weight and Income. Note that A with a monthly income of Rm4,000 earns twice as much as B whose monthly salary is only Rm2,000. Also, Rm0 means no income.
    All four arithmetic operations can be performed on interval data.

We can see that there is a hierarchy in the four levels of measurement with nominal being the least accurate and ratio the most accurate. So when the phrase ‘at least ordinal’ is used, it refers to all except nominal.

Note: According to Garson, G.D. (1999), ‘Likert scales are very commonly used with interval procedures, provided the scale item has at least 5 and preferably 7 categories’. Read more.

Categorical Data and Continuous Data

It is customary to refer to nominal and ordinal data as categorical data and interval and ratio data as continuous data.

References

Garson, G.D. (2009). SPSS Tutorial. URL:http://faculty.chass.ncsu.edu/garson/PA765/datalevl.htm. Accessed: 2010-03-23. (Archived by WebCite® at http://www.webcitation.org/5oSViac3a)

Trochim, W.M.K. (2006). Web Center for Social Research Methods. URL:http://www.socialresearchmethods.net/kb/measlevl.php. Accessed: 2010-03-23. (Archived by WebCite® at http://www.webcitation.org/5oSUm5C7F)