SPSS Tutorial #7: Preliminary Analysis using Descriptive Statistics in SPSS

After checking for and correcting errors in your dataset, the next important step before running your analysis is to conduct preliminary analysis to explore the nature of your data. One can conduct preliminary analysis using descriptive statistics or graphs. This post focuses on descriptive statistics in SPSS.

Descriptive statistics play two major roles:

  • They describe the nature of the data and the variables
  • They help to check for violation of assumptions behind the statistical techniques.
    • The assumptions vary from one statistical technique to another and it is important for a researcher to know them and to know how to check for their possible violation.

Descriptive statistics in SPSS using Codebook

The Codebook feature allows one to quickly get a summary of the data.

To perform the Codebook function, follow the procedure below:

  • Click Analyze menu > Reports > Codebook
  • In the Codebook dialogue box, click on the variables tab, select the variables you want and move them to the codebook variables box.
  • In the same dialogue box, click on the output tab and select Label, Value Labels and Missing Values from the variable information list.
  • In the same dialogue box, click on the statistics tab and select all the options listed under counts and percents and central tendency and dispersion.
  • Click OK.

The procedure is demonstrated in the images below:

The output will show different statistics for categorical and numerical variables.

For categorical variables, only the count and percentages will be displayed.

For numerical variables, the mean, standard deviation and quartiles will be displayed.

Some examples are shown in the images below:

Descriptive statistics for categorical variables

Besides the Codebook, one can get information on categorical variables using the frequencies feature.

To do this, follow the procedure below:

  • Click Analyze menu > Descriptive statistics > Frequencies
  • Select the variables of interest and move them to the variables box
  • Click OK.

This is demonstrated in the images below:

Interpreting the Frequencies output for categorical variables

The output from the Frequencies table has the following key information:

  • The frequency of cases in each category of the variable.
  • The percentage of cases in each category of the variable.
  • Number and percentage of valid and missing cases in each category of the variable.

Descriptive statistics for continuous variables

Besides the codebook, you can also use the Descriptives feature in SPSS to obtain more information about continuous variables.

To do this, follow the procedure below:

  • Click on Analyze menu > Descriptive statistics > Descriptives
  • From the dialogue box that opens, select all the continuous variables of interest and move them to the variables box
  • Click on Options
  • Tick the mean, standard deviation, minimum, maximum, skewness and kurtosis
  • Click Continue > OK.

This is demonstrated in the images below:

Interpreting the Descriptives output for continuous variables

The output has the following key information:

  • The number of valid cases for each variable.
  • The minimum and maximum values for each variable.
  • The mean for each variable.
  • The standard deviation for each variable.
  • The skewness statistic and its standard error for each variable.
    • Skewness measures the symmetrical nature of the data.
    • There are three types of skewness: left (or positive) skewness, right (or negative) skewness, and zero skewness.
    • Left skewness: data points are concentrated on the left, with the tail towards the left of the distribution graph.
    • Right skewness: data points are concentrated on the right, with the tail towards the right of the distribution graph.
    • Zero skewness: data points are symmetrical and hence the distribution is normal.
  • The kurtosis statistic and its standard error for each variable.
    • Kurtosis measures the peakedness of the data, that is, how high or flat the data points are distributed.
    • There are two types of kurtosis: positive and negative.
    • Positive kurtosis indicates high peakedness of the data points, with long thin tails.
    • Negative kurtosis indicates a flat distribution of the data points, with many data points on the extreme tails.

In conclusion, this post has explained and illustrated how to conduct preliminary analysis using descriptive statistics in SPSS. Descriptive statistics for both categorical and continuous variables have been explained using the appropriate SPSS functions for each variable type. Preliminary analysis is a crucial step before conducting any advanced data analysis.

Related posts

SPSS Tutorial #8: Preliminary Analysis using Graphs in SPSS

Grace Njeri-Otieno

Grace Njeri-Otieno is a Kenyan, a wife, a mom, and currently a PhD student, among many other balls she juggles. She holds a Bachelors' and Masters' degrees in Economics and has more than 7 years' experience with an INGO. She was inspired to start this site so as to share the lessons learned throughout her PhD journey with other PhD students. Her vision for this site is "to become a go-to resource center for PhD students in all their spheres of learning."

Recent Content