Multiple Regression in Dissertation & Thesis Research


     For your dissertation or thesis, you might want to see if your variables are related, or correlated. A correlation indicates the size and direction of any relationship between variables. If, however, your hypothesis involves prediction (such as variables "A", "B", and "C" predict variable "D"), then a regression is the statistic you will use in your analysis.

     If you have only one independent variable and one dependent variable, you would use a bivariate linear regression (the straight line that best fits your data on a scatterplot) for your analysis. When your research involves more than one independent variable and you want to see if it predicts one dependent variable, you can use a multivariate, or multiple regression equation, although we won't discuss the mathematical equation here.

Types of Regression Analysis

     There are several types of regression analysis -- simple, hierarchical, and stepwise -- and the one you choose will depend on the variables in your research. The big difference between these types of regression analysis is the way the variables are entered into the regression equation when analyzing your data. (Note: In most statistical software packages, you simply select the type of regression you want to use for your analysis from a drop-down menu.)

     In a simple regression analysis, all of your predictor variables are entered together. The statistical software will treat each of the predictor/independent variables as though it had been entered after each of the other predictor variables. To use a hierarchical regression in analysis, you must tell the statistical software what order to put your predictor variables into the regression equation. For an analysis using step-wise regression, the order in which you enter your predictor variables is a statistical decision, not a theory on which your dissertation is based.

     To determine which of these regressions you should use to analyze your data, you must look to the underlying question or theory on which your dissertation or thesis is based. If your paper is based on a theory that suggests a particular order in which your predictor variables should be entered, then use a hierarchical regression for the analysis.

     If your theory doesn't really suggest a clear order of entry for your predictor variables, then use a simple regression for your analysis. For reasons we won't go into here, it is not normally recommended that you analyze your data using a step-wise regression, as it often capitalizes on chance, and your results may not generalize to other similar samples.

     To illustrate these regression analyses, let's say that your research has led you to believe that alcohol use, socioeconomic status, and education (independent variables) are related to the incidence of child abuse (dependent variable). Your dissertation hypothesizes that these three variables predict the incidence of child abuse. From your research, you learn that there is a strong correlation between alcohol use and the incidence of child abuse. Your research also has indicated that socioeconomic status is correlated with child abuse, but not as much as alcohol use. Let's say that your research did not provide any clear evidence that education was related to child abuse, but you think it is.

     Based on your research, an order of entry is suggested for your analysis, so you would use a hierarchical regression for your analysis. As your research has indicated that alcohol use is the biggest predictor of child abuse, you would enter that predictor variable into the regression equation first. Since your background suggests that socioeconomic status also contributes to child abuse, but not as much as alcohol use, you would enter that predictor variable next. Given that your research didn't produce any indication that education was related to child abuse, you would enter that predictor variable last. The incidence of child abuse would be entered as your dependent variable.

     After you enter all your variables and run the analysis, your statistical software package should provide a significance value (p-value). Using your preset alpha level (.01 or .05, usually), you can determine if your results are significant. If the p-value obtained by your analysis is less than this, then your results are significant, and your variable (education level) is a significant predictor of child abuse, even when your other variables (alcohol use and socioeconomic status) are accounted for!

     If your research did not indicate that any of your independent variables (alcohol use, socioeconomic status, education) were related to your dependent variable (child abuse), then there is no clear theory on which your dissertation is based to dictate what order you should enter these variables in the regression equation. If this is the case, then use a simple regression for the analysis.

Request Dissertation Statistics Help Today