# Whole Assignment

PSY 870: Module 3 Problem Set GAF, Consumer Satisfaction, and Type of Clinical Agency (Public or Private) A researcher wants to know if mental health clients of private versus public service agencies differ on Global Assessment of Functioning (GAF) scores and on Satisfaction with Services (Satisfaction). She has collected data for 34 clients from a private agency and for 47 clients of a public agency. Directions: Use the SPSS data file for Module 3 (located in Topic Materials) to answer the following questions: 1. What is the independent variable in this study? What are the dependent variables? 2. The first step for the researcher will be to clean and screen the data. Please do this for the researcher and report your findings. Be sure to check it for possible coding errors, as well as complete the screening of the data to see if the data meet assumptions for parametric tests. Did you find any errors that the researcher made when setting up the SPSS data file (check the variable view)? If so, what did you find? How did you correct it? HINT: Yes, one of the variables is incorrectly listed as scale. 3. Were there missing values on any of the variables? If so, what might you do for those for the independent variable? What about those for each of the dependent variables? Explain your reasoning. HINTS: • Yes, each variable has some missing data. Describe how many (and % of all) are missing on each variable. • When considering what to do about the missing values on each variable, consider if you really can guess what agency a person came from. Next, for the continuous variables, consider (1) what % of values are missing (if more than 5% are missing, what might this mean?); (2) is there a pattern to the missing scores? Include information from the Output file of your SPSS Explore analyses to provide specific number and % of missing values on each of the dependent variables. Based on this, what recommendation would you make for what to do about the missing values? 4. Did you find any outliers on the dependent variables that were due to errors of coding? If so, what and why? How would you correct an error of coding?3 HINT: One of the outliers on one continuous variable clearly is a coding error. Which one is that? What would be the best way to handle that outlier? 5. How might you deal with outliers that are not due to coding errors? Explain your reasoning. HINT: Use the information you have from your Output file from your Explore analyses to describe the outliers (e.g. how many outliers are there on each continuous variable; do they fall above and/or below the mean). What are ways to handle outliers on the continuous variables? Might there be some arguments against deleting outliers? What are these? 6. Check the descriptive statistics, histograms, stem-and-leaf plots, and the tests for normality that you obtained from your analyses (see box to check in “Plots” when using Explore to analyze descriptive statistics of your data). Considering the skewness and kurtosis values, as well as the Shapiro-Wilk’s results (preferred for small sample sizes), did the distribution of scores on either of the dependent variables violate the assumption of normality? How can you tell from the information you obtained from your analyses? HINTS: • First, you can look at your histograms and stem-and-leaf plots to see if you observe marked skewness or other indicators of differences between the distribution of scores from the normal distribution. • Next, you can inspect the computed values for skewness and kurtosis for your variables from your analyses. Report these values in your answer for the continuous dependent variables? Which ones are greater than + 1.0? What does having a skewness or kurtosis value that is greater than + 1.0 tell you about normality? Then, discuss what having these kinds of values tell you about the normality of the distribution of scores on that variable. • Next, look at the Shapiro-Wilks’ tests of normality that you ran. Results with p < .001 or less indicate a violation of the normality assumption using this type of evaluation. Solution: The below is the frequency of categorical variables: There is only one continuous variable and descriptive statistics, histograms, stem-and-leaf plots, and the tests for normality are given below: GAF Stem-and-Leaf Plot Frequency Stem & Leaf 1.00 Extremes (=<16) 3.00 3 . 123 6.00 3 . 999999 11.00 4 . 13344444444 14.00 4 . 56666666667999 16.00 5 . 1111133333444444 5.00 5 . 55555 .00 6 . 4.00 6 . 9999 1.00 7 . 1 12.00 Extremes (>=76) Stem width: 10 Each leaf: 1 case(s) From the above analysis, we can see that mean is 54.58 and standard deviation 21.899. The standard deviation is very high in this case. Skewness value is 4.257 which is bigger than 0, it means that data is positively skewed, most values are concentrated on left of the mean, with extreme values to the right. Kurtosis value is 27.455 which is bigger than 3, it means that the distribution is Leptokurtic, sharper than a normal distribution, with values concentrated around the mean and thicker tails. This means high probability for extreme values. We can see from the plot of Histogram that the data is not normal and there are some extreme high values and extreme low value in the data. The same can be easily seen from the stem and leaf plot and also from the box and whisker plot. The normal probability plot shows that the all values are not on a straight line, it means that the data departs from normality. The same conclusion can be drawn from Shapiro-Wilk’s test with p-value 0.0000 depicts that the data is not normal. We can conclude that the distribution of scores on dependent variable violate the assumption of normality. 7. If in #6, you identified any distributions that violate the assumption of normality, what are some options you might use to try to correct the distribution to get closer to normality? (You do not need to do these steps. Just describe them.) 8. Write a sample result section, discussing your data screening activity.

## Leave a Reply

Want to join the discussion?Feel free to contribute!