Module 6: Transforming data
(The attached PDF file has better formatting.)
Homework assignment: choosing a transformation
A random distribution of 10,000 values has the following characteristics:
Minimum value: 0.0186
First quartile: 0.4977
Median: 0.9822
Mean: 1.6280
Third quartile: 1.9480
Maximum value: 45.180
A. Is this distribution symmetric, left-skewed, or right-skewed? Calculate (HU-M)/(M-HL), (the upper hinge minus the median) divided by (the median minus the lower hinge) to justify your answer.
B. Should you transform the value up or down the ladder of powers and roots to make the distribution symmetric?
C. You are choosing among five transformations to remove the skewness: X2, X, ln(X), 1/ X, and 1/X . Which transformation would you choose? Use the table below to justify your answer. You may eliminate some choices are moving the wrong way up or down the ladder of powers and roots.
HL
Median
HU
(HU-M)/(M-HL)
X
X2
√X
ln(X)
1/√X
1/X
What do the values represent. Are the values given in the question values for X, or for Y. And what is the equation for the relationship between X and Y. I'm confused about the parts b and the explanation for c.
[NEAS: These are characteristics of a distribution. The random variable is denoted by X; there is no Y in this exercise.]
The values given are summary statistics of the distribution, not the distinct values of x and y. Those are not needed for this problem.The skewness of the distribution (positive or negative) determines whether you should ascend the ladder of powers or descend the ladder of powers to transform the distribution into something that is less skewed; see the gray call out box on p.57.For the chart, you are modeling different transformations of the summary statistics to see if the skew is increased or decreased. See the chart on the bottom of p.55 for an example.
[NEAS: Correct]
Quick clarification .. positive skew is right skewed, and negative skew is left skewed, correct?
[NEAS: Yes.]
What can I do to deal with 1/X or 1√X? Eliminate them?Thank you!