Exploring Skewness for Odd-Exponent Transformations of Symmetric Distributions

Thanks to a question on Quora, I’ve had the chance to explore the skewness of samples from symmetric distributions, prior to and after odd exponent transformations such as y = x^3. While the answer is posted there, I’d like to explore related odd transformations here and their effect on the skewness. A simple experiment below reveals the impact of non-trivial means on the skewness for data under a cube transformation.

Cube Transformations for Non-Trivial Means

x1 <- rnorm(10^4, 100, 10)
y1 <- x1^3
hist(x1, breaks = 200, col = "light blue", main = "x1; Mean = 100, s.d. = 10")
hist(y1, breaks = 200, col = "light green", main = "y1 = x1^3")

1

2
Right skewness visible when samples with significant positive means are transformed.

From here it is apparent that positive values lead to ever-increasing skewness in the transformed data. Significant negative means produce the following result:

3
Left skewness visible when samples with significant negative means are transformed

Odd-Power Transformations

Next, we explore how the skewness varies with increased exponents, for similar samples.

# Exploring skewness in symmetric distributions
# For mappings of random variablex -> x^3
# Non-trivial negative values of mean

library(e1071)
data <- data.frame()

for (mu in seq(0, 10^3, 1)){
 x <- rnorm(10^4, mu, 10)
 y <- x^3
 y1 <- x^5
 y2 <- x^7
 y3 <- x^9
 data <- rbind(data, c(mu, skewness(x), skewness(y),
 skewness(y1), skewness(y2), skewness(y3)))

}

colnames(data) <- c("mu", "skew_x", "skew_x3", "skew_x5", "skew_x7", "skew_x9") # Plotting x -> x^5
plot(skew_x ~ skew_x3, data = data,
 main = "Skewness(x) vs. Skewness(y = x^3)",
 sub = "Nontrivial positive values of mean",
 col = "dark blue",
 pch = "*",
 xlim = c(-1, 40)
 )
abline(h = 0, v = 0, col= "red")

# Plotting x -> x^5
plot(skew_x ~ skew_x5, data = data,
 main = "Skewness(x) vs. Skewness(y = x^5)",
 sub = "Nontrivial positive values of mean",
 col = "dark green",
 pch = "*",
 xlim = c(-1, 40)
)
abline(h = 0, v = 0, col= "red")

# Plotting x -> x^7
plot(skew_x ~ skew_x7, data = data,
 main = "Skewness(x) vs. Skewness(y = x^7)",
 sub = "Nontrivial positive values of mean",
 col = "dark red",
 pch = "*",
 xlim = c(-1, 40)
)
abline(h = 0, v = 0, col= "red")

#Plotting x -> x^9
plot(skew_x ~ skew_x9, data = data,
 main = "Skewness(x) vs. Skewness(y = x^9)",
 sub = "Nontrivial positive values of mean",
 col = "purple",
 pch = "*",
 xlim = c(-1, 40)
)
abline(h = 0, v = 0, col= "red")

skew3skew5skew7

skew9
Chances of higher skewness increases with higher non-trivial positive values of mean

These patterns are what are called fat tailed distributions, which are common in complex systems. Given that skewness is defined as the third standard moment, E [( {\frac{x-\mu}{\sigma}} )^3 ] , it is understandable that this behaviour exists when we have higher values of \mu. However, I wonder if this is the exact reason, or if there are deeper technical and statistical reasons behind this pattern. For one thing, you’d expect that x increases proportionally with as \mu for symmetric distributions. Further, do non-trivial values of \sigma affect the occurrence of such fat tails? If you happen to know, please comment, I’d love to know more.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s