Thanks to a question on Quora, I’ve had the chance to explore the skewness of samples from symmetric distributions, prior to and after odd exponent transformations such as . While the answer is posted there, I’d like to explore related odd transformations here and their effect on the skewness. A simple experiment below reveals the impact of non-trivial means on the skewness for data under a cube transformation.

### Cube Transformations for Non-Trivial Means

x1 <- rnorm(10^4, 100, 10) y1 <- x1^3 hist(x1, breaks = 200, col = "light blue", main = "x1; Mean = 100, s.d. = 10") hist(y1, breaks = 200, col = "light green", main = "y1 = x1^3")

From here it is apparent that positive values lead to ever-increasing skewness in the transformed data. Significant negative means produce the following result:

### Odd-Power Transformations

Next, we explore how the skewness varies with increased exponents, for similar samples.

# Exploring skewness in symmetric distributions # For mappings of random variablex -> x^3 # Non-trivial negative values of mean library(e1071) data <- data.frame() for (mu in seq(0, 10^3, 1)){ x <- rnorm(10^4, mu, 10) y <- x^3 y1 <- x^5 y2 <- x^7 y3 <- x^9 data <- rbind(data, c(mu, skewness(x), skewness(y), skewness(y1), skewness(y2), skewness(y3))) } colnames(data) <- c("mu", "skew_x", "skew_x3", "skew_x5", "skew_x7", "skew_x9") # Plotting x -> x^5 plot(skew_x ~ skew_x3, data = data, main = "Skewness(x) vs. Skewness(y = x^3)", sub = "Nontrivial positive values of mean", col = "dark blue", pch = "*", xlim = c(-1, 40) ) abline(h = 0, v = 0, col= "red") # Plotting x -> x^5 plot(skew_x ~ skew_x5, data = data, main = "Skewness(x) vs. Skewness(y = x^5)", sub = "Nontrivial positive values of mean", col = "dark green", pch = "*", xlim = c(-1, 40) ) abline(h = 0, v = 0, col= "red") # Plotting x -> x^7 plot(skew_x ~ skew_x7, data = data, main = "Skewness(x) vs. Skewness(y = x^7)", sub = "Nontrivial positive values of mean", col = "dark red", pch = "*", xlim = c(-1, 40) ) abline(h = 0, v = 0, col= "red") #Plotting x -> x^9 plot(skew_x ~ skew_x9, data = data, main = "Skewness(x) vs. Skewness(y = x^9)", sub = "Nontrivial positive values of mean", col = "purple", pch = "*", xlim = c(-1, 40) ) abline(h = 0, v = 0, col= "red")

These patterns are what are called fat tailed distributions, which are common in complex systems. Given that skewness is defined as the third standard moment, , it is understandable that this behaviour exists when we have higher values of . However, I wonder if this is the exact reason, or if there are deeper technical and statistical reasons behind this pattern. For one thing, you’d expect that x increases proportionally with as for symmetric distributions. Further, do non-trivial values of affect the occurrence of such fat tails? If you happen to know, please comment, I’d love to know more.