Data scientists come in many shapes and sizes, and constitute a diverse lot of people. More importantly, they can perform diverse functions in organizations and still stand to qualify under the same criteria we use to define data scientists. In this cross-post from a Quora answer, I wish to elucidate on the different kinds of […]Read more "Different Kinds of Data Scientists"
Data analytics and statistics aren’t historically associated with the strategic decisions that leaders take in small and medium sized businesses. Data analytics has for some years been used in larger organizations and organizations with larger user bases are also benefiting from this, thanks to the use of big data to drive consumer and business insight in […]Read more "Data and Strategy for Small and Medium Organizations"
Introduction The more advanced methods in statistics have generally been developed to answer real-world questions, and ANOVA is no different. How do we answer questions in the real world, as to which route from home to work on your daily commute is easier, or How would you know which air-conditioner to choose out of a […]Read more "Two Way ANOVA in R"
A quick experiment in R can unveil the impact of sample size on the estimates we make from data. A small number of samples provides us less information about the process or system from which we’re collecting data, while a large number can help ground our findings in near certainty. See the earlier post on […]Read more "Animated: Mean and Sample Size"
Linear systems are systems that have predictable outputs when there are small changes in the inputs to the system. Nonlinear systems are those that produce disproportionate results for proportional changes in the inputs. Both linear and non-linear systems are common enough in nature and industrial processes, or more accurately, many industrial and natural processes can […]Read more "Animated Logistic Maps of Chaotic Systems in R"
Not all data in this world is predictable in the exact same way, of course, and not all data can be modeled using the Gaussian distribution. There are times, when we have to make comparisons about data using one of many distributions that represent data which could show different patterns other than the familiar and […]Read more "Comparing Non-Normal Data Graphically and with Non-Parametric Tests"
Businesses are increasingly beginning to use data to drive decision making, and are often using hypothesis tests. Hypothesis tests are used to differentiate between a pair of potential solutions, or to understand the performance of systems before and after a certain change. We’ve already seen t-tests and how they’re used to ascribe a range to […]Read more "Hypothesis Tests: 2 Sample Tests (A/B Tests)"