Although the data science and big data buzzwords have been bandied about for years now, and although artificial intelligence has been talked about for decades, the two fields are irrevocably inter-related and interdependent. For one thing, the wide interest in data science started just as we were beginning to leverage distribute data storage and computation […]Read more "The Expert System Anachronism in the Data Science and AI Divergence"
A decade ago, Microsoft looked very different from the Microsoft we see today – it has been a remarkable transformation. One of the areas where MS have made a big push is machine learning and data analytics. Although the CRAN repository is going strong with >10,000 packages as of today, the MRAN repository (Microsoft’s Managed […]Read more "Azure ML Studio and R"
This may sound weird, but one sure way to not have perspective about the business in an innovative and constantly changing industry is to bury yourself within regular work. This is the meaning of the title – which comes from a book of the same name. By regular work, I mean work in which you […]Read more "Data Perspectives: “Orbiting The Giant Hairball”"
The insights we get from data depend on the quality of the data itself, and as the saying goes, “Garbage In, Garbage Out”. The volumes of data don’t matter as much as the quality of the data itself. Data quality and data quality assurance are therefore of growing importance in today’s Big Data arena. With the […]Read more "Quality and the Data Lifecycle"
A number of inferential statistical tests (A/B tests and significance tests) assume that the underlying that we’re comparing come from a normal (Gaussian) distribution. However, this isn’t generally true for a number of data sets in practice. In order to use the tools that assume normality, we have to transform the data (and the limits […]Read more "Johnson Transformation for Non-Normal Data"