I’m given to spurts of activity on Quora. Over the past year, I’ve had the opportunity to answer several questions there on the topics of data science, big data and data engineering. Some answers here are career-specific, while others are of a technical nature. Then there are interesting and nuanced questions that are always a […]Read more "Quora Data Science Answers Roundup"
One of the changes envisioned in the big data space is that there is the need to receive data that isn’t so much big in volume, as big in relevance. Perhaps this is a crucial distinction to make. Here, we examine business manifestations of relevant data, as opposed to just large volumes of data. What Managers Want From […]Read more "Big Data: Size and Velocity"
As a data science consultant that routinely deals with large companies and their data analysis, data science and machine learning challenges, I have come to understand one key element of the data scientist’s skill set that isn’t oft-discussed in data science circles online. In this post I hope to elucidate on the importance of domain […]Read more "Domain: The Missing Element in Data Science"
Data products are one inevitable result and culmination of the information age. With enough information to process, and with enough data to build massively validated mathematical models like never before, the natural urge is to take a shot at solving some of the world’s problems that depend on data. Data Product Maturity There are some […]Read more "Insights about Data Products"
Being data-driven in organizations is a bigger challenge than it is made out to be. For managers to suspend judgement and make decisions that are informed by facts and data is hard, even in this age of Big Data. I was spurred by a set of tweets I posted, to think through this subject. Decision […]Read more "“Small Data”and Being Data-Driven"
I recently came across this Hortonworks Data Flow presentation where the concept of the jagged edge of real time data analytics is discussed. The context that suits a discussion on this is to me, centred around prioritization of what comes in through the sensors (or other big data gathered from these “jagged edge” sources). This is a […]Read more "The “Jagged Edge” of Real Time Analytics"
The insights we get from data depend on the quality of the data itself, and as the saying goes, “Garbage In, Garbage Out”. The volumes of data don’t matter as much as the quality of the data itself. Data quality and data quality assurance are therefore of growing importance in today’s Big Data arena. With the […]Read more "Quality and the Data Lifecycle"