IBM Cloud Object Storage (or COS) is a highly scalable cloud storage service, designed for high durability, resiliency and security. It is..
Recently I had to deal with a dataset of hundreds of .tar.gz files dumped in an IBM Cloud Object Storage (IBM CleverSafe) accessible via a..
This is a quick tutorial on installing Jupyter and setting up the PySpark and the R kernel (IRkernel) for Spark development. The pre-reqs ..
It's been over a month since IBM released version 4.2 of their Hadoop distribution (BigInsights), so I decided to do a quick wirte up on t..
It's been a couple of weeks since I got accepted in the closed beta testing programme for IBM Data Science Experience (DSX), and it is abo..
Here is a quick recording about Apache SystemML - the declarative large-scale machine learning platform.
If you feel like installing an..
Apache SystemML is a declarative, large-scale machine learning platform that provides automatic optimisation for custom machine learning a..
This is a brief note on doing some rough estimates for sizing Hadoop worker nodes.
This post is in no way exhaustive and there is much..
The Curse of Dimensionality, a term initially introduced by Richard Bellman, is a phenomena that arises when applying machine learning alg..
I recently needed to generate some data for as a function of , with some added Gaussian noise. This comes in handy when you wan..