It's been a couple of weeks since I got accepted in the closed beta testing programme for IBM Data Science Experience (DSX), and it is abo..
Here is a quick recording about Apache SystemML - the declarative large-scale machine learning platform.
If you feel like installing an..
Apache SystemML is a declarative, large-scale machine learning platform that provides automatic optimisation for custom machine learning a..
This is a brief note on doing some rough estimates for sizing Hadoop worker nodes.
This post is in no way exhaustive and there is much..
The Curse of Dimensionality, a term initially introduced by Richard Bellman, is a phenomena that arises when applying machine learning alg..
I recently needed to generate some data for as a function of , with some added Gaussian noise. This comes in handy when you wan..
In Part I we looked at Apache Hive, Cloudera Impala, and IBM Big SQL. We now continue with Pivotal HAWQ, Apache Drill, Spark SQL, Presto, ..
Apache Zeppelin is a multipurpose notebook for interactive data analytics. It supports Scala and Python (with Spark), SparkSQL, Hive, Shel..
Offloading relational data to HDFS is a common use-case for Hadoop known as "online archive". This approach allows companies to solve a qu..