Installation of IBM Open Platform with Apache Hadoop (version 4.0)

Installation of IBM Open Platform with Apache Hadoop (version 4.0)


In the previous post I wrote about the importance of the Open Data Platform initiative. In this tutorial I will go over the steps of installing IBM Open Platform with Apache Hadoop on CentOS 6.7.

Prerequisites

  • You can get the IBM Open Platform from https://www14.software.ibm.com/webapp/iwm/web/preLogin.do?source=iopah4. You will need to register for an IBMid but it’s a quick and straightforward process.
  • IOP Download

  • You will need a machine with CentOS 6.7. It can be a VM just make sure it has enough resources. According to IBM’s documentation you will need a minimum of 24 GB of RAM and a minimum of 80 GB of free disk space.
    I have successfully deployed the Open Platform with some of the value adds on top in a VM with 12 GB of RAM and I haven’t encountered any issues. This was however a test configuration so I would recommend you follow the official documentation for production deployments. You should also note that CentOS is not officially supported – IBM Open Platform is certified only against Red Hat Enterprise Linux Server 6.
  • Note that the machine I am using in this tutorial is called iop.example.com and its IP address 192.168.59.199.

System preparation

Start by making sure that all devices have assigned UUIDs:

[root@iop ~]# sudo blkid
/dev/sda1: UUID="4304f1a2-26e5-440f-8b27-5055c96fcbac" TYPE="ext4"
/dev/sda2: UUID="6Zn0RU-Zm13-zXbp-5RjD-sSJw-vPnf-suWzMq" TYPE="LVM2_member"
/dev/mapper/vg_centos-lv_swap: UUID="b9ceac5e-c057-4efe-ac80-c7595777588f" TYPE="swap"
/dev/mapper/vg_centos-lv_root: UUID="3db9ee1f-97fb-460b-b136-3873c42764b7" TYPE="ext4"
/dev/mapper/vg_centos-lv_home: UUID="9df29554-dfd0-4aba-b72e-c8b08607b6a9" TYPE="ext4"
[root@iop ~]#

Confirm that /etc/fstab is using the UUIDs for identifying the devices.

[root@iop ~]# cat /etc/fstab

#
# /etc/fstab
# Created by anaconda on Fri Jul 31 17:02:05 2015
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=3db9ee1f-97fb-460b-b136-3873c42764b7 /                       ext4    defaults        1 1
UUID=4304f1a2-26e5-440f-8b27-5055c96fcbac /boot                   ext4    defaults        1 2
UUID=9df29554-dfd0-4aba-b72e-c8b08607b6a9 /home                   ext4    defaults        1 2
UUID= b9ceac5e-c057-4efe-ac80-c7595777588f swap                    swap    defaults        0 0
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
[root@iop ~]#

Confirm that /etc/host has an entry for your hostname and that you can resolve the name to an IP address.

[root@iop ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.59.199  iop     iop.example.com
[root@iop ~]# ping iop.example.com
PING iop (192.168.59.199) 56(84) bytes of data.
64 bytes from iop (192.168.59.199): icmp_seq=1 ttl=64 time=0.045 ms
64 bytes from iop (192.168.59.199): icmp_seq=2 ttl=64 time=0.042 ms
^C
--- iop ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1215ms
rtt min/avg/max/mdev = 0.042/0.043/0.045/0.006 ms
[root@iop ~]#

Make sure that the hostname command returns correct values for the long and short hostname. Do not ignore this as you won’t be able to install the Big SQL value add later – its monitoring script relies on getting correct long and short hostname.

[root@iop ~]# hostname --long
iop.example.com
[root@iop ~]# hostname --short
iop
[root@iop ~]#

Next we have to configure password-less login with SSH keys.

[root@iop ~]# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
86:6e:bb:2a:5a:a3:a6:2e:2a:96:1f:62:b8:65:3c:a5 root@iop
The key's randomart image is:
+--[ RSA 2048]----+
|                 |
|                 |
|                 |
|       .         |
|    . . S        |
|.. o . .         |
|.oE.  o          |
|+Xoo.. .         |
|#o.o..o.         |
+-----------------+
[root@iop ~]#
[root@iop ~]# cat /root/.ssh/id_rsa.pub > /root/.ssh/authorized_keys
[root@iop ~]# chmod 700 /root/.ssh
[root@iop ~]# chmod 640 /root/.ssh/authorized_keys
[root@iop ~]#

Make sure both the short and long hostnames are added to known_hosts.

[root@iop ~]# ssh root@iop.example.com date
The authenticity of host 'iop.example.com (192.168.59.199)' can't be established.
RSA key fingerprint is 26:15:71:25:08:a9:56:c7:06:c4:a0:e6:d1:57:45:b0.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'iop.example.com,192.168.59.199' (RSA) to the list of known hosts.
Thu Aug  6 09:49:25 BST 2015
[root@iop ~]# ssh root@iop
The authenticity of host 'iop (192.168.59.199)' can't be established.
RSA key fingerprint is 26:15:71:25:08:a9:56:c7:06:c4:a0:e6:d1:57:45:b0.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'iop' (RSA) to the list of known hosts.
Last login: Thu Aug  6 09:42:17 2015 from 192.168.59.3
[root@iop ~]#

Disable SELinux permanently by setting SELINUX to disabled in /etc/selinux/config.

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
#     targeted - Targeted processes are protected,
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted

Don’t forget to reboot the machine for the changes to take effect.

Disable the firewall (again, the assumption is that this is an isolated test system).

[root@iop ~]# chkconfig iptables off
[root@iop ~]# service iptables stop
[root@iop ~]#

Next we have to disable Transparent Huge Pages. Append the following /etc/rc.local:

if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
    echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi

To avoid another restart we can disable Huge Pages in the running OS:

[root@iop ~]# echo never > /sys/kernel/mm/transparent_hugepage/enabled
[root@iop ~]#

We also have to disable IPv6. Set the following parameters in /etc/sysctl.conf:

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

Use sysctl to load the new settings from systcl.conf.

[root@iop ~]# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
error: "net.bridge.bridge-nf-call-ip6tables" is an unknown key
error: "net.bridge.bridge-nf-call-iptables" is an unknown key
error: "net.bridge.bridge-nf-call-arptables" is an unknown key
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
[root@iop ~]#

Finally we have to configure and enable NTPD.

Make sure you’ve got a set of NTP servers listed in /etc/ntp.conf then enable and start the ntpd daemon.

[root@big ~]# cat /etc/ntp.conf |grep server |grep -v "#"
server 0.centos.pool.ntp.org iburst
server 1.centos.pool.ntp.org iburst
server 2.centos.pool.ntp.org iburst
server 3.centos.pool.ntp.org iburst
[root@big ~]#
[root@iop ~]# chkconfig --add ntpd
[root@iop ~]# service ntpd start
Starting ntpd:                                             [  OK  ]
[root@iop ~]# chkconfig ntpd on
[root@iop ~]#

Make sure that the host is now synchronized:

[root@iop ~]# ntpstat
synchronised to NTP server (151.236.19.231) at stratum 4
   time correct to within 1065 ms
   polling server every 64 s
[root@iop ~]#

Installing additional packages

Let’s start by installing MySQL. It is required by the Hive metastore service, which needs a relational database to store metadata for Hive tables and partitions.

[root@iop ~]# yum install -y mysql-server
Loaded plugins: fastestmirror, refresh-packagekit
Setting up Install Process
Loading mirror speeds from cached hostfile
…
Dependency Installed:
  mysql.x86_64 0:5.1.73-5.el6_6                               perl-DBD-MySQL.x86_64 0:4.013-3.el6                               perl-DBI.x86_64 0:1.609-4.el6

Complete!
[root@iop ~]#

Enable MySQL to start on boot.

[root@iop ~]# chkconfig mysqld on
[root@iop ~]# service mysqld start
Initializing MySQL database:  Installing MySQL system tables...
OK
Filling help tables...
OK

To start mysqld at boot time you have to copy
support-files/mysql.server to the right place for your system

PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER !
To do so, start the server, then issue the following commands:

/usr/bin/mysqladmin -u root password 'new-password'
/usr/bin/mysqladmin -u root -h iop password 'new-password'

Alternatively you can run:
/usr/bin/mysql_secure_installation

which will also give you the option of removing the test
databases and anonymous user created by default.  This is
strongly recommended for production servers.

See the manual for more instructions.

You can start the MySQL daemon with:
cd /usr ; /usr/bin/mysqld_safe &

You can test the MySQL daemon with mysql-test-run.pl
cd /usr/mysql-test ; perl mysql-test-run.pl

Please report any problems with the /usr/bin/mysqlbug script!

                                                           [  OK  ]
Starting mysqld:                                           [  OK  ]
[root@iop ~]#

We also need to install and configure PostgreSQL, which hosts the Ambari database.

[root@iop ~]# yum install -y postgresql-server
Loaded plugins: fastestmirror, refresh-packagekit
Setting up Install Process
Loading mirror speeds from cached hostfile
…
Dependency Installed:
  postgresql.x86_64 0:8.4.20-3.el6_6                                                         postgresql-libs.x86_64 0:8.4.20-3.el6_6

Complete!
[root@iop ~]#
[root@iop ~]# chkconfig postgresql on
[root@iop ~]# service postgresql initdb
Initializing database:                                     [  OK  ]
[root@iop ~]#
[root@iop ~]# service postgresql start
Starting postgresql service:                               [  OK  ]
[root@iop ~]#

We also need the netcat utility and OpenJDK 1.7.0.

[root@iop ~]# yum install -y nc
Loaded plugins: fastestmirror, refresh-packagekit
Setting up Install Process
Loading mirror speeds from cached hostfile
…
Installed:
  nc.x86_64 0:1.84-22.el6

Complete!
[root@iop ~]#
[root@iop ~]# yum install -y java-1.7.0-openjdk.x86_64 java-1.7.0-openjdk-devel.x86_64
Loaded plugins: fastestmirror, refresh-packagekit
Setting up Install Process
Loading mirror speeds from cached hostfile
…
Dependency Installed:
  giflib.x86_64 0:4.1.6-3.1.el6             jpackage-utils.noarch 0:1.7.5-3.14.el6  pcsc-lite-libs.x86_64 0:1.5.2-15.el6  ttmkfdir.x86_64 0:3.0.9-32.1.el6  tzdata-java.noarch 0:2015e-1.el6
  xorg-x11-fonts-Type1.noarch 0:7.2-11.el6

Complete!
[root@iop ~]#

Installing IBM Open Platform 4.0

Copy the IOP-4.0.0.0.x86_64.rpm package, which you’ve downloaded from the IBM website onto the host and install it.

[root@iop ~]# rpm -ivh IOP-4.0.0.0.x86_64.rpm
Preparing...                ########################################### [100%]
   1:IOP                    ########################################### [100%]
[root@iop ~]#

This package adds the IBM Open Platform repository to the local Yum configuration. We should now be able to use Yum to install Ambari.

[root@iop ~]# yum install -y ambari-server
Loaded plugins: fastestmirror, refresh-packagekit
Setting up Install Process
Loading mirror speeds from cached hostfile
…
Installed:
  ambari-server.noarch 0:1.7.0_IBM-4

Complete!
[root@iop ~]#

Run ambari-server setup to configure Ambari, select Custom JDK and provide the path to the OpenJDK 1.7.0 JRE.

[root@big ~]# ambari-server setup
Using python  /usr/bin/python2.6
Setup ambari-server
Checking SELinux...
SELinux status is 'disabled'
Customize user account for ambari-server daemon [y/n] (n)?
Adjusting ambari-server permissions and ownership...
Checking firewall...
Checking JDK...
[1] - Open JDK 1.7
[2] - Custom JDK
==============================================================================
    Enter choice (1):2
WARNING: JDK must be installed on all hosts and JAVA_HOME must be valid on all hosts.
Path to JAVA_HOME: /usr/lib/jvm/jre-1.7.0-openjdk.x86_64
Validating JDK on Ambari Server...done.
Completing setup...
Configuring database...
Enter advanced database configuration [y/n] (n)?
Default properties detected. Using built-in database.
Checking PostgreSQL...
Configuring local database...
Connecting to local database...done.
Configuring PostgreSQL...
Backup for pg_hba found, reconfiguration not required
Extracting system views...
.
Adjusting ambari-server permissions and ownership...
Ambari Server 'setup' completed successfully.
[root@big ~]#

Start Ambari.

[root@iop ~]# ambari-server start
Using python  /usr/bin/python2.6
Starting ambari-server
Ambari Server running with 'root' privileges.
Organizing resource files at /var/lib/ambari-server/resources...
Server PID at: /var/run/ambari-server/ambari-server.pid
Server out at: /var/log/ambari-server/ambari-server.out
Server log at: /var/log/ambari-server/ambari-server.log
Waiting for server start....................
Ambari Server 'start' completed successfully.
[root@iop ~]#

Open a web browser and connect to the IP address of the target host at port 8080.

iop-install-1

Login as admin/admin (the default Ambari administrative account).

iop-install-2

At the Welcome page click Launch Install Wizard.

iop-install-3

Provide a name for your IBM Open Platform cluster and click Next.

iop-install-4

Select the BigInsights 4.0 services stack and click Next.

iop-install-5

Provide the fully qualified name of your target host. Copy the contents of the SSH private key (/root/.ssh/id_rsa) and paste it in the Host Registration Information section. Click Register and Confirm.

iop-install-6

Select the target host and click Next. The system will run a set of checks and after the validation completes you will be presented with a list of services you can deploy as part of the installation.

iop-install-7

Leave the services selection by default and click Next.

iop-install-8

You can’t actually assign masters as you’ve only got a single machine so leave everything as it is and click Next.

iop-install-9

Again, no role separation is possible in a single node installation so just click Next.

iop-install-10

Select the individual tabs and provide the missing configuration information for Nagios, Hive, Ooozie, and Knox. This is about default administrative users and passwords. Once you’ve set credentials for all four the red markers will go away and you can click Next.

iop-install-11

Review the summary and click Deploy. This initiates the services deployment. Once the services are installed and configured they will start automatically.

iop-install-12

Click Next.

iop-install-13

Review the installation summary and click Complete.

iop-install-14

Congratulations! You now have a working installation of the IBM Open Platform with Apache Hadoop. You can see the available services on the left-hand side of the Ambari console and the Dashboard tab provides a general overview of the system.