In the previous post I wrote about the importance of the Open Data Platform initiative. In this tutorial I will go over the steps of installing IBM Open Platform with Apache Hadoop on CentOS 6.7.
- You can get the IBM Open Platform from https://www14.software.ibm.com/webapp/iwm/web/preLogin.do?source=iopah4. You will need to register for an IBMid but it’s a quick and straightforward process.
- You will need a machine with CentOS 6.7. It can be a VM just make sure it has enough resources. According to IBM’s documentation you will need a minimum of 24 GB of RAM and a minimum of 80 GB of free disk space.
I have successfully deployed the Open Platform with some of the value adds on top in a VM with 12 GB of RAM and I haven’t encountered any issues. This was however a test configuration so I would recommend you follow the official documentation for production deployments. You should also note that CentOS is not officially supported – IBM Open Platform is certified only against Red Hat Enterprise Linux Server 6.
- Note that the machine I am using in this tutorial is called iop.example.com and its IP address 192.168.59.199.
Start by making sure that all devices have assigned UUIDs:
[root@iop ~]# sudo blkid /dev/sda1: UUID="4304f1a2-26e5-440f-8b27-5055c96fcbac" TYPE="ext4" /dev/sda2: UUID="6Zn0RU-Zm13-zXbp-5RjD-sSJw-vPnf-suWzMq" TYPE="LVM2_member" /dev/mapper/vg_centos-lv_swap: UUID="b9ceac5e-c057-4efe-ac80-c7595777588f" TYPE="swap" /dev/mapper/vg_centos-lv_root: UUID="3db9ee1f-97fb-460b-b136-3873c42764b7" TYPE="ext4" /dev/mapper/vg_centos-lv_home: UUID="9df29554-dfd0-4aba-b72e-c8b08607b6a9" TYPE="ext4" [root@iop ~]#
Confirm that /etc/fstab is using the UUIDs for identifying the devices.
[root@iop ~]# cat /etc/fstab # # /etc/fstab # Created by anaconda on Fri Jul 31 17:02:05 2015 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # UUID=3db9ee1f-97fb-460b-b136-3873c42764b7 / ext4 defaults 1 1 UUID=4304f1a2-26e5-440f-8b27-5055c96fcbac /boot ext4 defaults 1 2 UUID=9df29554-dfd0-4aba-b72e-c8b08607b6a9 /home ext4 defaults 1 2 UUID= b9ceac5e-c057-4efe-ac80-c7595777588f swap swap defaults 0 0 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 [root@iop ~]#
Confirm that /etc/host has an entry for your hostname and that you can resolve the name to an IP address.
[root@iop ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.59.199 iop iop.example.com [root@iop ~]# ping iop.example.com PING iop (192.168.59.199) 56(84) bytes of data. 64 bytes from iop (192.168.59.199): icmp_seq=1 ttl=64 time=0.045 ms 64 bytes from iop (192.168.59.199): icmp_seq=2 ttl=64 time=0.042 ms ^C --- iop ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1215ms rtt min/avg/max/mdev = 0.042/0.043/0.045/0.006 ms [root@iop ~]#
Make sure that the hostname command returns correct values for the long and short hostname. Do not ignore this as you won’t be able to install the Big SQL value add later – its monitoring script relies on getting correct long and short hostname.
[root@iop ~]# hostname --long iop.example.com [root@iop ~]# hostname --short iop [root@iop ~]#
Next we have to configure password-less login with SSH keys.
[root@iop ~]# ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: 86:6e:bb:2a:5a:a3:a6:2e:2a:96:1f:62:b8:65:3c:a5 root@iop The key's randomart image is: +--[ RSA 2048]----+ | | | | | | | . | | . . S | |.. o . . | |.oE. o | |+Xoo.. . | |#o.o..o. | +-----------------+ [root@iop ~]# [root@iop ~]# cat /root/.ssh/id_rsa.pub > /root/.ssh/authorized_keys [root@iop ~]# chmod 700 /root/.ssh [root@iop ~]# chmod 640 /root/.ssh/authorized_keys [root@iop ~]#
Make sure both the short and long hostnames are added to known_hosts.
[root@iop ~]# ssh firstname.lastname@example.org date The authenticity of host 'iop.example.com (192.168.59.199)' can't be established. RSA key fingerprint is 26:15:71:25:08:a9:56:c7:06:c4:a0:e6:d1:57:45:b0. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'iop.example.com,192.168.59.199' (RSA) to the list of known hosts. Thu Aug 6 09:49:25 BST 2015 [root@iop ~]# ssh root@iop The authenticity of host 'iop (192.168.59.199)' can't be established. RSA key fingerprint is 26:15:71:25:08:a9:56:c7:06:c4:a0:e6:d1:57:45:b0. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'iop' (RSA) to the list of known hosts. Last login: Thu Aug 6 09:42:17 2015 from 192.168.59.3 [root@iop ~]#
Disable SELinux permanently by setting SELINUX to disabled in /etc/selinux/config.
# This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - No SELinux policy is loaded. SELINUX=disabled # SELINUXTYPE= can take one of these two values: # targeted - Targeted processes are protected, # mls - Multi Level Security protection. SELINUXTYPE=targeted
Don’t forget to reboot the machine for the changes to take effect.
Disable the firewall (again, the assumption is that this is an isolated test system).
[root@iop ~]# chkconfig iptables off [root@iop ~]# service iptables stop [root@iop ~]#
Next we have to disable Transparent Huge Pages. Append the following /etc/rc.local:
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then echo never > /sys/kernel/mm/transparent_hugepage/enabled fi
To avoid another restart we can disable Huge Pages in the running OS:
[root@iop ~]# echo never > /sys/kernel/mm/transparent_hugepage/enabled [root@iop ~]#
We also have to disable IPv6. Set the following parameters in /etc/sysctl.conf:
net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 net.ipv6.conf.lo.disable_ipv6 = 1
Use sysctl to load the new settings from systcl.conf.
[root@iop ~]# sysctl -p net.ipv4.ip_forward = 0 net.ipv4.conf.default.rp_filter = 1 net.ipv4.conf.default.accept_source_route = 0 kernel.sysrq = 0 kernel.core_uses_pid = 1 net.ipv4.tcp_syncookies = 1 error: "net.bridge.bridge-nf-call-ip6tables" is an unknown key error: "net.bridge.bridge-nf-call-iptables" is an unknown key error: "net.bridge.bridge-nf-call-arptables" is an unknown key kernel.msgmnb = 65536 kernel.msgmax = 65536 kernel.shmmax = 68719476736 kernel.shmall = 4294967296 net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 net.ipv6.conf.lo.disable_ipv6 = 1 [root@iop ~]#
Finally we have to configure and enable NTPD.
Make sure you’ve got a set of NTP servers listed in /etc/ntp.conf then enable and start the ntpd daemon.
[root@big ~]# cat /etc/ntp.conf |grep server |grep -v "#" server 0.centos.pool.ntp.org iburst server 1.centos.pool.ntp.org iburst server 2.centos.pool.ntp.org iburst server 3.centos.pool.ntp.org iburst [root@big ~]# [root@iop ~]# chkconfig --add ntpd [root@iop ~]# service ntpd start Starting ntpd: [ OK ] [root@iop ~]# chkconfig ntpd on [root@iop ~]#
Make sure that the host is now synchronized:
[root@iop ~]# ntpstat synchronised to NTP server (126.96.36.199) at stratum 4 time correct to within 1065 ms polling server every 64 s [root@iop ~]#
Installing additional packages
Let’s start by installing MySQL. It is required by the Hive metastore service, which needs a relational database to store metadata for Hive tables and partitions.
[root@iop ~]# yum install -y mysql-server Loaded plugins: fastestmirror, refresh-packagekit Setting up Install Process Loading mirror speeds from cached hostfile … Dependency Installed: mysql.x86_64 0:5.1.73-5.el6_6 perl-DBD-MySQL.x86_64 0:4.013-3.el6 perl-DBI.x86_64 0:1.609-4.el6 Complete! [root@iop ~]#
Enable MySQL to start on boot.
[root@iop ~]# chkconfig mysqld on [root@iop ~]# service mysqld start Initializing MySQL database: Installing MySQL system tables... OK Filling help tables... OK To start mysqld at boot time you have to copy support-files/mysql.server to the right place for your system PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER ! To do so, start the server, then issue the following commands: /usr/bin/mysqladmin -u root password 'new-password' /usr/bin/mysqladmin -u root -h iop password 'new-password' Alternatively you can run: /usr/bin/mysql_secure_installation which will also give you the option of removing the test databases and anonymous user created by default. This is strongly recommended for production servers. See the manual for more instructions. You can start the MySQL daemon with: cd /usr ; /usr/bin/mysqld_safe & You can test the MySQL daemon with mysql-test-run.pl cd /usr/mysql-test ; perl mysql-test-run.pl Please report any problems with the /usr/bin/mysqlbug script! [ OK ] Starting mysqld: [ OK ] [root@iop ~]#
We also need to install and configure PostgreSQL, which hosts the Ambari database.
[root@iop ~]# yum install -y postgresql-server Loaded plugins: fastestmirror, refresh-packagekit Setting up Install Process Loading mirror speeds from cached hostfile … Dependency Installed: postgresql.x86_64 0:8.4.20-3.el6_6 postgresql-libs.x86_64 0:8.4.20-3.el6_6 Complete! [root@iop ~]# [root@iop ~]# chkconfig postgresql on [root@iop ~]# service postgresql initdb Initializing database: [ OK ] [root@iop ~]# [root@iop ~]# service postgresql start Starting postgresql service: [ OK ] [root@iop ~]#
We also need the netcat utility and OpenJDK 1.7.0.
[root@iop ~]# yum install -y nc Loaded plugins: fastestmirror, refresh-packagekit Setting up Install Process Loading mirror speeds from cached hostfile … Installed: nc.x86_64 0:1.84-22.el6 Complete! [root@iop ~]# [root@iop ~]# yum install -y java-1.7.0-openjdk.x86_64 java-1.7.0-openjdk-devel.x86_64 Loaded plugins: fastestmirror, refresh-packagekit Setting up Install Process Loading mirror speeds from cached hostfile … Dependency Installed: giflib.x86_64 0:4.1.6-3.1.el6 jpackage-utils.noarch 0:1.7.5-3.14.el6 pcsc-lite-libs.x86_64 0:1.5.2-15.el6 ttmkfdir.x86_64 0:3.0.9-32.1.el6 tzdata-java.noarch 0:2015e-1.el6 xorg-x11-fonts-Type1.noarch 0:7.2-11.el6 Complete! [root@iop ~]#
Installing IBM Open Platform 4.0
Copy the IOP-188.8.131.52.x86_64.rpm package, which you’ve downloaded from the IBM website onto the host and install it.
[root@iop ~]# rpm -ivh IOP-184.108.40.206.x86_64.rpm Preparing... ########################################### [100%] 1:IOP ########################################### [100%] [root@iop ~]#
This package adds the IBM Open Platform repository to the local Yum configuration. We should now be able to use Yum to install Ambari.
[root@iop ~]# yum install -y ambari-server Loaded plugins: fastestmirror, refresh-packagekit Setting up Install Process Loading mirror speeds from cached hostfile … Installed: ambari-server.noarch 0:1.7.0_IBM-4 Complete! [root@iop ~]#
Run ambari-server setup to configure Ambari, select Custom JDK and provide the path to the OpenJDK 1.7.0 JRE.
[root@big ~]# ambari-server setup Using python /usr/bin/python2.6 Setup ambari-server Checking SELinux... SELinux status is 'disabled' Customize user account for ambari-server daemon [y/n] (n)? Adjusting ambari-server permissions and ownership... Checking firewall... Checking JDK...  - Open JDK 1.7  - Custom JDK ============================================================================== Enter choice (1):2 WARNING: JDK must be installed on all hosts and JAVA_HOME must be valid on all hosts. Path to JAVA_HOME: /usr/lib/jvm/jre-1.7.0-openjdk.x86_64 Validating JDK on Ambari Server...done. Completing setup... Configuring database... Enter advanced database configuration [y/n] (n)? Default properties detected. Using built-in database. Checking PostgreSQL... Configuring local database... Connecting to local database...done. Configuring PostgreSQL... Backup for pg_hba found, reconfiguration not required Extracting system views... . Adjusting ambari-server permissions and ownership... Ambari Server 'setup' completed successfully. [root@big ~]#
[root@iop ~]# ambari-server start Using python /usr/bin/python2.6 Starting ambari-server Ambari Server running with 'root' privileges. Organizing resource files at /var/lib/ambari-server/resources... Server PID at: /var/run/ambari-server/ambari-server.pid Server out at: /var/log/ambari-server/ambari-server.out Server log at: /var/log/ambari-server/ambari-server.log Waiting for server start.................... Ambari Server 'start' completed successfully. [root@iop ~]#
Open a web browser and connect to the IP address of the target host at port 8080.
Login as admin/admin (the default Ambari administrative account).
At the Welcome page click Launch Install Wizard.
Provide a name for your IBM Open Platform cluster and click Next.
Select the BigInsights 4.0 services stack and click Next.
Provide the fully qualified name of your target host. Copy the contents of the SSH private key (/root/.ssh/id_rsa) and paste it in the Host Registration Information section. Click Register and Confirm.
Select the target host and click Next. The system will run a set of checks and after the validation completes you will be presented with a list of services you can deploy as part of the installation.
Leave the services selection by default and click Next.
You can’t actually assign masters as you’ve only got a single machine so leave everything as it is and click Next.
Again, no role separation is possible in a single node installation so just click Next.
Select the individual tabs and provide the missing configuration information for Nagios, Hive, Ooozie, and Knox. This is about default administrative users and passwords. Once you’ve set credentials for all four the red markers will go away and you can click Next.
Review the summary and click Deploy. This initiates the services deployment. Once the services are installed and configured they will start automatically.
Review the installation summary and click Complete.
Congratulations! You now have a working installation of the IBM Open Platform with Apache Hadoop. You can see the available services on the left-hand side of the Ambari console and the Dashboard tab provides a general overview of the system.