In a previous post I wrote about the importance of IBM Open Data Platform and I also wrote a brief HOWTO on installing BigInsights 4.0. Well, BigInsights 4.1 has been out for some time now and it is about time I put together a quick and dirty installation guide on 4.1.
- You will need a machine with RedHat Enterprise Linux Server 6 or 7. These two are the only supported operating systems for BigInsights 4.1 as you can confirm by looking at the detailed system requirements.
The machine can be a VM, just make sure it has enough resources. According to IBM’s documentation you will need a minimum of 24 GB of RAM and a minimum of 80 GB of free disk space.
I have successfully deployed the Open Platform with some of the value adds on top in a VM with 12 GB of RAM and I haven’t encountered any issues. This was however a test configuration so I would recommend you follow the official documentation for production deployments.
- You’ll need to download the IBM Open Platform RPMs from https://www14.software.ibm.com/webapp/iwm/web/preLogin.do?source=iopah4. You will need to register for an IBMid but it’s a quick and straightforward process.
- Note that the machine I will be using in this tutorial is called big.example.com, its IP address is 192.168.59.200, and it is running RedHat Enterprise Linux Server 6.7.
Start by making sure that all devices have assigned UUIDs and confirm that /etc/fstab is using the UUIDs for identifying the devices.
[root@big ~]# blkid /dev/mapper/vg_big-lv_root: UUID="35757d7d-a8d8-4433-be20-e72141dab488" TYPE="ext4" /dev/sda1: UUID="aef9dc20-b6f7-447b-95ef-45d0f7c9df36" TYPE="ext4" /dev/sda2: UUID="tIQ4ff-IjOs-Sp9H-nR5M-SIu7-7ijC-RLjGT5" TYPE="LVM2_member" /dev/mapper/vg_big-lv_swap: UUID="e24d88ef-5a30-4fc3-abff-975a8c42c8f1" TYPE="swap" /dev/mapper/vg_big-lv_home: UUID="22a70f06-2381-47d1-8930-693c069b25ae" TYPE="ext4" [root@big ~]# cat /etc/fstab # # /etc/fstab # Created by anaconda on Tue Sep 8 19:43:07 2015 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # UUID=35757d7d-a8d8-4433-be20-e72141dab488 / ext4 defaults 1 1 UUID=aef9dc20-b6f7-447b-95ef-45d0f7c9df36 /boot ext4 defaults 1 2 UUID=22a70f06-2381-47d1-8930-693c069b25ae /home ext4 defaults 1 2 UUID=e24d88ef-5a30-4fc3-abff-975a8c42c8f1 swap swap defaults 0 0 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 [root@big ~]#
Confirm that /etc/host has an entry for your host name and that you can resolve the name to an IP address.
[root@big ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.59.200 big.example.com big [root@big ~]# ping big.example.com PING big.example.com (192.168.59.200) 56(84) bytes of data. 64 bytes from big.example.com (192.168.59.200): icmp_seq=1 ttl=64 time=0.015 ms 64 bytes from big.example.com (192.168.59.200): icmp_seq=2 ttl=64 time=0.018 ms ^C --- big.example.com ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1201ms rtt min/avg/max/mdev = 0.015/0.016/0.018/0.004 ms [root@big ~]#
Make sure that the hostname command returns correct values for the long and short host name. Do not ignore this as you won’t be able to install the Big SQL value add later – its monitoring script relies on getting correct long and short host names.
[root@big ~]# hostname --long big.example.com [root@big ~]# hostname --short big [root@big ~]#
Next we have to configure password-less login with SSH keys.
[root@big ~]# ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: a5:39:a1:71:fe:92:4f:5d:4b:66:ae:f9:0b:49:84:08 root@big The key's randomart image is: +--[ RSA 2048]----+ | E | | . . . | | . + o . | | = = . | | . S . = | | + o B . | | o o + o | | + + | | . o.o. | +-----------------+ [root@big ~]# cat /root/.ssh/id_rsa.pub > /root/.ssh/authorized_keys [root@big ~]# chmod 700 /root/.ssh [root@big ~]# chmod 640 /root/.ssh/authorized_keys [root@big ~]#
Make sure both the short and long host names are added to known_hosts.
[root@big ~]# ssh email@example.com date The authenticity of host 'big.example.com (192.168.59.200)' can't be established. RSA key fingerprint is 26:15:71:25:08:a9:56:c7:06:c4:a0:e6:d1:57:45:b0. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'big.example.com,192.168.59.200' (RSA) to the list of known hosts. Tue Sep 1 09:10:16 BST 2015 [root@big ~]# ssh root@big date The authenticity of host 'big (192.168.59.200)' can't be established. RSA key fingerprint is 26:15:71:25:08:a9:56:c7:06:c4:a0:e6:d1:57:45:b0. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'big' (RSA) to the list of known hosts. Tue Sep 1 09:10:23 BST 2015 [root@big ~]#
Disable SELinux permanently by setting SELINUX to disabled in /etc/selinux/config.
# This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - No SELinux policy is loaded. SELINUX=disabled # SELINUXTYPE= can take one of these two values: # targeted - Targeted processes are protected, # mls - Multi Level Security protection. SELINUXTYPE=targeted
Don’t forget to reboot the machine for the changes to take effect.
Disable the firewall (again, the assumption is that this is an isolated test system).
[root@big ~]# chkconfig iptables off [root@big ~]# service iptables stop [root@big ~]#
Next we have to disable Transparent Huge Pages. Append the following to /etc/rc.local:
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then echo never > /sys/kernel/mm/transparent_hugepage/enabled fi
To avoid another restart we can disable Huge Pages in the running OS:
[root@big ~]# echo never > /sys/kernel/mm/transparent_hugepage/enabled [root@big ~]#
We also have to disable IPv6. Set the following parameters in /etc/sysctl.conf:
net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 net.ipv6.conf.lo.disable_ipv6 = 1
Use sysctl to load the new settings from systcl.conf.
[root@big ~]# sysctl -p net.ipv4.ip_forward = 0 net.ipv4.conf.default.rp_filter = 1 net.ipv4.conf.default.accept_source_route = 0 kernel.sysrq = 0 kernel.core_uses_pid = 1 net.ipv4.tcp_syncookies = 1 error: "net.bridge.bridge-nf-call-ip6tables" is an unknown key error: "net.bridge.bridge-nf-call-iptables" is an unknown key error: "net.bridge.bridge-nf-call-arptables" is an unknown key kernel.msgmnb = 65536 kernel.msgmax = 65536 kernel.shmmax = 68719476736 kernel.shmall = 4294967296 net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 net.ipv6.conf.lo.disable_ipv6 = 1 [root@big ~]#
Finally we have to configure and enable NTPD. If NTP isn’t already installed start by adding the ntp package.
[root@big ~]# yum install -y ntp Loaded plugins: fastestmirror, refresh-packagekit Setting up Install Process ... Complete! [root@big ~]#
Make sure you’ve got a set of NTP servers listed in /etc/ntp.conf then enable and start the ntpd daemon.
[root@big ~]# chkconfig --add ntpd [root@big ~]# chkconfig ntpd on [root@big ~]# service ntpd start Starting ntpd: [ OK ] [root@big ~]# chkconfig ntpd on Make sure that the host is now synchronized: [root@big ~]# ntpstat synchronised to NTP server (184.108.40.206) at stratum 3 time correct to within 354 ms polling server every 64 s [root@big ~]#
Installing additional packages
With 4.1 we don’t have to manually install MySQL and PostgreSQL anymore. The only additional packages we need are nc and OpenJDK 1.8.0.
[root@big ~]# yum install -y nc Loaded plugins: fastestmirror, refresh-packagekit Setting up Install Process ... Installed: nc.x86_64 0:1.84-24.el6 Complete! [root@big ~]# yum install -y java-1.8.0-openjdk.x86_64 java-1.8.0-openjdk-devel.x86_64 Loaded plugins: fastestmirror, refresh-packagekit Setting up Install Process ... Installed: java-1.8.0-openjdk.x86_64 1:220.127.116.11-1.b16.el6_7 java-1.8.0-openjdk-devel.x86_64 1:18.104.22.168-1.b16.el6_7 Dependency Installed: giflib.x86_64 0:4.1.6-3.1.el6 java-1.8.0-openjdk-headless.x86_64 1:22.214.171.124-1.b16.el6_7 jpackage-utils.noarch 0:1.7.5-3.14.el6 ttmkfdir.x86_64 0:3.0.9-32.1.el6 tzdata-java.noarch 0:2015f-1.el6 xorg-x11-fonts-Type1.noarch 0:7.2-11.el6 Complete! [root@big ~]#
Installing IBM Open Platform 4.1
Copy the iop-126.96.36.199-1.el6.x86_64.rpm package, which you’ve downloaded from the IBM website, onto the host and install it.
[root@big ~]# yum install -y iop-188.8.131.52-1.el6.x86_64.rpm Loaded plugins: fastestmirror, refresh-packagekit Setting up Install Process … Installed: IOP.x86_64 0:184.108.40.206-1 Complete! [root@big ~]#
This package adds the IBM Open Platform repository to the local Yum configuration. We should now be able to use Yum to install Ambari.
[root@big ~]# yum clean all Loaded plugins: fastestmirror, refresh-packagekit Cleaning repos: BI_AMBARI-2.1.0 base extras updates Cleaning up Everything Cleaning up list of fastest mirrors [root@big ~]# yum install -y ambari-server Loaded plugins: fastestmirror, refresh-packagekit Setting up Install Process Determining fastest mirrors … Installed: ambari-server.x86_64 0:2.1.0_IBM-4 Dependency Installed: postgresql.x86_64 0:8.4.20-3.el6_6 postgresql-libs.x86_64 0:8.4.20-3.el6_6 postgresql-server.x86_64 0:8.4.20-3.el6_6 Complete! [root@big ~]#
Run ambari-server setup to configure Ambari and don't forget to set the path to the OpenJDK 1.8.0 JRE.
[root@big ~]# ambari-server setup -j /usr/lib/jvm/java-1.8.0-openjdk.x86_64/ Using python /usr/bin/python2.6 Setup ambari-server Checking SELinux... SELinux status is 'disabled' Customize user account for ambari-server daemon [y/n] (n)? Adjusting ambari-server permissions and ownership... Checking firewall status... Checking JDK... WARNING: JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk must be valid on ALL hosts WARNING: JCE Policy files are required for configuring Kerberos security. If you plan to use Kerberos,please make sure JCE Unlimited Strength Jurisdiction Policy Files are valid on all hosts. Completing setup... Configuring database... Enter advanced database configuration [y/n] (n)? Configuring database... Default properties detected. Using built-in database. Configuring ambari database... Checking PostgreSQL... Running initdb: This may take upto a minute. Initializing database: [ OK ] About to start PostgreSQL Configuring local database... Connecting to local database...done. Configuring PostgreSQL... Restarting PostgreSQL Extracting system views... .ambari-admin-2.1.0_IBM_4.jar ..... Adjusting ambari-server permissions and ownership... Ambari Server 'setup' completed successfully. [root@big ~]#
[root@big ~]# ambari-server start Using python /usr/bin/python2.6 Starting ambari-server Ambari Server running with administrator privileges. Organizing resource files at /var/lib/ambari-server/resources... Server PID at: /var/run/ambari-server/ambari-server.pid Server out at: /var/log/ambari-server/ambari-server.out Server log at: /var/log/ambari-server/ambari-server.log Waiting for server start.................... Ambari Server 'start' completed successfully. [root@big ~]#
Open a web browser and connect to the IP address of the target host at port 8080.
Log in as admin/admin (the default Ambari administrative account).
At the Welcome page click Launch Install Wizard.
Provide a name for your IBM Open Platform cluster and click Next.
Select the BigInsights 4.1 services stack and click Next.
Provide the fully qualified name of your target host. Copy the contents of the SSH private key (/root/.ssh/id_rsa) and paste it in the Host Registration Information section. Click Register and Confirm.
Select the target host and click Next. The system will run a set of checks and after the validation completes you will be presented with a list of services you can deploy as part of the installation.
Leave the services selection by default and click Next.
You can’t actually assign masters as you’ve only got a single machine so leave everything as it is and click Next.
Again, no role separation is possible in a single node installation so just click Next.
Select the individual tabs and provide the missing configuration information for Ooozie and Knox. This is about default administrative users and passwords. Once you’ve set credentials for all four the red markers will go away and you can click Next.
Review the summary and click Deploy. This initiates the services deployment. Once the services are installed and configured they will start automatically.
Review the installation summary and click Complete.
Congratulations! You now have a working installation of the IBM Open Platform 4.1 with Apache Hadoop. You can see the available services on the left-hand side of the Ambari console and the Dashboard tab provides a general overview of the system.
Test your installation (optional)
Open a console and become the ambari-qa user.
[root@big ~]# su - ambari-qa [ambari-qa@big ~]$
Run the following commands to initiate a series of test jobs. Make sure every single job completes successfully.
[ambari-qa@big ~]$ export HADOOP_MR_DIR=/usr/iop/current/hadoop-mapreduce-client [ambari-qa@big ~]$ yarn jar $HADOOP_MR_DIR/hadoop-mapreduce-examples.jar teragen 1000 /tmp/tgout 15/09/01 10:50:23 INFO impl.TimelineClientImpl: Timeline service address: http://big.example.com:8188/ws/v1/timeline/ ... File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=100000 [ambari-qa@big ~]$ yarn jar $HADOOP_MR_DIR/hadoop-mapreduce-examples.jar terasort /tmp/tgout /tmp/tsout 15/09/01 10:51:08 INFO terasort.TeraSort: starting … 15/09/01 10:51:33 INFO mapreduce.Job: Job job_1441100130926_0008 completed successfully … File Input Format Counters Bytes Read=100000 File Output Format Counters Bytes Written=100000 15/09/01 10:51:33 INFO terasort.TeraSort: done [ambari-qa@big ~]$ yarn jar $HADOOP_MR_DIR/hadoop-mapreduce-examples.jar teravalidate /tmp/tsout /tmp/tvout 15/09/01 10:53:36 INFO impl.TimelineClientImpl: Timeline service address: http://big.example.com:8188/ws/v1/timeline/ ... 15/09/01 10:53:56 INFO mapreduce.Job: Job job_1441100130926_0009 completed successfully ... File Input Format Counters Bytes Read=100000 File Output Format Counters Bytes Written=21 [ambari-qa@big ~]$