Installing Big SQL 3.0 on IBM Open Platform with Apache Hadoop 4.1

Installing Big SQL 3.0 on IBM Open Platform with Apache Hadoop 4.1


In "The SQL on Hadoop landscape: an overview" post (Part I and Part II) I wrote about the importance of having a SQL engine on top of Hadoop. In this post I'll show you how to quickly install IBM's SQL on Hadoop engine called Big SQL.

Big SQL is a great piece of technology, based on the mature and proven DB2 engine, which has shown again and again that it is faster than its rivals, as well as being fully ANSI SQL compliant.

IBM Big SQL Architecture

IBM Big SQL Architecture

IBM Big SQL doesn't rely on MapReduce for query execution. Instead, it uses its own MPP engine with workers written in C++, that access HDFS data directly but also integrate with the Hive metastore.

Big SQL uses shared nothing architecture and pushes processing out to data nodes to maximize data locality. It also performs intelligent data partition elimination based on SQL predicates. Access to Big SQL is via ODBC/JDBC drivers.

Prerequisites

To follow this tutorial you will need an installation of the IBM Open Platform with Apache HAdoop (IBM's ODPi compliant Hadoop distribution).

If you don't have an IBM Open Platform installation, you can follow this step-by-step guide to deploy one in a virtual machine.

The VM I'll be using in this tutorial is an IBM Open Platform 4.1 deployment, running in a host called big.example.com.

Getting Big SQL

While IBM Open Platform is an open source and freely available piece of software, Big SQL is part of the IBM licensable value-add components. IBM, however, provides a free, non-production version of the components bundle called IBM BigInsights Quick Start Edition.

Visit this URL and log in using your IBMid (if you haven't got one you can register for free – it is a quick and pretty straightforward process).

BigInsights QSE Download

Select your preferred download method (Download Director or HTTP) and download BigInsights Quick Start Edition for Non-Production Environment 4.1.0.1 for Linux on Intel 64-bit.

Put the downloaded file (BI_QSE_1.1.0.1.el.x86_64.bin) in the /root folder of your IBM Open Platform VM.

Installing Big SQL

Start by setting the executable bit and then running the Quick Start edition file.

[root@big ~]# chmod +x BI_QSE_1.1.0.1.el.x86_64.bin
[root@big ~]# ./BI_QSE_1.1.0.1.el.x86_64.bin
Creating directory ./BigInsights
Verifying archive integrity... All good.
Uncompressing Big Insights Quick Start Edition  100%
************************************************************************
 License Files
************************************************************************
License files are available in the directory:
/root/BigInsights/licenses

1. Do you Accept the Terms and Conditions in the Licenses Directory ? (y/n) : y

************************************************************************
 Installation Type (Online/Offline)
************************************************************************
2. Will this be an ONLINE(1) or OFFLINE(2) Installation ? 1

************************************************************************
 Check if IOP or HDP is already installed
************************************************************************
The directory /usr/iop exists, so you currently have IOP installed.

************************************************************************
 Installing Package...
************************************************************************
Downloading Package...
--2015-10-22 16:27:00--  https://ibm-open-platform.ibm.com/repos/BigInsights-Val                                                                                     ueAdds/rhel/6/x86_64/4.1.x/Updates/4.1.0.1/BI-ANA-IOP-RHEL6.rpm
Resolving ibm-open-platform.ibm.com... 198.11.219.186
Connecting to ibm-open-platform.ibm.com|198.11.219.186|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 264076 (258K) [application/x-rpm]
Saving to: “packages/BI-ANA-IOP-RHEL6.rpm”

100%[======================================>] 264,076      232K/s   in 1.1s

2015-10-22 16:27:02 (232 KB/s) - “packages/BI-ANA-IOP-RHEL6.rpm” saved [264076/2                                                                                     64076]

Installing Package...
Loaded plugins: product-id, refresh-packagekit, rhnplugin, security, subscription-manager
This system is receiving updates from RHN Classic or RHN Satellite.
Setting up Install Process
Examining packages/BI-ANA-IOP-RHEL6.rpm: BI-Analyst-IOP-2.13.1-IOP-4_1-2.13.1-1.noarch
Marking packages/BI-ANA-IOP-RHEL6.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package BI-Analyst-IOP-2.13.1-IOP-4_1.noarch 0:2.13.1-1 will be installed
...
Installed:
  BI-Quick-Start-Edition-IOP-2.13.1-IOP-4_1.noarch 0:2.13.1-1

Complete!
BigInsights QSE Package Installation Succeeded
Installation Complete
[root@big ~]#

Next, restart Ambari so it can pick up the newly added repositories.

[root@big ~]# ambari-server restart
Using python  /usr/bin/python2.6
Restarting ambari-server
Using python  /usr/bin/python2.6
Stopping ambari-server
Ambari Server stopped
Using python  /usr/bin/python2.6
Starting ambari-server
Ambari Server running with administrator privileges.
Organizing resource files at /var/lib/ambari-server/resources...
Server PID at: /var/run/ambari-server/ambari-server.pid
Server out at: /var/log/ambari-server/ambari-server.out
Server log at: /var/log/ambari-server/ambari-server.log
Waiting for server start....................
Ambari Server 'start' completed successfully.
[root@big ~]#

Install KSH as it is a prerequisite for Big SQL.

[root@big ~]# yum -y install ksh
Loaded plugins: product-id, refresh-packagekit, rhnplugin, security, subscription-manager
This system is receiving updates from RHN Classic or RHN Satellite.
Setting up Install Process
BIGINSIGHTS-VALUEPACK-1.1.0.1                                                                                                                                         | 2.9 kB     00:00
BIGINSIGHTS-VALUEPACK-1.1.0.1/primary_db                                                                                                                              |  16 kB     00:00
Resolving Dependencies
--> Running transaction check
---> Package ksh.x86_64 0:20120801-28.el6_7.3 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================================================
 Package                             Arch                                   Version                                               Repository                                            Size
=============================================================================================================================================================================================
Installing:
 ksh                                 x86_64                                 20120801-28.el6_7.3                                   rhel-x86_64-server-6                                 760 k

Transaction Summary
=============================================================================================================================================================================================
Install       1 Package(s)

Total download size: 760 k
Installed size: 1.7 M
Downloading Packages:
ksh-20120801-28.el6_7.3.x86_64.rpm                                                                                                                                    | 760 kB     00:05
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : ksh-20120801-28.el6_7.3.x86_64                                                                                                                                            1/1
  Verifying  : ksh-20120801-28.el6_7.3.x86_64                                                                                                                                            1/1

Installed:
  ksh.x86_64 0:20120801-28.el6_7.3

Complete!
[root@big ~]#

Edit /etc/sudoers and comment out the "Defaults requiretty" line.

#
# Disable "ssh hostname sudo ", because it will show the password in clear.
#         You have to run "ssh -t hostname sudo ".
#
#Defaults    requiretty

Open your Ambari console using a web browser at port 8080 and login as admin/admin (by default).

Ambari Dashboard

Click the Actions drop down list below the cluster services and select "Add Service".

Ambari Add Service

Scroll down and select "BigInsights – Big SQL", "BigInsights Data Server Manager", and "BigInsights Home", then click Next.

Select services

If you are using a single VM, leave the service layout by default and click Next.

Select Service Layout

Leave the Slaves and Clients assignments by default and click Next.

Component Assignments

Go to the "BigInsights Data Server Manager" tab and assign a password for the bigsql user.

Service Customization

Click Next.

Customized Service

Review the service settings and click Deploy.

Review Screen

Wait for the Big SQL service deployment to complete.

Deployment in progress

After the deployment has finished, click Next.

Deployment Completed

Click Complete and Ambari will send you back to the main dashboard.

Summary screen

At the main dashboard, click each of the services that Ambari indicates require restart and restart them.

Ambari Services Restart

Go back to your terminal session and use the knox_setup.sh script using the correct Ambari port and admin account (8080 and admin/admin by default).

[root@big ~]# /usr/ibmpacks/bin/2.13.1/knox_setup.sh -u admin -p admin -x 8080
Protocol : http
Cluster: democluster BigInsights 4.1
Cluster name: democluster, stack name: BigInsights, stack version: 4.1, stack directory: /usr/iop/current
***********************************************************************
Is Ambari User Name admin correct? y
***********************************************************************
Is Ambari Cluster Name democluster correct? y
***********************************************************************
Is Ambari port 8080 correct? y
***********************************************************************
Is Knox gateway servers  big.example.com correct? y
***********************************************************************
Is IBM BigInsights IOP installed? y
***********************************************************************
***********************************************************************
Updating KNOX jars on big.example.com:
***********************************************************************
gateway-service-bigsheets-5.11.2.jar                                                                                                                       100%   13KB  13.0KB/s   00:00
gateway-service-dsm-1.2.jar                                                                                                                                100%   13KB  13.2KB/s   00:00
gateway-service-text-analytics-web-tooling-3.4.jar                                                                                                         100%   23KB  22.8KB/s   00:00
gateway-service-web-ui-2.7.jar                                                                                                                             100%   17KB  16.7KB/s   00:00
Updating KNOX jars on all KNOX servers Completed
***********************************************************************
Update Topology:
***********************************************************************
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
107   428  107   428    0     0  77974      0 --:--:-- --:--:-- --:--:-- 85600
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4620  100  4620    0     0   958k      0 --:--:-- --:--:-- --:--:-- 1127k
Checking for updates for /usr/ibmpacks/bin/2.13.1/conf/knox_conf.json
roles== WEBUIFRAMEWORK
Adding role WEBUIFRAMEWORK with url: http://{{ui_server_host}}:{{ui_server_port}}/biginsights
roles== BIGSHEETS
Adding role BIGSHEETS with url: http://{{bigsheets_server_host}}:{{bigsheets_server_port}}/bigsheets
roles== TEXTANALYTICS
Adding role TEXTANALYTICS with url: http://{{ta_server_host}}:{{ta_server_port}}/TextAnalyticsWeb
roles== DSM
Adding role DSM with url: http://{{dsm_server_host}}:{{dsm_server_port}}/console
Adding provider iop-util
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  8513  100  4605  100  3908   105k  91866 --:--:-- --:--:-- --:--:--  107k
Cluster: democluster BigInsights 4.1
Cluster name: democluster, stack name: BigInsights
ssh: Could not resolve hostname EMPTY: Name or service not known
Stack version: 4.1, stack directory: /usr/iop/current
***********************************************************************
Cleanup KNOX deployment files on big.example.com:
***********************************************************************
Cleanup KNOX deployment files on all KNOX servers Completed
***********************************************************************
Updating KNOX params.py file:
***********************************************************************
Completed
***********************************************************************
Restarting AMBARI:
***********************************************************************
Using python  /usr/bin/python2.6
Restarting ambari-server
Using python  /usr/bin/python2.6
Stopping ambari-server
Ambari Server stopped
Using python  /usr/bin/python2.6
Starting ambari-server
Ambari Server running with administrator privileges.
Organizing resource files at /var/lib/ambari-server/resources...
Server PID at: /var/run/ambari-server/ambari-server.pid
Server out at: /var/log/ambari-server/ambari-server.out
Server log at: /var/log/ambari-server/ambari-server.log
Waiting for server start....................
Ambari Server 'start' completed successfully.
Waiting for Ambari to come up and configuring Knox.  This may take a while.   Please wait ...

***********************************************************************
Stop KNOX:
***********************************************************************
Stopping KNOX
Service KNOX stopped
Stopping KNOX succeeded
***********************************************************************
Redeploy KNOX on big.example.com:
***********************************************************************
Redeploy KNOX on all KNOX servers Completed
***********************************************************************
Start KNOX:
***********************************************************************
Starting KNOX
Service KNOX started
Starting KNOX succeeded
[root@big ~]# 

Make sure that all of your services now have green status.

Ambari Dashboard - Green Status

Go to your BigInsights home URL (for my big VM host this is https://big:8443/gateway/default/BigInsightsWeb/index.html#/welcome). Log in using the default LDAP account (guest/guest-password) and click Data Server Manager for Big SQL.

BigInsights Home Screen

You can use Data Server Manager to monitor your Big SQL services and run SQL queries.

Data Server Manager Screen

This completes the installation of IBM Big SQL on Open Platform with Apache Hadoop.