This article is contributed. See the original author and article here.
This article explain Migration of HBase 1.1 (HDI 3.6) Accelerated Write Cluster with Default Ambari Meta DB to HBase 2.1 (HDI 4.0) Accelerate write Cluster with custom Ambari Meta DB. In normal cluster creation, as described in other articles such as Set up clusters in HDInsight, Ambari is deployed in an S0 Azure SQL Database that is managed by HDInsight and is not accessible to users.
Also Starting July 1st 2021 , Microsoft will offer only Basic support plan for certain HDInsight 3.6 cluster types. This plan will be available till April 3rd 2022. So it is recommended to migrate to HDInsight4.0 at the earliest.
Understanding the Use Case:
HDInsight allows you to take control of your data and metadata with external data stores. This feature is available for Apache Hive metastore, Apache Oozie metastore, and Apache Ambari database. Here we will focus on Apache Ambari database. Ambari is used to monitor HDInsight clusters, make configuration changes and store cluster management information as well as job history. HDInsight provides a default SQL Database for each cluster which is good for test work load. For Production usage it is recommended to use Custom SQL Database to handle the load of cluster according to the business growth requirements. It is also possible to start with a basic database and upgrade later.
In this example We will create a Custom Meta DB and configure it to HDI4.0 HBase cluster and migrate the Data from HDI3.6 to HDI4.0 followed by validation.
Below are the steps for Migration.
Source and Destination Cluster setup
Step 1 : Create a source HBase HDI 3.6 with Default meta DB
Step 2: Create a Destination HBase HDI 4.0 clusters with a custom Ambari DB
Step 2.1: From Azure Portal Create an External SQL Database.
Step 2.2: Choose the right DTU based on the Nodes.
Step 2.3: Choose the above Database while Creating HDInsight Cluster as Ambari Meta DB.
Once the cluster is ready follow the below steps to Migrate:
Steps to be followed on Source Cluster HDInsight 3.6
Step 1: Login to Source Cluster and Create Sample Table using HBase perf.
Step 2: Flush the Table Data
Step 3: Stop the HBase from Ambari.
Step 4: Backup WAL folder
Steps to be followed on Destination Cluster HDInsight 4.0
Step 1: Stop the HBase from Ambari
Step 2: Under Services > HDFS > Configs > Advanced > Advanced core-site, change the fs.defaultFS HDFS setting to point to the source cluster’s container name, for example cluster1testhbase-2021-05-12t07-23-50-453z
Step 3: Under Services > HBASE > Configs > Advanced > Advanced hbase-site change the hbase.rootdir path to point to the container of the source cluster.
Step 4: Clean the Zookeeper data on the destination cluster by running the following commands in any of the Zookeeper nodes or worker nodes:
Step 5: Restart all the component required restart from Ambari.
Step 6: Clean the WAL FS data for the destination cluster, and copy the WAL directory from the source cluster into the destination cluster’s HDFS. Copy the directory by running the following commands in any of the Zookeeper nodes or worker nodes:
Step 7: Copy apps folder from destination container to source container
Step 8: Restart all the component required restart from Ambari.
Step 9: Validation
Validation of the table and count of record in source cluster
Validation of the table and count of record in destination cluster
Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.