Evolution of our Hadoop Architecture— Appendix: Add multiple Hive Interactive Server

Datum Scientia
4 min readJan 29, 2021

Steps to host another interactive server with container backed LLAP on kerberized HDP 2.6.5 cluster

CAUTION: Test out below steps in lower environment first

LLAP Architecture

Implementation Plan

Enable Multiple Hive LLAP Instances on Cluster

Take a backup of the below script on Ambari host:

/var/lib/ambari-server/resources/common-services/HIVE/0.12.0.2.0/package/scripts/hive_server_interactive.py

Modify the following script “ hive_server_interactive.py” on Ambari-Server host:

  1. vi /var/lib/ambari-server/resources/common-services/HIVE/0.12.0.2.0/package/scripts/hive_server_interactive.py
  • # Line 219: Add “ — name {params.llap_app_name}” after “ — service llap”
219       cmd = format("{stack_root}/current/hive-server2-hive2/bin/hive --service llap --slider-am-container-mb {params.slider_am_container_mb} "219       cmd = format("{stack_root}/current/hive-server2-hive2/bin/hive --service llap --name {params.llap_app_name} --slider-am-container-mb {params.slider_am_container_mb} "
  • # Line 246: Modify “.slider/keytabs/{params.hive_user}/” to “.slider/keytabs/{llap_app_name}/”
246         cmd += format(" --slider-keytab-dir .slider/keytabs/{params.hive_user}/ --slider-keytab "246         cmd += format(" --slider-keytab-dir .slider/keytabs/{llap_app_name}/ --slider-keytab "
  • # Line 356: Modify “ — folder {params.hive_user}” to “ — folder {llap_app_name}”
356       slider_keytab_install_cmd = format("slider install-keytab --keytab {params.hive_llap_keytab_file} --folder {params.hive_user} --overwrite")356       slider_keytab_install_cmd = format("slider install-keytab --keytab {params.hive_llap_keytab_file} --folder {llap_app_name} --overwrite")
  • # Line 401: add “ — name {llap_app_name} “ after “ — service llapstatus”
401       llap_status_cmd = format("{stack_root}/current/hive-server2-hive2/bin/hive --service llapstatus -w -r {percent_desired_instances_to_be_up} -i {refresh_rate} -t {total_timeout}")401       llap_status_cmd = format("{stack_root}/current/hive-server2-hive2/bin/hive --service llapstatus --name {llap_app_name} -w -r {percent_desired_instances_to_be_up} -i {refresh_rate} -t {total_timeout}")

2. Restart Ambari-server

3. Set up Yarn Queue

Default:
Llap_custom:
Etl_custom:

4. Review the following Keytab “hive.llap.zk.sm.keytab” exist on the HiveServer2 Interactive host.

5. Copy the security keytab if it doesn’t exist:

cp -p /etc/security/keytabs/hive.service.keytab /etc/security/keytabs/hive.llap.zk.sm.keytab

Further steps to add a new Interactive server on a new host:

  1. Run curl command to add Service Component to a Host Component on Ambari-Server.
curl --user <userName>:<Password> -H "X-Requested-By: ambari" -i -X POST http://<ambariHostName>:8080/api/v1/clusters/<ClusterName>/hosts/<etl_host_name>/host_components/HIVE_SERVER_INTERACTIVE

Where:

  • <userName> = ambari-server admin host
  • <Password> = ambari-server admin host
  • <ambariHostName> = ambari-server hostName
  • <ClusterName> = Cluster Name // you can fetch from : http://<ambariHostName>:8080/api/v1/clusters/
  • <etl_hostName> = Host Name for new HiveServer2 Interactive

2. Once the command is executed successfully, you would notice the service in “Install Pending” State.

3. Click on “Re-Install and observe the service get installed successfully.

4. Create Config Groups and assign host for this new host of the “HiveServer2 Interactive”

  • In Ambari, Navigate to HIVE > Config > Manage Config group:
  • Create a New Config Group with Name “llap1” and then add a host to this Config Group and click Save.

5. Configure all the important properties that would differentiate each LLAP Instance.

6. Review the following properties and tune them for HiveServer2 Interactive for etl_custom and Yarn queue resource allocation

Advanced hive-interactive-env > hive_heapsize= 32768 // Heap Size of HiveServer2 InteractiveInteractive Query > hive.llap.daemon.queue.name=etl_custom // Yarn Queue NameAdvanced hive-interactive-env > llap_app_name=llap1 // LLAP Instance NameAdvanced hive-interactive-site > hive.server2.zookeeper.namespace=hiveserver2-llap1 // LLAP Instance ZK Discovery NameAdvanced hive-interactive-site > hive.llap.daemon.service.hosts=@llap1 // Match LLAP Instance NameAdvanced hive-interactive-env > slider_am_container_mb= 4608Advanced hive-interactive-site > hive.server2.tez.default.queues=etl_customAdvanced hive-interactive-site > hive.server2.tez.sessions.per.default.queue=1Advanced tez-interactive-site > tez.am.resource.memory.mb=4096Advanced hive-interactive-site > hive.llap.daemon.yarn.container.mb=4608 // (Memory per Daemon)Advanced hive-interactive-env > llap_heap_size= 2560 // (LLAP Daemon Heap Size)Advanced hive-interactive-site > hive.llap.daemon.num.executors=1 // n_executorsAdvanced hive-interactive-env > llap_headroom_space= 1024 // (LLAP Daemon Container Max Headroom)Advanced hive-interactive-site > hive.llap.io.memory.size= 1024 // (In-Memory Cache per Daemon)Advanced hive-interactive-site > hive.llap.io.threadpool.size=1Advanced hive-interactive-site > hive.auto.convert.join.noconditionaltask.size= 3221225472 // 25% of Each Task Memory = 25% * (llap_heap_size) / hive.llap.daemon.num.executorsAdvanced tez-interactive-site > tez.runtime.io.sort.mb=100 // 40% of Each Tax Memory = 25% * (llap_heap_size) / hive.llap.daemon.num.executorsInteractive Query > num_llap_nodes=1Advanced hive-interactive-env > num_llap_nodes_for_llap_daemons=1 // Number of Node(s) for running Hive LLAP daemonAdvanced hive-interactive-siteRestricted session configs= hive.execution.engine
hive.execution.mode= container
hive.auto.convert.join.noconditionaltask.size=3221225472
Number of executors per LLAP Daemon=1
hive.llap.execution.mode=none
hive.llap.io.enabled=true
hive.llap.io.threadpool.size=1
hive.tez.container.size=9216
General:
hive.tez.java.opts= -server -Xmx7373m -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseG1GC -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps
Advanced hive-env
Hive PID Dir=/var/run/hive

Rollback Plan:

  1. Stop the 2nd newly hosted LLAP instance from Ambari.
  2. Delete second Hive Interactive Service using API call from Ambari Server.
curl --user admin:admin -H "X-Requested-By: ambari" -i -X DELETE http://<ambariHostName>:8080/api/v1/clusters/<ClusterName>/hosts/<etl_host_name>/host_components/HIVE_SERVER_INTERACTIVE

3. Remove the host from the newly created config group for the second LLAP instance and remove the config group and save. This will revert all the service level parameter set for the newly hosted LLAP instance

4. Remove the Yarn queue created for the etl_custom and re-adjust the yarn-queue with the original allocated capacity.

--

--

Datum Scientia

We bring you the world of Big Data from the inside.