Big Data Cloud Convenience Functions

Big Data Cloud scripts come with helper functions for convenience. The following functions are available.

# 
# hdfsCopy 
# Copies data from and to Object store to HDFS or local file system
# Usage:
#   hdfsCopy swift://container1.default/logs/one.log hdfs:///tmp/
#      OR
#   hdfsCopy hdfs:///tmp/one.log swift://container1.default/logs/
#      OR
#   hdfsCopy swift://container1.default/logs/one.log file:///tmp/
#
# Checks if the file is present
# eg:
#   hdfsStat hdfs:///user
#      OR
#   hdfsStat swift://container1.default/bdscsce/
#
# Creates a directory on hdfs or object store
# eg:
#   hdfsMkdir hdfs:///tmp/trial
#      OR
#   hdfsMkdir swift://container1.default/bdscsce/trial
#
# getDefaultContainer
# returns the default container that is registered with the cluster
# Usage:
#   default_container=$(getDefaultContainer)
#
# getBaseObjectStoreUrl
# returns a URL that is pointing to default object store container
# does not have trailing '/' to make it easy to append
# Usage:
#   objectStoreURL=$(getBaseObjectStoreUrl)
# OR
#   objectStoreURL=$(getBaseObjectStoreUrl)/bdcsce/logs/one.log
#
# getClusterName
# returns the cluster name
# Usage:
#   cluster_name=$(getClusterName)
#
# getMasterNodes
# returns a space-separated list of master nodes
# The following for loop can be used to print all master nodes
#   for i in $(getMasterNodes); do echo $i; done
# Usage:
#   master_nodes=$(getMasterNodes)
#
# getComputeOnlySlaveNodes
# returns a space-separated list of compute-only slave nodes
# Usage:
#   compute_only_slaves=$(getComputeOnlySlaveNodes)
#
# getComputeAndStorageSlaveNodes
# returns a space-separated list of compute and slave nodes
#
# getAllNodes
# returns a space-separated list of all nodes
#
# getAmbariServerNodes
# returns a space-separated list of all Ambari server nodes
#
# getSparkThriftServerNodes
# returns a space-separated list of all Spark Thrift Server nodes
#
# getHive2ServerNodes
# returns a space-separated list of all Hive2 server nodes