Configuring Object Storage with Flink
Flink uses Object Storage as sink for persisting data, which can be achieved by using HDFS connector.
- Access Apache Ambari.
- From the side toolbar, under Services select HDFS.
- Select Configs.
- View the relevant HDFS connector properties in core-site.xml, and then restart HDFS.
-
To write stream data to Object Storage, run:
export HADOOP_CLASSPATH=`hadoop classpath`; sudo /usr/odh/current/flink/bin/flink run-application -t yarn-application -yD classloader.check-leaked-classloader=false -yD security.kerberos.login.keytab=/etc/security/keytabs/smokeuser.headless.keytab -yD security.kerberos.login.principal=<ambari-qa-principal> /usr/odh/current/flink/bin//../examples/streaming/WordCountStreaming.jar --input <object_storage_input_file_location> --output <object_storage_output_file_location>
-
To set Object Storage as sink in Batch mode, run:
export HADOOP_CLASSPATH=`hadoop classpath`; sudo /usr/odh/current/flink/bin/flink run-application -t yarn-application -yD classloader.check-leaked-classloader=false -yD security.kerberos.login.keytab=/etc/security/keytabs/smokeuser.headless.keytab -yD security.kerberos.login.principal=<ambari-qa-principal> /usr/odh/current/flink/bin//../examples/batch/WordCount.jar --input <object_storage_input_file_location> --output <object_storage_output_file_location>
-
To use Object Storage for storing savepoints, configure <object_storage_file_location> for the
state.savepoints.dir
parameter. - To use Object Storage for storing checkpoints, configure in the client with specific interval and path to Object Storage.