arttuladhar
8/11/2017 - 3:04 AM

Hadoop CLI Cheatsheet

Hadoop CLI Cheatsheet

HDFS

### DFS
hdfs dfs -chmod 777 /user

### HDFS MKDIR
hdfs -mkdir <paths>

# Examples
hdfs dfs -mkdir /user/hadoop
hdfs dfs -mkdir /user/hadoop/art /user/hadoop/ant /user/hadoop/bat
hdfs dfs -mkdir -p /user/data/weather/mn/minnesota

# ---------------------------------------------------------------------- #

### HDFS GET - Copies / Downloads files from HDFS to Local File System
hdfs dfs -get <hdfs-src> <local-dest>

# Example
hdfs dfs -get /user/data/salaries/sf-2017 /home/

# ---------------------------------------------------------------------- #

### HDFS PUT - Copies Single Src for Multiple Src Files from Local File System to Hadoop
hdfs dfs -put <local-src> ... <HDFS_dest_path>

# Example
hdfs dfs -put sf-salaries-2011-2013.csv /user/hadoop/sf-salaries-2011-2013/sf-salaries-2011-2013.csv

# ---------------------------------------------------------------------- #

### HDFS DFS-GETMERGE - Takes a source directory file or files and concatenates src file into the local destination file
hdfs dfs -getmerge <src> <localdst>

# Examples
hdfs dfs -getmerge /user/hadoop/sf-salaries-2011-2013/ /user/hadoop/sf-salaries-2014/ /root/output.csv
# Killing Yarn Application with YarnID
yarn application -kill <app_id>

Run the workflow/coordinator:

oozie job -config ${property_file_name} -run

Example

oozie job -config newtonpublishLoader.properties -run

To know the coordinator id for a job:

oozie jobs --oozie http://bigredoozie.target.com:11000/oozie -filter name=${coordinator_name} -jobtype coordinator oozie jobs --oozie http://bigredoozie.target.com:11000/oozie -filter name=tpa_mkdn_tbl_grpLoader_COORDINATOR -jobtype coordinator

To know the workflow ids for a given coordinator:

oozie job -info ${coordinator_id} -allruns oozie job -info 0017964-160420173116398-oozie-oozi-C -allruns

To kill the coordinator:

oozie job -kill ${coordinator_id} Example: oozie job -kill 0017964-160420173116398-oozie-oozi-C