BIG
DATA

JAVA

Hadoop FS Commands

Read more about »
  • Java 9 features
  • Read about Hadoop
  • Read about Storm
  • Read about Storm
 

The File System (FS) shell includes various shell-like commands that directly interact with the Hadoop Distributed File System (HDFS) as well as other file systems that Hadoop supports.The FS shell is invoked by:

Usage:
bin/hadoop fs 

All FS shell commands take path URIs as arguments. The URI format is scheme://authority/path. For HDFS the scheme is hdfs, and for the Local FS the scheme is file. The scheme and authority are optional. If not specified, the default scheme specified in the configuration is used. An HDFS file or directory such as /parent/child can be specified as hdfs://namenodehost/parent/child or simply as /parent/child (given that your configuration is set to point to hdfs://namenodehost).

Most of the commands in FS shell behave like corresponding Unix commands. Let’s get started.


Create a directory in HDFS - mkdir

The hadoop mkdir command is for creating directories in the hdfs. This is similar to the unix mkdir command. You can use the -p option for creating parent directories. Takes path uri’s as argument and creates directories.

Usage:
hadoop fs -mkdir 
Examples:
hadoop fs -mkdir /user/hadoop/corejavaguru
hadoop fs -mkdir /user/hadoop/dir1 /user/hadoop/dir2
hadoop fs -mkdir -p /user/hadoop/corejavaguru/fscommands/demo

List the contents of a HDFS directory - ls

The ls command is used to list out the directories and files.

For a file ls returns stat on the file with the following format:

permissions number_of_replicas userid groupid filesize modification_date modification_time filename

For a directory it returns list of its direct children as in Unix. A directory is listed as:

permissions userid groupid modification_date modification_time dirname
Usage:
hadoop fs -ls 
Example:
hadoop fs -ls /user/hadoop/file1

Upload a file into HDFS - put

put command is used to copy single source, or multiple sources to the destination file system. Also reads input from stdin and writes to destination file system. The different ways for the put command are :

Usage:
hadoop fs -put  ... 
Example:
hadoop fs -put /home/hadoop/Samplefile.txt  /user/hadoop/dir3/
hadoop fs -put localfile1 localfile2 /user/hadoop/hadoopdir
hadoop fs -put localfile hdfs://nn.example.com/hadoop/hadoopfile

Download a file from HDFS - get

Hadoop get command copies the files from HDFS to the local file system. The syntax of the get command is shown below:

Usage:
hadoop fs -get [-ignorecrc] [-crc]  
Example:
hadoop fs -get /user/hadoop/file localfile
hadoop fs -get hdfs://nn.example.com/user/hadoop/file localfile

See contents of a file in HDFS - cat

cat command is used to print the contents of the file on the stdout.

Usage:
hadoop fs -cat 
Example:
hadoop fs -cat /user/hadoop/dir1/xyz.txt

Copy a file from source to destination in HDFS - cp

cp command is for copying the source into the target. This command allows multiple sources as well in which case the destination must be a directory.

Usage:
hadoop fs -cp [-f] [-p | -p[topax]] URI [URI ...] 
Example:
hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2
hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir

Copy a file from Local file system to HDFS - copyFromLocal

The hadoop copyFromLocal command is used to copy a file from the local file system to the hadoop hdfs. Similar to put command, except that the source is restricted to a local file reference.

Usage: 
hadoop fs -copyFromLocal  URI
Example:
hadoop fs -copyFromLocal /home/hadoop/xyz.txt  /user/hadoop/xyz.txt

Copy a file from HDFS to Local file system - copyToLocal

The hadoop copyToLocal command is used to copy a file from the hdfs to the local file system. Similar to get command, except that the destination is restricted to a local file reference.

Usage: 
hadoop fs -copyToLocal [-ignorecrc] [-crc] URI 
Example:
hadoop fs -copyToLocal /user/hadoop/xyz.txt /home/hadoop/xyz.txt

Move file from source to destination in HDFS - mv

Moves files from source to destination. This command allows multiple sources as well in which case the destination needs to be a directory. Note: Moving files across file systems is not permitted.

Usage: 
hadoop fs -mv URI [URI ...] 
Example:
hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2
hadoop fs -mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2 hdfs://nn.example.com/dir1

Remove a file or directory in HDFS - rm, rmdir

rm

Delete files specified as args. Deletes directory only when it is empty

Usage: 
hadoop fs -rm [-f] [-r |-R] [-skipTrash] URI [URI ...]
Example:
hadoop fs -rm hdfs://nn.example.com/file /user/hadoop/emptydir

rmdir

Delete a directory specified as args.

Usage: 
hadoop fs -rmdir [--ignore-fail-on-non-empty] URI [URI ...]
Example:
hadoop fs -rmdir /user/hadoop/emptydir

Options: --ignore-fail-on-non-empty: When using wildcards, do not fail if a directory still contains files.


Display last few lines of a file in HDFS - tail

Displays last kilobyte of the file to stdout.

Usage: 
hadoop fs -tail [-f] URI
Example:
hafoop fs -tail /user/hadoop/demo.txt

Print statistics about the file or directory in HDFS - stat

Use stat to print statistics about the file/directory at in the specified format.

Usage: 
hadoop fs -stat [format]  ...
Example:
hadoop fs -stat /user/hadoop/

Display the size of files and directories in HDFS - du

The du command displays aggregate length of files contained in the directory or the length of a file in case its just a file.

Usage :
hadoop fs -du 
Example:
hadoop fs -du /user/hadoop/dir1/xyz.txt

Change group of files in HDFS - chgrp

The hadoop chgrp shell command is used to change the group association of files. The user must be the owner of files, or else a super-user.

Usage: 
hadoop fs -chgrp [-R] GROUP URI [URI ...]

Change the permissions of files in HDFS - chmod

The hadoop chmod command is used to change the permissions of files. The user must be the owner of the file, or else a super-user.

Usage: 
hadoop fs -chmod [-R]  URI [URI ...]

Change the owner of files in HDFS - chown

The hadoop chown command is used to change the ownership of files. The user must be a super-user.

Usage: 
hadoop fs -chown [-R] [OWNER][:[GROUP]] URI [URI ]

Help for an individual HDFS command - usage

Below command return the help for an individual command.

Usage: 
hadoop fs -usage command