Skip to main content

HDFS - Useful commands


Configuration Files:
/etc/hadoop/conf/core-site.xml
/etc/hadoop/conf/hdfs-site.xml

Most Common port for name node UI: 50070

to see the properties file:
> view core-site.xml

Useful Property from core-site.xml
fs.defaultFS -> hdfs://nameNodeIp:8020

Useful Property from hdfs-site.xml
> dfs.blocksize -> <Bytes Value> #Default value in hadoop 2 is 128MB
> dfs.replication

Useful Linux commands
> du -sh <file>

Show all hadoop command line commands:
> hadoop fs

> hadoop fs -ls /user/<USERNAME> #Show User Space

> hadoop fs -ls <dirName> #Show list of files in mentioned folder with permission

> hadoop fs -help <command> #Show help for mentioned commands

> hadoop fs -copyFromLocal <localDirPath> <targetDirPath>/. #copy from local to HDFS. Alternative to -put command.
#NOTE: /. required because we want source folder also to target directory instead of just source folder content.

> hadoop fs du -s -h <filePath> #to show file size in HDFS

> hadoop fs -tail <hdfs-file-path> #Show file content from HDFS

Following commands are used by Admin
> hadoop fsck / -files #file system check. It displays all the files in HDFS while checking

> hadoop fsck / -files -blocks #It displays all the blocks of files in HDFS while checking

> hadoop fsck / -files -blocks -locations #It displays all the files block locations i.e. replicas while checking

> hadoop fsck / -files -blocks -locations -racks #display the networking topology for data-node locations i.e replicas.

> hadoop fsck -delete #delete corrupted files in HDFS


Comments