Replication factor in HDFS is the number of copies of a file in file system. A Hadoop application can specify the number of replicas of a file it wants HDFS to maintain.
This information is stored in NameNode.
We can set the replication factor in following ways:
We can use Hadoop fs shell, to specify the replication factor for a file. Command as follows:
$hadoop fs –setrep –w 5
/file_name
In above command, replication factor of file_name file is set as 5.
We can also use Hadoop fs shell, to specify the replication factor of all the files in a directory.
$hadoop fs –setrep –w 2
/dir_name
In above command, replication factor of all the files under directory dir_name is set as 2.