Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
hdfs [2016/04/19 09:25]
steve
hdfs [2016/04/21 08:32]
steve
Line 63: Line 63:
 </​code>​ </​code>​
  
-If you want, you can a backup of a directory as well.+If you want, you can a backup of a directory as well. This will take some time, since the data is stored across multiple nodes, and it has to pull parts of it from each one.
  
 <​code>​ <​code>​
-hdfs dfs -copyToLocal /​user/​beandog/​bigdata /var/archives/bigdata+hdfs dfs -copyToLocal /​user/​beandog/​bigdata /home/beandog/bigdata.bak 
 +</​code>​ 
 + 
 +On the HDFS NameNode (the primary server that keeps track of all the metadata), edit the HDFS configuration file to now exclude that node as a place for storage. 
 + 
 +In the ''​hdfs-site.xml''​ file, you'll add a new property, with the name being ''​dfs.hosts.exclude''​. The value for the property is a file somewhere on the filesystem that has a list of each host that will be decommissioned,​ one per line. 
 + 
 +In this case, the location I'm putting the text file is in ''/​etc/​hdfs-removed-nodes''​. 
 + 
 +First, the XML file addition: 
 + 
 +<​code>​ 
 +<​property>​ 
 +  <​name>​dfs.hosts.exclude</​name>​ 
 +  <​value>/​etc/​hdfs-removed-nodes</​value>​ 
 +</​property>​ 
 +</​code>​ 
 + 
 +And the contents of hdfs-removed-nodes:​ 
 + 
 +<​code>​ 
 +hadoop-node4.lan 
 +</​code>​ 
 + 
 +Tell the NameNode to refresh the nodes: 
 + 
 +<​code>​ 
 +hdfs dfsadmin -refreshNamenodes 
 +</​code>​ 
 + 
 +The HDFS node will be decomissioned,​ which will take some time. You can view the status either through the web interface, or using ''​hdfs dfsadmin'':​ 
 + 
 +<​code>​ 
 +hdfs dfsadmin -report 
 +</​code>​ 
 + 
 +Once the node is completely decommissioned,​ you can remove it from the ''​slaves''​ file in your Hadoop configuration directory, and restart HDFS: 
 + 
 +<​code>​ 
 +stop-dfs.sh 
 +start-dfs.sh
 </​code>​ </​code>​