Differences
This shows you the differences between two versions of the page.
— | hdfs [2016/04/21 14:32] (current) – created - external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== hdfs ====== | ||
+ | * [[Hadoop]] | ||
+ | * [[HDFS Filesystem]] | ||
+ | * [[hdfs dfs]] | ||
+ | * [[webhdfs]] | ||
+ | |||
+ | * [[http:// | ||
+ | * [[http:// | ||
+ | * [[http:// | ||
+ | |||
+ | * [[https:// | ||
+ | |||
+ | ==== User Commands ==== | ||
+ | |||
+ | * [[hdfs archive]] | ||
+ | * [[hdfs distcp]] | ||
+ | * [[hdfs dfs]] | ||
+ | * [[hdfs fs]] | ||
+ | * [[hdfs fsck]] | ||
+ | * [[hdfs fetchdt]] | ||
+ | |||
+ | ==== Admin Commands ==== | ||
+ | |||
+ | * [[hdfs balancer]] | ||
+ | * [[hdfs daemonlog]] | ||
+ | * [[hdfs datanode]] | ||
+ | * [[hdfs dfsadmin]] | ||
+ | * [[hdfs namenode]] | ||
+ | * [[hdfs secondarynamenode]] | ||
+ | |||
+ | === Working Directory === | ||
+ | |||
+ | User home directory is assumed (fex, ''/ | ||
+ | |||
+ | === Removing an HDFS node === | ||
+ | |||
+ | For some reason or another, you may need to remove an HDFS node from the cluster. Doing maintenance would be an example. The filesystem needs to be checked to see if there' | ||
+ | |||
+ | Before doing anything, get a report about the node to see what it's status is. | ||
+ | |||
+ | < | ||
+ | hdfs dfsadmin -report | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | Name: 192.168.12.24: | ||
+ | Hostname: hadoop-node4.lan | ||
+ | Decommission Status : Normal | ||
+ | Configured Capacity: 5487524069376 (4.99 TB) | ||
+ | DFS Used: 9204400128 (8.57 GB) | ||
+ | Non DFS Used: 279243939840 (260.07 GB) | ||
+ | DFS Remaining: 5199075729408 (4.73 TB) | ||
+ | DFS Used%: 0.17% | ||
+ | DFS Remaining%: 94.74% | ||
+ | </ | ||
+ | |||
+ | Next, run a filesystem check to see if there' | ||
+ | |||
+ | < | ||
+ | hdfs fsck / | ||
+ | hdfs fsck / | ||
+ | </ | ||
+ | |||
+ | If you want, you can a backup of a directory as well. This will take some time, since the data is stored across multiple nodes, and it has to pull parts of it from each one. | ||
+ | |||
+ | < | ||
+ | hdfs dfs -copyToLocal / | ||
+ | </ | ||
+ | |||
+ | On the HDFS NameNode (the primary server that keeps track of all the metadata), edit the HDFS configuration file to now exclude that node as a place for storage. | ||
+ | |||
+ | In the '' | ||
+ | |||
+ | In this case, the location I'm putting the text file is in ''/ | ||
+ | |||
+ | First, the XML file addition: | ||
+ | |||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | </ | ||
+ | |||
+ | And the contents of hdfs-removed-nodes: | ||
+ | |||
+ | < | ||
+ | hadoop-node4.lan | ||
+ | </ | ||
+ | |||
+ | Tell the NameNode to refresh the nodes: | ||
+ | |||
+ | < | ||
+ | hdfs dfsadmin -refreshNamenodes | ||
+ | </ | ||
+ | |||
+ | The HDFS node will be decomissioned, | ||
+ | |||
+ | < | ||
+ | hdfs dfsadmin -report | ||
+ | </ | ||
+ | |||
+ | Once the node is completely decommissioned, | ||
+ | |||
+ | < | ||
+ | stop-dfs.sh | ||
+ | start-dfs.sh | ||
+ | </ |