Hadoop for DBAs (4/13): Introduction to HDFS

HDFS is Hadoop Distributed File System and provides High Availability through Replication and Automatic Failover. This 4th article of the Hadoop for DBAs series digs into HDFS concepts. It is probably the funniest part to learn for DBAs. It is also the closest to what you are used to. There are dozens of features to be interested in and learn. This article focuses on some of the concepts and under the hood. It also demonstrates HDFS High Availability for DataNode and NameNode. Its 3 sections are independent and demonstrate: o A First Look at HDFS Underlying Structures; o HDFS Replication and DataNode Failure; o Quorum Journal Manager and NameNode Failure.
HDFS is a file system and, in some way, something you are very used to. It provides parameters to bypass caches or to force data in caches. You can impact block distribution, geo-distribution. You can also impact the level of replication. It is built to feed processes running in parallel. It can be accessed in many different ways, including CLI, Java, C or REST API. It provides advanced features like snapshots and rolling upgrades. You have the ability to move data dynamically and also extend or shrink your cluster. It is bluffing!
On the other hand, you probably don’t know anything like HDFS already. It could be because there are not that many distributed file systems anyway. Plus, it is written in Java to store blocks of files across thousands of servers. It is built for large files written by one process at a time. It still only supports data to be appended to existing files and it relies on local file systems on every nodes. So, it is not something you would have needed anyway, unless you want to deal with a at least a few terabytes of data and don’t want to invest in proprietary hardware and software. But enough of the overview, let us get into the inside… And before you go ahead, make sure you have installed and started HDFS as described in Hadoop for DBAs (3/13): A Simple 3-Node Hadoop Cluster.

A First Look at HDFS Underlying Structures

As you’ve figured out by installing and starting it, HDFS relies on 2 main components:

  • The NameNode manages metadata including directories and files, files to blocks mappings, ongoing and some past operations as well as the status of datanodes
  • DataNodes store, as you can expect, data blocks that are chunks of files

The number of operations performed by the NameNode are kept minimal so that HDFS can scale out without making the infrastructure more complex by adding namespaces. As a result, clients get and send data directly to DataNodes. The diagram below shows how a client interacts with HDFS when storing a file:

  1. It coordinates with the NameNode to get and store metadata. Once done it sends the file blocks to a DataNode
  2. DataNode replicates data to other DataNodes which send feedback to the NameNode so that it can keep track of metadata

hdfs put command
To check details of those operations clean up your HDFS filesystem and restart the namenode. You will find a fsImage file in the metadata directory. From the yellow server, run:

bin/hdfs dfs -rm -R -f -skipTrash /user/hadoop/*
bin/hdfs dfs -mkdir /user
bin/hdfs dfs -mkdir /user/hadoop
sbin/hadoop-daemon.sh stop namenode
sbin/hadoop-daemon.sh start namenode

Once the namenode restarted, you should find a fsImage file to be used in case of a crash. That file contains an image of the HDFS files and directories. You can use the Offline Image Viewer or oiv to check the content of it:

ls -tr /u01/metadata/current
[...]
fsimage_0000000000000000503
fsimage_0000000000000000503.md5
VERSION
edits_inprogress_0000000000000000504
seen_txid
bin/hdfs oiv -i /u01/metadata/current/fsimage_0000000000000000503 \
    -o /tmp/fsimage -p Indented
cat /tmp/fsimage
drwxr-xr-x  -   hadoop supergroup 1409403918690 0 /
drwxrwx---  -   hadoop supergroup 1409421293892 0 /tmp
drwxr-xr-x  -   hadoop supergroup 1409403927142 0 /user
drwxr-xr-x  -   hadoop supergroup 1409421326069 0 /user/hadoop

Put a file in HDFS to figure out what will happen, you can run hdfs dfs -put from pink:

dd if=/dev/zero of=/tmp/data bs=64M count=3
bin/hdfs dfs -put /tmp/data data
bin/hdfs dfs -ls /user/hadoop
Found 1 items
-rw-r--r-- 3 hadoop supergroup  201326592 2014-08-30 20:25 /user/hadoop/data

Once the file in HDFS, you can check how it is stored with hdfs fsck:

bin/hdfs fsck /user/hadoop/data -files -blocks -locations
Connecting to namenode via http://yellow:50070
FSCK started by hadoop (auth:SIMPLE) from /192.168.56.4 for path /user/hadoop/data at Sat Aug 30 20:34:11
 CEST 2014
/user/hadoop/data 201326592 bytes, 2 block(s):  OK
0. BP-6177126-192.168.56.3-1407174901393:blk_1073741890_1066 len=134217728 repl=3 [192.168.56.4:50010, 192.168.56.5:50010, 192.168.56.3:50010]
1. BP-6177126-192.168.56.3-1407174901393:blk_1073741891_1067 len=67108864 repl=3 [192.168.56.4:50010, 192.168.56.5:50010, 192.168.56.3:50010]
Status: HEALTHY
 Total size:    201326592 B
 Total dirs:    0
 Total files:   1
 Total symlinks:                0
 Total blocks (validated):      2 (avg. block size 100663296 B)
 Minimally replicated blocks:   2 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     3.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          3
 Number of racks:               1
FSCK ended at Sat Aug 30 20:34:11 CEST 2014 in 11 milliseconds
The filesystem under path '/user/hadoop/data' is HEALTHY

Stop the NameNode again and check the content of the metadata directory. The edit files keep tracks of the operations managed by the NameNode. You can use the Offline Edit Viewer to understand what is managed by the NameNode and how/when operations are performed on DataNodes:

sbin/hadoop-daemon.sh stop namenode
ls -tr /u01/metadata/current
[...]
fsimage_0000000000000000503
[...]
seen_txid
edits_inprogress_0000000000000000517
bin/hdfs oev -i /u01/metadata/current/edits_inprogress_0000000000000000517 \
   -o /tmp/edit.xml -p XML
cat /tmp/edit.xml
[...]
 <RECORD>
    <OPCODE>OP_CLOSE</OPCODE>
    <DATA>
      <TXID>525</TXID>
      <LENGTH>0</LENGTH>
      <INODEID>0</INODEID>
      <PATH>/user/hadoop/data._COPYING_</PATH>
      <REPLICATION>3</REPLICATION>
      <MTIME>1409423158612</MTIME>
      <ATIME>1409423152498</ATIME>
      <BLOCKSIZE>134217728</BLOCKSIZE>
      <CLIENT_NAME></CLIENT_NAME>
      <CLIENT_MACHINE></CLIENT_MACHINE>
      <BLOCK>
        <BLOCK_ID>1073741890</BLOCK_ID>
        <NUM_BYTES>134217728</NUM_BYTES>
        <GENSTAMP>1066</GENSTAMP>
      </BLOCK>
      <BLOCK>
        <BLOCK_ID>1073741891</BLOCK_ID>
      <NUM_BYTES>67108864</NUM_BYTES>
        <GENSTAMP>1067</GENSTAMP>
      </BLOCK>
      <PERMISSION_STATUS>
        <USERNAME>hadoop</USERNAME>
        <GROUPNAME>supergroup</GROUPNAME>
        <MODE>420</MODE>
      </PERMISSION_STATUS>
    </DATA>
  </RECORD>
[...]

As you can guess from above default block size are 128M, except for the last block of the file. If you look for those blocks in the underlying filesystem, you should find them named « blk_<blockid> » like below:

find /u01/files/current -iname "*107374189?"
.../BP-6177126-192.168.56.3-1407174901393/current/finalized/blk_1073741890
.../BP-6177126-192.168.56.3-1407174901393/current/finalized/blk_1073741891

HDFS Replication and DataNode Loss

As you can see from above the default replication factor for every file block is 3. Because we have a 3 node cluster, it won’t help to demonstrate blocks are automatically rebalanced in the event of a server crash. To begin with this section, change the level of replication to 2:

  • Stop HDFS NameNode and DataNodes
  • Add a dfs.replication property in etc/hadoop/hdfs-site.xml and set it to 2
  • Remove the content of the data and metadata directories
  • Format the new file system
  • Start back HDFS NameNode and DataNodes
bin/hdfs dfs -mkdir /user
bin/hdfs dfs -mkdir /user/hadoop
dd if=/dev/zero of=/tmp/data bs=64M count=3
bin/hdfs dfs -put /tmp/data data
bin/hdfs fsck /user/hadoop/data -files -blocks -locations
Connecting to namenode via http://yellow:50070
FSCK started by hadoop (auth:SIMPLE) from /192.168.56.3 for path /user/hadoop/data at Sun Aug 31 06:47:24 CEST 2014
/user/hadoop/data 201326592 bytes, 2 block(s):  OK
0. BP-399541416-192.168.56.3-1409430021329:blk_1073741827_1003 len=134217728 repl=2 [192.168.56.4:50010, 192.168.56.3:50010]
1. BP-399541416-192.168.56.3-1409430021329:blk_1073741828_1004 len=67108864 repl=2 [192.168.56.3:50010, 192.168.56.5:50010]
Status: HEALTHY
 Total size:    201326592 B
 Total dirs:    0
 Total files:   1
 Total symlinks:                0
 Total blocks (validated):      2 (avg. block size 100663296 B)
 Minimally replicated blocks:   2 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    2
 Average block replication:     2.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          3
 Number of racks:               1
FSCK ended at Sun Aug 31 06:47:24 CEST 2014 in 4 milliseconds
The filesystem under path '/user/hadoop/data' is HEALTHY

As you can see above there is only 2 copies of each block. Stop one DataNode that contains some blocks. From that example, we will stop yellow (192.168.56.3):

sbin/hadoop-daemon.sh stop datanode

As you can see from hdfs dfsadmin -report the failed DataNode is not marked as dead right away. By default, it takes 10 minutes and 30 seconds to be detected:

bin/hdfs dfsadmin -report
Configured Capacity: 72417030144 (67.44 GB)
Present Capacity: 61966094336 (57.71 GB)
DFS Remaining: 61560250368 (57.33 GB)
DFS Used: 405843968 (387.04 MB)
DFS Used%: 0.65%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 3 (3 total, 0 dead)
Live datanodes:
Name: 192.168.56.5:50010 (green.resetlogs.com)
Hostname: green
Decommission Status : Normal
Configured Capacity: 24139010048 (22.48 GB)
DFS Used: 67649536 (64.52 MB)
Non DFS Used: 3213279232 (2.99 GB)
DFS Remaining: 20858081280 (19.43 GB)
DFS Used%: 0.28%
DFS Remaining%: 86.41%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Sun Aug 31 06:57:04 CEST 2014
Name: 192.168.56.3:50010 (yellow.resetlogs.com)
Hostname: yellow.resetlogs.com
Decommission Status : Normal
Configured Capacity: 24139010048 (22.48 GB)
DFS Used: 202915840 (193.52 MB)
Non DFS Used: 3726766080 (3.47 GB)
DFS Remaining: 20209328128 (18.82 GB)
DFS Used%: 0.84%
DFS Remaining%: 83.72%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Sun Aug 31 06:56:23 CEST 2014
Name: 192.168.56.4:50010 (pink.resetlogs.com)
Hostname: pink
Decommission Status : Normal
Configured Capacity: 24139010048 (22.48 GB)
DFS Used: 135278592 (129.01 MB)
Non DFS Used: 3510890496 (3.27 GB)
DFS Remaining: 20492840960 (19.09 GB)
DFS Used%: 0.56%
DFS Remaining%: 84.90%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Sun Aug 31 06:57:04 CEST 2014

During that period of time, if you want to get some files, the client might report some exceptions as you can see below; however, it should be able to get the file from block copies:

bin/hdfs dfs -get /user/hadoop/data /tmp/greg2
14/08/31 07:00:40 WARN hdfs.BlockReaderFactory: I/O error constructing remote block
reader.
java.net.ConnectException: Connection refused
[...]
14/08/31 07:00:40 WARN hdfs.DFSClient: Failed to connect to /192.168.56.3:50010 for
block, add to deadNodes and continue. java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
[...]
14/08/31 07:00:40 INFO hdfs.DFSClient: Successfully connected to /192.168.56.4:50010
 for BP-399541416-192.168.56.3-1409430021329:blk_1073741827_1003

After a while, the missing datanode should be reported as dead:

bin/hdfs dfsadmin -report
Configured Capacity: 72417030144 (67.44 GB)
Present Capacity: 61966077952 (57.71 GB)
DFS Remaining: 61560233984 (57.33 GB)
DFS Used: 405843968 (387.04 MB)
DFS Used%: 0.65%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 2 (3 total, 1 dead)
Live datanodes:
Name: 192.168.56.5:50010 (green.resetlogs.com)
Hostname: green
Decommission Status : Normal
Configured Capacity: 24139010048 (22.48 GB)
DFS Used: 67649536 (64.52 MB)
Non DFS Used: 3213287424 (2.99 GB)
DFS Remaining: 20858073088 (19.43 GB)
DFS Used%: 0.28%
DFS Remaining%: 86.41%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Sun Aug 31 07:09:01 CEST 2014
Name: 192.168.56.4:50010 (pink.resetlogs.com)
Hostname: pink
Decommission Status : Normal
Configured Capacity: 24139010048 (22.48 GB)
DFS Used: 135278592 (129.01 MB)
Non DFS Used: 3510898688 (3.27 GB)
DFS Remaining: 20492832768 (19.09 GB)
DFS Used%: 0.56%
DFS Remaining%: 84.90%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Sun Aug 31 07:09:01 CEST 2014
Dead datanodes:
Name: 192.168.56.3:50010 (yellow.resetlogs.com)
Hostname: yellow.resetlogs.com
Decommission Status : Normal
Configured Capacity: 24139010048 (22.48 GB)
DFS Used: 202915840 (193.52 MB)
Non DFS Used: 3726766080 (3.47 GB)
DFS Remaining: 20209328128 (18.82 GB)
DFS Used%: 0.84%
DFS Remaining%: 83.72%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Sun Aug 31 06:56:23 CEST 2014

File blocks are re-balanced to the remaining DataNodes so that you don’t any get exception when reading the file:

bin/hdfs fsck /user/hadoop/data -files -blocks -locations
Connecting to namenode via http://yellow:50070
FSCK started by hadoop (auth:SIMPLE) from /192.168.56.3 for path /user/hadoop/data at Sun Aug 31 07:11:32 CEST 2014
/user/hadoop/data 201326592 bytes, 2 block(s):  OK
0. BP-399541416-192.168.56.3-1409430021329:blk_1073741827_1003 len=134217728 repl=2 [192.168.56.4:50010, 192.168.56.5:50010]
1. BP-399541416-192.168.56.3-1409430021329:blk_1073741828_1004 len=67108864 repl=2 [192.168.56.5:50010, 192.168.56.4:50010]
Status: HEALTHY
 Total size:    201326592 B
 Total dirs:    0
 Total files:   1
 Total symlinks:                0
 Total blocks (validated):      2 (avg. block size 100663296 B)
 Minimally replicated blocks:   2 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    2
 Average block replication:     2.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          2
 Number of racks:               1
FSCK ended at Sun Aug 31 07:11:32 CEST 2014 in 1 milliseconds
The filesystem under path '/user/hadoop/data' is HEALTHY
bin/hdfs dfs -get /user/hadoop/data /tmp/copy3

If you restart the failed DataNode, the NameNode will report it back alive right away and each block will have 3 copies:

bin/hdfs fsck /user/hadoop/data -files -blocks -locations
Connecting to namenode via http://yellow:50070
FSCK started by hadoop (auth:SIMPLE) from /192.168.56.3 for path /user/hadoop/data at Sun Aug 31 07:17:25 CEST 2014
/user/hadoop/data 201326592 bytes, 2 block(s):  OK
0. BP-399541416-192.168.56.3-1409430021329:blk_1073741827_1003 len=134217728 repl=2 [192.168.56.4:50010, 192.168.56.5:50010, 192.168.56.3:50010]
1. BP-399541416-192.168.56.3-1409430021329:blk_1073741828_1004 len=67108864 repl=2 [192.168.56.5:50010, 192.168.56.4:50010, 192.168.56.3:50010]
Status: HEALTHY
 Total size:    201326592 B
 Total dirs:    0
 Total files:   1
 Total symlinks:                0
 Total blocks (validated):      2 (avg. block size 100663296 B)
 Minimally replicated blocks:   2 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    2
 Average block replication:     2.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          3
 Number of racks:               1
FSCK ended at Sun Aug 31 07:17:25 CEST 2014 in 1 milliseconds
The filesystem under path '/user/hadoop/data' is HEALTHY

NameNode Failure and QJM

In the previous configuration, the NameNode remains a single point of failure, especially if disks are local to servers and not secured by any kind of RAID/LVM. Keep in mind that HDFS metadata are only accessible from the NameNode server. To secure that configuration, HDFS provides a way to replicate metadata to a standby NameNode. For now and, at least up to release 2.5.0, there can only be 2 NameNodes per configuration: one active and one standby. In case of a server loss or to manage server maintenance, you can failover the active and standby NameNodes. That replication Infrastructure is named Quorum Journal Manager and it works as described below:

  1. The active NameNode sends edits to JounalNodes so that metadata can be maintained by other NameNode
  2. Standby NameNodes pull edits from JournalNodes and maintain a copy of metadata
  3. DataNodes send not only to the active NameNode but to all NameNodes in the configuration so that a standby node can be set active by a simple failover command

Quorum Journal Manager

NameNode Standby Configuration

In order to configure the Quorum Journal Manager for manual failover, proceed as below:

  • Stop the NameNode and DataNodes
sbin/hadoop-daemon.sh stop namenode
sbin/hadoop-daemon.sh stop datanode
  • Create the Journal Directory on every node
for i in yellow pink green; do
   ssh $i mkdir /u01/journal
done
  • Create the metadata directory on green
ssh green mkdir /u01/metadata
  • Make sure fuser is installed on every one of your servers
yum install -y psmisc
  • Add the following properties to etc/hadoop/hdfs-site.xml
 <property>
    <name>dfs.nameservices</name>
    <value>colorcluster</value>
  </property>
  <property>
    <name>dfs.ha.namenodes.colorcluster</name>
    <value>nn1,nn2</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.colorcluster.nn1</name>
    <value>yellow.resetlogs.com:8020</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.colorcluster.nn2</name>
    <value>green.resetlogs.com:8020</value>
  </property>
  <property>
    <name>dfs.namenode.http-address.colorcluster.nn1</name>
    <value>yellow.resetlogs.com:50070</value>
  </property>
  <property>
    <name>dfs.namenode.http-address.colorcluster.nn2</name>
    <value>green.resetlogs.com:50070</value>
  </property>
 <property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://yellow.resetlogs.com:8485;pink.resetlogs.com:8485;green.resetlogs.com:8485/colorcluster</value>
  </property>
  <property>
    <name>dfs.journalnode.edits.dir</name>
    <value>/u01/journal</value>
  </property>
  <property>
    <name>dfs.client.failover.proxy.provider.colorcluster</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  </property>
  <property>
    <name>dfs.ha.fencing.methods</name>
    <value>sshfence</value>
  </property>
  <property>
    <name>dfs.ha.fencing.ssh.private-key-files</name>
    <value>/home/hadoop/.ssh/id_dsa</value>
  </property>

Note:
For details about those parameters, refer to HDFS High Availability Using the Quorum Journal Manager.

  • Change core-site.xml fs.defaultFS property to match the cluster name
cat etc/hadoop/core-site.xml
<configuration>
   <property>
        <name>fs.defaultFS</name>
        <value>hdfs://colorcluster</value>
    </property>
</configuration>
  • Distribute those 2 files to all the nodes of the cluster
for i in pink green; do
  scp etc/hadoop/hdfs-site.xml $i:/usr/local/hadoop/etc/hadoop/.
  scp etc/hadoop/core-site.xml $i:/usr/local/hadoop/etc/hadoop/.
done
  • Start the journalnode and datanode processes
sbin/hadoop-daemons.sh start journalnode
sbin/hadoop-daemons.sh start datanode
  • Format the journalnode from yellow
bin/hdfs namenode -initializeSharedEdits
  • Start the NameNode on yellow
sbin/hadoop-daemon.sh start namenode
  • Configure the NameNode on green to synchronize with the primary NameNode
# on green
bin/hdfs namenode -bootstrapStandby
  • Start the NameNode on green
sbin/hadoop-daemon.sh start namenode
  • Check the status on both NameNodes:
bin/hdfs haadmin -getServiceState nn1
standby
bin/hdfs haadmin -getServiceState nn2
standby
  • Activate the one of the namenode
bin/hdfs haadmin -transitionToActive nn1
  • Test HDFS; the client should be able to access HDFS
bin/hdfs dfs -ls /user/hadoop
-rw-r--r--   2 hadoop supergroup  201326592 2014-08-31 13:29 /user/hadoop/data

Manual Failover

If you lose the active NameNode from yellow, you won’t be able to access hdfs anymore:

ps -ef |grep [n]amenode |awk '{print "kill -9",$2}'|sh

However, you can easily failover to the standby node with a one-only command:

bin/hdfs haadmin -failover nn1 nn2

The client should be able to use HDFS back again

bin/hdfs dfs -ls /user/hadoop
-rw-r--r--   2 hadoop supergroup  201326592 2014-08-31 13:29 /user/hadoop/data

Startup and Shutdown Procedure

Once you’ve configured QJM, starting and stopping the whole HDFS cluster differ. In order to stop the whole cluster, connect to yellow and run:

cd /usr/local/hadoop
ssh green `pwd`/sbin/hadoop-daemon.sh stop namenode
sbin/hadoop-daemon.sh stop namenode
sbin/hadoop-daemons.sh stop journalnode
sbin/hadoop-daemons.sh stop datanode

To restart HDFS, run:

cd /usr/local/hadoop
sbin/hadoop-daemons.sh start journalnode
sbin/hadoop-daemon.sh start namenode
ssh green `pwd`/sbin/hadoop-daemon.sh start namenode
sbin/hadoop-daemons.sh start datanode
bin/hdfs haadmin -failover nn2 nn1
bin/hdfs haadmin -getServiceState nn1

Conclusion

This 4th part of the series shows how HDFS works. It also shows how DataNode replication behaves and How to setup and test Quorum Journal Manager to handle manual failover of NameNode. There are many other aspects of HDFS you may want to dig into, like the use of Zookeeper to automate NameNode failovers or the configuration of rack for replication. Next week you’ll start developing MapReduce Jobs with this colorcluster. Stay tuned!

Reference
To know more about HDFS, refer to HDFS Users Guide.

1 réflexion sur “Hadoop for DBAs (4/13): Introduction to HDFS”

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *