Hadoop File System도 구글 파일 시스템에서 아이디어를 얻었으며
시스템 아키텍처, 기능 등에서 구글 파일 시스템과 유사하다.
다음 요구 사항을 만족시키기 위해 개발됐다.

그림 1. GFS Architecture [STUDY:1]
POSIX
POSIX (Portable Operation System Interface) [STUDY:2]
이식 가능 운영 체제 인터페이스
서로 다른 UNIX OS의 공통 API를 정리하여 이식성이 높은 유닉스 응용 프로그램을 개발하기 위한 목적으로
IEEE가 책정한 애플리케이션 인터페이스 규격이다.

그림 2. HDFS(Hadoop Distributed File System) Architecture
http://apache.tt.co.kr/hadoop/common/hadoop-0.20.203.0/
# The java implementation to use. Required.
export JAVA_HOME=/Library/Java/Home
# Extra Java CLASSPATH elements. Optional.
# export HADOOP_CLASSPATH=
# The maximum amount of heap to use, in MB. Default is 1000.
export HADOOP_HEAPSIZE=2000
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/Users/jin-uyu/hadoop-0.20.203.0/hadoop-${user.name}</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implemetaion. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
<property>
<name>mapred.tasktracker.tasks.maximum</name>
<value>8</value>
<description>
The maximum number of tasks that will be run simultaneously by
a task tracker
</description>
</property>
</configuration>
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>
Default block replication
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>tasktracker.http.threads</name>
<value>400</value>
</property>
</configuration>
jin-u-yuui-MacBook-Pro:bin jin-uyu$ ./hadoop namenode -format
12/03/26 22:46:05 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = jin-u-yuui-MacBook-Pro.local/10.64.139.69
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.203.0
STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203 -r 1099333; compiled by 'oom' on Wed May 4 07:57:50 PDT 2011
************************************************************/
12/03/26 22:46:06 INFO util.GSet: VM type = 64-bit
12/03/26 22:46:06 INFO util.GSet: 2% max memory = 39.83375 MB
12/03/26 22:46:06 INFO util.GSet: capacity = 2^22 = 4194304 entries
12/03/26 22:46:06 INFO util.GSet: recommended=4194304, actual=4194304
12/03/26 22:46:06 INFO namenode.FSNamesystem: fsOwner=jin-uyu
12/03/26 22:46:06 INFO namenode.FSNamesystem: supergroup=supergroup
12/03/26 22:46:06 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/03/26 22:46:06 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
12/03/26 22:46:06 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
12/03/26 22:46:06 INFO namenode.NameNode: Caching file names occuring more than 10 times
12/03/26 22:46:06 INFO common.Storage: Image file of size 113 saved in 0 seconds.
12/03/26 22:46:06 INFO common.Storage: Storage directory /Users/jin-uyu/hadoop-0.20.203.0/hadoop-jin-uyu/dfs/name has been successfully formatted.
12/03/26 22:46:06 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at jin-u-yuui-MacBook-Pro.local/10.64.139.69
************************************************************/
jin-u-yuui-MacBook-Pro:~ jin-uyu$ hadoop-*/bin/start-all.sh
namenode running as process 35948. Stop it first.
localhost: starting datanode, logging to /Users/jin-uyu/hadoop-0.20.203.0/bin/../logs/hadoop-jin-uyu-datanode-jin-u-yuui-MacBook-Pro.local.out
localhost: starting secondarynamenode, logging to /Users/jin-uyu/hadoop-0.20.203.0/bin/../logs/hadoop-jin-uyu-secondarynamenode-jin-u-yuui-MacBook-Pro.local.out
jobtracker running as process 36035. Stop it first.
localhost: starting tasktracker, logging to /Users/jin-uyu/hadoop-0.20.203.0/bin/../logs/hadoop-jin-uyu-tasktracker-jin-u-yuui-MacBook-Pro.local.out
jin-u-yuui-MacBook-Pro:~ jin-uyu$ hadoop-*/bin/stop-all.sh
jin-u-yuui-MacBook-Pro:~ jin-uyu$ hadoop-*/bin/hadoop dfs -put test.txt input/test.txt
jin-u-yuui-MacBook-Pro:~ jin-uyu$ hadoop-*/bin/hadoop dfs -ls input
Found 1 items
-rw-r--r-- 1 jin-uyu supergroup 75 2012-03-27 23:21 /user/jin-uyu/input/test.txt
jin-u-yuui-MacBook-Pro:hadoop-0.20.203.0 jin-uyu$ ./bin/hadoop jar hadoop-examples-0.20.203.0.jar wordcount input output/print.txt
12/03/27 23:27:57 INFO input.FileInputFormat: Total input paths to process : 1
12/03/27 23:27:58 INFO mapred.JobClient: Running job: job_201203262247_0001
12/03/27 23:27:59 INFO mapred.JobClient: map 0% reduce 0%
12/03/27 23:28:13 INFO mapred.JobClient: map 100% reduce 0%
12/03/27 23:34:31 INFO mapred.JobClient: Task Id : attempt_201203262247_0001_r_000000_0, Status : FAILED
Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
12/03/27 23:35:31 WARN mapred.JobClient: Error reading task outputconnect timed out
12/03/27 23:36:31 WARN mapred.JobClient: Error reading task outputconnect timed out
<property>
<name>tasktracker.http.threads</name>
<value>400</value>
</property>
jin-u-yuui-MacBook-Pro:hadoop-0.20.203.0 jin-uyu$ ./bin/hadoop jar hadoop-examples-0.20.203.0.jar wordcount input output
12/03/27 23:45:08 INFO input.FileInputFormat: Total input paths to process : 1
12/03/27 23:45:09 INFO mapred.JobClient: Running job: job_201203272343_0002
12/03/27 23:45:10 INFO mapred.JobClient: map 0% reduce 0%
12/03/27 23:45:23 INFO mapred.JobClient: map 100% reduce 0%
12/03/27 23:45:35 INFO mapred.JobClient: map 100% reduce 100%
12/03/27 23:45:40 INFO mapred.JobClient: Job complete: job_201203272343_0002
12/03/27 23:45:40 INFO mapred.JobClient: Counters: 25
12/03/27 23:45:40 INFO mapred.JobClient: Job Counters
12/03/27 23:45:40 INFO mapred.JobClient: Launched reduce tasks=1
12/03/27 23:45:40 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=13273
12/03/27 23:45:40 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
12/03/27 23:45:40 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
12/03/27 23:45:40 INFO mapred.JobClient: Launched map tasks=1
12/03/27 23:45:40 INFO mapred.JobClient: Data-local map tasks=1
12/03/27 23:45:40 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=10427
12/03/27 23:45:40 INFO mapred.JobClient: File Output Format Counters
12/03/27 23:45:40 INFO mapred.JobClient: Bytes Written=89
12/03/27 23:45:40 INFO mapred.JobClient: FileSystemCounters
12/03/27 23:45:40 INFO mapred.JobClient: FILE_BYTES_READ=151
12/03/27 23:45:40 INFO mapred.JobClient: HDFS_BYTES_READ=189
12/03/27 23:45:40 INFO mapred.JobClient: FILE_BYTES_WRITTEN=43213
12/03/27 23:45:40 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=89
12/03/27 23:45:40 INFO mapred.JobClient: File Input Format Counters
12/03/27 23:45:40 INFO mapred.JobClient: Bytes Read=75
12/03/27 23:45:40 INFO mapred.JobClient: Map-Reduce Framework
12/03/27 23:45:40 INFO mapred.JobClient: Reduce input groups=14
12/03/27 23:45:40 INFO mapred.JobClient: Map output materialized bytes=151
12/03/27 23:45:40 INFO mapred.JobClient: Combine output records=14
12/03/27 23:45:40 INFO mapred.JobClient: Map input records=5
12/03/27 23:45:40 INFO mapred.JobClient: Reduce shuffle bytes=151
12/03/27 23:45:40 INFO mapred.JobClient: Reduce output records=14
12/03/27 23:45:40 INFO mapred.JobClient: Spilled Records=28
12/03/27 23:45:40 INFO mapred.JobClient: Map output bytes=147
12/03/27 23:45:40 INFO mapred.JobClient: Combine input records=18
12/03/27 23:45:40 INFO mapred.JobClient: Map output records=18
12/03/27 23:45:40 INFO mapred.JobClient: SPLIT_RAW_BYTES=114
12/03/27 23:45:40 INFO mapred.JobClient: Reduce input records=14
jin-u-yuui-MacBook-Pro:hadoop-0.20.203.0 jin-uyu$ ./bin/hadoop dfs -cat output/*
Geenoo 1
a 1
am 2
are 2
boy 1
friends 1
geenoo 1
girl 1
good 2
guy 1
i 2
is 1
we 1
you 1
NameNode http://localhost:50070
Map/Reduce http://localhost:50030
jin-u-yuui-MacBook-Pro:hadoop-0.20.203.0 jin-uyu$ ./bin/hadoop dfs Usage: java FsShell
[-ls <path>]
[-lsr <path>]
[-du <path>]
[-dus <path>]
[-count[-q] <path>]
[-mv <src> <dst>]
[-cp <src> <dst>]
[-rm [-skipTrash] <path>]
[-rmr [-skipTrash] <path>]
[-expunge]
[-put <localsrc> ... <dst>]
[-copyFromLocal <localsrc> ... <dst>]
[-moveFromLocal <localsrc> ... <dst>]
[-get [-ignoreCrc] [-crc] <src> <localdst>]
[-getmerge <src> <localdst> [addnl]]
[-cat <src>]
[-text <src>]
[-copyToLocal [-ignoreCrc] [-crc] <src> <localdst>]
[-moveToLocal [-crc] <src> <localdst>]
[-mkdir <path>]
[-setrep [-R] [-w] <rep> <path/file>]
[-touchz <path>]
[-test -[ezd] <path>]
[-stat [format] <path>]
[-tail [-f] <file>]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-chgrp [-R] GROUP PATH...]
[-help [cmd]]
[STUDY:1] The Google File System http://www.cs.brown.edu/courses/cs295-11/2006/gfs.pdf
[STUDY:2] POSIX http://ko.wikipedia.org/wiki/POSIX
[STUDY:3] http://wiki.apache.org/hadoop/Running_Hadoop_On_OS_X_10.5_64-bit_(Single-Node_Cluster)
[STUDY:4] Hadoop Shell Commands http://hadoop.apache.org/common/docs/r0.17.1/hdfs_shell.html
[STUDY:5] MapReduce Tutorial http://hadoop.apache.org/common/docs/current/mapred_tutorial.html#Example%3A+WordCount+v1.0