Aller au contenu
  • Société
    • Qui sommes-nous
    • Nos valeurs
    • Nos partenaires
    • Entreprise citoyenne
    • Régions
  • Services
    • Expertise
    • Formation
    • Développement
    • Migration
    • Infogérance
  • Join the Team
  • Actualités
  • Blog
    • Blog easyteam.fr
    • Blog Cloud Natives
  • Formations
  • Rugb’Easyteam
  • Contact
Menu
  • Société
    • Qui sommes-nous
    • Nos valeurs
    • Nos partenaires
    • Entreprise citoyenne
    • Régions
  • Services
    • Expertise
    • Formation
    • Développement
    • Migration
    • Infogérance
  • Join the Team
  • Actualités
  • Blog
    • Blog easyteam.fr
    • Blog Cloud Natives
  • Formations
  • Rugb’Easyteam
  • Contact
Inscrivez-vous à la newsletter

Inscrivez-vous à la newsletter

Abonnez-vous maintenant et nous vous tiendrons au courant.
Nous respectons votre vie privée. Vous pouvez vous désabonner à tout moment.

Bienvenue sur le Blog d'EASYTEAM (ex ArKZoYd)

  • Accueil
  • Actualités
  • Cloud
  • Infrastructure
  • Données / Sécurité
  • Intégration
  • Dev / DevOps
  • SAM / FinOps
Menu
  • Accueil
  • Actualités
  • Cloud
  • Infrastructure
  • Données / Sécurité
  • Intégration
  • Dev / DevOps
  • SAM / FinOps
  • le 21/08/2014
  • Gregory Guillou
  • Home page @en

Hadoop for DBAs (2/13): Building Hadoop for Oracle Linux 7

hadoop2
Partager sur linkedin
LinkedIn
Partager sur twitter
Twitter
Partager sur facebook
Facebook

You may prefer to install Hadoop from distributions like the ones from Cloudera or Hortonworks… Even better, deploy Hadoop in the cloud or with an appliance like Netapp’s or Oracle’s. Those solutions help build, manage and are ready for your operating system. That’s not so obvious with Apache Hadoop software library.
No Pain, no gain! These series of articles are written so that you can taste the benefits and also some of the Hadoop challenges. They rely on the rough latest release from Apache, i.e. 2.4.1. It allows to test the latest features if needed. And because Apache Hadoop comes with 32-bit compiled libraries, we’ll need to rebuild it from source. I’m kin to it, so I’ll be using Oracle Linux 7. It should not be too difficult to adapt it to RHEL7, CentOS7 or Fedora…

Package Installation

To build Hadoop from source, several packages, libraries and tools are required. The 3 commands below install more than what is necessary to perform that task:

# Tools and libraries useful for Oracle
yum install procps module-init-tools ethtool \
initscripts bc bind-utils nfs-utils \
util-linux-ng pam xorg-x11-utils \
xorg-x11-xauth smartmontools binutils \
compat-libstdc++-33 gcc gcc-c++ glibc \
glibc-devel ksh libaio libaio-devel \
libgcc libstdc++ libstdc++-devel make \
sysstat openssh-clients compat-libcap1
# Additional tools and libraries for Hadoop
yum install  lzo  zlib-devel  autoconf automake \
libtool  openssl-devel cmake
# Additional tools for me (and probably you)
yum install curl zip unzip gzip bzip2 rsync git mlocate \
strace gdb perf openssh-server elinks

Google Protocol Buffers Installation

An important prerequisite to compile Apache Hadoop is the availability of Google Protocol Buffers 2.5. You might want to install it from the EPEL 7 repository  (for now beta). You can also install it from source:

curl -O https://protobuf.googlecode.com/files/protobuf-2.5.0.tar.bz2
tar -jxvf protobuf-2.5.0.tar.bz2
cd protobuf-*
./configure
make
sudo make install

Note:
protobuf default installation location, from source, is /usr/local/bin. Make sure it is included in the PATH variable or change the prefix to build Hadoop.

Java SE 8 JDK Installation

Most of Hadoop is written in Java and you’ll need to install a Java SE JDK too. You can use Java SE 8 RPM from the Oracle website or rely on OpenJDK [2]:

yum install jdk-8u11-linux-x64.rpm

Add the 2 lines below in ~/.bashrc or a profile file to access Java during the build:

export JAVA_HOME=/usr/java/jdk1.8.0_11
export PATH=$JAVA_HOME/bin:$PATH

Maven Installation

Maven is used to build Hadoop. Download and install Maven distribution from one of Apache Mirror sites:

cd /usr/local/
sudo tar -zxvf /home/hadoop/apache-maven-3.2.2-bin.tar.gz \
   --transform s/apache-maven-3.2.2/apache-maven/

Add the lines below in ~/.bashrc or a profile file to access maven during the build:

export M2_HOME=/usr/local/apache-maven
export M2=$M2_HOME/bin
export PATH=$M2:$PATH

Download Hadoop Source

Like all Apache projects, Hadoop software configuration manager is subversion. The good news is Apache also provides a git repository. Download Hadoop and checkout the 2.4.1 version:

git clone git://git.apache.org/hadoop-common.git
cd hadoop-common
git tag -l
git checkout tags/release-2.4.1

Build Hadoop

You are done with installing the prerequisites and you should be good to run the build. The command below generates the distribution file, including the 64-bit dynamic C libraries. It should be archived and compressed in the hadoop-dist/target directory:

mvn package -Dmaven.javadoc.skip=true -Pdist,native -DskipTests -Dtar

Note:
Hadoop Javadoc is not properly formed, including some unescaped punctuation characters. That is why you must skip it from the build.

Here we are, ready to install an Hadoop cluster on Oracle Linux 7…

References :
To know more about Hadoop build, read:
[1] How to Contribute to Hadoop Common.
[2] Hadoop Wiki Java Versions Page

Gregory Guillou
Gregory Guillou
Voir tous ses articles
Partager sur linkedin
LinkedIn
Partager sur twitter
Twitter
Partager sur facebook
Facebook

1 réflexion sur “Hadoop for DBAs (2/13): Building Hadoop for Oracle Linux 7”

  1. paputurto.science
    03/11/2015 à 00:03

    I am very much impressed with your article. I am working as Oracle DBA with 4 yrs of experience and maintaining huge database. and also having a knwledge on Hadoop..

    Répondre

Laisser un commentaire Annuler la réponse

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *

Les derniers articles

  • Master Note Middleware 2020 27/01/2021
  • Synchronisation standby avec Dbvisit 25/01/2021
  • La fin d’OVM – L’essor d’OLVM 18/01/2021
  • Azure Netapp Files 11/01/2021
  • AWS – Choisir entre les services de messagerie pour les applications Serverless AWS 04/01/2021

Les derniers commentaires

  • Synchronisation standby avec Dbvisit - EASYTEAM dans DUPLICATE…FROM ACTIVE DATABASE
  • Laurent GALLET dans Chiffrement du flux SQL*NET
  • SylvainF dans Oracle et VMware : risques, enjeux et solutions
  • Développer avec Oracle Functions - EASYTEAM dans Oracle Cloud Infrastructure Container Engine for Kubernetes
  • Younes dans Les bonnes raisons d’utiliser un CDN (réseau de diffusion de contenus / Content Delivery Network)
Espace Membres
Mot de passe perdu ?
EASYTEAM

Tour Nova, 71 Boulevard National,
92250 La Garenne-Colombes
Tél. 0800 40 60 40
contact@easyteam.fr

Facebook
Linkedin
Twitter
Navigation
  • Accueil
  • Qui sommes-nous
  • Entreprise citoyenne
  • Nos valeurs
  • Régions
  • Partenaires
  • Contact
  • Support
Menu
  • Accueil
  • Qui sommes-nous
  • Entreprise citoyenne
  • Nos valeurs
  • Régions
  • Partenaires
  • Contact
  • Support
Services
  • Développement
  • Migration
  • Infogérance
  • Expertise
  • Formation
Menu
  • Développement
  • Migration
  • Infogérance
  • Expertise
  • Formation
Blog
  • Cloud
  • Infrastructures
  • Data
  • Intégration
  • Dev / DevOps
  • SAM / FinOps
  • Applications
Menu
  • Cloud
  • Infrastructures
  • Data
  • Intégration
  • Dev / DevOps
  • SAM / FinOps
  • Applications
Copyright 2018 - EASYTEAM, Tous droits réservés
Mentions légales
Politique de confidentialité​