What is Hadoop?
Following my high-level write-up of Hadoop and Big Data, this article will
present each of the components or projects that make up Hadoop with a
technical description of each.
First, what is Hadoop?
Hadoop stores and processes large volumes of a wide variety of data that
changes rapidly. It analyses and summarizes the data. For example: census of
a city, web page analytics, threat analysis, risk models, network failures,
Hadoop is redundant and reliable, powerful and focused on batch processing.
Hadoop divides a large data processing job into many smaller tasks that can
be distributed across all the nodes
Hadoop comprises two main components:
MapReduce: The task to analyse the data and summarize the results
HDFS: The distributed file system, on commodity server hardware, that
contains the data.
On each server there is a task tracker and a data nod... (more)
In prior blog posts, I described Infrastructure as a Service (IaaS) and
Platform as a Service (PaaS).
If I use IaaS I get servers onto which I can load software and applications
which I then maintain, though I don't need to maintain the hardware. I can
customize the applications and software running on the servers, at will. If I
use PaaS, I get a platform of ready to use web servers, application servers,
databases etc. I write my own software application and host it at the PaaS
provider. I maintain the software I write, but not the application servers,
databases or ha... (more)
OpenStack is an Infrastructure as a Service offering. (see my prior post for
an explanation of IaaS).
OpenStack is an OpenSource project, founded by RackSpace, NASA and others.
OpenStack can be deployed as a public or private cloud.
The OpenStack projects are: CINDER, GLANCE, KEYSTONE, NOVA, QUANTUM, SWIFT.
OpenStack Compute: (NOVA)
Project NOVA, or OpenStack Compute, provisions and manages on-demand virtual
machines and associated resources: CPU, Memory, Disk and Network.
Virtual machines can be started, stopped, suspended, created and deleted,
while network options for a ... (more)
The Security for Cloud Computing: 10 Steps to Ensure Success white paper
provides a practical reference to help enterprise information technology (IT)
and business decision makers as they analyze and consider the security
implications of cloud computing on their business. The paper includes a list
of steps, along with guidance and strategies, designed to help these decision
makers evaluate and compare security offerings in key areas from different
cloud providers. The paper discusses the threats, technology risks, and
safeguards for cloud computing environments, and provides the ... (more)
VMware vCloud Director is used by organizations wishing to build a private
cloud (Infrastructure as a Service - IaaS). The purpose of this guide is to
help you set up vCloud. Thus, the guide configures vCloud for local storage
and direct-connect networking. Storage and network provisioning is not
configured because these are complex configurations requiring a SAN/NAS,
vShield Edge Gateway, VXLAN etc most of which are not available in a simple
To understand what IaaS cloud computing is, I offer some blogs I have written
and a short video at these links:
Introduction to... (more)