What is Hadoop? Following my high-level write-up of Hadoop and Big Data, this article will present each of the components or projects that make up Hadoop with a technical description of each. First, what is Hadoop? Hadoop stores and processes large volumes of a wide variety of data that changes rapidly. It analyses and summarizes the data. For example: census of a city, web page analytics, threat analysis, risk models, network failures, etc. Hadoop is redundant and reliable, powerful and focused on batch processing. Hadoop divides a large data processing job into many smaller tasks that can be distributed across all the nodes Hadoop comprises two main components: MapReduce: The task to analyse the data and summarize the results HDFS: The distributed file system, on commodity server hardware, that contains the data. On each server there is a task tracker and a data nod... (more)

A Comparison Between OpenStack and VMware vCloud IaaS Offerings

I previously wrote a review of  the Microsoft Azure public cloud and included a comparison between Azure and AWS (Amazon Web Services) and will now compare OpenStack and VMware vCloud. For a review of IaaS (Infrastructure as a Service) see my blog post and video. This table provides a simple and high level comparison of OpenStack and vCloud. Feature OpenStack VMware vCloud Virtualization layer Type 2 virtualization - Libvirt layered on top of Linux. Supports various hypervisors: XEN, KVM, HyperV... Type 1 virtualization - bare metal; vSphere hypervisor only. Management Open API... (more)

Introduction to Cloud Computing for Newbies

Cloud computing is a general term for computing services delivered over the Internet, as opposed to computing services hosted inside your own network; on your own premises. These computing services can be as simple as Internet based email or as complex as a Customer Relationship Management (CRM) application. Cloud computing offers cost savings, because users don't have to invest capital budget to purchase hardware and software, nor expend the operating costs of electric power, space and cooling for the hardware and employee costs of maintaining the hardware and software. The maj... (more)

Cloud Computing Easily Understood - SaaS

In prior blog posts, I described Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). To recap: If I use IaaS I get servers onto which I can load software and applications which I then maintain, though I don't need to maintain the hardware. I can customize the applications and software running on the servers, at will. If I use PaaS, I get a platform of ready to use web servers, application servers, databases etc. I write my own software application and host it at the PaaS provider. I maintain the software I write, but not the application servers, databases or ha... (more)

Architecting for the Cloud Using Amazon Web Services

Traditional IT environments that are built using physical servers can only scale and grow by buying new hardware and software and taking time to install and rack the hardware, configure the software and the application. If/when the excess capacity is not needed the servers stand idle consuming power, cooling and rackspace. This is inefficient and a waste of money. Amazon Web Services (AWS) allows customers to scale using elastic demand. Just like a rubber elastic band stretches to accommodate more items , AWS provides elastic computing to allow a customer to scale up (or down); to... (more)