What is Spark? Spark is an open-source cluster computing framework originally developed in the AMPLab at UC Berkeley. It is a fast and general engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing. Why…
4 Things You Need To Know About YARN
What is YARN? YARN stands for Yet Another Resource Negotiator. It is a generic resource platform for managing resources in a cluster. YARN was introduced with Hadoop 2.0, an open source distributed processing framework from Apache. Why YARN? The main challenges for Hadoop 1.x are…
6 Things You Need To Know About Hadoop
What is Hadoop? Hadoop is an open-source software framework for storage and large-scale processing of data-sets on clusters of commodity hardware. Hadoop is an Apache top-level project. It is licensed under the Apache License 2.0. Why Hadoop? Since there are…
Videos Must watch for Tech lovers
Are you trying to learn something new but do not know how to get started? Check YouTube videos! I recently started to watch YouTube videos and I really hope I should have started to do it early. There are so…