After completing the course the student is expected to - know typical characteristics and common applications of big data - know the basics of distributed file systems, databases and computing - have gained practical data processing skills with the MapReduce framework / Apache Hadoop.
Contents
Characteristics and applications of big data. Structured and unstructured data. Distributed file systems. Distributed and relational/non-relational databases. Distributed computing. MapReduce framework. Apache Hadoop.