UVA HPC CURSUS January 2018 - STEP UP TO SUPERCOMPUTING

9. Data-intensive Computing with Spark & Hadoop

9. Data-intensive Computing with Spark & Hadoop

Content:  Data sets of increasing volume and complexity are often very difficult to process with 'standard' HPC or DBMS technology. Large-scale data processing is particular popular in the fields of linguistics, data mining, machine learning, bioinformatics and the social sciences, but certainly not limited to those disciplines. Open-source frameworks such as Apache Spark and Hadoop have been developed with this challenge in mind and can be of great benefit for data-intensive computing.

This workshop gives:
  •  Background: learn about the underlying concepts of Apache Spark & Hadoop
  •  Hands-on session: get experience with Spark in a Python notebook environment
  • Optional: discuss your own data problem
 
  • Duration: 8 hours
  • Date and Timesee Schedule.
  • Location: Science Park 904, Room: see Schedule.
  • Target group: Researchers who need to analyze large amounts of data.
  • Course Leader: Jeroen Schot  (SURFsara).

 

Back