Skip to Content

TU Wien Fakultät für Informatik DBAI Database and Artificial Intelligence Group
Top-level Navigation: Current-level Navigation:

Path: DBAI > Basic Spark Tutorial


Basic Spark Tutorial

DBAI Research Seminar, October 12th 2017

Theresa Csar


This is a brief introduction to Spark and some related technologies. Spark is an engine for large-scale data processing and is often handled as the successor of Hadoop Mapreduce. Spark's key-concepts are the resilient distributed datasets and their lazy evaluation. The topics of this tutorial are: Basic Concepts of Spark, RDDs, DataFrames, Spark SQL and GraphX.
We will use the Databricks community platform for the Tutorial, so you will have to create an account at https://community.cloud.databricks.com/ to follow the practical part.

You can download the slides here: Slides

Create a user account for the community edition of databricks: Databricks
Go to your "Workspace" in Databricks and import the following Notebooks (right-click) using the URLs:

Home / Kontakt / Webmaster / Offenlegung gemäß § 25 Mediengesetz: Inhaber der Website ist das Institut für Logic and Computation an der Technischen Universität Wien, 1040 Wien. Die TU Wien distanziert sich von den Inhalten aller extern gelinkten Seiten und übernimmt diesbezüglich keine Haftung. Disclaimer / Datenschutzerklärung