Introduction to Data Science Workshops - fall 2019/2020


With an increase in demand for the use of data science tools and workflows on KAUST Campus, KAUST Visualization Core Laboratory (KVL) has organized a series of “Introduction to Data Science Workshops” to equip the research community with core data science tools, and to enable future data science applications at KAUST.

Through these series of workshops, KVL aims to complement its existing training series on data visualization, and also advance the state-of-the-art in both data science and visualization by providing advanced facilities, training, services, and consulting to the KAUST community and the Kingdom.

Identifying Core Competencies for Data Science

According to a recent O’Reilly Data Science Survey most data scientists use multiple programming languages on a daily base to solve their data science problems. The top four programming languages used by data scientists are SQL, Python, R, and Bash. The ability to share and reproduce data science workflows is critical whether the workflows are providing decision support in industrial applications, or generating novel insights from scientific data. Core tools for facilitating reproducible data science workflows are version control tools such as Git, virtual environment tools such as Conda, and container technologies such as Docker.

Building Data Science Capacity at KAUST

KVL has organized a series of Introduction to Data Science workshops to build capacity in the core data science tools and enable future data science applications at KAUST.

• Introduction to Python for Data Science, 1 September 2019 and 3 November 2019. 
• Introduction to R for Data Science, 2 September 2019 and 4 November 2019.
• Introduction to Conda for (Data) Scientists, 10 September 2019. 
• Introduction to Shell for (Data) Scientists, 15 September 2019. 
• Introduction to Version Control using Git for (Data) Scientists, 29 September 2019. 
• Introduction to SQL for Data Science, 13 October 2019.

The core workshop material largely follows a curriculum developed by Software and Data Carpentry, two global non-profit organizations that teach foundational coding and data science skills to researchers worldwide. The curriculum will be offered every Fall and Spring semester in its entirety in order to provide KAUST students, post-docs, staff, and researchers with an opportunity to develop their skills in these core data science tools.

Helping to advance the State-of-the-Art in Data Science at KAUST

In addition to building capacity in core data science tools, KVL and KAUST Supercomputing Core Laboratory (KSL) are planning to offer additional advanced training courses in tools used in state-of-the-art data science applications with a particular focus on enabling data science with GPUs.