Undergraduate Course: Introduction to Data Science (MATH08077)
Course Outline
School | School of Mathematics |
College | College of Science and Engineering |
Credit level (Normal year taken) | SCQF Level 8 (Year 1 Undergraduate) |
Availability | Not available to visiting students |
SCQF Credits | 20 |
ECTS Credits | 10 |
Summary | This is an introductory level course on data science and statistical thinking. Students will learn to explore, visualize, and analyze data to understand natural phenomena, investigate patterns, model outcomes, and make predictions, and do so in a reproducible and shareable manner. In doing so, they will gain experience in data, wrangling, and visualization, exploratory data analysis, predictive modelling, and effective communication of results while working on problems and case studies inspired by and based on real-world questions. The course will focus on the R statistical computing language, attached to the use of GitHub for version control and code collaboration. No statistical or computing background is necessary. |
Course description |
This course is comprised of three learning units:
Unit 1 - Understanding and exploring data: This unit focuses on data types and classes, wrangling, and visualization.
Unit 2 - Modelling and prediction: This unit introduces simple and multiple linear regression models (prediction) and logistic regression (classification) models, with a focus on interpretations, visualizing interactions, model selection, and model validation.
Unit 3 - Making rigorous conclusions: In this part we introduce statistical inference for making data based conclusions from a simulation based perspective, focusing on model validation (cross-validation) and uncertainty measuring (bootstrapping).
|
Course Delivery Information
|
Academic year 2024/25, Not available to visiting students (SS1)
|
Quota: 0 |
Course Start |
Semester 1 |
Timetable |
Timetable |
Learning and Teaching activities (Further Info) |
Total Hours:
200
(
Lecture Hours 22,
Seminar/Tutorial Hours 17,
Supervised Practical/Workshop/Studio Hours 11,
Summative Assessment Hours 3,
Programme Level Learning and Teaching Hours 4,
Directed Learning and Independent Learning Hours
143 )
|
Assessment (Further Info) |
Written Exam
0 %,
Coursework
100 %,
Practical Exam
0 %
|
Feedback |
Not entered |
No Exam Information |
Learning Outcomes
On completion of this course, the student will be able to:
- employ all stages of a modern data science pipeline, including import, visualize, model, and communicate.
- critique data-based claims and evaluate data-based decisions.
- interpret results correctly, effectively, and in context with or without relying on statistical jargon.
- use the statistical computing language R to perform fully reproducible data analyses.
- use the basics of Git and GitHub for the purpose of code collaboration and reproducibility.
|
Reading List
There is no compulsory course text, but weekly slides are related reading documents are regularly shared during the semester. The following books are useful complements to parts of the course for those who prefer learning from textbooks. Both books are freely available online and related reading materials are attached to the weekly course material
- R for Data Science - Grolemund, Wickham O'Reilly, 1st edition, 2016
- OpenIntro: Introduction to Modern Statistics - Çetinkaya-Rundel, Hardin. CreateSpace, Preliminary Edition, 2020
|
Additional Information
Graduate Attributes and Skills |
Not entered |
Keywords | IDS |
Contacts
Course organiser | Dr Cecilia Balocchi
Tel:
Email: |
Course secretary | Mrs Frances Reid
Tel: (0131 6)50 4883
Email: |
|
|