Undergraduate Course: Scalable Data Management Systems (INFR11123)
Course Outline
School | School of Informatics |
College | College of Science and Engineering |
Credit level (Normal year taken) | SCQF Level 11 (Year 4 Undergraduate) |
Availability | Available to all students |
SCQF Credits | 10 |
ECTS Credits | 5 |
Summary | The course focuses on core systems building aspects of data management. The material of the course is fuelled by technology, techniques, and architectures. One of the key aspects of the course will be to weave technology along with algorithms and systems and present fundamental data-centric computing challenges in light of the systems being built to address them. The course content is dynamic and continuously updated to cover the state-of-the-art in scalable data management systems. |
Course description |
Background: Fundamental challenges introduced by data-centric computation; where current practices stop and where the need for new techniques arises; how new technology addresses this need and what challenges arise with the introduction of this new technology; how the hardware and software substrates of computing platforms come together in addressing these challenges.
* Technology: Existing technology is not enough to deal with contemporary data needs, both at a single-server level and at the distributed computation level. We will first deal with advances in memory technology and specifically flash, solid-state, and non-volatile memory. We will introduce the notions of massively parallel processors and discuss the programming models and performance implications that come with incorporating them into the systems stack.
* Techniques: We will then introduce rack-scale computing and remote direct memory access as mechanisms by which future deployments will leverage the newly found power and discuss techniques for ensuring high performance in storing and accessing data; algorithms for data management; techniques for ensuring data consistency; protocols for data coherence.
* Architectures: We will focus on new architectures and programming substrates for building systems that access and manipulate large volumes of data in performing ways. We will discuss NoSQL and cluster-based solutions, and introduce the notion of high-level languages for managing data on clusters.
The course will take a practical approach towards introducing scalable data management systems. We will start from the ground up, first focusing on cutting-edge technology, moving on to data structures and algorithms, and then building systems for large-scale deployments.
|
Entry Requirements (not applicable to Visiting Students)
Pre-requisites |
|
Co-requisites | |
Prohibited Combinations | |
Other requirements | None |
Information for Visiting Students
Pre-requisites | None |
High Demand Course? |
Yes |
Course Delivery Information
|
Academic year 2017/18, Available to all students (SV1)
|
Quota: None |
Course Start |
Semester 2 |
Timetable |
Timetable |
Learning and Teaching activities (Further Info) |
Total Hours:
100
(
Lecture Hours 18,
Seminar/Tutorial Hours 8,
Feedback/Feedforward Hours 1,
Summative Assessment Hours 1,
Programme Level Learning and Teaching Hours 2,
Directed Learning and Independent Learning Hours
70 )
|
Assessment (Further Info) |
Written Exam
0 %,
Coursework
100 %,
Practical Exam
0 %
|
Additional Information (Assessment) |
For proper evaluation, students must be presented with real problems, rather than 'toy' ones which can be solved in a very limited time. The focus on this course will be on large-scale problem solving and critical thinking. To that end, there will be a list of projects given out to the students at the beginning of the semester where the students will be able to pick one, shape up the project description until the end of Week 3.
Assessment weightings:
Written Examination: 0%
Practical Examination: 0%
Coursework: 100%
The students will deliver their work in three instalments:
* a refinement of the project description at the end of Week 4 (worth 25%);
* an architectural blueprint and an implementation plan at the end of Week 6 (worth 25%); and
* a final (short) report and a demonstration of their work at the end of the semester (worth 50%).
|
Feedback |
Not entered |
No Exam Information |
Learning Outcomes
On completion of this course, the student will be able to:
- Describe and justify the differences between large-scale data management and general distributed computing and the need for customisation.
- Deploy state-of-the-art hardware concepts like persistent memory in a stand-alone application or in the context of a data management system and justify that the deployment is appropriate.
- Build scalable data management applications using multi-core and heterogeneous architectures to provide a single-server massively parallel deployments.
- Implement state-of-the-art query processing techniques (e.g. code generation) in either a stand-alone application or in the context of a managed runtime and be able to identify appropriate contexts for such techniques.
- Implement state-of-the-art distributed data management techniques in either a stand-alone application or in the context of a data processing system and describe the storage and query processing algorithms involved.
|
Additional Information
Course URL |
http://course.inf.ed.ac.uk/sdms/ |
Graduate Attributes and Skills |
Not entered |
Keywords | Database systems,Data management,Scalability,Hardware,Software Engineering |
Contacts
Course organiser | Dr Stratis Viglas
Tel: (0131 6)50 5183
Email: |
Course secretary | Mr Gregor Hall
Tel: (0131 6)50 5194
Email: |
|
© Copyright 2017 The University of Edinburgh - 6 February 2017 8:10 pm
|