STATS 4022 - Data Science - Honours

North Terrace Campus - Semester 2 - 2024

This course will introduce the fundamental concepts of modern data science. It will provide students with tools to deal with real, messy data, an understanding of the appropriate methods to use, and the ability to use these tools safely. Topics will include data structures; regression models including lasso regression, ridge regression and non-linearity with splines; classification models including logistic regression, linear discriminant analysis, support vector machines and random forests; and unsupervised learning methods such as principal component analysis, k-means and hierarchical clustering. The practical skills will be focused on data science in R.

  • General Course Information
    Course Details
    Course Code STATS 4022
    Course Data Science - Honours
    Coordinating Unit Mathematical Sciences
    Term Semester 2
    Level Undergraduate
    Location/s North Terrace Campus
    Units 3
    Contact Up to 3 hours per week
    Available for Study Abroad and Exchange Y
    Prerequisites STATS 2107 or (MATHS 2201 and MATHS 2202) or (MATHS 2106 and MATHS 2107)
    Incompatible STATS 3022
    Assumed Knowledge Experience with the statistical package R such as would be obtained from STATS 1005 or STATS 2107.
    Restrictions Honours students only
    Course Description This course will introduce the fundamental concepts of modern data science. It will provide students with tools to deal with real, messy data, an understanding of the appropriate methods to use, and the ability to use these tools safely. Topics will include data structures; regression models including lasso regression, ridge regression and non-linearity with splines; classification models including logistic regression, linear discriminant analysis, support vector machines and random forests; and unsupervised learning methods such as principal component analysis, k-means and hierarchical clustering. The practical skills will be focused on data science in R.
    Course Staff

    Course Coordinator: Dr Jono Tuke

    Course Timetable

    The full timetable of all activities for this course can be accessed from Course Planner.

  • Learning Outcomes
    Course Learning Outcomes
    Syllabus:

    The topics covered will include:

    Overview of modelling framework
    Preprocessing
    Model theory
    Resampling
    Penalised regression
    Classification modelling
    LDA / SVM
    Non-parametric
    Trees
    Random forests
    Feature selection
    Unsupervised learning
    Learning outcomes:

    On successful completion of this course, students will:

    1. Demonstrate an understanding of the foundational principles of machine learning
    2. Recognise which method to use for a given data analysis problem.
    3. Demonstrate an understanding the statistical underpinning of the chosen method.
    4. Implement safely any chosen method and interpret the results.
    5. Be confident to apply the methods to large datasets.
    6. Apply the theory in the course to solve a range of problems at an appropriate level of difficulty.
    University Graduate Attributes

    This course will provide students with an opportunity to develop the Graduate Attribute(s) specified below:

    University Graduate Attribute Course Learning Outcome(s)

    Attribute 1: Deep discipline knowledge and intellectual breadth

    Graduates have comprehensive knowledge and understanding of their subject area, the ability to engage with different traditions of thought, and the ability to apply their knowledge in practice including in multi-disciplinary or multi-professional contexts.

    1, 2, 3, 4, 5, 6

    Attribute 2: Creative and critical thinking, and problem solving

    Graduates are effective problems-solvers, able to apply critical, creative and evidence-based thinking to conceive innovative responses to future challenges.

    2, 3, 5, 6

    Attribute 3: Teamwork and communication skills

    Graduates convey ideas and information effectively to a range of audiences for a variety of purposes and contribute in a positive and collaborative manner to achieving common goals.

    6

    Attribute 4: Professionalism and leadership readiness

    Graduates engage in professional behaviour and have the potential to be entrepreneurial and take leadership roles in their chosen occupations or careers and communities.

    5, 6

    Attribute 7: Digital capabilities

    Graduates are well prepared for living, learning and working in a digital society.

    1, 2, 3, 4, 5, 6

    Attribute 8: Self-awareness and emotional intelligence

    Graduates are self-aware and reflective; they are flexible and resilient and have the capacity to accept and give constructive feedback; they act with integrity and take responsibility for their actions.

    4
  • Learning & Teaching Activities
    Learning & Teaching Modes
    The structure consists of

    - Weekly topic videos watched in own time.
    - One workshop on Advanced R methods in the workshop time.
    - One implementation workshop a week held in practical time.
    Workload

    The information below is provided as a guide to assist students in engaging appropriately with the course requirements.

    Activity Quantity Workload hours
    Topic videos 12 12
    Practicals 12 24
    Advanced R workshop 12 24
    Assignments 3 51
    Online test 3 33
    Online quizzes 12 12
    Total 156
    Learning Activities Summary
    Week Topic
    1 Assessing model accuracy/Bias-Variance
    2 Regression models/Classification/ROC
    3 EDA/Pre-processing
    4 LDA / QDA /naïve Bayes/CV
    5 Model selection/Ridge regression
    6 Lasso/PCR/PLS
    7 Polynomial and step functions/Smoothing splines/LOESS
    8 MARS/GAM/NLS/Decision trees
    9 Bagging/Boosting/RF/SVM
    10 PCA/Clustering
    11 MDS/EM
    12 VIP
  • Assessment

    The University's policy on Assessment for Coursework Programs is based on the following four principles:

    1. Assessment must encourage and reinforce learning.
    2. Assessment must enable robust and fair judgements about student performance.
    3. Assessment practices must be fair and equitable to students and give them the opportunity to demonstrate what they have learned.
    4. Assessment must maintain academic standards.

    Assessment Summary
    Assessment Percent of final mark
    Online quizzes 5
    Written assignments (3) 15
    Test (3) 30
    Practical exam 25
    Written exam 25
    Assessment Detail
    Assessment Distributed Due Weighting
    A1 Week 2 Friday Week 4 5%
    A2 Week 6 Friday Week 8 5%
    A3 Week 10 Friday Week 12 5%
    Test 1 Week 2 10%
    Test 2 Week 6 10%
    Test 3 Week 10 10%
    Online quizzes Weekly Weekly 5%
    Practical exam Week 13 Week 13 25%
    Written exam Exam period Exam period 25%
    Submission
    Homework assignments must be submitted on MyUni. It will be assumed that the students have read and accepted the Academic Honesty Statement on MyUni.

    Assignments will be returned within two weeks. Students may apply to be excused from or obtain an extension for an assignment for medical or compassionate reasons. Documentation is required and the lecturer must be notified as soon as possible.

    Course Grading

    Grades for your performance in this course will be awarded in accordance with the following scheme:

    M11 (Honours Mark Scheme)
    GradeGrade reflects following criteria for allocation of gradeReported on Official Transcript
    Fail A mark between 1-49 F
    Third Class A mark between 50-59 3
    Second Class Div B A mark between 60-69 2B
    Second Class Div A A mark between 70-79 2A
    First Class A mark between 80-100 1
    Result Pending An interim result RP
    Continuing Continuing CN

    Further details of the grades/results can be obtained from Examinations.

    Grade Descriptors are available which provide a general guide to the standard of work that is expected at each grade level. More information at Assessment for Coursework Programs.

    Final results for this course will be made available through Access Adelaide.

  • Student Feedback

    The University places a high priority on approaches to learning and teaching that enhance the student experience. Feedback is sought from students in a variety of ways including on-going engagement with staff, the use of online discussion boards and the use of Student Experience of Learning and Teaching (SELT) surveys as well as GOS surveys and Program reviews.

    SELTs are an important source of information to inform individual teaching practice, decisions about teaching duties, and course and program curriculum design. They enable the University to assess how effectively its learning environments and teaching practices facilitate student engagement and learning outcomes. Under the current SELT Policy (http://www.adelaide.edu.au/policies/101/) course SELTs are mandated and must be conducted at the conclusion of each term/semester/trimester for every course offering. Feedback on issues raised through course SELT surveys is made available to enrolled students through various resources (e.g. MyUni). In addition aggregated course SELT data is available.

  • Student Support
  • Policies & Guidelines
  • Fraud Awareness

    Students are reminded that in order to maintain the academic integrity of all programs and courses, the university has a zero-tolerance approach to students offering money or significant value goods or services to any staff member who is involved in their teaching or assessment. Students offering lecturers or tutors or professional staff anything more than a small token of appreciation is totally unacceptable, in any circumstances. Staff members are obliged to report all such incidents to their supervisor/manager, who will refer them for action under the university's student’s disciplinary procedures.

The University of Adelaide is committed to regular reviews of the courses and programs it offers to students. The University of Adelaide therefore reserves the right to discontinue or vary programs and courses without notice. Please read the important information contained in the disclaimer.