October 2, 2022

PDF Grading - Univeristy of Birmingham

Abstract

Graide is an assessment and feedback platform designed to speed up grading and improve the consistency and quality of feedback. It uses technology to automate administrative tasks and optionally uses artificial intelligence to reduce repetition when giving feedback.

This case study looks at the PDF grading usage in the 21/22 academic year at the University of Birmingham in the Physics & Astronomy, Engineering, Materials and Metallurgy, and Foundation Year departments. 

46 staff and post graduate teaching assistants used Graide during the pilot, 31 participated in the survey and headline statistics from the pilot are below:

Introduction

Marking and feedback are an integral part of higher education. Yet this component causes stress on educators, takes up a significant amount of time, and leaves students unsatisfied. 

University of Birmingham post graduates are hired for 11.2 mins of grading per script on medium length assessments. This can be two to three times longer for exams or coursework. In addition to this, marks for each question have to be summed and often transferred to university systems. This can add an additional minute or more per student and is prone to errors. Finally this is reflected in the National Student Survey scores. The assessment and feedback category is systemically the lowest scoring “education delivery” category in most universities across the country.

Graide is a spin out by University of Birmingham postgraduate students, designed to directly address these issues. 

In the rest of this case study we will introduce Graide and its functionality, define success for this pilot, describe the methodology for measuring against the criteria, and show the results from this pilot.

What is Graide

Graide is an end-to-end assessment and feedback platform for STEM. It lets you build assignments in minutes that you can deliver digitally or on paper. 

Administrative benefits of Graide include:

Grading benefits of Graide include:

Examples of the grading interface for both PDF and digital answer are below.

AI suggestions are based on an assistive workflow, where suggestions are based on previously graded questions for one particular question at a time. 

Defining Success

The University of Birmingham wanted to improve assessment and feedback in a variety of areas. They defined success by the following metrics that, when put together, make a meaningful impact on assessment and feedback:

Methodology

As we did not have access to historical data to compare specific instances of times, we were led by user surveys and data analytics from system usage.

User Surveys

Users are surveyed of their opinions in two sections:

  1. Comparing Graide to other marking systems they have used (e.g. on paper or Canvas SpeedGrader)
  2. Usability of Graide as a platform

In section one we ask the following questions with the corresponding definitions:

Consistent feedback is when answers with the same method get the same marks and feedback, in order to be fair to students.

High-quality feedback is when: enough is provided and in enough detail; it focuses on students’ performance/learning/actions, rather than them or their characteristics; it is timely (received while it still matters and in time for further learning/assistance); it is appropriate to the assignment's criteria for success; it is related to students’ understanding of what they are supposed to be doing.

Each question was a multiple choice indicating the following set of changes:

E.g. faster / slower, easier / harder.

In the next section we asked users to select to what degree they agree/disagree with the following statements. Where 1 - strongly disagree, 2 - disagree, 3 - neutral, 4 - agree, 5 strongly agree.

Data analytics from the system usage

In addition to user surveys we have analysed data from the usage of Graide. This includes:

When each question is loaded into the grading interface the time is logged and when the marking is submitted the time is also logged. This allows us to estimate the time taken to grade the question. This cannot take into account breaks taken by the user and hence is an estimate.

As the feedback is digital we can count the words of feedback given to the student submissions. As feedback often involves mathematical symbols, a “word” is defined as a string of characters separated by a space.

Additionally, as the rubric is being used we can count the words of feedback within a rubric and calculate the reduction of repetition.

Results

In this section we see that the user survey indicates Graide hits the success criteria set out in the previous section. Users found that Graide:

Whilst analysing the data from system usage we found that:

Additionally 94% of users want to continue using Graide.

User Surveys

46 staff have used Graide to mark work, and 31 participated in the survey. Any comparisons are to existing systems such as Canvas SpeedGrader or grading work on paper.

System Marking Time Analysis

You can see a histogram of the time taken to grade all questions in the data set below. There is a long tail but a significant amount of the distribution is in the shorter amounts of time.

Average amount of time(mins) to grade an assignment.

Each row represents a unique assignment which has been redacted for confidentiality.

Average amount of feedback (words) on an assignment

Each row represents a unique assignment which has been redacted for confidentiality.

Conclusion

Graide is a platform designed to alleviate the pains associated with assessment and feedback. It does this in three ways:

The purpose of this pilot was to assess whether these claims are applicable to the assessment and feedback processes in certain university of Birmingham departments.

After being tested by 46 staff in 16 modules to grade 38 assignments (22532 questions), we found that Graide