How the UCAT Score is Calcuated

Many students feel that UCAT scaling is mysterious. But in reality, it isn't. In this article, we explain how UCAT is scaled, the method used, and how that method can be applied.

Your UCAT score governs what course you get into, or don’t. So how do they calculate it? In this article, we explain the likely way the UCAT score is scaled and calculated and discuss the IRT method of scaling.

 

UCAT Score

As you know, the UCAT test consists of five subtests:

The result received after the test is completed will look something like this:

UCAT ANZ SUBTEST SCORES
Verbal Reasoning700
Decision Making710
Quantitative Reasoning830
Abstract reasoning730
Total Score2970
Situational Judgement694

 

 

UCAT Scaling

The result for each subtest is a scaled score between 300 and 900.

The first four subtests all test different cognitive abilities, and their scaled scores are combined into a score between 1200 and 3600.

The Situational Judgement subtest does not test cognitive abilities, so its score is presented on its own.

Why is the UCAT Scaled?

There are two main reasons for why the UCAT is scaled:

1. Scaling produces a normalised score that allows the performance in the different subtests to be compared.

This is required because the five subtests are so different: each tests a different ability, uses different question types, a different number of questions and a different time limit.

 

2. Scaling allows the difficulty of the question to be accounted for.

Since UCAT questions are mostly worth one mark (with the exception of two-mark questions in Decision Making and partial marks in Situational Judgement), the hardest questions would be worth just as much as the easiest questions if the score was not scaled.

 

So, what is the method for scaling UCAT?

UCAT scaling uses a method called Item Response Theory (IRT, or IRT scaling). The UCAT Consortium does not release information about how the UCAT IRT scaling is achieved, but this is one method they could have implemented.

Item Response Theory is a method of estimating a student’s “ability” in a certain area by considering both their mark in every question and the relative difficulty of each question.

As a result, the scaled score given is a reflection of the student’s ability, rather than a direct conversion of their raw mark from the test.

In IRT, a correct answer on a harder question is worth more than a correct mark on an easier question.

For example, two students that get the same raw mark will not necessary get the same score. It will depend on which questions they got right and wrong. The student that got the more difficult questions correct will get a higher score.

I love maths and data, tell me how IRT scaling works!

Okay, here we go with the nitty gritty data details.

For the scaling to be applied, the difficulty of each question must be determined. This can be achieved by testing the question out on a large group of students and seeing what fraction of students choose the correct answer.

Once the difficulty is determined, a statistical model is developed that relates the students ability to their performance in the test. The statistical model will essentially say:

“If the student’s ability corresponds to a score of X, then the student will get these Y questions correct, and these Z questions incorrect.”

The model is then applied to a student’s result and attempts to answer the question:

“Given the difficulty of each question and this student’s responses, what is their most likely ability score?”

This is how the UCAT scaled score between 300 and 900 is determined.

They are determined by estimating a student’s ability based on their result, and refining that estimate until the difference between their actual result and their predicted result is as small as possible. This is done separately for each UCAT subtest.

Each year, some undisclosed UCAT questions are not counted when determining the final score, but students don’t know which questions these are. It would make sense that having students attempt these questions may be a way of determining the difficulty of each question before it is used in future tests.

In addition, the student scores would be re-analysed to re-determine the difficulty of each question, and if there are any inconsistencies (e.g. a question believed to be difficult was answered correctly by many students) the process is re-calibrated until consistency is achieved.

This IRT method ensures that the scaling is fair.

This method of scaling does not discriminate well at the high and low ends of the ability scale (e.g. you can’t tell the difference between two students that got full marks). It is also sensitive to the overall average score. So, it might not be enough to get a good score, you will need to get a better score than others.

 

Why do I need to know about UCAT scaling?

Honestly, you don’t! This will only help satisfy your curiosity, should you be curious about these things. We care about it, because we love data and we scale the results of our UCAT Mock Exam Day.

When it comes to the UCAT, you should not worry about scaling or try to take it into account in any way. Just focus on answering the questions and doing your best!

Share this article