Assessment using genAI to develop evaluative judgement

Pattern

Stats

Context & Scale

This pattern is concerned with developing students’ critical thinking and evaluative judgement through a task that requires them to produce an assessment artefact with the aid of a generative artificial intelligence agent and reflect on the process of pulling together the components of the artefact. An assessment with this structure can replace an assessment that has been compromised by the advent of generative AI. The original assessment artefact is still produced, but students are explicitly required to use a generative AI agent, and the marks assigned to the assessment artefact are greatly reduced. The students’ explicit reflection on how the assessment was put together makes up the rest of the marks.

As this assessment format is designed to replace an existing written assessment, marking requirements are comparable.

In addition, students can ask the agent as many questions as they want. These can include questions about the format of the assessment (as the agent has access to the information documents), and feedback on their writing, supporting deployment at scale.

Problem

The drive to develop evaluative judgement (Tai et al., 2018) and digital literacy (Morgan et al., 2022) in tertiary educated students has been accelerated by the onset of generative AI. Traditional assessment models are no longer reliable ways for students to demonstrate their learning (Bearman et al., 2024), although there has been movement in the sector to change assessment practices to focus more on the processes of student learning for quite some time (Boud, 2000; Carless, 2015).

Solution

Replace a written assessment with an assessment that still requires students to produce a written artefact but written with the aid of a generative AI agent. An explicit reflection on the process of deciding what material goes into the artefact makes up the bulk of the marks in the new format.

Implementation

Determine the artefact that students will be producing, for example a report or an analysis.
Choose a generative AI agent that all students have equal access to, for instance Microsoft Copilot, or a university supported chatbot agent (such as the University of Sydney’s Cogniti).
Write very clear instructions on how the AI agent is to be used in the assessment. Many students expect an AI agent to be able to produce an acceptable report with one prompt. Part of the aim of an assessment like this is for students to learn to break down what information they need, then to judge its worth. You may want to consider instructing students to limit the agent’s output (for instance to 300 words) to scaffold their understanding of the process you are asking them to undertake. Require that students append their full conversations with the agent to their assessments.
Optional: write or program an agent that specialises in the artefact the students will be producing, connecting it to resources from the course.
Plan a practice session that will not only familiarise students with the agent, but explain why being able to use generative AI in this way is important in their future work lives. Students have a wide range of literacy in using AI and will need guidance. They also need reassurance that they are developing translatable skills. A tutorial is an ideal time for such a familiarisation session.
Ensure that the marking rubric requires students to make explicit reference to the decisions made, and to the artefact produced, in their reflective writing. This way, students will have to go beyond the kinds of generic reflections that can be produced by generative AI.

Examples of pattern in use

Global Entrepreneurship (IBUS2105)

Acknowledgements

Joe Boulis, Dat Le and Rachael Lowe

Context

This pattern was evaluated in one semester of an undergraduate unit in International Business. The unit had 104 enrolled students.

Description

This unit develops a range of global entrepreneurship skills in students. Being able to critically evaluate information from a wide variety of sources is an important skill in entrepreneurship, so this assessment was deployed early in semester (week 3) in order to develop this skill early in the unit.

The assessment was worth 20 marks, 5 marks for a feasibility report, and 15 marks for a reflection on putting the report together.

Two identical Cogniti agents were written – one for the practice in tutorial time in Week 2, and one for the assessment. This was to keep the two sets of conversations separate for evaluation purposes. The agents were designed to be experts in writing feasibility reports, and were connected to all unit resources on Canvas. Their output was limited to 400 words, the same as the feasibility report – which turned out to be confusing and will be amended in future iterations (the output may be limited to 300 words and the feasibility report to 500 words, as many students felt that they were not able to express their understanding within the word limit).

Technology and resources

The Cogniti agent, developed at The University of Sydney, was used in order to grant equal access to a generative AI agent to all students. It was also able to be embedded in Canvas for ease of use, its output shaped by programming it as an agent, and all student conversations (deidentified) were visible to the unit coordinator so he could monitor the progress of the practice session and the students’ development of their assessment.

The practice session in tutorial time familiarised students with the use of the agent, and allowed them to ask questions in real time. This was a valuable exercise as it helped develop digital literacy, and legitimised the use of the tool for students.

The marking rubric was designed to reward explicit reflection on using external sources (students could add to the output of the agent in putting together their report) in order to scaffold evaluative judgement, and to protect against students using generative AI to write the reflection itself.

Findings

The teaching team reported a range of sophistication in using the agent, and a great deal of evidence of thoughtful reflection on the utility of the agent in producing a report.

A survey of students found that the agent was at least of some use in producing the report, but that they had to add in other sources for a variety of reasons. They found it most useful for scoping the report and generating content. They found its output, on the whole, to be:

‘extremely limited’; ‘Everything it generated was underwhelming’; ‘Concepts and arguments explored in the generated report were extremely surface level and limited’; ‘I had to feed it so much information it defeated the point.’

However, a majority would have liked to see a generative AI agent like this used in other assessments (11/16).

Students reported that they would have liked more practice, and clearer instructions on how they were expected to use the agent, and these will be built into the next iteration of the assessment.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Stats

Acknowledgements

Context

Description

Technology and resources

Findings

Alison Casey

Leave a ReplyCancel reply

Pattern

Stats

Context & Scale

Problem

Solution

Implementation

Examples of pattern in use

Acknowledgements

Context

Description

Technology and resources

Findings

About the Authors

Alison Casey

Leave a ReplyCancel reply

Discover more from University of Sydney Business School