Artificial intelligence (AI) models can achieve better average grades in university writing assignments than real-life students, a new report has suggested.
The research, published in Scientific Reports, found ChatGPT matched or exceeded the efforts of students when answering assessment questions in subjects including computer science, political studies, engineering, and psychology.
It also found almost three-quarters of students surveyed (74%) would use ChatGPT to help with their assignments, despite 70% of educators viewing it as plagiarism.
ChatGPT – a chatbot that can provide detailed prose responses and engage in human-like conversations using prompts – burst into the public consciousness following its release in November last year.
In the research, faculty members on 32 courses at New York University Abu Dhabi (NYUAD) provided three student submissions each for 10 assessment questions.
ChatGPT was also asked to produce three sets of answers to the 10 questions, which were then assessed alongside the students’ responses by three blind graders.
The findings showed ChatGPT-generated answers achieved a similar or higher average grade than students in 12 of the 32 courses, with maths and economics the only two disciplines where students consistently outperformed AI.
The gap in performance between ChatGPT and students was much smaller on questions requiring high levels of knowledge and cognitive process, compared to those requiring intermediate levels.
This suggests that educators need to come up with alternative solutions to integrate, rather than prevent, the use of AI in schoolwork
ChatGPT only outperformed the students on questions requiring factual knowledge, as opposed to skills like creativity, and struggled most in comparison to students where trick questions were included in the assignment.
According to the report, there was a general consensus among educators and students that the use of ChatGPT in school work should be acknowledged.
Students also agreed that, in their future job, they would be able to outsource mundane tasks to ChatGPT, allowing them to focus on substantive and creative work.
Talal Rahwan and Yasir Zaki, computer science professors at NYUAD who led the project, said: “AI tools such as ChatGPT have already reached a level where they can outperform students in a considerable number of university-level courses.
“Moreover, as our survey indicates, the majority of students intend to use such tools to solve homework assignments.
“These findings suggest that evaluating students through homework assignments may no longer serve its purpose in the age of AI, raising a serious challenge for educational institutions worldwide.”
Mr Rahwan and Mr Zaki added that educational institutions need to “urgently craft appropriate academic integrity policies” as a means of regulation.
Two tools used for identifying AI-generated text also struggled to correctly state the origin of the assignments in the research.
OpenAI’s Text Classifier mistook almost half (49%) of ChatGPT’s submissions for being human-generated, whilst GPTZero misclassified around a third (32%) of submissions in the same way.
Mr Rahwan and Mr Zaki said: “Current AI-text classifiers cannot reliably detect ChatGPT’s use in schoolwork, due to both their propensity to classify human-written answers as AI-generated, as well as the relative ease with which AI-generated text can be edited to evade detection.
“This suggests that educators need to come up with alternative solutions to integrate, rather than prevent, the use of AI in schoolwork.”
The best videos delivered daily
Watch the stories that matter, right from your inbox