The Problem with AI Grading
Reflections on Automation, Assessment, and the Human Element in Education
Teaching research methods to graduate students is always a delicate balance, but it becomes particularly interesting in a field as diverse as digital media. When most people think of digital media, they picture web design or social media, but our scope extends far beyond that — encompassing everything from computer animation to game design and virtual production. This breadth demands an equally expansive approach to research methodology. In my research methods course, we explore this rich landscape while grounding ourselves in foundational concepts from the philosophy of science. Students engage with seminal ideas like Popper's falsificationism and Kuhn's analysis of scientific revolutions, learning to apply these frameworks to contemporary digital media research.
The Evolution of Assessment in the AI Era
The assignments in my course on research methods in digital media have traditionally centered on analyzing popular media through academic lenses. One engaging example involves students watching "The Matrix" and examining Jean Baudrillard's claim that the film's directors misinterpreted his postmodernist philosophy in "Simulacra and Simulations" — a book that was required reading for the film's actors. These assignments make complex philosophical ideas engaging and fun by using familiar cultural references. Students gain a clearer understanding of abstract concepts by applying theoretical frameworks to media they regularly consume.
When designing assignments for this course, I've always emphasized critical thinking over mere regurgitation of facts. I encourage students to form their own opinions while exploring challenging theoretical frameworks. This approach worked well for years, fostering rich discussions and producing insightful analyses that often surprised me with their depth and creativity.
Over the last two years, however, the landscape of our classroom shifted dramatically: AI could suddenly complete virtually all our assignments in ways indistinguishable from human work. This realization wasn't simply about academic integrity — it challenged the very foundation of how we assess learning in higher education. If AI could generate sophisticated analyses of complex texts and concepts, what were our assignments actually measuring?
This insight led me to fundamentally reimagine my approach to assessment and grading. Rather than resisting technological advancements, I encouraged students to use AI tools for their essays, provided they acknowledged their use. The focus of assessment shifted to the classroom, where students actively discussed their work. Written assignments transformed from end products into conversation starters, launching points for deeper exploration and genuine intellectual exchange.
An Experiment with AI Grading
This year I decided to additionally bring ChatGPT O1 Pro into my grading process. I wanted to see how an advanced AI system would analyze student work — would it catch nuances I might miss? Could it offer fresh perspectives on their arguments? After carefully anonymizing all assignment submissions, I began feeding essays into the system. This process required meticulous preparation, as each paper needed to be contextualized with relevant course materials and theoretical frameworks while maintaining strict student privacy.
The results were remarkable. The AI analysis was so precise and thorough that it occasionally provided insights beyond my expertise as an instructor. While specific examples must remain private for confidentiality reasons, the depth and quality of the feedback were truly impressive. The AI could identify subtle connections between different theoretical frameworks, point out implicit assumptions in arguments, and suggest areas for deeper investigation that I might have overlooked.
The Unexpected Drawbacks
Despite the system's unprecedented ability to provide comprehensive feedback, I ultimately returned to traditional assessment methods. My decision was based on three important observations that I believe shed a fascinating light on education and assessment in the AI enhanced digital age.
First, using AI for assessment proved surprisingly time-consuming. Providing the AI with necessary context, including book excerpts and video transcripts, while simultaneously ensuring student privacy was often more labor-intensive than traditional grading methods. What initially seemed like a timesaving solution actually created additional work. And formatting, contextualizing, and reviewing AI feedback added even more time and effort compared to traditional grading.
Second, while technically excellent, the AI's feedback lacked a certain quality that I can only describe as the "human touch." The analysis, though precise, felt mechanical and somehow disconnected from the human experience of learning and understanding. There was something missing in the way it engaged with students' ideas — a warmth, an intuitive grasp of the learning process, and an ability to recognize and nurture genuine intellectual curiosity.
Most significantly, this experiment raised a fundamental question: If students use AI to write essays and instructors use AI to evaluate them, what is the underlying educational value of the exercise? This question has no simple answer, even for someone like myself who advocates for AI integration in education. It forces us to confront deep questions about the nature of learning, understanding, and assessment in an age where artificial intelligence can simulate many aspects of human intellectual work.
Finding Balance in the AI Era
After careful consideration, I've therefore returned to a hybrid approach that better serves our educational goals. Students maintain the freedom to use AI in their essay writing, provided they're transparent about its use. However, the evaluation process has returned to human hands, with class participation during discussions remaining the primary basis for grading.
This approach acknowledges AI's presence in academic writing while preserving the irreplaceable value of human interaction in education. Essays serve not as products to be evaluated but as starting points for meaningful classroom discussions where students explain their thinking, defend their positions, and engage with their peers.
These discussions highlight aspects of understanding that AI text generation struggles to replicate. When students discuss their work, they demonstrate not just their grasp of concepts but their ability to apply them creatively, respond to challenges, and engage in genuine intellectual discourse. These interactions provide insights into their learning that no AI-generated essay, no matter how well-crafted, could reveal.
Looking Forward
The challenges of AI in education extend far beyond simple questions of academic integrity. As we continue to navigate this evolving landscape, we must carefully consider how to preserve the essential human elements of education while embracing technological advancement. Perhaps the most valuable lesson from this experiment is that the true worth of educational assignments lies not in their final form but in the discussions, insights, and human connections they generate.
This experience has once again taught me that while AI can be an incredibly powerful tool for both students and educators, it should augment, rather than replace, human judgment and interaction in education. The technology's limitations, particularly its inability to understand the nuances of human learning and foster genuine connection, help us identify what is truly essential in education: the human capacity for understanding, empathy, and genuine intellectual engagement.
In the end, the "problem" with AI grading might not be about technical capabilities at all, but about maintaining the delicate balance between technological efficiency and the human nature of education. As we move forward, finding this balance will be crucial for educators across all disciplines. Our challenge is not to resist technological change but to harness it thoughtfully.
Although AI's involvement in education will undoubtedly expand, my experience suggests a need for cautious and controlled integration. In our enthusiasm to embrace new technologies, we must preserve the essential human connections and experiences that make education truly meaningful.
I enjoyed reading this post and how you documented our class experience.
I used to agree that ‘AI writings lack a human touch’, but there are two things that is preventing me from accepting it these days. The first one is that the more I interact with AI the more I see it can be even added the ‘human touch’ to it, like to give it a personality or ask it to add experience to it as a human does.
The second thing is again the question of ‘if you didn’t know AI generated it, would you still rate it as mechanical?’ Or is it just knowing the fact that AI wrote it, or AI graded it, that makes us think it lacks that human touch?