Think Forward: AI and educational assessments
Plus useful research, policy resources, some fun new tools, and district innovations.
Welcome back. Lots of new AI/edu material is coming in lately, including new policy guidance, research, and examples from the field. But I’ve recently been digging into AI and the future of assessments, so we’ll begin with a look at that topic.
Deep Dive: Harnessing AI to Revolutionize Educational Assessments
Anyone tracking education policy over the last decade knows that the assessment movement has taken some political hits. Both federal and state governments have gradually relaxed assessment standards and requirements, and advocates now find themselves in a defensive stance, simply trying to hold the line on annual testing.
Some government reconsiderations around assessments are justified. While standardized tests are crucial, their limitations are evident. Test results often arrive too late to inform classroom instruction and are limited in what kind of skills they can assess. Standardized tests predominantly feature yes/no questions, which fail to capture essential "durable skills" like critical thinking and creativity—skills that are increasingly valuable in the age of AI-driven employment. Additionally, these tests can be inaccessible to English learners, students with disabilities, and other marginalized groups.
Not all motives for reducing commitment to state standards are benign, however. Some resist the accountability measures that come with assessments. Without robust assessments, we lose visibility into achievement and opportunity gaps and miss the chance to recognize effective educational strategies. Absent reliable data, educators are left to rely on guesswork and unverified resources like Pinterest.
Can AI address the shortcomings of current assessments while still ensuring data is used to enhance schools and close learning gaps? New developments suggest reason for hope.
The Programme for International Student Assessment (PISA), a leading global exam, has begun exploring AI's role in grading open-ended questions, aiming to assess higher-order thinking skills such as creativity and critical analysis. Will the College Board, Measures of Academic Progress (MAP), National Assessment of Educational Progress (NAEP), and others follow suit? We shall see.
Of course, there are challenges. Researchers are still evaluating AI's effectiveness in tasks like essay grading. Early findings suggest that even this nascent stage of AI can perform comparably to human graders, particularly when paired with human oversight. While AI has the potential to reduce some human biases, it may also perpetuate or exacerbate others. Still, the opportunity for improvement warrants continued investment in AI development and testing.
Meanwhile, some teachers are already leveraging AI to tailor tests to student interests and to customize assessments for students with learning disabilities and those needing language translation. This reduces bias and removes some of the tedium associated with traditional assessments.
Additionally, new ed tech tools are helping people make easier sense of existing assessment data. Zelma AI allows users to analyze publicly available data in highly accessible ways.
My content director used Zelma to generate information about math scores in Washington state (pardon Taylor Swift’s new album playing in the background). Try it for yourself here.
This seems like a critical moment to work with AI developers to design the assessments of educators’ and policymakers' dreams. The question is: who will lead this charge?
The Data Problem
Personalizing education via AI holds so much promise, but you cannot personalize without data. And reliable student-level data is very hard to come by in education. Mark Schneider, former director of the federal Institute of Education Science (IES) penned a terrific piece that detailed the challenges and potential solutions in making assessment and other data more accessible for these purposes. In Schneider’s words: “IES must lead an effort to better balance the concern for protecting student privacy against the reality that the nation also needs breakthroughs that only access to bias-free and representative data of the sort generated by [assessments like] NAEP can provide.”
The Good, the Bad, and the Ugly: What AI Can Do
Here’s a really fascinating post about how well AI (Claude, specifically) analyzes Supreme Court cases. Pretty amazing. Here’s a teaser from the post:
In my view, in accuracy and creativity, Claude’s answers are at or above the level of a human Supreme Court clerk. Not only is Claude able to make sensible recommendations and draft judicial opinions, but Claude effortlessly does things like generate novel legal standards and spot methodological errors in expert testimony. Claude does occasionally make mistakes, but humans do too.
On the other hand, this chilling story from The Hill about “deepfaking” visuals and audio gives a glimpse at how AI is taking school bullying to a new, very disturbing level. Policymakers take note of this line: “Advocates are sounding the alarm on the potential damage — and on gaps in both the law and school policies.”
New Research
Teacher attitudes: We previously shared results from a RAND/CRPE report on teacher adoption around AI. Now Pew research is showing similar results: the vast majority of teachers are either unsure or quite skeptical about whether the AI carries more benefit than risk. A small percentage is quite optimistic. This pretty well comports with the RAND teacher data showing that only a small percentage of teachers are actively using AI in the classroom.
Community input: Here is an interesting study showing how an AI-powered tool is incorporating public input into LLMs (large language models) to better reflect values that may differ from those of developers. There’s interesting potential for this tool to be used in education settings as a way of aligning around pedagogy, local values, etc.
Policy Resources
The American Federation of Teachers (AFT) has a new guidance document on AI out this week. In it, you will find some very detailed and helpful suggestions for how schools can navigate tricky AI issues like plagiarism. The document also includes some examples of how educators are using AI in positive ways to enhance teaching and learning. There is a big call for professional development investments and more planning time. Overall, this is a pretty positive document, and the AFT’s Rob Weil is probably largely to thank for that. He is thoughtful about the opportunities AI presents in education, but also clear-eyed on risk.
Not surprisingly, the AFT calls for teachers to have a say in which technologies are used in schools and how. There are also some interesting statements about how students should demonstrate mastery. Both issues are probably important ones to watch for union-related concerns over AI use. Here is an example statement on assessment, which seemed sort of odd to me, as technology already exists that allows students to find answers:
With the existence of technology that gives students the ability to quickly search for and find answers to questions, educators must use different methods to demonstrate mastery, including project-based work, essays, speaking/interviews, discussion and art.
The ILO Group released these terrific resources to help state education agencies (SEAs) implement AI.
And here is a really good resource for policymakers (or realistically: their staffers) who are going to be regulating AI. It’s got everything regulators need to know (this week, anyway)!
Cool Edutools
This is an interesting post about FoondaMate, an AI-powered study aid built with Meta Llama, providing personalized assistance to millions of middle and high school students in emerging markets via WhatsApp and Messenger.
The founders trained the bot to rephrase content for different English comprehension levels, include local language and slang, and add appropriate emojis, informed by data on how teenagers in emerging markets chat, learn, and engage. The post suggests that this tool might allow kids who don’t get enough personal attention in large classes to better prepare for classes and for college. I like to look to emerging markets in education to consider what innovations might help U.S. kids.
Cool Stuff Happening in Districts
In our last edition, I did an all-call for school districts that are using AI to analyze data or to take on other critical central office functions. I got a response! It turns out that folks in the Peninsula School District, just across Puget Sound from my home base, have been investigating how to put AI to use for several years. They’ve shared their AI policy with other school districts across the U.S. and are using AI with multilingual learners and students with learning differences. Great to see a school district being so thoughtful about AI. We’re building a database of “first adopter” school systems, so do drop us a line if you know about others!
Time Savers
I am in the midst of finalizing our annual State of the American Student report. Apparently, the state of the student is complicated because the report was getting to be more than 40 pages. I asked GPT-4o to suggest a more streamlined version appropriate for a policymaker audience. It did a beautiful (amazing, really) edit and cut the content by at least half, reminding me that policy people need information distilled as concisely as possible. I knew this, of course, but this saved me days of my (and my editor’s) time using ChatGPT. We’d love to know: have you used ChatGPT or other chatbots to save yourself time at work?
That’s all for now. We’ll be back soon on a (more or less) biweekly basis.
-R
This is super helpful and informative -- thank you! Question for you: Have you come across any people/organizations developing AI tools to help design project-based learning units? If AI is good at lesson design, I would think it would be really good at designing units that align to standards and are multi-lesson projects.