June 14, 2015
June 14, 2015
June 17, 2015
Computing & Information Technology
26.270.1 - 26.270.8
Scalable Assessment: Present and FutureA perennial problem in teaching is securing enough resources to adequately assessstudent work. In recent years, tight budgets have constrained the dollars available to hireteaching assistants. Concurrent with this trend, the rise of MOOCs, has raised assessmentchallenges to a new scale. In MOOCs, it’s necessary to get feedback to, and assigngrades to, thousands of students who don’t bring in any revenue. As MOOCs begin tocredential students, accurate assessment will become even more important. These twodevelopments have created an acute need for scalable assessment mechanisms, to assesslarge numbers of students without a proportionate increase in costs.There are three main approaches to scalable assessment: autograding, peer review, andautomated essay scoring. Autograding is provided, on a basic scale, by any LMS that canadminister quizzes made up of multiple-choice or fill-in-the-blank questions.Autograding is also done by publishers’ apps like Wiley Plus, third-party grading systemslike Webassign, and content-specific grading applications like Web-CAT for computerprogramming assignments. Autograding scales very well, because almost all of the effortis expended in creating the quiz, which can then be administered to any number ofstudents automatically. Autograding systems can randomize the ordering of questionsand answers, and randomize parameters to numeric problems, making it harder to cheat.With a little more effort, multiple-choice distractors can be keyed to specificmisconceptions. Advanced analytics can identify which material the students are havingtrouble with.Peer review is supported by almost every LMS, as well as standalone systems likeCalibrated Peer Review, Peerceptiv, and Peer Scholar. It is also a feature of severalMOOCs, notably Coursera and Canvas. Its greatest strength is the ability to give eachstudent formative assessment on their work. When used for summative assessment inMOOCs, its accuracy is questionable, especially for assignments that the students do notunderstand well. However, researchers are working on better techniques to identifyaccurate reviews, which would improve summative scores. Researchers are alsodeveloping techniques to allow peer reviewers to annotate, and respond to annotations, onsubmitted documents. Other research involves natural-language processing techniques toestimate the quality of a review before it is submitted, and give feedback to the revieweron how to improve the review.Automated essay scoring uses software that predicts how an instructor would score apiece of prose, by using metrics such as correlation of its vocabulary with essays scoredhigh by humans, average word length, and number of grammatical errors. An instructorneeds to train an AES system by scoring a number of essays (e.g., 100) to teach thesystem which characteristics are important. AES is used by MOOCs such as EdX, aswell as for high-stakes educational testing, and research is ongoing in applying similartechniques to the grading of non-prose submissions.The presentation will cover the current capabilities and research directions for thesesystems, and also show how they can be used to improve assessment in any context, byproviding greater formative feedback to students.
ASEE holds the copyright on this document. It may be read by the public free of charge. Authors may archive their work on personal websites or in institutional repositories with the following citation: © 2015 American Society for Engineering Education. Other scholars may excerpt or quote from these materials with the same citation. When excerpting or quoting from Conference Proceedings, authors should, in addition to noting the ASEE copyright, list all the original authors and their institutions and name the host city of the conference. - Last updated April 1, 2015