How AimFive Grades AP Essays

Most AI grading tools claim accuracy and never explain how. Here's the full methodology behind AimFive's rubric scoring — what model, what data, what works, and what doesn't.

The Goal

For every DBQ, LEQ, SAQ, or FRQ a student writes, AimFive returns:

A score per rubric point (earned / not earned, with reasoning).
Specific feedback on which sentences earned which points.
One actionable suggestion for the next attempt.

How It Works

Rubric loading: Each AP essay format has a structured rubric extracted from College Board's published scoring guidelines. AimFive references the exact criteria, not a paraphrase.
Structural parse: The essay is broken into thesis, body paragraphs, evidence claims, and conclusion. Each part is mapped to relevant rubric criteria.
Per-criterion evaluation: An LLM evaluates whether each rubric criterion is met by the student's writing — citing specific text from the essay as evidence.
Score aggregation + feedback: Earned rubric points are summed; specific feedback is generated per missed point with what would have earned it.

Calibration

The grader has been calibrated against:

Officially released College Board scoring guides for each AP essay format.
Sample student essays from the College Board's published sample exams.
AP-experienced teacher annotations on a small holdout set used for evaluation.

Limits — Where It Fails

We're explicit about this because pretending AI grading is perfect makes it less trustworthy, not more.

Complexity / sophistication point: The hardest rubric point to grade reliably for any grader (including humans). AimFive's agreement with human teachers drops here. We err on the side of NOT awarding the point unless multiple criteria are clearly met.
Document analysis on DBQs: When students misattribute or misquote documents, our grader sometimes misses the error. We're improving this with each iteration.
Handwriting-style errors: AimFive only grades typed responses. Handwriting analysis from photos is on the roadmap but not live.
Languages other than English: AP Spanish Language grading uses a different model and is in beta.

The Independent Agreement Study (Planned)

We plan to run a formal agreement study comparing AimFive's grader to AP-experienced teachers across multiple essay formats. The study has not started yet — we'll begin once we have a stable pool of student essays to draw from. Results will be published in full, whether agreement is high, low, or somewhere in between. Sign up for updates at research@aimfive.com.

How to Report Bad Grading

If AimFive grades an essay wrong, tell us. Email grading@aimfive.com with the essay text and the score you think it should have earned. Every report goes into our training/eval set. We pay $50 per validated case where our grader missed.

Try AimFive Free · Outcome Study · State of AP Report

AP and Advanced Placement are trademarks of College Board. AimFive is not affiliated with or endorsed by College Board.