Careers · Evaluator role · Parker Dewey & university partnerships

Real product work, with a clear boundary.

Researchers join as evaluators inside Evaluation Labs — judging whether Lucia's responses actually help the human in front of her, whether that human runs the property or is staying in it. It's structured, mentored, and scoped: you don't need to be an AI expert to begin, and you won't touch infrastructure or admin tools. You provide honest human judgment, and that judgment is the product.

The role · Evaluator (v1)

What an evaluator does

Evaluate Lucia's behavior inside an assigned set of work: read the prompt, read the response, use the review controls honestly, and leave a short note when it helps.

You do

  • Run assigned custom evaluations.
  • Read each prompt and Lucia's response before scoring.
  • Score the visible dimensions honestly.
  • Write a short, specific note when context matters.
  • Flag uncertainty or risk for senior review.
  • Finalize your own run when every item is reviewed.

You don't

  • Need to know the full system to begin.
  • Inspect infrastructure or use owner/admin tools.
  • Decide whether Lucia is ready for live use.
  • Review runs that aren't yours.
  • Change the assignment mid-run.
  • Pass a response just because it sounds polished.
What you're judging

Review the response, not the vibe.

Polished language isn't enough. For each item you read the prompt, read Lucia's response, glance at any suggested selections — then decide what you believe.

  • Did Lucia understand the user?
  • Did she avoid overclaiming?
  • Did she give a useful next move?
  • Was the answer clear without extra scanning?
  • Was the tone calm and appropriate?
  • Did the answer preserve trust?

What researchers gain

This is a genuine on-ramp into applied AI — the part of the field that's hardest to learn from a textbook.

  • Applied AI evaluation experience — how real models are tested, scored, and improved.
  • Structured critical judgment — reasoning about truthfulness, intent, and trust, then defending it in a note.
  • Mentorship & escalation — a layered review model where senior reviewers adjudicate and turn findings into reusable learning.
  • Real product impact — your reviews directly shape how Lucia behaves in the next version.

A guided first assignment

New evaluators start with a short onboarding path and a smoke test — one simple prompt to confirm the full loop runs end to end before any scored work begins.

read the prompt read Lucia's response score the dimensions honestly add a short note if it helps flag uncertainty or risk save & continue, then finalize the run

Interested researchers & reviewers

The role is posted first on Parker Dewey, and also placed through our university partnerships and Handshake. For questions about the role, the partnerships, or employer verification, reach us directly.

Placements
Parker Dewey · University partnerships · Handshake
Evaluator onboarding
welcome.evaluationlabs.ai