This is what you get.
Not a CV. An intel report.
Below are 3 sample Promptintern profiles. For every answer, we show you the prompt, what we were testing for, and what the answer revealed. Read the thesis, not just the response.
How to read this report
Exact question or task we put in front of the candidate.
Grader's thesis — what a good answer reveals about the candidate.
Our read on what their actual answer tells you. The 'so what'.
Demo profile. Fictional candidate, real format. Live profiles use the same structure with real submissions.
Aarav Mehta
B.Tech CS, IIT Bombay (3rd yr) · AI / Prompt Engineering
Builds tooling around LLMs for fun. Spends weekends benchmarking prompt strategies on open eval sets. Comfortable shipping ugly v1s and iterating.
Verified Intel Score
Composite of 4 independent assessment layers.
+5 reviewer bonus
Layer 1
Analytical thinking
Real-world task. Open prompt. No retakes.
We gave you a real production prompt used to classify customer support tickets. It's underperforming (~64% accuracy on our holdout set). Diagnose what's wrong, propose a fix, and back it up with evidence. 90 min, open notebook, you can run code.
- ·Can they actually run an experiment, not just theorize?
- ·Do they identify ROOT cause, or surface-level symptoms?
- ·Do they quantify trade-offs (cost, latency, accuracy) instead of hand-waving?
Their response
The given prompt fails on long-context inputs because it asks the model to summarize before classifying — the summary collapses the signal the classifier needs. My fix: invert the order (classify first using a 3-shot example block, then summarize only the matched class). On the 50 samples I ran locally, accuracy went from 64% → 89%. Trade-off: ~1.4x token cost, which is fine for this use case. Full notebook attached with eval harness.
- ·Diagnosed root cause (order-of-operations), not symptom
- ·Ran an actual eval (n=50) instead of vibes
- ·Surfaced cost trade-off unprompted
- ·Sample size small for production claim — flagged it himself
Layer 2
Communication & soft skills
Recorded video round. Unedited. One take.
Watch video round
Recorded answers · ~2 min
Question 1
Tell us about a project you finished that nobody asked you to do.
What we test
Internal drive vs external compliance. Self-starters ship without permission.
Their answer (transcript)
I'd rather work on something boring that ships than something exciting that lives on a Notion page. The internship I learned most from was unpaid work on an OSS RAG repo — 4 PRs merged, one became the default retriever.
Signal
Concrete output (4 merged PRs), prefers shipped-boring over hypothetical-exciting. No ego attachment to credit.
Question 2
Tell us about a time you were wrong about something technical.
What we test
Calibration & ego — can they update their model when reality disagrees?
Their answer (transcript)
Last time I was wrong about something technical: I assumed embedding similarity was enough for our doc search. It wasn't — users phrase queries totally differently than docs. Switched to hybrid + rerank in a weekend.
Signal
Owned the mistake without defensiveness. Time-to-fix was a weekend — fast feedback loop.
Question 3
What's the worst thing about working with you?
What we test
Self-awareness. Most candidates either deflect or fake-flaw ('I work too hard').
Their answer (transcript)
Honestly, the worst thing about me as a teammate is I move too fast on the first draft. I've learned to slow down and write a one-pager before touching code, especially when others depend on the interface.
Signal
Real flaw + already has a mitigation. Doesn't perform humility, just describes the loop.
Layer 4
Drive & motivation
Why they want this. In their own words.
Why this, why now? In 3-5 sentences — no fluff, no 'I'm passionate about innovation.'
Their answer
I don't want a CV-internship where I make slides. I want to ship something a real team actually uses. Promptintern's task round was the first application that actually tested if I could think, not if I could write a cover letter.
Signal
Specific rejection of low-leverage work + signals he's already self-selected for output-driven teams. Not performative.
Top 2% of Q2 applicants. His task response identified two failure modes in the prompt we didn't expect candidates to catch. Hire fast — he won't be on the market long.
— Promptintern review team
Why this beats their CV
You don't need a second screening call. Everything you'd ask is already above.
| Dimension | CV / Resume | Promptintern report |
|---|---|---|
| Source of truth | Self-written | Independently graded |
| Analytical proof | Bullet points | Live task response |
| Communication | Claimed | Recorded video round |
| Work readiness | Unknown | Setup audited (Layer 3) |
| Motivation | Cover-letter fluff | On-record statement |
| Trust level | Take their word | Verified by Promptintern |
Ready to hire from this?
Browse verified interns, request a warm intro, and skip the screening calls. We've already done that part.