๐งชTDD Challengeยทbeginnerยทโฑ๏ธ 15โ30mยทโญ 100 XP
M-063Build Your First LLM Eval Scorer
Description
Nebula Corp's AI team has no way to measure whether their LLM outputs are any good. Build a simple evaluation scorer that checks LLM responses against expected answers using multiple metrics: exact match, keyword containment, and a basic similarity score based on word overlap. This is the foundation of every eval pipeline.
Test Cases (3)
Exact match works
Should match after trimming and lowercasing
Input:exactMatch(' Hello World ', 'hello world')
Expected:true
Keyword containment check
Should find all keywords in the text
Input:containsAllKeywords('The quick brown fox jumps over the lazy dog', ['quick', 'fox', 'dog'])
Expected:true
Word overlap scoring
4 of 6 expected words found = 0.67
Input:wordOverlapScore('the cat sat on the mat', 'the cat is on a mat')
Expected:STARTS_WITH:0.6
Related Lessons
Click Run / Check to validate your solution