Job Details

AI Evaluation and Model Quality SME

BH-31999 Posted: 06/03/2026

£70 to £75 Per: hour + Benefits: N/A
Reading, South East,
Contract

We are seeking an AI Evaluation & Model Quality Specialist to support the delivery and validation of AI-driven solutions in collaboration with a global technology partner. The role will focus on defining and executing robust evaluation frameworks to measure model accuracy, reliability, and production readiness across speech-to-text, summarisation, and intent-based AI systems.

Working closely with engineering, product, and partner teams, the successful candidate will design metrics, curate high-quality ground-truth datasets, and conduct rigorous model validation to ensure solutions meet agreed performance and governance standards before deployment.

Key Responsibilities

Design and implement evaluation frameworks for AI models, including speech-to-text and generative AI outputs.
Define and apply appropriate performance metrics (e.g., word error rate, semantic accuracy, relevance, completeness) and establish acceptance thresholds.
Create, validate, and maintain high-quality labelled ground-truth datasets to support transcription, summarisation, and intent evaluation.
Conduct statistical analysis and systematic error diagnostics to identify root causes and compare model performance.
Support model validation and governance activities, including regression testing and quality sign-off across SIT, UAT, and production readiness cycles.
Provide empirical insights to guide prompt optimisation and model tuning, balancing accuracy, latency, and cost considerations.
Contribute to post-deployment monitoring frameworks, including model performance tracking, drift detection, and continuous improvement processes.
Translate technical evaluation outcomes into clear, evidence-based insights for business and stakeholder audiences.

Key Skills & Experience

Strong understanding of AI evaluation methodologies and performance metrics, particularly for speech-to-text and generative AI systems.
Experience designing and managing labelled datasets for model testing and validation.
Proficiency in statistical analysis, model benchmarking, and structured error analysis.
Experience working within model validation, testing, or AI governance frameworks.
Familiarity with prompt engineering and empirical model optimisation approaches.
Understanding of monitoring strategies for deployed AI systems, including performance degradation and drift detection.
Strong communication skills with the ability to present technical findings clearly to non-technical stakeholders.

Working Environment
The role will operate within a cross-functional delivery team and collaborate closely with a global technology partner to ensure AI solutions are rigorously evaluated, governed, and ready for enterprise deployment.

Joe Matthews Associate Director

Apply for this role

First Name

Last Name

Telephone Number

Email Address

CV, LinkedIn or Dropbox URL

CV Upload

Choose File

LinkedIn / Dropbox URL

Message

By submitting this form you agree to our Terms & Conditions, Privacy Policy & Cookie Policy.

Not yet registered? Create an account today

Already have an account? Sign in now

Job Details

AI Evaluation and Model Quality SME

Apply for this role

Contact Us

Connect With Us

Quick Links

Focus Cloud Group

Quick CV Dropoff