Job Details

AI Evaluation and Teaming Consultant

BH-32748
  • £600 to £650 Per: day + Benefits: N/A
  • Greater London, South East,
  • Contract
A leading financial services organisation is seeking an AI Evals & Red Teaming Expert to design and operate a robust evaluation and adversarial testing capability for production-grade AI systems.

The successful candidate will be responsible for implementing automated adversarial testing within CI/CD pipelines using tools such as AgentDojo, Garak, and Pyrit, with formal release gating to ensure safe and compliant deployment of AI systems.

They will establish and own a comprehensive AI measurement framework, including success rate tracking, uncertainty quantification, and model drift detection. This includes building repeatable evaluation standards that can be applied across all agentic systems within the enterprise.

A key aspect of the role involves close collaboration with security and governance stakeholders to map threats to test cases and generate EU AI Act (Article 15) compliance evidence. The role also includes ownership of the organisation’s AI Bill of Materials (AI-BOM), ensuring supply chain integrity, monitoring model drift, and maintaining signed artefacts across the AI lifecycle.

The expert will design and implement testing strategies covering bias detection, hallucination analysis, and memorisation risk assessment, embedding these into a centralised evaluation platform used across all AI systems in production.
Common requirements across all AI engineering roles in the organisation include:
  • Strong experience within UK financial services, with working knowledge of DORA, FCA Operational Resilience, and the EU AI Act
  • Hands-on experience with AWS Bedrock (including Agents, Knowledge Bases, Guardrails, and model lifecycle management)
  • Solid understanding of AI/ML fundamentals, including foundation models, RAG architectures, non-deterministic agent behaviour, and tool-using systems
  • Strong knowledge of secure AI practices, including OWASP LLM Top 10, agentic AI threat modelling, and familiarity with the NIST AI Risk Management Framework (AI RMF)
This will be a UK based inside IR35 contract role working via Umbrella Company so you must be resident in the UK to be considered for this role.
Joe Matthews Associate Director

Apply for this role

© Copyright 2023 Focus Cloud
Site by Venn