AI Needs Us!

The pace of change in artificial intelligence (AI) within procurement is astonishing.  Only a short time ago, the conversation focused on what tasks AI tools could automate to make our lives easier – checking whether a tenderer’s Project Specific Questionnaire (PSQ) was complete, for example. Another was assisting with moderation by comparing a human Subject Matter Expert’s (SME) evaluation of a method statement against the scoring guidance.

Now, the conversation has shifted. Instead of the AI Agent checking the human’s work, it is often the human SME checking the AI Agent’s work.

Why AI Needs Us – and We Need AI

The simple truth is this: we need AI, and AI needs us.

I recently spoke with an organisation that uses multiple tools in their procurement process – a tender portal, SharePoint, and bespoke evaluation templates. They asked why they needed AI to support tender clarifications when the portal already handled the communication.

The answer is that the portal only transmits queries and responses.  Humans are still responsible for spotting where clarification is required.  An AI Agent can support by reviewing tender responses, highlighting gaps, drafting clarification questions, and even summarising the answers. These are cognitive tasks, usually performed by people.

That’s why we should stop thinking of AI as a “robot” doing repetitive tasks, and start thinking of it as a colleague, one that can read, analyse, and draft at speed, but still needs direction and oversight.

Training the AI “Colleague”

In the same way evaluation teams receive training before an exercise, an AI Agent must also be “trained” through careful prompting.

AI can process information millions of times faster than a human, but it lacks judgement and context. The SME provides prompts and framing, so the AI can analyse material and present reasoned outputs. The better the input, the better the outcome.

Let’s look at an example, to illustrate this point.

Example – Planned Maintenance Works Framework

The tenderer is responding to a question on a tender to get onto a framework to undertake planned maintenance works for a housing care provider.  The specific question is as follows:-

“This project involves undertaking works to accommodate residents with acute needs and compliance with Approved Document M of the Building Regulations is therefore required.  Please set out your approach to ensuring that consultation takes place with all relevant stakeholders to ensure that the customer need is met.”

The tenderer’s written response to the question was put into Chat GPT, along with the following prompt:

<Evaluate the response against the question and provide a score out of 5>

The following response from Chat GPT was provided:

The response provides a comprehensive and structured approach to stakeholder consultation in relation to meeting the requirements of Approved Document M and ensuring that residents with acute needs are properly accommodated.

  • General Strategy: The answer begins with a clear commitment to compliance and inclusivity, supported by lessons learned from previous similar projects. It highlights structured consultation, inclusive communication strategies, and a review process for continuous improvement.

  • Stakeholder Engagement: The response identifies and details engagement methods for all relevant stakeholders, including:

    • The Client: Frequent meetings, training for staff, and strong senior management involvement.

    • Local Service Managers: Profiling resident needs, regular feedback loops, and minimising disruption.

    • OTs, Healthcare Professionals & Psychologists: Early and ongoing engagement to co-produce designs that address physical and psychological needs.

    • Residents: Direct involvement through workshops, pilot works, reviews, and post-completion testing.

    • Design Team: Workshops, compliance reviews, risk registers, and collaborative inspections.

    • Local Authorities: Regular engagement with planning and building control officers, joint reviews, and clear communication channels for feedback.

  • Strengths:

    • Very detailed and stakeholder-specific.

    • Demonstrates awareness of both compliance and practical resident needs.

    • Emphasises early engagement, ongoing dialogue, and feedback loops.

    • Shows a strong focus on co-production, inclusivity, and lessons learned.

  • Weaknesses:

    • Could be more concise — it risks being overly detailed and repetitive, which may dilute key points.

    • Some areas are generic (e.g., “structured strategy” without outlining measurable KPIs or tools for monitoring).

    • A stronger link between consultation activities and how they directly ensure Approved Document M compliance would improve clarity.

 Score: 4 out of 5.

It is a strong and thorough response that clearly addresses the question, but could be improved by being slightly more concise, better linking consultation methods to compliance outcomes, and including clearer measures of success.


The process was then re-run, using the same tender response but with the following prompts:-

  1. Here is a copy of the project brief for the specific project in question;

  2. Here is a copy of the assessment methodology.  Please use this to ensure that your assessment is in line with the methodology and that the scoring criteria is strictly followed;

  3. The evaluation must consider that the response is forward-looking and focuses on what the tenderer can do, not what they have done in the past;

  4. The assessment should be objective;

  5. The assessment should identify any added value items that go beyond the basic requirement;

  6. The response should provide evidence to back up the tenderer’s proposals;

  7. The assessment summary should be no longer than 150 words.

The following response from Chat GPT was provided:

The response is forward-looking and sets out clear, structured methods of engaging stakeholders throughout the major refurbishment works. It addresses the scenario’s specific challenges, including compliance with Part M for accessible units, resident groups with acute needs, conservation area restrictions, and coordination with local authorities. Evidence of added value is demonstrated through proposed resident workshops, pilot works, co-production with healthcare professionals, and structured lessons-learned reviews, all of which go beyond basic consultation. The tenderer provides specific mechanisms for engagement (e.g., weekly site meetings, daily liaison with service managers, and design risk registers) that credibly evidence their proposals. The approach directly links stakeholder consultation to ensuring compliance with Approved Document M and minimising disruption during works across varied sites. Overall, the submission gives confidence that requirements will be fully met and exceeded, with transparent and inclusive methods tailored to the scenario. 

Overall Score: 5/5 – Fully meets and exceeds the criteria.

This is an actual real-life example.  The human SME scored it 4/5!

Incidentally, the response time to both queries was seconds.

What do we learn from this?

Well, firstly, the AI Agent is incredibly efficient.  It can read and analyse a response in a fraction of the time that a human SME can do it in.  It can also write a succinct summary which is grammatically and orthographically correct.  But it’s not perfect.  I feel that it doesn’t align completely with the scoring criteria; I’d like to see it be more explicit in this regard.  For example, to achieve a score of 5, the criteria states:

“The response very clearly addresses all the required elements of the question. It provides compelling evidence and assurance that the specified requirements will be met and exceeded.  There are no concerns, and the proposal provides full confidence that delivery of the outcomes will be to an excellent standard.”   

The summary provided by Chat GPT is good, but perhaps the commentary could align better with the scoring criteria.  I also have concerns about how it scores a response.  I have played around with a few of these recently and found the AI Agent’s ability to give a score, quite inconsistent.  Although I have found that providing better quality prompting does result in scores more closely aligned to the human SME.

Conclusion

It is clear to me that there is some way to go where we can rely on an AI Agent being the SME in undertaking a quality evaluation, but in the meantime, if we use it to undertake the initial review and the human SME checks it, then I think this could be a useful time-saving tool, help to improve consistency and potentially avoid bias.  In the meantime, that clever little AI Agent will continue to learn, at pace and the more information it is provided, the greater its cognitive skills will become.  In the meantime, I think that the human SME, with care and using effective prompting, can work alongside the AI Agent in undertaking quality evaluations.  And let’s face it, the Procurement Act 2023 and guidance from the Cabinet Office does not prohibit its use.  The only requirement I see, is that the Contracting Authority should be transparent (Regulation 23) about the award criteria and ideally, should set out how it proposes to use AI within their Assessment Methodology.  This could even include a summary of the prompts it proposes to use in undertaking the evaluation/

AI cannot replace judgement, but it can free up SMEs to focus on higher-value evaluation and assurance tasks.  In summary, the AI Agent needs us! 

Find out more

If you want to learn more about how you can integrate AI into your procurement practices, then please get in touch for an informal chat.  We work with AI developers who can realise your dream of “the art of the possible”.  We also work with leading procurement lawyers who can provide assurance that the AI solution adopted is compliant with the Act.

Neil Thody is a Fellow of the RICS, leading procurement specialist, CEDR/RICS Mediator, RICS Adjudicator and Independent Adviser, working with clients across multiple sectors.   

Next
Next

AI and the Expert Witness: Opportunities and Risks in Construction Disputes