Google’s contractors reportedly use an internal platform for comparing Gemini’s outputs to those of other AI models.
The new Gemini 2.0 Flash is available as an experimental model to developers via the Gemini API.
Google reportedly relies on Anthropic’s Claude to improve the responses provided by its AI model Gemini.
Contractors hired by the tech giant are shown answers generated by Gemini and Claude in response to a user prompt. They then have 30 minutes to rate each model’s output based on truthfulness and verbosity.
A report has said Google is using Anthropic’s Claude AI model to evaluate the performance of its own Gemini AI, raising questions about whether this practice complies with Anthropic’s terms of service.
TechCrunch claims to have viewed internal correspondence, which suggests that contractors working on Gemini are comparing its responses to Claude’s. These contractors are tasked with rating the accuracy and quality of Gemini’s outputs based on criteria like truthfulness and verbosity.
“The contractors are given up to 30 minutes per prompt to determine whose answer is better, Gemini’s or Claude’s,” the report said.
Difference between Gemini and Claude’s answers
TechCrunch also says contractors noticed explicit references to Claude within the internal Google platform for AI model comparisons.
The report added that in some instances, Claude’s responses appeared to prioritise safety more than Gemini’s, refusing to answer unsafe prompts or generating more cautious responses. In one instance, Gemini’s response was flagged as a “huge safety violation” for including “nudity and bondage.”
“Claude’s safety settings are the strictest” among AI models, a contractor said.
What Anthropic’s terms say and Google’s clarification
Anthropic’s terms of service explicitly prohibit using Claude to “build a competing product or service” or “train competing AI models” without prior approval.
A Google DeepMind spokesperson confirmed comparing model outputs for evaluation purposes but denied using anthropogenic models to train Gemini. Google is also a major investor in Anthropic.
“Of course, in line with standard industry practice, in some cases, we compare model outputs as part of our evaluation process,” said Shira McNamara, a spokesperson for Google DeepMind.
“However, any suggestion that we have used anthropic models to train Gemini is inaccurate,” McNamara added.