Dria

Swan
NFTHardhat
21,000 USDC
View results
Submission Details
Severity: low
Valid

Inaccurate best response selection in `LLMOracleCoordinator::getBestResponse`.

Summary

The LLMOracleCoordinator::getBestResponse function may not return the best performing result of the given task. This behaviour can lead to less than optimal purchases, especially in scenarios where response validation has not taken place.

Description

The LLMOracleCoordinator::getBestResponse function aims to return the best-performing response based on validation scores for a given task. However, there is an issue when the task responses have no validation requirements, as it is when task.parameters.numValidations == 0 in LLMOracleCoordinator::respond. In such cases, all responses will be assigned a default score of 0 (see [https://github.com/Cyfrin/2024-10-swan-dria/blob/main/contracts/llm/LLMOracleCoordinator.sol#L226]) and will not require any validation, setting the task status to Completed. This may lead to the following consequences:

** The LLMOracleCoordinator::getBestResponse function selects the first response from taskResponses array, which may not represent the best or most accurate result. In fact it could be an empty, irrelevant or false response. Since all scores are 0, this does not indicate performance quality.

function getBestResponse(uint256 taskId) external view returns (TaskResponse memory) {
TaskResponse[] storage taskResponses = responses[taskId];
if (requests[taskId].status != LLMOracleTask.TaskStatus.Completed) {
revert InvalidTaskStatus(taskId, requests[taskId].status, LLMOracleTask.TaskStatus.Completed);
}
@> TaskResponse storage result = taskResponses[0];
@> uint256 highestScore = result.score;
for (uint256 i = 1; i < taskResponses.length; i++) {
@> if (taskResponses[i].score > highestScore) {
highestScore = taskResponses[i].score;
result = taskResponses[i];
}
}
return result;}

** When the task parameter numValidations is set to 0, the system lacks a mechanism to validate responses effectively. Consequently, even incorrect or irrelevant responses can be treated as acceptable output and returned as the best result after calling LLMOracleCoordinator::getBestResponse.

PoC

For the proof of concept here is a valid test case, please paste it into the LLMOracleCoordinator.test.ts file:

describe("without validation", function () {
const [numGenerations, numValidations] = [2, 0];
let generatorAllowancesBefore: bigint[];
this.beforeAll(async () => {
taskId++;
generatorAllowancesBefore = await Promise.all(
generators.map((g) => token.allowance(coordinatorAddress, g.address))
);
});
it("should make a request", async function () {
await safeRequest(coordinator, token, requester, taskId, input, models, {
difficulty,
numGenerations,
numValidations,
});
});
it("should NOT respond if not a registered Oracle", async function () {
const generator = dummy;
await expect(safeRespond(coordinator, generator, output, metadata, taskId, 0n))
.to.revertedWithCustomError(coordinator, "NotRegistered")
.withArgs(generator.address);
});
it("should respond (1/2) to a request only once", async function () {
// using the first generator
const generator = generators[0];
await safeRespond(coordinator, generator, output, metadata, taskId, 0n);
// should NOT respond again
await expect(safeRespond(coordinator, generator, output, metadata, taskId, 0n))
.to.be.revertedWithCustomError(coordinator, "AlreadyResponded")
.withArgs(taskId, generator.address);
});
it("should respond (2/2)", async function () {
// use the second generator
const generator = generators[1];
await safeRespond(coordinator, generator, output, metadata, taskId, 1n);
});
it("should NOT respond if task is not pending generation", async function () {
// this time we use the other generator
const generator = generators[2];
await expect(safeRespond(coordinator, generator, output, metadata, taskId, 2n))
.to.revertedWithCustomError(coordinator, "InvalidTaskStatus")
.withArgs(taskId, TaskStatus.Completed, TaskStatus.PendingGeneration);
});
it("should NOT respond to a non-existent request", async function () {
const generator = generators[0];
const nonExistentTaskId = 999n;
await expect(safeRespond(coordinator, generator, output, metadata, nonExistentTaskId, 0n))
.to.revertedWithCustomError(coordinator, "InvalidTaskStatus")
.withArgs(nonExistentTaskId, TaskStatus.None, TaskStatus.PendingGeneration);
});
// Test case returns the very first response when no validations are present
it("should return the first response as the 'best' when no validations are present", async function () {
const task = await coordinator.requests(taskId);
console.log(`Task Status: ${task.status}`);
expect(task.status).to.equal(TaskStatus.Completed);
const responses = await coordinator.getResponses(taskId);
console.log("Generator responses:");
responses.forEach((response: any, index: number) => {
console.log(` Generator ${index} Address: ${response.responder}, Score: ${response.score}`);
});
const bestResponse = await coordinator.getBestResponse(taskId);
console.log(`Best Response - Responder: ${bestResponse.responder}, Score: ${bestResponse.score}`);
expect(bestResponse.responder).to.equal(generators[0].address);
expect(bestResponse.score).to.equal(0);
});
it("should see rewards", async function () {
const task = await coordinator.requests(taskId);
for (let i = 0; i < numGenerations; i++) {
const allowance = await token.allowance(coordinatorAddress, generators[i].address);
expect(allowance - generatorAllowancesBefore[i]).to.equal(task.generatorFee);
}
});
});

The test can be run with yarn test ./test/LLMOracleCoordinator.test.ts --verbose .
Here are the logs:

Task Status: 3
Generator responses:
Generator 0 Address: 0x90F79bf6EB2c4f870365E785982E1f101E93b906, Score: 0
Generator 1 Address: 0x15d34AAf54267DB7D7c367839AAf71A00a2C6A65, Score: 0
Best Response - Responder: 0x90F79bf6EB2c4f870365E785982E1f101E93b906, Score: 0

As you can see from the error logs, if oracle response validation is not required, all generated responses get the default score of 0 and the LLMOracleCoordinator::getBestResponse function returns the very first result pushed into the TaskResponse array of structs.

Impact

A core BuyerAgent::purchase function relies on the results of the LLMOracleCoordinator::getBestResponse function, which may not reflect the response quality or relevance, resulting in purchases based on inaccurate information. If users consistently receive from oracles poor answers that are labelled as the best, this would lead to financial losses as users would not purchase the best items, resulting in frustration and reduced confidence in the system's capabilities.

Tools Used

Manual Code Review, Hardhat

Recommendations

Consider introducing a threshold for the number of validations required before considering any response valid. It can be set in LLMOracleManager.sol. This ensures that responses are subjected to some level of scrutiny before being considered for the best-performing result.
Revise the logic in the getBestResponse function to ensure that it only selects responses with non-zero scores. If all responses have a score of 0, the function should revert with a clear message indicating that no valid responses were found, or return a designated fallback value that signifies the absence of a suitable response.

Updates

Lead Judging Commences

inallhonesty Lead Judge 12 months ago
Submission Judgement Published
Validated
Assigned finding tags:

Return value of `getBestResponse` when no validators

Appeal created

sovaslava Auditor
12 months ago
inallhonesty Lead Judge
12 months ago
inallhonesty Lead Judge 12 months ago
Submission Judgement Published
Validated
Assigned finding tags:

Return value of `getBestResponse` when no validators

Support

FAQs

Can't find an answer? Chat with us on Discord, Twitter or Linkedin.