Dria

Swan

NFTHardhat

21,000 USDC

View results

Previous Next

Submission Details

Severity: low

Valid

Inaccurate best response selection in `LLMOracleCoordinator::getBestResponse`.

v1vah0us3

Summary

The LLMOracleCoordinator::getBestResponse function may not return the best performing result of the given task. This behaviour can lead to less than optimal purchases, especially in scenarios where response validation has not taken place.

Description

The LLMOracleCoordinator::getBestResponse function aims to return the best-performing response based on validation scores for a given task. However, there is an issue when the task responses have no validation requirements, as it is when task.parameters.numValidations == 0 in LLMOracleCoordinator::respond. In such cases, all responses will be assigned a default score of 0 (see [https://github.com/Cyfrin/2024-10-swan-dria/blob/main/contracts/llm/LLMOracleCoordinator.sol#L226]) and will not require any validation, setting the task status to Completed. This may lead to the following consequences:

** The LLMOracleCoordinator::getBestResponse function selects the first response from taskResponses array, which may not represent the best or most accurate result. In fact it could be an empty, irrelevant or false response. Since all scores are 0, this does not indicate performance quality.

function getBestResponse(uint256 taskId) external view returns (TaskResponse memory) {

TaskResponse[] storage taskResponses = responses[taskId];

if (requests[taskId].status != LLMOracleTask.TaskStatus.Completed) {

revert InvalidTaskStatus(taskId, requests[taskId].status, LLMOracleTask.TaskStatus.Completed);

}

@> TaskResponse storage result = taskResponses[0];

@> uint256 highestScore = result.score;

for (uint256 i = 1; i < taskResponses.length; i++) {

@> if (taskResponses[i].score > highestScore) {

highestScore = taskResponses[i].score;

result = taskResponses[i];

}

return result;}

** When the task parameter numValidations is set to 0, the system lacks a mechanism to validate responses effectively. Consequently, even incorrect or irrelevant responses can be treated as acceptable output and returned as the best result after calling LLMOracleCoordinator::getBestResponse.

PoC

For the proof of concept here is a valid test case, please paste it into the LLMOracleCoordinator.test.ts file:

describe("without validation", function () {

const [numGenerations, numValidations] = [2, 0];

let generatorAllowancesBefore: bigint[];

this.beforeAll(async () => {

taskId++;

generatorAllowancesBefore = await Promise.all(

generators.map((g) => token.allowance(coordinatorAddress, g.address))

);

});

it("should make a request", async function () {

await safeRequest(coordinator, token, requester, taskId, input, models, {

difficulty,

numGenerations,

numValidations,

});

it("should NOT respond if not a registered Oracle", async function () {

const generator = dummy;

await expect(safeRespond(coordinator, generator, output, metadata, taskId, 0n))

.to.revertedWithCustomError(coordinator, "NotRegistered")

.withArgs(generator.address);

});

it("should respond (1/2) to a request only once", async function () {

// using the first generator

const generator = generators[0];

await safeRespond(coordinator, generator, output, metadata, taskId, 0n);

// should NOT respond again

await expect(safeRespond(coordinator, generator, output, metadata, taskId, 0n))

.to.be.revertedWithCustomError(coordinator, "AlreadyResponded")

.withArgs(taskId, generator.address);

});

it("should respond (2/2)", async function () {

// use the second generator

const generator = generators[1];

await safeRespond(coordinator, generator, output, metadata, taskId, 1n);

});

it("should NOT respond if task is not pending generation", async function () {

// this time we use the other generator

const generator = generators[2];

await expect(safeRespond(coordinator, generator, output, metadata, taskId, 2n))

.to.revertedWithCustomError(coordinator, "InvalidTaskStatus")

.withArgs(taskId, TaskStatus.Completed, TaskStatus.PendingGeneration);

});

it("should NOT respond to a non-existent request", async function () {

const generator = generators[0];

const nonExistentTaskId = 999n;

await expect(safeRespond(coordinator, generator, output, metadata, nonExistentTaskId, 0n))

.to.revertedWithCustomError(coordinator, "InvalidTaskStatus")

.withArgs(nonExistentTaskId, TaskStatus.None, TaskStatus.PendingGeneration);

});

// Test case returns the very first response when no validations are present

it("should return the first response as the 'best' when no validations are present", async function () {

const task = await coordinator.requests(taskId);

console.log(`Task Status: ${task.status}`);

expect(task.status).to.equal(TaskStatus.Completed);

const responses = await coordinator.getResponses(taskId);

console.log("Generator responses:");

responses.forEach((response: any, index: number) => {

console.log(` Generator ${index} Address: ${response.responder}, Score: ${response.score}`);

});

const bestResponse = await coordinator.getBestResponse(taskId);

console.log(`Best Response - Responder: ${bestResponse.responder}, Score: ${bestResponse.score}`);

expect(bestResponse.responder).to.equal(generators[0].address);

expect(bestResponse.score).to.equal(0);

});

it("should see rewards", async function () {

const task = await coordinator.requests(taskId);

for (let i = 0; i < numGenerations; i++) {

const allowance = await token.allowance(coordinatorAddress, generators[i].address);

expect(allowance - generatorAllowancesBefore[i]).to.equal(task.generatorFee);

}

});

The test can be run with yarn test ./test/LLMOracleCoordinator.test.ts --verbose .
Here are the logs:

Task Status: 3

Generator responses:

Generator 0 Address: 0x90F79bf6EB2c4f870365E785982E1f101E93b906, Score: 0

Generator 1 Address: 0x15d34AAf54267DB7D7c367839AAf71A00a2C6A65, Score: 0

Best Response - Responder: 0x90F79bf6EB2c4f870365E785982E1f101E93b906, Score: 0

As you can see from the error logs, if oracle response validation is not required, all generated responses get the default score of 0 and the LLMOracleCoordinator::getBestResponse function returns the very first result pushed into the TaskResponse array of structs.

Impact

A core BuyerAgent::purchase function relies on the results of the LLMOracleCoordinator::getBestResponse function, which may not reflect the response quality or relevance, resulting in purchases based on inaccurate information. If users consistently receive from oracles poor answers that are labelled as the best, this would lead to financial losses as users would not purchase the best items, resulting in frustration and reduced confidence in the system's capabilities.

Tools Used

Manual Code Review, Hardhat

Recommendations

Consider introducing a threshold for the number of validations required before considering any response valid. It can be set in LLMOracleManager.sol. This ensures that responses are subjected to some level of scrutiny before being considered for the best-performing result.
Revise the logic in the getBestResponse function to ensure that it only selects responses with non-zero scores. If all responses have a score of 0, the function should revert with a clear message indicating that no valid responses were found, or return a designated fallback value that signifies the absence of a suitable response.

Updates

Lead Judging Commences

inallhonesty Lead Judge about 1 year ago

Submission Judgement Published

Validated

Assigned finding tags:

Return value of `getBestResponse` when no validators

Appeal created

sovaslava Auditor

about 1 year ago

inallhonesty Lead Judge

about 1 year ago

inallhonesty Lead Judge about 1 year ago

Submission Judgement Published

Validated

Assigned finding tags:

Return value of `getBestResponse` when no validators

Prize pool breakdown

Total prize

21,000 USDC

nSLOC

1,034

20 USDC / LOC

High Medium

20,000 USDC

Low

1,000 USDC

Live

The contest is live. Earn rewards by submitting a finding.

Judging

Submissions are being carefully reviewed by our judges.

View all submissions

Appeals

This is your time to appeal against judgements on your submissions.

Appeals Review

Appeals are being carefully reviewed by our judges.

Rewards Distribution

The contest is complete and the rewards are being distributed.

View results

Support

FAQs

Can't find an answer? Chat with us on Discord, Twitter or Linkedin.

Give us feedback!