Dria

Swan

NFTHardhat

21,000 USDC

View results

Previous Next

Submission Details

Severity: low

Valid

Arbitrary Response Selection When No Validations or Low Validation Count

0xnbvc

Vulnerability Details

In LLMOracleCoordinator.getBestResponse(), when there are no validations (or too few validations to be meaningful), the function arbitrarily returns the first response without any quality assessment:

LLMOracleCoordinator.sol#L400-L423

/// @notice Returns the best performing result of the given task.

/// @dev For invalid task IDs, the status check will fail.

/// @param taskId The ID of the task to get the result for.

/// @return The best performing response w.r.t validation scores.

function getBestResponse(uint256 taskId) external view returns (TaskResponse memory) {

TaskResponse[] storage taskResponses = responses[taskId];

// ensure that task is completed

if (requests[taskId].status != LLMOracleTask.TaskStatus.Completed) {

revert InvalidTaskStatus(taskId, requests[taskId].status, LLMOracleTask.TaskStatus.Completed);

}

// pick the result with the highest validation score

-> TaskResponse storage result = taskResponses[0];

-> uint256 highestScore = result.score;

for (uint256 i = 1; i < taskResponses.length; i++) {

if (taskResponses[i].score > highestScore) {

highestScore = taskResponses[i].score;

result = taskResponses[i];

}

return result;

}

This is problematic in two scenarios:

When numValidations = 0: All responses will have a score of 0, so the first response is always selected
When numValidations <= 2: The statistical calculations of mean and standard deviation become meaningless with such a small sample size. For example:
- With 1 validation: There's no variance, making standard deviation calculations useless
- With 2 validations: Any two scores will form a "valid" mean and standard deviation, even if they're completely incorrect
- Statistical significance typically requires at least 3-5 samples for any meaningful analysis

Impact

Users receive arbitrary responses that may not be the best quality
The first responder is unfairly advantaged in no-validation scenarios
The protocol's quality assurance mechanism is bypassed
Users paying for multiple generations don't get the intended benefit of selecting the best response
The statistical scoring system becomes trivially gameable with 2 or fewer validations

Recommendations

Allow users to select their preferred response when validation count is <= 2.
Or add a minimum validation threshold for automated selection.

Updates

Lead Judging Commences

inallhonesty Lead Judge about 1 year ago

Submission Judgement Published

Validated

Assigned finding tags:

Return value of `getBestResponse` when no validators

Prize pool breakdown

Total prize

21,000 USDC

nSLOC

1,034

20 USDC / LOC

High Medium

20,000 USDC

Low

1,000 USDC

Live

The contest is live. Earn rewards by submitting a finding.

Judging

Submissions are being carefully reviewed by our judges.

View all submissions

Appeals

This is your time to appeal against judgements on your submissions.

Appeals Review

Appeals are being carefully reviewed by our judges.

Rewards Distribution

The contest is complete and the rewards are being distributed.

View results

Support

FAQs

Can't find an answer? Chat with us on Discord, Twitter or Linkedin.

Give us feedback!