Dria

Swan
NFTHardhat
21,000 USDC
View results
Submission Details
Severity: low
Valid

Arbitrary Response Selection When No Validations or Low Validation Count

Vulnerability Details

In LLMOracleCoordinator.getBestResponse(), when there are no validations (or too few validations to be meaningful), the function arbitrarily returns the first response without any quality assessment:

LLMOracleCoordinator.sol#L400-L423

/// @notice Returns the best performing result of the given task.
/// @dev For invalid task IDs, the status check will fail.
/// @param taskId The ID of the task to get the result for.
/// @return The best performing response w.r.t validation scores.
function getBestResponse(uint256 taskId) external view returns (TaskResponse memory) {
TaskResponse[] storage taskResponses = responses[taskId];
// ensure that task is completed
if (requests[taskId].status != LLMOracleTask.TaskStatus.Completed) {
revert InvalidTaskStatus(taskId, requests[taskId].status, LLMOracleTask.TaskStatus.Completed);
}
// pick the result with the highest validation score
-> TaskResponse storage result = taskResponses[0];
-> uint256 highestScore = result.score;
for (uint256 i = 1; i < taskResponses.length; i++) {
if (taskResponses[i].score > highestScore) {
highestScore = taskResponses[i].score;
result = taskResponses[i];
}
}
return result;
}

This is problematic in two scenarios:

  1. When numValidations = 0: All responses will have a score of 0, so the first response is always selected

  2. When numValidations <= 2: The statistical calculations of mean and standard deviation become meaningless with such a small sample size. For example:

    • With 1 validation: There's no variance, making standard deviation calculations useless

    • With 2 validations: Any two scores will form a "valid" mean and standard deviation, even if they're completely incorrect

    • Statistical significance typically requires at least 3-5 samples for any meaningful analysis

Impact

  • Users receive arbitrary responses that may not be the best quality

  • The first responder is unfairly advantaged in no-validation scenarios

  • The protocol's quality assurance mechanism is bypassed

  • Users paying for multiple generations don't get the intended benefit of selecting the best response

  • The statistical scoring system becomes trivially gameable with 2 or fewer validations

Recommendations

  1. Allow users to select their preferred response when validation count is <= 2.

  2. Or add a minimum validation threshold for automated selection.

Updates

Lead Judging Commences

inallhonesty Lead Judge 10 months ago
Submission Judgement Published
Validated
Assigned finding tags:

Return value of `getBestResponse` when no validators

Support

FAQs

Can't find an answer? Chat with us on Discord, Twitter or Linkedin.