The LLMOracleCoordinator::getBestResponse function may not return the best performing result of the given task. This behaviour can lead to less than optimal purchases, especially in scenarios where response validation has not taken place.
The LLMOracleCoordinator::getBestResponse function aims to return the best-performing response based on validation scores for a given task. However, there is an issue when the task responses have no validation requirements, as it is when task.parameters.numValidations == 0 in LLMOracleCoordinator::respond. In such cases, all responses will be assigned a default score of 0 (see [https://github.com/Cyfrin/2024-10-swan-dria/blob/main/contracts/llm/LLMOracleCoordinator.sol#L226]) and will not require any validation, setting the task status to Completed. This may lead to the following consequences:
** The LLMOracleCoordinator::getBestResponse function selects the first response from taskResponses array, which may not represent the best or most accurate result. In fact it could be an empty, irrelevant or false response. Since all scores are 0, this does not indicate performance quality.
** When the task parameter numValidations is set to 0, the system lacks a mechanism to validate responses effectively. Consequently, even incorrect or irrelevant responses can be treated as acceptable output and returned as the best result after calling LLMOracleCoordinator::getBestResponse.
For the proof of concept here is a valid test case, please paste it into the LLMOracleCoordinator.test.ts file:
The test can be run with yarn test ./test/LLMOracleCoordinator.test.ts --verbose .
Here are the logs:
As you can see from the error logs, if oracle response validation is not required, all generated responses get the default score of 0 and the LLMOracleCoordinator::getBestResponse function returns the very first result pushed into the TaskResponse array of structs.
A core BuyerAgent::purchase function relies on the results of the LLMOracleCoordinator::getBestResponse function, which may not reflect the response quality or relevance, resulting in purchases based on inaccurate information. If users consistently receive from oracles poor answers that are labelled as the best, this would lead to financial losses as users would not purchase the best items, resulting in frustration and reduced confidence in the system's capabilities.
Manual Code Review, Hardhat
Consider introducing a threshold for the number of validations required before considering any response valid. It can be set in LLMOracleManager.sol. This ensures that responses are subjected to some level of scrutiny before being considered for the best-performing result.
Revise the logic in the getBestResponse function to ensure that it only selects responses with non-zero scores. If all responses have a score of 0, the function should revert with a clear message indicating that no valid responses were found, or return a designated fallback value that signifies the absence of a suitable response.
The contest is live. Earn rewards by submitting a finding.
This is your time to appeal against judgements on your submissions.
Appeals are being carefully reviewed by our judges.