Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert message IDs to unique hashes #203

Closed
wants to merge 19 commits into from
Closed

Conversation

michaelkaplan13
Copy link
Collaborator

@michaelkaplan13 michaelkaplan13 commented Dec 15, 2023

Note: The diff of this PR is relatively large due to the number of tests that needed to be updated to account for a type change from uint256 to bytes32. When reviewing, I recommend starting with ITeleporterMessenger.sol and TeleporterMessenger.sol as those files contain the only core changes.

Why this should be merged and how it works

Currently the TeleporterMessenger contract assigns a unique ID to each message sent by track how many messages have been sent to a given destination chain, and incrementing the "message ID" to be used messages sent to that chain by each time a message is sent to it. The message ID is used when TeleporterMessenger receives a message to provide replay protection by ensuring that the given message has never been received in the past.

While this is sufficient for the TeleporterMessenger contract itself, these message IDs are not unique across chains or TeleporterMessenger versions. The first message from one TeleporterMessenger instance to any other chain will alway have a message ID of 1, etc. On a global Avalanche level, in order to identify a specific Teleporter message, you must specify:

  • The origin blockchain ID
  • The destination blockchain ID
  • The Teleporter contract address
  • The message ID assigned to the message.

This makes indexing and querying for Teleporter messages more cumbersome.

In this PR, the unique messageID assigned to each message by the TeleporterMessenger contract is changed from a uint256 value to a bytes32 value. Instead of being a sequence number of the number of message that have been sent by the TeleporterMessenger instance to another blockchain ID, it is computed as:

sha256(<teleporter_contract_address>, <current_blockchain_id>, <message_nonce>)

The message_nonce above is tracked as the total number of message sent from the given TeleporterMessenger instance to any other blockchain ID. The resulting bytes32 message ID is guaranteed to be unique for each message sent by the contract given that the message_nonce value will be different for every message. It can be used in the same exact way as the previous message ID uint256 was to provide replay protection.

The new message ID hash is not included in the TeleporterMessage that is sent to the destination. Only the message nonce used to create the message ID hash is included in the message. When receiveCrossChainMessage is called to receive a message on the destination, the message ID hash is recomputed by the receiving chain using the source blockchain ID of the Warp message, and the message nonce included in the message. This ensures that message IDs received from different chains will be unique, even if a subnet is corrupted.

Since the message ID values no longer depend on the destination blockchain ID, the getNextMessageID function of the ITeleporterMessenger interface is also simplified to have no parameters.

The high level result of this change is that the message IDs assigned to messages by different TeleporterMessenger contract instances across different blockchains or at different contract addresses are unique. This allows for simplified data indexing and querying of messages offchain.

How this was tested

Unit tests, E2E test

How is this documented

TODO 😄

@michaelkaplan13 michaelkaplan13 changed the title Hash message Convert message IDs to unique hashes Dec 15, 2023
@michaelkaplan13 michaelkaplan13 added this to the production launch milestone Dec 15, 2023
Copy link
Contributor

@geoff-vball geoff-vball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I like it❗

Comment on lines 768 to 771
function _calculateNextMessageID() private view returns (bytes32) {
bytes32 blockchainID_ = blockchainID;
require(blockchainID_ != bytes32(0), "TeleporterMessenger: blockchainID not set");
return sha256(abi.encode(address(this), blockchainID_, messageNonce));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of a private view function, is it some sort of gas optimization?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a minute to remember why we have it structured this way 😅

The reason is that interface functions (such as getNextMessageID) are required to be external. In order to call those methods from within the contract itself, you have to use this.functionName(), which uses significantly more gas, and is generally not best practice.

So instead, we have a private view method for internal use and add on getNextMessageID which provides the external access point to it.

Copy link
Contributor

@cam-schultz cam-schultz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments on TeleporterMessenger.sol, and will take another pass to review the downstream changes.

) public sentMessageInfo;

// Tracks the hash of messages that have been received but whose execution has never succeeded.
// Enables retrying of failed messages with higher gas limits. Message execution is guaranteed to
// succeed at most once. The first key is the blockchain ID of the sending chain, the second key is the message ID, and
// the value is the hash of the uniquely identified message whose execution failed.
mapping(bytes32 sourceBlockchainID => mapping(uint256 messageID => bytes32 messageHash)) public
mapping(bytes32 sourceBlockchainID => mapping(bytes32 messageID => bytes32 messageHash)) public
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the messageID encodes the blockchainID, I believe we can simplify this by removing sourceBlockchainID as a key, and just keeping the inner map. Same applies to _relayerRewardAddresses. However, we need to keep sentMessageInfo as is since the messageID does not encode the destinationBlockchainID

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to be careful here for a couple of reasons:

  • It is technically not correct to say that the message ID "encodes" the source blockchain ID because there is no way to decode what the source blockchain ID value is given a message ID because it's a one way hash.
  • While message IDs should be unique from every source blockchain ID now that the source blockchain ID is used as input to that hash that generates the message ID, a single corrupted source subnet can send arbitrary messageID values that conflict with other chains.

An additional step we could take here, is to verify the correctness of the message ID hash in receiveCrossChainMessage. We have the source blockchain ID from the warp message itself, and verify that the sender address is always the same as address(this) since Teleporter messages are always sent from/to the same address on other chains. The only additional piece of information needed to be able to re-compute the expected message ID hash is the messageNonce used by the sender.

If we have the message include the nonce instead of the message ID, and then re-compute the message ID on the receiving side, then we would be able to rely on its uniqueness across all source chains and could likely leverage that further.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are made in #209

Comment on lines 571 to 575
bytes32 blockchainID_ = blockchainID;
if (blockchainID_ == bytes32(0)) {
blockchainID_ = WARP_MESSENGER.getBlockchainID();
blockchainID = blockchainID_;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An internal or private initializeBlockchainID() would be good to include since we do this in multiple places now.

Comment on lines 581 to 582
// Increment the message nonce so the next message will have a different ID
++messageNonce;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't matter functionally, but can we move this to after bytes memory teleporterMessageBytes = abi.encode(teleporterMessage);?

This will be consistent with the pattern before, where we update the state variable pertaining to messageID after this line.

Additionally, I think it makes sense to update the messageNonce value only after it's used to contsruct the TeleporterMessage (even though now it's indirect)

) external view returns (address, uint256);

/**
* @dev Gets the next message ID to be used for a given chain ID.
* @return The next message ID to be used to send a message to the given chain ID.
* @dev Gets the message ID to be used for the next message sent from the contract instance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here explaining that the return value is not only guaranteed to match the message ID of the next message sent to the contract, regardless of the sender address, or recipient chain or address?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's accurate. The return value is guaranteed to match the message ID of the next message sent from the contract. I'm guessing what you're referring to is that a caller of this method may not be guaranteed to be the next sender of a message and thus may get a different message ID assigned to their message, but I think that's assuming a specific use case that doesn't need to be documented in the contract itself.

For instance, a contract could call getNextMessageID, and then sendCrossChainMessage in the same function such that it does know it is the next caller of sendCrossChainMessage. I'm not sure what the use case would be, but something along the lines of:

contract MyContract {
    function calculateReportSend() public {
        // Calculate the message ID before sending.
        bytes32 expectedMessageID = teleporterMessenger.getNextMessageID();
        
        // Logic here to do something with expectedMessageID.
        
        // Send the message
        bytes32 messageID = teleporterMessenger.sendCrossChainMessage(...);
        require(expectedMessageID == messageID);
    }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's what I meant, thanks. In my original comment, an extraneous "not" snuck its way in. What I meant to say was:

Can you add a comment here explaining that the return value is not only guaranteed to match the message ID of the next message sent to the contract, regardless of the sender address, or recipient chain or address?

michaelkaplan13 and others added 5 commits December 18, 2023 10:38
Co-authored-by: cam-schultz <78878559+cam-schultz@users.noreply.github.com>
Signed-off-by: Michael Kaplan <55204436+michaelkaplan13@users.noreply.github.com>
@michaelkaplan13
Copy link
Collaborator Author

I've updated this PR such that the new unique message ID is not included in the messages themselves.

Instead the messageNonce used to compute the unique message ID is included in the message, and the receiver side recomputes the unique message ID based on that nonce. This makes it such that the message IDs received by a given Teleporter instance are ensured to be unique across all subnets even if a subnet gets corrupted.

This allows for flattening of multiple data structures that were previously keyed on [sourceBlockchainID, messageID], because it is now guaranteed that messageID is unique across all sourceBlockchainIDs.

geoff-vball
geoff-vball previously approved these changes Dec 19, 2023
Copy link
Contributor

@geoff-vball geoff-vball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

Co-authored-by: Geoff Stuart <geoff.vball@gmail.com>
Signed-off-by: Michael Kaplan <55204436+michaelkaplan13@users.noreply.github.com>
) private returns (bytes32) {
// If the blockchain ID has yet to be initialized, do so now.
bytes32 blockchainID_ = _initializeBlockchainID();
require(blockchainID_ != bytes32(0), "TeleporterMessenger: blockchainID not set");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check seems unnecessary. We already check against the bytes32(0) case inside the function, so this should never occur.

If we want to keep this as a redundant check, can we also add this require in the same place in receiveCrossChainMessage?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove the redundant check

/**
* @dev Calculates the message ID given this contract's address and required input parameters.
*/
function _calculateNextMessageNonceAndID() private view returns (uint256, bytes32) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording of this function suggests that it calculates the message ID based on messageNonce + 1. Can we rename to _calculateMessageNonceAndID?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo we can remove this function in favor of _calculateMessageID, then for getNextMessageID we can add a check for whether blockchainID is set

@@ -123,9 +120,12 @@ contract TeleporterMessenger is ITeleporterMessenger, ReentrancyGuards {
bytes32 destinationBlockchainID,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-looking at this, do we even need to pass in destinationBlockchainID? It's already inside TeleproterMessage

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in #209 👍

// Get the message ID to use for this message.
uint256 messageID = _getNextMessageID(messageInput.destinationBlockchainID);
(uint256 messageNonce_, bytes32 messageID_) = _calculateNextMessageNonceAndID();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just use _calculateMessageID here? We already have the check that blockchainID != zero bytes

@@ -62,23 +61,21 @@ contract TeleporterMessenger is ITeleporterMessenger, ReentrancyGuards {
// The first key is the blockchain ID of the destination chain, the second key is the message ID, and the value is the info
// for the uniquely identified message.
mapping(
bytes32 destinationBlockchainID => mapping(uint256 messageID => SentMessageInfo messageInfo)
bytes32 destinationBlockchainID => mapping(bytes32 messageID => SentMessageInfo messageInfo)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can actually get rid of the first key, since messageID is now unique regardless of which destinationBlockchainID it's sending to, and is unique to messageNonce

Copy link

@minghinmatthewlam minghinmatthewlam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall like the changes so far, but I think we'd be able to remove the first key of destinationBlockchainID for sentMessageInfo, since each messageID is unique regardless of the destinationBlockchainID, and will match to a message nonce. Then in some functions that take the extra destination blockchain ID, it can also be removed as a parameter, i.e. getFeeinfo, getMessageHash, etc.

bytes32 sourceBlockchainID,
uint256 nonce
) private view returns (bytes32) {
return sha256(abi.encode(address(this), sourceBlockchainID, nonce));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we opt for sha256 vs keccak256

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A tiny bit of research seems to say keccak is both cheaper (marginally) and more secure (not so important here, but nice to get for free).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, was already use keccak256 else in the contract also, so I switched to it as a part of #209 to be consistent.

@michaelkaplan13
Copy link
Collaborator Author

Closing in favor of #209

@geoff-vball geoff-vball deleted the hash-message-id branch January 30, 2024 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants