Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow set max_id and min_id for MessageIter and SearchIter #304

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

amone-bit
Copy link

allow set max_id and min_id for MessageIter and SearchIter to filter messages like telethon.

@Lonami
Copy link
Owner

Lonami commented Jan 10, 2025

Why is the existing offset_id and breaking out of the loop early not enough?

@amone-bit
Copy link
Author

amone-bit commented Jan 11, 2025

If I use offset_id I need to filter messages in local, and will fetch more data from server. I think that will cost more time, and may not be very elegant and efficient. In the contrary, if I set min_id and max_id the filter will in telegram server, and will return less data from server. And meanwhile min_id and max_id maybe more intuitive than offset_id and loop.

@amone-bit
Copy link
Author

offset_id seems only can get older messages, can't get the new messages. I want to get messages from old to new rather than from new to old.

@amone-bit
Copy link
Author

It would be very nice, if SearchIter and MessageIter have not only next() method but also prev() method to iter from behind to front.

@Lonami
Copy link
Owner

Lonami commented Jan 12, 2025

I've discussed this before and also argued it wasn't needed. But I may reconsider if more people need it.

One of my worries is that some offsets seem to not always work, and exposing them here would mean people would expect them to work.

If we introduce max_id, I think we should remove offset_id. And when making the request, continue using offset_id internally unless min_id was also set.

Regarding prev, I'm not sure it's a good idea to provide both at the same time. If anything, there should be some sort of .reversed(), kind of like in Telethon. But we also need to be very careful here, to make sure it works in all edge cases. And it's more maintenance burden.

@aparo
Copy link

aparo commented Apr 6, 2025

I'm +1 for the changes. I implemented a downloader and I also had to implement these methods to be able to efficiently process the ranges. offset_id is not enought if you need to evaluate if you have gap in the messages.

@Lonami
Copy link
Owner

Lonami commented Apr 6, 2025

offset_id is not enought if you need to evaluate if you have gap in the messages

Can you help me understand why? A practical example perhaps.

@aparo
Copy link

aparo commented Apr 6, 2025

I need to check that all the message from a chat are downloaded. I start from offset 0 to x.
I keep track of the downloaded message number.
Because messages can be deleted or there can be gao issues, I don’t have message for example from 123 to 135.
the most perfomant check is to ask messages between 123 and 135 and che if the result is 0 means that I have all the messages. The query will no fetch messages if they are missing so it’s performing. Otherwise I should fetch a range and then skip all of them if not in range.

@Lonami
Copy link
Owner

Lonami commented Apr 7, 2025

Right, so the usecase is "going back" to a "failed gap", and offset_id + limit is not desirable because that might needlessly include messages outside of the expected range.

In that case, I think the implementation would make more sense if it had a single method that took a Range. Then the user can choose all the combinations, and the questions of "is this inclusive or exclusive" would be obvious. The library can then also take care of using offset_id or min_id/max_id to correctly fit the usecase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants