-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Persistence_matrix] move bar type outside #1083
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we have different stylistic tastes. I tend to favor short:
Persistence_interval(dimension_type dim, event_value_type birth, event_value_type death)
doesn't need any description (it should still appear in the doc), the declaration is self-documenting.- A comment at the beginning that Persistence_interval is usable like a tuple that stores birth, death and dim in this order with std::tuple_element, std::tuple_size, std::get and structured binding, may be clearer than explicitly and redundantly documenting each specialization and overload.
But we don't have to follow my tastes 😉
* | ||
* @brief Overload of `std::get` for @ref Gudhi::persistence_matrix::Persistence_interval. | ||
* | ||
* @tparam I Index of the value to return: 0 for the birth value, 1 for the death value and 2 for the dimension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems confusing that the constructor uses the order dim-birth-death while the access is in order birth-death-dim. Can't we pick one consistent order? Or are you trying to be consistent with 2 existing, incompatible types, and this is some compromise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially did all the matrix, zigzag etc. with the order dim-birth-death in mind. But when I was looking at making it consistent with the output of cohomology when adding the std::get
overloads, I saw that the tuples are in order birth-death-coeff. So I wanted the std::get
to stick closer to that. I thought that it won't be much of a problem as the users using the constructors them-selves would probably not use the std::get
to access the members and therefore not be confused. But I agree, it was mostly laziness on my side, as I didn't wanted to correct all the calls to the bars in the matrix to change the order... But I could do that if you find it really too confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I like the new consistent order better (I don't mind that printing is in a different order because the []
around the dimension don't make it look like a tuple).
A drawback is that we are probably less likely to be able to use the short versions of the constructor with 1 or 2 elements 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I used the short version in zigzag and will have to use the long version now. But it is not the end of the world. Or I could go back to have three different constructors.
By experience I realized that the more redundant the documentation is, the better the people are understanding it... Even though I personally also find it often too much. So, I will try to make a mix of the two approaches by adding what you mentioned in the main description. |
@@ -0,0 +1,215 @@ | |||
/* This file is part of the Gudhi Library - https://gudhi.inria.fr/ - which is released under MIT. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File should be named Persistence_interval.h
, cf. code convention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not the only file which doesn't follow this convention in the module. Is it fine if I correct them all in this PR or should I do another one? Also the example_*
are not respected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't mix mass renaming and actual changes in the same branch, it makes it super hard to review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but as the changes of this PR are mostly in just one file, I thought there could be an exception. But no problem in doing another PR for the renamings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as the changes of this PR are mostly in just one file
If you rename things all over the place in other files, it will take a reviewer a long time just to make sure that this file is the only one with non-trivial changes.
(I didn't look at what the renaming was about)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but I don't think there will be other reviewesr than you two, no? But, like I said, it makes sense to make another PR out of it, it was more of a "you never know" question. The new PR is here: #1106.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VincentRouvreau I am not sure anymore if I should really rename persistence_interval.h
to Persistence_interval
. The convention says:
- If a single class or function is provided in a file, its name (with the same letter case) should be used for the file name.
- If a file does not contain a single class, its name should not begin with a capital letter.
The file does not contain only the Persistence_interval
class, but also different global methods (the get etc.) and global structures (tuple_element). So, the second point of the convention should apply, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I never read the convention and don't really care)
I would argue that those are part of the interface of this class. Whether they are free functions, static members, etc is a detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, that is why they are in the same file. But I am not sure that is how I interpret the convention. I saw the two namings a way for the developer to search more easily: "if there is a upper case, don't need to look here for this one method related to it, it has to be somewhere else". Otherwise, I am not sure to see the utility to have two different naming convention for files...
inline friend std::ostream &operator<<(std::ostream &stream, const Persistence_interval &interval) { | ||
stream << "[" << interval.dim << "] "; | ||
stream << interval.birth << ", " << interval.death; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inf
values should be detected and print something else.
using Dimension_type = typename Matrix::dimension_type;
using Pos_index = typename Matrix::pos_index;
using Interval = Gudhi::persistence_matrix::Persistence_interval<Dimension_type, Pos_index>;
Interval bar_default;
std::cout << bar_default << std::endl;
// prints: [-1] 4294967295, 4294967295
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the same remark than Marc, but:
Interval bar(0,1,2);
std::cout << bar << std::endl;
// print: [0] 1, 2
std::cout << std::get<0>(bar) << ", " << std::get<1>(bar) << ", " << std::get<2>(bar) << std::endl;
// print: 1, 2, 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure to understand which remark of Marc you are referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably the one about having a consistent order on the elements of the tuple between the constructor and get (and operator<<).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I already explained why it was like that, I was waiting for an answer. So, should I change the order in the constructor to match the std::get
s? But for the print, I like it better to have the dimension first. [0] 1, 2
is much easier to read than 1, 2 [0]
in particular when there are a lot of bars.
Side remark: I am surprised your intervals always come with a dimension. Can you comment on that? |
Until now, all use cases I had for barcodes needed the dimension somehow (to plot them separately for example), so I thought that adding the dimension is a natural thing to do. In particular when getting the dimension from the barcode using the matrix is not always possible (eg. when MatIdx != PosIdx). But it could be an option if you think that there are enough cases where the dimension is not needed and will therefore just take up useless space in the barcode. |
When computing usual persistence, the matrix doesn't need to know anything about dimensions. We can have a single matrix for all dimensions, or a separate matrix for each dimension. |
True, for computing the persistence the dimensions are not needed per se, but it can be useful for some optimization during reduction. Also with it, the matrix can be used a bit like a simplex tree by storing commonly needed information about your filtration the user don't have to manage anymore. But I agree that we could simplify the structure by removing completely the notion of dimension from it (or make it completely optional to keep the optimization of |
To be able to use it also for Zigzag persistence (PR #917) more easily.
I also added the possibility to use
std::get
andstd::tuple_size
for the bars for the compatibility with other modules.std::tuple_size
just doesn't work for structure inheriting from aPersistence_interval
.