-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Simplex tree] adds possibility to different insertion strategies #1176
base: master
Are you sure you want to change the base?
Conversation
…b.com:hschreiber/gudhi-devel into simplex_tree_filtration_value_generalization
…version of initialize_filtration + change is_floating_point to has_quit_NaN for more generality + additional unit tests
Co-authored-by: Marc Glisse <marc.glisse@inria.fr>
….com:hschreiber/gudhi-devel into simplex_tree_filtration_value_generalization
* - if \f$ \sigma \f$ existed already, then it and all of its potentially already inserted cofaces are pushed to the | ||
* intersection of their old value and \f$ f \f$. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can imagine an algorithm changing its mind and wanting to insert a simplex at higher value than before. Which means that all its cofaces potentially also have to be pushed up, for the filtration to remain valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can make sense, but in my mind that's not an insertion anymore. If anything, it is closer to a removal (and the code and complexity are probably more similar to a removal as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then, even with other strategies, inserting an existing simplex is not really an insertion. The old version also updated the filtration values of everyone, just the other way around.
But yes, the complexity is going towards a removal, as it needs to iterate through the cofaces (just not always to the very end, as it can stop whenever it is not necessary anymore). But that is to expect if you want this type of strategy, I don't think the user would be surprised.
But if you really don't like it, I can remove it. I don't have a use for it for now, it just didn't hurt to add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe to explain my perspective: if you think of a filtered simplicial complex as a set of simplicial complexes (indexed by the filtration value, and with an inclusion when filtration values are comparable), lowering the filtration value of a simplex means inserting it in some of the complexes where it was not yet present, while increasing the filtration value means removing it from some of the complexes.
I am perfectly happy with such functionality being added, but I'd much rather give it a different name than insert
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From an algebraic point of view, I agree that it is a removal and not an insertion. But from a data structural point of view, both strategies do the same, just on different nodes. As a compromise, I would propose to use Update_strategy
instead of Insertion_strategy
. What do you think? Because I am not convinced that it is a good idea to add a new method for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should ask for more opinions. It seems strange to me to repurpose a function called insert_simplex_and_subfaces to do something that
- corresponds to a removal
- increases the filtration value of a simplex and its cofaces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is still a matter of point of view: I see the simplex tree as a structure which stores the simplices of a complex and when store_filtration
is enabled, it additionally stores the appearance time of the simplices when the complex is considered filtered. From this point of view, insert
is already wrong when a simplex is already inserted, as all the method is doing in that case, is updating the stored appearance time, not inserting it in the simplex tree. But if you allow the method to do so, updating the appearance time of cofaces seems fine too, as the idea the same. And I am pretty sure that a user with the same point of view will not be confused by it.
Things are different when you look at it as an algebraic object of course. But it is difficult to say which point of view is the more intuitive here.
But well, from the discussion below, this will probably not be a problem anymore anyway.
* - if \f$ \sigma \f$ was not inserted yet, then \f$ \sigma \f$ and all its non existing faces are inserted at the | ||
* first possible filtration value with minimum filtration value \f$ f \f$. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a bit strange that it still inserts missing faces: this means that a missing face and a face with filtration value +infinity have very different consequences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure to understand the comment? If you want to insert a simplex and some of its faces are missing, should insert_simplex_and_subfaces
not insert them? I thought that was what the purpose of the method (contrary to insert_simplex
). The strategies only apply on that method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a function called insert_*_subfaces
should insert subfaces, but the different behavior between missing faces and faces with a high filtration value feels inconsistent (I would understand better if the filtration value was increased to +inf when a face is missing, with the idea that a missing face is the same as a face with infinitely large filtration value).
How does David use the function exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If one face is at +inf, then all non existing cofaces of that face (which are faces of the given simplex) will be inserted at +inf, the given filtration value is just a minimum in that case (that is what I meant with "first possible filtration value with minimum filtration value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a simple example: say you have a triangle {0,1,2}. The vertices {0}, {1} and {2} are already inserted at filtration value 0. Same for edge {0,1} that is inserted at value 5. Now, we use insert_simplex_and_subfaces({0,1,2}, 4, HIGHEST)
. This will insert edge {0,2} and {1,2} at filtration value 4, but the triangle {0,1,2} at filtration value 5.
Before, insert_simplex_and_subfaces({0,1,2}, 4)
(equivalent now to insert_simplex_and_subfaces({0,1,2}, 4, LOWEST)
) would have done the same for {0,2} and {1,2}, but would have pushed down {0,1} to 4 et inserted {0,1,2} also at 4.
And finally, insert_simplex_and_subfaces({0,1,2}, 4, FIRST_POSSIBLE)
is inserting {0,2} and {1,2} at value 0 and {0,1,2} at value 5.
If you have now "missing simplices" in the sense that they are included but at value +inf, then HIGHEST
will insert all their future cofaces also at infinity, while LOWEST
will force their appearances by changing their filtration value to the given one while inserting the cofaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a simple example: say you have a triangle {0,1,2}. The vertices {0}, {1} and {2} are already inserted at filtration value 0. Same for edge {0,1} that is inserted at value 5. Now, we use
insert_simplex_and_subfaces({0,1,2}, 4, HIGHEST)
. This will insert edge {0,2} and {1,2} at filtration value 4, but the triangle {0,1,2} at filtration value 5.
Would a 4th strategy that inserts 02, 12 and 012 all at filtration value 5 make sense?
Before,
insert_simplex_and_subfaces({0,1,2}, 4)
(equivalent now toinsert_simplex_and_subfaces({0,1,2}, 4, LOWEST)
) would have done the same for {0,2} and {1,2}, but would have pushed down {0,1} to 4 et inserted {0,1,2} also at 4.And finally,
insert_simplex_and_subfaces({0,1,2}, 4, FIRST_POSSIBLE)
is inserting {0,2} and {1,2} at value 0 and {0,1,2} at value 5.
Is insert(x,*,first_possible)
equivalent to insert(x,-inf,lowest)
?
I don't know if the names highest/lowest properly convey the main difference: lowest may modify the filtration value of existing simplices (so it can really insert it at the time requested), while the others all keep them fixed. Ah, but that property only exists in my head, since you have cases were insert increases the filtration value of existing cofaces...
If you have now "missing simplices" in the sense that they are included but at value +inf, then
HIGHEST
will insert all their future cofaces also at infinity, whileLOWEST
will force their appearances by changing their filtration value to the given one while inserting the cofaces.
I find it a bit confusing that, without 02, I get 012 at time 5, while with 02 at time +inf (supposedly equivalent to not having 02?), I get 012 at time +inf. But maybe that's still useful. Or maybe it could yield a 5th strategy.
I'd still be interested in a reminder of what cases David really cared about when he came up with a different strategy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a simple example: say you have a triangle {0,1,2}. The vertices {0}, {1} and {2} are already inserted at filtration value 0. Same for edge {0,1} that is inserted at value 5. Now, we use
insert_simplex_and_subfaces({0,1,2}, 4, HIGHEST)
. This will insert edge {0,2} and {1,2} at filtration value 4, but the triangle {0,1,2} at filtration value 5.Would a 4th strategy that inserts 02, 12 and 012 all at filtration value 5 make sense?
It would make sense as it is not wrong. It would just even more disregard the input value than before. But as an additional strategy, it is valid. Complexity wise, I don't know if it would be better, I would have to verify again how the traversal in the recursion is done.
Before,
insert_simplex_and_subfaces({0,1,2}, 4)
(equivalent now toinsert_simplex_and_subfaces({0,1,2}, 4, LOWEST)
) would have done the same for {0,2} and {1,2}, but would have pushed down {0,1} to 4 et inserted {0,1,2} also at 4.
And finally,insert_simplex_and_subfaces({0,1,2}, 4, FIRST_POSSIBLE)
is inserting {0,2} and {1,2} at value 0 and {0,1,2} at value 5.Is
insert(x,*,first_possible)
equivalent toinsert(x,-inf,lowest)
?
No, it is equivalent to insert(x,-inf,highest)
. But now that I see it, I could factorise both highest and first_possible.
I don't know if the names highest/lowest properly convey the main difference: lowest may modify the filtration value of existing simplices (so it can really insert it at the time requested), while the others all keep them fixed. Ah, but that property only exists in my head, since you have cases were insert increases the filtration value of existing cofaces...
The idea of highest/lowest was that one strategy inserts at highest choice (between given and potentially already existing value) and the other at lowest choice and then updates the filtration such that it remains valid. It also highlight the duality between both strategies, as one is a bit the "co" version of the other. But I have nothing against a name change.
If you have now "missing simplices" in the sense that they are included but at value +inf, then
HIGHEST
will insert all their future cofaces also at infinity, whileLOWEST
will force their appearances by changing their filtration value to the given one while inserting the cofaces.I find it a bit confusing that, without 02, I get 012 at time 5, while with 02 at time +inf (supposedly equivalent to not having 02?), I get 012 at time +inf. But maybe that's still useful. Or maybe it could yield a 5th strategy.
I don't see "not yet inserted faces" and "missing faces" (at +inf) as something equivalent. A missing face is purposely missing, while a not existing face is a face where you still have to indicate where it should go (at +inf for example).
Perhaps, it would be a good idea to describe me the strategies you would like to have and then I see, if I can replace one of my proposition with yours or if I can add this strategy to the existing ones?
I'd still be interested in a reminder of what cases David really cared about when he came up with a different strategy.
What mattered to David, is that if he constructed a filtration in the simplex tree and later wanted to add a simplex, but without really caring about where, he didn't wanted insert_simplex_and_subfaces(s)
to flat partially the filtration down to 0, but to insert the simplex at the first position the already existing filtration permits it. Then, in python, he would potentially modify some filtration values by hand. So, basically the FIRST_POSSIBLE
strategy. The idea behind HIGHEST
was just to give the possibility to give a lower bound on where to insert the simplex. The subcase of HIGHEST
where the simplex already exists was just a natural extension, as this case has to be covered (even though not useful for multipers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a 4th strategy that inserts 02, 12 and 012 all at filtration value 5 make sense?
It would make sense as it is not wrong. It would just even more disregard the input value than before.
I think it just does things in a slightly different order. It first asks: what is the earliest time I can insert 012? And then does insert_simplex_and_subfaces(simplex_012, that_time)
. Whereas your version recursively inserts the subfaces with the same strategy.
I mostly suggested it because I have trouble figuring out what strategies are likely to be useful.
Complexity wise, I don't know if it would be better,
Mine should be a bit slower, the simplest way I can see to implement it is to follow yours, inserting all the new Simplex_handle
in a vector, and updating their filtration value at the end once we know the final value for the main simplex.
I don't see "not yet inserted faces" and "missing faces" (at +inf) as something equivalent. A missing face is purposely missing, while a not existing face is a face where you still have to indicate where it should go (at +inf for example).
A philosophical difference of opinions...
Perhaps, it would be a good idea to describe me the strategies you would like to have and then I see, if I can replace one of my proposition with yours or if I can add this strategy to the existing ones?
I don't have particular strategies I want. We could easily add a dozen strategies. I am just afraid that at most a couple of them would be useful and we would confuse users into picking the wrong one, or at least waste time trying to figure out what to use. But maybe that's still what we should do.
What mattered to David, is that if he constructed a filtration in the simplex tree and later wanted to add a simplex, but without really caring about where, he didn't wanted
insert_simplex_and_subfaces(s)
to flat partially the filtration down to 0,
It is true that, when we don't explicitly specify a filtration value, this behavior can be annoying. I guess it wouldn't have been a problem if we had used +inf as the default filtration value (although David favors -inf...).
but to insert the simplex at the first position the already existing filtration permits it.
I guess that makes sense, although it isn't completely obvious in this case if we want each new subface to appear also as early as possible (equivalent to calling the function on each missing subface first) or if it is better to have everything appear at the same time (it is one insertion, turning it into multiple insertions at multiple times could be unexpected).
Then, in python, he would potentially modify some filtration values by hand.
I am unsure how that part works. One possibility, if you are going to overwrite the filtration values anyway, would be to insert at +inf. Then at least you can identify the new subfaces as those with +inf filtration value. Or you could be satisfied with the values that were computed for the subfaces and not modify them. But I suspect that maybe he was only using this function when all the subfaces were already present?
The idea behind
HIGHEST
was just to give the possibility to give a lower bound on where to insert the simplex.
Makes sense.
The subcase of
HIGHEST
where the simplex already exists was just a natural extension, as this case has to be covered (even though not useful for multipers).
I see... The way HIGHEST is defined in the normal case, if we try to see what should happen if the simplex is already there with a lower filtration value, the options seem to be
- do nothing (this violates the idea that the value should be a lower bound, but we could easily say that this lower bound only applies to new simplices, we don't increase the filtration value of faces that are already there after all)
- increase the filtration value of the simplex (and that of the cofaces if necessary)
- refuse to do it (throw an exception or something)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have particular strategies I want. We could easily add a dozen strategies. I am just afraid that at most a couple of them would be useful and we would confuse users into picking the wrong one, or at least waste time trying to figure out what to use. But maybe that's still what we should do.
I agree, we shouldn't try to write all possible strategies here. We should choose among the strategies we think are most commonly useful. That is what I tried to do with those three options, as I now that they are useful for multipers (except for the modifications of the cofaces, as this method is never used on already existing simplices there). So let me reformulate: what do you think would be useful strategies to replace or complement the ones I suggested?
I guess that makes sense, although it isn't completely obvious in this case if we want each new subface to appear also as early as possible (equivalent to calling the function on each missing subface first) or if it is better to have everything appear at the same time (it is one insertion, turning it into multiple insertions at multiple times could be unexpected).
I know that David needed the first version, but I agree, the second one could also be interesting. I cannot say what would be better: implement both or chose one.
I am unsure how that part works. One possibility, if you are going to overwrite the filtration values anyway, would be to insert at +inf. Then at least you can identify the new subfaces as those with +inf filtration value. Or you could be satisfied with the values that were computed for the subfaces and not modify them. But I suspect that maybe he was only using this function when all the subfaces were already present?
I admit, I don't really remember the details, in particular because he changed his strategy several times. I just remember that for a while, he modified the method such that it just ignored the filtration values and then he overwrote everything in Python. Later, he asked me to modify the insertion method so that he can avoid having to do that in most cases.
I see... The way HIGHEST is defined in the normal case, if we try to see what should happen if the simplex is already there with a lower filtration value, the options seem to be
1. do nothing (this violates the idea that the value should be a lower bound, but we could easily say that this lower bound only applies to **new** simplices, we don't increase the filtration value of faces that are already there after all) 2. increase the filtration value of the simplex (and that of the cofaces if necessary) 3. refuse to do it (throw an exception or something)
Yes. I went for 2, but it does not have to be that way. If you think that this solution violates the idea of insertion, we could just go for 1, as the simplex would already be considered inserted in the filtered complex at filtration value "lower bound". Throwing would somehow go against the current strategy, where it does not throw when a simplex is already inserted in a lower complex than given by the argument filtration value (which would also be a removal otherwise). It just does nothing.
* - if \f$ \sigma \f$ was not inserted yet, then \f$ \sigma \f$ and all its non existing faces are inserted at the | ||
* first possible filtration value with minimum filtration value \f$ f \f$. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a 4th strategy that inserts 02, 12 and 012 all at filtration value 5 make sense?
It would make sense as it is not wrong. It would just even more disregard the input value than before.
I think it just does things in a slightly different order. It first asks: what is the earliest time I can insert 012? And then does insert_simplex_and_subfaces(simplex_012, that_time)
. Whereas your version recursively inserts the subfaces with the same strategy.
I mostly suggested it because I have trouble figuring out what strategies are likely to be useful.
Complexity wise, I don't know if it would be better,
Mine should be a bit slower, the simplest way I can see to implement it is to follow yours, inserting all the new Simplex_handle
in a vector, and updating their filtration value at the end once we know the final value for the main simplex.
I don't see "not yet inserted faces" and "missing faces" (at +inf) as something equivalent. A missing face is purposely missing, while a not existing face is a face where you still have to indicate where it should go (at +inf for example).
A philosophical difference of opinions...
Perhaps, it would be a good idea to describe me the strategies you would like to have and then I see, if I can replace one of my proposition with yours or if I can add this strategy to the existing ones?
I don't have particular strategies I want. We could easily add a dozen strategies. I am just afraid that at most a couple of them would be useful and we would confuse users into picking the wrong one, or at least waste time trying to figure out what to use. But maybe that's still what we should do.
What mattered to David, is that if he constructed a filtration in the simplex tree and later wanted to add a simplex, but without really caring about where, he didn't wanted
insert_simplex_and_subfaces(s)
to flat partially the filtration down to 0,
It is true that, when we don't explicitly specify a filtration value, this behavior can be annoying. I guess it wouldn't have been a problem if we had used +inf as the default filtration value (although David favors -inf...).
but to insert the simplex at the first position the already existing filtration permits it.
I guess that makes sense, although it isn't completely obvious in this case if we want each new subface to appear also as early as possible (equivalent to calling the function on each missing subface first) or if it is better to have everything appear at the same time (it is one insertion, turning it into multiple insertions at multiple times could be unexpected).
Then, in python, he would potentially modify some filtration values by hand.
I am unsure how that part works. One possibility, if you are going to overwrite the filtration values anyway, would be to insert at +inf. Then at least you can identify the new subfaces as those with +inf filtration value. Or you could be satisfied with the values that were computed for the subfaces and not modify them. But I suspect that maybe he was only using this function when all the subfaces were already present?
The idea behind
HIGHEST
was just to give the possibility to give a lower bound on where to insert the simplex.
Makes sense.
The subcase of
HIGHEST
where the simplex already exists was just a natural extension, as this case has to be covered (even though not useful for multipers).
I see... The way HIGHEST is defined in the normal case, if we try to see what should happen if the simplex is already there with a lower filtration value, the options seem to be
- do nothing (this violates the idea that the value should be a lower bound, but we could easily say that this lower bound only applies to new simplices, we don't increase the filtration value of faces that are already there after all)
- increase the filtration value of the simplex (and that of the cofaces if necessary)
- refuse to do it (throw an exception or something)
Adds different strategies for
insert_simplex_and_subfaces
to handle the filtration values of the simplices inserted.Includes PR #1122 so it should be easier to review once the latter is merged.
Necessary for the multi-simplex tree from #976.