Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix rule extraction to handle constant factors #870

Merged
merged 7 commits into from
Dec 3, 2024
Merged

Conversation

Zapiano
Copy link
Collaborator

@Zapiano Zapiano commented Oct 3, 2024

Note: This branch is based on #863 's branch. Closes #672 .


Updates ADRIA.analysis.cluster_rules so we can check for constant factors (and ignore them) before trying to extract rules. The parameters you need to pass to cluster_rules have changed - with that is one line less of code now.

The documentation still needs to be updated - once this is done, I'll mark this as ready to review.

Example

As an example, let's fix all SeedCriteriaWeights and only a few of the FogCriteriaWeights factors, run the model, cluster and target some cluster:

dom = ADRIA.load_domain("...")

ADRIA.fix_factor!(
    dom;
    # Seed criteria
    seed_out_connectivity=0.9,
    seed_heat_stress=0.8,
    seed_depth=0.7,
    seed_coral_cover=0.5,
    seed_in_connectivity=0.4,
    seed_wave_stress=0.0,
    # Fog criteria
    fog_coral_cover=0.5,
    fog_depth=0.7
)

# Create scenarios
num_scens = 2^8
scens = ADRIA.sample(dom, num_scens)

# Run model / Load ResultSet
rs = ADRIA.run_scenarios(dom, scens, "45")

# Find Time Series Clusters
s_tac = ADRIA.metrics.scenario_total_cover(rs)
num_clusters = 6
clusters = ADRIA.analysis.cluster_scenarios(s_tac, num_clusters)

# Target scenarios
target_clusters = ADRIA.analysis.target_clusters(clusters, s_tac)

Now if we select Intervention factors, everything works (as it would previously, nothing's c hanged here):

max_rules = 10
fields_iv = ADRIA.component_params(rs, [Intervention]).fieldname
rules_iv = ADRIA.analysis.cluster_rules(dom, target_clusters, scens, fields_iv, max_rules)

If we select FogCriteriaWeights, some are constant and some are not. Previously this would lead to an error but now it just filters off all constant factors and everything works fine:

max_rules = 10
fields_fog = ADRIA.component_params(rs, [FogCriteriaWeights]).fieldname
rules_fog = ADRIA.analysis.cluster_rules(
    dom, target_clusters, scens, fields_seeds, max_rules
)

And, finally, if we select SeedCriteriaWeights, because all of them are constant, this will raise an error:

max_rules = 10
fields_seeds = ADRIA.component_params(rs, [SeedCriteriaWeights]).fieldname
rules_seeds = ADRIA.analysis.cluster_rules(
    dom, target_clusters, scens, fields_seeds, max_rules
)

The error:

ERROR: ArgumentError: Factors of interest have to be non constant.
Stacktrace:
 [1] cluster_rules(domain::ADRIADomain, clusters::Vector{…}, scenarios::DataFrames.DataFrame, factors::Vector{…}, max_rules::Int64; seed::Int64, kwargs::@Kwargs{})
   @ ADRIA.analysis c:\Users\pribeiro\AIMS\Code\ADRIA.jl\src\analysis\rule_extraction.jl:148
 [2] cluster_rules(domain::ADRIADomain, clusters::Vector{…}, scenarios::DataFrames.DataFrame, factors::Vector{…}, max_rules::Int64)
   @ ADRIA.analysis c:\Users\pribeiro\AIMS\Code\ADRIA.jl\src\analysis\rule_extraction.jl:134
 [3] top-level scope
   @ c:\Users\pribeiro\AIMS\Code\ADRIA.jl\sandbox\rule_extraction\fixed_factors.jl:77
Some type information was truncated. Use `show(err)` to see complete types.

Below are the scatter plots for Interventions and FogCriteriaWeights rules (the plot functions haven't changed):

Intervention FogCriteriaWeights
rules_iv_scatter rules_fog_scatter

@Zapiano Zapiano added the bug Something isn't working label Oct 3, 2024
@Zapiano Zapiano self-assigned this Oct 3, 2024
@Zapiano Zapiano force-pushed the fix-rule-extraction branch from 41c5040 to 263a125 Compare October 4, 2024 02:22
@Zapiano Zapiano marked this pull request as ready for review October 4, 2024 02:22
@ConnectedSystems
Copy link
Collaborator

ConnectedSystems commented Oct 9, 2024

Haven't looked through the changes yet (will wait for the dependent PR to be merged first), but one minor request:

Change the error message to "Factors of interest cannot be constant".

@Zapiano Zapiano force-pushed the fix-rule-extraction branch from 263a125 to 6ff0cfb Compare November 1, 2024 00:51
@Zapiano
Copy link
Collaborator Author

Zapiano commented Nov 1, 2024

@ConnectedSystems this one is ready for review.

@Zapiano Zapiano force-pushed the fix-rule-extraction branch from 7007655 to 3140b75 Compare November 19, 2024 22:42
@Zapiano
Copy link
Collaborator Author

Zapiano commented Nov 19, 2024

@ConnectedSystems friendly reminder that this is ready for review

function cluster_rules(clusters::Vector{T}, X::DataFrame, max_rules::T;
seed::Int64=123, kwargs...) where {T<:Int64}
function cluster_rules(
result_set::ADRIA.ResultSet,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment - could this be simplified to ResultSet? If not, maybe we should consider exporting the ResultSet type...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that works.

Copy link
Collaborator

@ConnectedSystems ConnectedSystems left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor question.

Otherwise this is ready to be merged.

@Zapiano
Copy link
Collaborator Author

Zapiano commented Nov 26, 2024

I still need to rebase this.

@Zapiano Zapiano force-pushed the fix-rule-extraction branch from 4ef0725 to a4c81a4 Compare December 2, 2024 22:45
@Zapiano
Copy link
Collaborator Author

Zapiano commented Dec 2, 2024

@ConnectedSystems this one is rebased and ready for review.

@Zapiano Zapiano merged commit 0f4f9a2 into main Dec 3, 2024
1 check passed
@Zapiano Zapiano deleted the fix-rule-extraction branch December 3, 2024 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ignore constant factors when applying SIRUS
2 participants