Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various fixes #274

Merged
merged 30 commits into from
Jan 10, 2025
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
f089b7b
Do not perform arithmetic operations on i/e options
olejandro Jan 6, 2025
e1cde22
Add comments
olejandro Jan 6, 2025
8cdc973
Move dummy import switch to Config; remove default cost
olejandro Jan 6, 2025
7f6d687
Modify comments
olejandro Jan 6, 2025
8eaee4b
Populate defaults for limtype and timeslice cols in uc_t
olejandro Jan 6, 2025
482d603
Capitalise attribute names after melting
olejandro Jan 6, 2025
db36273
Correct mapping for PRC_REFIT
olejandro Jan 6, 2025
e1f287a
Correct mapping and expand defaults for FLO_FUNC
olejandro Jan 6, 2025
78c1f7d
Adjust more mappings
olejandro Jan 6, 2025
72ca7cf
Treat PEAK(CON) as an alias of NCAP_PKCNT
olejandro Jan 6, 2025
f600add
Mostly fix type errors
olejandro Jan 7, 2025
6cdba03
Escape dots
olejandro Jan 7, 2025
ecef06f
Closes #275
olejandro Jan 7, 2025
55f39a4
Only keep valid process / module combinations in tfm_ava
olejandro Jan 7, 2025
f9f5dbc
Remove based on process name instead of index
olejandro Jan 8, 2025
ad3c3d7
Simplify process_wildcards and the related code
olejandro Jan 8, 2025
be3aeea
Remove callable from _match_wildcards inputs
olejandro Jan 8, 2025
ddba76e
Update tests
olejandro Jan 8, 2025
eba3eb3
Simplify _match_wildcards
olejandro Jan 9, 2025
0e627b9
clean up...
olejandro Jan 9, 2025
db919a3
Try to pass the test
olejandro Jan 9, 2025
f2267e5
Try fixing the test once again
olejandro Jan 9, 2025
2e2b8ab
Mostly more types...
olejandro Jan 9, 2025
5984b50
Use difference
olejandro Jan 9, 2025
cd198df
Use difference instead of "-" through out the code
olejandro Jan 9, 2025
69d0ee1
Strip white space when processing attributes with tilde
olejandro Jan 10, 2025
320f196
Do not install from PyPI for pytest
siddharth-krishna Jan 10, 2025
d933c1b
Merge branch 'main' into olex/various-fixes
olejandro Jan 10, 2025
3a010b5
Address review comments
olejandro Jan 10, 2025
fa630a8
And one more...
olejandro Jan 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 16 additions & 16 deletions xl2times/config/times-info.json
Original file line number Diff line number Diff line change
Expand Up @@ -1619,15 +1619,15 @@
"YEAR",
"PRC",
"CG",
"CG",
"CG2",
"TS"
],
"mapping": [
"region",
"year",
"process",
"other_indexes",
"other_indexes",
"commodity",
"timeslice"
]
},
Expand Down Expand Up @@ -2017,14 +2017,14 @@
"indexes": [
"ALL_R",
"COM",
"ALL_R",
"COM"
"REG2",
"COM2"
],
"mapping": [
"region",
"commodity",
"region",
"commodity"
"region2",
"commodity2"
]
},
{
Expand All @@ -2035,18 +2035,18 @@
"YEAR",
"PRC",
"COM",
"ALL_R",
"COM",
"TS"
"REG2",
"COM2",
"TS2"
],
"mapping": [
"region",
"year",
"process",
"commodity",
"region",
"commodity",
"timeslice"
"region2",
"commodity2",
"timeslice2"
]
},
{
Expand All @@ -2059,7 +2059,7 @@
"COM",
"TS",
"IE",
"COM",
"COM2",
"IO"
],
"mapping": [
Expand All @@ -2069,7 +2069,7 @@
"commodity",
"timeslice",
"other_indexes",
"commodity",
"commodity2",
"other_indexes"
]
},
Expand Down Expand Up @@ -3295,11 +3295,11 @@
"indexes": [
"REG",
"PRC",
"PRC"
"PRC2"
],
"mapping": [
"region",
"process",
"other_indexes",
"process"
]
},
Expand Down
15 changes: 15 additions & 0 deletions xl2times/config/veda-attr-defaults.json
Original file line number Diff line number Diff line change
Expand Up @@ -646,6 +646,12 @@
},
"FLO_FUNC": {
"defaults": {
"commodity": [
"commodity-in",
"commodity-out",
"commodity-in-aux",
"commodity-out-aux"
],
"ts-level": "ANNUAL"
}
},
Expand Down Expand Up @@ -883,7 +889,16 @@
},
"times-attribute": "NCAP_PKCNT"
},
"PEAK(CON)": {
"defaults": {
"ts-level": "ANNUAL"
},
"times-attribute": "NCAP_PKCNT"
},
"PKCNT": {
"defaults": {
"ts-level": "ANNUAL"
},
"times-attribute": "NCAP_PKCNT"
},
"PKCOI": {
Expand Down
2 changes: 2 additions & 0 deletions xl2times/datatypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -318,6 +318,8 @@ class Config:
times_sets: dict[str, list[str]]
# Switch to prevent overwriting of I/E settings in BASE and SubRES
ie_override_in_syssettings: bool = False
# Switch to include dummy imports in the model
include_dummy_imports: bool = True

def __init__(
self,
Expand Down
91 changes: 41 additions & 50 deletions xl2times/transforms.py
Original file line number Diff line number Diff line change
Expand Up @@ -885,6 +885,7 @@ def fill_in_missing_values(
def fill_in_missing_values_table(table):
df = table.dataframe.copy()
default_values = config.column_default_value.get(table.tag, {})
mapping_to_defaults = {"limtype": "limtype", "timeslice": "tslvl"}

for colname in df.columns:
# TODO make this more declarative
Expand All @@ -900,30 +901,18 @@ def fill_in_missing_values_table(table):
ismat = df["csets"] == "MAT"
df.loc[isna & ismat, colname] = "FX"
df.loc[isna & ~ismat, colname] = "LO"
elif (
colname == "limtype"
and (table.tag == Tag.fi_t or table.tag.startswith("~TFM"))
and len(df) > 0
):
isna = df[colname].isna()
for lim in config.veda_attr_defaults["limtype"].keys():
df.loc[
isna
& df["attribute"]
.str.upper()
.isin(config.veda_attr_defaults["limtype"][lim]),
colname,
] = lim
elif colname == "timeslice" and len(df) > 0 and "attribute" in df.columns:
elif colname in {"limtype", "timeslice"} and "attribute" in df.columns:
isna = df[colname].isna()
for timeslice in config.veda_attr_defaults["tslvl"].keys():
df.loc[
isna
& df["attribute"]
.str.upper()
.isin(config.veda_attr_defaults["tslvl"][timeslice]),
colname,
] = timeslice
if any(isna):
key = mapping_to_defaults[colname]
for value in config.veda_attr_defaults[key].keys():
df.loc[
isna
& df["attribute"].isin(
config.veda_attr_defaults[key][value]
),
colname,
] = value
elif (
colname == "tslvl" and table.tag == Tag.fi_process
): # or colname == "CTSLvl" or colname == "PeakTS":
Expand Down Expand Up @@ -1829,7 +1818,6 @@ def generate_dummy_processes(
config: Config,
tables: list[EmbeddedXlTable],
model: TimesModel,
include_dummy_processes=True,
) -> list[EmbeddedXlTable]:
"""Define dummy processes and specify default cost data for them to ensure that a
TIMES model can always be solved.
Expand All @@ -1838,7 +1826,7 @@ def generate_dummy_processes(
Significant cost is usually associated with the activity of these processes to
ensure that they are used as a last resort
"""
if include_dummy_processes:
if config.include_dummy_imports:
# TODO: Activity units below are arbitrary. Suggest Veda devs not to have any.
dummy_processes = [
["IMP", "IMPNRGZ", "Dummy Import of NRG", "PJ", "", "NRG"],
Expand All @@ -1863,9 +1851,8 @@ def generate_dummy_processes(
)

process_data_specs = process_declarations[["process", "description"]].copy()
# Use this as default activity cost for dummy processes
# TODO: Should this be included in settings instead?
process_data_specs["ACTCOST"] = 1111
# Provide an empty value in case an upd table is used to provide data
process_data_specs["ACTCOST"] = ""

tables.append(
EmbeddedXlTable(
Expand Down Expand Up @@ -2030,6 +2017,8 @@ def is_year(col_name):
value_name="value",
ignore_index=False,
)
# Convert the attribute column to uppercase
df["attribute"] = df["attribute"].str.upper()
result.append(
replace(table, dataframe=df, tag=Tag(tag.value.split("-")[0]))
)
Expand Down Expand Up @@ -2165,7 +2154,7 @@ def process_transform_availability(
return result


def filter_by_pattern(df: pd.DataFrame, pattern: str) -> pd.DataFrame:
def filter_by_pattern(df: pd.Series, pattern: str) -> pd.Series:
"""Filter dataframe index by a regex pattern."""
# Duplicates can be created when a process has multiple commodities that match the pattern
df = df.filter(regex=utils.create_regexp(pattern), axis="index").drop_duplicates()
Expand All @@ -2174,18 +2163,18 @@ def filter_by_pattern(df: pd.DataFrame, pattern: str) -> pd.DataFrame:
return df.drop(exclude)


def intersect(acc, df):
def intersect(acc: pd.Series | None, df: pd.Series) -> pd.Series | None:
if acc is None:
return df
return acc.merge(df)


def get_matching_processes(
row: pd.Series, topology: dict[str, DataFrame]
row: pd.Series, topology: dict[str, pd.Series]
) -> pd.Series | None:
matching_processes = None
for col, key in process_map.items():
if col in row.index and row[col] not in {None, ""}:
if col in row.index and pd.notna(row[col]):
proc_set = topology[key]
pattern = row[col].upper()
filtered = filter_by_pattern(proc_set, pattern)
Expand All @@ -2197,10 +2186,12 @@ def get_matching_processes(
return matching_processes


def get_matching_commodities(row: pd.Series, topology: dict[str, DataFrame]):
def get_matching_commodities(
row: pd.Series, topology: dict[str, pd.Series]
) -> pd.Series | None:
matching_commodities = None
for col, key in commodity_map.items():
if col in row.index and row[col] not in {None, ""}:
if col in row.index and pd.notna(row[col]):
matching_commodities = intersect(
matching_commodities,
filter_by_pattern(topology[key], row[col].upper()),
Expand All @@ -2221,7 +2212,7 @@ def df_indexed_by_col(df, col):

def generate_topology_dictionary(
tables: dict[str, DataFrame], model: TimesModel
) -> dict[str, DataFrame]:
) -> dict[str, pd.Series]:
# We need to be able to fetch processes based on any combination of name, description, set, comm-in, or comm-out
# So we construct tables whose indices are names, etc. and use pd.filter

Expand Down Expand Up @@ -2326,7 +2317,7 @@ def process_wildcards(
def _match_wildcards(
df: pd.DataFrame,
col_map: dict[str, str],
dictionary: dict[str, pd.DataFrame],
dictionary: dict[str, pd.Series],
matcher: Callable,
result_col: str,
explode: bool = False,
Expand Down Expand Up @@ -2362,12 +2353,7 @@ def _match_wildcards(
# match all the wildcards columns against the dictionary names
matches = unique_filters.apply(lambda row: matcher(row, dictionary), axis=1)

# we occasionally get a Dataframe back from the matchers. convert these to Series.
matches = (
matches.iloc[:, 0].to_list()
if isinstance(matches, pd.DataFrame)
else matches.to_list()
)
matches = matches.to_list()
matches = [
df.iloc[:, 0].to_list() if df is not None and len(df) != 0 else None
for df in matches
Expand Down Expand Up @@ -2438,11 +2424,11 @@ def query(
def is_missing(field):
return pd.isna(field) if not isinstance(field, list) else pd.isna(field).all()

qs = []

for k, v in query_fields.items():
if not is_missing(v):
qs.append(f"{k} in {v if isinstance(v, list) else [v]}")
qs = [
f"{k} in {v if isinstance(v, list) else [v]}"
for k, v in query_fields.items()
if not is_missing(v)
]

query_str = " and ".join(qs)
row_idx = table.query(query_str).index
Expand All @@ -2454,6 +2440,11 @@ def eval_and_update(table: DataFrame, rows_to_update: pd.Index, new_value: str)
which can be a update formula like `*2.3`.
"""
if isinstance(new_value, str) and new_value[0] in {"*", "+", "-", "/"}:
# Do not perform arithmetic operations on rows with i/e options
if "year" in table.columns:
rows_to_update = rows_to_update.intersection(
table.index[table["year"] != 0]
)
old_values = table.loc[rows_to_update, "value"]
updated = old_values.astype(float).map(lambda x: eval("x" + new_value))
table.loc[rows_to_update, "value"] = updated
Expand Down Expand Up @@ -2694,8 +2685,8 @@ def apply_transform_tables(
new_rows = table.loc[rows_to_update].copy()
# Modify values in all '*2' columns
for c, v in row.items():
if c.endswith("2") and v is not None:
new_rows.loc[:, c[:-1]] = v
if str(c).endswith("2") and v is not None:
new_rows.loc[:, str(c)[:-1]] = v
# Evaluate 'value' column based on existing values
eval_and_update(new_rows, rows_to_update, row["value"])
# In case more than one data module is present in the table, select the one with the highest index
Expand Down Expand Up @@ -2794,7 +2785,7 @@ def timeslices_table(

# Ensure that all timeslice levels are uppercase
timeslices = {
col.upper(): list(values.unique())
str(col).upper(): list(values.unique())
for col, values in table.dataframe.items()
}

Expand Down
6 changes: 4 additions & 2 deletions xl2times/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -261,8 +261,10 @@ def create_regexp(pattern: str, combined: bool = True) -> str:
pattern = pattern.replace(",", r"$|^")
if len(pattern) == 0:
return r".*" # matches everything
# Handle substite VEDA wildcards with regex patterns
for substition in (("*", ".*"), ("?", ".")):
# Substite VEDA wildcards with regex patterns; escape metacharacters.
# ("_", ".") and ("[.]", "_") are meant to apply one after another to handle
# the usage of "_" equivalent to "?" and "[_]" as literal "_".
for substition in ((".", "\\."), ("_", "."), ("[.]", "_"), ("*", ".*"), ("?", ".")):
Copy link
Collaborator

@siddharth-krishna siddharth-krishna Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I had #275 on my list but I didn't get around to it yet. Thanks for the fix, it's a smart one!

Btw, perhaps this is a good function to start writing unit tests for, as it's gotten reasonably complex/subtle?

Also, you can simplify this with for old, new in ... :)
Also, I would use a list of tuples over a tuple of tuples when it gets beyond 3 items, because tuples have some strange hardcoded implementation in Python if I remember correctly. :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see we have #222 already, I'll put it on my list :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll change it to a list before merging.

old, new = substition
pattern = pattern.replace(old, new)
# Do not match substrings
Expand Down
Loading