-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
column names hardcoded in fault modeler #66
Comments
I start with replying to the second issue. |
About the first comment, I fully agree that the code should be capable to handle non-standard header information. Indeed, I've also included the param_map variable in the class for parsing the geojson files (fault_database.import_from_geojson), and I will add it as well to the generic interfaces (build_fault_model and build_model_from_db ) as external argument or (as you suggested) as an ini file. |
I have applied all of the above in the new version of |
For the parameter mapping (not the defaults), I think it is fine to put the parameter mapping/translation in an interface rather than in the library code. When I wrote the library code, I had to build a first version that incorporated this functionality directly into the library because there was no interface yet. I also wanted to make the interface as simple as possible and move all of the complexity into the library. But either way is fine with me I don't think it's computationally less efficient--this is exactly the same as looping through all of the parameters for all of the faults when the translation is done. Classes in Python are just big, complicated dicts, so each time you access an attribute of a class you are doing dictionary lookups through a few layered dicts anyways. Regardless it's a very quick process (I just timed a dict lookup at 36 ns, and an attribute lookup at 40 ns] so it really doesn't matter from an efficiency standpoint how we do it. However we need to make sure that this is only done in one function somewhere. You mentioned having several interfaces in a comment above, and obviously you don't want to have to maintain a version of this code in each interface. This is one reason to keep it in the library, but if we are smart about how the interface code is written it's not a problem to have it there. |
As I wrote the code, it already did all of this.
This is incorrect. These are project default values and are only overridden for specific faults if those keys exist in the For more information, see how the If there are problems with this, it may because you have changed the |
As it was before before my changes, passing a default dictionary with only one parameter to be updated was causing the whole hardcoded dictionary to be overwritten, losing all the other parameters and creating an error. With my modification, only the user-provided key is replaced, while all other defaults setting are preserved. I suggest you to use different names for the hardcoded dictionary and the function argument. In my modification, I have changed the name of the argument, but you can alternatively change the name of the hardcoded dictionary, to prevent overwriting, and then update. |
This is also incorrect. In the script I provided that demonstrated how to use the library, both the This may have been unclear in the example script because they were imported with everything else (using I apologize for not providing enough documentation and writing code that was unclear. |
This issue pertains to work in progress on the fault modeler, in the 'fault_conversion_utils' branch that has not yet been merged into master.
@klunk386 @julgp @mmpagani please review:
The primary problem is that non-default geojson column names are being hardcoded into the
fault_source_modeler.py
script, in theparam_map
dictionary1.The new library was written so that project-specific settings that are not default (things such as column/attribute names) are not hardcoded in, but are specified in a project configuration file. This was done because attribute names can and will change between projects. There is no reason to force everyone who uses this library to have a specific set of attribute names, or to prevent us from changing the names as the fault database project evolves.
By hardcoding these values in the
fault_source_modeler.py
script, this repeats the old problem with brittle code, after I re-wrote the library to fix this problem. I don't understand why it is necessary or desirable.As a side note:
My opinion is that the
fault_source_modeler.py
should take as its primary argument a project configuration (.ini
) file, that contains all non-default parameters (not only column/attribute names, but any non-default parameters including scaling relationships, seismogenic depths, rupture mesh spacings, temporal occurrence models, or anything else. The GIS input and XML output files would also be specified here. This prevents all of these parameters from having to be a) hardcoded into the fault modeler each time, or b) having to be passed as command line arguments each time. It is very hard to keep track of all of these parameters, especially ones only passed on the command line, because you have to search through your bash history to find out what you did. Good luck with that after a few months or a few years! Also, version control can be used so that the parameters are tracked with the output file. This is impossible with the current system.Also, It is no more work (it is less work, in fact) to write them once into a
.ini
file than pass them on the command line.An example .ini file is below
Then, the
fault_source_modeler.py
file reads this file and does all of the necessary work based on the parameters here. I have provided a preliminary working version of this script to @klunk386 and @julgp as an example.Then, the XML file is built with
fault_source_modeler.py config.ini
instead of passing all of the arguments as command line arguments. I know this is different than was done in the past but it seems to me to be simpler.Just to be clear, Values only need to be written for non-default parameters, not for all of the parameters needed for a project other than the input gis file name.
If we want to discuss this further (and we should, as a group, come to a decision) I can make a separate issue for this topic.
Footnotes
These column names are found in the early versions of the fault database, and are inherited from the old Faulted Earth project. They are not used in the current Global Active Faults database and are not in the current CCARA faults, for example. However, because the old scripts used to make the .xml files were brittle and had these values hardcoded, I did not update those column names in the geojson files we were using for hazard modeling because this would break the scripts. However, these column names have been simplified to make them less confusing. ↩
The text was updated successfully, but these errors were encountered: