Scaling JPK data values correctly #118

derollins · 2025-02-25T11:49:44Z

As described in issue #116, currently the data (z values) scaling factors used for most channels in JPK images are incorrect and some channels cannot be read.

This pull request creates the fucntion _get_z_scaling() that extracts the correct scaling factor for each channel and uses these to scale the image.

Additonally corrects a typo in the example notebook.

This fixes #116

derollins · 2025-02-25T12:02:19Z

I am working on improving the documentation however I think the test failure is due to the nature of the bug and the test itself is incorrect. The height test channel used in one that is incorrectly scaled.

ns-rse

Thanks for the Pull Request @derollins

I've left some suggestions on style in place.

I'm wondering if the tags should be parameterised rather than hard coded so that if they change in the future it is easier to update them.

The values I've seen are...

JPK_TAGS = {
  "n_slots": 32896,
  "default": 32897,
  "tag_name": 32912,
  "scaling_type": 32931,
  "scaling_name": 32932,
  "offset_name": 32933,
}

This could be a dictionary passed in, although the default should be tags: dict[str: int] | None = None as setting the default to be a dictionary falls foul of a linting error.

The dictionary could be define globally and passed as an argument when calling on line 150.

AFMReader/jpk.py

examples/example_01.ipynb

AFMReader/jpk.py

Co-authored-by: Neil Shephard <n.shephard@sheffield.ac.uk>

derollins · 2025-02-25T14:29:28Z

The correct sum for the height_trace channel of the sample image is 219242202.8256843 verified by opening the image in gwyddion and exporting and summing a numpy array. This agrees with the output of this pull request.

derollins · 2025-02-25T14:42:22Z

I'm wondering if the tags should be parameterised rather than hard coded so that if they change in the future it is easier to update them.

The values I've seen are...
JPK_TAGS = {
  "n_slots": 32896,
  "default": 32897,
  "tag_name": 32912,
  "scaling_type": 32931,
  "scaling_name": 32932,
  "offset_name": 32933,
}
This could be a dictionary passed in, although the default should be tags: dict[str: int] | None = None as setting the default to be a dictionary falls foul of a linting error.

The dictionary could be define globally and passed as an argument when calling on line 150.

Thanks for the suggestion @ns-rse, the tags are certainly the most confusing aspect of reading these files. I'm uncertain of the best approach as many of the tags are 'relative' and found by adding multiples of 48 depending on the slot. Some are used as stings while others are used as integers.
Do you think the tags should be held and used as strings, as most are now, and converted to integers for manipulation, then back to strings for use or is it better to hold them as integers like in your example and convert them all to strings when used?
When you suggest the dictionary should be defined globally do you mean in the load_jpk() function rather than _get_z_scaling()?

…mage

AFMReader/jpk.py

ns-rse · 2025-02-25T15:58:23Z

Thanks for the prompt work @derollins 👍 Looked through and noticed one small thing, and also that default_slot_number still has _number which we can do away with I think.

Do you think the tags should be held and used as strings, as most are now, and converted to integers for manipulation, then back to strings for use or is it better to hold them as integers like in your example and convert them all to strings when used?

Always a tricky one when you have to go back and forth, thankfully they're not floats or large integers so precision won't be a problem. I'd let the balance of how they are most frequently used determine their default type.

When you suggest the dictionary should be defined globally do you mean in the load_jpk() function rather than _get_z_scaling()?

I'd define it near the top of the jpk.py for now. I suspect over time we might look to set a default_config.yaml that holds defaults across all files. I've created #119 to capture that so we don't forget.

AFMReader/jpk.py

derollins · 2025-02-25T16:19:27Z

Thanks for the prompt work @derollins 👍 Looked through and noticed one small thing, and also that default_slot_number still has _number which we can do away with I think.

Hi @ns-rse I kept _number since default_slot is already used above and default_slot_number is the index that refers to the default slot. I don't think _number is too confusing since it is a number and used as one although it could also be default_slot_index.

Always a tricky one when you have to go back and forth, thankfully they're not floats or large integers so precision won't be a problem. I'd let the balance of how they are most frequently used determine their default type.

In that case I think storing them as strings makes sence as that is how they are primarily used.

I'd define it near the top of the jpk.py for now. I suspect over time we might look to set a default_config.yaml that holds defaults across all files. I've created #119 to capture that so we don't forget.

Great, I'll have a go at this.

ns-rse · 2025-02-26T12:10:27Z

Thanks @derollins I think this is much easier to understand now than with the numbers hard coded.

One final suggestion on naming conventions made in-line, I'd be inclined to use the prefix approach and this will be good to merge.

You may want to look at setting up `pre-commit` locally to get faster ~~frustration~~ feedback than waiting on the CI to pass/fail/make changes.

There are some notes on this in the AFMReader Documentation if inclined to do so.

ns-rse · 2025-02-26T12:50:35Z

That was quick @derollins 👍 😁

You might want to check out the magic of git commit --amend when you realise you need to add something to the last commit you made. 😉

derollins · 2025-02-26T12:51:40Z

One final suggestion on naming conventions made in-line, I'd be inclined to use the prefix approach and this will be good to merge.

I went with _default_slot, is this approariate?

You may want to look at setting up pre-commit locally to get faster ~~frustration~~ feedback than waiting on the CI to pass/fail/make changes.

There are some notes on this in the AFMReader Documentation if inclined to do so.

I was using this but the company security settings on the work laptop I was using was blocking it from running. I might try making a commit on a personal computer before I push in the future.

derollins · 2025-02-26T12:52:22Z

That was quick @derollins 👍 😁

You might want to check out the magic of git commit --amend when you realise you need to add something to the last commit you made. 😉

Thanks! I will do that, I had just forgotten to pull the changes that pre-commit had made

ns-rse

Thanks for your patience and for taking the time to provide a solution @derollins

This will be merged and available, not sure when we'll make a new release just yet though but you can install from main branch to get these changes if you start a new env or a colleague does.

derollins and others added 3 commits February 25, 2025 11:27

Fix JPK channel data reading and data scaling and offset

c59c0c2

Fix typo in examples notebook

c6a4fb3

[pre-commit.ci] Fixing issues with pre-commit

ca864d2

ns-rse requested changes Feb 25, 2025

View reviewed changes

derollins and others added 2 commits February 25, 2025 13:24

Update AFMReader/jpk.py

c96d1cd

Co-authored-by: Neil Shephard <n.shephard@sheffield.ac.uk>

Implimenting review suggetsions from Neil Shephard

0b820df

derollins added 5 commits February 25, 2025 14:44

added channel_idx parameter documentation

df0ed87

updated test_jpk.py with correct sum for height_trace of the sample i…

4e5211b

…mage

fixing key error introduced when changing how slots were counted

4e83f1e

clarified documentation

02f71da

trying to fix pre-commit errors

e11ef61

ns-rse mentioned this pull request Feb 25, 2025

Default configuration to parameterise values #119

Open

ns-rse reviewed Feb 25, 2025

View reviewed changes

AFMReader/jpk.py Outdated Show resolved Hide resolved

ns-rse reviewed Feb 25, 2025

View reviewed changes

AFMReader/jpk.py Outdated Show resolved Hide resolved

derollins and others added 6 commits February 25, 2025 16:20

trying to placate pre-commit

3ed0e72

trying to placate pre-commit again

b2c7533

[pre-commit.ci] Fixing issues with pre-commit

9579ba7

added JPK_TAGS dictionary for storing tag names

8332ddf

added JPK_TAGS dictionary for storing tag names

9484f53

[pre-commit.ci] Fixing issues with pre-commit

5d2250b

derollins and others added 3 commits February 26, 2025 12:45

use _ prefix to distinguish default_slot variables

30b9c5d

use _ prefix to distinguish default_slot variables

2909bbd

[pre-commit.ci] Fixing issues with pre-commit

1a726a0

ns-rse approved these changes Feb 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scaling JPK data values correctly #118

Scaling JPK data values correctly #118

derollins commented Feb 25, 2025 •

edited

Loading

derollins commented Feb 25, 2025

ns-rse left a comment

derollins commented Feb 25, 2025

derollins commented Feb 25, 2025

ns-rse commented Feb 25, 2025

derollins commented Feb 25, 2025

ns-rse commented Feb 26, 2025

ns-rse commented Feb 26, 2025

derollins commented Feb 26, 2025 •

edited

Loading

derollins commented Feb 26, 2025

ns-rse left a comment

Scaling JPK data values correctly #118

Are you sure you want to change the base?

Scaling JPK data values correctly #118

Conversation

derollins commented Feb 25, 2025 • edited Loading

derollins commented Feb 25, 2025

ns-rse left a comment

Choose a reason for hiding this comment

derollins commented Feb 25, 2025

derollins commented Feb 25, 2025

ns-rse commented Feb 25, 2025

derollins commented Feb 25, 2025

ns-rse commented Feb 26, 2025

ns-rse commented Feb 26, 2025

derollins commented Feb 26, 2025 • edited Loading

derollins commented Feb 26, 2025

ns-rse left a comment

Choose a reason for hiding this comment

derollins commented Feb 25, 2025 •

edited

Loading

derollins commented Feb 26, 2025 •

edited

Loading