Skip to content

Commit

Permalink
polishes
Browse files Browse the repository at this point in the history
  • Loading branch information
Mayukhdeb committed Feb 25, 2024
1 parent 8293454 commit 4dfda81
Show file tree
Hide file tree
Showing 2 changed files with 29 additions and 11 deletions.
24 changes: 18 additions & 6 deletions content/post/2024-02-23-magnitude-constrained-featurevis.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,36 @@
The primary drawback of feature visualization has been it's inability generate interpretable features in deeper networks. In my own experience, I've seen that feature vis basically stops working once we go past the 3rd resnet block in a resnet18.
The primary drawback of feature visualization has been it's inability generate interpretable features in deeper networks. In my own experience, I've seen that feature vis basically stops working once we go past the 3rd resnet block on a resnet18.

This paper somehow fixes this issue by optimizing the images in the phase spectrum while keeping the magnitude constant. I have explained the concept of a phase spectrum in this [other post](https://mayukhdeb.github.io/notes/posts/2024-02-24-phase-spectrum.html).
This paper fixes this issue by optimizing the images in the phase spectrum while keeping the magnitude constant. I have explained the concept of a phase spectrum in this [other post](https://mayukhdeb.github.io/notes/posts/2024-02-24-phase-spectrum.html).

---

There are 2 main approaches for feature visualization:

1. Gradient ascent with a penalty for high frequencies in the fourier domain. Combined with data augmentation.
2. Gradient ascent on a subspace parameterized by a generative model.

The first method fails on large/deep models. The 2nd method is not very useful since it's dependent on the generative model's own biases.

The proposed method is motivated by psychophysics experiments that have shown that humans are more sensitive to differences in phase than in magnitude.
The first method fails on large/deep models. The 2nd method is not very useful since it's dependent on the generative model's own biases. The only way forward is to understand why method 1 fails for deeper models.

Unlike shallow models like VGG etc, running featurevis on deeper models yield higher frequency components which are impossible to interpret by humans. To illustrate this, they ran featurevis on the logits of a ViT trained on imagenet and compared it's mean power spectrum (left) with that of the Imagenet dataset's power spectrum (right).

<img src = "https://github.com/Mayukhdeb/notes/assets/53133634/c2c0133f-4e60-4eea-ace6-cad344176aaf" width = "80%">

This proves that featurevis images contain a a much larger amount of high frequency components. The solution to this problem would be to constrain the power spectrum to lower frequency components only.

Apart from constraining high frequencies, the method is also motivated by psychophysics experiments [1, 2] that have shown that when viewing images, humans are more sensitive to differences in phase than in magnitude. The authors build an analogous mathematical constraint for featurevis which optimizes only the phase of the image and not the magnitudes of the frequency components.

# Method

The first thing that they do is that they break down the fourier spectrum into magnitude and [phase spectrum](https://mayukhdeb.github.io/notes/posts/2024-02-24-phase-spectrum.html). They optimize the phase spectrum of the image while keeping the magnitude spectrum to a constant at an average value computed over a set of natural images.

<img src = "https://github.com/Mayukhdeb/notes/assets/53133634/4419c6be-da5a-474d-95ae-9aaa9a6b82ab" width = "100%">

On a side note, this also reduces the number of parameters by half.
On a side note, this method also reduces the number of trainable parameters by half.

---

# References

[1] - [Image phase or amplitude? Rapid scene categorization is an amplitude-based process](https://comptes-rendus.academie-sciences.fr/biologies/articles/10.1016/j.crvi.2004.02.006/)

[2] - [On the role of spatial phase and phase correlation in vision, illusion, and cognition](https://www.frontiersin.org/articles/10.3389/fncom.2015.00045/full)
16 changes: 11 additions & 5 deletions posts/2024-02-23-magnitude-constrained-featurevis.html
Original file line number Diff line number Diff line change
Expand Up @@ -19,20 +19,26 @@
<header id="title-block-header">
<h1 class="title">MaCo - feature visualization for deeper networks</h1>
</header>
<p>The primary drawback of feature visualization has been it’s inability generate interpretable features in deeper networks. In my own experience, I’ve seen that feature vis basically stops working once we go past the 3rd resnet block in a resnet18.</p>
<p>This paper somehow fixes this issue by optimizing the images in the phase spectrum while keeping the magnitude constant. I have explained the concept of a phase spectrum in this <a href="https://mayukhdeb.github.io/notes/posts/2024-02-24-phase-spectrum.html">other post</a>.</p>
<p>The primary drawback of feature visualization has been it’s inability generate interpretable features in deeper networks. In my own experience, I’ve seen that feature vis basically stops working once we go past the 3rd resnet block on a resnet18.</p>
<p>This paper fixes this issue by optimizing the images in the phase spectrum while keeping the magnitude constant. I have explained the concept of a phase spectrum in this <a href="https://mayukhdeb.github.io/notes/posts/2024-02-24-phase-spectrum.html">other post</a>.</p>
<hr />
<p>There are 2 main approaches for feature visualization:</p>
<ol type="1">
<li>Gradient ascent with a penalty for high frequencies in the fourier domain. Combined with data augmentation.</li>
<li>Gradient ascent on a subspace parameterized by a generative model.</li>
</ol>
<p>The first method fails on large/deep models. The 2nd method is not very useful since it’s dependent on the generative model’s own biases.</p>
<p>The proposed method is motivated by psychophysics experiments that have shown that humans are more sensitive to differences in phase than in magnitude.</p>
<p>The first method fails on large/deep models. The 2nd method is not very useful since it’s dependent on the generative model’s own biases. The only way forward is to understand why method 1 fails for deeper models.</p>
<p>Unlike shallow models like VGG etc, running featurevis on deeper models yield higher frequency components which are impossible to interpret by humans. To illustrate this, they ran featurevis on the logits of a ViT trained on imagenet and compared it’s mean power spectrum (left) with that of the Imagenet dataset’s power spectrum (right).</p>
<p><img src = "https://github.com/Mayukhdeb/notes/assets/53133634/c2c0133f-4e60-4eea-ace6-cad344176aaf" width = "80%"></p>
<p>This proves that featurevis images contain a a much larger amount of high frequency components. The solution to this problem would be to constrain the power spectrum to lower frequency components only.</p>
<p>Apart from constraining high frequencies, the method is also motivated by psychophysics experiments [1, 2] that have shown that when viewing images, humans are more sensitive to differences in phase than in magnitude. The authors build an analogous mathematical constraint for featurevis which optimizes only the phase of the image and not the magnitudes of the frequency components.</p>
<h1 id="method">Method</h1>
<p>The first thing that they do is that they break down the fourier spectrum into magnitude and <a href="https://mayukhdeb.github.io/notes/posts/2024-02-24-phase-spectrum.html">phase spectrum</a>. They optimize the phase spectrum of the image while keeping the magnitude spectrum to a constant at an average value computed over a set of natural images.</p>
<p><img src = "https://github.com/Mayukhdeb/notes/assets/53133634/4419c6be-da5a-474d-95ae-9aaa9a6b82ab" width = "100%"></p>
<p>On a side note, this also reduces the number of parameters by half.</p>
<p>On a side note, this method also reduces the number of trainable parameters by half.</p>
<hr />
<h1 id="references">References</h1>
<p>[1] - <a href="https://comptes-rendus.academie-sciences.fr/biologies/articles/10.1016/j.crvi.2004.02.006/">Image phase or amplitude? Rapid scene categorization is an amplitude-based process</a></p>
<p>[2] - <a href="https://www.frontiersin.org/articles/10.3389/fncom.2015.00045/full">On the role of spatial phase and phase correlation in vision, illusion, and cognition</a></p>
</body>
</html>

0 comments on commit 4dfda81

Please sign in to comment.