Skip to content

Hierarchical cluster of Java source files in a convenient Maven plugin

License

Notifications You must be signed in to change notification settings

patrickdoc/auto-cluster-maven-plugin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

auto-cluster-maven-plugin

Extract Java dependency information, run a hierarchical clustering algorithm, organize your code.

Usage

To generate a dry-run folder: src/main/auto-cluster-maven-plugin1234...

mvn io.github.patrickdoc:auto-cluster-maven-plugin:cluster

To delete your existing structure and fully embrace the plugin

⚠️ WARNING: This will delete your existing files. Please be very careful, and also use version control.

mvn io.github.patrickdoc:auto-cluster-maven-plugin:cluster -DdryRun=false

Why?

I think dependencies are an under-examined aspect of code and we can do a lot more with them.

This plugin has two goals:

  • For individual projects, the goal is to make the internal dependency structure easy to analyze. By putting it front and center, you will hopefully be able to identify and resolve potential structural issues in your code.

  • For the general community, the goal is to provide a language for talking about code organization and style.

I don't like organizing interfaces into an inf package or enums into an enum package, but there is no productive conversation we can have about that.

On the other hand, if you submit a pull request to increase the effect of transitive dependencies on the clustering algorithm, then we can look at concrete examples of how it would work on any codebase. This seems like a much better starting point than "I don't like X".

For a longer form dev log and discussion, see here.

Example

The ClassGraph repo is as good an example of any of medium sized project with non-zero complexity in the code. So I've used it as an example here. Note, this is not a criticism of the existing structure. In fact, I'm quite pleased that the plugin reproduces some of the existing structure.

Running a dry run in the ClassGraph repo (with a parameter to handle multiple base packages):

mvn io.github.patrickdoc:auto-cluster-maven-plugin:cluster -DbasePackages=io.github.classgraph,nonapi.io.github.classgraph

You can see a side by side comparison of the original repo:

Original ClassGraph source

and the clustered code: Clustered ClassGraph source

You can also browse the files in this fork.

Acknowledgements and References

ClassGraph: This project powers the dependency data extraction, but can also do much more

Hierarchical Clustering Primer: A useful introduction to hierarchical clustering

Clustering Algorithm: The base algorithm for clustering, available in Python and R packages as fastcluster

About

Hierarchical cluster of Java source files in a convenient Maven plugin

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published