Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OME-ZARR support for BigStitcher-Spark #40

Merged
merged 42 commits into from
Feb 5, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
2a7e2a7
generalize pixeltype
StephanPreibisch Jan 27, 2025
7d86ea2
added first code for creating fusion container
StephanPreibisch Jan 27, 2025
de96b66
Merge branch 'main' into omezarr
StephanPreibisch Jan 27, 2025
87b7fe9
working on downsampling and other parameters
StephanPreibisch Jan 27, 2025
f1bdc18
Merge branch 'omezarr' of github.com:PreibischLab/BigStitcher-Spark i…
StephanPreibisch Jan 27, 2025
155db7c
fix downsampling string parsing
StephanPreibisch Jan 27, 2025
3f0b8aa
update mvr
StephanPreibisch Jan 27, 2025
0714354
progress towards creating fused dataset containers
StephanPreibisch Jan 27, 2025
44479cd
directly using mvr
StephanPreibisch Jan 28, 2025
ae86317
almost finished first version of dataset creation
StephanPreibisch Jan 28, 2025
149c1dc
adding saving of XML
StephanPreibisch Jan 28, 2025
339d81f
added metadata storage
StephanPreibisch Jan 28, 2025
36107b7
fix bug in array indexing
StephanPreibisch Jan 29, 2025
04cb2a9
add more parameters to the JSON
StephanPreibisch Jan 29, 2025
fc65dce
add one more abstraction layer for cmdline
StephanPreibisch Jan 29, 2025
a807768
start working to change fusion to use created containers
StephanPreibisch Jan 29, 2025
ce0c82b
transferring more metadata to fusion
StephanPreibisch Jan 29, 2025
04368d2
add min & max intensity to fusion definition
StephanPreibisch Jan 30, 2025
7a82298
getting closer, first prototype of fusion
StephanPreibisch Jan 30, 2025
f210d6a
fix a few Spark-related bugs
StephanPreibisch Jan 30, 2025
08513c7
fix more little bugs
StephanPreibisch Jan 30, 2025
c269ff4
added prefetching
StephanPreibisch Jan 31, 2025
8517747
clean up old code
StephanPreibisch Jan 31, 2025
02ab794
start putting masks back in
StephanPreibisch Jan 31, 2025
b3dc08c
added missing class
StephanPreibisch Jan 31, 2025
509e390
first potentially functioning mask creation code
StephanPreibisch Jan 31, 2025
a581027
fix a few bugs in mask creation, works now
StephanPreibisch Jan 31, 2025
293b9f7
add possibility to run partial fusions/mask creations
StephanPreibisch Jan 31, 2025
8cef690
fix text
StephanPreibisch Jan 31, 2025
61f68a3
add create fusion container
StephanPreibisch Jan 31, 2025
8b61518
fix bug in fusion, converter was null
StephanPreibisch Feb 2, 2025
25c542f
Update README.md
StephanPreibisch Feb 3, 2025
6bd6418
Update README.md
StephanPreibisch Feb 3, 2025
1573975
using multiview-reconstruction 5.0.4
StephanPreibisch Feb 3, 2025
65285e0
Update README.md
StephanPreibisch Feb 4, 2025
dfdd3dd
Update README.md
StephanPreibisch Feb 4, 2025
fbd8be4
Update README.md
StephanPreibisch Feb 4, 2025
abbbc6f
Update README.md
StephanPreibisch Feb 4, 2025
de9d441
Update README.md
StephanPreibisch Feb 4, 2025
9c13775
Update README.md
StephanPreibisch Feb 4, 2025
76464f1
Merge branch 'main' into omezarr
StephanPreibisch Feb 4, 2025
15ddddb
Merge branch 'omezarr' of github.com:PreibischLab/BigStitcher-Spark i…
StephanPreibisch Feb 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 21 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ Additonally there are some utility methods:
* [Match Interest Points](#ip-match)
* [Solver](#solver)
* [Affine Fusion](#affine-fusion)
* [Create Fusion Container](#create-fusion-container)
* [Run Affine Fusion](#run-affine-fusion)
* [Non-Rigid Fusion](#nonrigid-fusion)

<img align="left" src="https://github.com/JaneliaSciComp/BigStitcher-Spark/blob/main/src/main/resources/bs-spark.png" alt="Overview of the BigStitcher-Spark pipeline">
Expand Down Expand Up @@ -73,7 +75,7 @@ Please ask your sysadmin for help how to run it on your **cluster**, below are h

`mvn clean package -P fatjar` builds `target/BigStitcher-Spark-0.0.1-SNAPSHOT.jar` for distribution.

For running the fatjar on the **cloud** check out services such as [Amazon EMR](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark.html). An implementations of image readers and writers that support cloud storage can be found [here](https://github.com/bigdataviewer/bigdataviewer-omezarr). Note that running it on the cloud is an ongoing effort with [@kgabor](https://github.com/kgabor), [@tpietzsch](https://github.com/tpietzsch) and the AWS team that currently works as a prototype but is further being optimized. We will provide an updated documentation in due time. Note that some modules support prefetching `--prefetch`, which is important for cloud execution due to its delays as it pre-loads all image blocks in parallel before processing.
BigStitcher-Spark is now fully "cloud-native". For running the fatjar on the **cloud** check out services such as [Amazon EMR](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark.html) and [Google Serverless Batches](https://cloud.google.com/dataproc-serverless/docs/quickstarts/spark-batch). Note that some modules support prefetching `--prefetch`, which is important for cloud execution due to its delays as it pre-loads all image blocks in parallel before processing. We will soon add detailled information on how to run the examples on both cloud platforms (it works - if you need help now, please contact @StephanPreibisch).

## Example Datasets<a name="examples">

Expand Down Expand Up @@ -217,7 +219,24 @@ When using interestpoints (for timeseries alignment with grouping all views of a

### Affine Fusion<a name="affine-fusion">

Performs **fusion using affine transformation models** computed by the [solve](#solver) (including translations) that are stored in the XML (*Warning: not tested on 2D*). By default the affine fusion will create an output image that contains all transformed input views/images. While this is good in some cases such as tiled stitching tasks, the output volume can be unnecessarily large for e.g. multi-view datasets. Thus, prior to running the fusion it might be useful to [**define a custom bounding box**](https://imagej.net/plugins/bigstitcher/boundingbox) in BigStitcher.
Performs **fusion using affine transformation models** computed by the [solve](#solver) (also supports translations, rigid, interpolated models) that are stored in the XML (*Warning: not tested on 2D*). By default the affine fusion will create an output image that encompasses all transformed input views/images. While this is good in some cases such as tiled stitching tasks, the output volume can be unnecessarily large for e.g. multi-view datasets. Thus, prior to running the fusion it might be useful to [**define a custom bounding box**](https://imagej.net/plugins/bigstitcher/boundingbox) in BigStitcher.

#### Create Fusion Container<a name="create-fusion-container">

The first step in the fusion is to create an empty output container that also contains all the metadata and still empty multi-resolution pyramids. By default an **OME-ZARR** is created, **N5** and **HDF5** are also supported, but HDF5 only if Spark is not run in a distributed fashion but multi-threaded on a local computer. A typical call for creating an output container for e.g. the **stitching** dataset is (e.g. [this dataset](https://drive.google.com/file/d/1ajjk4piENbRrhPWlR6HqoUfD7U7d9zlZ/view?usp=sharing)):

<code>./create-fusion-container -x ~/SparkTest/Stitching/dataset.xml -o ~/SparkTest/Stitching/Stitching/fused.zarr --preserveAnisotropy --multiRes -d UINT8</code>

By default, this will create an output container that contains a 3D volume for all channels and timepoints present in the dataset. In the case of OME-ZARR, it is a single 5D container, for N5 and HDF5 it is a series of 3D datasets. ***Note: if you do NOT want to export the entire project, or want to specify fusion assignments (which views/images are fused into which volume), please check the details below. In short, you can specify the dimensions of the output container here, and the fusion assignments in the affine-fusion step below.***

The fusion container for the [dataset that was aligned using interest points](https://drive.google.com/file/d/13b0UzWuvpT_qL7JFFuGY9WWm-VEiVNj7/view?usp=sharing) can be created in the same way, except that we choose to use the bounding box `embryo` that was specified using BigStitcher and we choose to save as an BDV/BigStitcher project using N5 as underlying export data format:

<code>./create-fusion-container -x ~/SparkTest/IP/dataset.xml -o ~/SparkTest/IP/fused.n5 -xo ~/SparkTest/IP/dataset-fused.xml -s N5 -b embryo --bdv --multiRes -d UINT8</code>

#### Run Affine Fusion<a name="run-affine-fusion">

bla bla


A typical set of calls (because it is three channels) for affine fusion into a multi-resolution ZARR using only translations on the **stitching** dataset is (e.g. [this dataset](https://drive.google.com/file/d/1ajjk4piENbRrhPWlR6HqoUfD7U7d9zlZ/view?usp=sharing)):

Expand Down
1 change: 1 addition & 0 deletions install
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ install_command detect-interestpoints "net.preibisch.bigstitcher.spark.SparkInte
install_command match-interestpoints "net.preibisch.bigstitcher.spark.SparkGeometricDescriptorMatching"
install_command stitching "net.preibisch.bigstitcher.spark.SparkPairwiseStitching"
install_command solver "net.preibisch.bigstitcher.spark.Solver"
install_command create-fusion-container "net.preibisch.bigstitcher.spark.CreateFusionContainer"
install_command affine-fusion "net.preibisch.bigstitcher.spark.SparkAffineFusion"
install_command nonrigid-fusion "net.preibisch.bigstitcher.spark.SparkNonRigidFusion"

Expand Down
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@

<bigdataviewer-core.version>10.6.3</bigdataviewer-core.version>
<spim_data.version>2.3.5</spim_data.version>
<multiview-reconstruction.version>5.0.1</multiview-reconstruction.version>
<multiview-reconstruction.version>5.0.4</multiview-reconstruction.version>
<BigStitcher.version>2.3.1</BigStitcher.version>

<!--for the old bioformats to work properly (compared with old main branch to find out)-->
Expand Down
Loading