Added -chr option for calculating TAD-fusion score. Now you can use it with any genome you like! Just specify the number of chromosomes with the new option.
TAD-fusion score is a score to quantify deletions based on their potential disruption of the 3D genome structure. More specifically, TAD-fusion score is defined as the expected number of additional genomic interactions created as a result of the deletion.
Option 1: From 5kb Hi-C data of GM12878 of Rao et al
Compile the TAD-fusion score tool by running the script
Prepare the input deletion file (3-column format as the sample file, with hg19 as the reference)
Run the tool with default parameters as
./../src/cal_tad_fusion_score -md ../Model/GM_Rao_5kb -f ../Data/disease_del.dat -mnl 10000 -mxl 5000000 -w 100 -d 0.06 -o Output/disease_del_TAD_fusion_score.dat
The output file is "Output/disease_del_TAD_fusion_score.dat", the last column is the TAD-fusion score
Sample scripts are in the folder Examples, here are options for calculating the TAD-fusion score
-md Model directory, model files must be renamed as chr1.model, chr2.model, ..., chrX.model -f The file that stores deletions that we need to calculate the TAD-fusion score, the file format has three columns (e.g. one row is "chr2 221278232 223014332") -mnl The minimum length (a number of base pairs), any deletion that is shorter than this threshold will be skipped -mxl The maximum length, any deletion that is longer than this threshold will be skipped -w The window length (a number of bins) around the deletion to calculate the TAD-fusion score -d The delta value threshold to consider if a bin pair is interacted or not. -o The output file, the file format has four columns where the last one is the TAD-fusion score
NEW IN THIS FORK -chr The maximum number of chromosomes in the model directory. The default number is 23.
Fit the model with Hi-C data
a. Install CPLEX
b. Set variables CPLEX_INCLUDE and CPLEX_LIB (in file make_fit_hic_model) to the directory where CPLEX is installed
c. Compile the source by running the script
d. If the compilation is successful, an executable file "fit_hic_model" will be generated in the folder "src"
e. Options for fitting the model
-fn Data file path -ff Data file format ("full_matrix_format" of Schmitt et al. data or "sparse_matrix_format" of Rao et al. data) -res Hi-C matrix bin resolution (e.g. 40kb, 10kb, 5kb) -mn Minimum distance (by a number of bins), any bin pair that the distance is shorter than this threshold will not be considered for fitting the model -mx Maximum distance, any bin pair that the distance is longer than this threshold will not be considered for fitting the model -method Method for fitting ("full" to fit the model from the whole Hi-C data at one time or "segmentation" to partition the matrix into segments and then fit the model for each segment) -sg Length (i.e. a number of bins) of a segment (in the case the method is set to "segmentation") -mso The minimum overlap (i.e. a number of bins) between two segments (in the case the method is set to "segmentation") -zero A constant to replace the zero value to take the log -of The output model file (the file format has 4 columns where alpha, beta, and the insulation are 1st, 3rd, 4th column respectively)
f. Example: The script file "" (in folder "Examples") is to fit the model of chr22 of GM12878 from Schmitt et al. data
- Run the script by
cd Examples ./
- The output model file is "GM12878.40kb.chr22.model" in folder "Examples/Output".
- In the model file, 1st, 2nd and 4th columns are alpha, beta, and the insulator respectively.
- Run the script by
g. For your convenience, we also provide models (in the folder "Model") that we fitted for GM12878 from Rao et al. data at 5kb resolution.
Run TAD-fusion score tool (with the new model) to get the TAD-fusion score (as the section above)
If you have any questions about TAD-fusion score, please contact Linh Huynh ( or Fereydoun Hormozdiari (
Huynh L, Hormozdiari F. TAD-fusion score: discovery and ranking the contribution of deletions to genome structure. Genome Biology. 2019; 20:60.
See the LICENSE file for license rights and limitations (BSD-2).
This work is supported in part by the Sloan Research Fellowship number G-2017-9159 to Fereydoun Hormozdiari.