A Python 3 script that calculates CShM (Continuous Shape Measures) values for various shapes, geometry indices, including ฯโ, ฯโ', ฯโ , O (octahedricity) of automatically selected atoms (transition metal atoms by default) from a crystallographic information file (CIF). The CIF may contain one or more entries. It can also calculate these values from COD (Crystallography Open Database) entries by simply entering the COD ID. It also saves the XYZ coordinates of the central atom and its neighboring atoms, including those from CIFs with multiple entries or COD entries and calculates the polyhedral volume. Optionally, tables in Markdown format containing bond lengths and angles can be generated.
This script calculates the Continuous Shape Measures (CShM) and geometry indices (ฯโ, ฯโ', ฯโ , and O (octahedricity)) to assign coordination geometries to two-, three-, four-, five and six-coordinate transition metal atoms. The indices ฯโ and ฯโ are used to determine whether a compound adopts tetrahedral, trigonal pyramidal, square planar, or seesaw geometry (for four-coordinated compounds), and square pyramidal or trigonal bipyramidal geometry (for five-coordinated compounds). These assignments rely on the two largest angles enclosing the central atom. The octahedricity index, O, is calculated based on experimental X-M-X angles and assesses how close the geometry is to an ideal octahedron. The CShM value approaches zero when the shape closely matches the ideal geometry. Additionally, the polyhedral volume is calculated using the Convex Hull Algorithm (via SciPy and the Qhull Library). For more information, refer to the associated papers and references.
Square planar geometry:
Square planar geometry:
Square pyramidal geometry:
gemmi
, numpy
, scipy
, tabulate
, requests
For local CIFs with one or more entries start the script with:
python3 cshm-cc.py example.cif
The following output will be printed:
๐ฐ๐ฒ๐ป๐๐ฟ๐ฎ๐น ๐ฎ๐๐ผ๐บ ๐ฐ๐ผ๐ผ๐ฟ๐ฑ๐ถ๐ป๐ฎ๐๐ถ๐ผ๐ป ๐ป๐๐บ๐ฏ๐ฒ๐ฟ
โ โ
Fe1 (example): CN = 5, min dist. = 1.9799 ร
, max dist. = 2.3664 ร
โ โ โ
๐๐๐ ๐ฒ๐ป๐๐ฟ๐ ๐ผ๐ฟ ๐๐ข๐ ๐๐ ๐บ๐ถ๐ป๐ถ๐บ๐๐บ ๐ฑ๐ถ๐๐๐ฎ๐ป๐ฐ๐ฒ ๐บ๐ฎ๐
๐ถ๐บ๐๐บ ๐ฑ๐ถ๐๐๐ฎ๐ป๐ฐ๐ฒ (๐๐ผ ๐ฐ๐ฒ๐ป๐๐ฟ๐ฎ๐น ๐ฎ๐๐ผ๐บ)
| **compound** | **Fe1 (example)** | โ ๐๐ฒ๐ป๐๐ฟ๐ฎ๐น ๐ฎ๐๐ผ๐บ (๐๐๐ ๐ฒ๐ป๐๐ฟ๐)
|----------------|-------------------------|
| CN | 5 | โ ๐๐ผ๐ผ๐ฟ๐ฑ๐ถ๐ป๐ฎ๐๐ถ๐ผ๐ป ๐ป๐๐บ๐ฏ๐ฒ๐ฟ (๐๐ก)
| ฯโ
| 0.3053 | โ ๐๐ฒ๐ผ๐บ๐ฒ๐๐ฟ๐ ๐ถ๐ป๐ฑ๐ฒ๐
(๐๐๐ถ๐๐ฎ๐ฏ๐น๐ฒ ๐ณ๐ผ๐ฟ ๐๐ก = ๐ฑ)
| V /ร
ยณ | 7.2032 | โ ๐ฃ๐ผ๐น๐๐ต๐ฒ๐ฑ๐ฟ๐ฎ๐น ๐๐ผ๐น๐๐บ๐ฒ
| | |
| PP-5 | 31.2864 | โ ๐๐ฆ๐ต๐ (๐๐๐ถ๐๐ฎ๐ฏ๐น๐ฒ ๐ณ๐ผ๐ฟ ๐๐ก = ๐ฑ)
| vOC-5 | 2.1704 | โ ๐๐ฆ๐ต๐ (๐๐๐ถ๐๐ฎ๐ฏ๐น๐ฒ ๐ณ๐ผ๐ฟ ๐๐ก = ๐ฑ)
| TBPY-5 | 3.1110 | โ ๐๐ฆ๐ต๐ (๐๐๐ถ๐๐ฎ๐ฏ๐น๐ฒ ๐ณ๐ผ๐ฟ ๐๐ก = ๐ฑ)
| SPY-5 | *0.9603* | โ ๐๐ฆ๐ต๐ (๐น๐ผ๐๐ฒ๐๐ ๐๐ฎ๐น๐๐ฒ ๐ต๐ถ๐ด๐ต๐น๐ถ๐ด๐ต๐๐ฒ๐ฑ)
| JTBPY-5 | 6.3056 | โ ๐๐ฆ๐ต๐ (๐๐๐ถ๐๐ฎ๐ฏ๐น๐ฒ ๐ณ๐ผ๐ฟ ๐๐ก = ๐ฑ)
Or to retrieve structural data from the COD (Crystallography Open Database), for example:
python3 cshm-cc.py 4110517
The following output will be printed:
Fe1 (4110517): CN = 6, min dist. = 1.8997 ร
, max dist. = 1.9875 ร
Co1 (4110517): Warning! Co1 (1_655) has been excluded from coordinating atoms.
Co1 (4110517): CN = 6, min dist. = 2.0824 ร
, max dist. = 2.1582 ร
| **compound** | **Fe1 (4110517)** | **Co1 (4110517)** | โ ๐๐ฒ๐ป๐๐ฟ๐ฎ๐น ๐ฎ๐๐ผ๐บ๐ (๐๐ข๐ ๐๐)
|----------------|---------------------|---------------------|
| CN | 6 | 6 | โ ๐๐ผ๐ผ๐ฟ๐ฑ๐ถ๐ป๐ฎ๐๐ถ๐ผ๐ป ๐ป๐๐บ๐ฏ๐ฒ๐ฟ๐ (๐๐ก)
| O | 5.1303 | 0.1262 | โ ๐๐ฒ๐ผ๐บ๐ฒ๐๐ฟ๐ ๐ถ๐ป๐ฑ๐ถ๐ฐ๐ฒ๐ (๐๐๐ถ๐๐ฎ๐ฏ๐น๐ฒ ๐ณ๐ผ๐ฟ ๐๐ก = ๐ฒ)
| V /ร
ยณ | 9.7103 | 12.5312 | โ ๐ฃ๐ผ๐น๐๐ต๐ฒ๐ฑ๐ฟ๐ฎ๐น ๐๐ผ๐น๐๐บ๐ฒ
| | | |
| HP-6 | 30.9116 | 33.2437 | โ ๐๐ฆ๐ต๐ (๐๐๐ถ๐๐ฎ๐ฏ๐น๐ฒ ๐ณ๐ผ๐ฟ ๐๐ก = ๐ฒ)
| PPY-6 | 26.5838 | 30.2095 | โ ๐๐ฆ๐ต๐ (๐๐๐ถ๐๐ฎ๐ฏ๐น๐ฒ ๐ณ๐ผ๐ฟ ๐๐ก = ๐ฒ)
| OC-6 | *0.4782* | *0.0261* | โ ๐๐ฆ๐ต๐ (๐น๐ผ๐๐ฒ๐๐ ๐๐ฎ๐น๐๐ฒ ๐ต๐ถ๐ด๐ต๐น๐ถ๐ด๐ต๐๐ฒ๐ฑ)
| TPR-6 | 14.1660 | 16.7105 | โ ๐๐ฆ๐ต๐ (๐๐๐ถ๐๐ฎ๐ฏ๐น๐ฒ ๐ณ๐ผ๐ฟ ๐๐ก = ๐ฒ)
| JPPY-6 | 30.0476 | 33.6764 | โ ๐๐ฆ๐ต๐ (๐๐๐ถ๐๐ฎ๐ฏ๐น๐ฒ ๐ณ๐ผ๐ฟ ๐๐ก = ๐ฒ)
The central atom(s) or transition metal atom(s) and their coordination number(s) will be automatically determined,
and all related values will be calculated. There is no manual choice of atom(s); only a maximum bonding distance can be set (-d
option).
Start the script with:
python3 cshm-cc.py example.cif > example.md
will save the output in markdown format.
Convert markdown to docx (install PANDOC first):
pandoc example.md -o example.docx
This will convert the markdown file to a docx file. Open it with your favorite word processor. Convert the file to even more formats such as HTML, PDF or TeX with PANDOC.
filename or COD ID
(required): A CIF (Crystallographic Information File) with the .cif extension or a COD ID (an integer without an extension), e.g.,example.cif
or12345678
.-d
N
(optional): Excludes atoms with bond lengths larger thanN
ร from the central atom in the calculation (e.g.,-d 2.1
).-eh
(optional): Exclude all hydrogen atoms.-n
N
(optional): Specifies the number of trials (N
> 0) for the fast CShM calculation (default: 124).-ex
(optional): Uses a slower CShM calculation that guarantees finding the global minimum.-sxyz
(optional): Saves the XYZ coordinates of the central atom and its neighboring atoms. If multiple entries are provided, they are combined (filename:cif_name.xyz
orcod_id.xyz
).-v
(optional): Enables verbose output, printing all bond lengths and angles of the central atom with its neighboring atoms.-p
(optional): Plot bar graphs of CShM metrics for easier visual comparison.-pc
(optional): Plot color bar graphs of CShM metrics for easier visual comparison.
- All parameter calculations are based on the estimated coordination number(s).
- Parameters like angles and distances are calculated from the atomic coordinates using the
gemmi
module. Therefore, symmetry operations for bonded atoms may differ from those in the CIF. Estimated standard deviations (e.s.d.) on bond lengths and angles are not calculated. - The advantage of this method, compared to the one in the classic version, is that it now recognizes a much larger number of CIFs, especially those from the COD. A bonding section in the CIF file is no longer required.
- Possible bonds are now determined based on covalent radii. Several reported radii have been tested with varying success. In the end, the maximum radius of each element is used, with 10% added to the sum of the radii of the central and ligand atoms. This approach successfully identifies nearly all bonds to ligand atoms but also detects numerous metal-metal bonds, including those between the central atom and itself or other metals of the same element. These metal-metal bonds are excluded from further calculations.
- The central atom(s) or transition metal atom(s) and their coordination number(s) are determined automatically. There is no manual selection of atoms possible.
- Manual selection of atoms is possible with the script tau-calc.
- The XYZ coordinates of neighboring atoms are provided relative to the central atom, which is positioned at [0, 0, 0].
- The XYZ file (option:
-sxyz
) can be used for further analysis of coordination geometry. - The polyhedral volume should match the value calculated by Olex2.
- Reference shapes for CShM are from here (cosymlib).
- The script's results should match those of the online calculator or the Shape program when using default options.
- The CShM calculation employs a fast optimization process using the Hungarian algorithm. When only a small number of random trials is performed, the result may converge to a local minimum. For recommended number of trials, see below.
- The
-ex
option enables a slower CShM calculation, which ensures finding the global minimum. - The script downloads the CIF from the COD into a temporary folder. The CIF is deleted after the script finishes.
- In rare cases, the
gemmi
module does not return the correct coordinates for symmetry-equivalent positions. When this happens, the script exits. - If problems arise, the only available option is to reduce the number of bonds considered using the
-d
option. - The CShM method is rewritten from the C++ code and may still contain errors.
The CShM calculation employs a fast optimization process using the Hungarian algorithm. When only a small number of random trials are performed, the result may converge to a local minimum. To determine the optimal number of trials, 100 runs were conducted, each with trial counts ranging from 1 to 100 (a total of 100 ร 100 calculations, see statistics.py). The results are shown in the figure below.
The values for the global minimum solutions are displayed on the y-axis. Except for
python3 cshm-cc.py combined.cif
Ru1 (I): CN = 6, min dist. = 2.0645 ร
, max dist. = 2.0656 ร
Cu1 (anko4): CN = 3, min dist. = 1.8636 ร
, max dist. = 1.9991 ร
7718332: Warning! No metal atoms or '_atom_site_type_symbol' is missing.
Fe1 (av12_25): Warning! Fe1 (2_666) has been excluded from coordinating atoms.
Fe1 (av12_25): CN = 4, min dist. = 1.9890 ร
, max dist. = 2.2038 ร
| **compound** | **Ru1 (I)** | **Cu1 (anko4)** | **Fe1 (av12_25)** |
|----------------|---------------|-------------------|---------------------|
| CN | 6 | 3 | 4 |
| ฯโ | | | 0.9035 |
| ฯโ' | | | 0.8976 |
| ฯโ
| | | |
| O | 7.1226 | | |
| V /ร
ยณ | 11.5171 | 0.0415 | 4.5706 |
| | | | |
| TP-3 | | *2.1792* | |
| vT-3 | | 4.8766 | |
| fvOC-3 | | 12.5001 | |
| mvOC-3 | | 6.0484 | |
| SP-4 | | | 29.3605 |
| T-4 | | | *1.2134* |
| SS-4 | | | 8.3243 |
| vTBPY-4 | | | 3.0150 |
| PP-5 | | | |
| vOC-5 | | | |
| TBPY-5 | | | |
| SPY-5 | | | |
| JTBPY-5 | | | |
| HP-6 | 28.5605 | | |
| PPY-6 | 27.0546 | | |
| OC-6 | *0.9516* | | |
| TPR-6 | 13.6312 | | |
| JPPY-6 | 30.8553 | | |
python3 cshm-cc.py 1100756 -v -sxyz
Fe1 (1100756): CN = 6, min dist. = 1.9321 ร
, max dist. = 2.1414 ร
COD: 1100756, M: Fe1, CN: 6, Ligand atoms: O1 O2 O5 O7 N1 N2
| **Atoms** | **Bond length /ร
** |
|-------------|----------------------|
| Fe1-O1 | 1.9394 |
| Fe1-O2 | 1.9321 |
| Fe1-O5 | 2.0936 |
| Fe1-O7 | 1.9975 |
| Fe1-N1 | 2.1147 |
| Fe1-N2 | 2.1414 |
| **Atoms** | **Angle /ยฐ** |
|-------------|----------------|
| O1-Fe1-O2 | 92.13 |
| O1-Fe1-O5 | 174.32 |
| O1-Fe1-O7 | 92.81 |
| O1-Fe1-N1 | 85.28 |
| O1-Fe1-N2 | 96.52 |
| O2-Fe1-O5 | 93.54 |
| O2-Fe1-O7 | 98.78 |
| O2-Fe1-N1 | 167.91 |
| O2-Fe1-N2 | 84.63 |
| O5-Fe1-O7 | 86.65 |
| O5-Fe1-N1 | 89.10 |
| O5-Fe1-N2 | 83.70 |
| O7-Fe1-N1 | 93.15 |
| O7-Fe1-N2 | 169.95 |
| N1-Fe1-N2 | 83.93 |
Fe2 (1100756): CN = 6, min dist. = 1.9019 ร
, max dist. = 2.1397 ร
COD: 1100756, M: Fe2, CN: 6, Ligand atoms: O6 O3 N6 O4 N3 N4
| **Atoms** | **Bond length /ร
** |
|-------------|----------------------|
| Fe2-O6 | 2.1298 |
| Fe2-O3 | 1.9019 |
| Fe2-N6 | 2.1397 |
| Fe2-O4 | 1.9100 |
| Fe2-N3 | 2.1042 |
| Fe2-N4 | 2.1254 |
| **Atoms** | **Angle /ยฐ** |
|-------------|----------------|
| O6-Fe2-O3 | 93.74 |
| O6-Fe2-N6 | 75.28 |
| O6-Fe2-O4 | 164.47 |
| O6-Fe2-N3 | 86.67 |
| O6-Fe2-N4 | 86.82 |
| O3-Fe2-N6 | 90.19 |
| O3-Fe2-O4 | 95.39 |
| O3-Fe2-N3 | 87.07 |
| O3-Fe2-N4 | 172.35 |
| N6-Fe2-O4 | 92.13 |
| N6-Fe2-N3 | 161.53 |
| N6-Fe2-N4 | 97.32 |
| O4-Fe2-N3 | 106.30 |
| O4-Fe2-N4 | 85.80 |
| N3-Fe2-N4 | 85.35 |
| **compound** | **Fe1 (1100756)** | **Fe2 (1100756)** |
|----------------|---------------------|---------------------|
| CN | 6 | 6 |
| O | 6.1844 | 9.2859 |
| V /ร
ยณ | 11.1317 | 11.2190 |
| | | |
| HP-6 | 32.8092 | 32.9834 |
| PPY-6 | 25.3548 | 23.8228 |
| OC-6 | *0.4960* | *1.3337* |
| TPR-6 | 13.2723 | 10.3584 |
| JPPY-6 | 29.1733 | 27.1195 |
XYZ file saved to 1100756.xyz
XYZ file content:
7
1100756-Fe1
Fe 0.00000000 0.00000000 0.00000000
O -0.61659181 -1.06174100 -1.50121523
O 0.95997426 1.21200823 -1.15862156
O 0.57158964 1.00798188 1.74371741
O -1.72418244 1.00058396 0.12684006
N -0.70442517 -1.59688382 1.19403794
N 1.89597425 -0.97050214 0.22175439
7
1100756-Fe2
Fe 0.00000000 0.00000000 0.00000000
O -0.48516594 -2.07125725 0.10297420
O -1.20716320 0.47924102 1.38941634
N -1.64079225 -0.24189918 -1.35192126
O 0.31465477 1.78925625 -0.58946897
N 1.41570551 -0.38239341 1.50903978
N 1.54651075 -0.55071842 -1.34996022
python3 cshm-cc.py 1100600 -p
Cd1 (1100600): Warning! Cd2 (1_555) has been excluded from coordinating atoms.
Cd1 (1100600): Warning! Cd4 (1_555) has been excluded from coordinating atoms.
Cd1 (1100600): Warning! Cd3 (1_555) has been excluded from coordinating atoms.
Cd1 (1100600): CN = 4, min dist. = 2.2531 ร
, max dist. = 2.3998 ร
Cd2 (1100600): Warning! Cd4 (1_555) has been excluded from coordinating atoms.
Cd2 (1100600): Warning! Cd1 (1_555) has been excluded from coordinating atoms.
Cd2 (1100600): Warning! Cd3 (1_555) has been excluded from coordinating atoms.
Cd2 (1100600): CN = 4, min dist. = 2.2682 ร
, max dist. = 2.3968 ร
Cd3 (1100600): Warning! Cd2 (1_555) has been excluded from coordinating atoms.
Cd3 (1100600): Warning! Cd4 (1_555) has been excluded from coordinating atoms.
Cd3 (1100600): Warning! Cd1 (1_555) has been excluded from coordinating atoms.
Cd3 (1100600): CN = 4, min dist. = 2.2747 ร
, max dist. = 2.3846 ร
Cd4 (1100600): Warning! Cd2 (1_555) has been excluded from coordinating atoms.
Cd4 (1100600): Warning! Cd1 (1_555) has been excluded from coordinating atoms.
Cd4 (1100600): Warning! Cd3 (1_555) has been excluded from coordinating atoms.
Cd4 (1100600): CN = 4, min dist. = 2.2334 ร
, max dist. = 2.4005 ร
| **compound** | **Cd1 (1100600)** | **Cd2 (1100600)** | **Cd3 (1100600)** | **Cd4 (1100600)** |
|----------------|---------------------|---------------------|---------------------|---------------------|
| CN | 4 | 4 | 4 | 4 |
| ฯโ | 0.7378 | 0.7348 | 0.7131 | 0.7341 |
| ฯโ' | 0.7273 | 0.7064 | 0.6972 | 0.7251 |
| V /ร
ยณ | 5.5171 | 5.5015 | 5.5687 | 5.4022 |
| | | | | |
| SP-4 | 33.2281 | 31.6257 | 33.2891 | 33.0562 |
| T-4 | *3.6158* | *3.7698* | *3.8506* | *3.5041* |
| SS-4 | 7.5277 | 6.5626 | 7.2498 | 7.3999 |
| vTBPY-4 | 4.4485 | 4.5457 | 4.3667 | 4.1798 |
Cd1 (1100600) CN = 4:
T-4 : โโโโ 3.62
vTBPY-4: โโโโโ 4.45
SS-4 : โโโโโโโโโ 7.53
SP-4 : โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 33.23
Cd2 (1100600) CN = 4:
T-4 : โโโโ 3.77
vTBPY-4: โโโโโ 4.55
SS-4 : โโโโโโโ 6.56
SP-4 : โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 31.63
Cd3 (1100600) CN = 4:
T-4 : โโโโ 3.85
vTBPY-4: โโโโโ 4.37
SS-4 : โโโโโโโโ 7.25
SP-4 : โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 33.29
Cd4 (1100600) CN = 4:
T-4 : โโโโ 3.50
vTBPY-4: โโโโโ 4.18
SS-4 : โโโโโโโโ 7.40
SP-4 : โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 33.06
Or color bar graphs using:
python3 cshm-cc.py 1100600 -pc
If you use ฯ4, ฯ5, the O index or CShM to describe the coordination geometry of your compounds, please cite one or more of the following articles:
ฯ4:
"Structural variation in copper(i) complexes with pyridylmethylamide ligands: structural analysis with a new four-coordinate geometry index, ฯ4"
Lei Yang, Douglas R. Powell, Robert P. Houser, Dalton Trans. 2007, 955-964.
ฯ4' (ฯ4 improved):
"Coordination polymers and molecular structures among complexes of mercury(II) halides with selected 1-benzoylthioureas"
Andrzej Okuniewski, Damian Rosiak, Jarosลaw Chojnacki, Barbara Becker, Polyhedron 2015, 90, 47โ57.
ฯ5:
"Synthesis, structure, and spectroscopic properties of copper(II) compounds containing nitrogenโsulphur donor ligands; the crystal and molecular structure of aqua[1,7-bis(N-methylbenzimidazol-2โฒ-yl)-2,6-dithiaheptane]copper(II) perchlorate"
Anthony W. Addison, T. Nageswara Rao, Jan Reedijk, Jacobus van Rijn, Gerrit C. Verschoor, J. Chem. Soc., Dalton Trans. 1984, 1349-1356.
O:
"Structural, electrochemical and photophysical behavior of Ru(II) complexes with large bite angle sulfur-bridged terpyridyl ligands"
Christopher M. Brown, Nicole E. Arsenault, Trevor N. K. Cross, Duane Hean, Zhen Xu, Michael O. Wolf, Inorg. Chem. Front. 2020, 7, 117-127.
CShM:
"Continuous Symmetry Measures. 5. The Classical Polyhedra"
Mark Pinsky, David Avnir, Inorg. Chem. 1998, 37, 5575โ5582.
DOI: https://doi.org/10.1021/ic9804925
"Shape maps and polyhedral interconversion paths in transition metal chemistry"
Santiago Alvarez, Pere Alemany, David Casanova, Jordi Cirera, Miquel Llunell, David Avnir, Coord. Chem. Rev., 2005, 249, 1693โ1708.
The script uses the Gemmi library for CIF processing:
"GEMMI: A library for structural biology"
Marcin Wojdyr, Journal of Open Source Software 2022, 7 (73), 4200.
https://gemmi.readthedocs.io/en/latest/
https://github.com/project-gemmi/gemmi
The Crystallography Open Database: