Skip to content

Commit

Permalink
Add the draft JOSS paper
Browse files Browse the repository at this point in the history
  • Loading branch information
seisman committed Apr 1, 2024
1 parent adf71bf commit 0ca5bc9
Show file tree
Hide file tree
Showing 2 changed files with 163 additions and 0 deletions.
26 changes: 26 additions & 0 deletions .github/workflows/draft-pdf.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#
# Build draft PDF that will be submitted to JOSS
#
on: [push]

jobs:
paper:
runs-on: ubuntu-latest
name: Paper Draft
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Build draft PDF
uses: openjournals/openjournals-draft-action@master
with:
journal: joss
# This should be the path to the paper within your repo.
paper-path: paper/paper.md
- name: Upload
uses: actions/upload-artifact@v1
with:
name: paper
# This is the output path where Pandoc will write the compiled
# PDF. Note, this should be the same directory as the input
# paper.md
path: paper.pdf
137 changes: 137 additions & 0 deletions paper/paper.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
---
title: 'HinetPy: A Python package for accessing and processing NIED Hi-net seismic data'
tags:
- Python
- geophysics
- seismology

authors:
- name: Dongdong Tian
orcid: 0000-0001-7967-1197
affiliation: 1
affiliations:
- name: School of Geophysics and Geomatics, China University of Geosciences, China
index: 1
date: 1 April 2024
bibliography: paper.bib
---

# Summary

HinetPy is a Python package designed to simplify accessing to and processing of seismic
data from the High-sensitivity Seismograph Network of Japan (Hi-net). Hi-net provides
high-quality seismic data, but its proprietary WIN32 format presents challenges for
data access and exchange. HinetPy aims to address these challenges by providing a
user-friendly interface for accessing, downloading, and processing Hi-net data, thereby
enabling researchers to more effectively utilize this valuable dataset.

# Statement of need

The National Research Institute for Earth Science and Disaster Resilience (NIED) operates
and maintains NIED Hi-net, a nationwide high-sensitivity seismograph network in Japan.
Since its establishment in October 2000, NIED Hi-net has grown to include approximately
800 seismic stations equipped with 3-component short-period seismometers. The Hi-net
website provides access to high-quality seismic data from 2004 onwards, including data
from other seismic networks such as F-net, S-net, V-net, and more. The NIED Hi-net is
renowned for its invaluable contributions to seismological research.


# Challenges in accessing NIED Hi-net data

To access Hi-net data, users need to register an Hinet account. This registration process
is necessary to ensure data security and adherence to Hi-net's data usage policies.
Once registered, users can authenticate their accounts within HinetPy to access and
download the data. The NIED Hi-net data is free-accessible after user registration.
However, accessing Hi-net seismic data can still be challenging. Although the
seismological community have switched to standard web services users can request data
waveforms using tools like ObsPy. Unfortuately, NEID Hi-net has upgraded their server to
use the web services. Users have to login in the NIED Hi-net website and request data
manually. What’s more challenging are the limitations about data size and length in one
single request: the number of channels \* record length must be no larger than 12000
minutes and record length must be no larger than 60 minutes. For NIED Hi-net, which
contains 800 seismic stations and 24000 channels (3 channels per station), the record
length must be no larger than 5 minutes. Thus, for a typical teleseismic event, we may
required 30-minute length of data, which means we need to divide the time range into 6
subranges and post 6 requests separately. We also need to note that NIED Hi-net website
doesn’t allow posting multiple data requests at the same time. Thus, we need to post
the request, wait for data preparation (which may take a few minutes), and then post
another request. After downloading all the files, we then need to combine these data
into a single one.

# Challenges in processing NIED Hi-net data

This format is not widely supported by standard seismology data formats such as miniSEED,
StationXML, and QuakeML, making it difficult to exchange data with other seismograph
networks and researchers. As a result, researchers face barriers in accessing and utilizing
the high-quality data provided by Hi-net. Although the seismological community has standard
data formats such as miniSEED for waveforms, StationXML for station metadata and QuakeML
for earthquake catalog since 20XX. NEID Hi-net still uses its own proprietary WIN32 format,
which is the format used by its own WIN32 system. This format presents obstacles for data
exchange and collaboration within the seismology community, hindering the broader
utilization of Hi-net data. In the WIN32 format, continuous waveform data is divided into
multiple one-minute segments. Again, a companion text file, called “channels table” is
provided for instrumental metadata. NIED Hi-net also provides a series of commands in the
their win32tools package to process WIN32 data and converts WIN32 data to the SAC format,
but there are no tools to convert the channels table to a more commonly used format
(e.g., SAC polezero files).


# HinetPy for easy data accessing and processing

HinetPy is a Python package developed to address the challenges of accessing and processing NIED Hi-net data. The package provides a simple and intuitive interface for accessing Hi-net data, allowing researchers to easily download waveform data and station metadata. HinetPy also includes tools for processing seismic data, mainly converting the seismic data from win32 format to SAC format and build SAC polezero files from the channels table.

Here is an example showing how to access and process NIED Hi-net waveform data.

```python
from HinetPy import Client, win32

# You need a Hi-net account to access the data
client = Client("username", "password")

# Let's try to request 20-minute data of the Hi-net network (with an internal
# network code of '0101') starting at 2010-01-01T00:00 (JST, GMT+0900)
data, ctable = client.get_continuous_waveform("0101", "201001010000", 20)

# The request and download process usually takes a few minutes
# waiting for data request ...
# waiting for data download ...

# Now you can see the data and corresponding channel table in your working directory
# waveform data (in win32 format) : 0101_201001010000_20.cnt
# channel table (plaintext file) : 0101_20100101.ch
# Let's convert data from win32 format to SAC format
win32.extract_sac(data, ctable)

# Let's extract instrument response as PZ files from the channel table file
win32.extract_sacpz(ctable)

# Now you can see several SAC and SAC_PZ files in your working directory
# N.NGUH.E.SAC N.NGUH.U.SAC N.NNMH.N.SAC
# N.NGUH.N.SAC N.NNMH.E.SAC N.NNMH.U.SAC
# ...
# N.NGUH.E.SAC_PZ N.NGUH.U.SAC_PZ N.NNMH.N.SAC_PZ
# N.NGUH.N.SAC_PZ N.NNMH.E.SAC_PZ N.NNMH.U.SAC_PZ
# ...
```

The package itself is platform-independent but it requires the win32tools to be compiled,
so using HinetPy should be easy on Linux and macOS but Windows is not tested. It is
available from PyPI and can be installed using Python’s package management tool `pip`.

# Conclusions

HinetPy provides a valuable tool for researchers working with Hi-net data, enabling them
to more easily access and process this high-quality dataset. By addressing the challenges
posed by the proprietary WIN32 format, HinetPy helps to facilitate data exchange and
collaboration within the seismology community, ultimately advancing our understanding of
seismic events and Earth's structure.

# Acknowledgments

The HinetPy package was initially developed in 2013, when the author was a graduate
student at University of Science and Technology of China. The package doesn’t contain
any NEID Hi-net data even a small sample. Please also note that redistributing any NIED
Hi-net data is prohibited and users should renew their account and report any
publications that uses NIED Hi-net data annually.

# References

0 comments on commit 0ca5bc9

Please sign in to comment.