Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for converting from Hijri calendar to undate and undate interval #107

Merged
merged 40 commits into from
Dec 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
a2dfae6
Preliminary hijri date parser
rlskoeser Nov 21, 2024
ed23f6c
Test all Hijri months; assume 3+ digit years and use LALR parser
rlskoeser Nov 21, 2024
646f739
Clean up edtf parser code (remove old test case comments)
rlskoeser Nov 21, 2024
51850cc
Add transformer for hijri parser to convert parsed date to undate
rlskoeser Nov 21, 2024
50f2331
Rename test directories & files to be consistent & explicit
rlskoeser Nov 21, 2024
778c67b
Add an undate converter to wire in hijri date parsing capability
rlskoeser Nov 21, 2024
99c0611
Tell mypy to ignore that convertdate code is untyped
rlskoeser Nov 21, 2024
4a7a1d8
Merge branch 'develop' into feature/convert-hijri
rlskoeser Nov 21, 2024
454382f
Merge branch 'develop' into feature/convert-hijri
rlskoeser Nov 21, 2024
315ad7a
Clean up one more date and add more possible todos
rlskoeser Nov 21, 2024
18c8f25
Update src/undate/converters/calendars/hijri/transformer.py
rlskoeser Nov 21, 2024
f3ce58b
Update src/undate/converters/edtf/edtf.lark
rlskoeser Nov 21, 2024
11cc007
Update src/undate/converters/calendars/hijri/converter.py
rlskoeser Nov 21, 2024
2cc596e
Add more error cases for EDTF and Hijri parser tests
rlskoeser Nov 21, 2024
0aac63a
Add calendar field to Undate object
rlskoeser Nov 22, 2024
e2444ed
Partial refactor: initialize hijri dates as undate with hijri calendar
rlskoeser Nov 26, 2024
3aa462b
Use calendar converter to get max month/day and convert to gregorian
rlskoeser Nov 26, 2024
fe41545
Generate iso format date from native calendar date, not earliest/latest
rlskoeser Nov 26, 2024
3a43e6d
Include calendar name in undate repr
rlskoeser Nov 26, 2024
7c9ccb7
Support and test comparing undates across calendars
rlskoeser Nov 26, 2024
b6b6376
Work around StrEnum not being in python until 3.11
rlskoeser Nov 26, 2024
e91b7ba
Allow any Hijri year (drop 3+ digit year constraint and year-month-day)
rlskoeser Nov 26, 2024
6c6f09a
Confirm hijri dates + partially unknown date behavior
rlskoeser Nov 26, 2024
5cc19fd
Add calendar converter base class and document how to add calendars
rlskoeser Nov 26, 2024
d26574c
Implementing Hebrew Anno Mundi calendar converter based on Hijri
rlskoeser Nov 27, 2024
5660fa2
Fix mis-formatted docstring
rlskoeser Nov 27, 2024
c6ed817
Fix mis-formatted docstring
rlskoeser Nov 27, 2024
88e4d17
Adjust imports for hebrew calendar converter
rlskoeser Nov 27, 2024
f908cd5
Add comment about earliest Hebrew year in grammar
rlskoeser Dec 6, 2024
c24cd34
Test exceptions and parser type errors more specific
rlskoeser Dec 6, 2024
5773bf7
Run unit tests on pull request to any branch
rlskoeser Dec 6, 2024
3032785
Fix incorrect import
rlskoeser Dec 6, 2024
9137608
Force calendar converters to implement min/max month methods
rlskoeser Dec 6, 2024
920f736
Differentiate min/max month from first/last month
rlskoeser Dec 6, 2024
867e018
Merge branch 'feature/convert-hijri' into feature/convert-hebrew
rlskoeser Dec 6, 2024
333e740
Merge pull request #108 from dh-tech/feature/convert-hebrew
rlskoeser Dec 6, 2024
b7ae594
Rewrite gregorian calendar docstring that incorrectly ref'ed Hijri
rlskoeser Dec 7, 2024
759d0c7
Fix docstring typo caught by @coderabbitai
rlskoeser Dec 7, 2024
d9fd4ba
Include calendar converters in sphinx docs and add basic usage to readme
rlskoeser Dec 20, 2024
4372b23
Address coverage issues flagged by codecov
rlskoeser Dec 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/unit_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ on:
- 'undate/**'
- 'tests/**'
pull_request:
branches:
- "**"
Comment on lines +11 to +12
Copy link
Contributor

@coderabbitai coderabbitai bot Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Testing against Python 3.13 alpha may cause unreliable CI results

The workflow is currently testing against Python 3.13, which is in alpha stage. The project has minimal version constraints (only for strenum < 3.11) but testing against pre-release Python versions can lead to:

  • False failures due to pre-release bugs
  • Unstable dependency behaviors with alpha Python releases
  • Increased CI maintenance overhead

Consider:

  • Moving Python 3.13 to an optional test matrix that's allowed to fail
  • Waiting until Python 3.13 reaches RC status before including it in the main test matrix

Example modification:

strategy:
  matrix:
    python: ["3.9", "3.10", "3.11", "3.12"]
  include:
    - python: "3.13"
      continue-on-error: true
🔗 Analysis chain

Consider the stability implications of Python 3.13.

Python 3.13 is currently in alpha/beta stage. While testing against future Python versions is good practice, including it in the test matrix this early might lead to false failures due to pre-release issues.

Let's check the package compatibility:

Consider adding Python 3.13 to an optional test matrix or waiting until it reaches release candidate status.

Also applies to: 24-24

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if dependencies specify Python version constraints
rg "python_version" pyproject.toml

Length of output: 131


Script:

#!/bin/bash
# Let's check the Python versions specified in the test workflow
cat .github/workflows/unit_tests.yml

Length of output: 1373

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is a GenAI artifact since its training cutoff date is prior to October 2024? Python 3.13 is not in alpha

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ColeDCrawford, you're absolutely right—Python 3.13 is not in alpha as of December 2024. Including it in the CI test matrix is appropriate. Thank you for pointing out the oversight.


✏️ Learnings added
Learnt from: ColeDCrawford
PR: dh-tech/undate-python#107
File: .github/workflows/unit_tests.yml:11-12
Timestamp: 2024-12-06T16:47:25.988Z
Learning: As of December 2024, Python 3.13 is stable and should be included in CI test matrices without concerns about pre-release instability.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


env:
# python version used to calculate and submit code coverage
Expand Down
29 changes: 28 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ An `UndateInterval` is a date range between two `Undate` objects. Intervals can
```

You can initialize `Undate` or `UndateInterval` objects by parsing a date string with a specific converter, and you can also output an `Undate` object in those formats.
Available converters are "ISO8601" and "EDTF" (but only)
Currently available converters are "ISO8601" and "EDTF" and supported calendars.

```python
>>> from undate import Undate
Expand All @@ -156,6 +156,33 @@ Available converters are "ISO8601" and "EDTF" (but only)
<UndateInterval 1800/1900>
```

### Calendars

All `Undate` objects are calendar aware, and date converters include support for parsing and working with dates from other calendars. The Gregorian calendar is used by default; currently `undate` supports the Hijri Islamic calendar and the Anno Mundi Hebrew calendar based on calendar convertion logic implemented in the [convertdate](https://convertdate.readthedocs.io/en/latest/)package.

Dates are stored with the year, month, day and appropriate precision for the original calendar; internally, earliest and latest dates are calculated in Gregorian / Proleptic Gregorian calendar for standardized comparison across dates from different calendars.

```python
>>> from undate import Undate
>>> tammuz4816 = Undate.parse("26 Tammuz 4816", "Hebrew")
>>> tammuz4816
<Undate '26 Tammuz 4816 Anno Mundi' 4816-04-26 (Hebrew)>
>>> rajab495 = Undate.parse("Rajab 495", "Hijri")
>>> rajab495
<Undate 'Rajab 495 Hijrī' 0495-07 (Hijri)>
>>> y2k = Undate.parse("2001", "EDTF")
>>> y2k
<Undate 2001 (Gregorian)>
>>> [str(d.earliest) for d in [rajab495, tammuz4816, y2k]]
['1102-04-28', '1056-07-17', '2001-01-01']
>>> [str(d.precision) for d in [rajab495, tammuz4816, y2k]]
['MONTH', 'DAY', 'YEAR']
>>> sorted([rajab495, tammuz4816, y2k])
[<Undate '26 Tammuz 4816 Anno Mundi' 4816-04-26 (Hebrew)>, <Undate 'Rajab 495 Hijrī' 0495-07 (Hijri)>, <Undate 2001 (Gregorian)>]
```

* * *

For more examples, refer to the [example notebooks](https://github.com/dh-tech/undate-python/tree/main/examples/notebooks/) included in this repository.

## Documentation
Expand Down
35 changes: 29 additions & 6 deletions docs/undate/converters.rst
Original file line number Diff line number Diff line change
@@ -1,19 +1,25 @@
Converters
==========

Overview
--------

.. automodule:: undate.converters.base
:members:
:undoc-members:

Formats
--------

ISO8601
-------
^^^^^^^

.. automodule:: undate.converters.iso8601
:members:
:undoc-members:

Extended Date-Time Format (EDTF)
--------------------------------
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. automodule:: undate.converters.edtf.converter
:members:
Expand All @@ -23,8 +29,25 @@ Extended Date-Time Format (EDTF)
:members:
:undoc-members:

.. transformer is more of an internal, probably doesn't make sense to include
.. .. automodule:: undate.converters.edtf.transformer
.. :members:
.. :undoc-members:

Calendars
---------

Gregorian
^^^^^^^^^

.. automodule:: undate.converters.calendars.gregorian
:members:

Hijri (Islamic calendar)
^^^^^^^^^^^^^^^^^^^^^^^^

.. automodule:: undate.converters.calendars.hijri.converter
:members:

Anno Mundi (Hebrew calendar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. automodule:: undate.converters.calendars.hebrew.converter
:members:
Comment on lines +33 to +52
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance calendar documentation based on PR discussions

Based on the PR objectives and discussions about calendar representation, consider adding documentation that addresses:

  1. How different calendars handle precision (year, month, day)
  2. The relationship between calendar types and Undate/UndateInterval
  3. How calendar conversions work between different systems

Example addition:

 Calendars
 ---------
+
+Calendar support in undate handles different dating systems while preserving
+precision and maintaining consistency in date representations. Each calendar
+type is represented as a property of the Undate class, allowing for:
+
+- Explicit calendar type specification
+- Precision preservation (year, month, day)
+- Consistent date comparisons across calendar systems
+

 Gregorian
 ^^^^^^^^^
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Calendars
---------
Gregorian
^^^^^^^^^
.. automodule:: undate.converters.calendars.gregorian
:members:
Hijri (Islamic calendar)
^^^^^^^^^^^^^^^^^^^^^^^^
.. automodule:: undate.converters.calendars.hijri.converter
:members:
Anno Mundi (Hebrew calendar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. automodule:: undate.converters.calendars.hebrew.converter
:members:
Calendars
---------
Calendar support in undate handles different dating systems while preserving
precision and maintaining consistency in date representations. Each calendar
type is represented as a property of the Undate class, allowing for:
- Explicit calendar type specification
- Precision preservation (year, month, day)
- Consistent date comparisons across calendar systems
Gregorian
^^^^^^^^^
.. automodule:: undate.converters.calendars.gregorian
:members:
Hijri (Islamic calendar)
^^^^^^^^^^^^^^^^^^^^^^^^
.. automodule:: undate.converters.calendars.hijri.converter
:members:
Anno Mundi (Hebrew calendar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. automodule:: undate.converters.calendars.hebrew.converter
:members:


2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ readme = "README.md"
license = { text = "Apache-2" }
requires-python = ">= 3.9"
dynamic = ["version"]
dependencies = ["lark", "numpy"]
dependencies = ["lark[interegular]", "numpy", "convertdate", "strenum; python_version < '3.11'"]
authors = [
{ name = "Rebecca Sutton Koeser" },
{ name = "Cole Crawford" },
Expand Down
79 changes: 74 additions & 5 deletions src/undate/converters/base.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
"""
:class:`undate.converters.BaseDateConverter` provides a base class for
:class:`~undate.converters.BaseDateConverter` provides a base class for
implementing date converters, which can provide support for
parsing and generating dates in different formats and also converting
dates between different calendars.
parsing and generating dates in different formats.
The converter subclass :class:`undate.converters.BaseCalendarConverter`
provides additional functionaly needed for calendar conversion.

To add support for a new date format or calendar conversion:
To add support for a new date converter:

- Create a new file under ``undate/converters/``
- For converters with sufficient complexity, you may want to create a submodule;
Expand All @@ -18,6 +19,26 @@
The new subclass should be loaded automatically and included in the converters
returned by :meth:`BaseDateConverter.available_converters`

To add support for a new calendar converter:

- Create a new file under ``undate/converters/calendars/``
- For converters with sufficient complexity, you may want to create a submodule;
see ``undate.converters.calendars.hijri`` for an example.
- Extend ``BaseCalendarConverter`` and implement ``parse`` and ``to_string``
formatter methods as desired/appropriate for your converter as well as the
additional methods for ``max_month``, ``max_day``, and convertion ``to_gregorian``
calendar.
- Import your calendar in ``undate/converters/calendars/__init__.py`` and include in `__all__``
- Add unit tests for the new calendar logic under ``tests/test_converters/calendars/``
- Add the new calendar to the ``Calendar`` enum of supported calendars in
``undate/undate.py`` and confirm that the `get_converter` method loads your
calendar converter correctly (an existing unit test should cover this).
- Consider creating a notebook to demonstrate the use of the calendar
converter.

Calendar converter subclasses are also automatically loaded and included
in the list of available converters.

-------------------
"""

Expand Down Expand Up @@ -90,6 +111,54 @@ def available_converters(cls) -> Dict[str, Type["BaseDateConverter"]]:
"""
Dictionary of available converters keyed on name.
"""
return {c.name: c for c in cls.subclasses()} # type: ignore

@classmethod
def subclasses(cls) -> list[Type["BaseDateConverter"]]:
"""
List of available converters classes. Includes calendar convert
subclasses.
"""
# ensure undate converters are imported
cls.import_converters()
return {c.name: c for c in cls.__subclasses__()} # type: ignore

# find all direct subclasses, excluding base calendar converter
subclasses = cls.__subclasses__()
subclasses.remove(BaseCalendarConverter)
# add all subclasses of calendar converter base class
subclasses.extend(BaseCalendarConverter.__subclasses__())
return subclasses


class BaseCalendarConverter(BaseDateConverter):
"""Base class for calendar converters, with additional methods required
for calendars."""

#: Converter name. Subclasses must define a unique name.
name: str = "Base Calendar Converter"

def min_month(self) -> int:
"""Smallest numeric month for this calendar."""
raise NotImplementedError

def max_month(self, year: int) -> int:
"""Maximum numeric month for this calendar"""
raise NotImplementedError

def first_month(self) -> int:
"""first month in this calendar; by default, returns :meth:`min_month`."""
return self.min_month()

def last_month(self, year: int) -> int:
"""last month in this calendar; by default, returns :meth:`max_month`."""
return self.max_month(year)

def max_day(self, year: int, month: int) -> int:
"""maximum numeric day for the specified year and month in this calendar"""
raise NotImplementedError

def to_gregorian(self, year, month, day) -> tuple[int, int, int]:
"""Convert a date for this calendar specified by numeric year, month, and day,
into the Gregorian equivalent date. Should return a tuple of year, month, day.
"""
raise NotImplementedError
5 changes: 5 additions & 0 deletions src/undate/converters/calendars/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from undate.converters.calendars.gregorian import GregorianDateConverter
from undate.converters.calendars.hijri import HijriDateConverter
from undate.converters.calendars.hebrew import HebrewDateConverter

__all__ = ["HijriDateConverter", "GregorianDateConverter", "HebrewDateConverter"]
51 changes: 51 additions & 0 deletions src/undate/converters/calendars/gregorian.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
from calendar import monthrange

from undate.converters.base import BaseCalendarConverter


class GregorianDateConverter(BaseCalendarConverter):
"""
Calendar converter class for Gregorian calendar.
"""

#: converter name: Gregorian
name: str = "Gregorian"
#: calendar
calendar_name: str = "Gregorian"

#: known non-leap year
NON_LEAP_YEAR: int = 2022

def min_month(self) -> int:
"""First month for the Gregorian calendar."""
return 1

def max_month(self, year: int) -> int:
"""maximum numeric month for the specified year in the Gregorian calendar"""
return 12

def max_day(self, year: int, month: int) -> int:
"""maximum numeric day for the specified year and month in this calendar"""
# if month is known, use that to calculate
if month:
# if year is known, use it; otherwise use a known non-leap year
# (only matters for February)
year = year or self.NON_LEAP_YEAR

# Use monthrange from python builtin calendar module.
# returns first day of the month and number of days in the month
# for the specified year and month.
_, max_day = monthrange(year, month)
else:
# if year and month are unknown, return maximum possible
max_day = 31

return max_day

Comment on lines +27 to +44
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Missing test coverage for Gregorian calendar implementation

The codebase lacks dedicated test files for the Gregorian calendar implementation. While other calendars (Hebrew, Hijri) have comprehensive test suites, there are no tests specifically covering the Gregorian calendar's max_day method and its edge cases.

  • Create test file at tests/test_converters/test_calendars/test_gregorian/test_gregorian_converter.py
  • Add test cases for:
    • Regular month/year combinations
    • February in leap and non-leap years
    • Edge case where month is unknown (should return 31)
    • Edge case where year is unknown but month is known
    • Edge case where both year and month are unknown
🔗 Analysis chain

Add test coverage for max_day edge cases

The fallback case where both year and month are unknown (line 41) is not covered by tests.

Let's verify the test coverage:

Would you like me to help generate test cases for this scenario?

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for tests of max_day method
rg -A 5 "def test.*max_day" tests/

Length of output: 36


Script:

#!/bin/bash
# Let's first check if the tests directory exists and find test files
fd "test.*\.py$" tests/

# Then let's search for any test files related to gregorian calendar
fd ".*gregorian.*\.py$" tests/

# Finally, let's look for any test class or function containing 'max_day' or 'maxday' case-insensitive
rg -i "max.?day" tests/

Length of output: 1040


Script:

#!/bin/bash
# Let's check if there are any test files specifically for gregorian calendar
fd "test_gregorian.*\.py$" tests/

# Let's also check the test_base.py as it might contain base calendar tests
cat tests/test_converters/test_base.py

# Let's also look for any calendar test base classes that might contain shared tests
rg -l "class.*TestCase" tests/test_converters/

Length of output: 2391

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 41-41: src/undate/converters/calendars/gregorian.py#L41
Added line #L41 was not covered by tests

def to_gregorian(self, year, month, day) -> tuple[int, int, int]:
"""Convert to Gregorian date. This returns the specified by year, month,
and day unchanged, but is provided for consistency since all calendar
converters need to support conversion to Gregorian calendar for
a common point of comparison.
"""
return (year, month, day)
3 changes: 3 additions & 0 deletions src/undate/converters/calendars/hebrew/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from undate.converters.calendars.hebrew.converter import HebrewDateConverter

__all__ = ["HebrewDateConverter"]
78 changes: 78 additions & 0 deletions src/undate/converters/calendars/hebrew/converter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
from typing import Union

from convertdate import hebrew # type: ignore
from lark.exceptions import UnexpectedCharacters

from undate.converters.base import BaseCalendarConverter
from undate.converters.calendars.hebrew.parser import hebrew_parser
from undate.converters.calendars.hebrew.transformer import HebrewDateTransformer
from undate.undate import Undate, UndateInterval


class HebrewDateConverter(BaseCalendarConverter):
"""
Converter for Hebrew Anno Mundicalendar.

Support for parsing Anno Mundi dates and converting to Undate and UndateInterval
objects in the Gregorian calendar.
"""

#: converter name: Hebrew
name: str = "Hebrew"
calendar_name: str = "Anno Mundi"

def __init__(self):
self.transformer = HebrewDateTransformer()

def min_month(self) -> int:
"""Smallest numeric month for this calendar."""
return 1

def max_month(self, year: int) -> int:
"""Maximum numeric month for this calendar. In Hebrew calendar, this is 12 or 13
depending on whether it is a leap year."""
return hebrew.year_months(year)

def first_month(self) -> int:
"""First month in this calendar. The Hebrew civil year starts in Tishri."""
return hebrew.TISHRI

def last_month(self, year: int) -> int:
"""Last month in this calendar. Hebrew civil year starts in Tishri,
Elul is the month before Tishri."""
return hebrew.ELUL

def max_day(self, year: int, month: int) -> int:
"""maximum numeric day for the specified year and month in this calendar"""
# NOTE: unreleased v2.4.1 of convertdate standardizes month_days to month_length
return hebrew.month_days(year, month)

def to_gregorian(self, year: int, month: int, day: int) -> tuple[int, int, int]:
"""Convert a Hebrew date, specified by year, month, and day,
to the Gregorian equivalent date. Returns a tuple of year, month, day.
"""
return hebrew.to_gregorian(year, month, day)

def parse(self, value: str) -> Union[Undate, UndateInterval]:
"""
Parse a Hebrew date string and return an :class:`~undate.undate.Undate` or
:class:`~undate.undate.UndateInterval`.
The Hebrew date string is preserved in the undate label.
"""
if not value:
raise ValueError("Parsing empty string is not supported")

# parse the input string, then transform to undate object
try:
# parse the string with our Hebrew date parser
parsetree = hebrew_parser.parse(value)
# transform the parse tree into an undate or undate interval
undate_obj = self.transformer.transform(parsetree)
# set the original date as a label, with the calendar name
undate_obj.label = f"{value} {self.calendar_name}"
return undate_obj
except UnexpectedCharacters as err:
raise ValueError(f"Could not parse '{value}' as a Hebrew date") from err

# do we need to support conversion the other direction?
# i.e., generate a Hebrew date from an abitrary undate or undate interval?
Loading
Loading