Skip to content

LISA-ITMO/Python-Documents-Modifier

Repository files navigation

python ITMO codecov

Python-Documents-Modifier

itmo

Description

The library for editing DOCX and ODT documents was developed to provide the necessary functionality when checking documents for compliance with standards.

Features of library

The library provides the following features:

  • Adding comments DOCX & ODT
  • Editing text style DOCX & ODT
  • Deleting comments DOCX & ODT
  • Editing comments DOCX
  • Editing list style DOCX

Requirements

lxml==5.1.0
python-docx==1.1.0
typing_extensions==4.9.0

Documentation

The documentation is available at the link

Installation

git clone https://github.com/LISA-ITMO/Python-Documents-Modifier.git

Getting started

Install the library, open your script and create an instance of the class

from src.docx.docx_redactor import DOCXRedactor
from src.odt.odtredactor import ODTRedactor

doc_docx = DOCXRedactor('your_document.docx')
doc_odt = ODTRedactor('your_document.odt')

Examples for using functions

  1. Add a comment
    paraId = '00F00080' # DOCX | ID of the paragraph to which you want to add a comment
    text = 'Research has shown that' # ODT | Text to which you want to add a comment
    
    doc_docx.add_comment_by_id(paraId, 'your_comment', 'author')
    doc_odt.add_comment_by_text(text, 'your_comment', 'author')
  2. Edit a comment
    commentId = '0' # DOCX | ID of the comment you want to edit
    doc_docx.edit_comment_by_id(commentId, 'new_comment_text', 'new_author')
  3. Delete a comment
    commentId = '0' # DOCX | ID of the comment you want to delete
    nameId = '1' # ODT | NAME_ID of the comment you want to delete
    
    doc_docx.delete_comment_by_id(commentId)
    doc_odt.delete_comment_by_id(nameId)
  4. Change comment style
    from src.docx.enum.font_style import FontStyle
    from src.docx.enum.underline_style import UnderlineStyle
    from src.docx.enum.color import Color
    paraId = '00F00080' # DOCX | ID of the paragraph to which you want to edit style
    text = 'Scientists established back in 1984 that' # ODT | Text to which you want to edit style
    
    doc_docx.edit_style_by_id
    (
        paraId,
        size=12,
        fontStyle=FontStyle.ARIAL,
        color=Color.RED,
        underline=UnderlineStyle.DOUBLE,
        italic=False,
        bold=True
    )
    
    doc_odt.edit_style_by_text
    (
        text,
        font_name='Arial',
        font_size=12
    )
  5. Edit list style
    from src.docx.enum.ListStyle import ListStyle
    paraIds = ['00F00080', '11D11171'] # DOCX | IDs of the paragraphs, that contains numPr (included in list)
    doc_docx.edit_list_style_by_paraIds
    (
        paraIds,
        list_style=ListStyle.bullet,
        bullet_symbol='@'
    )

Contacts

Your contacts. For example:

slavamarcin@yandex.ru
vlad-tershch@yandex.ru

Conferences

  1. XIII Конгресс молодых ученых ИТМО:
    • Шафиков М.А., Терещенко В.В., Марцинкевич В.И., Крылов М.М. РАЗРАБОТКА БИБЛИОТЕКИ PYTHON ДЛЯ РЕДАКТИРОВАНИЯ ОБЪЕКТОВ ЭЛЕКТРОННЫХ ДОКУМЕНТОВ - 2024.

Authors

Shafikov Maxim
Krylov Michael
Tereshchenko Vladislav
Martsinkevich Viacheslav