The training project "Difference Generator" on the Python Development course on Hexlet.io.
Languages, frameworks and libraries used in the implementation of the project:
List of dependencies, without which the project code will not work correctly:
- python = "^3.8"
- pyyaml = "^6.0"
Difference Generator is a program that determines the difference between two data structures. This is a popular task for which there are many online services, for example: http://www.jsondiff.com/. A similar mechanism is used when outputting tests or when automatically tracking changes in configuration files.
The main question in the project: how to describe the internal representation of the diff between the files, so that it is as convenient as possible. Although there are many different ways to do this, only a few of them lead to simple code.
Working with trees and tree recursion is very good at pumping algorithmic thinking. This is important because real-world processing involves constant data processing, various transformations, and collection output.
To build a diff between two structures, many operations have to be done: reading files, parsing incoming data, building a tree of differences, and generating the necessary output.
Utility features:
- Suppported file formats: YAML, JSON.
- Report generation as plain text, structured text or JSON.
- Can be used as CLI tool or external library.
Before installing the package, you need to make sure that you have Python version 3.8 or higher installed:
# Windows, Ubuntu, MacOS:
>> python --version # or python -V
Python 3.8.0+
python3 --version
.
If you have an older version installed, update with the following commands:
# Windows:
>> pip install python --upgrade
# Ubuntu:
>> sudo apt-get upgrade python3.X
# MacOS:
>> brew update && brew upgrade python
# * X - version number to be installed
If you don't have Python installed, you can download and install it from the official Python website. If you are an Ubuntu or MacOS user, then it is better to do this procedure through package managers. Open a terminal and run the command for your operating system:
# Ubuntu:
>> sudo apt update
>> sudo apt install python3.X
# MacOS:
# https://brew.sh/index_ru.html
>> brew install python3.X
# * X - version number to be installed
β The configuration of assemblies of different versions of operating systems can vary greatly from each other, which makes it impossible to write a common instruction. If you're running an OS other than the above, or you're having errors after the suggested commands, search Stack Overflow for answers, maybe someone else has come across them before you! Setting up the environment is not easy! π
The project uses the Poetry manager. Poetry is a tool for dependency management and packaging in Python. It allows you to declare the libraries your project depends on and it will manage (install/update) them for you. You can read more about this tool on the official Poetry website.
Poetry provides a custom installer that will install poetry isolated from the rest of your system by vendorizing its dependencies. This is the recommended way of installing poetry.
# Windows (WSL), Linux, MacOS:
>> curl -sSL https://install.python-poetry.org | python3 -
# Windows (Powershell):
>> (Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | py -
# If you have installed Python through the Microsoft Store, replace "py" with "python" in the command above.
python
may still refer to Python 2 instead of Python 3. The Poetry Team suggests a python3
binary to avoid ambiguity.
~/Library/Application Support/pypoetry
on MacOS.~/.local/share/pypoetry
on Linux/Unix.%APPDATA%\pypoetry
on Windows.
If you wish to change this, you may define the $POETRY_HOME environment variable:
>> curl -sSL https://install.python-poetry.org | POETRY_HOME=/etc/poetry python3 -
Add Poetry to your PATH.
Once Poetry is installed and in your $PATH, you can execute the following:
>> poetry --version
To work with the package, you need to clone the repository to your computer. This is done using the git clone
command. Clone the project on the command line:
# clone via HTTPS:
>> git clone https://github.com/IgorGakhov/python-project-lvl2.git
# clone via SSH:
>> git clone git@github.com:IgorGakhov/python-project-lvl2.git
It remains to move to the directory and install the package:
>> cd python-project-lvl2
>> poetry build
>> python3 -m pip install --user dist/*.whl
# If you have previously installed a package and want to update it, use the following command:
# >> python3 -m pip install --user --force-reinstall dist/*.whl
Finally, we can move on to using the project functionality!
from gendiff import generate_diff
diff = generate_diff(file_path1, file_path2)
The utility provides the ability to call the help command if you find it difficult to use:
>> gendiff --help
usage: gendiff [-h] [-f {stylish,json,plain}] first_file second_file
Compares two configuration files and shows a difference.
positional arguments:
first_file
second_file
options:
-h, --help show this help message and exit
-f {stylish,json,plain}, --format {stylish,json,plain}
set format of output (default: stylish)
β‘ Both absolute and relative paths to files are supported.
If format option is omitted, output will be in stylish format string by default.
The diff is built based on how the files have changed relative to each other, the keys are displayed in alphabetical order.
The absence of a plus or minus indicates that the key is in both files, and its values are the same. In all other situations, the key value is either different, or the key is in only one file.
Example:
>> gendiff filepath1.json filepath2.json
{
- follow: false
host: hexlet.io
- proxy: 123.234.53.22
- timeout: 50
+ timeout: 20
+ verbose: true
}
The text reflects the situation, as if we have combined the second object with the first.
- If the new property value is complex, then [complex value] is written.
- If the property is nested, then the entire path to the root is displayed, and not just taking into account the parent.
Example:
>> gendiff --format plain filepath1.json filepath2.json
Property 'follow' was removed
Property 'proxy' was removed
Property 'timeout' was updated. From 50 to 20
Property 'verbose' was added with value: true
JSON (JavaScript Object Notation) is a standard text format for representing structured data based on JavaScript object syntax. It is usually used to transfer data in web applications (e.g. sending some data from the server to the client so that it can be displayed on a web page or vice versa).
Example:
>> gendiff --format json filepath1.json filepath2.json
{
"follow": {
"value": false,
"node type": "REMOVED"
},
"host": {
"value": "hexlet.io",
"node type": "UNCHANGED"
},
"proxy": {
"value": "123.234.53.22",
"node type": "REMOVED"
},
"timeout": {
"value": {
"old": 50,
"new": 20
},
"node type": "UPDATED"
},
"verbose": {
"value": true,
"node type": "ADDED"
}
}
Node types:
- "ADDED": key was not present in the first file, but was present in the second file.
- "REMOVED": key was present in the first file, but not present in the second file.
- "UNCHANGED": key exists in both files and its values match.
- "UPDATED": key exists in both files, but its values do not match.
- "NESTED": similar to 'updated', but here the values are dictionaries.
List of dev-dependencies:
- flake8 = "^4.0.1"
- pytest = "^7.1.2"
- pytest-cov = "^3.0.0"
.
βββ gendiff
βΒ Β βββ __init__.py
βΒ Β βββ cli.py
βΒ Β βββ file_processor
βΒ Β βΒ Β βββ __init__.py
βΒ Β βΒ Β βββ gendiff.py
βΒ Β βΒ Β βββ file_handler.py
βΒ Β βΒ Β βββ data_loader.py
βΒ Β βΒ Β βββ diff_assembler.py
βΒ Β βββ formatters
βΒ Β βΒ Β βββ __init__.py
βΒ Β βΒ Β βββ tree_render.py
βΒ Β βΒ Β βββ stylish.py
βΒ Β βΒ Β βββ plain.py
βΒ Β βΒ Β βββ json.py
βΒ Β βββ scripts
βΒ Β βββ __init__.py
βΒ Β βββ run.py
βββ tests
β βββ fixtures
β βΒ Β βββ diff_requests
β βΒ Β βββ diff_responses
β βββ test_cli.py
β βββ test_gendiff.py
βββ Makefile
βββ pyproject.toml
βββ README.md
βββ setup.cfg
The commands most used in development are listed in the Makefile:
make package-install
- Installing a package in the user environment.
make build
- Building the distribution of he Poetry package.
make package-force-reinstall
- Reinstalling the package in the user environment.
make lint
- Checking code with linter.
make test
- Tests the code.
make fast-check
- Builds the distribution, reinstalls it in the user's environment, checks the code with tests and linter.
Thank you for attention!
π¨βπ» Author: @IgorGakhov