Skip to content

Commit

Permalink
Merge branch 'main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
wangjc640 authored Mar 13, 2021
2 parents db8aff0 + f6cfd23 commit c6d52d6
Show file tree
Hide file tree
Showing 5 changed files with 27 additions and 2 deletions.
25 changes: 25 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,36 @@ data = pd.DataFrame({
'PetalWidthCm':[0.2, 0.1, 0.2],
'Species':['Iris-setosa','Iris-virginica', 'Iris-germanica']
})

data_with_NA = pd.DataFrame({
'SepalLengthCm':[5.1, 4.9, 4.7],
'SepalWidthCm':[1.4, 1.4, 1.3],
'PetalWidthCm':[0.2, 0.1, None]
})

data_with_outlier = pd.DataFrame({
'SepalLengthCm':[5.1, 4.9, 4.7, 5.2, 5.1, 5.2, 5.1, 4.8],
'SepalWidthCm':[1.4, 1.4, 1.3, 1.2, 1.2, 1.3, 1.6, 1.3],
'PetalWidthCm':[0.2, 0.1, 30, 0.2, 0.3, 0.1, 0.4, 0.5]
})
```

The eda_utils_py will help you to:
- Diagnose data quality: Resolve skewed data by identifing missing data and outlier and provide corresponding remedy.

```python
imputer(data_with_NA)
```
Output:

![imputer_output](images/imputer_output.png)

```python
outlier_identifier(data_with_outlier, method = "median")
```
Output:

![outlier_output](images/outlier_output.png)

- This package can help you easily plot a correlation matrix along with its values to help explore data.

Expand Down
2 changes: 1 addition & 1 deletion eda_utils_py/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '0.1.6'
__version__ = '0.1.7'
Binary file added images/imputer_output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/outlier_output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "eda_utils_py"
version = "0.1.6"
version = "0.1.7"
description = "Python package that contains util functions for eda process"
authors = ["Chuang Wang <chuangw.sde@gmail.com>"]
license = "MIT"
Expand Down

0 comments on commit c6d52d6

Please sign in to comment.