UBC-MDS · MCatherine1994 · Jan 23, 2025 · Jan 23, 2025 · Jan 23, 2025
diff --git a/README.md b/README.md
@@ -26,6 +26,22 @@ $ pip install pyeda
 
 ## Usage
 
+`pyeda` can be used to verify the format of data files and perform basic exploratory data analysis as follows:
+```python
+from pyeda.check_csv import check_csv
+from pyeda.pymissing_values_summary import missing_values_summary
+from pyeda.data_summary import get_summary_statistics
+
+# Check if the given data file is in csv format
+data_file_path = "data.csv"  # path to your data file
+if not check_csv(data_file_path):
+    raise TypeError("The given file is not in CSV format. Please check your data file.")
+
+# Check if the data file has any missing values
+
+# Get data summary
+```
+
 ## Contributing
 
 Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

diff --git a/docs/example.ipynb b/docs/example.ipynb
@@ -1,45 +1,132 @@
 {
-    "cells": [
-        {
-            "cell_type": "markdown",
-            "source": [
-                "# Example usage\n",
-                "\n",
-                "To use `pyeda` in a project:"
-            ],
-            "metadata": {}
-        },
-        {
-            "cell_type": "code",
-            "execution_count": null,
-            "source": [
-                "import pyeda\n",
-                "\n",
-                "print(pyeda.__version__)"
-            ],
-            "outputs": [],
-            "metadata": {}
-        }
-    ],
-    "metadata": {
-        "kernelspec": {
-            "display_name": "Python 3",
-            "language": "python",
-            "name": "python3"
-        },
-        "language_info": {
-            "codemirror_mode": {
-                "name": "ipython",
-                "version": 3
-            },
-            "file_extension": ".py",
-            "mimetype": "text/x-python",
-            "name": "python",
-            "nbconvert_exporter": "python",
-            "pygments_lexer": "ipython3",
-            "version": "3.8.5"
-        }
-    },
-    "nbformat": 4,
-    "nbformat_minor": 4
-}
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Example usage\n",
+    "\n",
+    "Here we will demonstrate how to use `pyead` to verify the format of data files and perform basic exploratory data analysis.\n",
+    "\n",
+    "## Imports"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import csv\n",
+    "from pyeda.check_csv import check_csv\n",
+    "from pyeda.pymissing_values_summary import missing_values_summary\n",
+    "from pyeda.data_summary import get_summary_statistics"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Create a csv file\n",
+    "\n",
+    "We'll first create a csv file to work with."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Define file name\n",
+    "file_name = \"sample_data.csv\"\n",
+    "\n",
+    "# Create data with some empty values\n",
+    "data = [\n",
+    "    [\"Name\", \"Age\", \"City\"],\n",
+    "    [\"Alice\", \"25\", \"New York\"],\n",
+    "    [\"Bob\", \"\", \"Los Angeles\"],  # Missing age\n",
+    "    [\"Charlie\", \"30\", \"\"],       # Missing city\n",
+    "    [\"Emily\", \"22\", \"Chicago\"],      \n",
+    "]\n",
+    "\n",
+    "# Write data to a CSV file\n",
+    "with open(file_name, mode=\"w\", newline=\"\") as file:\n",
+    "    writer = csv.writer(file)\n",
+    "    writer.writerows(data)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Check if the data file is in the csv format\n",
+    "\n",
+    "To begin our exploratory data analysis, it is essential to verify whether the given file is a CSV. This can be done by calling the `check_csv` method."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "if not check_csv(file_name):\n",
+    "    raise TypeError(\"The given file is not in CSV format. Please check your data file.\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Check if data file has any missing values\n",
+    "\n",
+    "After verifying the data file type, the next step is to check whether the data contains any missing values using `missing_values_summary`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Get data summary\n",
+    "\n",
+    "Now it's time to use the `get_summary_statistics` method to get the data summary information."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.16"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/docs/sample_data.csv b/docs/sample_data.csv
@@ -0,0 +1,5 @@
+Name,Age,City
+Alice,25,New York
+Bob,,Los Angeles
+Charlie,30,
+Emily,22,Chicago
diff --git a/src/pyeda/check_csv.py b/src/pyeda/check_csv.py
@@ -1,16 +1,17 @@
 import pandas as pd
 
 def check_csv(file_path):
-    """
-    Check if the given file is a CSV file by its extension.
+    """Check if the given file is a CSV file by its extension.
 
     Parameters
     ----------
-    file_path (str): Path to the file.
+    file_path: str
+        Path to the file.
 
     Returns
     -------
-    bool: True if the file is a CSV file, False otherwise.
+    bool
+        True if the file is a CSV file, False otherwise.
 
     Examples
     --------