You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/dqo-concepts/index.md
+41-1
Original file line number
Diff line number
Diff line change
@@ -4,11 +4,51 @@ title: DQOps Data Quality Operations Center concepts
4
4
# DQOps Data Quality Operations Center concepts
5
5
Follow this guide to learn each concept of DQOps Data Quality Operations Center to start measuring data quality for data sources.
6
6
7
+
## Introduction
8
+
9
+
### **What is DQOps?**
10
+
11
+
DQOps is a powerful open-source data quality and observability platform designed to address the entire data lifecycle,
12
+
from initial data assessment to advanced automation.
13
+
14
+
* Quickly start a local data quality environment.
15
+
* Configure data quality checks using the user interface or YAML files. Automate this process with rule mining engine and built-in data quality policies.
16
+
* Run data quality checks directly from your data pipelines.
17
+
* Utilize the user interface for easy testing and issues review.
18
+
* Receive incident notifications via email or webhook, and create multiple notification filters to customize alerts for specific scenarios.
19
+
20
+
### **Who needs DQOps?**
21
+
DQOps is designed to meet the diverse needs of various data stakeholders across different stages of the data platform lifecycle.
22
+
23
+
**Data engineers** need to integrate data quality checks directly into data pipelines, test the quality of data sources before they are transformed, and verify the data quality of target tables populated by the pipeline.
24
+
25
+
**Data steward**, who ensure the trustworthiness and usability of data, need a robust data quality platform to validate the quality of data assets and manage data cleansing workflows to address any issues.
26
+
27
+
**Data consumer (data scientists and data analysts)** want to know the data quality score for tables and quickly assert their expectations about the data quality of essential data sources.
28
+
29
+
### **When do you need DQOps?**
30
+
DQOps is essential for organizations that
31
+
32
+
* Need to assess the data quality of new data sources.
33
+
* Want to establish robust data observability practice to monitor data ingestion, transformation, and storage processes to detect anomalies, errors, or deviations from expected behavior.
34
+
* Aim to demonstrate data quality issues to business sponsors using the user interface and data quality dashboards.
35
+
36
+
### **How DQOps works?**
37
+
Download DQOps directly from PyPI.
38
+
Run DQOps locally without configuring databases or set up on-premise environment.
39
+
Assess your data with basic statistics and automatically configure profiling checks using the rule mining engine.
40
+
Activate data observability by setting up monitoring checks to automatically detect new data quality issues in the future.
41
+
Receive notifications for critical issues and track their resolution.
42
+
43
+
DQOps does not use a database to store the configuration. Instead, all data quality configuration files are stored in
44
+
YAML files. This code-first approach allows the data quality check configuration to be stored in a source code repository
45
+
and versioned along with other pipeline or machine learning code.
46
+
7
47
## List of DQOps concepts
8
48
This article is a dictionary of DQOps terms. Click on the links to learn about every concept.
DQOps follows a two-stage data quality process. The first step is a [data quality assessment](data-quality-process.md#data-quality-assessment) using the [data profiler](definition-of-data-quality-checks/data-profiling-checks.md).
51
+
DQOps follows a two-stage data quality process. The first step is a [data quality assessment](data-quality-process.md#data-quality-assessment) using the basic statistics and [data profiler](definition-of-data-quality-checks/data-profiling-checks.md).
12
52
This step identifies confirmed data quality issues. In the second stage,
13
53
users configure [monitoring](definition-of-data-quality-checks/data-observability-monitoring-checks.md) and [partition checks](definition-of-data-quality-checks/partition-checks.md) that regularly verify data quality using Data Observability.
DQOps has two methods of data quality assessment. The first step is capturing [basic data statistics](working-with-dqo/collecting-basic-data-statistics.md).
229
229
230
-
When you know how the table is structured, you can use [rule mining engine](dqo-concepts/data-quality-rule-mining.md) to automatically propose the configuration of [profiling data quality checks](dqo-concepts/definition-of-data-quality-checks/data-profiling-checks.md)
230
+
When you know how the table is structured, you can use the [rule mining engine](dqo-concepts/data-quality-rule-mining.md) to automatically propose the configuration of [profiling data quality checks](dqo-concepts/definition-of-data-quality-checks/data-profiling-checks.md)
231
231
to detect the most common data quality issues.
232
232
233
233
[:octicons-arrow-right-24: Review data statistics](working-with-dqo/collecting-basic-data-statistics.md)
DQOps simplifies data quality management with [data polcies that automatically activates checks](dqo-concepts/data-observability.md) on all imported tables and columns.
251
+
DQOps simplifies data quality management with [data policies that automatically activate checks](dqo-concepts/data-observability.md) on all imported tables and columns.
252
252
You have full control to enable, disable, or modify existing policies, and even create new ones.
253
253
254
254
There are other methods to activate data quality checks. You can:
255
255
256
-
[:octicons-arrow-right-24: Copy the checks activated by rule mining engine](dqo-concepts/data-quality-rule-mining.md)
256
+
[:octicons-arrow-right-24: Copy the checks activated by the rule mining engine](dqo-concepts/data-quality-rule-mining.md)
257
257
258
258
[:octicons-arrow-right-24: Manually activate checks using the check editor](working-with-dqo/run-data-quality-checks.md)
Over 50 built-in data quality dashboards let you drill-down to the problem.
293
+
Over 50 built-in data quality dashboards let you drilldown to the problem.
294
294
295
295
!!! success "Data quality KPIs"
296
296
@@ -359,13 +359,13 @@ React to data quality incidents and assign them to the right teams who can fix t
359
359
Organizations have separated operations team that react to data quality incidents first, and engineering teams
360
360
that can fix the problems.
361
361
362
-
DQOps reduces the alert fatigue by grouping similar data quality issues into **data quality incidents**.
362
+
DQOps reduces alert fatigue by grouping similar data quality issues into **data quality incidents**.
363
363
You can receive incident notifications via email or webhook, and [create multiple notification filters](dqo-concepts/grouping-data-quality-issues-to-incidents.md#incident-notifications)
364
364
to customize alerts for specific scenarios.
365
365
366
366
[:octicons-arrow-right-24: Data quality incident workflow](dqo-concepts/grouping-data-quality-issues-to-incidents.md)
367
367
368
-
[:octicons-arrow-right-24: Sending notifications to slack](integrations/slack/configuring-slack-notifications.md)
368
+
[:octicons-arrow-right-24: Sending notifications to Slack](integrations/slack/configuring-slack-notifications.md)
369
369
370
370
[:octicons-arrow-right-24: Sending notifications to any ticketing platform using webhooks](integrations/webhooks/index.md)
371
371
@@ -441,7 +441,7 @@ using our [REST API Python client](client/index.md).
441
441
---
442
442
443
443
What if your table contains aggregated data that was received from different suppliers, departments, vendors, or teams?
444
-
Data quality issues are detected, but who provided you the corrupted data?
444
+
Data quality issues are detected, but who provided you with the corrupted data?
445
445
DQOps answers the question by running data quality checks with grouping, supporting a hierarchy of up to 9 levels.
446
446
447
447
[:octicons-arrow-right-24: Use GROUP BY to measure data quality for different data streams](dqo-concepts/measuring-data-quality-with-data-grouping.md)
@@ -451,8 +451,8 @@ using our [REST API Python client](client/index.md).
451
451
---
452
452
453
453
A dashboard is showing the wrong numbers. The business sponsor asks you to monitor
454
-
it every day to detect when it will show the wrong numbers.
455
-
You can take the SQL query from the dashboard and turn it into a templated data quality check that DQOps shows on the user interface.
454
+
it every day to detect when it will show the wrong numbers.
455
+
You can turn the SQL query from the dashboard into a templated data quality check that DQOps shows on the user interface.
456
456
457
457
[:octicons-arrow-right-24: How to define a custom data quality check](working-with-dqo/creating-custom-data-quality-checks.md)
458
458
@@ -471,7 +471,7 @@ Want to learn more about data quality?
471
471
DQOps creators have written an eBook ["A step-by-step guide to improve data quality"](https://dqops.com/best-practices-for-effective-data-quality-improvement/)
472
472
that describes their experience in data cleansing and data quality monitoring using DQOps.
473
473
474
-
The eBook describes a full data quality improvement process that allows you to reach a ~100% data quality KPI score within 6-12 months.
474
+
The eBook describes a complete data quality improvement process that allows you to reach a ~100% data quality KPI score within 6-12 months.
475
475
[Download the eBook](https://dqops.com/best-practices-for-effective-data-quality-improvement/) to learn the process of managing
476
476
an iterative data quality project that leads to fixing all data quality issues.
0 commit comments