You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* enable streaming
* scaffolding for simpleExpr validation
* completed refactor -- tests outstanding
* refactor and enablement complete
* updated readme
* added implicit boolean
* added filter for summary report
* Update ValidatorTestSuite (#19)
* Update Validator tests with API changes.
* Add tests for implicit and explicit expression rules.
* imported outstanding spark sql functions
* Add test suite for Rules class.
* Add tests for RuleSet class.
* Add test for complex expressions on aggregates.
* Fix isGrouped bug when groupBys array is empty by default or explicitly set.
* Fix overloaded add function that merges 2 RuleSets.
* Add ignoreCase and invertMatch to ValidateStrings and ValidateNumerics rule types.
* Update documentation with latest features in categorical Rules.
Co-authored-by: Daniel Tomes [GeekSheikh] <10840635+geeksheikh@users.noreply.github.com>
* Update sbt (#23)
* simple update to build sbt
* Add scoverage.
Co-authored-by: Will Girten <will.girten@databricks.com>
* removed unused imports
* Accept expanded sequence of Rules to RuleSet Class.
* cleaning up (#30)
* cleaning up
* removed dependencies from assembly
* Fix whitespaces and special characters in Rule Names (#25)
* Parse white spaces and special characters in failure report.
* Update variable name with more meaningful name.
* Add method to remove whitespace and special characters from Rule names.
* Simplify ruleName public accessor.
* Change special character replacement to underscores.
* Update warning messages and assign private ruleName only once.
* Update demo notebook (#33)
* Update demo notebook with examples of latest features added.
* added scala demo example
Co-authored-by: Daniel Tomes [GeekSheikh] <10840635+geeksheikh@users.noreply.github.com>
* implemented new inclusive boundaries option (#32)
* implemented new inclusive boundaries option
* enhanced logic for upper and lower inclusivity
* readme updated
* Update validation logic for Bounds class. Add test case for inclusive boundary rules. (#35)
Co-authored-by: Will Girten <47335283+goodwillpunning@users.noreply.github.com>
Co-authored-by: Will Girten <47335283+goodwillpunning@users.noreply.github.com>
Co-authored-by: Will Girten <will.girten@databricks.com>
// The validate method will return the rules report dataframe which breaks down which rules passed and which
80
-
// rules failed and how/why. The second return value returns a boolean to determine whether or not all tests passed
81
-
// val (rulesReport, passed) = RuleSet(df, Array("store_id"))
82
-
val (rulesReport, passed) =RuleSet(df)
83
-
.add(specializedRules)
84
-
.add(minMaxPriceRules)
85
-
.add(catNumerics)
86
-
.add(catStrings)
87
-
.validate(2)
37
+
// COMMAND ----------
38
+
39
+
display(df)
40
+
41
+
// COMMAND ----------
42
+
43
+
// MAGIC %md
44
+
// MAGIC # Rule Types
45
+
// MAGIC There are several Rule types available:
46
+
// MAGIC
47
+
// MAGIC 1. Categorical (numerical and string) - used to validate if row values fall in a pre-defined list of values, e.g. lookups
48
+
// MAGIC 2. Boundaries - used to validate if row values fall within a range of numerical values
49
+
// MAGIC 3. Expressions - used to validate if row values pass expressed conditions. These can be simple expressions like a Boolean column `col('valid')`, or complex, like `col('a') - col('b') > 0.0`
50
+
51
+
// COMMAND ----------
52
+
53
+
// MAGIC %md
54
+
// MAGIC ### Example 1: Writing your first Rule
55
+
// MAGIC Let's look at a very simple example...
56
+
57
+
// COMMAND ----------
58
+
59
+
// First, begin by defining your RuleSet by passing in your input DataFrame
60
+
valmyRuleSet=RuleSet(df)
61
+
62
+
// Next, define a Rule that validates that the `store_id` values fall within a list of pre-defined Store Ids
// MAGIC Case-sensitivity is enabled by default. However, an optional `ignoreCase` parameter can be used to apply/not apply case sensitivity to a list of String values
166
+
167
+
// COMMAND ----------
168
+
169
+
// Numerical categorical rules. Build create a list of values to be validated against.
170
+
valcatNumerics=Array(
171
+
// Only allow store_ids in my validStoreIDs lookup
0 commit comments