-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathdesign.Rmd
74 lines (62 loc) · 2.79 KB
/
design.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
---
title: "Design"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{design}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
## Principles
These are the guiding principles for this package:
1. Functionality is as agnostic to data format as possible (e.g. can be
used with SQL or Arrow connections, in a data.table format, or as a
data.frame).
2. Functions have consistent inputs and outputs (e.g. inputs and
outputs are the same, regardless of specific conditions).
3. Functions have predictable outputs based on inputs (e.g. if an input
is a data frame, the output is a data frame).
4. Functions have consistent naming based on their action.
5. Functions have limited additional arguments.
## Use cases
We make these assumptions on how this package will be used, based on our
experiences and expectations for use cases:
- Entirely used within the Denmark Statistics (DST) or the Danish
Health Authority's (SDS) servers, since that is where their data are
kept.
- Used by researchers within or affiliated with Danish research
institutions.
- Used specifically within a Danish register-based context.
Below is a set of "narratives" or "personas" with associated needs that
this package aims to fulfil:
- "As a researcher, ..."
- "... I want to determine which registers and variables to
request from DST and SDS, so that I am certain I will be able to
classify diabetes status of individuals in the registers."
- "... I want to easily and simply create a dataset that contains
data on diabetes status in my population, so that I can begin
conducting my research that involves persons with diabetes
without having to tinker with coding the correct algorithm to
classify them."
- "... I want to be informed early and in a clear way whether my
data fits with the required data type and values, so that I can
fix and correct these issues without having to do extensive
debugging of the code and/or data."
## Core functionality
This is the list of functionality we aim to have in the osdc package
1. Classify individuals type 1 and type 2 diabetes status and create a
data frame with that information.
2. Provide helper functions to check and process individual registers
for the variables required to enter into the classifier.
3. Provide a list of required variables and registers in order to
calculate diabetes status.
4. Provide validation helper functions to check that variables match
what is expected of the algorithm.
5. Provide a common and easily accessible standard for determining
diabetes status within the context of research using Danish
registers.