From d7ba755e7fef54f8a4f2e0a2535651b4ddb6c6b7 Mon Sep 17 00:00:00 2001 From: David Gasquez Date: Tue, 22 Oct 2024 11:25:51 +0200 Subject: [PATCH] :art: --- Data/Analytics Engineering.md | 1 + Incentives.md | 6 ++++++ Open Data.md | 2 +- Organizations.md | 3 +++ 4 files changed, 11 insertions(+), 1 deletion(-) diff --git a/Data/Analytics Engineering.md b/Data/Analytics Engineering.md index 2b5dc2c..d80c71a 100644 --- a/Data/Analytics Engineering.md +++ b/Data/Analytics Engineering.md @@ -59,6 +59,7 @@ - [Tuba](https://github.com/tuva-health/tuva) - [Department of Education for New South Wales](https://github.com/wisemuffin/nsw-doe-data-stack-in-a-box) - [OP Analytitcs](https://github.com/ethereum-optimism/op-analytics) +- [Transfermarkt Datasets](https://github.com/dcaribou/transfermarkt-datasets) #### Dagster Resources diff --git a/Incentives.md b/Incentives.md index 6645c16..aaccaa8 100644 --- a/Incentives.md +++ b/Incentives.md @@ -13,6 +13,12 @@ Behavior is hard to fix. When people say they've learned their lesson they under - Two types of incentives: - Intrinsic incentives are internal—created by self-interest or desire. - Extrinsic incentives are external—created by outside factors (reward, punishment). +- Designing incentives is hard. Is easy to design a bad incentive, which is worse than no incentive. It is specially hard to design a good incentive that relies on money. Money removes intrinsic incentives and attracks the wrong kind of behavior. +- [Four components of effectively designed incentives are](https://dhruvmethi.substack.com/p/bureaucracy): + - Clear problem statement. + - Clear target metric to improve. + - Intentional system design. + - Commitment to study the metric. ## Incentive Framework diff --git a/Open Data.md b/Open Data.md index cfc1ee4..f2bf874 100644 --- a/Open Data.md +++ b/Open Data.md @@ -62,7 +62,7 @@ We could have a better data ecosystem if we **collaborate on open standards**! S - Portals are a GitHub repository with scripts to collect data from various sources, clean it, and join it, and publish useful datasets and artifacts for that community. Ideally, they are also simple to get started with and expose the best practices in data engineering for curating and transforming data. - Provide a declarative way of defining the datasets schema and other meta-properties like _relations_ or _tests/checks_. - Support for integrating non-dataset files. A dataset could be linked to code, visualizations, pipelines, models, reports, ... -- **Reproducible and Verifiable**. People should be able to trust the final datasets without having to recompute everything from scratch. In "reality", events are immutable, data should be too. [Make datasets the center of the tooling](https://dagster.io/blog/software-defined-assets). +- **[Reproducible](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003285) and Verifiable**. People should be able to trust the final datasets without having to recompute everything from scratch. In "reality", events are immutable, data should be too. [Make datasets the center of the tooling](https://dagster.io/blog/software-defined-assets). - With immutability and content addressing, you can move backwards in time and run transformations or queries on how the dataset was at a certain point in time. - [Datasets are books, not houses](https://medium.com/qri-io/datasets-are-books-not-houses-760bd4736229)! - **Permissionless**. Anyone should be able to add/update/fix datasets or their metadata. GitHub style collaboration, curation, and composability. On data. diff --git a/Organizations.md b/Organizations.md index 65b2c3e..104daf1 100644 --- a/Organizations.md +++ b/Organizations.md @@ -69,6 +69,9 @@ - [Big organizations develop strategic inefficiency to carry on doing what they're doing](https://youtu.be/v1eWIshUzr8?t=1147). - You can't sell something like versioning or encryption as the key feature, you need to put out a comparable product and have these features be the thing that tips the scale in your favour. - Your business may be thought of as a process. It has inputs, and it has outputs. One type of knowledge is understanding which inputs affect which outputs. +- Bureaucracy develops through deliberate decisions that seemed logical at the time, not just as a natural consequence of growth. Two main causes of bureaucracy are: + - Poorly designed incentive systems that punish mistakes but don't reward success. + - Loose coupling between [[systems]]/[[processes]] and desired outcomes. ## Resources