Skip to content

Commit

Permalink
🎨
Browse files Browse the repository at this point in the history
  • Loading branch information
davidgasquez committed Oct 22, 2024
1 parent 01e7cd2 commit d7ba755
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 1 deletion.
1 change: 1 addition & 0 deletions Data/Analytics Engineering.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@
- [Tuba](https://github.com/tuva-health/tuva)
- [Department of Education for New South Wales](https://github.com/wisemuffin/nsw-doe-data-stack-in-a-box)
- [OP Analytitcs](https://github.com/ethereum-optimism/op-analytics)
- [Transfermarkt Datasets](https://github.com/dcaribou/transfermarkt-datasets)

#### Dagster Resources

Expand Down
6 changes: 6 additions & 0 deletions Incentives.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,12 @@ Behavior is hard to fix. When people say they've learned their lesson they under
- Two types of incentives:
- Intrinsic incentives are internal—created by self-interest or desire.
- Extrinsic incentives are external—created by outside factors (reward, punishment).
- Designing incentives is hard. Is easy to design a bad incentive, which is worse than no incentive. It is specially hard to design a good incentive that relies on money. Money removes intrinsic incentives and attracks the wrong kind of behavior.
- [Four components of effectively designed incentives are](https://dhruvmethi.substack.com/p/bureaucracy):
- Clear problem statement.
- Clear target metric to improve.
- Intentional system design.
- Commitment to study the metric.

## Incentive Framework

Expand Down
2 changes: 1 addition & 1 deletion Open Data.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ We could have a better data ecosystem if we **collaborate on open standards**! S
- Portals are a GitHub repository with scripts to collect data from various sources, clean it, and join it, and publish useful datasets and artifacts for that community. Ideally, they are also simple to get started with and expose the best practices in data engineering for curating and transforming data.
- Provide a declarative way of defining the datasets schema and other meta-properties like _relations_ or _tests/checks_.
- Support for integrating non-dataset files. A dataset could be linked to code, visualizations, pipelines, models, reports, ...
- **Reproducible and Verifiable**. People should be able to trust the final datasets without having to recompute everything from scratch. In "reality", events are immutable, data should be too. [Make datasets the center of the tooling](https://dagster.io/blog/software-defined-assets).
- **[Reproducible](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003285) and Verifiable**. People should be able to trust the final datasets without having to recompute everything from scratch. In "reality", events are immutable, data should be too. [Make datasets the center of the tooling](https://dagster.io/blog/software-defined-assets).
- With immutability and content addressing, you can move backwards in time and run transformations or queries on how the dataset was at a certain point in time.
- [Datasets are books, not houses](https://medium.com/qri-io/datasets-are-books-not-houses-760bd4736229)!
- **Permissionless**. Anyone should be able to add/update/fix datasets or their metadata. GitHub style collaboration, curation, and composability. On data.
Expand Down
3 changes: 3 additions & 0 deletions Organizations.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,9 @@
- [Big organizations develop strategic inefficiency to carry on doing what they're doing](https://youtu.be/v1eWIshUzr8?t=1147).
- You can't sell something like versioning or encryption as the key feature, you need to put out a comparable product and have these features be the thing that tips the scale in your favour.
- Your business may be thought of as a process. It has inputs, and it has outputs. One type of knowledge is understanding which inputs affect which outputs.
- Bureaucracy develops through deliberate decisions that seemed logical at the time, not just as a natural consequence of growth. Two main causes of bureaucracy are:
- Poorly designed incentive systems that punish mistakes but don't reward success.
- Loose coupling between [[systems]]/[[processes]] and desired outcomes.

## Resources

Expand Down

0 comments on commit d7ba755

Please sign in to comment.