Skip to content

Commit

Permalink
🎨
Browse files Browse the repository at this point in the history
  • Loading branch information
davidgasquez committed Aug 18, 2023
1 parent f28a171 commit b52cca5
Show file tree
Hide file tree
Showing 9 changed files with 42 additions and 42 deletions.
12 changes: 6 additions & 6 deletions Data/Dashboards.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@
- Purpose and explanation of the data being shown.
- Caveats and assumptions.
- Extra Context:
- Why this dashboard exists.
- Who it's for.
- When it was built, and if and when it's set to expire .
- What features it's tracking via links to team repositories, project briefs, screenshots, or video walk-throughs.
- Why this dashboard exists.
- Who it's for.
- When it was built, and if and when it's set to expire .
- What features it's tracking via links to team repositories, project briefs, screenshots, or video walk-throughs.
- Take-aways.
- Metadata (owner, related OKRs, TTL, …).
- Make them so its easy to go one layer down (X went down in Y location, or for Z new users, etc).
Expand Down Expand Up @@ -43,8 +43,8 @@ The value is that now discussions are happening about the data.
- [They can serve endless needs, but in doing so, rarely do they serve _particular_ needs perfectly](https://win.hyperquery.ai/p/analysis-or-dashboard).
- Dashboards shouldn't be single-use
- Ask this:
- Can this new dashboard request be added into an existing one?
- What are you going to do differently by looking at the Dashboard? Focus on that [[Metrics|metric]] and add it to the main Dashboard
- Can this new dashboard request be added into an existing one?
- What are you going to do differently by looking at the Dashboard? Focus on that [[Metrics|metric]] and add it to the main Dashboard
- Beware of the death by 1,000 filters: After a dashboard had gone live, you'll be flooded with requests for new views, filters, fields, pages, everything ([can you just ...](https://richardswinbank.net/blog/can_you_just)).
- Dashboards are decision-making infrastructure, and infrastructure needs to be maintained. Be explicit about which Dashboards are disposable and add a TTL to them.
- The numbers and charts on a dashboard very rarely have any direct personal meaning to the people using it. There's tons of other work to do, and unless that dashboard is directly tied to your performance or compensation, there are probably more important things to look at. People are more likely to check stock prices when they actually own (and thus benefit from) the stock.
Expand Down
14 changes: 7 additions & 7 deletions Data/Data Culture.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
- Data is fundamentally a collaborative design process rather than a tool, an analysis, or even a product. [Data works best when the entire feedback loop from ideation to production is an iterative process](https://pedram.substack.com/p/data-can-learn-from-design).
- [To get buy in, explain how the business could benefit from better data](https://youtu.be/Mlz1VwxZuDs) (e.g: more and better insights). Start small and show value.
- Run *[Purpose Meetings](https://www.avo.app/blog/tracking-the-right-product-metrics)* or [Business Metrics Review](https://youtu.be/nlMn572Dabc).
- Purpose Meetings are 30 min meetings in which stakeholders, engineers and data align on the goal of a release and what is the best way to evaluate the impact and understand its success. Align on the goal, commit on metrics and design the data.
- Business Metrics Review is a 30 to 60 minutes meeting to chat and explore key metrics and teach how to think with data.
- Purpose Meetings are 30 min meetings in which stakeholders, engineers and data align on the goal of a release and what is the best way to evaluate the impact and understand its success. Align on the goal, commit on metrics and design the data.
- Business Metrics Review is a 30 to 60 minutes meeting to chat and explore key metrics and teach how to think with data.
- Value of clear goals and expectations. Validate what you think your job is with your manager and stakeholders, repeatedly.
- [While the output of your team is what you want to maximize, you'll need some indicators that will help guide you day-to-day](https://data-columns.hightouch.io/your-first-60-days-as-a-first-data-hire-weeks-3-4/). Decide what's important to you (test coverage, documentation missing, queries run, models created, ...), and generate some internal reports for yourself.
- [Data teams should be a part of the business conversations from the beginning](https://cultivating-algos.stitchfix.com/). Get the data team involved early, have open discussions with them about the existing work, and how to prioritize new work against the existing backlog. Don’t accept new work without addressing the existing bottlenecks, and don’t accept new work without requirements. **Organizational [[politics]] matter way more than any data methods or technical knowledge**.
Expand All @@ -19,7 +19,7 @@
- The modern data team needs to have *real organizational power*—it needs to be able to say "no” and mean it. If your data team does not truly have the power to say no to stakeholders, it will get sent on all kinds of wild goose chases, be unproductive, experience employee churn, etc.
- Data should report to the CEO. Ideally at least with some weekly metrics split into (a) notable trends, (b) watching close, and (c) business as usual.
- If data is the most precious asset in a company, does it make sense to have only one team responsible for it?
- [People talk about data as the new oil but for most companies it’s a lot closer uranium](https://news.ycombinator.com/item?id=27781286). Hard to find people who can to handle or process it correctly, nontrivial security/liabilities if PII is involved, expensive to store and a generally underwhelming return on effort relative to the anticipated utility.
- [People talk about data as the new oil but for most companies it’s a lot closer uranium](https://news.ycombinator.com/item?id=27781286). Hard to find people who can to handle or process it correctly, nontrivial security/liabilities if PII is involved, expensive to store and a generally underwhelming return on effort relative to the anticipated utility.
- [The pain in data teams come from needing to influence PMs/peers with having little control of them. Data teams need to become really great internal marketers/persuaders](https://anchor.fm/census/episodes/The-evolution-of-the-data-industry--data-jobs-w-Avo-CEO-and-Co-founder-Stefania-Olafsdottir-e16hu1l). That said, it shouldn't be the data team job to convince the organization to be data driven. That's not an effective way of spending resources.
- People problems are orders of magnitude more difficult to solve than data problems.
- **Integrate data where the decision is made**. E.g: Google showing restaurant scores when you're looking something for dinner.
Expand All @@ -43,7 +43,7 @@
- Do weekly recaps in Slack in to highlight key items, company-wide progress toward north-stars, improvements in certain areas, new customer highlights. All positive and fun stuff.
- How can we measure the data team impact?
- Making a [[Writing a Roadmap|roadmap]] can help you telling if you are hitting milestone deadlines or letting them slip.
- Embedded data team members need to help other teams build their roadmap too.
- Embedded data team members need to help other teams build their roadmap too.
- Also, having a changelog ([do releases!](https://betterprogramming.pub/great-data-platforms-use-conventional-commits-51fc22a7417c)) will help show the team impact on the data product across time.
- [Push for a *centralization of the reporting structure*, but keeping the *work management decentralized*](https://erikbern.com/2021/07/07/the-data-team-a-short-story.html).
- Unify resources (datasets, entities, definitions, metrics). Have one source of truth for each one and make that clear to everyone. That source of truth needs heavy curation. Poor curation leads to confusion, distrust and…. lots of wasted effort.
Expand Down Expand Up @@ -85,9 +85,9 @@
- [Data ownership is a hard problem](https://www.linkedin.com/posts/chad-sanderson_heres-why-data-ownership-is-an-incredibly-activity-6904107936533114880-gw8n/). Data is fundamentally generated by services (or front-end instrumentation) which is managed by engineers. CDC and other pipelines are built by data engineers. The delineation of ownership responsibilities is very rarely established, with each group wanting to push 'ownership' onto someone else so they can do the jobs they were hired for.
- [Becoming a data-driven organization is a journey, which unfolds over time and requires critical thinking, human judgement, and experimentation](https://hbr.org/2022/02/why-becoming-a-data-driven-organization-is-so-hard). Fail fast, learn faster.
- [Path to create a data-driven organization](https://twitter.com/_abhisivasailam/status/1520274838450888704):
- 1. Get a well-placed leader with influence to message, model, and demand data-driven execution.
- 2. Hire/fire based on data aptitude and usage.
- 3. Create mechanisms that force analytical conversations. Sometimes there is no way around spending an afternoon breaking down metrics by different segments until you find The Thing.
- 1. Get a well-placed leader with influence to message, model, and demand data-driven execution.
- 2. Hire/fire based on data aptitude and usage.
- 3. Create mechanisms that force analytical conversations. Sometimes there is no way around spending an afternoon breaking down metrics by different segments until you find The Thing.
- [Start small. Don't try to wrangle data for the entire company until you have the tools and process down for one team](https://data-columns.hightouch.io/your-first-60-days-as-a-first-data-hire-weeks-3-4/).
- Difficulty to work with data scales exponentially with size.
- [Rule of thumb; your first customer as a data person should be growth](https://twitter.com/josh_wills/status/1577699871335010304).
Expand Down
4 changes: 2 additions & 2 deletions Data/Data Engineering.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,8 @@ graph LR;
- Decouple producers and consumers adding a layer in between. That can be something as simple as a text file or complex as a [[Databases|database]].
- **Schemas changes**. Most of the time you won't be there at the exact time of the change so aim to save everything.
- Ideally, the schema will evolve in a backward compatible way:
- Data types don't change in the same column.
- Columns are either deleted or added but never renamed.
- Data types don't change in the same column.
- Columns are either deleted or added but never renamed.
- Create a few extra columns like `processed_at` or `schema_version`.
- Generate stats to provide the operator with feedback.
- Data coming from pipelines should be easily reproducible. If you want to re-run a process, you should ensure that it will produce always the same result. This can be achieved by enforcing the [Functional Data Engineering Paradigm](https://medium.com/@maximebeauchemin/functional-data-engineering-a-modern-paradigm-for-batch-data-processing-2327ec32c42a).
Expand Down
22 changes: 11 additions & 11 deletions Data/Reverse ETL.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,17 @@
## Why?

- It provides a source of truth for all the tools: **the data warehouse**.
- Each tool can use and share the same definitions, events, and properties.
- Tracking is less dependent of business rules.
- Centralized tests can be added to validate assumptions.
- It removes some tools limitations (e.g. Customer.io ways of doing segmentation, Pendo limitation on events cohort).
- SQL Queries will return the same numbers than other BI tools like Mixpanel.
- Each tool can use and share the same definitions, events, and properties.
- Tracking is less dependent of business rules.
- Centralized tests can be added to validate assumptions.
- It removes some tools limitations (e.g. Customer.io ways of doing segmentation, Pendo limitation on events cohort).
- SQL Queries will return the same numbers than other BI tools like Mixpanel.
- You get to use all the data you have, improving your [[Data Culture]].
- You can use the real source of truth for all the events and not rely on tracking only.
- You can join sources like ChartMogul, Customer.io, etc,
- You can create more interesting events by enriching the events and user profiles with extra properties/traits (Trial Started with a conversion provability attached). Makes product analytics much more powerful.
- You can use the real source of truth for all the events and not rely on tracking only.
- You can join sources like ChartMogul, Customer.io, etc,
- You can create more interesting events by enriching the events and user profiles with extra properties/traits (Trial Started with a conversion provability attached). Makes product analytics much more powerful.
- It is much easier to re-use the data available in the warehouse than it is to import the data in any new tool we use in the future.
- You can be much more flexible with the tools we want to use because the data is shared and owned by us.
- You avoid being locked in to BI tools like Mixpanel since the logic will be stored in our warehouse.
- You can be much more flexible with the tools we want to use because the data is shared and owned by us.
- You avoid being locked in to BI tools like Mixpanel since the logic will be stored in our warehouse.
- As any new tool, it gives more flexibility and power.
- The current state is the starting point! We start using it to fix some issues or add some interesting profile properties
- The current state is the starting point! We start using it to fix some issues or add some interesting profile properties
18 changes: 9 additions & 9 deletions Data/Sharing Data Insights.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# Sharing Data Insights

[Sharing your data insights across your organization facilitates collaboration and mutual learning – increasing data literacy across the company](https://locallyoptimistic.com/post/share-your-data-insights-to-engage-your-colleagues/). It also helps remind folks that members of the data team can be strategic partners, creating opportunities for proactive brainstorming that can drive innovation.
[Sharing your data insights across your organization facilitates collaboration and mutual learning – increasing data literacy across the company](https://locallyoptimistic.com/post/share-your-data-insights-to-engage-your-colleagues/). It also helps remind folks that members of the data team can be strategic partners, creating opportunities for proactive brainstorming that can drive innovation.

The aim is to answer the following questions each time:
The aim is to answer the following questions each time:

1. **What am I looking at?** A **short-but-informative title** can tell people immediately what data is the focus of the insight.
2. **What should I learn from this?** or, Why should I care? Include the **most useful information,** and/or a **clear takeaway**. For folks who only have a few seconds to scan the message, it should be **easy to spot** the **most valuable** bit of the insight, the **reason** this exploration was considered worth sharing.
3. **What caught my eye?** Share a chart or a related resource!
4. **What if I want to know more?** A **link to additional information** can be valuable for people who have time for more than a quick scan and want to understand how you developed the insight, or do some of their own related exploration.
5. **What if I have a question?** Explicitly **inviting questions** and responses is crucial. It’s the best part of sharing an insight! This is where you get to learn about things your colleagues know that you don’t, or what they’re curious about but has not yet risen to the level of becoming a data request from them.
6. **What if posting this prompts a whole bunch of follow-up questions, or exposes incorrect assumptions?** If you have hit on something that’s interesting to a lot of people there likely will be questions that spin off, new ways to slice the data you’re looking at, or assumptions you have made that need to be corrected.
1. **What am I looking at?** A **short-but-informative title** can tell people immediately what data is the focus of the insight.
2. **What should I learn from this?** or, Why should I care? Include the **most useful information,** and/or a **clear takeaway**. For folks who only have a few seconds to scan the message, it should be **easy to spot** the **most valuable** bit of the insight, the **reason** this exploration was considered worth sharing.
3. **What caught my eye?** Share a chart or a related resource!
4. **What if I want to know more?** A **link to additional information** can be valuable for people who have time for more than a quick scan and want to understand how you developed the insight, or do some of their own related exploration.
5. **What if I have a question?** Explicitly **inviting questions** and responses is crucial. It’s the best part of sharing an insight! This is where you get to learn about things your colleagues know that you don’t, or what they’re curious about but has not yet risen to the level of becoming a data request from them.
6. **What if posting this prompts a whole bunch of follow-up questions, or exposes incorrect assumptions?** If you have hit on something that’s interesting to a lot of people there likely will be questions that spin off, new ways to slice the data you’re looking at, or assumptions you have made that need to be corrected.

### Slack Template
## Slack Template

```md
![Chart]()
Expand Down
2 changes: 1 addition & 1 deletion Identity.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

- [Maintain a very small identity](http://www.paulgraham.com/identity.html). The act of labeling yourself is the act of restricting yourself to what you think fits that label. Don't have opinions on everything. Avoid forming an opinion at all about things that are not evident. Do not affiliate your identity with anything extrinsic - such as a religion, political party, country, company, profession, [[Programming]] language, social class, etc.
- Identity can be helpful in some cases. When we identity as something aligned with our [[Values]] and that can self correct (e.g: rationalism), it encourages you to behave better!
- [Try to affiliate more strongly with the communities whose core beliefs would be less dangerous if they turned out to be wrong](https://economicsdetective.com/2016/10/identity-mind-killer/).
- [Try to affiliate more strongly with the communities whose core beliefs would be less dangerous if they turned out to be wrong](https://economicsdetective.com/2016/10/identity-mind-killer/).
- Identity labels are a way of [[Conceptual Compression]]. They help you infer some things about people that identify as something.
- You're not your opinions. Don't define yourself by what you work on or what you hate. Once a belief becomes part of your identity, any evidence that threatens the belief is a personal attack.
- The only constant in the world is that it changes. Identify as someone that changes their mind when the data changes!
6 changes: 3 additions & 3 deletions Journaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,6 @@
- Review a set of recurrent prompts. Tweak them over time. For example:
- Consistency at your core [[habits]] this week ([[Fitness]], [[Routine]], [[Productivity]], etc.). How can you tweak them to be more consistent or more useful?
- What did you do this week that was a mistake and how can I avoid repeating it?
- What would you like to accomplish next week?
- Do you need to clarify something?
- Which actions will you move closer to your [[goals]]?
- What would you like to accomplish next week?
- Do you need to clarify something?
- Which actions will you move closer to your [[goals]]?
Loading

0 comments on commit b52cca5

Please sign in to comment.