diff --git a/README.md b/README.md index d34c874e..542235cb 100644 --- a/README.md +++ b/README.md @@ -2,19 +2,19 @@ [![Snyk](https://snyk.io/test/github/18F/analytics-reporter/badge.svg)](https://snyk.io/test/github/18F/analytics-reporter) [![Code Climate](https://codeclimate.com/github/18F/analytics-reporter/badges/gpa.svg)](https://codeclimate.com/github/18F/analytics-reporter) -## Analytics Reporter +# Analytics Reporter A lightweight system for publishing analytics data from the Digital Analytics Program (DAP) Google Analytics 4 government-wide property. -For Universal Analytics, uses the [Google Analytics Core Reporting API v3](https://developers.google.com/analytics/devguides/reporting/core/v3/) -and the [Google Analytics Real Time API v3](https://developers.google.com/analytics/devguides/reporting/realtime/v3/) (will be deprecated on July 1, 2024). +This project uses the [Google Analytics Data API v1](https://developers.google.com/analytics/devguides/reporting/data/v1/rest) to acquire analytics data and then processes it into a flat data structure. -This is used in combination with [analytics-reporter](https://github.com/18F/analytics-reporter-api) and [analytics.usa.gov](https://github.com/18F/analytics.usa.gov) to power the government analytics website, [analytics.usa.gov](https://analytics.usa.gov). +The project previously used the [Google Analytics Core Reporting API v3](https://developers.google.com/analytics/devguides/reporting/core/v3/) +and the [Google Analytics Real Time API v3](https://developers.google.com/analytics/devguides/reporting/realtime/v3/), also known as Universal Analytics, which has slightly different data points. See [Upgrading from Universal Analytics](#upgrading-from-universal-analytics) for more details. The Google Analytics v3 API will be deprecated on July 1, 2024. -Available reports are named and described in [`reports.json`](reports/reports.json). For now, they're hardcoded into the repository. +This is used in combination with [analytics-reporter-api](https://github.com/18F/analytics-reporter-api) to power the government analytics website, [analytics.usa.gov](https://analytics.usa.gov). -### Installation +Available reports are named and described in [`reports.json`](reports/reports.json). For now, they're hardcoded into the repository. -### Docker +## Docker Setup * To build the docker image on your computer, run: @@ -30,7 +30,7 @@ Then you can create an alias in order to have the analytics command available: alias analytics="docker run -t -v ${HOME}:${HOME} -e ANALYTICS_REPORT_EMAIL -e ANALYTICS_REPORT_IDS -e ANALYTICS_KEY analytics-reporter" ``` -To make this command working as expected you should export the env vars as follows: +To make this command work as expected you should export the env vars as follows: ```bash export ANALYTICS_REPORT_EMAIL="your-report-email" @@ -38,7 +38,14 @@ export ANALYTICS_REPORT_IDS="your-report-ids" export ANALYTICS_KEY="your-key" ``` -### NPM +## Local development setup + +### Prerequistites + +* NodeJS > v20.x +* A postgres DB running + +### Running the application as a npm package * To run the utility on your computer, install it through npm: @@ -46,9 +53,15 @@ export ANALYTICS_KEY="your-key" npm install -g analytics-reporter ``` -If you're developing locally inside the repo, `npm install` is sufficient. +### Running the application locally + +#### Install dependencies + +```bash +npm install +``` -### Setup +## Configuration and Google Analytics Setup * Enable [Google Analytics API](https://console.cloud.google.com/apis/library/analytics.googleapis.com) for your project in the Google developer dashboard. @@ -64,7 +77,7 @@ If you're developing locally inside the repo, `npm install` is sufficient. ```bash export ANALYTICS_REPORT_EMAIL="YYYYYYY@developer.gserviceaccount.com" -export ANALYTICS_REPORT_IDS="ga:XXXXXX" +export ANALYTICS_REPORT_IDS="XXXXXX" ``` You may wish to manage these using [`autoenv`](https://github.com/kennethreitz/autoenv). If you do, there is an `example.env` file you can copy to `.env` to get started. @@ -141,8 +154,7 @@ There are cases where you want to use a custom object storage server compatible export AWS_S3_ENDPOINT=http://your-storage-server:port ``` - -### Other configuration +## Other configuration If you use a **single domain** for all of your analytics data, then your profile is likely set to return relative paths (e.g. `/faq`) and not absolute paths when accessing real-time reports. @@ -163,7 +175,7 @@ This will produce points similar to the following: } ``` -### Use +## Use Reports are created and published using the `analytics` command. @@ -178,6 +190,8 @@ A report might look something like this: ```javascript { "name": "devices", + "frequency": "daily", + "slim": true, "query": { "dimensions": [ { @@ -194,7 +208,7 @@ A report might look something like this: ], "dateRanges": [ { - "startDate": "90daysAgo", + "startDate": "30daysAgo", "endDate": "yesterday" } ], @@ -205,15 +219,12 @@ A report might look something like this: }, "desc": true } - ], - "samplingLevel": "HIGHER_PRECISION", - "limit": "10000", - "property": "properties/393249053" + ] }, "meta": { "name": "Devices", - "description": "90 days of desktop/mobile/tablet visits for all sites." - }, + "description": "30 days of desktop/mobile/tablet visits for all sites." + } "data": [ { "date": "2023-12-25", @@ -245,7 +256,7 @@ A report might look something like this: } ``` -#### Options +### Options * `--output` - Output to a directory. @@ -294,7 +305,7 @@ analytics --frequency=realtime analytics --publish --debug ``` -### Saving data to postgres +## Saving data to postgres The analytics reporter can write data is pulls from Google Analytics to a Postgres database. The postgres configuration can be set using environment @@ -313,6 +324,48 @@ server](https://github.com/18f/analytics-reporter-api) that consumes and publish To write reports to a database, use the `--write-to-database` option when starting the reporter. +## Upgrading from Universal Analytics + +### Background + +This project previously acquired data from Google Analytics V3, also known as Universal Analytics (UA). + +Google is retiring UA and is encouraging users to move to their new version Google Analytics V4 (GA4). +UA will be deprecated on July 1st 2024. + +### Migration details + +Some data points have been removed or added by Google as part of the move to GA4. + +#### Deprecated fields + +- browser_version +- has_social_referral +- exits +- exit_page + +#### New fields + +##### bounce_rate + +The percentage of sessions that were not engaged. GA4 defines engaged as a +session that lasts longer than 10 seconds or has multiple pageviews. + +##### file_name + +The page path of a downloaded file. + +##### language_code + +The ISO639 language setting of the user's device. e.g. 'en-us' + +##### session_default_channel_group + +An enum which describes the session. Possible values: + +'Direct', 'Organic Search', 'Paid Social', 'Organic Social', 'Email', +'Affiliates', 'Referral', 'Paid Search', 'Video', and 'Display' + ### Deploying to Cloud.gov The analytics reporter runs on :cloud:.gov. Please refer to the `manifest.yml` @@ -365,7 +418,7 @@ Compose: docker-compose up ``` -# Running the unit tests +## Running the unit tests The unit tests for this repo require a local PostgreSQL database. You can run a local DB server or create a docker container using the provided test compose @@ -388,7 +441,7 @@ Run the tests (pre-test hook runs DB migrations): npm test ``` -### Public domain +## Public domain This project is in the worldwide [public domain](LICENSE.md). As stated in [CONTRIBUTING](CONTRIBUTING.md): diff --git a/reports/usa.json b/reports/usa.json index 6f4a43e4..3b1e60c1 100644 --- a/reports/usa.json +++ b/reports/usa.json @@ -1675,6 +1675,45 @@ "desc": true } ], + "dimensionFilter": { + "andGroup": { + "expressions": [ + { + "notExpression": { + "filter": { + "fieldName": "unifiedScreenName", + "stringFilter": { + "value": "(other)", + "caseSensitive": false + } + } + } + }, + { + "notExpression": { + "filter": { + "fieldName": "unifiedScreenName", + "stringFilter": { + "value": "null", + "caseSensitive": false + } + } + } + }, + { + "notExpression": { + "filter": { + "fieldName": "unifiedScreenName", + "stringFilter": { + "value": "", + "caseSensitive": false + } + } + } + } + ] + } + }, "metricFilter": { "filter": { "fieldName": "activeUsers", @@ -1694,7 +1733,7 @@ } }, { - "name": "top-domains-30-days", + "name": "top-10000-domains-30-days", "frequency": "daily", "query": { "dimensions": [ @@ -1711,21 +1750,6 @@ "metrics": [ { "name": "screenPageViews" - }, - { - "name": "sessions" - }, - { - "name": "activeUsers" - }, - { - "name": "screenPageViewsPerSession" - }, - { - "name": "averageSessionDuration" - }, - { - "name": "bounceRate" } ], "orderBys": [ @@ -1756,7 +1780,7 @@ "limit": "10000" }, "meta": { - "name": "Top Domains (30 Days)", + "name": "Top 10000 Domains (30 Days)", "description": "Last 30 days' domains, measured by page views, for the top 10000 sites. (>10 active users)" } },