The GA4 Audit Playbook Part 3
GA4 Data Health Review
This is the largest, and arguably the most important sub-audit area, as if there are data problems in your GA4 it may make it challenging to understand your web & app performance. You do not need to go on your website and do any debugging here and this sub-audit area will focus primarily on the reports/exploration area of GA4. I recommend performing this audit once every 6-12 months to ensure updates to your websites & apps and marketing channels have not impacted data health in your GA4. There are 6 areas; Data Anomalies, Campaign Tracking, Pages, Events, Custom Definitions and Ecommerce.
Time-series Anomaly Detection
This focuses on reviewing core metrics and events and if any time series anomalies are occurring. Review the last year’s worth of data (to mitigate as much as possible seasonality) and see in the last 28 days if there were any strange jumps or drops in these metrics (however, be careful of GA4’s 48-hour data processing delay! Yesterday’s data might not be ready to review so you might want to set the end date to two days ago). This is important to review as, if anomalies are detected, there are recent issues in these metrics there could be potential recent and ongoing tracking, website or app issues that need reviewing. However, it could just be your marketing team not telling you they launched a new campaign and everything is fine! Nonetheless, this is very important to review. In an ideal world, data anomalies should be checked every day but if that’s not possible try and review it once a week.
Fortunately, GA4 provided an anomaly detection tool within the exploration reports. If you create a “Free form” report and select the “Line chart” visualization, you can then enable the “Anomaly detection” toggle at the bottom of this report and adjust the settings within this tool as you wish:

We advise checking anomalies for at least the following:
- Sessions, as this can show immediate tracking and website downtime as well as spikes in bot traffic
- Engaged Sessions, as this can show an engagement issue on your website
- Conversion Events, as these are your main KPIs and it is important to check they are healthy and tracking well.
Campaign Tracking Health
Campaign tracking via UTMs is crucial in understanding channels, sources, mediums and campaign performance within your GA4. Therefore, making sure these UTMs have no gaps, consistent and healthy is crucial. You can either review this data in reports > acquisition (or depending on how the property was configured it may be under “Generate leads”) > traffic acquisitions or build your own custom free-form exploration reports using “Session channel grouping/source/medium/campaign” dimensions (it is recommended to use session-scoped sources instead of “First user” ones as session ones is the one that can show more issues in your data).

- Check the percentage of “Unassigned” traffic you have. A large proportion is not a good sign (more than 2%). There are three things to check within your unassigned channel:
- The amount of traffic Unassigned Channel does have UTMs within them (i.e. not only “(not set)”). This means you have not followed GA4’s default channel groups logic and will need to ensure you relabel these UTMs so that they fit in the relevant channel.
- The amount of traffic Unassigned Channel does have UTMs (i.e. not only “(not set)”) but also has a ‘(not set)’ value in either source or medium UTM. This is either a mistake in the UTMs being appended to the final URL and needs to be fixed in the marketing tool with these or a technical issue on that landing page where that UTM parameter is being stripped.
- The amount of traffic Unassigned Channel does not have UTMs at all (i.e. only “(not set)s”). This is either due to no UTMs being utilized on certain marketing activity in which the browser is not picking up the referral details or due to the session-timeout setting not being set to an appropriate length and sessions going beyond the session-timeout limit set and therefore causing another session.
- Check the percentage of “Direct” traffic you have. A large proportion could indicate an issue (more than 30%). This could indicate that there could be issues in the attribution journey that are causing the attribution of these sessions & conversions to go to Direct when they shouldn't. Use the user explorer reports in the explorer reports section to help see if it is an attribution journey issue.
- Check your referral channel for the following:
- You have no email referrals under the “Referral” GA4 default channel group. This indicates you have not put UTM tracking on your email campaigns which is a must. Search for “mail.” in “Session source” to review if this is occurring.
- Review referrals that have a very large amount of sessions and/or conversion rates. These either need UTM tracking or need to be added in the referral exclusions lists settings which we covered in Part 2 (self-referrals and 3rd party payment processors such as PayPal and Stripe are the most common).
- Check there are no internal site links within UTMs. Having internal site links with UTMs breaks the session and will unfairly attribute conversions and events to this site link even if users came from a different traffic source beforehand. There is no easy way to filter for this but review your session source and mediums and see if there are values that are not marketing activity related and then review which landing pages these are on and test links on these pages to see if UTMs get appended to the URL.
- Check the health of your session mediums data as your mediums list should be the smallest and most concise. Setting the wrong values leads to data points that are difficult to interpret and it most probably causes inaccurate channel grouping. We advise no more than 15 unique medium values. Check for the following mediums and make them better grouped:
- UTM mediums greater than 20 characters.
- UTM mediums with less than 1% of traffic.
- Check for upper casing in UTMs. Unlike UA, GA4 does not have lower casing data filters. Therefore it is considered best practice to lowercase all your UTMs before launching them. Uppercasing these might lead to accidental duplicate rows for the same source which will make analysis more challenging to do.
- Check for consistent use of delimiters (“-” or ”|” or ”_”) in UTMs. Choose a delimiter to deal with spacing in campaign names and stick with this consistently throughout your UTMs. Inconsistent use of delimiters can make it difficult to find your campaigns.
- If Custom Channel Grouping has been configured, check how well this has been configured by reviewing how sources, mediums and campaigns have been falling into each channel group.
To help future-proof your UTM tracking in general we advise building a UTM builder in a Google Sheet or Excel and asking your marketing teams to utilize this in their activities so that they can generate UTM parameters to place on top of their landing page URLs to best practice standards.
Page Data Health
One of the GA4s (and why we use web analytics tools in general) main reporting capabilities is seeing page engagement & performance data. Therefore checking this data is healthy is crucial. You can either review this data in reports > engagement (or depending on how you initially created your property it may be under “Generate leads” and “Examine user behavior”) > pages and screen/landing page or build your own custom free-form exploration reports and utilize various combinations with page and landing page dimensions accompanied with sessions, engagement rate, bounce rate and views metrics.

Check for the following in these reports:
- Check the proportion of visits that are on your primary domain’s hostname. If other hostnames are appearing which has more than 1% of the total traffic, you could be experiencing bot traffic problems or accidentally tracking on another domain that you shouldn’t be (note: that some website user journeys can cross over on several separate domains and tracking on these are useful, if this is the case just ensure your cross-domain tracking is activated and working).
- Check the volume of “(not set)” landing page traffic you have. A large proportion is not a good sign (usually more than 5%) and means your landing page data is not as accurate as it could be. This is happening because users are triggering a “session_start” event on certain pages but not a “page_view” event. Test your website and check your session-time-out settings (it could be set to too low). To help narrow down your testing, you can review which pages are affected by creating the following custom exploration report:

- Check the “extremes” in your landing page’s bounce rate. Both an extremely high bounce rate (greater than 95%) and an extremely low bounce rate (lower than 5%) can indicate technical problems such as broken links, missing pages, slow-to-load pages, error messages or even tracking issues with those pages.
- Check for “cardinality” in any of your page reports. If you start seeing ‘(other)’ in these reports you are experiencing cardinality. Cardinality can make reporting unusable as it is essentially “hiding” data in the ‘(other)’ bucket. The next check will help you reduce the potential to see cardinality
- Check how many pageviews are affected by pages with query parameters. Too many pages with various query parameters can increase the risk of cardinality as well as make it challenging to view specific query parameters effectively and efficiently. To view this you must create a custom exploration report and add the “Page path + query string” or “Page location” dimension and add the metric “Views” and filter for pages that contain “?”. I advise storing key query parameters as custom dimensions and removing all query parameters in the page_location variable for all event hits.
- Check your page title data health by adding the page title dimension as a secondary dimension and review how many “(not set)” and “404”s pages there are. In general, you should not have any 404s or (not set) pages on your website. Therefore you need to ask your developer to fix these pages and their page titles.
- Check for any Personal Identifiable Information (PII) in query parameters and page titles. If you see things such as email addresses, names, and phone numbers then this has to be removed as soon as possible as you are breaching GDPR and GA4’s policies. Here are the following steps to remove these:
- Get your developers to remove these (you can use Google Tag Manager to remove these if developer support is not available but is not considered best practice)
- Then you have to submit a data deletion request in the GA4 admin settings > property settings > data collection and modification > data deletion requests:

Page data health checks help not only ensure your page data and landing page data are in good shape for reporting purposes but also help spot technical issues on your website as well as spot and identify PII.
Events Data Health
This area does not cover how to check your events are firing correctly but focuses on data health checks that can be found within the GA4 UI. We will cover GA4 Custom Events Testing next week but this will help you pinpoint which events need urgent reviewing and testing within your GTM. We will cover here reviewing the two types of events configured in your GA4 and checking if there has been any major drop-off in these events which could indicate they are redundant as well as event naming structure review. You will be able to all your events and conversion data in reports > engagement (or depending on how you initially created your property it may be under “Generate leads” and “Examine user behaviour”) > events/conversions or build your own custom free-form exploration reports and utilise the event name dimension accompanied with the event count metric.

Automatically collected events
To see which events have been configured to “automatically” collect data in your GA4 property you can go to your admin settings > property settings > data collection and modification > data streams > select your data stream > under “Enhanced Measurement”:

- The page_view event, which in most cases should be initiated via the configuration tag. Check to see if the event count of the page_view event is greater than the session_start event. If it's lower it could indicate an issue with your configuration. This could happen if you have Sing Page Applications (SPAs) on your website and therefore need to either select “Page changes based on browser history events” in the advanced settings or configure this correctly in GTM.
- If Site Search functionality is available on the website and enabled in the “enhanced measurement” section then it's usually to check if the view_search_results is working correctly and populating search terms. We can check this by creating a custom exploration report and adding the “Search term” dimension with event counts as a metric. If this is blank it could be the query parameter, which contains the search term your website is generating after a site search, and is not part of the default list of parameters GA4 automatically tries to look for (these are q, s, search, query, keyword). You can add additional query parameters so that GA4 picks this up in the advanced Site search settings in the “Enhanced measurement” section.
Manually collected custom events
- Check these events still work by reviewing the last 7 vs 30 vs 90 days. If the event dropped off in the last 7 or 30 days points then review if this event is still needed and if it is then fix it otherwise remove it.
- Check your event naming structure. First of all check there is no duplication across the event naming schema and secondly, make sure the structure is good. A good naming structure should follow a clear logical structure, be lowercase throughout, and utilise underscores (i.e.””) to bridge spacing. An example of a good naming structure would be {{action, i.e. click, submit, viewed}}_{{product, cta type, form type etc.}}.
Custom Definitions Data Health
To review your list of custom definitions you need to go to settings > property settings > data display > customs definitions. It is important to know that you have the following custom definition limits depending if you are a GA360 customer or not:

Therefore it is important to review if these custom definitions are still working and needed as well as if they are working robustly. Similar to the events section, this area does not cover how to check your events are firing, and therefore populating the event parameters and user properties correctly (we will cover this in Part 4), but focuses on data health checks that can be found within the GA4 UI. For the following checks, it is best to create a custom exploration free-form report and within this report, you will only need the custom dimensions and metrics, event name and event count to review these checks. Two types of checks need to be performed:
- Check these custom dimensions and metrics still work by reviewing the last 7 vs 30 vs 90 days. If the custom dimension or metric dropped off in the last 7 or 30 days points then review if this custom definition is still needed and if it is then fix it otherwise archive it within the GA4 UI custom definition settings.
- Check these custom definitions are not populating blanks or (not set) values. Note: You must add an event name as a dimension filter on this report and when checking for each custom definition make sure to filter for only events that utilise this parameter. These values are being populated because either an empty string is being passed through or the parameter value is not present (i.e. ‘undefined’) which means that there is an issue and it needs to be fixed. Tip: to narrow down where this is occurring, add the page location dimension in this report and filter for blanks and (not set) values.

By incorporating these two checks, we aim not only to optimise the utilisation of custom definitions but also to enhance their overall efficiency and meaningful contribution to analysis pieces.
Ecommerce Data Health
The extent of your ecommerce setup in GA4 will depend on how much you need to check. At Know Analytics, if you have an online sales-type website/app, we advise you to have at least the following GA4 ecommerce events so you can measure not only sales and revenue but also see a checkout funnel:
- purchase (a must-have)
- add_to_cart
- begin_checkout
- add_payment_info
- view_item
We also recommend passing through the following item-level data in each ecommerce event:
- item_name and/or item_id (at least one of these are required)
- item_category (not a requirement but useful)
- item_variant (not a requirement but useful)
- price (required for purchase event to see item level revenue information in GA4)
- quantity (required for purchase event to see item revenue level information in GA4)
Note for the purchase event you must pass through a transaction_id as well as the value parameter (required for measuring revenue for the purchase event).
Now for the checks (we will use the Google Merchandise store GA4 demo property to demonstrate these):
- Specific purchase event data checks:
- Check for duplicate transactions. Make sure each transaction has a unique transaction ID. Show your developers these transaction IDs that are duplicates and they can help solve this issue. You can create the following custom exploration report to check this (note: in this example, no duplicates were found as it would have to be > 1 per transaction ID):

- Check that ‘purchase revenue’ has been configured correctly for each transaction. When the ecommerce revenue is 0 it indicates you have not utilized the ‘value’ parameter in your purchase hits or you have but the format is incorrect. Create the following report to view this (note: in this example, no issues were detected):

- Check if any items have 0 item revenue. This means the price and/or quantity item-level metadata is not correctly populated at the purchase event hit for certain items. Create the following report to view this (note: in this example, no issues were detected):

- Check if any transactions have 0 quantity. Capturing transactions with 0 items/quantity is a red flag and reveals a potential implementation issue regarding item tracking. Create the following report to view this (note: in this example, no issues were detected):

- Check main funnel stages are configured correctly. If all events listed above have been configured, we need to check this has been set up correctly and the user count on each of the steps shows a funnel flow from product detail view to add to cart to begin checkout to add payment info to purchase. Therefore, the count of users should decrease in the order stated above. Creating a funnel report is not the correct report for this check as that can only show if the order of events is incorrect but not if specific checkout journeys are missing some of the ecommerce event hits. We will need to create the following free-form report to be able to see this (in this case, add_payment_info should be less than begin_checkout user count indicating an issue with the tracking setup):

- Check for all ecommerce events and the item level metadata that has been added in for each has no ‘(not set)’ (i.e. missing the parameter entirely) or blank values (i.e. pass through the parameter but as an empty string). Review which pages and other item metadata these (not set) or blank values could be occurring for and see if this is a tracking issue or if this specific item parameter for this item at that specific event genuinely does not have a value. For example, let us check the item category information for just the purchase event (in this case 638 item quantities at the purchase event had a blank value):

Finally, ensuring your GA4 ecommerce data is healthy is essential in understanding not only your sales data but also checkout user journey and product level analysis.
Conclusion
Overall, making sure your GA4 data is healthy is intrinsic in maintaining tracking health and data attribution. If this is not maintained it will make it difficult to use and in extreme cases unusable. Ensuring that all the checks outlined above are performed will thoroughly cover every facet of your GA4 health.
Thanks for reading! Follow me next week for part 4 “GA4 Custom Events Testing”!