GA4 API Data 2 Excel
GA4 Data - API --> Python script in Google Colab --> Google Drive Excel file
In this tutorial I will explain how to connect to the Google Analytics 4 API, extract data from it and put it in a Excel file. The script is programmed in Python language and can be run on Google Colab and saves to Google Drive. You can easily follow the instructions even if you do not have Python coding experience.
Download/Access Google Colab file
Watch the video
Youtube video with explanation of this turorial.
Easy Colab notebook v2
This new Colab file allows you to configure all the variables in the first field, you only have to set the settings right and click on Run All, the script will do the rest. Very Easy :)
Go to: https://colab.research.google.com/drive/1kqmuj0AVgVq_UBQAVR_hqaPZ-NWFW4BH?usp=sharing
- Site with GA4
First step is to have a site with Google Analytics 4 installed on it.
2.Google Cloud Console
Go to the Google Cloud Consol ( https://console.cloud.google.com/ ), you do not have to start the trial, just click on CONSOLE.
On the top you can select a PROJECT and ther you have the option to create a new project.
A. Create a new project
B. Go to APIS & SERVICES --> + ENABLE APIC AND SERVICES --> enable the Google Analytics Data API
C. Go to CREDENTIALS and add a Service Account
D. Open Service account, go to Keys and create NEW KEYS
E. A json keyfile will be downloaded and stored in your download folder
3. Add user to GA4
Open the JSON file you just downloaded (in a text editor or in a browser window by drag and drop).
Copy the client_email right under the private key, like:
ga4-service-account@ga4-extract-data-388610.iam.gserviceaccount.com
In Analytics under Admin, go to the property, Property Access Management and add the e-mail as a user. (permissions Viewer are ok)
4. Google Drive add folder
Go to Google Drive and create a folder named "GA4-DATA" in the root. Upload the json file with the credentials from step 3.
5. Fire up Google Colab!
Open the shared notebook: Google Colab file or go to: colab.research.google.com.
Alternatively you can copy paste the Python code below.
Place every section: # ------- in a new cell, execute every cell with SHIFT+ENTER
# -----------------------------
# Official documentation: https://developers.google.com/analytics/devguides/reporting/data/v1/quickstart-client-libraries#python
# Options, Metrics and Dimensions: https://developers.google.com/analytics/devguides/reporting/data/v1/api-schema
# Install module first
!pip install google-analytics-data
# -----------------------------
# Import modules
from google.analytics.data_v1beta import BetaAnalyticsDataClient
from google.analytics.data_v1beta.types import (
DateRange,
Dimension,
Metric,
RunReportRequest,
)
from google.oauth2 import service_account
from google.protobuf.json_format import MessageToDict
import pandas as pd
import os
from google.colab import drive
# -----------------------------
# Mount Google drive and list content
drive.mount('/content/drive')
os.listdir('./drive/MyDrive')
# -----------------------------
# List content of the folder I created on Google Drive for this project
os.listdir('./drive/MyDrive/GA4-DATA')
# -----------------------------
# Set credentials and define the right GA4 property
credentials = service_account.Credentials.from_service_account_file('./drive/MyDrive/GA4-DATA/ga4-extract-data-388610-96abbf0fa466.json')
client = BetaAnalyticsDataClient(credentials=credentials)
property_id="376319614"
# -----------------------------
# Get the data from the API
request = RunReportRequest(
property=f"properties/{property_id}",
dimensions=[Dimension(name="city"), Dimension(name="country")],
metrics=[Metric(name="activeUsers"), Metric(name="sessions")],
date_ranges=[DateRange(start_date="2020-03-31", end_date="today")],
)
response = client.run_report(request)
response
# -----------------------------
# Turn the raw data into a Table
def ga4_result_to_df(response):
"""Original: print_run_report_response: Prints results of a runReport call. v2.1 changed by Bram to create DataFrame"""
result_dict = {}
for dimensionHeader in response.dimension_headers:
result_dict[dimensionHeader.name] = []
for metricHeader in response.metric_headers:
result_dict[metricHeader.name] = []
for rowIdx, row in enumerate(response.rows):
for i, dimension_value in enumerate(row.dimension_values):
dimension_name = response.dimension_headers[i].name
result_dict[dimension_name].append(dimension_value.value)
for i, metric_value in enumerate(row.metric_values):
metric_name = response.metric_headers[i].name
result_dict[metric_name].append(metric_value.value)
return pd.DataFrame(result_dict)
df = ga4_result_to_df(response)
df
# -----------------------------
# Export the table to an Excel file on the Google Drive
df.to_excel('./drive/MyDrive/GA4-DATA/GA4_Excel.xlsx')