Diagnose your Aggregation Jobs

The following tables detail a myriad of issues and error status codes with potential reasons for the cause and actions that you can take to mitigate your deployment. If you would like to check out the full error specifications & mitigation for Aggregation Service, check out our current public guidance.

Guide topics:

Permissions & Authorization Errors

Issue Permissions issues when you are either executing terraform plan or terraform apply to your public cloud project.
Example Error Error: UnauthorizedOperation: You are not authorized to perform this operation.
Resolution

Check that you are properly authenticated into the cli (command-line interface) of the public cloud that you are leveraging.

Amazon Web Services

AWS requires user permissions to be able to create instances and other services required for Aggregation Service. Once you apply this, you should be able to perform terraform plan and apply without any issues.

Google Cloud Platform

In Google Cloud, note that you'll have to impersonate a service account to deploy the second half of the Terraform. Your terraform apply command may be failing if you skipped this step because the deployment service account has all necessary permissions to create resources. See step 4 in "Set up your deployment environment" on the GitHub documentation.

Privacy Budget Errors

Error PRIVACY_BUDGET_ERROR
Cause This would indicate that the service was unable to process the reports due to an error with the privacy budget service.
Check Once you retry the job to see if the error was intermittent, reach out to us through the technical support form.
Error PRIVACY_BUDGET_AUTHORIZATION_ERROR
Cause You may be using a different reporting origin than what they provided during onboarding.
Check

Ensure that the site that you are submitting in the attribution_report_to field of createJob request is the same site that was submitted during onboarding.

The site should match or be a subdomain of what was onboarded. Note that Aggregation Service onboarding is handled at the top level domain, and all subdomains are eligible to use the Aggregation Service once the top level domain is onboarded.

Error PRIVACY_BUDGET_AUTHENTICATION_ERROR
Cause You may be using outdated or incorrect ARN.
Check Google Cloud Platform

Check that the service account being used in your Aggregation Service deployment matches the service account that was provided during onboarding. It must match exactly, not just belong to the same project.

Amazon Web Services

It's expected that you are using the same coordinators that were provided for you through email. If you are still having issues, gather your auto.tfvars file and reporting origin information and reach out to us on the technical support form.

Error PRIVACY_BUDGET_EXHAUSTED
Cause Error:
            "result_info": {
              "return_code": "PRIVACY_BUDGET_EXHAUSTED",
              "return_message": "com.google.aggregate.adtech.worker.exceptions.AggregationJobProcessException:
              Insufficient privacy budget for one or more aggregatable reports. No aggregatable report can appear
              in more than one aggregation job. Information related to reports that do not have budget can be
              found in the following file:
              File path: //
              Filename: privacy budget exhausted debugging information  \n
              com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor.consumePrivacyBudgetUnits(ConcurrentAggregationProcessor.java:525) \n com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor.process(ConcurrentAggregationProcessor.java:319) \n com.google.aggregate.adtech.worker.WorkerPullWorkService.run(WorkerPullWorkService.java:157)",
              "error_summary": {
                  "error_counts": [],
                  "error_messages": []
              },
              "finished_at": 
            }
          

Privacy Budget exhaustion issue happens when you try to batch a report whose shared ID has already been included in a previously successful batch. This error occurs due to the "No duplicates" rule where aggregatable reports are only allowed to appear in a single batch and can only contribute to one summary report.

Each report will be assigned a "shared ID" which will consist of the shared_info fields API, reporting_origin, destination_site, source_registration_time (truncated by day), scheduled_report_time (truncated by hour) and version. This will mean that multiple reports can belong to the same "shared ID" should they share the same attributes of the shared_info field.

Check

We recommend that you try out the Privacy Budget Exhausted support provided from job response to inspect and resolve your error. This provides a new helper JSON file that will provide visibility into what reports contributed to the error.

Note that if you are batching correctly, you may be eligible for budget recovery (explainer). Suggest that they read the explainer and fill out the form, but note that their request will need to be approved in order to successfully recover the budget and run the job again.

Error DEBUG_SUCCESS_WITH_PRIVACY_BUDGET_EXHAUSTED
Cause This indicates you are running the job in debug mode. The job_parameters in the createJob request contains the debug_run: true. When the debug_run flag is enabled, you can run the report multiple times for debugging purposes. This error message informs you that the job would have failed due to the report's privacy budget being exhausted if it had not been run in debug mode. This error will only be valid in releases v2.10.0 or below.
Check The createJob request body will contain debug_run in the job_parameters.
            {
              "job_request_id": "{job_request_id}",
              "input_data_blob_prefix": "{input_prefix}",
              "input_data_bucket_name": "{input_bucket}",
              "output_data_blob_prefix": "{output_prefix}",
              "output_data_bucket_name": "{output_bucket}",
              "job_parameters": {
                "output_domain_blob_prefix": "{output_domain_prefix}",
                "output_domain_bucket_name": "{output_domain_bucket}",
                "attribution_report_to": "{reporting_origin}",
                "debug_run": "true"
              }
            }
          

Job Runtime Errors

Error INVALID_JOB
Endpoint createJob
Cause This can happen when the debug privacy epsilon provided is not within the bounds (0.64], or when the job parameters fail validation.
Check What epsilon value was used? What job parameters were used in the createJob request, and do those match your environment? Are they formatted correctly? Make the corrections needed and retry the job.
Error INTERNAL_ERROR
Endpoint getJob
Cause Can be a formatting issue that causes failed processing for output domain or reports. Can also be an issue with your Aggregation Service deployment.
Check Ensure output domain location is a valid path. Retry the job. If error persists, request the auto.tfvars file and the Terraform plan output to troubleshoot their Aggregation Service deployment.
Error RESULT_WRITE_ERROR
Endpoint getJob
Cause This can happen when the write to the output directory fails, either transiently or due to lack of write permission on the directory. Note that write errors do consume privacy budget, and the job cannot be retried. This can contribute to another error result of PRIVACY_BUDGET_EXHAUSTED error.
Check Is this error occurring on every job, or only intermittently? If this is occurring in every job, ensure that you have enabled write permissions on the output directory. If this is an intermittent failure, the permissions should be correct. It is a known issue that writing summary reports can fail but the privacy budget will still be consumed. In this case, you can request budget recovery (explainer).
Issue Encountering 403 errors while running a job and retrieving an attestation service token and the job is always returning back with status "RECEIVED".
Error
            {
                "job_status": "RECEIVED",
                "request_received_at": "{utc timestamp}",
                "request_updated_at": "{utc timestamp}",
                "job_request_id": "0001",
                "input_data_blob_prefix": "reports/",
                "input_data_bucket_name": "{bucket_name}",
                "output_data_blob_prefix": "summary/",
                "output_data_bucket_name": "{bucket_name}",
                "postback_url": "",
                "job_parameters": {
                    "output_domain_bucket_name": "{bucket_name}",
                    "output_domain_blob_prefix": "output_domain/",
                    "attribution_report_to": 
                }
            }
          
Resolution

Jobs getting stuck in RECEIVED status and the 403 error commonly occurs when the service account has yet to be onboarded. Verify that the service account you are using matches what you provided in your onboarding request. If you have not completed an onboarding request, please fill out the onboarding form and enrollment forms.

Once you have verified your enrollment and onboarding status, check what happened to your running job.

Amazon Web Services

When this happens, possibly the AWS enclave might not be running or has crashed and thus the jobs are not being picked up.

  1. Connect to the EC2 instance Session Manager.
  2. Follow this AWS documentation, which includes the following steps for connecting to Session Manager.
  3. Go to AWS Console Manager > EC2 > Instances.
  4. Select the Instance ID of the Aggregation Service running.
  5. Select "Session Manager" tab > "Connect" button. This will connect you to your instance.
  6. Once Enclave instance is running, execute in the in the terminal:
    sudo nitro-cli describe-enclaves
    If this command does not show logs as expected, execute the following before trying again:
    sudo nitro-cli run-enclave --cpu-count=2 --memory=7000 --eif-path=/opt/google/worker/enclave.eif
  7. To check if the AWS enclave has crashed, run the command: sudo journalctl -u aggregate-worker.service
  8. You should see output logs filing in such as:
    Starting aggregate-worker.service - Watcher script for nitro enclave.
    Errors should be visible here if there are any failures, etc.
Google Cloud Platform

The managed instance group (MIG) may not be healthy. If this is the first-time setup, or you destroyed and recreated the adtech_setup Terraform, confirm that your service account is onboarded. If the service account isn't onboarded, the MIGs won't be healthy.

  1. Cloud Console, navigate to Compute Engine > Instance groups
  2. Check out your status columns (green check marks are healthy)
  3. Click one of the instance groups, and look at the Errors tab to learn more about the issue. Click the Instance name to access VM-level information.
  4. You can also use your Terminal to interact with the Instance group and get the same information. Try the list-errors command:
    gcloud compute instance-groups managed list-errors --region=
    The following is an example output.
                      INSTANCE_URL: https://www.googleapis.com/compute/v1/projects/aggservice-sandbox/zones/us-central1-c/instances/collector-operator-demo-env-67hd
                      ACTION: VERIFYING
                      ERROR_CODE: WAITING_FOR_HEALTHY_TIMEOUT_EXCEEDED
                      ERROR_MESSAGE: Waiting for HEALTHY state timed out (autohealingPolicy.initialDelay=200 sec) for instance projects/aggservice-sandbox/zones/us-central1-c/instances/collector-operator-demo-env-67hd and health check projects/aggservice-sandbox/global/healthChecks/operator-demo-env-collector-auto-heal-hc.
                      TIMESTAMP: 
                      INSTANCE_TEMPLATE: https://www.googleapis.com/compute/v1/projects/aggservice-sandbox/global/instanceTemplates/operator-demo-env-collector
                      VERSION_NAME: primary
                    
If you continue to see issues, save this and provide it to our team. Continue to the next steps.

Is your summary report converting as expected?

A situation may arise where your getJob call is successful, but there is an issue with the summary report returned by the Aggregation Service. The summary report is AVRO formatted and will need to be converted to JSON format. Once converted to a json format, it will look similar to the following.

{
  "bucket": "\u0005Y",
  "metric": 26308
}

If the AVRO conversion is having any issue, try to use the AVRO tools and use the following command on the AVRO report. java -jar avro-tools-1.11.1.jar tojson [report_name].avro > [report_name].json Stable versions can be downloaded from here. If you require further assistance, continue on to our next steps.

Next Steps

Check if anyone else has encountered the same issue on the Privacy Sandbox Status Dashboard or on the public GitHub repository.

If you do not see a resolution to your Aggregation Service issue, notify us by filing a GitHub issue or submitting the technical support form.