Live Indexing - Google Pub/Sub

How to enable Google Pub/Sub messaging for Live Indexing of objects in Google Cloud Storage buckets.

Google PubSub is a fully managed message publish/subscription service that enables you to decouple and scale microservices, distributed systems, and serverless applications.

What does ChaosSearch offer GCP Pub/Sub users?

ChaosSearch provides the ability for live indexing with the integration for Google Pub/Sub. With the Pub/Sub Integration configured, ChaosSearch can identify new files that have been put into the queue for discovery and indexing.

Functionality of Live Indexing

With Live Indexing enabled for an object group, ChaosSearch monitors your Google storage bucket for object creation events via user-specified Pub/Sub subscriptions to automatically index new objects and make the data available for querying.

Add Pub/Sub Integration in ChaosSearch - Quick Option

🚧

GCP Service account

Please note that the Google Cloud Platform Service account configured for the ChaosSearch account must also be configured with appropriate access to the Pub/Sub subscription.

Add Pub/Sub Integration in ChaosSearch

  1. In the GCP Management Console, navigate to Pub/Sub.
  2. Click Create Topic. The Create a topic window appears. Note your GCP project ID value (cs-customer-success in this example).
  3. In the Topic ID field, type a name for the topic.
  4. De-select (uncheck) Add a default subscription. Do not create the topic with this option selected.
  5. Click Create Topic. The topic is added; the next step is to create a subscription to associate with the topic.
  1. Click Subscriptions in the left menu.
  2. Click Create Subscription. The Create subscription window appears.
  3. In the Subscription ID field, type a name for the subscription that closely matches the topic name for ease of association.
  4. In the Select a Cloud Pub/Sub topic, select the topic that you created in the previous steps.
  5. Scroll down to the Acknowledgement deadline section, and increase the time to 600 seconds.
  1. Scroll to the bottom of the page and click Create to add the subscription.

NOTE: As an alternative to the UI steps above, you can create the topic and subscription using the gcloud utility commands. Some sample commands follow with test that you must substitute for your account and buckets:

gcloud --project YOUR-PROJECT-ID pubsub topics create YOUR-TOPIC-NAME
gcloud --project YOUR-PROJECT-ID pubsub subscriptions create YOUR-SUBSCRIPTION-NAME --topic=YOUR-TOPIC-NAME --ack-deadline 600

Granting Access to the Pub/Sub Service Topic

Make sure that the GCS service account is granted Publisher permission on the Pub/Sub service topic that you just created for a live object group. Also, make sure that the ChaosSearch tenant service account is granted Subscriber permission on the Pub/Sub topic. A summary of the steps follow; see the Google documentation for more information on the roles and GCS Pub/Sub access permissions.

📘

Your GCP account must have privileges to assign permissions.

Make sure that your user account is permitted to perform tasks like granting and changing access for Pub/Sub roles.

To grant the GCS service account Publisher permission:

  1. In Google Cloud, navigate to Cloud Storage > Buckets, and click Settings in the left menu.
  2. On the Settings page, scroll to the Cloud Storage Service Account section:
  1. Click the copy icon for the service account string. You need this value for the permission assignments steps that follow.
  2. Navigate to the GCS Pub/Sub area, and select the topic that you created for your live object group.
  1. Click Show Info Panel in the top right corner, then click Add Principal to grant the service account access to the live object group topic.
  1. In the Grant access to window, do the following:
    1. Paste the GCS service account ID that you copied in the earlier step into the New principals field.
    2. In the Assign roles field, filter and select the Pub/Sub Publisher role.
    3. Click Save.

You can now create a notification on the bucket that is associated with the live object group object files, and new messages will be associated to the topic when new objects are written to that storage location.

To grant the ChaosSearch service account Subscriber permission on the Pub/Sub topic:

  1. In the GCS Pub/Sub area, navigate to the Subscriptions area.
  1. Select the subscription that you created for your live object group, then click Show Info Panel in the top right, and click Add Principal to grant access to the tenant service account.
  1. In the Grant access to window, do the following:
    1. Specify the ChaosSearch service account in the New principals field. (If you are not sure of the ChaosSearch account name, you can use the Service Accounts page to scan for the tenant and its name.)
    2. In the Assign roles field, filter and select the Pub/Sub Subscriber role.
    3. Click Save.

Now, ChaosSearch can read from the GCS Pub/Sub subscription to identify when new storage objects are available for indexing by the live object group.

Adding the Pub/Sub Subscription to the ChaosSearch Object Group

You must associate your Pub/Sub Project ID and Subscription ID with the object group that you create to index the related Google Cloud Storage bucket.

  1. Create your object group following the steps described in Creating Object Groups.
  2. In the Create Object Group window, as in the following example, select the Live Indexing option, and paste the GCP Pub/Sub Project ID (cs-customer-success for this example) and the GCP Pub/Sub Subscription ID (cs-pwi-demo-subscription in this example) values into the two fields. A sample follows.
  3. Specify the retention values that you want to use for the index lifecycle.
  1. Click Create. The object group is created.
  2. In the Storage > Properties window, the Live badge on the right indicates that Live Indexing is configured.
  3. Click Start Indexing when you are ready to index the related objects in the cloud storage bucket.

Completing the Pub/Sub Live Indexing Setup

Make sure that you connect the Google Cloud Storage bucket to the Pub/Sub queue using Google’s gsutil command. A sample command format follows; you must replace the substitution strings with the values for your GCP configuration.

gsutil notification create -f json -p YOUR-LOGS-PREFIX -e OBJECT_FINALIZE -t YOUR-TOPIC-NAME gs://YOUR-BUCKET-NAME