How to Delete a Dataset Using Data API in Python

Introduction

This document describes how to delete a MongoDB dataset using Corva Data API and how to deal with non-empty datasets.

Warning. Before deleting anything double check that you choose the right dataset.

How to delete an empty dataset

To delete an empty dataset use the Delete dataset request of Corva Data API.

If the dataset is not empty all documents have to be deleted first.

“Dataset is not empty and cannot be deleted” error

How to delete multiple documents

How to delete documents for an asset

To delete all documents for a specific asset use the Delete multiple Documents request of Corva Data API. asset_id is required for this request. It is not possible to delete multiple documents querying by for example only company_id.

“Asset_is missing from the query” error

How to delete documents for multiple assets

There are two possible scenarios for deleting documents:

  • the dataset contains only documents for currently existing assets

  • the dataset contains documents with asset_id that were deleted from assets

Scenario 1. If the dataset contains only currently existing assets then loop for all assets of the company and delete all documents for each one.

# Step 1. Get a list of assets
url = "https://api.corva.ai/v2/assets"
company_id = 196 # your company_id
params = {
   'company_id': company_id,
    'types': 'well'
   
}
assets = requests.get(url, headers=auth, params=params).json()['data']
assets = [asset['id'] for asset in assets]

# Step 2. Delete documents for each asset
url = f'https://data.corva.ai/api/v1/data/sample/test/'
for asset in assets:
    params = {
            'query': '{"asset_id":' + str(asset) + '}'
    }
    response = requests.delete(url, headers=auth, params=params)
    print(response.content)  # print how many documents were deleted

Step #1 scenario

Scenario 2. If the dataset contains documents with asset_id that were deleted from assets then you need to find all those assets. To use the code below the company_id index is required. How to Create Custom Indexes for Custom Datasets

# Step 1. Get a list of assets
url = 'https://data.corva.ai/api/v1/data/sample/test/'
params = {
        'sort': '{"timestamp": 1}',
        'limit': 10000,
        'query': '{"company_id": 196}'
    }
response = requests.get(url, headers=auth, params=params)
if response.status_code == 200:
    assets = set([record["asset_id"] for record in response.json()])

# Step 2. Delete documents for each asset
url = f'https://data.corva.ai/api/v1/data/sample/test/'
for asset in assets:
    params = {
            'query': '{"asset_id":' + str(asset) + '}'
    }
    response = requests.delete(url, headers=auth, params=params)
    print(response.content)  # print how many documents were deleted

Step #2 scenario

1 Like