cancel
Showing results for 
Search instead for 
Did you mean: 
mariam_ayman
Employee
Employee

Introduction

As data volumes grow, especially with the proliferation of sensors and IoT devices or long-term usage of Incorta, it's essential to manage the amount of data stored on disk effectively. Increasing data size can lead to storage issues and higher costs. To address these challenges, Incorta introduces a powerful new feature: Data Retention.

Applies to

Versions 2024.7.0 and later.

Key Features

Time-Window Configurations

  • Flexible Time Frames: Define retention policies using time windows such as Days, Weeks, Months, Quarters, or Years.
  • Date Column: Specify the date/timestamp column to be used for evaluating the retention policy.

Example:

  • Keep data from the last 6 months.

Custom Configuration

  • User-Defined Conditions: Create custom conditions, using the formula builder, that must be met for data to be retained.

Example:

  • Keep records that meet specific business logic requirements.
mariam_ayman_0-1739286076106.png

 

 Data Purge Job (Newly Introduced Job Type)

Spark handles the deletion from the compacted parquet files and rewrites a new version on disk. Using the CMC, you can customize Spark configurations to better align the application's behavior with your data purge requirements. This includes adjusting parameters like application cores, application memory, executor memory, and driver memory to optimize performance and meet specific needs.

How to Use This Feature

  1. Navigate to the table settings in Incorta.
  2. Choose between Time-Window Configurations or Custom Configuration.
  3. Define your retention policy using the provided options.
  4. Schedule the Data Purge job to run during off-peak hours or on weekends.
  5. Incorta will handle the deletion process, updating the parquet files to reflect the changes.

Conclusion

By setting data retention policies at the table level, Incorta users can significantly reduce disk usage and lower storage costs. This feature provides the flexibility needed to handle growing data volumes efficiently.

 

 

Best Practices Index
Best Practices

Just here to browse knowledge? This might help!

Contributors
Version history
Last update:
‎02-11-2025 07:01 AM
Updated by: