Description
In Mad-Hatter, we started periodically flushing (checkpointing) datasets from memory to disk even when their memory budget is not full. This is done to ensure that datasets which have only few mutations will persist the latest DCP state. Currently the default setting is to flush datasets every 10 minutes. We need to investigate the impact of this periodic flush on ingestion.
The period in which datasets are flushed can be configured using the Analytics service parameter (txnDatasetCheckpointInterval) by setting its value in (seconds). We need to run the ingestion performance experiments while setting txnDatasetCheckpointInterval to a large value (e.g. 86400) so that the periodic flush is never triggered.
To do that, after the Analytics service is up, we need to do the following:
- Use the Analytics service configuration API to change the value of txnDatasetCheckpointInterval to 86400.
- Restart the Analytics service using the Analytics service restart API for the service configuration change to take effect.
- Run the ingestion experiments.
Attachments
Issue Links
- relates to
-
MB-34185 Performance regression of ingestion when using HDDs
- Closed