Details
-
Bug
-
Resolution: Fixed
-
Major
-
7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.1.0, 7.1.1, 7.1.2, 7.1.3
-
Untriaged
-
0
-
Unknown
Description
Problem
The index progress is based of seqno. With clusters that have been around for a while and data sets that have high mutation rate the seqno can be much higher than the total number of items and tombstones in the buckets. Generally speaking the lower seqno are more likely to be dead too.
This basically cause the index build percentage to jump from 0 up to 80 or 90%. Where in reality it has only done 5% of the mutations/work. The other problem this causes is that it will say it has to process 9 Billion items when a bucket might only have 200million items.
Both of these issue come up regularly with users.
Why does indexing always get stuck in the 90 percentage range
Why does it has to process 3 billion items when there are only 200million documents
I have stop using the progress bar as it's only correct for the most basic of use cases.
Notes
This problem is not unique to indexing, all DCP consumers have this issue. The protocol does not know how many items/mutation is going to be sent. cbbackupmgr had this problems and in the end we moved away from using the seqno for progress, unfortunately I don't think that is an option here.
Expectation
For the progress bar to be more accurate especially on initial builds and rollbacks to Zero.
Suggestion
For initial builds use the item count up to 99% and for the final percentage wait until the stream is closed or however today indexer decided that the build is done.