Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: 4.5.0
Affects Version/s: 4.5.0
Component/s: tools
Labels:
None

Triage:
Untriaged
Is this a Regression?:
Unknown

Description

The "tracks3" data set, from Couchbase training, has a schema that wreaks havoc with the schema inferencer. Each track has a subdocument called "reviews", and that subdocument has a different field for each review, where the field name is the id of the person who did the review. Thus there is a huge number of field names, resulting in a schema description hundreds of thousands of lines long. The inferencer needs to do something smarter in cases like this, perhaps having a parameterized maximum number of fields.

Attachments

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Eben Haber

Reporter:: Eben Haber

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 04/Mar/16 1:46 PM

Updated:: 31/May/16 10:16 AM

Resolved:: 15/Apr/16 7:51 PM

Gerrit Reviews

There are no open Gerrit changes

Show There is 1 closed Gerrit change

Hide There is 1 closed Gerrit change

MB-18536 - Handle the "dictionary" pattern in schema inferencing.: Gerrit Review:

Certain buckets can cause very large schema inferencing results

Details

Description

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty