Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-53286

[CX] Parquet fails with bucket names containing "dot" character between "numbers"

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown
    • Analytics Sprint 10, Analytics Sprint 11, Analytics Sprint 12

    Description

      Buckets containing the "dot" character (.) between numbers, such as my-bucket-1.1 fail when reading parquet data (current with internal error). This is due to the internal mechanism of how Hadoop coverts the URI to a path-style access, leading Hadoop to not pick up the bucket name properly.

      This is a known issue by Hadoop and AWS, and the recommendation is to avoid using bucket names containing "dot" (.) in the name.

      This issue is to ensure that we don't return internal error upon encountering such scenarios.

      References:

      https://issues.apache.org/jira/browse/HADOOP-17241

      https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          Re-targeted to 7.1.3 - we should add a release note.

          till Till Westmann added a comment - Re-targeted to 7.1.3 - we should add a release note.

          Moving 7.1.4 issues to 7.2.0 as 7.1.4 is currently not planned.

          till Till Westmann added a comment - Moving 7.1.4 issues to 7.2.0 as 7.1.4 is currently not planned.

          People

            Hussain.Towaileb Hussain Towaileb
            Hussain.Towaileb Hussain Towaileb
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty