Details
-
Improvement
-
Resolution: Unresolved
-
Major
-
master
-
None
-
0
Description
What is the problem?
In a customer issue, it was noticed that Azure allows two storage types/engines for blob storage:
- Standard: normal blob storage semantics (i.e. essentially a KV store)
- DataLake Gen 2: more traditional filesystem semantics
Ideally we would support both and currently the IsDir method on ObjectAttrs returns false for directories in DataLake. This is because the LastModified field is not nil. Here is a dump of the response we get when listing a directory blob:
{
|
"Deleted": null,
|
"Name": "foo/a",
|
"Properties": {
|
"ETag": "0x8DB87A25BE51B09",
|
"LastModified": "2023-07-18T15:19:19Z",
|
"AccessTier": "Hot",
|
"AccessTierChangeTime": null,
|
"AccessTierInferred": true,
|
"ArchiveStatus": null,
|
"BlobSequenceNumber": null,
|
"BlobType": "BlockBlob",
|
"CacheControl": "",
|
"ContentDisposition": "",
|
"ContentEncoding": "",
|
"ContentLanguage": "",
|
"ContentLength": 0,
|
"ContentMD5": null,
|
"ContentType": "",
|
"CopyCompletionTime": null,
|
"CopyID": null,
|
"CopyProgress": null,
|
"CopySource": null,
|
"CopyStatus": null,
|
"CopyStatusDescription": null,
|
"CreationTime": "2023-07-18T15:19:19Z",
|
"CustomerProvidedKeySHA256": null,
|
"DeletedTime": null,
|
"DestinationSnapshot": null,
|
"EncryptionScope": null,
|
"ExpiresOn": null,
|
"ImmutabilityPolicyExpiresOn": null,
|
"ImmutabilityPolicyMode": null,
|
"IncrementalCopy": null,
|
"IsSealed": null,
|
"LastAccessedOn": null,
|
"LeaseDuration": null,
|
"LeaseState": "available",
|
"LeaseStatus": "unlocked",
|
"LegalHold": null,
|
"RehydratePriority": null,
|
"RemainingRetentionDays": null,
|
"ServerEncrypted": true,
|
"TagCount": null
|
},
|
"Snapshot": null,
|
"BlobTags": null,
|
"HasVersionsOnly": null,
|
"IsCurrentVersion": null,
|
"Metadata": null,
|
"OrMetadata": null,
|
"VersionID": null
|
},
|
What is the solution?
Comparing the above listing to one for a file it looks like we might be able to use Properties.ContentMD5 or Properties.ContentType to infer the object is a directory. Unfortunately I cannot find any documentation that states what the best way to tell an object is a directory, so I am not sure whether we can rely on this.
Attachments
Issue Links
- relates to
-
MB-57913 [CBM] Support Azure DataLake Gen 2
- Open