Object folders

Object foldersLet’s answer one of the common questions. Why we see folders’ structure in Cyberduck and S3 browser? In theory, we shouldn’t…

As we know from a theory, one of the key differences between object and file storage is that objects are not organised in a hierarchy of folders and are not identified by its path in the hierarchy. Each object is associated with a key made of a string when created. As a result, all of the objects are organised in a flat name space. That is, there are no folders within a folder, and there is no operation for moving a folder. This organisation eliminates the dependency between objects, but retains the fundamental functionality of a storage system: storing and retrieving data.

 OK that is clear. But why we still see a folder structure copying data using popular GUI like Cyberduck or S3 browser? Good question. To answer, lets make several experiments.

  • Copy simple folder structure to ECS using Cyberducka1
  • Check the folder structure

host:s3curl pantyv$ ./s3curl.pl --id=ecsid -- -v -s http://10.76.246.143:9020/TargetBucket/ |xmllint --format -
…
<Name>TargetBucket</Name>
<Prefix/>
<Marker/>
<MaxKeys>1000</MaxKeys>
<IsTruncated>false</IsTruncated>
<ServerSideEncryptionEnabled>false</ServerSideEncryptionEnabled>
<Contents>
<Key>Folder1/</Key>
<LastModified>2016-05-23T18:53:30Z</LastModified>
<ETag>"d41d8cd98f00b204e9800998ecf8427e"</ETag>
 <Size>0</Size>
<StorageClass>STANDARD</StorageClass>
<Owner>
<ID>targetuser</ID>
<DisplayName>targetuser</DisplayName>
</Owner>
</Contents>
<Contents>
<Key>Folder1/Folder2/</Key>
<LastModified>2016-05-23T18:53:30Z</LastModified>
<ETag>"d41d8cd98f00b204e9800998ecf8427e"</ETag>
<Size>0</Size>
<StorageClass>STANDARD</StorageClass>
<Owner>
<ID>targetuser</ID>
<DisplayName>targetuser</DisplayName>
 </Owner>
</Contents>
<Contents>
<Key>Folder1/Folder2/file2.txt</Key>
<LastModified>2016-05-23T18:53:30Z</LastModified>
<ETag>"7bec9352114f8139c2640b2554563508"</ETag>
 <Size>13</Size>
<StorageClass>STANDARD</StorageClass>
<Owner>
 <ID>targetuser</ID>
<DisplayName>targetuser</DisplayName>
</Owner>
</Contents>
<Contents>
<Key>Folder1/file1.txt</Key>
<LastModified>2016-05-23T18:53:30Z</LastModified>
<ETag>"9d98eede4ccb193e379d6dbd7cc1eb86"</ETag>
<Size>13</Size>
<StorageClass>STANDARD</StorageClass>
<Owner>
<ID>targetuser</ID>
<DisplayName>targetuser</DisplayName>
</Owner>
</Contents>
</ListBucketResult>

From that simple experiment we can make a couple of simple conclusions:

  • All four objects have the same attributes. So folders are the ordinary objects with zero size.
  • The Key of the “nested” objects contain folders’ names separated by slash symbol in their prefix.

Amazon documentation confirms our observation and provides the explanation.

http://docs.aws.amazon.com/AmazonS3/latest/UG/FolderOperations.html

“Object Amazon S3 has a flat structure with no hierarchy like you would see in a typical file system. However, for the sake of organizational simplicity, the Amazon S3 console supports the folder concept as a means of grouping objects. Amazon S3 does this by using key name prefixes for objects.

For example, you can create a folder in the console called photos, and store an object called myphoto.jpg in it. The object is then stored with the key name photos/myphoto.jpg, where photos/ is the prefix.”

 

So folders are just an illusion to make our storage classical file system like. Lets prove that.

  • Create new object with the name
host:s3curl pantyv$ ./s3curl.pl --id=ecsid  --put=README -- -v -s http://10.76.246.143:9020/TargetBucket/Folder3/file3.txt

* We are completely uploaded and fine

 

  • Cyberduck shows us both the “folder” and the filea2
  • But truthful s3curl shows that only one new object was created. There is no such thing as Folder3 there.
host:s3curl pantyv$ ./s3curl.pl --id=ecsid -- -v -s http://10.76.246.143:9020/TargetBucket/ |xmllint --format - |grep -i "key>"

<Key>Folder1/</Key>
<Key>Folder1/Folder2/</Key>
<Key>Folder1/Folder2/file2.txt</Key>
<Key>Folder1/file1.txt</Key>
<Key>Folder3/file3.txt</Key>
  • Lets rename the Folder3 from the GUIa3
  • But folder rename is failed. Of course we just don’t have such thing as Folder3.a4
  • Lets delete the file3.txt from GUIa5
  • Object is deleted
host:s3curl pantyv$ ./s3curl.pl --id=ecsid -- -v -s http://10.76.246.143:9020/TargetBucket/ |xmllint --format - |grep -i "key>"

<Key>Folder1/</Key>
<Key>Folder1/Folder4/</Key>
<Key>Folder1/Folder4/file2.txt</Key>
<Key>Folder1/file1.txt</Key>
  • Hmm…   but the Folder3 is still presented in GUI.a6
  • Of course the folder disappears from the list after refresh.Object folders
  • Lets rename another foldera8
  • It works just finea9
  • Cyberduck changed prefixes for both objects influenced by the renaming
host:s3curl pantyv$ ./s3curl.pl --id=ecsid -- -v -s http://10.76.246.143:9020/TargetBucket/ |xmllint --format - |grep -i "key>"

<Key>Folder1/</Key>
<Key>Folder1/Folder4/</Key>
<Key>Folder1/Folder4/file2.txt</Key>
<Key>Folder1/file1.txt</Key>
  • Lets remove a file2.txt objecta10
  • But Folder4 is still there.
  • Refresh doesn’t help 😉 a11
  • The reason is – when we create structure via GUI, Cybeduck creates “folders” as separate objects. So we need to delete them as well. 
host:s3curl pantyv$ ./s3curl.pl --id=ecsid -- -v -s http://10.76.246.143:9020/TargetBucket/ |xmllint --format - |grep -i "key>"

<Key>Folder1/</Key>
<Key>Folder1/Folder4/</Key>
<Key>Folder1/file1.txt</Key>

In perspective of folder structure all ECS object protocols are very similar… with one exception.  In the Atmos protocol, directories are a real object.  You can create an empty directory object and directories can be automatically created with objects. ACLs (read/write) can be set on a directory and apply to objects within that directory.  Of course in S3 or Swift we can assign ACL on the level of namespace and bucket but not “folders”.a12

  

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s