S3 vs Swift

s3-vs-swiftOne of our customers asked to compare Amazon S3 and OpenStack Swift protocols provided by ECS.

My compilation from different sources below. Don’t expect really comprehensive analysis. So, criticism is not accepted  😉 

Object storage

Object storage is the way to organise data by addressing and manipulating discrete units of data called objects. Each object, like a file, is a stream of binary data. However, unlike files, objects are not organised in a hierarchy of folders and are not identified by its path in the hierarchy.

Each object is associated with a key made of a string when created, and you may retrieve an object by using the key to query the object storage. As a result, all of the objects are organized in a flat name space (one object cannot be placed inside another object). Such organisation eliminates the dependency between objects but retains the fundamental functionality of a storage system: storing and retrieving data. The main profit of such organisation is very high level of scalability.

Both files and objects have metadata associated with the data they contain, but objects are characterised by their extended metadata. Each object is assigned a unique identifier which allows a server or end user to retrieve the object without needing to know the physical location of the data. This approach is useful for automating and streamlining data storage in cloud computing environments.

S3 and Swift are the most commonly used cloud object protocols.

S3 protocol Amazon S3 (Simple Storage Service) is an online file storage web service offered by Amazon Web Services.

 Swift protocol OpenStack is a free and open-source software platform for cloud computing.

Popularity

 S3 protocol The S3 protocol is the most commonly used object storage protocol.  So, if you’re using 3rd party applications that use object storage, this would be the most compatible protocol.

Swift protocol  Swift is a little bit less than S3, but still very popular cloud object protocol.

 

Managed by

S3 protocol  S3 protocol is developed by Amazon. API (Application programming interface) are available free for all third-party developers.

Swift protocol  Swift protocol is managed by the OpenStack Foundation, a non-profit corporate entity established in September 2012 to promote OpenStack software and its community. More than 500 companies have joined the project.

 

Launched

S3 protocol  Amazon launched S3, its first publicly available web service, in the United States in March 2006.

 Swift protocol OpenStack began in 2010 as a joint project of Rackspace Hosting and NASA. The platform started with just two projects. One of them was the Swift Storage Project.

 

Unique functional features

S3 protocol  Some unique features of the S3 protocol:

  • Bucket-level controls for versioning and expiration that apply to all objects in the bucket
  • Copy Object.  This allows you to do server-side copies of objects
  • Anonymous Access.  The ability to set PUBLIC access on an object and serve it via HTTP/HTTPS without authentication.

 Swift protocol The unique features of the Swift API are:

  • Unsized object create.  Swift is the only protocol where you can use “Chunked” encoding to upload an object where the size is not known beforehand.  S3 require multiple requests to achieve this.

 

Large objects

S3 protocol S3 Multipart Upload allows you to upload a single object as a set of parts. After all of these parts are uploaded, the data will be presented as a single object.

 Swift protocol OpenStack Swift Large Object is comprised of two types of objects: segment objects that store the object content, and a manifest object that links the segment objects into one logical large object. When you download a manifest object, the contents of the segment objects will be concatenated and returned in the response body of the request.

 

Authentication

Authentication is the process of proving your identity to the system. Requests are allowed or denied in part based on the identity of the requester.

S3 protocol  Amazon S3 uses an authorization header that must be present in all requests to identify the user (Access Key Id) and provide a signature for the request. An Amazon access key ID has 20 characters, but ECS does not have this limitation.

The signature is calculated from elements of the request and based on the HMAC-SHA1 algorithm defined by RFC 2104 – Keyed-Hashing for Message Authentication . The output of HMAC-SHA1 is a byte string, called the digest. The Signature request parameter is constructed by Base64 encoding this digest. In the ECS data service, a namespace is also taken into HMAC signature calculation. The final signature is dynamic and depends on the current date and time.

In the ECS object data service, the user can be configured with 2 secret keys (passwords). The ECS data service will try to use the first secret key, and if the calculated signature does not match, it will try to use the second secret key. If the second key fails, it will reject the request. Secret keys are always generated by ECS and can’t be specified by a user. 

Both HTTP and HTTPS protocols are supported.

Swift protocol Authentication in Swift is quite flexible. It is done through a separate mechanism creating a “token” that can be passed around to authenticate requests. OpenStack Swift in ECS supports two version of Authentication.

V1 authentication.

User name and password are specified by administrators with no limits and strict requirements on a password complexity. They are sent as a plain text in a header of the authentication request. Auth header is static (always the same till a use or password change).

If the user and password are validated by ECS, the storage URL and token are returned in the response header. Further requests are authenticated by including this token. The storage URL provides the host name and resource address. The generated token expires 24 hours after creation. If you repeat the authentication request within the 24 hour period using the same UID and password, OpenStack will return the same token. Once the 24 hour expiration period expires, OpenStack will return a new token.

V2 (Keystone) authentication.

V2 authentication is made in 2 steps. Unscoped tokens are used to query tenant information. An unscoped token along with tenant information can be used to query a scoped token. A scoped token and a service endpoint can be used to authenticate with ECS the same way as V1 authentication described above.

Both HTTP and HTTPS protocols are supported.

 

Naming conventions

There are two basic levels separate the name space of objects: namespace and bucket. Each namespace represents an accounting and billing identity, which is similar to a storage volume. Buckets can be created in a namespace to provide a primitive solution for grouping objects. A bucket doesn’t contain other buckets.

 S3 protocol Swift protocol    The following rules apply to the naming of ECS namespaces for both S3 and Swift:

  • Cannot be null or an empty string
  • Length range is 1..255 (Unicode char)
  • Valid characters are alphanumeric characters, hyphen ( -) and underscore (_).

S3 protocol The following rules apply to the naming of S3 buckets in ECS:

  • Names must be between one and 255 characters in length.
  • Names can include dot (.), hyphen (-), and underscore (_) characters and alphanumeric characters ([a-zA-Z0-9]).
  • Names can start with a hyphen ( -) or alphanumeric character.
  • The name does not support
    starting with a dot (.), containing a double dot (..) and ending with a dot (.).
  • Name must not be formatted as IPv4 address.

Swift protocol  In perspective of the Swift terminology, buckets are named containers.

The following rules apply to the naming of Swift containers:

  • Cannot be null or an empty string
  • Length range is 1..255 (Unicode char)
  • Valid characters are alphanumeric characters, hyphen ( -) and underscore (_).

    The following rules apply to the naming of ECS both S3 and Swift objects:

  • Cannot be null or an empty string
  • Length range is 1..255 (Unicode char)
  • No validation on characters.

 

Retention

ECS provides the ability to prevent data being modified or deleted within a specified retention period. Retention periods and retention policies can be defined in metadata associated with objects and on buckets, and is checked each time a request to modify an object is made.

 S3 protocol Swift protocol Retention periods are supported on all object interfaces including S3 and Swift.

 

Audit buckets

 S3 protocol Swift protocol The controller API provides the ability to audit the use of the S3 and Swift object interfaces.

 

File access

 S3 protocol Swift protocol  ECS supports multi-protocol access, so that files written using NFS can also be accessed using S3 and Swift object protocols. Similarly, objects written using S3 and Swift object protocols can be made available through NFS.

In the same way as for the bucket itself, objects and directories created using object protocols can be accessed by Unix users and Unix group members by mapping the object users and groups.

 

ECS extensions for the object storage

S3 protocol Swift protocol A number of extensions to the object APIs are supported by ECS for both S3 and Swift.

  • Object Range Update – Uses Range header to specify object range to be updated.
  • Object Range Overwrite – Uses Range header to specify object range to be overwritten.
  • Object Range Append – Uses Range header to specify object range appended.
  • Object Range Read – Uses Range header to specify object byte range to read.

S3 protocol  Metadata Search which enables object search based on previously indexed metadata is supported by S3 only.

 

Conclusion

Both S3 and Swift protocols provide an object data storage. In general perspective, their functionalities are quite similar.

The key differentiators to choose the specific protocol in particular case are:

  • Popularity and compatibility
  • Applications’ support
  • Proprietary vs Open Source development model
  • Security
  • Unique functional features
  • Special extensions supported by EMC ECS

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s