Create an Open Source AWS S3 server

18970

Amazon S3 (Simple Storage Service) is a very powerful online file storage web service provided by Amazon Web Services. Think of it as a remote drive where you can store files in directories, retrieve and delete them. Companies such as DropBox, Netflix, Pinterest, Slideshare, Tumblr and many more are relying on it.

While the service is great, it is not open source so you have to trust Amazon with your data and even though they provide a free-tier access for a year, one must enter credit card information to create an account. Because S3 is a must-know for any software engineer, I want my students to gain experience with it and use it in their web applications, yet I don’t want them to pay for it. Some Holberton School students are also working during commutes, meaning either slow Internet connections and expensive bandwidth or no Internet connection at all.

That’s why I started looking into open source solutions that would emulate the S3 API and that could run on any machine. As usual, the open source world did not disappoint me and provided several solutions, here are my favorites:

The first one that I ran into is Fake S3, written in Ruby and available as a gem, it requires only a few seconds to install and the library is very well maintained. It is a great tool to get started but does not implement all S3 commands and is not suited for production usage.

The second option is HPE Helion Eucalyptus which offers a wide spectrum of AWS services emulation (CloudFormation, Cloudwatch, ELB…) including support for S3. This is a very complete solution (only running on CentOS), oriented toward enterprises, and unfortunately too heavyweight for individuals or small businesses.

The last and my preferred option is Scality S3 server. Available via Docker image, making it super easy to start and distribute. The software is suited for individuals, one can get started in seconds without any complicated installation. But also for enterprises as it is production-ready and scalable. The best of both worlds.

Getting started with Scality S3 server

To illustrate how easy it is to emulate AWS S3 with Scality S3 server, let’s do it live!

Requirements:

Launch the Scality S3 server Docker container:

$ docker run -d --name s3server -p 8000:8000 scality/s3server
Unable to find image 'scality/s3server:latest' locally
latest: Pulling from scality/s3server
357ea8c3d80b: Pull complete
52befadefd24: Pull complete
3c0732d5313c: Pull complete
ceb711c7e301: Pull complete
868b1d0e2aad: Pull complete
3a438db159a5: Pull complete
38d1470647f9: Pull complete
4d005fb96ed5: Pull complete
a385ffd009d5: Pull complete
Digest: sha256:4fe4e10cdb88da8d3c57e2f674114423ce4fbc57755dc4490d72bc23fe27409e
Status: Downloaded newer image for scality/s3server:latest
7c61434e5223d614a0739aaa61edf21763354592ba3cc5267946e9995902dc18
$

Check that the Docker container is properly running:

$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS                    NAMES
ed54e677b1b3        scality/s3server    "npm start"         5 days ago          Up 5 days           0.0.0.0:8000->8000/tcp   s3server

Install the Ruby gem AWS SDK v2 (documentation here):

$ gem install aws-sdk

Now let’s create a file that we will upload to our bucket (we will use it later):

$ touch myfavoritefile

Using your favorite text editor, create a file containing your Ruby script, let’s name it `s3_script.rb`:

#!/usr/bin/ruby
require 'aws-sdk'
s3 = Aws::S3::Client.new(
 :access_key_id => 'accessKey1',
 :secret_access_key => 'verySecretKey1',
 :region => 'us-west-2',
 :endpoint => 'http://0.0.0.0:8000/',
 :force_path_style => true
)
s3.create_bucket({bucket: "mybucket"})
File.open('myfavoritefile', 'rb') do |file|
 s3.put_object(bucket: 'mybucket', key: 'myfavoritefile', body: file)
end
resp = s3.list_objects_v2(bucket: 'mybucket')
puts resp.contents(&:key)

Run the script:

$ ruby s3_script.rb
$ myfavoritefile

Congratulations, you created your first S3 bucket and uploaded a file to it!

 

Let’s explain the code

Here we indicate that this script should be executed using Ruby and that we are including the AWS SDK library:

#!/usr/bin/ruby
require 'aws-sdk'

We initiate a connection to our S3 server running in our Docker container. Note that `accessKey1` and `verySecretKey1` are the default access key and secret access key defined by Scality S3 server.
s3 = Aws::S3::Client.new(
 :access_key_id => 'accessKey1',
 :secret_access_key => 'verySecretKey1',
 :region => 'us-west-2',
 :endpoint => 'http://127.0.0.1:8000/',
 :force_path_style => true
)

Let’s create a S3 bucket named `mybucket`:

s3.create_bucket({bucket: "mybucket"})

Here, we are uploading, the previously created file `myfavoritefile`, to our bucket `mybucket`:

File.open('myfavoritefile', 'rb') do |file|
 s3.put_object(bucket: 'mybucket', key: 'myfavoritefile', body: file)
end

Finally, this is collecting the content of `mybucket` and displaying it to standard output:

resp = s3.list_objects_v2(bucket: “mybucket”)
puts resp.contents(&:key)

Want to interact more with S3-compatible AWS apps? Join the hackathon co-organized by Scality and Seagate on October 21st to 23rd at Holberton School in San Francisco. No admission fee, food and drinks provided. The goal will be to write a S3Server to a Seagate Kinetic backend including using an erasure coding library and writing a data placement algorithm.