LAMP on EC2

Published on Tuesday, 03 March 2009.

I have always hosted my websites on a local (self managed) Linux or Microsoft web server. Lately I have taken great interest in Amazon's Cloud platform -- Elastic Compute Cloud(EC2). Although we have a T1 for internet access, I have been somewhat leery of the "fault tolerance" and "scalability" of our setup. I had heard a number of people talk about how cheap the Amazon EC2 solution was, but the stated characteristics of Amazon EC2 instances "lacking persistent storage" made running a dynamic database (such as MySQL) infeasible (changes such as database rows inserted, updated, or deleted -- in the event of an EC2 instance going down -- would be lost forever).


Amazon's Web Cloud Services are a pay-as-you-go service so please realize anything you do may result in charges to your Amazon account.


Then along came Amazon's Elastic Block Storage(EBS). EBS brought the idea of persistent volumes -- storage that would persist after an EC2 instance terminated. So the idea was -- run your operating system and static files on an EC2 instance and then store your dynamic files, databases, etc. on an EBS volume.

So Amazon offers a workable solution for me. But what really brings value are some of the tools that Amazon provides:

  • Persistent storage in the event of instance failure - If an EBS volume is used as the storage for a MySQL database, then the data is protected from instance termination or failure. You can simply attach/mount the volume on another instance and MySQL will run its normal recovery procedures to bring the database up to date with the binary logs.
  • Safety & Replication - According to Amazon, "EBS volume data is replicated across multiple servers". This makes your data safer than the default instance storage.
  • Improved performance - Early reports from studies on EBS disk IO performance indicate that EBS IO rates can be multiple times faster than ephemeral storage and even local disk IO. This has obvious benefits for databases which are often IO bound.
  • Large data storage capacity - EBS volumes can be up to 1TB in size. In theory you could go larger with LVM or RAID across EBS volumes, or by placing different databases or table files on different EBS volumes.
  • Instance type portability - If you find that your current small EC2 instance is not able to handle your growing demand, you could switch the EBS volume holding your MySQL database to a running extra large instance in a matter of seconds without having to copy the database across the network. Downgrade instance types later to save money.
  • Fast and easy backups - EBS snapshots alone could be a sufficiently attractive reason to move a database server to Amazon EC2. Being able to take live, consistent, binary snapshots of the database in just a few seconds is a thing of beauty. Add in the ability to create a new EBS volume from a snapshot so another EC2 instance can run against an exact copy of that database... and you've opened up new worlds of possibilities.

Setting up an Amazon Web Services

If you don't have one already, you will have to setup an Amazon account with a valid credit card so that they can get their $$$. Once your active, you can setup a Amazon Web Services Account. Then follow the link to the Access Identifiers Page. Your Access Key ID and your Secret Access Key should be displayed. You may have to use the supplied Generate button to generate your Secret Access Key. Write these down or print this out -- you will need these later. Good news is you can come back to this page to get this info in the future.

Lower on this page, use the Create New button to generate an X.509 certificate. Use the Download button to save this certificate locally and write down/print your private key.

Next, go to the Elastic Compute Cloud page and use the Sign up for Amazon EC2 button to sign up for EC2 and S3.

You can download a Firefox browser plug-in called Elasticfox or you can use Amazon's most excellent AWS Management Console. Currently the plug-in is at version 1.6. To complete the setup of the Elasticfox plug-in, you will need to click on the Credentials button and use the Access Key and Secret Access Key from our previous steps. Add the keys and click Close.

Setting up a Key Pair

The next step is to configure a key pair for use when starting up your linux instances. This public / private key pair will allow you to log in as root to a new instance generated off of a public machine image without the use of a password. From the AWS Management Console, select Key Pairs from the left-hand Navigation column. The Key Pairs tab will be displayed. Click on the Create Key Pair button. Give this new key pair a name and click on the Create button. You should receive a message, something like "A key pair has been created for you with the name xxxxxxx. Your private key should begin downloading in a few seconds". You should then be given the opportunity to save the key file (with a *.prm file extension) to your local hard drive. Save it to a safe but memorable place because you will need to use this file later for remotely connecting to your instance (via SSH).

Setting up a Security Group

A Security Group in Amazon's Web Service is basically a firewall. By default all network ports are blocked. You have to specifically allow access to your instance via port and ip (or group of ip) address. First select the Security Group tab and then click on the green plus button to add a new security group. Enter an appropriate Group Name and Description. Remember this group will most likely contain all of the allowed network ports for accessing this instance.

Although it is beyond the scope of this post, you will want to make sure and enable SSH, HTTP, and most likely HTTPS for the IP address of your client PC. Eventually you will also want to allow access (especially HTTP access) for other IP address ranges (hint: you will want to add IP group 0.0.0.0/0 for HTTP once your site goes "production", otherwise people will not be able to access it).

Setting up a Persistent Volume

To create a persistent volume (ie, one capable of running a reliable instance of MySQL -- ort any other form of dynamic data for that matter) we will first need to access your Amazon Web Service Account via the AWS Management Console. Click on Sign in to Amazon EC2 Console and enter your login information. Once logged in, the EC2 Dashboard should be displayed. On the left hand column, select the Volumes link under Elastic Block Store.

Next, the EBS Volumes tab shoudl be displayed. Select Create Volume and a popup should be displayed for the details on your new volume. You can create a volume from 1 gigabyte to 1 terabyte in size. You will also need to specify an Availability Zone.

Availability zones refer to the Amazon data centers. Currently there are three of them, all of which are on the east coast of the U.S. Make note of which availability zone you choose becuase you will need to match this with your instance availability zone. Instances can only reference persistent volumes located in the same availability zone.

You should also have the option to select a Snapshot. In our case we will be creating a blank volume. However, you could just as easily create a new volume based on a previously taken "snapshot" of a volume or even one of Amazon's pre-built volumes of data (Census, Economic, UGI, etc.). Sorry, no experience in using any of Amazon's canned data volumes -- I don't even know if they charge anything for their use.

Once you have defined your volume, click on the Create button. Amazon will start creating your volume as evidenced by "creating" displayed in the Status column. You can use the Refresh button to update the display until "available" is displayed in the status column. Once we have brought our Linux instance online, we will also use the Attach Volume button to join the instance and volumne together.

Starting and Configuring the Instance

Now it's time to get our Linux instance up and configured on EC2. Log into the AWS Management Console and select AMIs from the left-hand Navigation column. The Amazon Machine Images tab will appear. Near the top will be an area labeled Viewing that you can use to find a suitable AMI to use to create your own instance. You can filter your selection based on platform, architecture, visibility, et al. I am a fan of the CentOS Linux distribution so I'll be looking for a platform of "CentOS". I type into the text box provided "Rightscale". Rightscale is a great "open friendly" company that has donated their AMIs for public consumption. Now my selection has been filtered down to a very workable list. I chose AMI ID: ami-cb52b6a2 which is a CentOS v5.2 i386 image. Click on the image to select it and then click on the Launch button.

A pop-up window will now appear to collect some information for your new instance.

  • AMI Name: gives a (very) brief description of the AMI.
  • Number of Instances: enter "1".
  • Instance Type (32 bit): select "m1.small". This is the smallest (cheapest) instance. Of course, money permitting, you can select a more powerful platform if you so desire. You can also choose, money permitting, to select a 64-bit AMI which will give different options as well for your Instance Type.
  • Key Pair Name: select the Key Pair Name you created in Part 1.
  • Security Groups: select the Security Group (or groups) you create in Part 1.
  • Under Advanced Options, select the Availability Zone you specified in Part 2.

Click on the Launch button. Amazon should then advise you that your instance is in the process of launching. Click the Close button.

Select Instances from the left-hand Navigation column. Find your instance and wait until the Status column shows "running". You may have to keep hitting the Refresh button to view the most current status.

Select your instance and click on the Connect button. A pop-up window should appear detailing how you can use SSH to connect to your instance. You should be able to cut-and-paste the SSH line into a terminal window and (providing that your Key Pair file matches the location and name specified) connect to your instance. Note: SSH may complain about your file permissions for your Key Pair file. Use chmod (maybe 600) to set the file permissions if needed.

Configuring your Instance

I have plenty of experience with Linux, but I would not consider myself an expert. Most of what I would consider "Configuring a Linux Instance" involves: (1)installing software necessary to get the job done, (2)updating the operating system and installed software from yum repositories, and (3)disabling unnecessary services.

I will rely heavily on other web sites to recommend optimized settings for software such as Apache, PHP, and MySQL and combine these recommendations onto this blog.

Let's SSH into your instance and do a basic update of installed operating system and software.

yum -y update

Now let's install some software.

yum -y install php
yum -y install php-gd php-imap php-mysql mysql-server
yum -y install php-mbstring php-mcrypt php-pear-DB php-mhash php-pear
yum -y install php-xml php-xmlrpc php-curl
yum -y install ImageMagick libxml2-devel perl-libwww-perl perl-DateManip
yum -y install git iksemel js ntp
yum -y install kmod-xfs.i686 xfsdump.i386
yum -y install mod_ssl

Let's add a user.

adduser [myuser]
passwd [myuser]

Let's enable our new user to use SFTP.

vi /etc/ssh/sshd_config

Make the following change:

PasswordAuthentication yes

Let's add the new user to the end of our sudoers file:

[myuser]        ALL=(ALL)       ALL

Restart the sshd process:

/etc/rc.d/init.d/sshd restart

Let's disable some unneeded services:

chkconfig autofs off
chkconfig avahi-daemon off
chkconfig httpd on
chkconfig netfs off
chkconfig nfslock off
chkconfig restorecond off
chkconfig xfs off
chkconfig rightscale off

Let's reboot and hope for the best.

reboot

Setting up a MySQL Database for a Ruby on Rails Application

Here are the steps necessary to setup a MySQL database for a Ruby on Rails project. This assumes you have given the root user a password. If not do:

mysqladmin -u root password "new_root_password"

Once you have created secure access for your MySQL instance, do:


mysql -u root -p
DELETE FROM mysql.user WHERE NOT (host="localhost" AND user="root");
#
create database MyDB_dev;
grant all on MyDB_dev.* to 'rails_dev'@'localhost' identified by 'pw_dev';
#
create database MyDB_test;
grant all on MyDB_test.* to 'rails_test'@'localhost' identified by 'pw_test';
#
create database MyDB_prod;
grant all on MyDB_prod.* to 'rails_prod'@'localhost' identified by 'pw_prod';
#
DROP DATABASE test;
FLUSH PRIVILEGES;
exit;

Next, edit your database.yml file as such:

development:
  adapter: mysql
  database: MyDB_dev
  username: rails_dev
  password: pw_dev
  socket: /tmp/mysql.sock
test:
  adapter: mysql
  database: MyDB_test
  username: rails_test
  password: pw_test
  socket: /tmp/mysql.sock
production:
  adapter: mysql
  database: MyDB_prod
  username: rails_prod
  password: pw_prod
  socket: /tmp/mysql.sock

Now go to your rails application directory and execute rake:

cd /home/username/my_rails_app
rake db:migrate

You should receive no errors.

Mounting a Persistent Volume

In this post we will be mounting our persistent share on our Linux instance. If all is going according to plan, your Linux instance should have come up in a more optimized fashion with many of the LAMP services we need functioning and many of the services (that we do not need) inactive.

logon to AWS Management Console and click on the Volumes link on the left-hand Navigation panel. The volume you created in Part 2 should show under EBS Volumes. Select your volume and click on the Attach Volume button. A pop-up window should request the following settings:

  • Volume: shows a the Volume ID and Amazon Data Center associated with your volume.
  • Instances: allows you select which instance you want to mount the volume to. There should only be one choice unless but make sure this matches the intended instance.
  • Device: allows you to select the Linux device label that your persistent volume will be exposed to your Linux instance. You should be able to select a device label of your choice. Remember your setting as it will be used later when we mount the volume.

Make your selections and click on the Attach button. The windows should close and the status of your volume should now be "In Use".

Connect to your running Linux instance via SSH and switch to the root user once connected. In a previous step, we added XFS filesystem capabilities. We will now format the persistent volume with the XFS filesystem. Be sure and specify the device label you used in the "Attach Volume" steps detailed above. (we will use /dev/sdf for example purposes)

mkfs.xfs /dev/sdf
mkdir /mnt/persistent

Depending on the size of your volume, this may be quick or may take some time. In this example, we will be mounting the volume into the "/mnt/persistent" subdirectory, but you can mount it wherever you want. Next, edit your "/etc/fstab" file and add the following to the end of that file:

 
/dev/sdf /mnt/persistent xfs defaults 0 0

Again making sure that you specify the device label used in the "Attach Volume" steps detailed above. Save your changes and enter the following at the command prompt:

 
mount /mnt/persistent

If "all the stars are in alignment", you should have a clean mounted, persistent volume. Enter "df -h" at the command prompt and stare in amazement.

Configuring MySQL

In a previous post, we installed the binaries and associated libraries to run a MySQL server. Our goal all along was to get our MySQL server to use the persistent characteristics of an EBS volume. So our next task will be to make the configuration changes necessary to use our newly created EBS volume. We will also make some changes to make MySQL perform better on our EC2 Linux instance. We built our instance on the cheapest, entry level Instance Type. Depending on the processor(s), memory, and architecture selected for your instance, your MySQL configuration may require significant changes.

First, let's connect to your Linux instance via SSH, switch to the root user, and create a MySQL working directory for our new EBS volume:

mkdir /mnt/persistent/mysql

Next, we made the following changes in our "/etc/my.cnf" file:

[client]
port        = 3306
socket      = /var/lib/mysql/mysql.sock
[mysqld]
datadir=/mnt/persistent/mysql
port        = 3306
socket      = /var/lib/mysql/mysql.sock
skip-locking
key_buffer = 256M
max_allowed_packet = 50M
table_cache = 1024
sort_buffer_size = 1M
read_buffer_size = 1M
read_rnd_buffer_size = 4M
myisam_sort_buffer_size = 64M
thread_cache_size = 64
tmp_table_size = 40M
join_buffer_size = 1M
query_cache_limit = 12M
query_cache_size= 32M
query_cache_type = 1
max_connections = 60
thread_stack = 128K
thread_concurrency = 4
log-bin=mysql-bin
server-id   = 1
innodb_data_home_dir = /mnt/persistent/mysql/
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /mnt/persistent/mysql/
innodb_log_arch_dir = /mnt/persistent/mysql/
innodb_buffer_pool_size = 256M
innodb_additional_mem_pool_size = 20M
innodb_log_file_size = 64M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 50
[myisamchk]
key_buffer = 128M
sort_buffer_size = 128M
read_buffer = 2M
write_buffer = 2M

I am sure there are many arguable issues surrounding this config (especially on thread_concurrency). Please make your comments and I will evaluate their merit.

Time to start our MySQL server:

service mysqld start

Your instance should start up, but make sure and evaluate any messages that are displayed. Next we set the MySQL root password:

/usr/bin/mysqladmin -u root password 'new-password'

Making sure, of course to substitute your own suitable password. Now let's make sure the MySQL will automatically start on a reboot:

chkconfig mysqld on

Your database should be good to go!

Configuring Apache

As is with most software, there seems to be a some mysticism involving optimizing software -- especially for the web -- and even more so on a hosted web services such as Amazon's EC2. I will not pretend to be an expert but will try to bring together both my own experiences as well as the fine posts by others on the web.

Our apache web server should already be running on our instance from our earlier instructions in Part 3. But just in case, we can check its status by connecting to our instance via SSH, switching to the root user, and then entering:

 
service httpd status

If it is not running, you will probably have some invetigating to do -- check your logs.

Now it is time to make the following change to our Apache config file located at "/etc/httpd/conf/httpd.conf":

 
KeepAlive On
NameVirtualHost *:80

Note that we plan to use the Virtual Hosting (by Name) capabilities of Apache here. Depending on your plans, your configuration may vary. We would not only uncomment the NameVirtualHost directive, but we would also need to setup the VirtualHost section located at the bootom of this file:

 
<VirtualHost *:80>
        DocumentRoot /mnt/persistent/www
        ServerName www.your_domain.com
        <Directory /mnt/persistent/www>
                AllowOverride All
                allow from all
                Options +Indexes
        </Directory>
        ServerAlias your_domain.com
</VirtualHost>

Depending on the location and other properties of your web site, your configuration will most defintely be different. Let's restart our web server based on our changes:

 
service httpd restart

Our web server should be good to go! I invite any recommendations you may have as far as further optimizations.

Configuring PHP

As I stated in my previous posts, optimizing software for the web and especially for Amazon's EC2 service can be a mystical art. I will not pretend to be an expert and I invite any comments on further optimzations you may have.

The only optimations I really have for PHP involve the following changes to your /etc/php.ini file:
 
max_execution_time = 90
max_input_time = 240 
memory_limit = 128M

You will want to restart your Apache web server to make use of the changes:

 
service httpd restart

Happy coding!

Setting up an Elastic IP Address

So now we have a Linux instance out on the Internet and an Apache web server running on it. We now need a static IP address to associate with our server so that the Internet can access our (web) services. Amazon offers what they call Elastic IP addresses. Elastic IP addresses are associated with your Amazon Web Services account, not specific instances. Any elastic IP addresses that you associate with your account remain associated with your account until you explicitly release them. Unlike traditional static IP addresses, however, elastic IP addresses allow you to mask instance or availability zone failures by rapidly remapping your public IP addresses to any instance in your account.

Some good news: Amazon imposes a small hourly charge when these IP addresses are not mapped to an instance. When these IP addresses are mapped to an instance, they are free of charge.

So let's login to your AWS Management Console. Select Elastic IPs from the left-hand Navigation area. Click on the Allocate New Address button near the top. A pop-up window will be displayed to confirm your request for a new address. Click on "Yes Allocate". Write down your new IP address and select it with a click. Next, click on the Associate button. A pop-up windows will be displayed, asking you to associate the IP address with your Instance ID. Select your Linux instance and click on Associate.

Your instance now has an IP address. You will need to add this new IP address to your registered DNS settings.

Setting up Snapshots

One of the benefits stated in Post 1 is "Fast and Easy Backups". Amazon provides the ability to perform immediate, incremental backups of EBS volumes. With the help of some scripts to place our MySQL database in a state suitable for backup, we can make nearly-immediate backups of our persistent storage.

We'll need some tools from Amazon so login to your instance via SSH and switch to the root user. Since the Rightscale AMI already has an older version of the EC2 AMI tools installed, we will download the newer tools and use rpm to update them to the current version:

 
cd /usr/src 
wget "http://s3.amazonaws.com/ec2-downloads/ec2-ami-tools.noarch.rpm" 
rpm -Uvh ec2-ami-tools.noarch.rpm 

You will most likely have to use a web browser and visit http://developer.amazonwebservices.com/connect/entry.jspa?externalID=351&categoryID=88 to identify the url for the EC2 API tools. The Rightscale AMI we used also has a version of the tools installed. They are installed in /home/ec2. I did the following:

 
cd /usr/src 
wget "http://s3.amazonaws.com/ec2-downloads/ec2-api-tools.zip" 
unzip ec2-api-tools.zip 
cd /home/ec2 
rm -rf bin 
rm -rf lib 
cd /usr/src/ec2-api-tools* 
mv bin /home/ec2 
mv lib /home/ec2 

We will need to the X.509 certificate and private key (we downloaded these when we setup our Amazon Web Services account in Part 1). Use whatever method you feel comfortable with to upload the pk-*.pem and cert-*.pem to the /home/ec2/certs directory. In the /root/.bashrc file, add the following lines to make sure that the EC2 tools know where to find the certificate and key:

 
export EC2_CERT=/root/.pem 
export EC2_PRIVATE_KEY=/root/.pem 

The backup script that will run every hour will need to lock the MySQL database during the snapshot process, so create a /root/.my.cnf file that has the following format:

 
[client] 
    user=root 
    password=

Next we will install two scripts. The first is called takesnapshot and should be downloaded and placed in /etc/cron.hourly or /etc/cron.daily depending on your needs. Edit this file to insert the volume ID of your persistent store. This ID can be found using the Amazon Web Services Account under the Volumes tab. Make this script executable using chmod +x.

The second script is called ec2-snapshot-xfs-mysql.pl and is a modified version of Amazon's script. Move this script to /usr/bin, edit it to point to the proper file names of your X.509 certificate and private key, and make it executable.

Once everything is in place, you can manually try running the takesnapshot script. Once it finishes, checkout your AWS Management Console for the Snapshot tab for your backup.

Generating a Custom AMI

So in our last post, we created scripts that would take snapshots of our persistent volume. So that covers our persistent datastore. But what about the changes we have made to our Linux image. In the event that the instance goes down, we would loose any configuration changes made to date. So we will create our own private AMI of our current Linux instance state. In the event of a failure or even to spawn multiple images, we can use this custom AMI to quickly restore our state.

To do this, first shut down MySQL and Apache and unmount your persistent store using the following commands:

 
/etc/rc.d/init.c/mysql stop
/etc/rc.d/init.c/httpd stop
umount /persistent

Go to your AWS Management Console and retrieve your Owner ID from the running instance. Copy that and paste it within the following command, which creates the new AMI:

 
ec2-bundle-vol --fstab /etc/fstab \
     -c /home/ec2/certs/[certificate] \
     -k /home/ec2/certs/[private key] \
     -u [Owner ID]

This will create the image in the /tmp directory, but that image still needs to be uploaded. Upload it using the following command:

 
ec2-upload-bundle -b [bucket name] \
     -m /tmp/image.manifest.xml \
     -a [Access Key ID] \
     -s [Secret Access Key]

where the bucket name is a globally unique identifier. It can be the name of an bucket you already use or a new one, in which case the bucket will be created (if the name is available).

You will need to register this new AMI with AWS Management Console by going to the AMIs and clicking the Register New AMI. The AMI manifest path that it will ask for is your bucket's name followed by /image.manifest.xml. The AWS Management Console should add your AMI to the list of public ones (it will be marked "private"). If you don't see it right away, you can do a search for a substring within the name of your bucket.