Software Development Blog | Laurence Gellert's Blog - Part 12Laurence Gellert's Blog » Re: Software Is My Life, So Make It A Good One

Launch Plans – Your Ticket to Excellence

by Laurence Posted on October 27, 2011

McDonald’s has a checklist for making a Big Mac, but surgeons also use checklists so they don’t cut off the wrong limb of a patient. Checklists sound mundane, but they ensure quality. In the case of a Big Mac, the checklist makes for a consistent low cost product. In the case of a surgical operation, checklists help to reduce infection rates and eliminate human error.

Memory is transient, especially the biological kind. For software launches checklists make things go smoother at that critical moment when all the new features go live. Consistently getting the small details right separates the professionals from the amatures. My philosophy is to have two written plans when it comes to a software release:

1) Deploy Plan

2) Back Out Plan

In a past life on a small team we kept our plans in a release notes folder – eg /release_instructions/v4.2/. Checklists kept the team in sync and made deploys go smoothly. This article covers what types of notes we had in the launch plan for a LAMP based SaaS product.

Deploy Plan:

A deploy plan proves the team knows what it is doing. It is basically a set of steps. It could be a text file, word document, set of issues in Jira, whatever… The deploy plan gives you a chance to rehearse on QA or staging. Have the least familiar team member run through the plan on their own to spot confusing or incomplete sections.

A typical deploy plan has several steps. For example, there might be a database schema change associated with an ETL script that needs to be executed. A verification step could be included. Along with the code refresh, it is time to apply security patches to the server. The plan looks like this:

************************************************
A) Pre Deploy Steps.
************************************************

1) Make sure tag for 4.2 gold release is in place.
2) Backup database before proceeding!
3) Copy database backup to another server and share where it is with team.
4) Set application into maintenance mode.

************************************************
B) Run system level OS updates on servers.
************************************************
1) Take snapshot of servers in hypervisor layer.
2) On web server and DB server:

$ sudo apt-get update
$ sudo apt-get upgrade

************************************************
C) Deployment instructions for the various database scripts in the 4.2 release.
************************************************

DDL is in the /documentation/databasemap/4.2/ folder.

Perform steps in the following order:

1) Run DDL and SQL updates as root:  

	a) create_4.2_tables.sql
	b) basic_4.2_data_seeding.sql

2) Run data conversion:

	a) Edit /utility/config to point to the right DB.
	b) Run script /utility/4.2/populate_section_tables
	c) Run script /utility/4.2/verify_new_tables
           (should get zero errors)

************************************************
D) Deploy the new code.
************************************************

1) Checkout 4.2 gold release tag.
2) Run unit tests. If they pass, continue.
3) Continue with ./deployment_instructions.txt, eg:
	$ app_deploy_script...

The exact steps and what tools are used will vary greatly depending on how the system is setup. This article is not about deployment tools – there are many to choose from. This just gives you an idea of what could be organized before a launch. Make sure to rehearse it at least once to shake out the bugs.

Back Out Plan:

The back out plan is there in case something goes wrong. What if a last minute bug causes the deploy to fail? What if a library is different on the live server? Well you could wrestle with it and miss your dinner, or you could roll back safely and deal with it in the morning.

Sample back out plan:

1) Restore backed up database:

   a) Drop db
   b) Create new blank db
   c) Restore backup db
   {provide sample commands}

2) Revert to last checkout and re-run deployment script which replaces symbolic links and restarts necessary services:
   $ cd ../{last deployed tag}
   $ ./{run deploy script}

This is super simplified and will vary based on your situation. You might be able to revert a snapshot and reset the system clock. You might not have the luxury of dropping and restoring the database either. The point is to make backing out a plan, so in case you need to you can. No sense in working late and making even more mistakes, introducing last minute bugs, or accidentally causing data loss while making a cowboy tweak to the live server.

Taking this further:

Deploy and back out plans will vary considerably depending on the type of software, platform, hosting environment, etc. The example above was based on a pretty simple web application. It was running on just two servers. A command line driven deploy script was used to setup configuration files and remap symbolic links to make the application ‘live’.

A cool trick in the cloud is to spin up new instances, deploy onto those, test as if they are staging servers, then when ready do a DNS cut over. At that point, just spin down the old instances, or leave them hanging around until you are ready to terminate them. Data synchronization is something to pay attention to with this one.

Another trick with virtualization is to take snapshots of the servers so you have an extra safety net. Do that before applying OS updates just in case one goes south.

Some applications and even a few modern frameworks have a built in maintenance mode that can be enabled by an administrator which tells visitors that the site is undergoing maintenance.

When it comes to doing the build and generating packages / build artifacts I recommend Jenkins – http://jenkins-ci.org/ and Bamboo by Atlassian. In a past job we had a ‘One Click Deploy to QA’ process setup that any team member could trigger. Automating deploys is an investment and you’ll have to decide where to strike the balance economically. Once you try it though, you’ll never go back. An automated build becomes part of the deploy process and removes the element of human error present in the basic approach given above.

Posted in Application Development, For New Developers | Tagged scripting, software | Comments Off

Interview with a Software Lawyer

by Laurence Posted on October 9, 2011

I met Aaron Williamson at OSCON this year. Aaron works as a legal counselor at the Software Freedom Law Center in New York. He is an expert on software law and open source licensing. After one of his talks I approached him and asked if I could interview him and share the transcript online. Well, here it is! While we are on the subject of legal matters, here is a disclaimer: This article is not legal advice.

Understanding the legal aspects behind software copyright and licenses is extremely important to software makers. At the very least all software professionals should know the difference between Apache/MIT, LGPL, and GPL. What I learned from Aaron is that no matter what, a copyright is in place by default even if the author does not want the work. To give away the work the author must first acknowledge their copyright, then issue an appropriate license allowing others to use the work. A seemingly innocent copy and paste of a code snip can be considered a copyright violation if you are not careful.

Q1) If I copy and paste a code snippet from a website and then modify it to fit into my system, who owns that section? What if I just use that code as a guide and only re-use the pattern it follows?

A1) That depends on: whether the code itself is copyrightable; if so, what the license to the code is; and whether you copied original expression. Many sites that post code snippets include a term in their terms of use specifying the license for the code. If the site doesn’t specify a license, or otherwise say how the code can be used, you should consider the code “all rights reserved.” While code must contain a certain degree of originality to be copyrightable, that degree is low, and there is no clear line. Since “originality” can be found in not only the literal code but also in its structure and other nonliteral elements, merely copying the code’s “pattern” doesn’t necessarily mean you’re not using copyrighted work. If you find code you want to use on a blog and a license isn’t specified, try contacting the author and requesting to use the code under your preferred license; you’ll find people are usually more than happy to do so.

Q2) Has there ever been a liability case against a software engineer for having a bug in their code that lead to damages? I remember the case of the Terac-25 xray machine being responsible for injuries (http://en.wikipedia.org/wiki/Therac-25).

A2) I am not aware of any case against a free software developer for buggy code. I would guess that disputes have arisen between businesses and their software vendors over bugs in the vendors’ software, but I’m not familiar with any particular cases.

Q3) Are there any best practices from a legal standpoint software developers should follow? Other skilled professions, like law, accounting and medicine have such tenants, but software is still immature in that regard. Is that changing?

A3) For free software developers and companies who use free software, the most important thing is license compliance. Keep track of what code you’re using, whose copyrights are in it, and what the license terms say. Always preserve copyright notices. More on this can be found in our primer and other documents on our website, including Maintaining Permissive-Licensed Files in a GPL-Licensed Project: Guidelines for Developers (http://softwarefreedom.org/resources/2007/gpl-non-gpl-collaboration.html).

Q4) Is it important to add disclaimers to code samples and or blog posts?

A4) I assume that you’re referring to warranty disclaimers. In many jurisdictions (including individual states), certain warranties attach implicitly to any consumer good, for example a warranty that the product is generally “fit” for the purpose for which it is sold or distributed. There is sufficient concern that software—even community-produced free software distributed only in source code form—will be deemed subject to these implied warranties that most lawyers will recommend disclaiming them explicitly. There is probably a lower risk that code samples on a personal website will be considered “goods” subject to such warranties, but because consumer protection laws vary from state to state, I cannot say with certainty that no such warranty would apply. If you’re asking whether you should add explicit copyright licenses for code samples, I think it’s a good idea. If the code contains sufficient originality to be copyrightable (a relatively low bar) then visitors to your website will need a license to use it.

Q5) What do software developers need to know about patents?

A5) This one is easy. SFLC recently worked with the Debian community to produce an FAQ about patent issues for free and open source software developers. It explains what patents are, the risks they pose to free software developers, and how developers can limit their exposure. The FAQ can be found at: http://www.debian.org/reports/patent-faq. We also have a legal issues primer on our website, at http://softwarefreedom.org/resources/2008/foss-primer.html, that includes a section on patents. These two documents cover a lot of the same basic information, but both are worth reading.

Posted in Application Development | Tagged business, development | Comments Off

Good Source of Icons and Photos for Free

by Laurence Posted on September 28, 2011

There are four-year-olds with better drawing skills than me. Thankfully, there is the Open Icon Library project to take care of all my icon needs. It is full of useful icons, indicators, alerts, flags, etc… The images are professional and polished looking, and there are a number of available themes.

Statistics:

Total icons: 137,396
Unique icons: 10,787
Unique PNG: 10,489
Unique SVG: 3,723

Everything has been pre-saved in a variety of formats and sizes, making it easy to grab art and get it working with minimal fuss. Even though this is an ‘Open’ project, the licenses vary per icon! Be careful of this on commercial projects. From the footer of the site:

“All icons are under free/open licenses, however the licenses vary from image to image. Some are Creative Commons, some are GPL, some are Public Domain. Licenses for sources app-install, wiki_commons-BSD, wiki_commons-CC,wiki_commons-GPL, , wiki_commons-MIT, and wiki_commons-PD”

Aside from license pitfalls that might come up, one limitation is the inability to search.

For Pictures:

Sometimes a picture is also handy to pull into a project. In today’s world there are ways of searching only open source images. For example, the Flickr API supports searching for Creative Commons licensed images. Many other photo upload sites and content aggregators have jumped on open source bandwagon.

Again, you are dealing with a hodgepodge of licenses. What I have noticed is, creative commons licensed content increasingly becoming a high quality source of media. Enjoy!

Posted in Application Development | Tagged multimedia, software | Comments Off

Tech Ignite Presentation

by Laurence Posted on September 17, 2011

The Software Association of Oregon (SAO) held its first ever Tech Ignite event last Thursday, September 15th, 2011. My presentation “Building an ePortfolio system for NASA” was accepted! The event was taped and you can watch me in action here:

Direct link to video: http://youtu.be/3NXHBtbvNNQ

The venue was McMenamin’s Mission Theater. I had never been there. It is a cozy medium sized theater that seats about 300. I think about 240 people showed up for this event. I had never spoken ‘under the lights’ in front of so many people. It was a blast! It was a privilege to share the ideas behind our project with so many intelligent people in the audience. At the University we are working on making an ePortfolio system that promotes a positive learning cycle.

About Ignite Talks:

Ignite events were originally introduced in 2006 in Seattle by O’Reilly. An Ignite talk is a fast paced presentation that lasts exactly 5 minutes. The idea is similar to TED Talks. The event is usually focused on a particular subject area. Presenters must provide exactly 20 slides. The slides advance every 15 seconds on an automatic timer. This means the presenter is not in control of the progression. This makes it extra challenging to make smooth transitions. A successful Ignite talk requires an understanding of the audience and an economical use of words and visuals. The goal is to convey something useful, funny, and interesting that will be remembered – all in just 5 minutes. For more Ignite Talks visit the O’Reilly page.

Posted in Science and Math, Work | Tagged leadership, multimedia, software | Comments Off

CSS3 HTML5 Web Based Tools

by Laurence Posted on August 15, 2011

This post covers a few fun and useful browser based ‘CSS builder’ tools. They generate CCS3 effects on the fly and take the headache out of making them work across modern browsers. Check out http://caniuse.com/ for a complete source of information on which browsers support CSS3/HTML5 features. Alas, it looks like IE 9 is having trouble with the gradient and transitions effects. Recent versions of Firefox, Chrome, and Safari should not have any issues.

On new projects or upgrades where I don’t have to worry about supporting old browsers it makes for a fresh look. The main thing I like about all the tools mentioned is, no image files are needed. All the visual effects are done with CSS. I suggest you use them sparingly, unlike how I am showing them off in this post. The worst thing a screen can do to a user is improperly draw the user’s eye away from where it should be looking. This usually happens when an engineer builds a UI – leave the UI to the artistic designers!

CSS driven gradients:
http://www.colorzilla.com/gradient-editor/
This tool generates the CSS for colored gradients such as:

A rainbow, all done with CSS.

/*  See why you want to generate this with a tool?  */
background: rgb(175,72,156);
background: -moz-linear-gradient(left, rgba(175,72,156,1) 0%, rgba(40,65,175,1) 23%, rgba(114,170,0,1) 47%, rgba(230,247,0,1) 74%, rgba(255,58,58,1) 100%);
background: -webkit-gradient(linear, left top, right top, color-stop(0%,rgba(175,72,156,1)), color-stop(23%,rgba(40,65,175,1)), color-stop(47%,rgba(114,170,0,1)), color-stop(74%,rgba(230,247,0,1)), color-stop(100%,rgba(255,58,58,1)));
background: -webkit-linear-gradient(left, rgba(175,72,156,1) 0%,rgba(40,65,175,1) 23%,rgba(114,170,0,1) 47%,rgba(230,247,0,1) 74%,rgba(255,58,58,1) 100%);
background: -o-linear-gradient(left, rgba(175,72,156,1) 0%,rgba(40,65,175,1) 23%,rgba(114,170,0,1) 47%,rgba(230,247,0,1) 74%,rgba(255,58,58,1) 100%);
background: -ms-linear-gradient(left, rgba(175,72,156,1) 0%,rgba(40,65,175,1) 23%,rgba(114,170,0,1) 47%,rgba(230,247,0,1) 74%,rgba(255,58,58,1) 100%);
filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#af489c', endColorstr='#ff3a3a',GradientType=1 );
background: linear-gradient(left, rgba(175,72,156,1) 0%,rgba(40,65,175,1) 23%,rgba(114,170,0,1) 47%,rgba(230,247,0,1) 74%,rgba(255,58,58,1) 100%);

Rounded Corners: http://cssround.com/ http://www.css3.info/preview/rounded-border/ (Not a tool but a very in depth article.)

No Images On These Corners

#can be as simple as:
-moz-border-radius: 10px;
-webkit-border-radius: 10px;
border-radius: 10px;
#above example:
-moz-border-radius: 10px 35px 10px 35px;
-webkit-border-radius: 10px 35px 10px 35px;
border-radius: 10px 35px 10px 35px;

CSS3Generator has over a dozen effects to play with: http://css3generator.com/ Most of these are basic but the shadow and transform examples are useful.

Now Includes Drop Shadow!

-webkit-box-shadow: 3px 3px 3px 3px #333333;
-moz-box-shadow: 3px 3px 3px 3px #333333;
box-shadow: 3px 3px 3px 3px #333333;

Transformations: Transformations can add a nice touch to the UI if done delicately. Here is a tool to experiment with all the switches http://westciv.com/tools/transforms/index.html The following example which I created has a CSS3 a transition for the on hover event. Mouse over the tabs below and they will animate. Notice how it is nice and subtle.

-webkit-transition: background-color .5s linear, color .5s linear, 
       width .5s ease-out;
-moz-transition: background-color .5s linear , color .5s linear,
       width .5s ease-out;
-o-transition: background-color .5s linear , color .5s linear, 
      width .5s ease-out;
transition: background-color .5s linear , color .5s linear, 
      width .5s ease-out;

Other tools:

CSS Menu Builder:
http://www.cssmenubuilder.com/home
Could save some headaches for building a dynamic menu quickly.

Posted in Application Development | Tagged html5, webdev | Comments Off

Python Strategy Pattern

by Laurence Posted on August 6, 2011

The strategy pattern can be a nice way to improve flexibility when accessing external resources. For example an application might have images referenced in Flickr and a relational database. You want to be able to search both places, but you also want your API to be the same.

There are lots of ways to do this, since in Python functions can be passed around. This StackOverflow page has some details.

I’ll show two approaches here that I think are the easiest to read. The first is a simple interface / abstract class (loosely same thing in Python due to duck typing) with concrete classes that do the work. The second is a more complex setup that allows for a default strategy and the ability to maintain state on the top level strategy object. The code is here for download: Example 1, Example 2.

Basic Method – Simple Inheritance:

This should look very familiar to Java/C# programmers:

class ImageFinder:
    """ Inteface / Abstract Class concept for readability. """

    def find(self, image):
        # explicitly set it up so this can't be called directly
        raise NotImplementedError('Exception raised, ImageFinder is supposed to be an interface / abstract class!')

class ImageFinderFlickr(ImageFinder):
    ''' Locates images in flickr'''

    def find(self, image):
        # in reality, query Flickr API for image path
        return "Found image in Flickr: " + image


class ImageFinderDatabase(ImageFinder):
    ''' Locates images in database. '''
    def find(self, image):
        #in reality, query database for image path
        return "Found image in database: " + image
    
    
    
if __name__ == "__main__" :

    finderBase = ImageFinder()
    finderFlickr = ImageFinderFlickr()
    finderDatabase = ImageFinderDatabase()

    try:
        #this is going to blow up!
        print finderBase.find('chickens')
    except NotImplementedError as e:
        print "The following exception was expected:"
        print e
        

    print finderFlickr.find('chickens')
    print finderFlickr.find('rabbits')
    print finderDatabase.find('dogs')
    print finderDatabase.find('cats')

Outputs:

$ python strategy_ex1.py
The following exception was expected:
Exception raised, ImageFinder is supposed to be an interface / abstract class!
Found image in Flickr: chickens
Found image in Flickr: rabbits
Found image in database: dogs
Found image in database: cats

Secondary Method – Default Strategy with Stateful Object:

This approach allows us to track the number of times the method has been called. Granted this is a trivial example but it demonstrates one way to go beyond the simple example above.

class ImageFinder(object):
    """ 
    In this example the base object ImageFinder keeps a copy
    of the concrete class (strategy).  You may also set
    a default strategy to use which might be convienient.
    In this case it is set to None which forces the caller
    to supply a concrete class.
        
    The concrete find method is supplied with an instance of
    this object so its state can be tracked.
    """
    
    def __init__(self, strategy=None):
        self.action = None
        self.count = 0
        if strategy:
            #get a handle to the object
            self.action = strategy()
    
    def find(self, image):
        if(self.action):
            self.count += 1
            return self.action.find(image, self)
        else: 
            raise UnboundLocalError('Exception raised, no strategyClass supplied to ImageFinder!')

class ImageFinderFlickr(object):
    ''' Locates images in Flickr. '''

    def find(self, image, instance):
        # in reality, query Flickr API for image path
        return "Found image in Flickr: " + image + ", search #" + str(instance.count)


class ImageFinderDatabase(object):
    ''' Locates images in database. '''
    def find(self, image, instance):
        #in reality, query database for image path
        return "Found image in database: " + image + ", search #" + str(instance.count)
    
    
if __name__ == "__main__" :

    finderBase = ImageFinder()
    #these next two look a little convuluted don't they?
    #useage is a little more verbose in example 2 vs example 1
    #however, benefits in include a default strategy type, and ability to track state
    finderFlickr = ImageFinder(strategy=ImageFinderFlickr)
    finderDatabase = ImageFinder(strategy=ImageFinderDatabase)

    try:
        #this is going to blow up!
        print finderBase.find('chickens')
    except Exception as e:
        print "The following exception was expected:"
        print e
        

    print finderFlickr.find('chickens')
    print finderFlickr.find('bugs bunny')
    print finderFlickr.find('tweety')
    print finderDatabase.find('dogs')
    print finderDatabase.find('cats')
    print finderDatabase.find('rabbits')

Outputs:

$ python strategy_ex2.py
The following exception was expected:
Exception raised, no strategyClass supplied to ImageFinder!
Found image in Flickr: chickens, search #1
Found image in Flickr: bugs bunny, search #2
Found image in Flickr: tweety, search #3
Found image in database: dogs, search #1
Found image in database: cats, search #2
Found image in database: rabbits, search #3

To make the Flickr strategy default:

#change: def __init__(self, strategy=None):
#to
def __init__(self, strategy=ImageFinderFlickr):

Other Thoughts:

Even if you know there will only be one implementation for the foreseeable future, it can be helpful to create a ‘Fake’ implementation that returns hard coded test data. This also gets you thinking in terms of interfaces, which is a good thing. The fake implementation is only wired up temporarily. This is useful if different team members are working on different aspects of the system. Once the interface and return type are agreed upon team members may work on the back end and front end without interfering with each other. A fake implementation can also be used to show the customer something quickly e.g. – “This is just test data, but here is how it will work.”

If you would like to download the examples in this article: Example 1, Example 2.

Posted in Code | Tagged architecture, python | Comments Off

MySQL Maintenance Tasks for InnoDB with MySQL 5.1

by Laurence Posted on July 22, 2011

From time to time, MySQL 5.1 databases need a little house keeping. We found our production DB had a hard time running a simple join query between two tables with about 400k rows. It was taking between 30 and 100 seconds to run. On QA however, it was taking 58 milliseconds. The columns involved were already indexed. Thankfully it wasn’t impacting our users, but it still bugged me. The solution was simple, just run some cleanup commands. After the cleanup, on the live server the same query took just 4.8 milliseconds – that’s more like it!

Summary of solution:

Backup database
Check
Optimize
Analyze

$ mysqldump -u root -p --create-options --routines --triggers dbname > ./db.dmp
# note these cause LOCKS, so be careful on your production server!
$ mysqlcheck -u root -p --check --databases dbname
$ mysqlcheck -u root -p --optimize --databases dbname
$ mysqlcheck -u root -p --analyze --databases dbname

Complete details about each step:

1) First make database backup with mysqldump:
Don’t forget the argument –routines if you have stored procedures or functions and –triggers if using triggers:

$ mysqldump -u root -p --create-options --routines --triggers dbname > ./db.dmp
# copy to another server
$ scp ./db.dmp user@somehost:~/

For bonus points, actually restore the database on another system to make sure you have a valid backup.

This step may be impractical if the database is huge. In that case you are probably already using replication and have a backup system worked out.

2) Check:
Checks table for integrity errors.
http://dev.mysql.com/doc/refman/5.1/en/check-table.html

To check a single table:

mysql> CHECK TABLE {table name};

To check all tables in a database, from command line:

$ mysqlcheck -u root -p --check --databases dbname

This seems like a really smart thing to do on a regular basis.

3) Optimize:

Like a defrag operation, the optimize tables command reclaims unused space. At least, that is what it does for MyISAM. With InnoDB it basically runs an ALTER TABLE statement that changes nothing but tells MySQL to rebuild the table and its indexes.
http://dev.mysql.com/doc/refman/5.1/en/optimize-table.html

To optimize a single table:

mysql> OPTIMIZE TABLE {table name};

To optimize all tables in a database, from command line:

$ mysqlcheck -u root -p --optimize --databases dbname

If you get “Table does not support optimize, doing recreate + analyze instead”, that is normal for InnoDB.

4) Analyze:

Analyze rebuilds and optimizes the performance of indexes, specially it rebuilds the key distribution. If you have a slow running query but indexes are in place, it might be time to run this. A read lock goes into effect while this is running. If you have only InnoDB tables, this is already taken care of by Optimize.

http://dev.mysql.com/doc/refman/5.1/en/analyze-table.html

To analyze a single table:

mysql> ANALYZE TABLE {table name};

To analyze all tables in a database, from command line:

$ mysqlcheck -u root -p --analyze --databases dbname

With InnoDB and ANALYZE TABLE, there are some oddities. In particular, the number of samples the analyzer takes can vary (configuration option is innodb_stats_sample_pages). The default is low, and this means running analyze tables repeatedly will produce slightly different results.

Read here for more information:
http://dev.mysql.com/doc/refman/5.1/en/innodb-restrictions.html

Posted in Data, Sys Admin | Tagged linux, scripting | Comments Off

Memory Based Simultaneous User Limit

by Laurence Posted on July 14, 2011

As a follow on to my last post about PHP memory consumption, I wanted to get some ideas out there about memory utilization. This post explores:

An equation for the maximum number of users an application can support on a given server.
What can happen when the maximum number of users is exceeded.
How memory consumption impacts the cost of scaling to huge numbers of users.

Rough equation for simultaneous user limit based on memory:

(Memory Available / Memory Needed Per Request) = Memory Based Simultaneous User Limit

Memory Available != Total Memory. The operating system and other processes consume memory too. Even though a server might have 2GB of RAM, maybe only 60% of that is available to the application when everything is idle. The same idea applies to the JVM where classes and singletons occupy memory permanently, leaving a portion of the memory allocated to the JVM available to process incoming requests.

What happens when the limit is reached?

Keep in mind this is simulataneous users all hitting the site within say 20ms of each other.

The worst case is, free memory will be exhausted and the operating system will start to swap to disk. In that situation performance will degrade and all users will have to wait for the server to catch up. Eventually the server will completely crash. All users will be impacted negatively.
A better setup is to configure the application with a maximum user limit. When the limit is reached, new visitors will get a ‘page not available’ warning of some kind. This is better than crashing, and you can weather the rare surge of traffic that exceeds the limit.
The best case is to anticipate this limit, monitor the system constantly, and always stay comfortably ahead of maximum capacity limits. This is what you pay your top notch systems administrators to keep tabs on.

This is not a purely technical decision. Some business models do not tolerate downtime. Other business leaders love taking risk and are not satisfied until something breaks. The business leaders on the team should be made aware of the issue and decide which approach they want.

The cost of scaling:

If your site is low traffic (less than 50 simultaneous users), then memory consumption probably doesn’t matter much. Using a framework and leveraging plug-ins saves a lot of development time. Down the road it would be possible to turn on caching or employ other techniques to manage additional traffic.

However, if the site needs to scale to millions of users, memory utilization should be at the top of your list. Scaling cheaply is in part about using memory effectively. At a certain point it becomes worth it to refactor to light weight frameworks, which might take more up front development time, but the site will perform better.

As a very simple example, let’s consider three sites, A, B, and C, that use 4MB, 8MB, and 32MB on average per page request. Based on the findings in my previous post it might be that site A uses FatFree, site B uses Symonfy, and site C uses Drupal.

The AWS cost calculator reports that 1 large EC2 instance costs about $250/month. Site A has a huge competitive advantage over B and C when it comes to scaling. As an example, let’s say site A is reasonably successful and needs 10 servers to meet demand. An equivalent amount of traffic would require 20 servers for site B, and 80 servers for site C. In this example, the monthly server cost for A, B, and C are $2,500/month, $5,000/month, and $20,000/month respectively. The difference between A and C is about $17,500 per month. For a small business, that’s a fair amount of money. Now imagine if the traffic continues to grow, say ten times. The difference between A and C is now $175,000 per month. Did C paint themselves into a corner? Perhaps not, I don’t think comparing FatFree to Drupal is really that fair since they are different animals built for different purposes. Time to market is extremely important. Not to mention, how many sites really get 10,000 simultaneous users? It is important to be aware of the difference, and use tools for their best purpose.

Memory is just one variable:

Focusing just on memory oversimplifies the picture. The real user limit could be lower depending on CPU, Disk I/O or other factors (like a user triggering a huge report!). Request processing time is also important to pay attention to. Response time and bounce rate are positively correlated. The longer a page takes the load the more likely users are to leave the site. I have worked on applications where the response time had to be below one second. This influences the architecture to say the least. Disk I/O is another factor, especially in the cloud. The only real way to get to the bottom of all this is to run load tests and analyze the data.

Posted in Application Development, Sys Admin | Tagged architecture, software, startup | Comments Off

Memory Usage in PHP Frameworks

by Laurence Posted on June 14, 2011

I ran some tests across different PHP frameworks to see how much memory a single page request uses (FatFree, Symfony 1.0x, WordPress, and Drupal). I also asked some colleagues to share what they got. This is based on PHP 5.2 and 5.3 on a mix of Linux, Windows, and Mac. In other words, the results are generalized. A trend is definitely there, but individual results may vary.

Here is a summary of what I found:

PHP by itself, uses about 256 KB just to spin up and process “hello world”.
FatFree (http://fatfree.sourceforge.net/), a lightweight framework uses just 1MB-3MB to process a basic page. This framework has a lot of potential. It does not taste like the other fat free products I’ve tried.
Symfony 1.0x (http://www.symfony-project.org/), is using 8-14MB per page request, depending on how complex the page is. My basic testing of Symfony1.4 with a smaller test application showed similar results (7-9MB locally).
WordPress weighs in between 21MB and 33MB depending on how many plug-ins are running.
- WordPress 3.1 with minimal plug-ins and Platform theme (this blog) = 23.0 MB
- WordPress 3.1 with minimal plug-ins and custom lightweight theme (friend’s blog) = 20.8 MB
- WordPress 3.1 with plug-ins disabled and Platform theme (this blog) = 21.0 MB
- WordPress 3.1 with many plug-ins = 32.7 MB
Drupal gets the heavyweight belt coming in between 37.5MB and 76MB!
- Drupal 6 with a BUNCH of modules = 76MB
- Drupal 5 with a more limited collection of modules = 37.5MB
- Drupal 6 with a limited collection of modules = 43MB

The PHP minor version, operating system, and configuration did account for differences but I do not think an order of magnitude is possible. Most of the time the difference was about 5%. I did see one case where it was consistently about 20% more efficient on a workstation vs the QA server. I believe these differences were due to which extensions were turned on in php.ini and PHP 5.2 vs. 5.3.

Not PHP related, but a couple of our apps use a Django admin backend. They consume about 40MB per request. They are used only by our staff and get very low traffic. In this case we are running Python 2.6, Django 1.2.5, Apache 2.2, and mod_wsgi.

Find out how much memory your PHP application is using:

The function is memory_get_peak_usage().
http://www.php.net/manual/en/function.memory-get-peak-usage.php
I tried it and found the output it produces is not so friendly.

xelozz contributed a useful snip which I added to the footer of the templates in each site:

<?php
function convert($size) {
   $unit=array('b','kb','mb','gb','tb','pb');
   return @round($size/pow(1024,($i=floor(log($size,1024)))),2).' '.$unit[$i];
}
echo convert(memory_get_peak_usage(true));
?>

I’d like to run a contest to see which site out there uses the most memory per page request?!

Posted in Application Development, Sys Admin | Tagged architecture, software | Comments Off

Android Float Picker Widget Launched

by Laurence Posted on June 3, 2011

Made my first ever open source release tonight!

Android Float Picker Widget
Float Picker Widget is an open source Android based widget for picking a number using plus and minus buttons, no typing required. Could be used for picking specific scientific values (pH, temperature), AM/FM radio stations, quantity of items to purchase in lots. Several configuration options.

The source is up on BitBucket and includes a working demo activity.

Here is a screen shot of the demo:

Android Float Picker Widget screen shot

Released it under Apache 2.0 license. Turns out to do that correctly you have to add a LICENSE file, a NOTICE file, then decorate every single file in your project with a NOTICE. What a pain, but I wanted it done right from the start.

Hopefully somebody finds it useful. It feels good to give back to the community and get my hands dirty with Android at the same time. My prediction is a future version of the Android API will have a similar widget. I did my own because I couldn’t find anything that did floating point / decimal values.

I will say this, knowing I was going to commit to an open source release really focused my mind.

Posted in Application Development, Code | Tagged android, software | Comments Off

Launch Plans – Your Ticket to Excellence

1) Deploy Plan

2) Back Out Plan

Interview with a Software Lawyer

Good Source of Icons and Photos for Free

Statistics:

Tech Ignite Presentation

CSS3 HTML5 Web Based Tools

Python Strategy Pattern

MySQL Maintenance Tasks for InnoDB with MySQL 5.1

Memory Based Simultaneous User Limit

(Memory Available / Memory Needed Per Request) = Memory Based Simultaneous User Limit

Memory Usage in PHP Frameworks

Android Float Picker Widget Launched

Categories

Recent Posts

1) Deploy Plan

2) Back Out Plan

Statistics:

(Memory Available / Memory Needed Per Request) = Memory Based Simultaneous User Limit

Categories

Tags

Recent Posts