How To Make Python Code Run on the GPU

As a software developer I want to be able to designate certain code to run inside the GPU so it can execute in parallel. Specifically this post demonstrates how to use Python 3.9 to run code on a GPU using a MacBook Pro with the Apple M1 Pro chip.

Tasks suited to a GPU are things like:

  • summarizing values in an array (map / reduce)
  • matrix multiplication, array operations
  • image processing (images are arrays of pixels)
  • machine learning which uses a combination of the above

To use the GPU I’ve chosen to render the Mandelbrot set. This post will also compare the performance on my MacBook Pro’s CPU vs GPU. Complete code for this project is available on github so you can try it yourself.

Writing Code To Run on the GPU:

In Python running code through the GPU is not a native feature. A popular library for this is TensorFlow 2.14  and as of October 2023 it works with the MacBook Pro M1 GPU hardware. Even though TensorFlow is designed for machine learning it offers some basic array manipulation functions that take advantage of GPU parallelization. To make the GPU work you need to install the TensorFlow-Metal package provided by Apple. Without that you are stuck in CPU land only, even with the TensorFlow package.

Programming in TensorFlow (and GPU libraries in general) requires thinking a bit differently vs conventional “procedural logic”. Instead of working on one unit at a time, TensorFlow works on all elements at once. Lists of data need to be kept in special Tensor objects (which accept numpy arrays as inputs). Operations like add, subtract, and multiply are overloaded on Tensors. Behind the scenes when you add/subtract/multiply Tensors it breaks up the data into smaller chunks and the work is farmed out to the GPUs in parallel. There is overhead to do this though, and the CPU bears the brunt of that. If your data set is small, the GPU approach will actually run slower. As the data set grows the GPU will eventually prove to be much more efficient and make tasks possible that were previously unfeasible with CPU only.

How do you know your GPU is being used?

To view your CPU and GPU usage, Open Activity Monitor, then Window -> GPU History (command 4), and then Window -> CPU History (command 3).

Run the script in step 4 of the TensorFlow-Metal instructions which fires up a bunch of Tensors and builds a basic machine learning model using test data.

In your GPU history window you should see it maxing out like so:

M1 Pro GPU activity

The Code for Mandelbrot:

The Mandelbrot set is a curious mathematical discovery from 1978.  The wiki article has a great description of how it works. Basically it involves checking every point in a cartesian coordinate system to see if the value of that point is stable or diverges to infinity when fed into a “simple” equation. It happens to involve complex numbers (which have an imaginary component, and the Y values supply that portion) but Python code handles that just fine. What you get when you graph it is a beautiful / spooky image that is fractal in nature. You can keep zooming in on certain parts of it and it will reveal fractal representations of the larger view buried in the smaller view going down as far as a computer can take it.

Full view of the Mandelbrot set, generated by the code in this project:

Mandelbrot large view

Here is the naive “procedural” way to build the Mandelbrot set. Note that it calculates each pixel one by one.

def mandelbrot_score(self, c: complex, max_iterations: int) -> float:
    """
    Computes the mandelbrot score for a given complex number provided.
    Each pixel in the mandelbrot grid has a c value determined by x + 1j*y   (1j is notation for sqrt(-1))

    :param c: the complex number to test
    :param max_iterations: how many times to crunch the z value (z ** 2 + c)
    :return: 1 if the c value is stable, or a value 0 >= x > 1 that tells how quickly it diverged
            (lower means it diverged faster).
    """
    z = 0
    for i in range(max_iterations):
        z = z ** 2 + c
        if abs(z) > 4:
            # after it gets past abs > 4, assume it is going to infinity
            # return how soon it started spiking relative to max_iterations
            return i / max_iterations

    # c value is stable
    return 1

# below is a simplified version of the logic used in the repo's MandelbrotCPUBasic class:

# setup a numpy array grid of pixels
pixels = np.zeros((500, 500))

# compute the divergence value for each pixel
for y in range(500):
    for x in range(500):
        # compute the 'constant' for this pixel
        c = x + 1j*y

        # get the divergence score for this pixel
        score = mandelbrot_score(c, 50)

        # save the score in the pixel grid
        pixels[y][x] = score

 

Here is the TensorFlow 2.x way to do it. Note that it operates on all values at once in the first line of the tensor_flow_step function, and returns the input values back to the calling loop.

def tensor_flow_step(self, c_vals_, z_vals_, divergence_scores_):
    """
    The processing step for compute_mandelbrot_tensor_flow(),
    computes all pixels at once.

    :param c_vals_: array of complex values for each coordinate
    :param z_vals_: z value of each coordinate, starts at 0 and is recomputed each step
    :param divergence_scores_: the number of iterations taken before divergence for each pixel
    :return: the updated inputs
    """

    z_vals_ = z_vals_*z_vals_ + c_vals_

    # find z-values that have not diverged, and increment those elements only
    not_diverged = tf.abs(z_vals_) < 4
    divergence_scores_ = tf.add(divergence_scores_, tf.cast(not_diverged, tf.float32))

    return c_vals_, z_vals_, divergence_scores_

def compute(self, device='/GPU:0'):
    """
    Computes the mandelbrot set using TensorFlow
    :return: array of pixels, value is divergence score 0 - 255
    """
    with tf.device(device):

        # build x and y grids
        y_grid, x_grid = np.mgrid[self.Y_START:self.Y_END:self.Y_STEP, self.X_START:self.X_END:self.X_STEP]

        # compute all the constants for each pixel, and load into a tensor
        pixel_constants = x_grid + 1j*y_grid
        c_vals = tf.constant(pixel_constants.astype(np.complex64))

        # setup a tensor grid of pixel values initialized at zero
        # this will get loaded with the divergence score for each pixel
        z_vals = tf.zeros_like(c_vals)

        # store the number of iterations taken before divergence for each pixel
        divergence_scores = tf.Variable(tf.zeros_like(c_vals, tf.float32))

        # process each pixel simultaneously using tensor flow
        for n in range(self.MANDELBROT_MAX_ITERATIONS):
            c_vals, z_vals, divergence_scores = self.tensor_flow_step(c_vals, z_vals, divergence_scores)
            self.console_progress(n, self.MANDELBROT_MAX_ITERATIONS - 1)

        # normalize score values to a 0 - 255 value
        pixels_tf = np.array(divergence_scores)
        pixels_tf = 255 * pixels_tf / self.MANDELBROT_MAX_ITERATIONS

        return pixels_tf

Results:

Here are the results of generating Mandelbrot images of varying sizes with TensorFlow using the CPU vs the GPU. Note the TensorFlow code is exactly the same, I just forced it to use CPU/GPU using the with tf.device() method.

Time to Generate Mandelbrot at various resolutions CPU vs GPU

Between TensorFlow GPU and CPU, we can see they are about the same until 5000 x 5000. Then at 10000 x 10000 the GPU takes a small lead. At 15000 x 15000 the GPU is almost twice as fast! This shows how the marshalling of resources from the CPU to the GPU adds overhead, but once the size of the data set is large enough the data processing aspect of the task out weights the extra cost of using the GPU.

Details about these results:

  • Date: 10/29/2023
  • MacBook Pro (16-inch, 2021)
  • Chip: Apple M1 Pro
  • Memory: 16GB
  • macOS 12.7
  • Python 3.9.9
  • numpy 1.24.3
  • tensorflow 2.14.0
  • tensorflow-metal 1.1.0
Alg / Device Type Image Size Time (seconds)
CPU Basic 500×500 0.484236
CPU Basic 2500×2500 12.377721
CPU Basic 5000×5000 47.234169
TensorFlow GPU 500×500 0.372497
TensorFlow GPU 2500×2500 2.682249
TensorFlow GPU 5000×5000 13.176994
TensorFlow GPU 10000×10000 42.316472
TensorFlow GPU 15000×15000 170.987643
TensorFlow CPU 500×500 0.265922
TensorFlow CPU 2500×2500 2.552139
TensorFlow CPU 5000×5000 12.820812
TensorFlow CPU 10000×10000 46.460504
TensorFlow CPU 15000×15000 328.967006

Note: with the CPU Basic algorithm, I gave up after 5000 x 5000 because the 10000 x 10000 image was going super low and the point was well proven that TensorFlow’s implementation is much faster.

Curious how it will work on your hardware? Why not give it a try? Code for this project is available on github.

Other thoughts about running Python code on the GPU:

Another project worth mentioning is PyOpenCL. It wraps OpenCL which is a framework for writing functions that execute against different devices (including GPUs). OpenCL requires a compatible driver provided by the GPU manufacturer in order to work (think AMD, Nvidia, Intel).

I actually tried getting PyOpenCL working on my Mac, but it turns out OpenCL is no longer supported by Apple. I also came across references to CUDA which is like OpenCL, a bit more mature, except it is for Nvidia GPUs only. If you happen to have an Nvidia graphics card you could try using PyCUDA.

CUDA and OpenCL are to GPU parallel processing as DirectX and OpenGL are to doing graphics. CUDA like DirectX is proprietary but very powerful, while OpenCL and OpenGL are “open” in nature but lack certain built in features. Unfortunately on MacBook Pros with M1 chips, neither of those are options. TensorFlow was the only option I could see as of October 2023.  There is a lot of out dated information online about using PyOpenCL on Mac, but it was all a dead end when I tried to get it running.

Inspiration / sources for this post:

Posted in Code, Data, Science and Math | Tagged , , | 1 Comment

Visualizing Relationships in ChatGPT

The other day I asked ChatGPT for some recommendations for new piano pieces based on one I had just finished. To my astonishment the list it provided was pretty good and lead me to a new piece I started working on. This got me thinking, it would be fun to build a graph of these recommendations and visualize them. This morphed into a project I open sourced called Visualizing Relationships in ChatGPT.

The idea is to be able to ask ChatGPT for recommendations about various topics and visualize the relationship as a graph. Basically it can peak into ChatGPT’s “head”.  This allows us to see a few interesting things:

  1. What nodes are the most central to a given topic?
  2. Is the network graph it builds fairly redundant (self referencing) or vast and sparse?
  3. Does it make sense to a human or is it just hallucinating?

Topics analyzed:

I coded up a generalized structure so any topic can be quickly explored by implementing a new Conversation class. Here are the topics I setup so far:

  • 80sMovies – what 1980’s Movies might I also enjoy?
  • FastFood – what Fast Food Restaurants might I also enjoy?
  • PianoPieces – what are some good Classical Piano Pieces to study that are related/similar?  This was the most challenging of the four to cleanup and make consistent (see details in the project readme for more notes).
  • PrescriptionDrugs – what does GPT think people are prescribed in combination the most?

 

Results for ChatGPT 3.5 Turbo 5/23/2023:

80s Movies:

 

Fast Food Restaurants:

 

Prescription Drugs:

 

Top 10 nodes by centrality:

$ python main.py -command topnodes -topic all
Top 10 nodes found in the topic of 80sMovies:
1. The Terminator
2. Back to the Future
3. Die Hard
4. The Princess Bride
5. Blade Runner
6. Ghostbusters
7. Ferris Bueller's Day Off
8. The Breakfast Club
9. Beverly Hills Cop
10. E.T. the Extra Terrestrial

Top 10 nodes found in the topic of FastFood:
1. Taco Bell
2. KFC
3. Subway
4. Burger King
5. Wendy's
6. McDonald's
7. Popeyes
8. Arby's
9. Panera Bread
10. Hardee's

Top 10 nodes found in the topic of PrescriptionDrugs:
1. Lipitor
2. Crestor
3. Zoloft
4. Synthroid
5. Nexium
6. Plavix
7. Norvasc
8. Zocor
9. Singulair
10. Cymbalta

Top 10 nodes found in the topic of PianoPieces:
1. Clair de Lune - Claude Debussy
2. Moonlight Sonata - Ludwig van Beethoven
3. Prelude in C Major - Johann Sebastian Bach BWV846
4. Sonata in C Major - Wolfgang Amadeus Mozart K545
5. Prelude in E Minor - Frederic Chopin
6. Prelude in D-flat Major Raindrop - Frederic Chopin
7. Sonata in A Major - Wolfgang Amadeus Mozart K331
8. Fur Elise - Ludwig van Beethoven
9. Nocturne in E-flat Major - Frederic Chopin
10. Rhapsody in Blue - George Gershwin

 

Challenges:

  • Along the way I ran into some challenges in getting ChatGPT to spit out data in a consistent format that is machine parsable. It wants to be chatty and it wants to seem “human” so there is a lot of variation in both its return format. It also varies how it describes what should end up being the same node in the graph (KFC vs Kentucky Fried Chicken for example).  If you are interested in more details on that see my post on Getting back lists of data from the ChatGPT API.
  • The API is SLOW… it takes about 4 minutes to get the complete data download per topic.
  • NOTE: An OpenAI API key is required, and to get one you have to input a credit card. However, it is pretty affordable at this scale. So far on this entire project with all the test calls and trial and error I’ve spent a grand total of just $0.32.

Next Steps:

When ChatGPT-4 API is enabled for my account I will re-run the program and compare results.

Build out a web front end so conversations can be had on the fly.

ChatGPT doesn’t like to provide long lists of data, but it will provide lists of things 20 at time, which is enough to build a nice looking graph. In a sense you have to coax it into paginating the output.

My goal at this point was to launch the tool with fairly benign topics. Other fun topics might be:

  • TV shows from 2010-2020 (or other decades)
  • Good places to work
  • Rock bands
  • Popular careers and related careers
  • Programming languages
  • Sci-fi books

Use for AI research / controversial topics:

Many AI researches have discovered clever ways of trick ChatGPT into saying inappropriate things or provide instructions on how to harm others (aka “jail breaking”).  I find many of these pretty hilarious, although the real world implications of an AI teaching us how to harm each other are not so great. For now I stayed away from such topics, but this tool could potentially be used in combination with a jail break to extract noteworthy data.

AI researchers are also interested in what biases are inherent in ChatGPT (racial, gender, cultural, etc). This tool could also be used to help uncover biases in its data relationships.

Here are some example controversial topics that could be added later:

  • Street drugs (may take some jail breaking).
    • Hey ChatGPT, what are the top 10 street drugs in use in the USA?  If a person is on {crack}, what other street drugs might they enjoy??
  • Explosives (may take some jail breaking).
    • Hey ChatGPT, what are the top 10 explosives?   If I like {dynamite}, what other explosives might I enjoy??
  • Historical figures (see if they are all white men).
    • Hey ChatGPT, who are the top 10 historical figures during the civil war era? If I like {Abe Lincoln}, who else might you recommend during the civil war ear??
  • Famous people (see if it is USA centric or includes global celebrities, people of color, etc).
    • Hey ChatGPT, who are the top 10 celebrities from the 1990s?  if I like {Patrick Stewart}, who else might you recommend from the 1990s??
  • Travel destinations to safe countries (all white Christian countries?).
    • Hey ChatGPT, what are the top 10 travel destinations?   If I like going to {France} where else might you recommend for a vacation??
Posted in Code, Fun Nerdy, Science and Math | Tagged , , , , | Comments Off on Visualizing Relationships in ChatGPT

Getting back lists of data from the ChatGPT API

I’m working on a project where I’m trying to harvest lists of data from ChatGPT via its API. This post applies to the gpt-3.5-turbo model called via the API. I don’t have access to the GPT 4.0 API yet, but I’m on the waitlist.

When I first looked into it, I was excited when I learned that ChatGPT can return JSON (as opposed to regular text). This is really cool when it works. You can even tell it the specific format you want and it will do it.

An example prompt for getting a list of piano pieces:

Tell me the top 10 classical piano pieces played today. Provide the response in JSON only, following this format:
[{
"name": {name of the piece},
"composer": {composer},
"difficulty_level": {difficulty level}
}, ...]

ChatGPT returning JSON

However, in practice using API calls 80% of the time it does it, the other 20% of the time it says something like:

“Sorry, as an AI language model, I am not able to provide a JSON response only.”

This is in spite of the fact that I TOLD IT to return JSON and NOTHING else. So it acts a bit difficult at times, and you never know which one you will get… which seems very non-computer like but it is designed to vary the output to seem more human like.

For a script like this that makes repeated calls, having the output format alternate between text and json is unacceptable.
So the solution was to tell it to return a plain list.

Tell me the top 10 classical piano pieces played today. Provide the response with one answer per line, no leading numbers or symbols, and no other commentary

This approach provides consistent results of simple lists.

ChatGPT list results

This way the format is consistent, but the data still needs cleansing. Sometimes it numbers the items (1., 2., 3…) or adds a hyphen at the start of each line. Sometimes it adds a helpful clippy like message before the list (“Sure I can help you with that”), which needs to be filtered out.

Here is a simplified version of code I used, written in Python:

try:
    # this code calls the ChatGPT API
    result = call_open_ai(self.opening_conversation)
except Exception as e:
    print('Unable to get initial data from OpenIA')
    raise e

items = list()
try:
    # get the content of the response from ChatGPT
    raw_answer = result['choices'][0]['message']['content']

    for line in raw_answer.splitlines():
        line = clean_line(line)

        if skip_line(line):
            continue

        items.append({'name': line})

    if not items:
        print('Nothing came back?')
        print(raw_answer)

    return items

except Exception as e:
    print('Unable to parse JSON as string')
    print('Raw result was ' + str(result))
    raise e

def clean_line(self, line):
    """
    Cleanup string data from ChatGPT.
    :param line: string from ChatGPT
    :return:
    """

    line = line.strip()

    # clean up the 'bullets' it adds to lists
    if line.startswith('- '):
        line = line[2:]

    # clean up the numbering even though WE TOLD IT NOT TO!
    for i in range(1, 100):
        if line.startswith(str(i) + '. '):
            x = len(str(i)) + 2
            line = line[x:]

    # if the last character is a ' or ", remove it
    if line.endswith('"') or line.endswith("'"):
        line = line[:-1]

    # replace all accents with regular ASCII characters to smooth out variations
    str_normalized = unicodedata.normalize('NFKD', line)
    str_bytes = str_normalized.encode('ASCII', 'ignore')
    line = str_bytes.decode('ASCII')

    return line


def skip_line(self, line):
    """
    Helper method to tell if this line is chatter from ChatGPT that can be ignored.

    :param line: string returned from ChatGPT
    :return:
    """

    # skip empty lines
    if not line:
        return True

    # it is trying to be friendly, but we don't want this garbage in the data
    if line.startswith('Sure, I') or \
            line.startswith('Sure I') or \
            line.startswith('Sure, here') or \
            line.startswith('Sure here') or \
            line.startswith('Here are') or \
            line.startswith('Okay, here') or \
            line.startswith('Okay here'):
        return True

    return False

One thing I found is if you ask it for too much data, it won’t return a result. You have to trick it into paginating it, or getting a chunk at a time.

This is part of an open source project I started called Visualizing Relationships in ChatGPT.  Stay tuned for more posts about my findings.

Posted in Code, Data | Tagged , , | Comments Off on Getting back lists of data from the ChatGPT API

How to Stop/Start an AWS EC2 instance using AWS CLI command line

In my experience the smaller the AWS EC2 instance the more often it freezes / completely locks up. Something goes wrong inside the AWS infrastructure and poof, no ability to SSH or do anything with it. This is regardless of system load, memory status, the type of application, etc. The frequency of the crash is inversely proportional to the instance size.

  • t3.micro -> 1-8 times a month (free tier instances)
  • t3.medium -> 1-3 times a month (on a commercial Django product I maintain)
  • t2.xlarge -> maybe once a year if ever (another client of mine running a high traffic website)

One “solution” is to login to the AWS Console and stop / start the instance. That is a pain because 1) I have MFA setup, 2) the AWS console is a beast to navigate.  It can also take time for the instance to stop (15 minutes worst case) or require a force stop, before it can be restarted.

So I decided to figure out how to query the server status with the AWS CLI, and if needed stop/start it from the command line. This procedure is locked down to my IP, so even if the IAM user credentials leaked, it would have to be from my machine. I’m pretty happy with this balance of ease and security.  If your EBS volume is encrypted (which it should be) there are a few important yet obscure things to pay attention to as you proceed with the configuration.

In this guide I’ll show you how I use AWS cli to stop/start an EC2 instance from the command line using an IAM user with the right permissions, including support for encrypted volumes.

1) Install the AWS CLI tool if you haven’t already.

2) In the AWS Console, under IAM, create an IAM user, name it something like ec2-stop-start. Get the new user’s credentials (Access Key ID and Secret Access Key) and save them somewhere safe.

If your EBS Volume is encrypted, find the key it uses under the EC2 -> EBS Volume configuration. If this is a customer managed key, under AWS KMS Console, add the new IAM user to the key under Key Users.

3) In the AWS Console, under IAM create a policy that allows querying EC2 status and stop/starting instances.

This policy will be locked down to your local IP!  For me this works great because my ISP changes my IP once ever couple of years. The IP will also change if the MAC address on my wifi router changes (this is a trick to force a new IP to be allocated).

IAM policy outline:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "0",
            "Effect": "Allow",
            "Action": [
                "ec2:StartInstances",
                "ec2:StopInstances"
            ],
            "Resource": "arn:aws:ec2:*:*:instance/*",
            "Condition": {
                "IpAddress": {
                    "aws:SourceIp": "xyz.0.0.1/32"
                }
            }
        },
        {
            "Sid": "1",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstances",
                "ec2:DescribeInstanceStatus"
            ],
            "Resource": "*",
            "Condition": {
                "IpAddress": {
                    "aws:SourceIp": "xyz.0.0.1/32"
                }
            }
        },
        {
            "Sid": "2",
            "Effect": "Allow",
            "Action": [
                "kms:Encrypt",
                "kms:Decrypt",
                "kms:ReEncrypt*",
                "kms:DescribeKey",
                "kms:GenerateDataKey*",
                "kms:GetPublicKey",
                "kms:GetKeyPolicy"
            ],
            "Resource": "arn:aws:kms:us-west-2:0000:/xyz",
            "Condition": {
                "StringEquals": {
                    "kms:ViaService": [
                        "ec2.us-west-2.amazonaws.com"
                    ]
                }
            }
        }
    ]
}

Summary of each block:

Sid 0 -> EC2 allow stop/start from YOUR IP
Sid 1 -> EC2 allow describe instances from YOUR IP
Sid 2 -> Grant user access to encryption keys via EC2

Make sure to change the following:

  • Change the aws:SourceIP to be YOUR PUBLIC IP, which is in two places.
  • If your EBS volume is encrypted, change the region under the kms:ViaService block to the region your EC2 instance is actually in. Also change the `arn:aws:kms:us-west-2:0000:/xyz` value to be the ARN of the actual key that goes with the EBS volume.
  • Note – you don’t need the section with "SiD": "2" if your EBS volume is not encrypted. This section gives the IAM user the ability to work with the encryption key when requesting on behalf of another service (that is what the kms:ViaService tag does).
  • Note – this policy gives access to ALL EC2 instances under the account, but maybe that is way too much? Customize the Resource blocks as needed for your situation.

4) Associate the policy you just created with your ec2-stop-start user in the IAM control panel.

5) Add the IAM user credentials in your local aws credentials file (~/.aws/credentials), using the profile named
aws_ec2_stop_start:

[aws_ec2_stop_start]
aws_access_key_id = ******
aws_secret_access_key = ******

6)  Find your server’s instance ID in the EC2 control panel (something like i-00000).

7) Try querying the instance status:

$ aws ec2 describe-instances --region us-west-2 --profile aws_ec2_stop_start --output text

$ aws ec2 describe-instance-status --instance-ids i-00000 --region us-west-2 --profile aws_ec2_stop_start --output text

(change the region, and instance-ids parameters for your case, output can be text, table or json)

8) Try running stop start with the dry-run argument:

$ aws ec2 stop-instances --instance-ids i-00000 --region us-west-2 --profile aws_ec2_stop_start --dry-run

$ aws ec2 start-instances --instance-ids i-00000 --region us-west-2 --profile aws_ec2_stop_start --dry-run

(again, change the region, and instance-ids parameters for your case)

 

9) And for real, with out the –dry-run argument:

# STOP INSTANCE FOR REAL - MAKE SURE YOU KNOW WHAT YOU ARE DOING!!
$ aws ec2 stop-instances --instance-ids i-00000 --region us-west-2 --profile aws_ec2_stop_start

# if you need to force it
$ aws ec2 stop-instances --instance-ids i-00000 --region us-west-2 --profile aws_ec2_stop_start --force

# at this point you can query the status
$ aws ec2 describe-instances \
    --region us-west-2 --output text --profile aws_ec2_stop_start \
    --query 'Reservations[*].Instances[*].{Instance:InstanceId,Instance:State}'

# START FOR REAL
$ aws ec2 start-instances --instance-ids i-00000 --region us-west-2 --profile aws_ec2_stop_start --output text

 

Troubleshooting:

If the instance is not starting (stuck in pending), describe-instance-status shows:

"StateReason": {
  "Message": "Client.InternalError: Client error on launch",
  "Code": "Client.InternalError"
},

This might be because the EBS volume is encrypted. In that case double check the Sid 2 block in the IAM profile above, and see this AWS post for details on fixing that, and more about IAM policies for keys and the kms:ViaService configuration setting. The notes above work for me, but you may need to fiddle with the settings to get it to work. Again, if your volume is encrypted with a customer managed key, under AWS KMS Console, add the IAM user you created in step 2 to the key under Key Users.

Links to AWS CLI EC2 command docs:

AWS CLI has a lot of extra command line options you might want to get familiar with.  See the docs here:

Posted in Sys Admin, Work | Tagged , | Comments Off on How to Stop/Start an AWS EC2 instance using AWS CLI command line

Querying Complex Data in Django with Views

Django has a powerful ORM, but like any ORM it suffers when querying data that has complex relationships. Django supports foreign keys and many to many relationships out of the box with ease. Sometimes a report or data feed has half a dozen joins and calls for representing the data slightly differently than the way the models are defined.

I’m thinking of a summary report where you need to group rows by month or year, but also pull in related rows that are several joins away.

Choice 1 – bend the Django ORM to your will, but accept the limitations.

By default the Django ORM does not do any joins. When application code accesses related entities or child sets it will make additional calls to the database on the fly. This can really slow down an app, especially if there are loops involved.

To improve on that the Django ORM has `select_related()` for a standard one to one join. It also has `prefetch_related()` for sets (many to many) where it does a second query that gets associated matches.

Under the hood Django pulls all columns in the SELECT clause it builds. It is trying to hydrate a model instance for each row so it grabs everything. Pulling unneeded data slows down the call and expands the memory footprint needed for the request. To solve that you can use `values()` on the query to limit what columns are returned.

Django also allows `GROUP BY field_x HAVING …` queries, which I’ve written about.

For most things you can get it working with some trial and error and Stack Overflow as your co-pilot.

 

Choice 2 – write your own SQL query and run it with cursor.execute():

Wait a sec… what is SQL? This is the year 2020. Nobody really knows SQL anymore. All they know is JavaScript.

I’m half joking here but plenty of people would argue writing SQL in your application code is a big mistake. In 2005 I don’t think anybody would have guessed that is how the world would be 15 years later.

Raw SQL can be hard to maintain. After-all it’s a huge magic multi-line string. From a maintenance perspective it exists independent of the application’s models and migrations. It is kind of hanging out, like a rat in the wall. It is great for one-off scripts and rare corner cases, but I would not want to rely on raw SQL for day to day Django work.

Note that with raw SQL there is also a small security risk in the form of the SQL injection attack. This can happen if you (or future you, or the new intern) forget to parameterize any inputs.

 

Choice 3 – go with a database level view:

A database view (supported in MySQL and Postgres, among others) is basically a SELECT statement that gets saved in the database. Views operate like a read only table. A view can do all sorts of clever things like joins, generate new columns based on the row level values, aggregate functions etc.

The nice thing about setting up a view and pointing Django to it is, Django can query it like an ordinary table. In fact Django doesn’t even know it is really a view under the hood. You may not edit the rows in the view though.

 

How to get Django to work with views:

Note this is working for me with Django 2.2 and Python 3.7.

First design your view in SQL to get the bugs worked out.

Next setup a new model that mirrors the columns. For views, you can put them in models.py, but I use models_views.py so it is clear it is a view and not a regular model.

Make sure to set managed = False.

from django.db import models, NotSupportedError

class ExampleView(models.Model):
    name = models.CharField(primary_key=True, editable=False, max_length=100, blank=True, null=True)
    status = models.CharField(max_length=25, null=False)
    amount = models.DecimalField(max_digits=24, decimal_places=6, blank=True, null=True)

    def save(self, *args, **kwargs):
        raise NotSupportedError('This model is tied to a view, it cannot be saved.')
    
    class Meta:
        managed = False
        db_table = 'example_view'    # this is what you named your view
        verbose_name = 'Example'
        verbose_name_plural = 'Examples'
        ordering = ['name']

Create the migration:

$ python manage.py makemigrations

This will generate a migration file under that app’s migrations directory, something like migrations/00xx_exampleview.py.

Open that file and add your view’s create statement. This is an example for MySQL:

dependencies = [....]

# put your SQL to create (replace) the view here, below the dependencies line
sql = """CREATE OR REPLACE SQL SECURITY INVOKER VIEW example_view AS
SELECT
    name,
    status,
    sum(total) as amount
FROM some_table 
GROUP BY name, status;"""

operations = [
   migrations.RunSQL(sql),   # add this line so it runs as part of the migration
   migrations.CreateModel(
   name='ExampleView',
.....

Run your migrations as normal:

$ python manage.py migrate

 

Use your view model as normal:

example_rows = ExampleView.objects.filter(name='test').all()

 

When to use each choice in my opinion:

Most of the time I try and get the Django ORM to do the job. If that fails or is getting really crazy, I opt for choice 3 – the database view since that keeps things so much cleaner. I use raw sql for temporary scripts, prototypes, build and throw away kind of stuff where nobody cares about maintenance or clean code.

Pros of using views:

  • Can greatly simplify application code where nested loops or complex joins would be required.
  • Leverage the power of the database engine to provide data to your application.

Cons of using views:

  • Any rows retrieved from a view cannot be saved back to the database, they need to be treated as read only objects.
  • Anecdotally, I’ve seen comments in Stack Overflow where people complain that views are slower than running the equivalent query. So your mileage may vary. If you do run into that, dropping back to raw SQL is a feasible work around.
Posted in Code, Data | Tagged , , , | Comments Off on Querying Complex Data in Django with Views

Mac vs Dell for Software Development with Cost Breakdown 2020

My 2015 Mac Book Pro is getting a little old and tired. I recently joined a project that uses a docker-compose configuration that spins up 8 separate containers. It barely works on my Mac. It takes a long time to start and performance is terrible system wide while it is running. So it was time for me to bite the bullet and either get a new Mac, or look into a Windows or Linux laptop.

2015 Mac Book Pro 13

TLDR;

  • Get the Mac Book Pro if you have the money (or it is your employer’s money, hehe).
  • Get the Dell if you want maximum power for the price and care about replaceable parts.
  • Reformatting the Dell to Linux is a sweet spot for computing power and ease of development if your use case supports it.
  • If you go with the Dell don’t kid yourself that you are saving money. Your time is very valuable as a software developer. Windows will waste it here and there (windows update churning in the background, command line quirks, hard to find certain packages, etc). That isn’t to say that macOS is free of these annoyances, they are there, but to a lesser degree.

You know, I really hate this trend of bigger and bigger virtual images to run what amounts to a web server and a database.  For a large team on a project with dozens of dependencies it does make sense. However, when I’m developing solo I get by just fine with local packages.

I’ve been holding off on a new Mac because in 2016 Apple went a few steps backwards. The controversial touch bar and the redesigned keyboard have gotten horrible reviews. At the same time their prices keep going up but performance lags. I will really miss the magnetic charging port. It’s pure genius, why remove it? It has saved my laptop from hitting the floor a few times. Did you know a replacement charger for a 2015 Mac Book Pro is about $75!?

In bargaining with myself, yes I could live without a physical escape key and a crappy keyboard because most of the time I hook up to an external monitor and use the Apple Magic Keyboard and Magic Trackpad 2. These are 2-3x the price of Windows peripherals, but they are really awesome. Every Windows touchpad I’ve tried jumps around like crazy and has a rough texture. The Magic Trackpad 2 is as accurate as a mouse and is smooth like glass.

apple magic keyboard and trackpad

Faced with the prospect of buying all new dongles and having to fight through the bugs involved with macOS Catalina (which I’m currently holding off on), I took a look at my old friend the Dell Outlet.

The Dell Outlet sells machines that have been returned for whatever reason. Dell is just trying to get rid of them. They are discounted way below retail. The outlet runs specials on a regular basis and offers free shipping. I used to work at a company where everyone ran Dell Outlet hardware. We purchased from them over 50 times. Most of the stock is labeled as “scratch and dent”, but I never saw one that I could tell had any problems.

When looking at Dell the first thing I did was rule out the Inspiron class completely, which is the cheapest level. I looked closely at XPS and Precision, but the prices really jump up. I ended up going with a middle of the road business line, the Vostro. It comes in a 14” model which is about perfect. Mine came with a regular Intel graphics chip but if you dig around on the outlet you can find ones with Nvidia or Radeon graphics on board which is a nice bonus if you do the occasional gaming session.

In terms of OS, you can generally reformat a Dell to run Linux which I recommend. Sometimes you’ll run into a boot issue or device driver error. If you are buying on the outlet that model has probably already been out for long enough that you can get help by googling.

When it comes to Windows the Pro version is the way to go. With the Pro version BitLocker is included, which offers full drive encryption. As a developer you’ll want to activate that if you have anything beyond cat pictures on your machine. Most of the Dell business machines come with Windows Pro by default.

Here is the breakdown between my new Vostro and a middle of the road 2019 Mac Book Pro:

Dell Vostro 14” – 5481 2019 Mac Book Pro 13” 
CPU Intel i7-8565U 1.8 – 4.6 GHz Quad Core i5-8279U 2.4 – 4.1 GHz Quad Core
RAM 16GB 2666MHz DDR4 8GB 2133MHz LPDDR3
Storage 512GB SSD 512GB SSD
Screen 14 inch FHD (1920 x 1080) Anti-Glare LED-Backlit Non Touch Display Retina Display
Ports Ports galore (USB, HDMI, SD Card, Headphones, RJ-45) Four Thunderbolt 3 Ports
Replaceable / Upgradable Parts Yes No
Realistic Battery Life While Doing Software Development 3-4 hours 7-10 hours
Price $646.00 $1,999.00

Here is the score card:

Winner Notes
CPU Mac The Mac’s i5 is actually a bit faster than the Dell’s i7 according to this breakdown.
RAM Dell The Dell has twice as much memory which is super important for running virtual machines.
Storage Dell The Dell’s HDD is replaceable while Mac’s is soldered to the board.
Screen Mac Retina displays are awesome, but if you dock and leave the lid closed it is moot.
Ports Dell Dell has the old school USB / HDMI ports, the Mac requires dongles which you have to purchase on your own.
Replaceable / Upgradable Parts Dell The Dell is designed to have the hard drive, battery, and even RAM upgraded. The Mac is a sealed product.
Battery Mac The Mac battery is way better. This is moot when you dock. Still I do get “range anxiety” when I’m on battery.
Price Dell You could buy a new Dell every year, compared to a new Mac every 3-4 years.
Development Experience Mac I have to admit the Mac experience is a lot smoother.

Dell with Linux comes close for some use cases.

Good for developing on complex containerized projects Dell I doubt 8GB of RAM is enough.

So which is better, Mac or Dell?

The Mac wins, begrudgingly, but it depends. Honestly they both work and a good developer should be able to get their job done on Mac, Linux, or Windows without a problem.

However, I’m not so sure the 13” MPB above with only 8 GB of RAM would handle the huge dockerized development environment I mentioned at the beginning of this post. For an extra $200 you can get a 13″model with 16GB of RAM. Or you could jump up to the 15” 16″ MBP which starts at $2,399. Personally I don’t want to lug around a $2400 machine nor a 15” 16” laptop.

Is the Mac really worth an extra $1353?

Yes. If the Mac “experience” saves you 1 minute a day it will pay for itself. Here’s the math:

  • A software developer’s fully loaded cost is $100/hr – benefits, payroll taxes, retirement plan, paid time off, training, hourly wage, etc.
  • In terms of developer time, it would take a savings of 13.5 hours over the life of the laptop to make up for the extra price of the Mac. (1999 – 646) / 100 = 13.5
  • The laptop lasts 3 years.
  • There are 261 working days in a year.
  • [ 13.5 hours / (261 (work day / year) * 3 years) ] * 60 minutes / hour = 1.035 minutes / work day
  • Windows updates alone will rob you of at least 1 minute per day.

They why did I buy a Dell?

For one thing, I already have a Mac I really like. I needed something cheap and powerful for this one particular project. That is where the Dell comes in. For $646 I’m able to allocate 2 cores and 8GB to the docker-composer instance which makes it just as fast as regular local development (even though the fans do come on frequently).

This situation is causing me to jump between keyboard layouts, but I just can’t let go of my Mac on my other projects! Turns out as a contractor I need both Windows and Mac in my toolbag.

In terms of overall budget for your workstation also consider a sit-stand desk and nice chair. I’ve shared my setup which I still love, but there are many new sit stand desk companies on the market and they are getting more and more affordable all the time.

In my review of Windows development I am glad to say it is getting a lot better for Python / PHP projects. Microsoft is building an open source terminal app. Then there is Windows Subsystem Linux, which is like having a tightly integrated Linux VM running all the time under the hood. Visual Studio gets a lot of great reviews. I’m still using IntelliJ (WebStorm / PyCharm, etc) but I look forward to trying it out soon.

What I’ve learned is: Apple has everyone where they want them, even a pragmatist like myself.

Posted in Application Development, Work | Tagged , , , , | 6 Comments

Why is WordPress so Popular?

WordPress is arguably the most successful “killer app” of the web in the last 10 years.  Since it is written in PHP and has a history of security vulnerabilities, most “software brains” dismiss it as a toy. While it is certainly not a monument to computer science, it does its job really well. It has grown up from a simple blogging platform into a powerful CMS capable of running the most complex and high traffic websites.

Consider these astounding stats – as of 2019 WordPress runs 33.5% of all websites, and 60.4% of all CMSs [1].

Given its success WordPress has some valuable lessons to offer for anyone working on a web based platform. Here is why I think WordPress is such a huge hit:

WordPress Why So Popular?

1) They got into a hot area:

Back in 2003 they hit on an area that was prime for growth – online publishing and marketing. To some extent there is still a good deal of growth to be had in that market.

2) The plugin and theme architecture was genius:

From the early days the WordPress architecture included extensible plugins themes.  It allowed other developers to build sites that do what they want using WordPress as foundation. Over time a large community of theme and plugin developers grew around WordPress. Presently there are over 55,000 plugins [2] and 31,000 themes available.

3) WordPress is a cheap and easy option that scales well:

A big factor in WordPress’s success is the price – it is free. The WordPress core itself is open source. Themes are abundantly available for under $40. Many plugins are free or relatively inexpensive compared to development efforts.

Getting a website going that looks professional and has a few bells and whistles is fast and dirt cheap.

Since WordPress uses PHP and MySQL hosting it is also simple and relatively cheap. Plans under $10/month are able to power sites with significant traffic.

At the same time WordPress’s low maintenance costs (no renewal fees) and its robustness makes it a good fit for high profile websites looking to spend up to seven figures on development.

4) WordPress is open source, but unusual in that it cares about its users:

I think the main reason WordPress has done so well is due to how strict they are about keeping backward compatibility.  It is one of the few platforms I’ve ever seen that accounts for their existing user base when determining what features to add, what to deprecate, and how to refactor.

The lesson all software developers can learn from WordPress is that their huge success steams from a simple philosophy – they put their users first. This is opposed to the more common philosophy of building overly complex technology “solutions” at everyone else’s expense.

Many open source projects are downright nasty to users… The typical mentality for an open source project is “it’s free screw you”.  They constantly refactor or even wholesale re-write their code.  Nobody enjoys keeping up with that. Popular web frameworks like Rails and Symfony barely resemble their initial versions. This alienates people fast.  Keeping up with refactoring takes time away from building features paying customers need.  My observation is a lot of the times the refactors appear to be a matter of personal preference or a whim vs something that could be tied to a functional requirement.

Yes, WordPress kept a lot of the cruft around. But they were right to. They correctly realized that refactoring code for the sake of gold plating it runs counter to what their customers actually need – working software that doesn’t require much upkeep.

5) As a CMS, WordPress is pretty amazing:

WordPress isn’t just a blogging platform anymore, it is a fully featured CMS.

The recent WordPress 5.0 release included the Gutenberg editor that allows customizable content modules. Prior to this, and now in combination with it, the Advanced Custom Fields (ACF) plugin can be used to create on the fly CRUD forms in the admin.

WordPress also shines in how it handles media uploads. Getting a correctly sized graphic uploaded is still a huge pain point for most non-technical users. WordPress comes with a built in crop / resize tool. It also has a way of configuring the crop types for an image (mobile, tablet, desktop) and allows overrides so any user can make a page look good on desktop, tablet and mobile without needing a designer.

6) WordPress dances to its own beat:

Another reason they did so well is the WordPress team consistently ignores the criticism from the technical purist camp. Sure they are based in PHP, they use tons of global functions, the database design is horribly inefficient, but so what, it’s a killer app!  If they got really into performance or cleaning up all the messy bits, the would have lost sight of their customer base.

 

References:

[1] https://w3techs.com/technologies/overview/content_management/all
[2] https://wordpress.org/plugins/
[3] https://sceptermarketing.com/how-many-wordpress-themes/
[4] https://en.wikipedia.org/wiki/WordPress#Vulnerabilities

Posted in Application Development | Tagged , | Comments Off on Why is WordPress so Popular?

Basic Git Commands I Use Everyday

My opinion is very strong on this point – if you are using git, you should be using it on the command line.

Git is the best and most popular tool for doing source control.  There used to be competitors, but why waste bytes listing them…  Git was created by the infamous Linus Torvalds (the founder of Linux). Linus was fed up with everything else so he wrote his own tool to help him work on the Linux kernel more effectively.

Command Line vs GUI Tools for using git:

Git is a powerful tool. As a software developer you need to understand what git is doing. GUI tools are just wrappers over the command line tool and they hide or simplify the details which is bad.  I’ve found GUI tools are a) slow, b) sometimes screw up merges, and c) sometimes make branches from a starting point you didn’t want.

In other words git GUI tools are more trouble than they are worth. For beginning developers, my thinking is, you should understand how to use git via command line so you know what the GUI tool is doing under the hood (and then uninstall the GUI tool).

Besides, it only takes a few minutes to learn the basics of git command line. If you use it everyday you’ll get good at it pretty quick! This translates into having more confidence on the command line, which is great for many other reasons.

On Mac or Linux, your terminal is built in. On Windows, install the git bundle, then you can right click on any folder and go to “Git Bash Here” which will get you a DOS prompt with the proper paths and environment variables setup.

The exception to my command line focused dogma is when I review commits from other team members or look back through my own commit history. For that I use the github (or bitbucket) web interface and not the command line.  When looking at code changes it really helps to have a visual diff with the red/green colored lines. It also helps for collaboration to be able to comment on the commit or even a specific line of code.

The top git commands I use:

  • Checkout a repo:
    • git clone {url to repo}
  • Working on code:
    • git diff     (see what I changed)
    • git add [files]    (stage files for commit)
    • git diff --cached      (shows me the changes I’m about to commit, I always do this as a final personal code review before committing)
    • git commit -m "A useful message" (see my post on writing a proper commit message)
    • If I screwed up the commit, I can do git commit --amend to change the message or contents of the commit.
    • git push (send it to the remote repo for all to see, at this point, going back can still be done with the force flag -f but use with caution because if another developer pulled your change it gets very messy)
  • Undoing changes:
    • git checkout [files] (note this will erase local changes from your system, reverting files to the last commit. This is useful if you went down a rat hole, or accidentally broke something.)
    • git reset HEAD [files]  (this will unstage files from the next commit, useful if you accidentally added a file and need to move it back down to the list of files that won’t be committed. This leaves changes in tact.)
    • If your repo is hopelessly screwed up, you can always clone a fresh repo with git clone {url to repo}. Beforehand, take a backup of the folder your broken repo lives in just in case you need something in there.
  • Get code pushed by my team members:
    • git pull  (I use an alias git pl, see my .gitconfig below, which does a rebase and recurses any submodules. This way my commits are moved after any incoming commits which avoids annoying merges on pull.)
  • Viewing commit history:
    • git log (shows a lot of commits on your current branch, can be useful as a sanity check)
    • Again, I use github / bitbucket to view history, but the command line tool has a rich set of tools in this area tool that I don’t use much. The aliases below can be helpful git l and git lg.
  • Make a new feature branch and push it to the remote repo:
    • git checkout {branch I want to start from} (typically master, but could be a feature branch)
    • git pull (make sure I’m up to date on the starting branch)
    • git checkout -b {new_branch_name} (create my new branch)
    • git push -u origin {new_branch_name} (push my new branch to the remote repo, initially with no changes)
  • Changing between branches:
    • As a developer you’ll be working on several things at a time, which means several open branches.  Here is how I switch branches:
    • 1) Disable any background severs / processes like npm, grunt, gulp, react, angular, etc. If you have a front end watch script, or a javascript build process running when you switch branches it will get really confused.
    • 2) Make sure your checkout is clean (no pending changes) using git st
    • 3) git checkout {branch_name} (switch to the branch)
  • Other cool commands:
    • git rebase {source_branch}    (integrate changes from source branch into current  branch, and replay commits on feature branch. This is the secret to fixing merge conflicts before a merge!)
    • git cherry-pick {commit hash}  (pull in a single commit from another branch, can be a life saver if you need just one commit but not the whole branch)

Related to git command line is your .gitconfig file, which lives in your home folder. It has many powerful options, but the one I want to show you is the [alias] section. These are shortcut commands that make git command line much easier to use. I can’t take credit for any of these, I cobbled them together from various other developers / blog posts along the way.

[alias]
       	st = status
       	c = commit
       	co = checkout
       	br = branch
       	tg = tag -a
       	df = diff
        dfc = "diff --cached"
       	dt = difftool
       	lg = "log -p"
       	l = "log --pretty=oneline"
       	pl = "pull --rebase --recurse-submodules"
       	ph = push
       	phf = "push -f"
       	brls = "branch -a"
       	tgls = "tag --list"
       	stls = "stash list"
       	chp = "cherry-pick"
       	ch = "cherry -v"
       	chh = "cherry -v HEAD"
       	subup = "submodule update"
       	subst = "submodule status"
       	cl = "clone --recursive"
       	notmerged = "branch --no-merged"
        graph = "log --oneline --graph"

With these aliases,  git diff --cached becomes git dfc, or git commit becomes git c which saves a few keystrokes every day.

The graph alias above is very cool if you have a project with commit history and branches.

Here are some additional resources for learning git.

Let me know if I missed something and I’ll add it here and give you props.

Posted in Code, For New Developers | Tagged , , | Comments Off on Basic Git Commands I Use Everyday

Full Stack Developer Skills and Technologies for 2019

Here are the specific technologies and skills that go with being a full stack developer in 2019.  Full stack development has two sides, front end and back end.  A full stack developer is someone who can work on both sides. This does not imply they suck at everything because they are too general or spread too thin.  Everyone has specific strengths and weaknesses based on their brain, education and experience.  Some people have the bandwidth to do both, others prefer to be masters of a specific area.  I like to say I have experience on the front end but mainly do back end work because I enjoy that the most.

Front End Development – Fundamental Skills and Technologies:

The front end on the web is built with HTML, CSS, and JavaScript to varying degrees. A webpage can work without JavaScript, or it could have several megabytes of minified JavaScript doing all kinds of fancy stuff.  A basic knowledge of computer graphics (images, svgs, fonts) also comes in handy.

Front end work also involves setting up all the user interface interactions that can happen (like loading external data, making dialogs appear, showing validation errors, etc).

Here is my list of front end development concepts:

  • HTML5 / CSS – this is your entry point to start doing basic coding.
    • HTML (hyper text markup language) defines the page structure and content. The elements on the page make up the document object model (DOM).
    • CSS (cascading style sheets) addresses formatting and positioning of the elements in the html.
    • A popular front end CSS library is Bootstrap but there are many to choose from.  Here is a list of popular front end css frameworks.
    • Some people do without a CSS framework and instead start with a more basic “CSS reset” which normalizes how things look across browsers.
    • CSS itself is rather clunky. It does not support variables or nesting of elements.  Most modern sites use a CSS pre-processors such as SASS or LESS which gives handy syntax, variable support, mixins and other neat things so writing CSS is much cleaner. During development they require a task running in the background that rebuilds the SASS or LESS into CSS each time a file is changed. That is because browsers only understand CSS.
  • Request/Response, AJAX, Sockets and HTTP – the web runs over HTTP (hyper text transfer protocol).  Full stack developers understand the difference between request/response, AJAX (asynchronous loading), and web sockets.
    • When you visit a website for the first time and the page loads in the browser that is a request -> response. The entire page’s HTML, CSS, JavaScript, images, are pulled in.  If you click another link that makes a new request to the server and the response is another full page load.
    • With AJAX, JavaScript calls the backend and gets a piece of data to add / replace on the current page.
    • A full page refresh is expensive and slow where AJAX calls are lightweight and fast, but if there are several happening at once it can really bog down the page (and the server).
    • Web sockets are a way of leaving a connection open to the server and streaming updates (like stock quotes or a feed).
    • It helps to understand DNS, basics of the http protocol, and request headers.
  • Forms – web forms have been around since the beginning of the web and are still used in great numbers for submitting data to the backend.
    • Front end developers setup the layout and behaviors for form controls including text inputs, checkboxes, radio buttons, drop downs, etc. Getting web forms to look good and function the same across all platforms, screen sizes, and desktop vs mobile devices is tricky.
    • Front end form validation involves checking to make sure inputs are valid and showing appropriate warning messages before they are sent to the server.
    • Date pickers, auto complete drop downs, and other fancy widgets can be pretty fun to setup.
  • jQuery – a JavaScript library that is great for simple sites where you need to sprinkle in a few dynamic behaviors like event handling, front end form validation, etc.
    • jQuery is pretty old but still viable for simple sites. Lots web applications use it, and many frameworks have it as a dependency.
    • jQuery makes it easy to get at elements in the HTML either by id or class name, and manipulate them (show, hide, change their content) and wire click/keypress events.
    • jQuery can also do ajax calls, animations, and has tons of community add ons (although finding one that is actively maintained is getting harder).
    • Knowing jQuery is useful because it isn’t as magical or obfuscated as the more advanced javascript frameworks I’ll talk about next.
  • React / Angular / Vue – these JavaScript frameworks are useful for complex pages involving multiple interactions on the page at one time or SPAs (single page apps).
    • Attempting to use jQuery to build a complex page results in a lot of spaghetti code. With one of these frameworks, if you do it right, everything gets organized into classes / components that are reusable.
    • React is the most popular of the three and has the best plugin eco system. It is more of a library than a framework and can work with things in the jQuery world if it has to.
    • Angular is the biggest of the three. It is a fully fledged framework. You must do things its way or beat your head against the wall until you cave in. Angular wants to take over the entire page and isn’t as easy to get working with existing systems.
    • Vue is the light weight new comer and popular with developers because it lets you do things your way.
    • With any of these as the version count relentlessly goes higher keeping up with changes can become a burden.
  • Npm (node package manager) – where you configure the JavaScript dependencies for your application, handle asset minification, etc.
  • Graphics – Images (png, gif, jpeg) and SVG (scalable vector graphics) are important to know about.
    • As a developer I don’t produce art or designs, but I do end up resizing, cropping, color correcting, and compressing images all the time.
    • Photoshop is your friend.  Paint.NET is a free alternative for Windows that mostly does the job.  Gimp is a free alternative but I find it to be a major pain to use.
  • Fonts – your web app will likely need to load external fonts unless you want to stick with the vanilla fonts that are installed across windows, mac and linux (arial, times roman, etc).
    • Fonts can also be used for icon libraries – font awesome, giphicon convert the letters in a font into symbols like (plus, minus) and provide a huge number of icons in one download to the browser instead of having to create individual SVGs / PNGs for each icon.
  • Responsive design – using one set of HTML/CSS to make a site (or app) look good from 320 pixels wide up to 4,000 pixels wide.
    • Responsive design is done with CSS media queries which apply styles to an element based on the browser window size.
  • Other things to know about on the front end:
    • ADA compliance – it is 2019 and everybody should be building sites that work for people with visual impairments but a lot of developers are not aware of this.
    • SEO (search engine optimization) / Schema.org tags
    • Page load time considerations YSlow, CDNs, site speed testing tools.
    • Linting, a command line tool that checks to make sure your code meets quality and formatting rules for the project.

 

Back End Development – Fundamental Skills and Technologies:

The backend handles the flow of data, business logic, authentication and authorization, connection to third party data feeds, etc.  You’ll love backend coding if you like data, organizing things, algorithms, boolean logic, command line tools, etc.  The backend is my bread and butter (though I do a lot of front end work too).

Here is my list of backend development concepts:

  • Design patterns (OOP, encapsulation, abstraction) – how to structure code so it models the domain (subject area) you are working on.
    • For the big picture think in terms of the entities (people, objects), their properties (name, address), and what operations can happen (user signs up, user initiates process X…).
    • At the coding level think in terms of the basic inputs and outputs to a function.
  • Quality Coding – write code that works in the real world.
    • Code defensively in case inputs are not what you expect.
    • Create functions that do small amounts of work vs huge monolithic functions that do everything. This is called the “single responsibility principle”.
    • Unit tests are your friend. Writing good unit tests hopefully makes your system design better because it help you to think in terms of “one function one task”.
    • Don’t take external resources for granted. When you connect to a resource (database, API, file system, etc), always setup try/catch or promise failure blocks so your code gracefully handles problems on the other end.
    • Add logging as appropriate so it is possible to piece together what went wrong when an error happens.
    • Use transactions when doing multiple writes that need happen all at once or not at all (like transferring money between bank accounts).
  • MVC pattern  – model, view, controller – a popular way of organizing websites.
    • Model – classes that allow you to interface with the tables / rows in the database
    • Views (aka Actions, Templates) – nobody can seem to agree on the name for this, but there are two parts, the backend actions and the front end template that renders the output.
    • Controller – handles the relationships between the layers, has a set of routes that map URLs to actions on the backend. For example going to  /myreport calls the report action which would get some data and pass it to the template layer.
    • Forms – not always a fully fledged element to an MVC (though Django has an excellent forms module). On the backend forms need to be validated before they are fed into a save/update operation. Generally forms mirror models in terms of the associated fields. But it is not always one to one. A lot of glue code needs to handle processing form data and mapping it to the database. Unit tests come in very handy here.
  • Data Storage – full stack developers need to know how to store and retrieve data, typically done using a database. There are three main types of databases, relational, NoSQL, and graph.
    • Relational databases excel at storing things that need to be related (for example a customer and an invoice).  The vast majority of business systems use relational databases. Most MVC frameworks are designed to sit on top of a relational database.
      • SQL (structured query language) is the language used to interface with a relational database. Each platform has its own ‘flavor’ of SQL with custom functions and special tricks, but the core of the SQL language (SELECT, INSERT, UPDATE, DELETE) works the same across most platforms.
      • Popular relational databases: MySQL, PostgreSQL, Microsoft SQL Server, Oracle.  Here is a list of relational databases.
    • Non relational / NoSQL databases have different performance characteristics from relational databases (redundancy, replication, raw speed, flexibility).  They are best suited for free form data that can change or evolve.
    • Graph databases uses nodes, edges, and properties to store data where the relationship between the nodes (called edges) is the primary thing of interest. Relational databases are good at relating A -> B, but if you need to relate A -> B -> C -> D -> E… in a relational database the SQL gets really ugly really fast.  Graph databases are useful for analysis of large data sets to find patterns in the relationships. For example product recommendation systems, medical databases, and social media.
  • API Integrations – full stack / backend developers are often tasked with interfacing between systems using APIs. There are two main technologies used for APIs:
    • REST (restful, representational state transfer) – uses JSON as the data format.  REST is common for AJAX calls from the client and also between more modern backend systems. The JSON format can be adhoc (one off) or support a more formal set of operations against an entity (get, create, update, delete, list). With REST your system talks directly to a URL and the data is passed as url params, query string arguments and or in the request body.
    • SOAP – uses XML as the data format. SOAP is much more rigid than REST but also seen as secure because the XML is ‘typed’ meaning the structure of the data and the values are ‘guaranteed’ to adhere to a pre determined format. With SOAP your system calls a ‘wsdl’ endpoint with a specially formatted SOAP envelope, header and body.
    • Other common integration patterns:
      • OAuth – a standard that allows a user to give permission for application A to access their data on platform B.
      • ETL jobs – ETL stands for (extract, transform, load) and is used for moving data around and manipulating it in between if needed.  You can DIY code an ETL job, use libraries to help, and there are countless enterprise grade suites available.
      • HTTP POST – this is like a crude form of REST where you code programmatically submits a web form.
      • Flat files – using CSV or other text based data format to transfer data. Newer systems don’t do it this way (I hope), but a lot of older systems require it.
  • Message Queues / Asynchronous Processing – large systems that have a lot of different types of events happening benefit from a message queue approach.
    • Message Queues allow sending messages on one end and subscribing to them as events on the other. A message can be anything, typically an event with a bit of data attached to it. For example, a user signed up and here is their information.
      • Queues can be configured in a number of ways to support one or many clients reading messages, message expiration, and message redundancy (durability) in case of a system crash.
      • Flexibility is gained because the number of systems pushing or consuming messages can be adjusted and messages can potentially be leveraged by other systems.
      • Popular message queue systems are RabbitMQ, Kafka, AWS SQS.
      • Message queues are generally platform agnostic so many types of systems can participate in the message flow.  Service oriented architecture (SOA) and enterprise service bus (EBS) technology can rely heavily on messages queues (although some implement “messages” as API calls or flat file drops).
      • It is beneficial to think about a system in terms of the messages that go through it and how those messages are processed.
    • With asynchronous processing, the client is not waiting for the action to finish. It can be ‘fire and forget’ which means the client sends it and moves on (regardless of the outcome), or it can have a callback attached that gets triggered when the action completes.
  • Background Jobs – doing work behind the scenes to keep data moving, up to date, backed up, etc.
    • Background processing / task frameworks are similar to message queues but they integrate directly into a programming language /  framework.  Django/Celery, Java/Spring Batch, Rails/Sidekick are examples. With these libraries a function that looks synchronous can easily become asynchronous. The framework handles creating the message and processing it elsewhere. Dependencies such as Redis or a full scale message queue backing the background processing framework are generally needed for complete solutions.
      • A good use case for this is a web server sending an email.  Small scale sites block the client response while an email is being sent, but that can be slow or even fail. It is better to queue the email and return success to the client immediately. The job that sends the email can be setup re-try if the email server is unavailable.
    • Cron jobs / scheduled tasks –  these run on a schedule (every minute or once a month) to clean up and process data. While not as scalable as a distributed message queue they come with simplicity and ease of setup. However, making sure they do not stack on top of each other, log properly, and recovery from problems is key to making sure they run smoothly.
    • Daemonized processes – some systems need a custom program that runs in the background continuously. Memory management (preventing of memory leaks) is important here.
  • Language and Framework Specifics – backend developers must have a solid grasp of best practices for the language and framework they are using!
    • Here are a few popular backend languages / frameworks in 2019. You can browse the full list at the Stack Overflow Developer Survey.
      • Python – (Django, Flask)
      • Ruby – (Rails)
      • PHP – (Laravel, WordPress)
      • JavaScript / Node.js (Express)
      • .NET (C#, MVC.NET, etc)
      • Java – (Spring, Grails, Blade, GWT)
    • In general you’ll find .NET and Java being used in established corporate environments, while Python, Ruby and PHP get used more by agencies and smaller businesses. Python is also good for data analytics and research projects. Node.js has been the hot item for startups but companies across the board use it.  Up and coming languages like Go and Rust get a lot of attention these days.
    • The core concepts show up in each language and framework in different ways. There is an endless variation of jargon and structure as things are “re-invented” every ten years or so.
    • Knowing multiple languages and frameworks gives full stack developers the context need to sift out core knowledge from the jargon and trivial design decisions that go with a specific language and framework.

 

Closing Thoughts:

Nobody can master everything mentioned above. However I think the best full stack developers have at least a general grasp of all the topics mentioned above plus a genuine interest in everything software related (except enterprise software which you have to be paid $$$ to use or work on).  Pure dev skills aside, all full stack developers should have a good chair and sit stand desk.

The person who grunts every time someone mentions JavaScript, cloud hosting, or NoSQL is going the way of the dinosaur.

If I left something out please let me know.  In my original post from 2012 I didn’t mention any specific technologies because I wanted it to be buzzword free and timeless. This time I took the opposite approach.

If you’d like more in depth explanation of the topics above, please check out my book on fundamental software skills.

 

Posted in Application Development, Code, For New Developers | 1 Comment

ADA Compliance Tips for Web Developers

After working on projects involving ADA / WCAG compliance I can assure you, the internet sucks for people who are visually impaired.

Unless you are on a government website (required by law to be compliant) or a website that is specifically designed for visually impaired users, odds are it is difficult if not impossible to navigate with a screen reader or other aid.

I’m not an expert in ADA / WCAG compliance. I’m a software developer who sometimes works on the front end. My first suggestion is to go read the actual specs a few times. They are pretty easy to grok in an afternoon: Web Content Accessibility Guidelines Overview

Here are my observations and tips about making a website ADA / WCAG compliant from a software development perspective:

Contents:

  1. History of web standards for accessibility and current situation
  2. Run a scanning tool
  3. Make sure the site can be navigated using the keyboard
  4. How to enable keyboard navigation in your local browser
  5. Do not disable outlines!
  6. Code your HTML with WCAG in mind using ARIA and linting
  7. Stop taking good vision for granted

History of web standards for accessibility and current situation:

There are a number of laws and standards impacting web accessibility (partial list below):

Americans with Disabilities (ADA) Act of 1990 – “prohibits discrimination against individuals with disabilities in all areas of public life, including jobs, schools, transportation, and all public and private places that are open to the general public”. Websites and some apps can also be considered “public places” but the exact definition has turned into a highly debated legal term.

ADA Amendments Act of 2008 (ADAAA) – makes changes to the definition of the term “disability,” clarifying and broadening that definition

Section 508 amendment to the Rehabilitation Act – requires government websites (and IT programs) to be accessible to people with disabilities. Includes public schools, federally funded non-profits, and federal websites.

The Web Content Accessibility Guidelines (WCAG) – a set of guidelines focusing on websites and HTML. There are three tiers of compliance (A, AA, and AAA). The standard has a version history, 1.0 (1998), 2.0 (2008) and 2.1 (2018). As a developer WCAG will make the most sense to you!

Recently there has been a flood of lawsuits in this area, claims of frivolous lawsuits, and bills in congress to limit said lawsuits.

Business wise, the fear of being sued into the ground is driving executives to invest in ADA / WCAG compliance.

Legal motivation aside, from a software development perspective I think making sites accessible is the right thing to do.

 

Run a scanning tool:

There are countless scanning tools out there that look for common problems like color contrast issues, missing alt tags, empty headings etc. Some are free, some are paid, and some are so expensive they make you contact their sales team to get a quote.

Getting the scanning tools to pass should be pretty easy for any web developer. The fixes are usually about an hour per line item (alt tags, title tags, labels, etc). However, some code bases may be setup to resist even these simple fixes.

Here are some of the testing tools I found useful, no affiliate links here.

Accessibility Insights – a Chrome extension that runs on the current page. This tool looks at what is currently on the screen. If you have elements that appear at different responsive breakpoint widths, you’ll need to test at each of them. It struggles with popovers, dialogs. It annoyingly opens a new Chrome window after each run.

Arc Toolkit – Art Toolkit is a Chrome extension that runs on the current page inside the dev tools area.

WAVE Web Accessibility – A web based tool that visually shows you where the problems are on your page. The UI for showing what the problems are takes a little getting used to. The nice thing is you can browse around the site inside their wrapped window and check other pages without having to paste in the URL.

Power Mapper A web based scanning tool that catches a lot of ADA issues the chrome extensions above do not catch. It is nice in that it also checks a wide spectrum of problems including broken links, usability issues, privacy issues, etc. Not free, but has a free trial, with cloud and desktop versions.

Accessible Colors – Useful for fixing color contrast issues. The tools above flag color contrast issues, but they don’t tell you how to fix it. Accessible colors shows you the nearest compliant color pair with a visual preview and hex codes.

Countless other WCAG tools can be found at W3’s database of Web Accessibility Evaluation Tools.

A fun thing to try is to run the compliance scanner against major websites like Apple, Google, or even run the scanner against itself!

 

Make sure the site can be navigated with the keyboard:

Keyboard navigation is probably one of the most overlooked UI/UX features of modern websites. With the emphasis being on mobile, responsive design, and touch interfaces it is easy to forget about the clunky old keyboard. It is part of the WCAG 2.0 spec.

Goals:

  • Ensure every button, field, and link is accessible by tab key.
  • When a field is focused it is should be obvious with a visual outline effect.
  • Make sure there are no tab sinks where the focus gets stuck trapping the user.
  • Include a ‘skip navigation’ button at the top of the page.

As a developer, you need to know the difference between:

  • tabindex=”0″ – allows a <div> or other page element to get the focus when tabbing through the page.
  • tabindex=”-1″ – allows a <div> or other page element to get the focus via script $('#thing').focus(), but it is not in the keyboard tab order. Useful for resetting the focus to a helpful spot without showing an outline on anything.
  • tabindex=”{1 or higher}” – sets an explicit tab order for fields, but generally a no-no. On my own blog I set tabindex=”1” on the hamburger menu since it needs to be focused first, but it gets injected at the bottom of the DOM.

For more:

 

How to enable keyboard navigation in your local browser:

Not all browsers have full keyboard navigation turned on by default.

Chrome 74 automatically allows you to tab between links, input fields and buttons. It defaults to a dotted outline.

Firefox 67 on Mac by default only lets you tab into inputs. To allow it to tab between links, you have to enable a special setting. See this Stack Overflow answer. Note the MacOS setting mentioned in the SO answer had no effect for me.

Safari requires you to turn on keyboard navigation under Preferences, Advanced, ‘Press Tab to highlight each item on a webpage’.

 

Do not disable outlines!

Online tutorials point out how to disable the annoying focus outline by doing:

*:focus { outline: none; }

This kind of ‘free advice’ fails to point out the implications, in that it makes your site impossible to navigate by keyboard, and thus not compliant with WCAG!

 

Code your HTML with WCAG in mind using ARIA and linting:

Accessible Rich Internet Applications (ARIA) attributes give hints to the browser and screen readers. They are fairly easy to add to divs, labels, inputs, etc.

For example, a custom radio button should have ARIA attributes like so:

<span role="radio" 
	aria-label="Option A"
	aria-checked="false"
	aria-disabled="false"
	tabindex="0">
</span>

ARIA resources:

With front end frameworks like Angular and React linting tools can be configured to require ARIA tags, labels, titles, etc. The A11Y project is a good resource to look into.

Open source projects for accessibility scanning during development:

 

Actually test to make sure the site is usable by people with disabilities:

WCAG has a long list of requirements. Some of the requirements are subjective. Just because your site passes a scanner and supports keyboard navigation doesn’t mean the job is done.

Making a site ADA / WCAG compliant adds several new dimensions to QA testing. Not only should you be testing cross platform + screen size, now you need to test cross platform + screen size + keyboard input + screen reader mode + zoom level.

A partial list of items to watch for:

  • Provide alt tags for all images.
  • Audio and video should have text captions.
  • Provide text alternatives for audio and video (sign language, text description, etc).
  • Do not force horizontal or vertical device layout.
  • Clearly label form fields.
  • Being able to turn off animations and sounds.
  • Do not design content in a way that is known to cause seizures.
  • Support alternative input devices and screen readers.
  • Allow the user to zoom the page without breaking anything.
  • Giving links a title if their inner html is something like an image or SVG.
  • Having a Sitemap page, or search functionality.
  • Ensure color contrast on all text, buttons, inputs, etc is 4 or higher between foreground and background.
  • Text spacing must be adequate.
  • Allow lots of time to complete tasks, and if user must re-authenticate it puts them back where they were.
  • Allow user to turn off interruptions, like those annoying ‘signup for my free crap’ popup dialogs.

I get the feeling that making a site fully compliant for everybody is a never ending project. It requires a team effort among designers, UX’ers and software developers.

 

Stop taking good vision for granted:

My 20/20 20/15 vision is something I take for granted. My grandmother (former Cobol programmer) is nearly blind due to macular degeneration. It follows that my vision will likely someday fail me as well. One thing I am looking forward to in my golden years is having lots of time to surf the internet. But if I’m partially blind it probably won’t be worth it. I usually have half a dozen tabs open. I would not be happy listening to a computer voice stutter its way through the source of some website reading all the links and titles as I feebly tab my way around… Maybe I’d have to switch to listening to podcasts, but I doubt I have the patience for that.

Posted in Application Development, Code | Tagged , , | 1 Comment