Webservice API design tips – correct pagination and exposing deleted rows

After working with dozens of REST, SOAP and ‘ad-hoc’ web services / APIs I’ve noticed a similar set of design problems by companies big and small. One gotcha I almost always see left out of an API is an easy way to determine which records were deleted or moved on the backend. Another gotcha is implementing pagination and sorting in a helpful way. This includes ‘feed’ style listing APIs where the data under the API is changing constantly. I’ll explain the solutions below after a brief introduction.

Overview of a ‘books’ listing endpoint:

Let’s say we have a web app, a native iOS app and a 3rd party system that need to look-up books in a database. A RESTful API is perfect for this!

Let’s make the API a decent one by allow keyword filtering, pagination, and sorting.

# listing of book records, default sort, page 1 implied, default page size of 10
GET /books
{
record_count: 24178,
page: 1,
results: [
	{title: "Calculus 1st Edition", publisher: "Mathpubs", id: "15878"},
	{title: "Geometry 4th Edition", publisher: "Heath", id: "65787"}
	....
]
}
# listing of book records that contain 'python' as a search match
GET /books?q=python
{
record_count: 147,
page: 1,
results: [
	{title: "Python", publisher: "O'Reilly", id: "74415"},
	{title: "Fluent Python", publisher: "O'Reilly", id: "99865"}
	....
]
}
# listing of book records, sorted by title
GET /books?sort=title
{
record_count: 24178,
page: 1,
results: [
	{title: "Aardvark's Adventures", publisher: "Kids books", id: "124789"},
	{title: "Aardvark's Explained", publisher: "Zoolabs", id: "988741"}
	....
]
}
# get the 10 most recently updated books related to python
# note the minus (-) sign in front of updated_at, that is a Django convention but in your API do it however you want, perhaps better to specify it as "NewestFirst", just keep it consistent
GET /books?q=python&sort=-updated_at&page_size=10
# next get the 11 - 20 most recently updated books related to python
GET /books?q=python&sort=-updated_at&page_size=10&page=2

My notes on sorting a webservice listing endpoint:

  • By default, sort the results by something natural like title or date created if the sort parameter isn’t supplied.
  • Allow the client to specify a sort order. Validate the sort order they provided against a list options the server allows. Return a warning if it is invalid with a 400 error (bad request).
  • An essential sort order option is the time a record was last updated, newest first (typically updated_at desc). With that sort option a client can crawl through the pages until it hits a date already processed and stop there. So many APIs I’ve worked with overlook sorting by update_at desc. Without the updated_at desc sort option a client is forced to crawl the entire listing to find anything new or updated. This is very inefficient for large databases with a relatively small number of regular changes or additions.

My notes on paginating a webservice listing endpoint:

If your data set has more than say, 10 rows, adding pagination is a good idea. For very large data sets it is essential because too much data in a request can crash the server or the client.

  • Implementing pagination is a matter of the proper LIMIT / OFFSET queries on the backend, though that varies by ORM and data store.
  • One annoying thing that may dissuade you is, the server should return the total count of records that match in addition to returning the slice of rows that match the current page and page size. This is so the appropriate page links {1,2,3,4…} can be generated. Getting the overall count of matches can be a performance hit because it involves an extra query. If you want solid pagination, you just have to bite the bullet in terms of the count query.
  • The client should be able to tell the backend the page size it wants, but it should be validated (say between 1 and 100 most of the time).
  • Really good REST frameworks like Django-Rest-Framework offer ‘next-page’ and ‘previous-page’ URLs inside the JSON response – very handy for paging!

My notes on paginating a ‘feed’ style listing:

Some data sets are a lot more wild than books and change constantly. Let’s take the example of a twitter style feed, where bots, celebrities, teenagers, and software developers waiting on unit tests are tweeting their heads off in real time.

In this case, the database needs to organize records by a natural sort. Twitter has the concept of an ‘id’ that is sortable. Yours might be the updated_at flag or some naturally sorting hash that goes on each record (maybe the primary key). When the client loads the feed, the first call asks for a page of data with a given number of rows (say 50). The client notes the maximum ID and the minimum ID it got (typically on the first and last rows respectively). For the next API call, the minimum ID gets passed back to the server. The server then returns the next 50 rows after the minimum ID value the client saw. The server could also return the number of ‘new rows’ on a periodic basis with an ID higher than the maximum ID the client initially got. It has to be done this way because while the user was reading their tweets and scrolling down, it is possible many new tweets were created. That would cause everything to slide down and screw up traditional pagination.

Twitter has a more in depth tutorial here:
https://dev.twitter.com/rest/public/timelines

What about deleted or moved records??

Getting at deleted records in an API is a practical problem I’ve had to solve several times. Think of case where a background process scrapes an API and keeps tabs on what changes. For example, social media posts or content records in a CMS.

Let’s say an hour ago, the listing API was scanned and all data was retrieved and our copy is in perfect sync with the other side. Now imagine the book with ID 789 gets deleted on the server. How do we know that 789 got deleted?

Invariably, I have ask the people who made the API and they write back and say something like, “it can’t do that, you have to page through the entire set of data or call for that individual book by ID”. What they are saying is, on a regular basis do a full scan of the listing, compare that to what you have, and anything you have that the server doesn’t was deleted on the server.

This situation is particularly painful with very large data sets. It can make nightly syncs unfeasible because there is just too much data to verify (rate limits are quickly exceeded or the sheer amount of processing time is too high). Let’s say you are forced down that road anyway. You have to be very careful when triggering deletes on your side since a glitch in the API could cause accidentally deletes on your side. In this scenario when the API goes down or responds with an empty result set the scraping program might think “great I’ll delete everything on this side just like you asked since it looks like nothing exists anymore!”. To prevent that kind of disaster, in the past I’ve limited the maximum number of deletes per run and alerted when it found an excessive number of deletes.

Fundamentally a RESTful API isn’t a great way to mirror data that changes all the time. The reality is, often it is all you have to work with, especially given mobile apps and cross platform connectivity, security requirements, etc.

Here is what I do regarding server side deletion of records in a listing API:

First of all, as a general principle, I almost never design a database to allow immediate physical deletion of records. That is like driving without a seat belt. Instead, I add a deleted column with type tinyint/bool/bit default 0 to every single table. The front end and all APIs are programmed to filter out deleted rows. This way, if something is accidentally deleted, it can easily be restored. If a row has been deleted for more than a given period of time, say 12 months, a cleanup script will pick it up and physically trash it and associated child rows out of the database. Remember – disk space is cheap but data loss is costly.

Another way to do this is to keep a DeletedBooks table. Whenever a Book is deleted, make an entry in that table via a trigger or hook or whatever your framework fires off after a record is deleted. I don’t like that as much as the deleted bit column solution because with hooks / triggers things get complicated and data loss can happen unless they are truly ‘transactional’. However, a DeletedBooks table may be easier to put in place in a legacy system that constantly stymies your efforts to make a good API.

Now that our data layer has knowledge of deleted records, we can add a new endpoint for deletes that only returns books that were deleted. This API should be paginated, allow filtering, etc. Note that it includes a date_deleted field in the results, which may be useful to the client. In most cases date_deleted may be substituted for updated_at.

# listing of deleted book records!
GET /books_deleted
{
record_count: 50,
page: 1,
results: [
	{title: "Algebra II", id: "29898" date_deleted: "2016-08-20 T18:25:43.511Z" },
	{title: "Trig for Kids", id: "59788" date_deleted: "2016-08-17 T07:54:44.789Z" },
	....
]
}

You could also add a deleted parameter to the original listing API to filter for deleted records:

GET /books?deleted=1

A similar implementation can be created for records that disappear for whatever reason – moved to a different account, re-classified, merged, or tossed around like rag dolls. The basic idea is to expose data so clients can decipher what the heck happened instead of having to page through the entire listing API to piece it together.

All the other ‘best practices’ for REST APIs:

If you’ve read this far you are probably committed to building a good API. Thank you. It is a thankless job like many in ‘backend’ software, but let me again say Thank You. Unfortunately, people usually don’t notice when things go smooth, but a bad API is very easy to notice. Perhaps a few developers have suffered permanent IQ degradation from being forced to write code against poorly designed, undocumented, and jenky APIs. Together, we can ensure this is a thing of the past.

All the docs I’ve read say a good API should emit JSON and XML. Your framework should handle that for you, so I won’t say anything more about that.

Eg:

GET /books.json -> spits out JSON
GET /books.xml -> spits out XML

Successful requests should also return the http status code of 200.

Here are some other status codes you’ll want to use in your API.

  • 400 – bad request (inputs invalid, something screwed up on their end)
  • 401 – unauthorized (user is not authenticated or can’t access this particular thing)
  • 404 – not found (just like a page not found error on the web)
  • 405 – method not allowed (eg, client tired to POST to an endpoint that only allows GET requests)
  • 500 – internal server error (something screwed up on your end, I hope you logged the details?)

For a complete list of HTTP status codes see:
http://www.restapitutorial.com/httpstatuscodes.html

Other good tips I’ve seen include: Versioning your API, use verbs correctly (GET, POST, DELETE, PUT, …), use SSL, document it, etc.

For more best practices involved RESTful APIs see:
http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api
http://blog.mwaysolutions.com/2014/06/05/10-best-practices-for-better-restful-api/

Posted in Application Development, Code | Tagged , , , | Comments Off on Webservice API design tips – correct pagination and exposing deleted rows

How to Deftly Handle Pushy People and Succeed on Software Projects

Working in the software profession you will often run into a situation where someone is pushing your estimates down, asking for the moon, or increasing scope without adding resources or time. There is no instruction manual for handling that sort of thing. The only thing you can really control in that situation is how you respond. I’ve learned a few skills over the years that help.

The fundamental law governing all projects – the Iron Triangle:

iron_triangle

Probably the most widely accepted concept from project management is the ‘iron triangle’. The corners are:

  • Scope – what does the project consist of? Includes QUALITY of the work!
  • Schedule – when will the project be released?
  • Cost/resources – how many dollars and or hours are available?

The law of the iron triangle: when one corner is reduced at least one other corner must expand to compensate.  Every conceivable way to cheat this law has been tried.

Another way of looking at the iron triangle is: better, faster, cheaper – pick two and you’ll be fine.

Feeling the squeeze and reacting with empathy:

Normally balancing the triangle is in direct conflict with a project being seen as profitable or successful by at least one if not multiple stakeholders on a project. Invariably someone is going to want to push on one of the corners. You’ll be asked if you can do it sooner, or if you can do it with one less team member, or if you can launch it with just one extra feature. The important thing is not to take it personally. It may be their job to do so, perhaps a performance evaluation or paycheck is on the line? In the famous Getting to Yes book, one of the things that has always stuck with me is separating the people from the problem.

The naive and closed off way of think of those who push you around might be:

  • The CEO with the personality disorder
  • The sales person who lied about the system’s functionality
  • The project manager gunning for a promotion
  • The team member who insists on using language X
  • The team member who insists on staying with older technology

Instead of using labels, the wiser path is to see them as people who have shared goals with you, who want what they think is most important.

  • The CEO who is self assured and wants to ‘win’ (which is good for you)
  • The sales person who is always upbeat and optimistic (without sales, the software is irrelevant)
  • The project manager who bases their self worth on their accomplishments
  • The brilliant and eager developer who wants to use language X
  • The experienced and cautious developer who trusts what they know

Negotiation skills for getting out of the squeeze:

For most senior software professionals, it is second nature to refer back to estimates, bring up concerns of quality, or tactfully point out how rushing in the short term leads to bad outcomes later on. If all you offer is ‘pick two’, or ‘it is impossible’, you are right, BUT whoever you are talking to is coming to you for a solution not a dismissal. Here are some techniques that have helped me deftly get out of pressure situations while making the customer happy:

a) Soft launch / beta release: Release the software in a partially complete state to a very small number of trusted users. Finish up everything else as more users come on board. This allows the schedule to flex some, and possibly even the resources, but keeps the scope in tact.

b) Start a road map: Setup a long term plan (1-2 years) which covers all the necessary features. Illustrate any changes to resources, scope or schedule on the road map so the trade offs are apparent. Some advantages to having a road map are that everyone in the company can setup the 1.0 features in a way that leaves room for the 2.0 features down the line. Of course, leave room to clean up bugs and technical debt along the way and make it clear that when these get left out they will come back to steal resources later on.

c) Primary deliverables and secondary deliverables: Primary deliverables are must have items. Secondary deliverables are things that are needed, but don’t take immediate priority. Usually items like report pages, admin screens, data exports, and print friendly pages make great secondary deliverables. Coming to an understanding over what items are the most important can be a huge breakthrough in communication.

d) Make room for technical items: Every release, include at least one or two technical cleanup items. Politely insist on these items at every planning meeting. Explain the consequences of not following through. An example – the SSL certificate on the API is going to expire in 6 weeks. Unless that is updated all users will get locked out of the application.

e) Be honest about your limitations: It can be hard to admit you need some help or that a specific part of the project isn’t suited to your skill set. For rock star developers it is tempting to take on everything all at once. I always tell people – I can’t pick colors or draw… for the sake of the product let’s get a designer on board sooner than later so I can implement what they come up with and we can stay on schedule.

Another tool – Non Violent Communication:

This post was inspired by the book Nonviolent Communication (NVC) by Marshall Rosenberg. NVC explains a formula for effective communication.

As a software developer I liked the way it was presented as a ‘recipe for communication’ with lists of wording choices.

The basic formula is:

  1. State observations that are objective in nature and seek clarification.
  2. State how those observations are making you feel using specific language.
  3. State your needs.
  4. Make specific requests so your needs can be met.

Here is a list of descriptive ‘feeling’ words: https://www.cnvc.org/sites/default/files/feelings_inventory_0.pdf

I don’t know if statements like “i’m feeling enchanted“, or “i’m feeling wretched” are 100% work appropriate, but the idea is to be very specific about how you feel so the other side opens up.

NVC Applied during an ambush:

One day early in my career I recall being cornered in front of a white board by multiple senior managers. They insisted I launch a product by a certain date with a certain set of features. I told them the original estimate of four months was what it would take to get it done. They kept asking me questions like “what can we cut?”, “how can we do this cheaper?”, “is your estimate for that already padded?”.

We looked at every aspect of the project together. It went on for an entire afternoon. Every design decision was scrutinized. Fundamental items were scraped. I walked out of there feeling drained and wondered what kind of people I worked for. In hindsight they were struggling to deliver on a bad promise they had made, all with the best intentions. The project ended up working out fine. I didn’t have to work evenings and weekends to get it delivered. Later I went on to clean up some of technical debt in subsequent releases. Then I took another job (where I stayed for a long time) and washed my hands of the whole code base.

On that fateful day, had I known about the tools from Nonviolent Communication, I could have done the following:

1) Make observations:

Wow, hang on a minute here, let me catch my breath! I’ve noticed everyone is pushing hard to make this project as cheap and fast as possible.

1b) Seek clarification:

What changed? Did we loose funding or did a customer back out?

Are you sure you want to do without feature X, my concern is that is an essential element of the project.

I’d like to know what Joe from down the hall thinks about changing feature Y.

Maybe we should we call in project manager Jane to provide some input because my expertise is in software not project management?

2) State my feelings in a clear manner:

I’m feeling lost because our commitment to quality and customer satisfaction can’t be upheld when we rush through our work. I’m also feeling flustered because non-technical people are overriding the technical details of a product I am responsible for.

3) State my needs:

My needs are to deliver work that is complete, done to the best of my abilities, and aligned with what my manager and the business expects of me.

4) State my requests:

Would the leadership team be willing to phase in everything in my original spec over a 4 month period, with a soft launch in 3 months? Would the leadership team be willing to allow key technical items that ensure quality as part of each release over the 4 month period?

Further reading:

Posted in Business, Work | Tagged , , | Comments Off on How to Deftly Handle Pushy People and Succeed on Software Projects

Django Automatic Slug Generator Abstract Class

Using Django, an example of how to auto populate a slug field based on the name or title of whatever the thing is. Correctly handles duplicate values (slugs are unique), and truncates slug if value too long. Includes sample unit tests!

The built in Django model.SlugField() is basically the same thing as a model.CharField(). It leaves the work of populating the slug up to you.  Do not override the save() method of your concrete model and copy and paste that code to every model in your project that needs the behavior (that causes dementia).

Django makes it possible to elegantly abstract the ‘fill in my slug’ behavior into an abstract base model. Once you setup the model the slug generation is entirely automatic. Just leave the slug field blank in your app code, call .save(), and the slug will get populated.

This is compatible with Django 1.8 and 1.9 for me on Python 2.7, and probably earlier versions Django too.

What is a slug?

A slug is a unique, URL friendly label for something, usually based on the name of whatever that thing is. For example, the book Fluent Python’s slug might be ‘fluent-python’.  This blog, powered by WordPress, makes extensive use of slugs, every post gets one as you can see in the URL bar.

What about corner cases?

If the title is too long for the slug field, the value will be truncated. In the case of a duplicate name (which may be okay depending on the model), the slug will get suffixed with a number, eg ‘fluent-python-2’.  There are some unit tests below so you can carry these into your project too and be confident.

My field names are not ‘name’ and ‘slug’!

That is okay. It is setup so you can customize the source field name and the slug field name on a per model basis. See the LegacyArticle example in the Gist.

Example usage in a TestCase:

row = Article()
row.name = 'The Computer Is Talking Sense'
row.save()
self.assertEqual(row.slug, 'the-computer-is-talking-sense')

# create another one with the same name, should get -2 added
row2 = Article()
row2.name = 'The Computer Is Talking Sense'
row2.save()
self.assertEqual(row2.slug, 'the-computer-is-talking-sense-2')

# change the name, the slug should change too
row2.name = 'Change is Good'
row2.save()
self.assertEqual(row2.slug, 'change-is-good')

# slug gets truncated
row = Article()
row.name = '0123456789' * 25
row.save()
self.assertEqual(row.slug, ('0123456789' * 10))

# slug gets truncated, accounts for suffix
row = Article()
row.name = '0123456789' * 25
row.save()
self.assertEqual(row.slug, ('0123456789' * 9) + '01234567-2')

# loop to trigger integrity error
for i in range(1, 10):
 row = Article()
 row.name = 'loop'
 row.save()

row = Article()
row.name = 'loop'
# hijack the local attribute just this once
setattr(row, 'slug_max_iterations', 10)

try:
 row.save()
 self.fail('Integrity exception should have been fired')
except IntegrityError as e:
 self.assertEqual(e.message, 'Unable to locate unique slug')
Posted in Code | Tagged , | Comments Off on Django Automatic Slug Generator Abstract Class

My Answer To: I want to learn programming, should I attend a code school?

I recently had a reader ask me if they should attend a coding academy because they want to get into programming. Here is my answer:

There are many success stories involving code schools. In fact my grandmother was one of those success stories, but more about her later. Still, I’d be careful of code boot camps, code schools, hack schools, code academies and the like. Read the reviews and talk to grads who are a year or so ahead of you.  One code school in my home town recently shut down suddenly with no warning. Anecdotally, I’ve heard of code schools hiring their own grads as tutors or instructors just so they can claim a high percentage of their grads are employed in the field. At best that is called academic inbreeding, at worst it is a pyramid scheme. Aside from avoiding getting ripped off, you want to make sure of the quality of the curriculum. These schools often jump directly to frameworks and web development without providing proper fundamentals like what is a function, what is a variable, etc.

So yes, this equation is sometimes true:

$12k Code camp + $2k Mac Book Pro = Job in software @ $60-110k/yr

but this makes more sense:

The Knack + Enjoyment = A career you love that pays well

Software takes a special knack and to be good you must love it:

To succeed in software you need many things, but at the top of my list is an innate knack and natural enjoyment for it. I’d rule those out first before dropping money on tuition.

My first criteria is you must have the gift for coding, and that is probably genetic. What goes into the innate ability to code has been studied and blogged about. The bottom line is, there is no way to teach everyone to write code. It doesn’t work the in same way that almost all of us absorb language as infants or grow up and join Facebook if we want.

My second criteria is you must enjoy writing code. Know anybody who can concentrate on abstract details for long periods of time? A lot of people I know can’t believe I sit in a room for 10-12 hours a day doing what I do. They say it would drive them totally nuts. Even very gifted and intelligent people struggle and end up hating it. I once had a calculus teacher who despised writing code. He couldn’t get his program to run, even though he said it was mathematically perfect… It was probably just a syntax error.

How many people have ‘The Knack’ as a percentage of the population?

The short answer is somewhere between 1.5% and 3%, but the numbers are pretty fuzzy.

theknack

Based on the bureau of labor statistics, in 2014 in the US there were 2.3M jobs directly related to writing code. According to this infographic of the 17M people in the ‘nerd economy’ world wide, 44% were in the US. That is ~7.5M but may include some non-programmers. Let’s take a guess and say in the US it is 5M. Out of population of roughly 320M in the US, that comes out to ~1.5% of people who write code for a living.

If you walk past the average family on the street, you would see 1.8 children. If you walk past 100 people on the street, 1.5 of them would be employed in software development. Except in the bay area it would probably be closer to 50! The 1.5% estimate only reflects those who are active, not those who have the knack + enjoyment but do something else, nor those who were downsized. As a planet we can get that number higher if age bias goes away and more opportunity is provided for minorities, women, and people of low socioeconomic status.

What goes into ‘The Knack’ for writing code?

The following traits appear to be closely associated with coders:

  • Analytical skills
  • Problem solving
  • Rapid comprehension (fast learner)
  • Mathematical aptitude
  • Musical proficiency
  • More interest in truth than appearances
  • Good memory
  • Creativity
  • Do-It-Yourself (DIY) hobbies
  • Obsession with science fiction
  • T-shirt collections
  • Difficulty picking good looking color combinations

 

Software is a journey, it is cyclical, and the learning never stops:

The idea that anybody with $12k can become a great programmer in a matter of 5 months is so wrong. I’ve been programming for almost 20 years and I’m still improving. Who you work with and what you are working on matters. It may take a decade for everything to really start clicking.

We live in a golden age of technology expansion. Right now the world is experiencing another technology bubble. This one may not be as big or as violent as the dot com boom, but programmer demand is out of control. Overall I think demand for software will continue to grow for many years while being bridled by bust and boom business cycles. That is until self aware artificial intelligence gets loose and kills us all (software developers first no doubt).

I recall during the dot com boom my wages were artificially boosted which I thought was permanent at the time. I also found myself working around a bunch of yahoo’s who had no business in software. They were ultimately weeded out of the field. That pattern is peaking yet again.

A CS degree, or some kind of complimentary degree from an accredited university should be on your road map. To test the waters you might start with a free online course.

In software it is entirely possible to start off being self taught – like I was. My first paying gig was at age 16. I was literally the kid down the street. At the time I was very rough around the edges. Side projects and eventual part time employment allowed me to pay my own way through college. It was hard, I clocked 20-30 hours per week and took 8-16 credits per term including summers. I got into the habit of running through flash cards every night before I went to bed. Side note – it turns out memories are best formed right before going to sleep, so studying before going to bed helps with retention. What I learned in college wasn’t as immediately valuable as my software skills, but it ended up being the prefect compliment to my life. I learned how to write, how to analyze information, and grow up some. I also met my wife in college.

Your assignment:

To find out if software is the life for you, my advice is to get a cheap PC laptop, install linux on it (Ubuntu or RedHat), and start with (Python, Ruby or PHP), Javascript, and SQL. Online outlet stores like the Dell outlet  and the Lenovo outlet have good deals on refurbished hardware (which is basically an extended burn in test).

Start going to local meetups and hack nights. Get in the habit of learning all the time. Whenever you see a word or acronym you don’t know, google it and make a flashcard for it. Flip through video presentations from past software conferences like OSCON, InfoQ, etc, much of the content is made available for free!

Check out some books on programming from the library. The web is great for bits and pieces, but a published book typically has more depth.  The first chapter of most programming books will be about setting up your computer and installing the right programs (called your environment). Then you will write a program that prints ‘hello world’ on the screen. Note how you feel and how smoothly it went. If you are totally flummoxed this, you may need to some face to face help, which brings me to the next section.

Get a mentor:

There are many people out there willing to share their knowledge. Some will charge anywhere from $10-$100/hr, others ask nothing in return, and some work for pizza. Mentoring is something I plan to do for the rest of my life, especially in my twilight years to keep my mind healthy and to give back.

I wish I would have had more mentoring earlier in my career. My bosses were gracious enough to introduce me to a few senior people. I met with them every few months and emailed more often. I should have taken more advantage though!

It was a simpler world back then. There were fewer frameworks and languages vying for attention. In today’s work the ‘stack’ or the ‘tree’ of technology is really getting out of hand with dozens of options in each category. Talk through this with your mentor.

Some encouragement:

Anybody can get into the field of software, not just white guys like me. In the 1970’s my grandmother took a programming course, perhaps similar to today’s boot camps. She started on punch cards and later wrote Cobol for the IBM mainframe. They tried to bring her back out of retirement in the late 90’s to help fix Y2k bugs but she wisely declined. I suppose I got the knack from her. As a female, she was a pioneer in the tech industry. I’m really proud of her. Her department had a few female coders. I’ve always noticed companies hold onto their female coders. There is a huge movement out there to get more women and minorities in tech. I fully support that kind of thinking. Yes, at bad companies there is a glass ceiling, harassment, and the old boys club to put up with. Screw those kinds of places. Be like Grandma and go for it.

Posted in Code, For New Developers | Tagged , , , | Comments Off on My Answer To: I want to learn programming, should I attend a code school?

Three Failings of Scrum and the Old School Solution

Everywhere I look I see software rapidly falling into obsolescence that will need to be re-written soon. A good example is how rapidly JavaScript (ECMAscript) is evolving. In a few years nobody will get excited about maintaining a ECMA5 code base… this is already true for some early Angular 1.x apps, am I right??? Another driver of obsolescence is the continual force march of updates and upgrades from big corporations like Microsoft, Apple and Oracle. In that case, it is all about money. It recently dawned on me that Scrum is also a driver of early obsolescence! In a nutshell, Scrum leads to technical debt and conceptual debt, but it can be managed if a) you can take a medium to long, view technology wise, and b) you have a healthy level of skepticism about anything that is hyped up.

Call me crazy, but I think software should last more than 2-5 years. There is a LOT of old software out there that needs maintenance. Some of it is built really well and stands up to the test of time (abstraction, configuration, logging, not locked into a particular vendor, etc). However, a percentage of that software starts off as or morphs into a totally unmaintainable rats nest. I’m talking about the kind of spaghetti that makes developers run and hide. Low quality software leads directly to higher costs and expensive fires like data breaches and glitches that release prisoners early. CEO’s hate having to deal with technology risk, because they think they can’t control it. I’ll explain below why Scrum feeds into this problem.

An overview of Scrum:

Scrum is a development methodology, a way of organizing work on a software project. The most commonly implemented idea in Scrum is a daily stand-up, where all team members gather together for a quick status report. Each team member lists what they did yesterday, what they will do today, and anything they are stuck on.  Work is broken up into time boxed segments called sprints (usually 1-2 weeks long).  To the team, the current sprint is all that matters. Each sprint has to produce something of value to the end user, so it can’t be all UML diagrams and refactoring.  What gets worked on during the sprint is selected by the product owner, via a backlog of tasks they have ranked in order of importance to them. These tasks should already have estimates associated with them. Ideally, those estimates were reviewed by two or more technical people who are actually doing the work.  The unit of effort can be hours, or more abstract things like story points, or even bananas.

What, estimating in bananas??? It sounds nuts, but it gets into the fact that estimates are often misconstrued, wrong, and used as political footballs in the game that is middle management. When you say 8 hours, what does that really mean? Is it 8 ideal hours? Is it one work day with room for meetings and water cooler sessions? Does it mean the work will be ‘done’ tomorrow? When an estimate is first uttered, it magically turns to stone and becomes a commitment to those who heard it. They may try to talk it down which causes the stone to shrink, again magically.  Estimates are rarely changed unless extraordinary information comes to light. At the end of the day, as a middle manager, with bananas you have more leeway (if your boss will still take you seriously). A login page might take 4 bananas, but that upgrade to the reporting module might be 40 bananas. How many hours is a banana is worth? In Scrum, that is determined by looking at past sprints. Measure how many bananas got done vs how many person hours were available on the team that sprint. To tell how much ‘work’ you can get done each sprint, compare estimated bananas vs actual completed bananas, adjusted for variations in team size. That figure is called velocity. A team’s velocity might be 100 bananas or 1000. The actual number doesn’t really matter.

There is an ongoing circular debate about how to solve the estimation problem. Topics include relative vs absolute estimates, the estimation process, the units to use in estimation, who should do the estimating, estimate poker sessions, etc. Estimates by nature are probably wrong, so why fuss about them so much? Have the person doing the work make the estimate and then get out of their way. I’ve never seen actual work get done while philosophizing about estimates.

Speaking of philosophy, here is a better question to ponder: what is your team’s definition of done? Does it include writing some documentation? Unit tests? Peer review? QA testing? Deployment? Don’t forget those very important aspects of software development in your estimates. Skip these and somebody will be disappointed later on. The strategy most immature companies use is to entice their staff to becomes heroes which makes things even worse later on. They say it takes a start-up about 2 years to paint itself into the corner…

I’ve also found a) tasks have a funny way of getting completed within their estimates (but to varying degrees of quality and done-ness), b) it can be much faster to just do the task rather than sit around estimating.

To be fair, some estimating is a necessary part of the planning phase of any project. No approach is perfect. It is wise to double or triple estimates to allow padding and get the job done right. You only know how long it actually took until after the work is done. This bothers middle managers who need to file reports to their bosses showing how efficient and frankly amazing they are at herding cats and seeing into the future. To cover up that huge gap in managerial control, the Scrum people invented banana velocity to explain what is going on in a quasi-economic sense. I suppose banana velocity sounds sufficiently geeky so non-technical people just accept it. In repressive office settings dominated by cubicles, bananas go by the term ‘story-points’ which sounds more official, but not concrete enough to be an hour or a dollar which still leaves the all important leeway.

Scrum’s Strengths:

As a developer, I enjoy working in a Scrum environment because of the balance it strikes between structure and being left alone to work.  Agile software development and its offspring Scrum and Kanban, achieve balance by making the software development process flexible. Work is based on real goals. Output is measurable. Features are launched regularly, sometimes with quick turn around. Scrum’s iterative releases are especially satisfying for the team.   The stand-ups keep everyone in sync and problems are surfaced right away. Communication between technical and non-technical stake holders is baked into the process!

On a green field system, where the code is new, Scrum is fantastic because you can fly though features and put things in the hands of customers quickly. For one off contractor style projects Scrum does a great job.

Where Scrum falls apart:

Developers enjoy Scrum because they are left alone to work, but that enjoyment is a form of moral hazard. As a wise master put it: ‘the untroubled ease of programming beneath [Scrum’s] sheltering branches‘ (see The Tao of Programming section 7.1). Moral hazard is when one party is insulated from risk (in this case, the happy programmer), and they behave in a way they enjoy that may cause another party to suffer a loss (in this case the business they work for). A happy developer, working on small tasks week by week, does not see or have reason to care about the big picture. With Scrum, the big picture takes a back seat to the prioritized tasks on the backlog. The lack of big picture thinking and focus on the short term is what makes Scrum so dangerous to software quality.

As the sprints wear on, things begin to change. The velocity begins to drop and estimates begin to creep up, but that’s okay because we are talking bananas and there is room to fudge it. Eventually weaknesses begin appearing in the system design and a few fires break out. Perhaps a scapegoat will be blamed for the headaches? It is the system, not the people that is causing the actual problem.

There are three main ways in which Scrum causes the code base to be more expensive to maintain as it progresses, which leads to obsolescence.

1) Scrum is like playing drunk jenga in reverse:

Scrum says nothing of technical details, leaving that ‘to you’, as in, an exercise you can do on your own. Seriously, unless we are talking about homework assignments, nobody does those. Given the lack of specific direction there, and the fact the Scrum product owner in a purely technical sense is likely clueless, I don’t think most Scrum teams address technical details as good as they could, or at all. The new bits continually accumulate on top of the older bits forming a kind of upside-down pyramid that is prone to collapse. Imagine building a 2000 square foot house, where every two weeks living space has to be delivered to the customer. Every two weeks the customer immediately moves into the new living space. For the next sprint, you either start with a small room and build a larger story on top, or plop down a new disjointed foundation and knock holes in the walls to join the rooms. Even with a blueprint the decisions made in the early sprints invariably end up being poor decisions later on.

2) Non-technical product owner steers the ship:

At the start of each sprint, called the sprint planning meeting, there is a little dance that happens between the technical and non-technical staff as to how much customer facing work gets done vs how many backend fixes and improvements are made during the sprint. The product owner controls the money, so they get what they want. Unfortunately, the product owner may not understand technical jargon or care that the RAID array needs its drives updated. System improvements and maintenance tasks usually get pushed to the next sprint until they manifest themselves as a fire which interrupts the sprint and defeats the process. I like to call that spontaneous self combusting technical debt. When that happens, sometimes it is a random surprise, but it is often due to poor management. Scrum seems to bring out the worst decisions in non-technical managers.

Over the life of the software system, Scrum is sort of like driving with a blind fold on.  Sure the first few blocks might be okay, but you’ll be in the ditch a mile or so down the road. I’ve seen this unfold many times. A non-technical product owner (someone who I often respect and work well with) ignores the overall technical health of the system for a short term gain (need to get paid, need to make client X happy ASAP, etc). On a lot of software projects, especially under Scrum, the non-technical leadership has too much sway on technical matters. They may not trust technical people, they may want to assert their dominance, or they may be blindly following Scrum to the letter. Either way, we need a methodology that tells non-technical leaders to listen, but that hasn’t been invented yet. Only a few are smart enough to listen and lead. I’m not saying each sprint should be 100% ruled by geeks. That is just silly since technical people often have no clue about business. It needs to be a healthy balance, and I’ll get to that later.

Related to estimation and non-technical product owners, consider an example.  Let’s say the backlog has twenty tasks each taking 10 bananas (200 bananas total) and at the very bottom some technical task estimated at 100 bananas that refactors the system making everything else easier. The 100 banana task delivers nothing to the customer, but reduces the effort of the ten banana tasks to just one banana. The net cost of everything is 120 bananas, but requires halting new deliverables for 100 bananas worth of work.

I’ve observed that tasks with smaller estimates are universally made higher priority. This is the – let’s get the cheap stuff done first school of thought.  I have rarely seen a non-technical product owner opt for the 100 banana task even though the payoff is clearly there. With Scrum, the product owner gets hooked on immediate results from each sprint. They are turned off from devoting two or three entire sprints to something ‘big’ and ‘risky’. Thus, the big important technical stuff never gets done. This leads to problems and the technology gets blamed instead of the real culprit – the process and people managing the technology.

3) Commoditization of developers:

Others have pondered, Is Scurm is the McDonald’s of software development? Comparing any software development process to McDonald’s is a bit rough. Though it is true, Scrum views developers the same way McDonald’s views burger flippers. Scrum treats developers as interchangeable resources capable of producing bananas.

Career-wise, under Scrum, developers rarely get a chance to do big picture thinking beyond the current sprint. The backlog, which is prioritized by the product owner, dictates the engineering effort each sprint. No work is supposed to span multiple sprints. There is a limit to career growth that can happen in this setting. The post Why Agile and especially Scrum are terrible goes into more detail on that subject. It hypotheses that some companies use Scrum to turn low wage C players into more than the sum of their parts. Bizarrely, this may be seen as a ‘profit’ for the company using that strategy.  In reality the best people in software either find a culture they fit with, program in comfort under the shade of the corporate tree branches (and could give a shit about banana velocity), or they create their own companies.

How to use Scrum correctly:

So, what can we do about the fact that Scrum leads to problems down the road? In the software world, as usual, the answers lie in The Mythical Man Month, which was originally published in 1975, way before Agile was a thing (roots in the mid 80’s at the earliest).

In software, there are three main wheels turning: the business, the architect/designer, and the implementer (aka coder). Scrum places the business in control of the implementer and kills the architect. This problem can be remedied with the following:

To address weakness #2 Non-technical product owner steers the ship:

a) Get a product owner who understands the business AND the technology.

or

b) The product owner is replaced by a committee made up equally of business types and technology types who prioritize the back log on a regular basis.

To address weaknesses #1 drunk jenga design and #3 commoditization of developers:

The architect/designer skill needs to be resurrected and given prominence. Work will still be broken into sprints, but that work must adhere to the big picture technology-wise. In other words, there is a plan in place, a road map that lives outside whatever happened to bubble up to the top of the backlog this week. That plan should influence what bubbles to the top of the backlog.

This role could be filled by a lead or senior dev. It should involve junior devs on a rotating basis to give them a career boost and learning opportunities.

The Mythical Man Month advocates for an empowered designer. In reality, many software architects are powerless ‘technical leaders’, spouting off Power Point email attachments in excess of 10MB. Software developers are largely free to ignore all the white tower bullshytt. If the software architect has no teeth, you might as well just laugh behind their back with everyone else. They gave up their coding skills for Power Point anyway, didn’t they?

To ensure quality and consistency, many open source projects nominate a benevolent dictator for life – BDFL. Hey, if you want to get big things done, history has shown a dictatorship is the way to go. The people’s feelings be damned! The BDFL makes decisions that sometimes screw over a few small groups, but they are working towards the greater good for the project.

All Scrum projects would benefit from a technically oriented benevolent dictator (not necessarily for life). This role is not the Scrum master. This person must understand the business. They must also be insulated from petty politics and short-term incentives. The equivalent might be giving the CTO a 25% stake in the company, and starting with the understanding that sometimes the business won’t get what they want right away. The business side needs to trust the CTO’s judgment implicitly like Kirk trusted Spock.

Posted in Application Development, Business, Code, Work | Tagged , , , | 2 Comments

Is localhost development obsolete?

Topic explored: Someday soon developers will only need a basic Chrome Book and a wifi connection to do their work. No software will be installed locally other than a browser.

I’m not so sure this trend will pan out across the board, but there are several reasons it makes sense.

Software development traditionally requires a high end machine capable of running everything the server needs plus an array of development tools. That translates to a non-trivial setup process and leads to subtle variations in what packages are installed. Some languages try to make life easier, for example python with virtualenv, or Ruby with rmv, but it is rarely a 100% perfect match between all team members and the production servers.

Why is localhost bad?

Using the exact same system libraries in dev, qa, staging and production is a smart thing to do because it eliminates bugs related to differences between versions. As a contract developer with multiple clients, I often have several projects going at once on the same development laptop. Keeping all the dependencies wired correctly gets annoying sometimes, but I’ve kept good notes and for the most part it doesn’t get in the way.

Dependency hell is a real place and I’ve been down there too many times.

In the modern world we solve problems by outsouring them to the cloud. So why not outsource localhost to the cloud?

The winning combination as I see it is:

  • Web Shell for Vagrant / Git
  • GitHub (or BitBucket) for collaboration
  • Web based IDE
  • Slack – not required but might as well publicly get on the Slack bandwagon now, ’cause it does make my life better.

In a nutshell, this new solution allows developers to edit code in a browser tab, click a button to launch a vagrant instance on AWS,  access shell commands in another browser tab, and integrate perfectly with source control. No need for any development libraries or tools installed locally. This lends itself heavily to the LAMP / MEAN stacks, but I don’t see why Java, C++, or any platform wouldn’t work with this approach.

Vagrant makes localhost as a server obsolete:

vagrant

Vagrant is a utility for spinning up virtual machines that run your application. Vagrant is heavily configurable. The config file lives in your project’s source code, typically in the root directory. With Vagrant all team members run the exact same virtual environment. Vagrant integrates with VirtualBox by default, but also Amazon Web Services, VMWare, and allows custom providers. Vagrant links your source code into the app directory it is hosting. When you make edits to your code the VM is automatically updated.

As of Vagrant 1.6 (April 2014), Vagrant started supporting Windows as the server environment. This was a smart business move for Vagrant (if I dare use the word business in the same sentence as an open source project). With 1.6, supporting Windows virtual machines is a major step for Vagrant to be universal and not just a *nix tool for all the l33t people working LAMP / MEAN variants on Macs and Linux.

Web Based IDEs to challenge local development:

A Web Based IDE will have to be downright amazing to get developers to switch in large numbers. It has to have a super fast search feature, auto complete, syntax highlighting, code formatting, and lots of flexibility. Remember, software development is like herding cats, so it has to work with everyone’s finicky little idiosyncrasies. Editing code aside, it will flop without a powerful plugin architecture. I would expect a rich ecosystem of utilities including a database explorer, command line tools, XML / JSON viewers, web services, test suite runners, file comparison, etc.

I have PyCharm, Eclipse, IntelliJ, PHPStorm, and Sublime Text currently installed on my Ubuntu development laptop. I have all that plus MSSQL Studio and Visual Studio on my Windows desktop (because some of my work does require Windows). That might be a low number of IDEs for a typical developer. For brevity, I didn’t mention text editors… That is a lot of functionality to cram into a browser, but people are out there working on it.

Here are some of the current contenders (in no particular order):

I’m not seeing an extensive plugin architecture from any of them… Maybe JetBrains can pull it off? They don’t seem to be working on anything publicly yet. From a business perspective they have no real incentive to cannibalize their current products. Besides JetBrains integrates with Vagrant via a plugin, and that solves most of the issues.

That feeling when you are stuck without a tool you need:

Developer A: The application code, the server environment, and the IDE are now in the cloud. Yes I can finally buy a Chrome Book!!!

Developer A: Wait…. what about the database??

Developer B: On the Vagrant instance or in the cloud, duh…

Developer A: Yeah, let’s all buy Chrome books!

[A trip to Best Buy, and a few minutes later…]

Developer A: Cool, the app is loading! But wait…. I want to run a query. How do you access the database?

Developer B: Umm… command line, duh…

*music from Psycho plays*

Developer A: Nooooooooooooo!!!!!!

The command prompt is not a tool I like to use for data exploration:

Don’t get me wrong, I can navigate the SQL command prompt with the best of them. But let’s be honest, it SUCKS for wading through complex data. When there are enough columns to cause line wrapping per row it gets impossible to read. What about pasting huge queries? Every mature app has at least a few queries that span multiple screens, amirite? The SQL command line REALLY SUCKS for debugging lengthy queries written by ‘intelligent’ ORM frameworks or the bastard who writes SQL using string concatenation with inline if/thens, redundant joins, wanton disregard* to formatting, and over use of sub-queries {IN(), EXISTS(), etc}.

* Wanton Disregard – legal term meaning severe negligence, extreme recklessness, not malicious but more serious than carelessness, can be evidence of gross negligence, can result in punitive damages depending on severity.

There are many examples out there of web based data explorers but they are clunky at best (take PHPMyAdmin for example). A good web based SQL explorer supports multiple tabs, allows saving of SQL, and shows a basic picture of the database entities. MySQL Workbench, HeidiSQL and MSSQL Studio are the three tools I mainly use today. In the past I’ve used Toad, Navicat and DbVisualizer. They are great tools as well. In fact paid tools are generally better.

Side note – I was really hoping the Oculus Rift DK2 would be a good platform to build an app for data exploring, but it makes me sea-sick…

What’s the actual payoff?

If we are going to outsource something, we expect to save some money too. Economically, unless I’m missing something, the payoff this new approach provides for run of the mill software development isn’t really that big.

  • If your company already has QA + staging environments, in theory you’ll catch bugs related to environment differences anyway.
  • If you don’t have QA + staging, you’ve got bigger problems to worry about than minor package differences on some contractor’s laptop.
  • Bugs come in a wide range of shapes and sizes. Even if there is a bug due to environment differences, it is a small percentage of overall bugs that happen.
  • Vagrant alone solves the issue of keeping everyone’s server environment the same, and it is free.
  • The cost savings of an ‘automatic’ environment setup is a rounding error compared to a developer’s cost per year. Crappy developers take ages to get their environment going because they don’t understand $PATH or other basics. For me it is typically under an hour to get up and running. Good software shops have scripts that assist the developer in obtaining database dumps and the like.
  • If developers all require cloud instances to be spun up during development that is an added cost on top of licenses / subscriptions for the IDE.
  • If the infrastructure running the Web Based IDE goes down, all your programmers are idle.

Where a Web Based IDE does make sense:

For certain applications, like cluster computing, or big data (where localhost is just too small), I think it makes perfect sense. In situations where high security is needed, a locked down Web IDE also makes sense (no data or code on localhost at all). This might put an end to developing over a VPN through RDC – thank god for that!

Cloud based software development tools can work in theory for just about any style of programming, even 3D-game developers. Nvidia offers a cloud gaming grid which houses an array GPUs in the cloud, renders HD video in the cloud, and streams it back to the client. If you can develop Ruby in the cloud, why can’t you do OpenGL or DirectX? At least, that is what Nvidia is saying. Sounds like fun!

>>> “there's no place like localhost... “ * 3
Posted in Application Development, Work | Tagged , , | Comments Off on Is localhost development obsolete?

Example Django Model Count and Group By Query

The Django framework abstracts the nitty gritty details of SQL with its Model library (the M in MVC). It is straight forward for standard lookups involving WHERE clauses. 10% of the time I need it to do something fancy like a GROUP BY aggregate query. This required checking the docs to see ‘how the heck do I make Django do a GROUP BY with a count(*)?‘. I’ll explain in detail below with examples. Django has a method called raw() for running custom SQL, but that is a last resort. Thankfully Django supports this case really well and did exactly what I was expecting.

This information applies to Django 1.8.2.

Example ‘Bike’ model:

In this example, the Bike model has paint color, seat color, and category:

class Bike(models.Model):
    name = models.CharField(max_length=50)
    paint_color = models.CharField(max_length=255)
    seat_color = models.CharField(max_length=255)
    category = models.CharField(max_length=255)
    active = models.BooleanField()

The SQL I wanted Django’s Model to run for me:

SELECT paint_color, count(*) 
FROM bike
WHERE 
  paint_color IS NOT NULL AND
  paint_color != '' AND
  active = 1
GROUP BY paint_color
ORDER BY paint_color;

-- same thing for seat_color and category
SELECT seat_color, count(*) 
FROM bike
WHERE 
  seat_color IS NOT NULL AND
  seat_color != '' AND
  active = 1
GROUP BY seat_color
ORDER BY seat_color;

SELECT category, count(*) 
FROM bike
WHERE 
  category IS NOT NULL AND
  category != '' AND
  active = 1
GROUP BY category
ORDER BY category;

My report needs a count of all the active bikes by paint_color, by seat_color, and by category. Note that the columns allow null and empty string, so those need to be filtered out of the report.

How to do the GROUP BY / count(*) with Django:

Bike.objects.filter(active=1)
  .exclude(paint_color__exact='')
  .exclude(paint_color__isnull=True)
  .values('paint_color')
  .annotate(total=Count('paint_color'))
  .order_by('paint_color'))

For more details see the documentation page on Django Aggregation.

The call returns a list of dictionaries like so:

[
 {'paint_color': u'Green', 'total': 15},
 {'paint_color': u'Blue', 'total': 19},
 {'paint_color': u'Yellow', 'total': 4}
]

Getting fancy – allowing dynamic column substitution by variable name:

The code above is a start, but I don’t want to have three copies of that lengthy model query floating around in my code. This calls for converting ‘paint_color’ into a parameter. I also opted to go with a static method, which I can do like so on the Bike model:

@staticmethod
def summary_report(fieldname):
  allowed_fields = ('paint_color', 'seat_color', 'category')
  if fieldname not in allowed_fields:
    return {}

  return (Bike.objects.filter(active=1)
             .exclude(**{fieldname + '__exact': ''})
             .exclude(**{fieldname + '__isnull': True})
             .values(fieldname)
             .annotate(total=Count(fieldname))
             .order_by(fieldname))

Now the parameter fieldname takes the place of the hard coded string. In the spirit of defensive coding, the method checks to make sure that fieldname is an authorized property on the Bike model in this context. It could also throw exception, log an error, etc, but it is kept simple for this example.  From there, the exclude() calls use **kwargs (keyword arguments) to pass in the dynamic value.

The data for the Bike report can be obtained as follows:

summary_paint_color = Bike.summary_report('paint_color')
summary_seat_color = Bike.summary_report('seat_color')
summary_category = Bike.summary_report('category')

How to see what SQL Django Query generated?

As I was working on this, I needed an easy way to see what SQL Django was generating behind the scenes. Django Debug Toolbar does it nicely.

To install the Django Debug Toolbar it takes just two steps:

$ pip install django-debug-toolbar

Then add ‘debug_toolbar’ to your INSTALLED_APPS. It requires django.contrib.staticfiles. Refresh your page, and you’ll see the debug toolbar come up:

django_toolbar

Hope this helps!

 

Posted in Application Development, Code | Tagged , , , | Comments Off on Example Django Model Count and Group By Query

Design your own GitHub activity graph, mine is a DNA spiral

I recently turned my github activity graph into an 8-bit looking DNA spiral!

github_contributions

By setting GIT_AUTHOR_DATE and GIT_COMMITTER_DATE it is possible to log a commit at any point in time. The tool I wrote allows you to draw a pattern sort of like a mashup of mine sweeper and MS paint for windows 3.1. Then it generates the commits that match that pattern.

Here is the page where you can build your own. If you come up with something cool to put on your profile I’d love to see it. For the record, this wasn’t an original idea. Others have done similar projects (which I reference at the bottom of the tool), but none let you draw right there in the page. This was purely for fun and only took me a few hours to knock out and test.

Posted in Fun Nerdy | Tagged , , | Comments Off on Design your own GitHub activity graph, mine is a DNA spiral

Mastery over Negativity – Dealing with Negative Geeks

I think it is okay to be negative about a given software technology, but it has to be for the right technical reasons in the context of the problem at hand. For the most part what goes on is bashing with scant substance behind it. Thankfully that sort of bashing can safely be ignored, but it is not always easy.  We software developers take our work with pride. Hey, I even claim that ‘software is my life’.

A fellow Portland software developer wrote a post on negativity in the software profession, why it is lame, and some steps to address it.

“PHP, possibly the most made fun of language, doesn’t even get a reason most of the time. It is just ‘lulz php is bad, right gaise?’” – Wraithan

This inspired me to break down where the negativity comes from and how to address it in a positive way. As a software developer I am compelled to categorize and organize things, so here goes…

Why are they snickering at that technology and how can I help them see their folly?

Mono-lingual programmers – It is natural to see your first language as the best in the world. It is also the ONLY language you know, so by default it is the best. My advice is get familiar with multiple languages. That way you can contrast the pros and cons of each language. Now you have a shot at being a master programmer.

Distrust of the unfamiliar – It is human nature to distrust the unfamiliar. This is true no matter how many languages a person knows. Bashing something because you don’t know it is forgivable but screams low emotional intelligence and a weak mind. If I’m pretty sure someone doesn’t know what they are talking about, I try to point out a couple really cool things about what they are bashing and hopefully get them excited about it.

Hubris and self confirmation bias – Again, human nature at play, overconfidence can cause bias. Programmers build up deep specializations spanning many years of experience in a given area. They may even get fancy titles like Principle or Lead, and consider themselves a ‘master’. It is easy to fall into the trap of thinking the skills you’ve worked so hard to attain are the ‘best’ skills. When an alpha geek is bashing something, what I like to do is point out that what they are saying may very well be the case for a given set of problems at the moment.  Or with a specific version.  Nothing in software stays the same for very long.  Ignoring that is a failure to recognize how fast technology changes.  A good alpha geek will appreciate that point. Take JavaScript for example, when it started everybody completely hated it! Now JavaScript is everywhere and has gotten a lot better than it used to be. In fact, some the highest paying jobs as of 2015 are for JavaScript engineers, not Java engineers or C++ engineers like it use to be in 2005. In 2025 who knows what it will be?

People trying to sound smartThis news article talks about how negative people tend to be viewed as more intelligent. There is a trick to seeing through that. Are they pointing out drawbacks relevant to a task? Okay, that is fine. For example, PHP sucks at building flight control software because it isn’t multi-threaded. Agreed! Or are they pointing out weaknesses that may amount to personal preference or fail to address a specific situation? PHP sucks because it uses globals. Yeah that isn’t perfect, but you are not forced to use globals in PHP. Every language has pitfalls that should be avoided.  If they are not being specific, call it out, make them be specific so they can be more helpful.

Jerks and Gits – The haters be hating… I avoid these people when possible. Some are truly too smart for their own good. Others are frustrated sub-geniuses who feel the world owes them fame. You might be able to learn a trick or two from their criticism. Getting to know them is rarely worth the effort because sooner or later they’ll start hating on you. It amuses me when people publicly (and permanently) reveal this trait on social media or forums, thinking they are being clever.

Concluding Thoughts:

It is wise to see all languages / technologies for what they are: tools.

A software tool is not an extension of one’s identity or ego… unless you actually wrote it. Even then it is best to keep emotional distance from it. If you did write something that became famous I hope for your sake the online bashing and endless stream of bug fix and feature requests did not get to your soul.

Master software developers know that everything has limitations, and they also know what gets the job done. No software is perfect. To launch software on time within budget requires artful compromises.

Posted in For New Developers, Work | Tagged , , | Comments Off on Mastery over Negativity – Dealing with Negative Geeks

Sending emails through Comcast on Ubuntu using ssmtp

Ssmpt is a light weight mail package that is easy to configure and suitable for my needs during local development. It is basically a mail forwarder, can’t receive email, and has very few settings relative to a program like sendmail.

Comcast is notorious for requiring email sent on its network to go through its smtp server. Not doing that can get your IP blacklisted and your legitimate emails flagged as spam. I resisted but was assimilated. These settings should work for most ISPs, not just Comcast.

Install ssmtp:

sudo apt-get install ssmtp

Configure ssmpt for Comcast:

You must setup an account with our ISP / email provider and enter the email/password below. I use a dedicated email account for development.

sudo vi /etc/ssmtp/ssmtp.conf

ssmpt.conf content:

root=postmaster
mailhub=smtp.comcast.net:587
UseSTARTTLS=YES
UseTLS=YES
AuthUser=myaccount@comcast.net
AuthPass=****
hostname=mymachine
FromLineOverride=YES

To test it out:

First save a test message in the ssmtp format, here is how my file looks:

$ cat testmessage.txt
To: youremail@gmail.com
From: you@comcast.net
Subject: test message

Test message for ssmtp.

To send the message:

ssmtp youremail@gmail.com < testmessage.txt

For PHP compatibility:

Edit php.ini, look for the sendmail section, set the following:

sendmail_path = /usr/sbin/ssmtp -t

Last step: restart apache

Posted in Sys Admin | Tagged , | Comments Off on Sending emails through Comcast on Ubuntu using ssmtp