Django Rest Framework How To Whitelist (Safelist) IP Addresses

Here is how to setup a list of IP addresses / subnets that are allowed to call a Django Rest Framework endpoint.  All other IP addresses will be blocked. Using an IP safe list is much easier than dealing with username or token authentication for a REST endpoint. This works great in cases where the API is only used internally by a handful of clients.

First add a list of IP addresses, or IP patterns, to the settings.py file:

REST_SAFE_LIST_IPS = [
    '127.0.0.1',
    '123.45.67.89',   # example IP
    '192.168.0.',     # the local subnet, stop typing when subnet is filled out
]

Next setup a class that extends DRF’s BasePermission class:

class SafelistPermission(permissions.BasePermission):
    """
    Ensure the request's IP address is on the safe list configured in Django settings.
    """

    def has_permission(self, request, view):

        remote_addr = request.META['REMOTE_ADDR']

        for valid_ip in settings.REST_SAFE_LIST_IPS:
            if remote_addr == valid_ip or remote_addr.startswith(valid_ip):
                return True

        return False

This code will do an exact match or a ‘starts with’ match.

The ‘starts with’ match is a quick way to allow a /24 subnet (255.255.255.0). This logic doesn’t support CIDR notation. If the requester’s IP is 192.168.0.101, and 192.168.0. is in REST_SAFE_LIST_IPS  above, the function will return True.

 

Wiring this all together:

Django Rest Framework has a setting called DEFAULT_PERMISSION_CLASSES which can be configured to use the little function above, if you want ALL endpoints to default to this permission logic.

REST_FRAMEWORK = {
    ...
    'DEFAULT_PERMISSION_CLASSES': (
        'yourapp.SafelistPermission',   # see REST_SAFE_LIST_IPS
    )
}

Or you can apply it view by view using the permission_classes property.  See the Django Rest Framework -> API Guide -> Permissions section for details.

* A point of political correctness here – blacklist / whitelist could be seen as culturally insensitive, similar to the master/slave vs primary/replica issue in database and storage technology.  Blocked list / safe list is much better, although other terms like allow / deny work well too, more ideas here. That is how I coded it in the examples above and I hope you will too.

 

Posted in Application Development | Tagged , | Leave a comment

Full Stack Developer Retrospect

Back in 2012 I wrote a post What is a Full Stack developer. Somehow it ended up being the top hit in google for “full stack” for a number of years. It has had over 1.1 million views since 2015 when I started tracking analytics.

The post has been referenced by many other bloggers which I appreciate.

The post was also blatantly plagiarized dozens of times. The worst offender was a fly by night code school that used it for promoting their curriculum without my knowledge or consent. All this gave me a real world lesson in copyright law but not so much in terms of SEO. It was part luck, part timing, and part good content.

Like it or not the term is sticking:

According to the 2019 Stack Overflow developer survey:

About 50% of respondents identify as full-stack developers

 

Never read the comments:

My post was used in threads on reddit and hacker news.  In reading the comments (which always leads to a feeling of emptiness), I’ve noticed there are three main camps: supporters, people on a non-web stack where the idea doesn’t fit as well, and critics.

Supporters of the idea have worked with variety of technologies (mostly web), know multiple programming languages, have worked in varied roles, and don’t take it too seriously.  Others are wanting to become full stack developers because they see it as more interesting or better paying. Personally I’m a supporter because I’d rather not be pigeon holed into language X nor have my career tied to a single vendor.

The non-web software developers bring up some good points about how different software stacks can be.  Examples areas are C++ / assembly, robotics, hardware controllers, AI, etc. This stuff is really cool, very specific, and the opposite of web development.  Current web stacks still have the same sort of things I mentioned in 2012 (hosting, data modeling, business logic, API / action layer, user interface, user experience), but not all software systems map well to those layers (like a robotic vacuum cleaner, the linux kernel, a chat bot, or a smart TV).

Critics tend to doubt that it is possible to have meaningful skills across a wide spectrum, dismiss full stack devs as either unicorns or liars, and claim it is just hype designed to cut costs. There are always trolls online no matter what the subject is.

At least some have a sense of humor about it:

Maybe full stack is just another term for a generalist?

There is nothing wrong with having several technologies listed on your resume, but does that imply a lack of expertise? If you want a Java developer, and they list Java, Python and Assembly, are they diluting themselves or showing off their super powers? Now what if they list Java, HTML and CSS, oh no, now they look more like a “front end developer”.

I don’t think any of this matters as long as they can do the job well.

From personal experience it is possible to be good at SQL, business logic, unit tests, API integrations, JavaScript and HTML/CSS. There are countless frameworks to choose from that boost productivity!

That said, it is impossible to do the work of more than a few people without cutting corners and incurring massive technical debt. Personally I doubt that full stack developers are unskilled because they are spreading themselves too thin, it is more likely the environment they work in being too crazy.

 

I agree with the critics in some circumstances:

When writing a system that someone’s life depends on (like a medical device or flight control system) then I agree, deep expertise is absolutely required. When a bug could lead to an accidental death “full stack” generalists and new comers alike should not be allowed to work on something unsupervised. This is where senior developers need to do the job of directing work, reviewing work, sharing their knowledge and mentoring new developers.

The problem is most corporations have a culture of treating their profit driven mission as if it is life and death. Meanwhile they are actually building a restaurant reservation system or helping someone buy a shoe online.

 

 

 

 

Posted in Application Development | Tagged | Leave a comment

Why Software Should Not Be Grouped Under Information Technology (IT)

My first programming instructor told the class one day:

“Technology is like floating on a river, and every so often you have to open your wallet and pay someone to stay afloat.”

Partly it is nobody’s fault. Change is constant in this rapid technological boom we are living through. Hardware is getting “better” all the time. Software is to blame just as much as hardware and the two are interconnected. Some change may be attributed to products designed for obsolescence that more or less serve the same function from version to version, with slightly different menus. Companies that make those products must keep up with change too and pass the cost to their customers.

In spite of the cost, many a fanboy/fangirl lay awake at night, excited for the new features coming in the next version of X!! Businesses on the other hand are squeezing everything they can out of their current legacy systems and would prefer not to upgrade. Experience teaches that businesses will eventually be forced to upgrade or risk drowning in the river.

A business leader who is also not a technology expert will recognize the ‘technology river’ and fall prey to a serious mistake. Businesses have Information Technology (IT) departments that run their ‘technology’. Typical IT responsibilities include printers, desktops, networking, and a hodgepodge of applications, databases and platforms. Miles of cables, blinking LED lights, email, and the occasional beep or fan noise is what IT has domain over. Keeping all that running is an ongoing cost. Therefore, IT takes away from profit. Even more insulting, IT is a barrier to growth because more IT is always needed for each hire, each facility, each process. IT is overly complex and can cause huge fires for the business, so it can’t be trusted. IT does nothing to help the business get an edge. The leaders think the tactics that make their business succeed should be applied to IT and software.

The serious mistake non-technical business leaders make, as I so non-nonchalantly committed at the end of the previous paragraph, is to group IT and software into the same area.

The natural role of software in a business is to make money and beat out the competition. The impact to the bottom line from software is completely opposite that of IT! For example, an optimization to a workflow process may take a few days to code, test, document, and deploy. Its unborn potential is to shave a few seconds off millions of orders yet to be processed. The economics of software done right is a pleasant sight to behold. The raft is not sinking anymore, it is flying across the water!

Lowering costs through software innovation are welcome. That says nothing to the greater potential of software to give a company a market edge or to reduce risk.

One of the reasons I love software and get so excited about is it is one of the few ideas known to our civilization that can so easily magnify its initial investment. There are no physical atoms to move, no chemical process to undergo, and if you are lucky no regulations or idealogical debate in the way. Sure there are electrons, packets, and physical storage, but those work out to be rounding errors. Many intellectual works share this property, such as music, writing, art… A bleak world it would be without these things. At this point in history software can be applied to just about any business and make an immediate impact!

What the world needs is a bright glowing line that delineates between initiatives that keep things running, and initiatives that magnify themselves to make profit. If that line were there leaders would stop grouping software into IT. In my career I have fought to stay on the profit generating side of the line, no matter what kind of work I was doing for whom. The reason is, if you follow the money, the more interesting and better paying projects will be found there. Working in a “cost center” is depressing. Working in a “profit center” is awesome.

I have seen this confusion happen over and over again in my career (software grouped with the cost center, vs software being treated independently as the profit center it should be). Software developers themselves often don’t see the distinction, or are not expected to care, which is unfortunate. I wish it were an amendment to the constitution, or at least added to the Joel TestAre developers grouped under IT?

Again, can we all please stop grouping software under IT?

Posted in Business, Work | Tagged | Leave a comment

Making Django’s Database Connection More Secure for Migrations

Since Django 1.7 (September, 2014) and the introduction of schema migrations, it has always bugged me that Django needs to connect to the database with a user that has pretty close to ALL privileges. That is because when migrate runs it needs to be able to make changes to the schema with commands like CREATE TABLE and ALTER TABLE.

99% of Django database configuration examples I see online show a single default database connection:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'db_name',
        'HOST': 'myhost',
        'USER': 'django_user',
        'PASSWORD': 'XYZ*****'
    }
}

This configuration implies django_user would need at least ALTER and CREATE permissions for migrate to work.

What typically happens is new comers set it up like the above, then run into a permissions error the first time they run migrate. That leads to google searches where the “solution” is to run a GRANT ALL for django_user. That solves the problem but it is a big shortcut and opens a security hole. The ALL permission includes potentially destructive commands like DROP TABLE and EXECUTE.  MySQL offers a number of problematic permissions that Postgres doesn’t including FILE, LOCK TABLES and SHUTDOWN.

Ordinarily, connecting to the database with a user that has ALL rights would only be a problem if:

  1. There were a SQL injection vulnerability in Django’s ORM.
    or
  2. The application used poorly coded raw SQL functions (without escaping inputs properly).

In my opinion it is better to grant each layer of the system only the access it needs. When fulfilling standard web requests Django should only be able to SELECT, INSERT, UPDATE and DELETE the tables it works with. The only time it needs the extra rights is when it runs the actual migrate step during a deploy.

The Solution Is Two Database Connections That Differ By User:

Turns out there is a little known argument to the migrate command (--database) that lets you specify which database connection to use.

To take advantage of that enter two connections under DATABASES in the settings file. They both point to the same database. One uses the basic user that is more locked down. The other is used when running the migrate command.

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'db_name',
        'HOST': 'myhost',
        'USER': 'django_user',
        'PASSWORD': 'XYZ*****'
    }, 
   'default_with_migration_rights': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'db_name',
        'HOST': 'myhost',
        'USER': 'django_migration_user',
        'PASSWORD': 'ABC***'
    }
}

The migrate command then accepts a database argument like so:

python manage.py migrate --database=default_with_migration_rights

NOTE! This example shows the password being entered directly into the config file (and presumable checked into source control).  It is better to store passwords / secrets on the server as an environment variable or in a properties file. This stack overflow page  has examples.

 

Setting up the database user:

Configuring the database user will vary by platform.

Example for Postgres (we have to worry about connecting, and the sequences):

GRANT CONNECT, SELECT, UPDATE, INSERT, DELETE 
   ON ALL TABLES IN SCHEMA db_name 
   TO django_user;
GRANT USAGE, SELECT 
   ON ALL SEQUENCES IN SCHEMA db_name
   TO django_user;
GRANT ALL PRIVILEGES 
   ON ALL TABLES IN SCHEMA db_name 
   TO django_migration_user; 
GRANT ALL PRIVILEGES
   ON ALL SEQUENCES IN SCHEMA db_name
   TO django_migration_user;

Example for MySQL:

GRANT SELECT, INSERT, UPDATE, DELETE ON db_name.* 
   TO 'django_user'@'127.0.0.1' 
   IDENTIFIED BY 'XYZ***';

GRANT ALL ON db_name.* 
  TO 'django_migration_user'@'127.0.0.1' 
  IDENTIFIED BY 'ABC***';

FLUSH PRIVILEGES;

Is This Really Worth It?

In reality, even with the extra precaution, if there were a SQL injection vulnerability in your Django app, you are so screwed. 

Unfettered UPDATE access could be used to mix up record ownership, inject content, change prices, etc.

SELECT may be the most dangerous as it would allow hackers to harvest your entire database.

Those would both would require guessing table names though.

But at least they couldn’t change the schema!

 

Posted in Application Development, Code | Tagged | Comments Off on Making Django’s Database Connection More Secure for Migrations

Tips and Tools for Securing Django

Django has a number of built in security measures which are really helpful. In fact as frameworks go by default it does a lot of really smart things security wise. There are a few less commonly known settings that improve security even more which I’ll share below. This information is up to date as of Django 1.11 / 2.1.

Securing Django is part server configuration, part Django settings, and part not being a fool:

There’s an old Russian proverb “Simplicity is worse than thievery” that loosely translates to: “A well meaning fool can do more damage than an enemy or a criminal would intentionally”.

To avoid being a well meaning fool when it comes to Django, start by reading Django’s official documentation on security.  Django handles the following out of the box:

  • Cross site scripting (XSS) attacks. By default the Django template engine scrubs all content that gets outputted to the client for script tags and other potentially malicious data. You have to mark content ‘safe’ for it to be rendered without the sanitization step.
  • Cross site forgery request (CSFR). By default Django will prevent replaying form posts which ensures data security.
  • SQL Injection.  Django’s ORM will sanitize all query variables so you don’t have to think about it. WARNING: if you are writing custom queries (which may fall under the category of ‘fool’) then you need to take extra precautions yourself.
  • Clickjack protection.  Django will set the X-Frame-Options middleware header.
  • HTTPS / SSL. Django has a solid understand of SSL and I’ll list the settings I use in combination with SSL. Require SSL from day one on your project and you’ll thank yourself later.
  • Host header validation. Django has an ALLOWED_HOSTS setting so you can ensure the URL you want users to see in their browser’s address bar is the only valid way to access the site (and not by IP, some stray domain name, or a subdomain).
  • Built In Password Hashing. Django includes a User model and a related Permissions system. When using this part of Django, by default passwords are stored securely in the database (via configurable hashing logic). This means only your users know their passwords. The database stores the hash of the password (a one way encrypted version of it). The design allows for flexibility and easy upgrades to future hashing functions.

How To Secure Django – Configuration Steps:

  • Enable SSL, and redirect all non-SSL requests to SSL on the web server level.
  • Run a firewall so only ports 80 and 443 (SSL) is open to the world. The only purpose of port 80 is to redirect to 443. The database port, memcache, etc should be locked down and non-accessible to the outside world.   SSH (port 22) should only be accessible to trusted IPs or better yet through a VPN.  For goodness sake – shut off FTP!
  • Don’t host the admin site at /admin, setup a custom URL.  Yes this is security through obscurity, but it will thwart all the script kiddies scanning for /admin.
    # In urls.py set a custom URL for the admin
    url(r'^make-up-your-own-secret-admin-path/', admin.site.urls),
  • In my live settings.py file I have the following. The full list of Django settings is here.
    DEBUG = False   # required for live environments
    
    SECRET_KEY = '...'  # to get a new value try this generator
    
    # lock down allowed hosts to just the domains I'm okay with
    ALLOWED_HOSTS = ['mydomain.com', 'www.mydomain.com']
    
    # security settings
    CSRF_COOKIE_SECURE = True
    SESSION_COOKIE_SECURE = True
    SECURE_HSTS_SECONDS = 3600
    SECURE_HSTS_INCLUDE_SUBDOMAINS = True
    SECURE_SSL_HOST = 'www.mydomain.com'
    SECURE_SSL_REDIRECT = True
    SECURE_CONTENT_TYPE_NOSNIFF = True
    SECURE_BROWSER_XSS_FILTER = True
    X_FRAME_OPTIONS = 'SAMEORIGIN'
    
    # see also
    # SECURE_HSTS_PRELOAD
    # CSRF_USE_SESSIONS
    # CSRF_FAILURE_VIEW
    
    # only available in Django 2.1+
    # SESSION_COOKIE_SAMESITE
    # CSRF_COOKIE_SAMESITE
    
    
  • Many of the options above rely on apps / middleware being enabled, here is an abridged version of the associated configuration:
    INSTALLED_APPS = (
        'django.contrib.admin.apps.SimpleAdminConfig',
        'django.contrib.auth',
        'django.contrib.contenttypes',
        'django.contrib.sessions',
        'django.contrib.messages',
        'django.contrib.staticfiles',
        'django.contrib.sites',
        'django.contrib.sitemaps',
        'django.contrib.redirects',
         ...
    )
    
    MIDDLEWARE = [
        'django.contrib.sessions.middleware.SessionMiddleware',
        'django.middleware.common.CommonMiddleware',
        'django.middleware.csrf.CsrfViewMiddleware',
        'django.contrib.auth.middleware.AuthenticationMiddleware',
        'django.contrib.auth.middleware.SessionAuthenticationMiddleware',
        'django.contrib.messages.middleware.MessageMiddleware',
        'django.middleware.clickjacking.XFrameOptionsMiddleware',
        'django.middleware.security.SecurityMiddleware',
        ...
    ]
  • In your database connections, setup a separate user for migrations that has ALL permissions and use a less privileged user for general web requests.

Useful tools for verifying the security of your Django app:

  • I have 46 dependencies in my largest Django project’s requirements.txt file! That is a lot of ghetto software to worry about 😉 I use a tool called safety to check the project’s dependencies for known vulnerabilities like so:
    $ safety check -r requirements.txt

    python safety library

  • Once your site is live, Sasha’s Pony Checkup will scan your Django app for common vulnerabilities and misconfigurations.
  • I find this SSL certificate checker tool useful.

Areas to be careful of:

  • User Uploads are a potential area of concern. Anytime you are allowing someone on the internet to upload files to your server a door the size of a castle gate is opened for potential hacking.  Unfortunately Django doesn’t have a lot of built in things to assist with this. So you need to put in your own control mechanisms such as:
    • Moderation – it would be a bad idea to let User A upload something and let User B see it without an admin checking it first.
    • Validate uploaded files for size, file extension (which tells you squat but blocks common user mishaps), file MIME type, etc.
    • Don’t allow uploads to go into a folder that can be executed by your web server via a web request (think .py files, .php files, etc).
  • Storing passwords / secrets related to the production environment in the application config file is convenient, but they shouldn’t be in source control. The options are 1) setup a config file on the server the settings file reads 2) use environment variables, this stackoverflow page  has examples.
  • Make sure to filter out sensitive data from the logs. Django apps are commonly configured to email the admins when a 500 error is thrown. That email will contain all details of the request including GET and POST parameters entered by the user. To protect against exposing sensitive data in the logs Django has built in decorators sensitive_variables() and sensitive_post_parameters(). Consider a login form that does a POST with fields ‘username’ and ‘password’. If an exception fires during the request that bubbles up causing a 500 error, these decorators will ensure the password value appears as ***** in the log output.
    # example in a standard view
    @sensitive_post_parameters('password')
    def login_view(request):
        ...
    
    # example in a class based view
    class LogininView(View):
        ...
    
        @method_decorator(sensitive_post_parameters('password'))
        def post(self, request, *args, **kwargs):
           ...
    

    Note that if you are doing your own logging statements then scrubbing for sensitive data is still up to you.

Other ideas:

Did I miss anything? If so please let me know in the comments below or through my contact page and I’ll update!

 

 

Posted in Application Development | Tagged , , | Comments Off on Tips and Tools for Securing Django

Working for Equity as a Software Developer

As a software developer I’ve been involved in a number of equity conversations over my career. Mostly what you’ll run into is dreamers with no experience asking you to work for free while they waste your time. Occasionally you’ll encounter a real business person who is worth talking to. It is thrilling to negotiate for equity – simultaneously weighing the adventure, challenge, and potential reward of a new job. Beware, it is a literal “shark tank” in this area of business and you’ll need a lot more than this blog post to guide you!

Below I’ll share some of my perspectives as the “tech person” being recruited into a startup with equity as part of the compensation package.

Disclaimer – I am not a lawyer and this is not legal advice. The information in this post comes from personal experiences and those shared by my colleagues. Examples provided have been anonymized and generalized into a work of fiction.

What is Equity:

Impulse Power vs Warp Power and the allure of equity:

Selling your time is like traveling at impulse power, great for short trips around the solar system.

Owning a substantial chunk of equity is like having warp power. Only warp power can take you to other star systems you can only dream of (eg, getting really rich).

I was once told “Equity is a funny thing.” and I agree with that.

Equity is ownership in a company. It has “value” but it acts more like red matter from Star Trek than regular money. Your job will be to help grow that value. With equity you can’t easily tap into it, transfer it, or even get rid of it. That is until an exit event where the company gets sold and all owners tap into the value at once.

Equity can also be valuable in a different way if you are the majority owner. Majority ownership provides control which gives you access to the company’s cash flow, hiring and firing decisions, and total freedom. Being in the position of a majority owner generally requires you to start the business and put in your own money. This is much riskier up front, but the potential rewards are much higher too.

 

How much equity can you expect?

There are Founders and Later Hires:

If you are coming in at the very beginning of the company you can get a lot of equity and possibly control but don’t expect to get paid right away.

If you are coming in as a later hire you can get a decent wage and a little equity (anywhere from a few percent to a fraction of a percent). Typically that equity will vest over a number of years (you have to stick around to earn it).

Working for a ‘Minority’ share:

A minority share is ownership that doesn’t offer control, typically anything less than 50%. When you are working for someone else who is funding the project, you will be a minority owner.

Working for free?

If you are going to build it for free, ask for at least 51% equity, or 100% of the voting stock and 50% of the common stock.

Equity changes with time:

If you don’t have control, expect to get diluted along the way. New shares will be issued, making your shares less valuable.

 

Watch out for bad people:

I’d say about half of the self proclaimed entrepreneurs I’ve met were good people and the other half were narcissistic sociopaths.

If you get in bed with a scum bag….

For a software developer / co-founder CTO to get anything positive out of startup equity everything has to go perfect. So much of that is completely out of your hands. Plus there is the chance you are being lied to by a sociopath Founder-CEO. So even if things do go perfectly for the company you may still get crumbs and an unfavorable legal position. Scam artists tend to blow all their money quickly, funneling it into cars, drugs, and the next scam. That makes recovery problematic assuming you can even find them.

They drink their own kool aid….

Some startup founders get so obsessed with their idea it becomes dangerous. At first nobody else believes in them or their idea which is psychologically taxing. So they invent a persona to carry on.  But it can get out of control. In a few cases I’ve seen startup founders justify breaking their word, screwing over their customers, and treating people like trash all for the sake of the company’s mission.

To me, just because you have a startup, you are in debt, and you haven’t had a paycheck in six months it doesn’t mean you are allowed to lie to people, be unethical, or act like you are above the law.

Tips for checking out potential co-founders / startup employers:

  • Weed out the dreamers by asking what their budget is, what their time commitment is, what their plans are for dealing with competition.
  • Watch how they treat others, someday they will treat you the same.
  • In the United States – check out their UCC filings. These are state level publicly registered debts. These kinds of debts are important in bankruptcy.
  • See if they are operating any other businesses.
  • Check their residence and see if the purchase date, location, etc lines up with their story.
  • Check their social media presence, do a google search, etc look for any red flags.
  • Ask who their attorney is. If they don’t have one, they have failed the test.
  • Other names will likely come up in your searches, perform similar lookups on their “known associates”.

Understand what Control and Profit are:

If your partner buys a Tesla with company funds, do you get to drive it?

Control directly translates to personal enrichment even if the company is struggling and taking on debt.

Let’s say you own 20% and your partner owns 80%. Your partner (who has control) can lease a Tesla on company money and keep the car all to themselves. Even though you have 1/5th of the stock you don’t get to drive it. You don’t even get the spare tire! All you can do is look at it with contempt.  In fact, you can be fired by your “partner” at any time since they have control.

Profits are not for minority owners:

Perhaps your agreement clearly says if the company makes a “profit” this year you are entitled to 25% of it! When it comes to zeroing out profit for tax purposes consider how useful control is. Your partner may decide it is essential to now have a yacht for entertaining prospective customers, to hire their family members, and to give themselves a raise, etc, etc. There goes all the profit. This may mean the end of your relationship with them, but what do they care, they now have the money to easily replace you.

Navigating the “shark tank” of equity for software development:

Kinds of crappy offers out there:

  • 1.3% vesting over 4 years, starting salary $900/month.
    • This is below minimum wage which is illegal.
    • Anything below 5% should command a market salary, in exchange for the career risk the software developer is taking on by accepting the assignment!
  • 20% equity for developer, 80% equity for CEO/founder, no wages
    • Developer is allowed to freelance 12 hours/week on the side to pay their bills. Wow that’s really generous.
    • Due to possibility of dilution and lack of control this arrangement leaves nothing but crumbs on the table for the developer.
    • What if the CEO/founder gets bored or loses focus or hires their stupid nephew who “knows computers”? The software developer cannot recover their investment!
  • Zero pay and no paperwork, just come and “be a part of the madness”. WTF!!!
  • Developer was asked to join as a “co-founder” and contribute $24k to a company which had a launch pending in 4 weeks. Turns out the splash page had been there for months, and after 6 weeks nothing changed. This was a scam.

You have two choices when evaluating an offer:

  1. Get your own lawyer to review the documents. It is worth every penny (provided you set limits and expectations up front). I’ve found when negotiating, saying “my lawyer is asking for X” is much more credible and likely to be accepted than simply saying “I would like X”.
  2. Sign the paperwork blindly and risk that years later you’ll suddenly get screwed out of something you love and poured countless hours into.

Avoid complex deals:

I used to think creativity was useful to apply everywhere, including legal agreements.

What I have learned is when it comes to business arrangements, keep it simple and stay with established patterns. If the agreement needs to be out of the ordinary then there is probably something else majorly wrong with the situation.

Use an LLC or incorporate:

I’ve casually used the term “partner” in this post, but you never want to be in a “legal partnership” which means joint liability in the eyes of the law. In a “legal partnership”, if your partner wrecks the Tesla, say smashing it into a liquor store, you are liable for the damages too.

Always do business through a LLC (limited liability company) or other corporate entity that provides personal protection.  LLCs are easy to create online at the Secretary of State’s office for your state without a lawyer.

Then when a project goes sour, you are not a “partner” who is jointly liable for the debts.

Use each LLC once and never again (like kleenex, use and throw away).  This is because the problems of any past venture will continue to follow the LLC.

This basic step will easily and cheaply insulate you from most of the problems of a failed venture.  Doing business through a LLC means everything you sign should be as “joe developer LLC” and all paychecks, etc. should read the same.

 

More Reading:

 

I hope you enjoyed this post. If you have a horror story related to startup equity that you’d like to share please contact me or leave a comment below.

Posted in Business | Tagged , | Comments Off on Working for Equity as a Software Developer

Django Tricks for Processing and Storing JSON

In this post I’ll show a few tricks I use to make JSON fit into Django more seamlessly. The first is a lesson on coding that everybody should know.

Parsing external JSON:

Whenever you take in JSON from “strange computers” (which is basically any computer) it works most of the time.

As C3p0 said: R2D2, you know better than to trust a strange computer.

The following example code does two things that are super important when it comes to processing JSON in python.

  1. Handles the exception in case the JSON is not valid. This happens all the time with unescaped characters, server errors, etc. In the code below raw_data is a string that allegedly contains valid JSON. Not if, but WHEN raw_data is some random invisible UTF-8 character, you want to recover from it, and you want to log what you know about the problem.
  2. The parser keeps the keys in order. Without the object_pairs_hook=OrderedDict argument, json_data’s internal dictionary keys will be in whatever order your Python interpreter felt like that day. I’ve found for some types of JSON data order does matter and most systems that claim to emit ‘ordered JSON keys’ don’t realize that isn’t how JSON works.
try:
	json_data = json.loads(raw_data, object_pairs_hook=OrderedDict)
except JSONDecodeError:
	logger.exception('Error when parsing JSON, raw data was ' + str(raw_data))
	raise ExternalAPIException('Unable to do my work! Invalid JSON data returned.')

Now that you’ve seen a basic example of processing JSON, the rest of this post will be about storing JSON and integrating it nicely into the Django admin.

Some background on Django + JSON:

If you are using Postgres life is easy with JSON and Django because way back in Django 1.9 (December 2015) the Postgres only models.JSONField() came out.  Prior to that you’d be using a TEXT field to store JSON.

MySQL introduced a JSON type in version 5.7 but Django doesn’t support it out of the box as of 2.1.  Django requires you to implement JSON storage in a TextField (MySQL type longtext) and leave it at that.  I’m ‘guilty’ of using MySQL TEXT to store JSON on a few older projects and it works fine for me.

With the MySQL native JSON field you get additional JSON validation at the database level (which you should be doing at the API level already), optimized storage (yes please!), and ability to use non-standard SQL to index and query values inside the JSON. The last part is interesting but also makes me cringe a little because it is NoSQL wrapped inside an SQL database… Too much rope to get tangled up in.

For MySQL users there is another way…. if you really want to use the native JSON type in MySQL the Django-Mysql package is available.

Consider the following contrived model (models.py):

# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.db import models
from django_mysql.models import JSONField


class BookExample(models.Model):
    id = models.BigAutoField(primary_key=True, editable=False)
    name = models.CharField(max_length=100)
    detail_text = models.TextField()
    detail_json = JSONField()  # requires Django-Mysql package

    class Meta:
        managed = True
        db_table = 'book_example'
        verbose_name = 'Book Example'
        verbose_name_plural = 'Book Examples'

For illustrative purposes it has a TEXT based column and a JSON based column.

 

Running makemigrations + migrate will generate the following MySQL table:

CREATE TABLE `book_example` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `name` varchar(100) NOT NULL,
  `detail_text` longtext NOT NULL,
  `detail_json` json NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

 

Django admin wiring (admin.py):

# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.contrib import admin

from example.models import BookExample


class BookExampleAdmin(admin.ModelAdmin):
    list_display = ('name',)


admin.site.register(BookExample, BookExampleAdmin)

 

Now let’s fill in some sufficiently complex (contrived) JSON:

{
  "title": "Anathem",
  "authors": ["Neal Stephenson"],
  "publication_year": 2008,
  "description_sort": "Anathem is a science fiction novel by Neal Stephenson, published in 2008. Major themes include the many-worlds interpretation of quantum mechanics...",
  "chapters": [{
      "title": "Extramuros",
      "number": 1,
      "summary": "The story takes place on Arbre, a planet similar to Earth...",
      "page_count": 23
    },
    {
      "title": "Cloister",
      "number": 2,
      "summary": "Erasmas describes several buildings of the Concent, namely the Scriptiorium...",
      "page_count": 14
    },
    {
      "title": "Aut",
      "number": 3,
      "summary": "Erasmas describes the Mynster, which is a building housing the Concent's clock...",
      "page_count": 34
    }
  ],
  "language": "English",
  "page_count": 937
}

 

Here is how it would look in the admin using default settings:
 Django admin with JSON default

Both fields render as <textarea> fields (which are fully editable) and it is really hard to read the contents.  The first field is a plain textarea and it will accept any data. The second field has JSON validation wired to it so the form won’t go through unless the JSON is valid.

Let’s make the JSON look better in the admin and make it read only:

When I’m storing JSON in a database it is typically either:

  • 3rd party data associated with the record that needs to be kept but is rarely used.
  • A document format used by a front end tool or mobile app (designer tool, dashboard layout, or mobile app data sync).

It normally wouldn’t make sense to edit the raw JSON data in the admin. But it would make sense to edit other properties on the row and at the same time see a nicely formatted version of the JSON next to the other fields.

Here is how to make the JSON read only in the Django admin and format nicely.

First in the model, wire up a function to output the formatted JSON. This relies on the python Pygments package.

# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.db import models
from django.utils.safestring import mark_safe
from django_mysql.models import JSONField

# new imports!
import json
from pygments import highlight
from pygments.formatters.html import HtmlFormatter
from pygments.lexers.data import JsonLexer


class BookExample(models.Model):
    id = models.BigAutoField(primary_key=True, editable=False)
    name = models.CharField(max_length=100)
    detail_text = models.TextField()
    detail_json = JSONField()  # requires Django-Mysql package

    def detail_json_formatted(self):

        # dump the json with indentation set

        # example for detail_text TextField
        # json_obj = json.loads(self.detail_text)
        # data = json.dumps(json_obj, indent=2)

        # with JSON field, no need to do .loads
        data = json.dumps(self.detail_json, indent=2)

        # format it with pygments and highlight it
        formatter = HtmlFormatter(style='colorful')
        response = highlight(data, JsonLexer(), formatter)

         # include the style sheet
        style = "<style>" + formatter.get_style_defs() + "</style><br/>"

        return mark_safe(style + response)

    detail_json_formatted.short_description = 'Details Formatted'

    class Meta:
        managed = True
        db_table = 'book_example'
        verbose_name = 'Book Example'
        verbose_name_plural = 'Book Examples'

And in the admin file:

# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.contrib import admin

from example.models import BookExample


class BookExampleAdmin(admin.ModelAdmin):
    fieldsets = [
        (None, {'fields': ['name']}),
        ('JSON', {'fields': ['detail_json_formatted']}),
    ]
    list_display = ('name',)
    readonly_fields = ('detail_json', 'detail_json_formatted')


admin.site.register(BookExample, BookExampleAdmin)

Now the admin has a nice easy to read formatted representation of the JSON complete with inline styles:

Django admin json pretty print

 

Other ideas for using JSON with Django:

Posted in Application Development | Tagged , , | Comments Off on Django Tricks for Processing and Storing JSON

My Answer To: How To Stay Current in Software?

I had a developer write to me a few weeks ago asking how to stay current. With their permission I’ve published their question and my answer with personal details omitted.

Question from a new developer:

How do I stay current? I am a new developer at a small software company. In my current position, I am a support developer, mainly database scripting. I worry that my skills from college involving object oriented coding are falling behind. What is a good way to keep practiced? I have a drive to be very good at all types of projects, yet I know very little of the full stack experience.

My Reply:

Side projects are a great way to “practice” coding. Pick an interest of yours and go for it. Don’t let your employer be the only source of skills you gain. After awhile your work tasks should get easier. You’ll end up with free mental cycles to read articles and view videos during your work day (or on your lunch break) that fit into your side projects.

If you work for a place that discourages side projects or personal learning it is time to get another job.  Do yourself a favor by being respectful and tactful when making your move.

Their reply:

Thank you. I figured as much. I have picked up side projects that relate to my work, like automation tools for uploading report to a reporting server. As well as an XML generator specifically for our programs report import process. Sounds like I am doing what I can. I appreciate the feedback.

My follow up:

Hmmm…. I was thinking of something less work related and more for the joy of it – like a new language or framework.

They say you are what you eat.  In software it is the same thing – you are what you do.

You won’t get hired for 3D graphics if you’ve never done it before. If you are really good with Crystal Reports and that is what you mess with in your free time, chances are that is all you can get hired for. If you yearn to build 3D games, mess with 3D games in your free time and see where it takes you.

I was always drawn Python but wasn’t allowed to use it at work for years. I tried to slip it in here and there but when you are working on a mature code base it isn’t possible.  Slowly but surely I tinkered with it in my free time. Then I taught myself Django which meshed really well with my other web experience and the professional work started coming.

Fast forward to current times and I use Python on about half of my professional projects!  I strongly considered Ruby/Rails too. I spent a lot of time experimenting with RoR but it never went anywhere in terms of gigs. It wasn’t time wasted because my overall mental context of coding and web frameworks was expanded.

In terms of the future in 2018… “Go” is an upcoming language that looks interesting. There’s always mobile apps and Node/JavaScript/React/Angular to look at.

Make your side projects feel fun, you know, like you want to code all weekend 🙂

Posted in For New Developers, Work | Tagged , | Comments Off on My Answer To: How To Stay Current in Software?

Windows 10 Updates Failing with MEMORY_MANAGEMENT due to DiskCryptor

Having problems updating to Windows 10 1703, 1709 or 1803? If you have DiskCryptor installed the updates will fail with the unhelpful message `MEMORY_MANAGEMENT`. Then you’ll be trapped in a kind of Windows Update purgatory:

  • Windows downloads update (30 minutes – 1 hour)
  • Windows installs the update (30 minutes)
  • Windows reboots, chugs for a long time, then the update fails (30 minutes)
  • Windows reboots and rolls back the update (30 minutes)
  • Windows reboots… then goes back to step 1 and starts to download the update again!!! Ahhh!!!

On my Windows box which I use for internal QA this was taking about 2.5 hours round trip. It only affords about 30 minutes of usable time on the desktop. I tried various fixes and hacks to get it to go through. Nothing worked until I stumbled across this and this that mentioned DiskCryptor was the problem.

As it turns out DiskCryptor hasn’t been updated in almost 4 years so it is probably time to find an alternative.

Here is how I was able to get around the MEMORY_MANAGEMENT error blue screen while installing Windows 10 update 1703, 1709 and 1803:

  1. FIRST – TAKE A BACKUP OF THE CONTENTS OF YOUR ENCRYPTED DRIVE!!! I ran into problems getting DiskCryptor to re-install which was more than a little scary but I did get it to work… If you follow these instructions, do so at your own risk. Make sure you have a backup before starting. These steps worked for me given the unique set of programs & hardware I have but may not work the same for you. You could get locked out of your encrypted disk if it fails to reinstall correctly. Note – I’m not encrypting my boot partition, nor my entire drive, I only encrypt a small portion of my drive which I mount when I need it.
  2. Note the current version of DiskCryptor something like 1.1.846.118.
  3. Download that version of DiskCryptor here so you are ready to reinstall it later.
  4. Uninstall DiskCryptor (Control Panel -> Programs and Features -> Uninstall a Program…) – this is the point of no return – READ STEP 1!
  5. Stop the Windows Update service (Control Panel -> Administrative Tools -> Services), scroll down to the Windows Update, right click on it, then click Stop. This can take several minutes. You may need to reboot first and try this, or set it to disabled, then reboot, but it needs to be off for the next step. If you do disable it make sure to re-enabled it after everything is done.
  6. With Windows Update service stopped, delete everything in C:\Windows\SoftwareDistribution -> that is where windows downloads all the windows update files before it installs them. These downloads can get corrupted. I had 61K files in that folder! Took it several minutes to complete. So at this point, maybe take a breather, stretch, eat a cookie – do something to help relieve all the stress Windows is putting you through at the moment.
  7. Reboot and run the Windows Update Troubleshooter (which you can download here).
  8. Then run the Windows Update Assistant (which is also a small download found here) and let it do the update.
  9. After a few reboots and tense moments – the update finally went through for me!!!
  10. Next re-install Disk Cyptor, but right click on the installer and select Run as Administrator. It may come up with an error like “error occurred when installing driver”…
  11. Reboot.
  12. Now run DiskCryptor like you would normally, but it may ask you to reboot (WTF????- PANIC!!!)…. select reboot.
  13. When the desktop comes back up – uninstall DiskCryptor again. Do not start it again, as it will just ask you to reboot.
  14. Now install it as Admin again, reboot… cross fingers.
  15. When the desktop comes back up, run DiskCryptor, and for me it worked….!

Steps 12 – 15 seem kind of crazy, but I went through this process on two machines with DiskCryptor and it worked both times.

Given this fiasco I’ll be looking for an alternative to DiskCryptor – here is a list of disk encryption software, many look promising. On Mac OS and Ubuntu encryption is built in, all you need to do is enable it. Unfortunately on Windows you are on your own.

In the mean time if you are interested in protecting your online identity, check out this post on some of the basics.

Posted in Sys Admin | Tagged , | 1 Comment

Django – Correctly Wiring to AWS CloudFront for Static and Media Files

When it comes to “How To Setup a CDN for Django” most people suggest the following “half way” setup that can lead to stale cache problems and slowness in the admin.

  1. Configure Django to upload media and static files to S3. That is done by installing django-storages and boto, then setting DEFAULT_FILE_STORAGE to your S3 bucket.
  2. Configuring your AWS credentials.
  3. Setup your CloudFront distribution to pull from your S3 bucket.
  4. Set Django’s MEDIA_ROOT and STATIC_ROOT settings to point to the public URL of the CloudFront distribution.

For simple sites this can be an okay way to go. Wiring it up takes just a few minutes and you get automatic backups of all uploaded media. This approach is required in some hosting environments, such as Heroku, where you don’t get a permanent local disk to store files on.

I’m not sold on it though…

Using S3 as the Media / Static Root Can Be a Bad Idea:

1) CDNs cache everything aggressively AND pass expires headers to clients that tell them to cache aggressively too.

Unless you are actively managing the state of the CDN cache (which takes forethought with CloudFront) if the content of a file changes some of your users are guaranteed to get the stale version and see a messed up version of your site.

This will happen pretty much every time you push new code.

There is a mechanism in CloudFront to invalidate files, but if you do over 1000 operations per month it starts costing you, and isn’t instantaneous.

For the kind of high profile Django sites I work on, it would be detrimental if the user got to a page that was a franken-build mix of new and old assets because of stale cache.


2) Write performance to S3 is slow.

Many of the sites I’ve worked on use VersatileImageField to generate a handful of cropped / resized versions of images after they get uploaded through the admin (large, medium, thumbnail). When S3 is the default file storage the admin runs slow because of the intensive IO work needed to generate the images and push them to S3. With VersatileImage field, every time you save the model, it revalidates all those files, which causes it to scan the S3 bucket (really slow).

So if you are doing a lot of interesting things with your media files, your site will bog down while Django talks back and forth to S3.

 

How I Setup CloudFront with Django:

1) Don’t use S3 to store files…

Instead configure MEDIA_ROOT and STATIC_ROOT to be on your server like you would normally.

There’s a twist for STATIC_ROOT, see below.

2) Setup your server as normal, like you would without CloudFront, so it serves /media/ and /static/ on its own, but do set some extra headers.

Configure your server so /media/ and /static/ are being served as normal from your domain. It is helpful to set the CORS header Header set Access-Control-Allow-Origin "*" so things like custom fonts work correctly.

Example Apache 2.4 configuration for serving media and static files, with CORS header:

# STATIC DIRECTORY
Alias /static /var/mysite/static
<Directory "/var/mysite/static/">
   Options +FollowSymLinks -Indexes -MultiViews
   Require all granted
   # add the CORS header so the AWS CloudFront distribution can serve all files, including fonts
   Header set Access-Control-Allow-Origin "*"
</Directory>

# MEDIA DIRECTORY
Alias /media /var/mysite/media
<Directory "/var/mysite/media/">
   Options +FollowSymLinks -Indexes -MultiViews
   Require all granted
   # add the CORS header so the AWS CloudFront distribution can serve all files, including fonts
   Header set Access-Control-Allow-Origin "*"
</Directory>

Note: this is only an option if your hosting configuration has access to a local partition, not always the case (Heroku for example).

3) Setup two CloudFront distributions using your server as the origin.

Create one distribution for the media files and a second distribution for the static files. This improves download performance during page load.

4) Add a variable into the static path / url that busts the cache for each deploy.

AWS calls this strategy Using Versioned Object Names.

The trick is to configure Django’s STATIC_ROOT and STATIC_URL so they include a configuration variable that gets updated every deploy.

STATIC_ROOT = '/var/mysite/static/' + BUILD_VERSION + '/'
STATIC_URL = 'https://{static-distribution-id}.cloudfront.net/' + BUILD_VERSION + '/'

When the deploy happens and collectstatic is executed all the files go into a new folder just for that version of the app. Those files are totally new to the CDN and all your users. When your users hit your site, they are guaranteed to get the new version of all static files.

Note, for media files I’m not setting up a dynamic path. This is okay for most uses cases. If a user uploads a file or image with the same name Django will take care of giving it a unique name.

Example settings.py file:

BUILD_VERSION = 'mysite-1.0'

#############################################
# MEDIA & STATIC FILES
#############################################

# store media files locally on the server
MEDIA_ROOT = '/var/mysite/media/'

# Serve media files from the CloudFront distribution
MEDIA_URL = 'https://{media-distribution-id}.cloudfront.net/'

# When collect static runs, it will copy the files to a path in appending the BUILD_VERSION
# this way each deploy gets a fresh URL folder, no need to worry about invalidating stale caches in the CDN
STATIC_ROOT = '/var/mysite/static/' + BUILD_VERSION + '/'

# serve static files from the CloudFront distribution
STATIC_URL = 'https://{static-distribution-id}.cloudfront.net/' + BUILD_VERSION + '/'

This requires updating BUILD_VERSION before each deploy. I like having that as a pre-deploy step.  I update that value at the same time I minify the CSS/JavaScript files, then commit it all. Makes a nice marker in the commit history along with a new version tag.

For systems with automated builds updating that variable can easily be part of the overall build / test / deploy process.

How It Works:

Initially, your site’s CSS file would live at:
https://mysite.com/static/mysite-1.0/css/mysite.min.css

But it would be served through the CDN at:
https://{static-distribution-id}.cloudfront.net/mysite-1.0/css/mysite.min.css

As part of the next deploy, you’d change BUILD_VERSION to mysite-1.1. After the deploy the next page request to your site would generate a request for the CSS file at:
https://{static-distribution-id}.cloudfront.net/mysite-1.1/css/mysite.min.css

The CDN would pull it off your server as a totally fresh file and cache it across the entire network.

Why I like it:

I opt for a simple solution that works well all of the time.

Other solutions to busting the cache do things like adding a query string parameters to CSS / JS files. For example:
https://mysite.com/static/css/mysite.min.css?version=mysite-1.0.1

In practice, that works okay but not 100% of the time because some proxy servers ignore the query string for the purposes of the cache! This is true even if you are forwarding query strings with CloudFront. That is the kind of bug you only run into once in a blue moon and it is impossible to replicate.

One minor drawback is, each time the site is deployed collect static copies all the files to a new folder. At some point someone or something will need to cleanup the old folders, but only when you know it is safe to do so. Storage is cheap and the static directory is usually pretty small so I don’t see this as a huge problem.

Where not to use this idea:

If you have load balancers with multiple servers this solution may not be for you.

If your media files are only changed through the admin then it may work, but you still have to make sure the bucket’s origin points to a location on your side that will always see the newly uploaded files. That can defeat the purpose of load balancing…

If you have lots of users uploading media content across all your servers it would be better to send the new uploads directly into S3 (or save locally at first then push to S3 in the background). In that case it would be important to avoid file name collisions – where user A uploads cat.gif and user B uploads another cat.gif in the same instant, and user A’s cat.gif gets overwritten.  In theory Django will automatically check for file name collisions when uploading new media, but S3 has such high latency I’d take extra precautions.

Another approach would be to setup an NFS share that all your load balanced servers tie into for media storage. Then you could continue with the setup above. But setting up an NFS adds complexity and another potential point of failure.

Thank you Alberto for bringing up this scenario.

Closing thoughts…

No matter what it is absolutely essential to setup a CDN. It is dirt cheap, it will make the site load very fast for all your users no matter where they are, and doing so is based on best practices. Just don’t cause bugs you will never see or be able to replicate due to a stale cache…

Some tools I use to test website performance:

 

Posted in Application Development | Tagged , , | Comments Off on Django – Correctly Wiring to AWS CloudFront for Static and Media Files