Learn SQL

Dear new developer,

It’s a good idea to learn SQL (which stands for structured query language). This is the language that the vast majority of data is stored in for most companies. The reason for this is that relational databases (which is what SQL is the main interface for) are very good at a wide variety of data storage. Sure, at the edges of speed, scale and functionality there are other solutions, but you should reach for them when the relational database falls short, not at first.

You don’t need to be an expert at SQL, though it’s a mindbending way to interact with data, so you might want to put studying it on your list. Instead of being procedural or functional, SQL is set based. I confess, I’ve been using it for decades and still haven’t mastered it.

If you are using a modern language, there are often frameworks that sit between you and the database (for example, ActiveRecord for Rails, Hibernate for Java, SQLAlchemy for Python). These are helpful because they make simple operations simpler. If you want to look something up via primary key or a simple query, these tools can help. But if things get harder (joining across multiple tables, database specific functions) the abstraction breaks down. This is where knowing some SQL can be helpful.

There are also times when you are running queries that are punishing using a framework. For example, if you wanted to sum across a set of orders in a day to get a daily total, a naive framework would have to load all the data for the orders and then sum up the order value in memory. A more sophisticated framework would be able to generate SQL summing up the values in the database for you. Unfortunately, it’s hard to know whether the framework you are using is naive or sophisticated. But dropping into SQL will always work.

I have also found that some systems have a lot of non intuitive operations, but that at the end of the day, the magic is built on code and data storage. By looking at the data storage, you can understand some of the operations that these frameworks take care of for you. For instance, for a long time, rails migrations were magical to me. When I took a look at the database, it became clear that a fundamental piece of rails migrations was the datetime portion of the migration name stored in the database. When I got into a weird state because of running migrations then switching branches then re-running migrations, this understanding of the data structure behind them helped me out.

Some good resources to learn SQL:

One final note. People have very strong opinions on the type of SQL database they use (a commercial offering like SQL Server or Oracle, or an open source solution like MySQL or PostgreSQL). As a new developer, you want to learn whatever your company is using. Honestly, the difference between them at the basic SQL level just isn’t that large. They start to differ in more advanced SQL functions and other performance and administration concerns. But that’ll matter later in your career.

Sincerely,

Dan

Use copy/paste as much as you can

Dear new developer,

Use copy and paste as much as you can. Not so much for code snippets from Stackoverflow, though that will save you some time hunting down mismatched parentheses.

But this is especially useful whenever you are searching for errors or moving information between systems.

For example, recently I had to find where java was installed and set the JAVA_HOME environment variable. I could have typed it in, but that is error prone. More than once I’ve transposed or omitted letters. Whenever I’m doing this kind of adhoc software configuration, I use copy/paste.

I also use this a lot when preparing to file an issue or ticket. First, I’ll just copy and paste the error message (removing the particulars of my system like file paths) into Google to see what pops up.

Then, if it is still an issue and I don’t see any answers, I’ll pop open a text editor and cut and paste all my notes about the issue, what I’ve tried (which often leads to more avenues to explore, rubber ducking in action), logfile lines. All this would be hellish to type, so I copy and paste away.

I hear there are systems out there with more than one clipboard spot, but I’ve always gotten by with the system clipboard. I did recently discover the magic of pbpaste and pbcopy on the mac, which let you copy file contents to the system clipboard (and it looks like there are analogs for other systems).

Anyway, make sure you know and love your system clipboard. When you are trying to copy complicated text, you’ll thank me.

Sincerely,

Dan

Learn a little jq, awk and sed

Dear new developer,

You are probably going to be dealing with text files sometime during your development career. These could be plain text, csv, or json. They may have data you want to get out, or log files you want to examine. You may be transforming from one format to another.

Now, if this is a regular occurrence, you may want to build a script or a program around this problem (or use a third party service which aggregates everything together). But sometimes these files are one offs. Or you use them once in a blue moon. And it can take a little while to write a script, look at the libraries, and put it all together.

Another alternative is to learn some of the unix tools available on the command line. Here are three that I consider “table stakes”.

awk

This is a multi purpose line processing utility. I often want to grab lines of a log file and figure out what is going on. Here’s a few lines of a log file:

54.147.20.92 - - [26/Jul/2019:20:21:04 -0600] "GET /wordpress HTTP/1.1" 301 241 "-" "Slackbot 1.0 (+https://api.slack.com/robots)"
185.24.234.106 - - [26/Jul/2019:20:20:50 -0600] "GET /wordpress/archives/date/2004/02 HTTP/1.1" 200 87872 "http://www.mooreds.com" "DuckDuckBot/1.0; (+http://duckduckgo.com/duckduckbot.html)"
185.24.234.106 - - [26/Jul/2019:20:20:50 -0600] "GET /wordpress/archives/date/2004/08 HTTP/1.1" 200 81183 "http://www.mooreds.com" "DuckDuckBot/1.0; (+http://duckduckgo.com/duckduckbot.html)"

If I want to see only the ip addresses (assuming these are all in a file called logs.txt), I’d run something like:

$ awk '{print $1}' logs.txt
54.147.20.92
185.24.234.106
185.24.234.106

There’s lots more, but you can see that you’d be able to slice and dice delimited data pretty easily. Here’s a great article which dives in further.

sed

This is another line utility. You can use it for all kinds of things, but I primarily use it to do search and replace on a file. Suppose you had the same log file, but you wanted to anonymize the the ip address and the user agent. Perhaps you’re going to ship them off for long term storage or something. You can easily remove this with a couple of sed commands.

$ sed 's/^[^ ]*//' logs.txt |sed 's/"[^"]*"$//'
- - [26/Jul/2019:20:21:04 -0600] "GET /wordpress HTTP/1.1" 301 241 "-"
- - [26/Jul/2019:20:20:50 -0600] "GET /wordpress/archives/date/2004/02 HTTP/1.1" 200 87872 "http://www.mooreds.com"
- - [26/Jul/2019:20:20:50 -0600] "GET /wordpress/archives/date/2004/08 HTTP/1.1" 200 81183 "http://www.mooreds.com"

Yes, it looks like line noise, but this is the power of regular expressions. They’re in every language (though with slight variations) and worth learning. sed gives you the power of regular expressions at the command line for processing files. I don’t have a great sed tutorial I’ve found, but googling shows a number.

jq

If you work on the command line with modern software at all, you have encountered json. It’s used for configuration files and data transmission. Sometimes you get an array of json and you just want to pick out certain attributes of it. Tools like sed and awk fail at this, because they are used to newlines separating records, not curly braces and commas. Sure, you could use regular expressions to parse simple json, and there are times when I’ve done this. But a far better tool is jq. I’m not as savvy with this as with the others, but have used it whenever I’m dealing with an API that delivers json (which is most modern ones). I can pull the API down with curl (another great tool) and parse it out with jq. I can put these all in a script and have the exploration be repeatable.

I did this a few months  ago when I was doing some exploration of an elastic search system. I crafted the queries with curl and then used jq to parse out the results so that I could make some sense of this. Yes, I could have done this with a real programming language, but it would have taken longer. I could also have used a gui tool like postman, but then it would not have been replicable.

sed and awk should be on every system you run across; jq is non standard, but easy to install. It’s worth spending some time getting to know these tools. So next time you are processing a text file and need to extract just a bit of it, reach for sed and awk. Next time you get a hairy json file and you are peering at it, look at jq. I think you’ll be happy with the result.

Sincerely,

Dan

Trade Money For Time

Dear new developer,

Don’t be penny wise, pound foolish. Your time is worth a lot, and it’s worthwhile to spend some money to accelerate toward your goals. I heard a client say once that their time was essentially free. I understood the sentiment, but the reality is that if you can be working on tasks that are higher value, it makes sense to spend the money.

Examples of ways to spend money that may not strike you as “worth it”.

  • Buying a book or video course instead of just reading the free docs. I remember one time for a consulting gig I needed to integrate with Stripe. I found a $30 technical ebook that I could read a few pages of and do the exact integration I needed (take money from a ruby on rails web application). The alternative would have been an hour or two reading the docs and figuring out how to do the same thing.
  • Buying and using tools. I use vim, but I know others who swear by IDEs like Jetbrains. Buying and learning these tools have saved them hours and hours of development and debugging time.
  • Opening a support ticket with a service provider. When I run into a strange situation, if I’m paying someone money, I open a support ticket. I had a colleague that had an issue with images getting corrupted across a number of places in an application. He spent a lot of time looking at our code, but eventually the issue turned out to be caused by the service provider.
  • Paying for commercial software. The alternative is to stringing together open source solutions. Now, open source is great and can often be a good value. But there are times when it just makes sense to pay for a solution. I often use the criteria “is this core to the business” or “what would happen if this paid service went away” to ward myself away from the idea that my time is free.
  • Paying for consulting or training. Sometimes a day with a consultant (even if it is expensive) can save you weeks or months. You gain the benefit of their mistakes and experience.

Now not all of these will apply to you, new developer. You may have no budget to spend at your place of work. But you can still apply this heuristic to your own choices. Get that subscription to Udemy or Safari. Buy that book. Explore that tool and see if you can recommend it.

Realize that your time is precious and you can leverage it through spending some money on tools.

Sincerely,

Dan

Read the documentation

Dear new developer,

Reading the docs is so important. It is so easy, when you are confronted with a task, to just jump in and start doing. It feels right. It feels natural. It feels like progress.

The problem is that it may be motion, but it probably is not be progress. You may be spinning in circles rather than moving towards your goals.

So, the solution is to read the docs. Documents are key ways of transmitting knowledge and will let you reduce effort or reuse solutions. There are a couple of different kinds of documents that are worth reading:

Requirements/high level project docs: These are typically written specifically for the project, and will help give you a sense of direction. It will help you find how the work you are doing fits in. Depending on the size and maturity of your organization, you may find these documents in various levels of detail and completion.

If you don’t find any at all, take the time to write one, even if it is just a one page overview that answers “what are we trying to accomplish”. Send this to a senior member of your team (or of the business, if there aren’t senior technical team members) and ask “hey, did I document what we are trying to accomplish here?” If not, revise until everyone is on the same page.

Writing down these requirements can save tons of time, as they can bring new members of the team up to speed as well as bring the team into alignment. If you are working on a project with human interaction, clickable prototypes can also be useful in determining the functioning of what the team is building.

Try to keep these documents up to date, though that is always a struggle. Whenever I start a new project, these type of docs are the first thing I look for, and if I don’t exist, I start writing them. They can take many forms and can include things such as overarching goals and terms (especially if they are not common vernacular).

Even a paragraph in slack that is pinned to the channel is better than nothing, but I typically like to put them in a google doc (if the keeper of the doc is non technical) or a readme in git (if the keeper of the doc is technical). Having these kind of docs available will keep you from heading down errant pathways that aren’t moving toward the end goal. It reduces your effort.

Platform and library documents: These are the user manuals for the tools you are going to use. Oftentimes they’ll be provided by an outside source (an open source project or a company) and are general in nature. As a new developer, hopefully you’ll have some internal guidance on these tools (even if it is just a conversation on why language X was chosen). But no matter how you arrive at the platform/library/framework, it’s a good idea to learn as much as you can about the tools you are going to be using. I tend to bounce back and forth between experimentation and documentation, but find the learning style that works for you.

A thorough read of the docs will save you time. Recently I was using a snap in CMS for Ruby on Rails, a web framework. I wanted to customize the back end system and jumped immediately into prototyping code. Later I was reading through the docs and saw that there was official support for my customization. I burned a few hours of time figuring out the wrong way to do what I accomplished, then had to spend time doing it the right way.

One of the difficulties of reading these docs is sometimes you don’t know what you need to know, nor how to look for it. I can think of a few times where I was working in AWS. I scanned the documentation and proceeded to work. Later, running into an issue, I went back and re-read the documentation and lo and behold the solution to my issue was in the documentation, I just didn’t know enough to know that I needed that piece of knowledge. There’s no way to avoid such situations. But having scanned the documentation for the tools you are using to solve your problem will let you be aware of any prebuilt snap in solutions, and also may point out extension points that you’ll want to be aware of as you build out your solution. Reusing code and concepts will save you time or money.

However, you do want to be careful not to spend too much time reading docs and thinking about the problems. I’m often confronted with a problem that is newish, whether in a domain that I’m unfamiliar with or combining two or more existing pieces of software in a novel way. Sometimes there’s no way forward but to just start thinking and coding, and documentation is no guide.

But knowing the bounds of the problem and information about the tools you have to solve it will help you determine when you are at such a place, and when and where you’re on well trodden ground.

Sincerely,

Dan Moore

Learn to use a debugger

Dear new developer,

When you are fixing a bug in a program you are working on, a key thing to do is to get an understanding of the state of the system. This can include user input, stored values from a persistent data store, and non recurring information like the current time. But the most important piece of state is that of the program in memory. What function or procedure the

Reproducing a problem with a test or sequence of steps is crucial for being able to solve it. You should take every step you can to make sure that your debugging environment is the same as the environment that the problem is appearing in. I remember one program I was debugging that worked fine in development, but failed miserably in production. It used Google Web Toolkit, which compiled java down to javascript. In development, even when I compiled it, the obfuscated variable names were different. That ended up being the issue–there was a variable name collision between the compiled javascript and another javascript that wasn’t namespaced correctly. I tore my hair out and was reduced to putting in console.log statements on production.

And that’s how a lot of debugging happens–printing out log statements to a file. You can solve many problems that way, it’s extremely portable and customizable, and it gives you some insight into program state.

However, a far better solution is to use a real debugger. They’ve been around since the 80s, at least, and give you far more insight into a program’s state than log statements. You can see the state of any variable. You can run commands interactively. You can stop anywhere, and restart the program. If you pair an interactive debugger with an automated test, you can have an extremely tight feedback loop that will help you zero in on the issue at hand.

Most of the major languages have such interactive debuggers (in fact, that’s one way to decide to avoid a language; a development language without a real debugger is likely to have other language level issues, like a poor dependency management story). Some languages even have standard protocols where you can connect to remote servers with a debugger. If you ever have to debug a production issue and can enable that, it’s going to be super helpful.

Debuggers are often integrated with an IDE, but some are runnable on the command line. Whatever your language, just google for “<language> debugger” and find out more about this valuable resource.

Sincerely,

Dan

The best code is no code

Dear new developer,

It’s paradoxical, but sometimes the best thing you can do is not write code. Remember, the value you provide is to solve the problem you are faced with (the outcome), not to write code. Custom code has value, but comes with costs. It needs to be deployed, maintained and upgraded. It has bugs. It requires a developer to change. It also has opportunity cost. Writing custom code to accomplish task A means that you won’t have time to accomplish task B, which may be either more urgent, more important or both.

There are a couple of ways in which you might solve a business problem with out writing a lick of code.

  • Use a library or framework. For instance, I worked at a place where they had written their own database connection pool. Why? I never got a great answer, but it wasn’t clear to me that one of the open source solutions wouldn’t have worked. You need to have an awareness of such libraries to be able to propose this.
  • Use a third party SaaS tool. I’ve seen people run their own in house git repositories. There may be good reasons to do this (including security or privacy concerns). But Github is going to give you a far better experience and unless you have a big team, probably better security and privacy. You need to know what the problem, the solution and the cost are to make an effective suggestion.
  • De prioritize the work. I was at a meeting with a CEO and we were talking about continuing an effort to integrate a set of outside data sources. I asked why, and we discussed it a bit further. It became clear that the reason we were thinking about doing it was because of inertia. It became clear that there was no real business reason to do it, and we prioritized other work instead. A clear roadmap and the willingness to question requirements are helpful with this path.
  • Do it manually. I was working on a startup and we had the need to occasionally refund customers. I could have integrated with the payment provider and had this case be handled automatically, but it was much much easier to document the process and handle it manually. Refunds happened rarely enough that there was no value in automating them. Here it is helpful to know how often the problem arises, how long it takes to fix manually, and how often it will arise in the foreseeable future.

Now, sometimes you may not understand the larger context of your work. You may propose a solution that isn’t the right fit, and there’s certainly nothing wrong with writing custom code to solve a problem.

In cases where I’m not sure I have full understanding, I always preface my questions with “I am not sure I have the full picture, but I think we could solve the business problem using solution A or project B, rather than writing custom code.” If you are working directly with the client, they likely won’t care, as long as the problem is solved. If you are on a team, the engineer or project manager running your project should have a good understanding of alternatives and why custom code might be the right solution. Most folks will be happy to share that reasoning with you.

In short, it’s better to keep your eyes on solving the business problem and be aware that custom code isn’t always the right answer.

Dan

PS No, this isn’t an April Fools Day post 🙂