Compiled vs. Interpreted Languages

When discussing programming languages, the topic of compiled vs. interpreted languages often comes up. When explaining this to non-technical folks, I tend to use an analogy.

Imagine you are a businessperson who works for a multinational corporation. For work you must travel to another country where they speak a different language.

When you go, you are expected to communicate with the people there in their native tongue. This generally leaves you with two options. First, you can learn the foreign language ahead of time, thereby being able to communicate at will once you arrive there. Second, you can hire an interpreter to travel with you, who then will translate on-the-fly whenever you wish to communicate.

So what is the difference? Learning a foreign language up front takes time, delaying the trip. But in the end this leads to much quicker communication when traveling, and there is no dependence on another person. An interpreter, on the other hand, requires no up front preparation time, allowing you to leave immediately. However, if you are at a business party in the foreign country, where you tell the same story a dozen times to different people, each time the interpreter will have to re-translate the story as you tell it. (Today you might use Google Translate or similar to try and get by without a human interpreter. This would lower the bar even further, as you do not even need time to find a human interpreter to travel with you. However, the dependence on someone/something else to interpret for you still remains, and it slows down communication.)

This analogy, albeit simple, holds reasonably well for programming languages. Learning the foreign language is akin to using a compiled language. Using an interpreter is, well, like using an interpreted language. So when discussing different programming languages, it is sometimes helpful to think in these terms.

In the early days, most applications were written in compiled languages. You wrote code in some pseudo, high level, English-like language such as COBOL, FORTRAN, Pascal, or later C/C++. Then you fed that text into a compiler which translated your code into the binary machine language of a particular operating system on a particular hardware chip (or simply directly for a hardware chip). This meant the binary application could only run on that one OS/chip combination. (For example, if you compiled an application to run on macOS on an Intel-based system, that same application binary would not run on MS Windows.)

While writing an application in a compiled language, the development process tends to take longer. You write code in an editor, save it, feed it to a compiler, wait for the compilation to complete, then run the binary to test. Rinse, repeat. (Depending on the language, the compilation might involve several steps, including compiling source code into object files, then linking those object files together into the final binary.)

Compare this to an interpreted language, where you write code, save it, and then run it straight through the interpreter. Generally speaking, the iterative process of development goes quicker with interpreted languages, especially when they offer a REPL (Read Evaluate Print Loop) as languages such as Python do.

But the final compilation of an application is done just once. The execution of the binary program can be run any number of times after that. And when compiling a program before using it, it affords you the opportunity to look for certain common types of errors before the code is ever run. That is, you can “check your work” before running it.

Now most compiled languages tend toward certain characteristics which benefit the compiler writer. That is, the more the compiler writer knows ahead of time, the more they can program optimizations into the compiler. These language characteristics might include features such as static typing, where a variable has to be declared before use as to what kind of information it holds (e.g., integer, float, character, string, etc.). And that variable can only ever store that one type of information. Why would you want static typing? This allows the compiler to go through your entire program and make sure you did not inadvertently try to stick a string into an integer variable, etc.

Oversimplifying, there was a general sense that programmers who used compiled languages knew what they were doing. And because of this, the early compiled languages also typically did not offer features such as garbage collection (fancy term for cleaning up memory that is not in use). The onus was typically on the programmer to allocate/free memory as needed in their code.

On the other hand, applications written in interpreted languages such as BASIC could be run on any system which understood that dialect of BASIC.¹ These programs, however, ran slower due to the need to interpret the code from pseudo-English to the language of the machine each and every time the application was run.

However, interpreted languages, in part due to their late execution model, often picked up features difficult to accomplish in a compiled language. And in the last few decades, interpreted languages from Perl, PHP, Python, and Ruby have become very popular in part due to such features. These include things such as dynamic variable type assignment (where a variable can hold an integer one minute, and a user’s name the next) and garbage collection (where the programmer does not have to worry about memory bleeds, as the interpreter will clean up any unused memory over time, like a maid going through a teenager’s room), etc. Many of these features made it “easier” as programmers did not have to worry about certain mundane issues common in compiled languages.

It should be noted, of course, that “imitation is the sincerest form of flattery.” And just as different OS makers copy from each other, over time programming language developers have copied ideas from other languages. Today this includes compiled languages such as Golang having garbage collection and variable type inference (where you set a variable to a value without declaring its type, allowing the compiler to “infer” the type from the first thing assigned to it). But I am getting ahead of myself.

The other difference is that, generally speaking, compiling a program prevents the average person from being able to see the source code, or how the program logic actually functions. You get a binary blob that you run and just hope it does what the developer claims it will do. (Yes, there are tools such as disassemblers, but they lie beyond the reach of less technical folks, and this goes well beyond this simple comparison.) Interpreted programs, however, are mostly just text files of source code which are then fed through an interpreter when run. This means the source code is usually exposed/available, whether the developer wishes it or not.

And so it has been. The general notion, which really has not changed in 50 years, is that compiled languages provide developers with the option to hide their coding logic. (I say “option to hide” as nothing prevents developers from also providing the source code to their binary programs. And many do. Hence the entire FLOSS–Free/Libre Open Source Software–movement.) And compiled languages tend to run much faster. But compiled applications are limited to a single OS/architecture.

On the other hand, interpreted languages tend to be more open with their code, run slower, but can often run on multiple OSes/architectures. The only requirement is that the interpreter for the language, possibly along with any additional modules/libraries used, be installed on each system where the application runs.

And then Java came along.

I say “dialect” as BASIC did not have a true standard. Each version tended to have its own unique commands for certain bits, meaning if you used such unique commands, your BASIC program would not run on other systems. ↩︎