6. What is a Programming Language?

../../_images/binary_code.svg

We've gotten started writing our first programs using the Python computer language. But what is a computer language? What other languages are there? How do they work?

This section will briefly dive into how computers work. We won't spend more than a chapter on this subject because you don't need to understand any of this to get started programming. But once you become a more advanced programmer, that's no longer true. If you want to get the best performance, if you want to debug complex problems, if you work on building a platform for your servers to run on, you need to understand what happens "under the hood."

Remember in the last chapter how RGB values were specified from 0-255? The reason for that choice comes from how the computer is built. Understanding how computers work, helps us understand why 255 is a special number.

Learning to drive a car is a good analogy. You don't need to understand how an engine works to drive a car, but it helps if you want performance, reliability, and to know if you are getting a good deal from the repair shop. In this chapter we'll introduce a few concepts to get started.

6.1. Central Processing Unit

Computers have a chip called the Central Processing Unit (CPU). The CPU functions as the main "brain" of the computer. For example, right now you might have a CPU called an Intel i7 or an AMD-FX in your computer. Your phone might have the Qualcomm Snapdragon 855 as its CPU. A dishwasher could have a ARM Cortex M3. Desktop CPUs emphasize speed, phone CPUs emphasize low power consumption, and dishwasher CPUs emphasize low cost.

Note

The CPU is the "brain" of the computer.

../../_images/Intel_CPU_Core_i7_2600K_Sandy_Bridge_top.jpg

Intel i7 CPU (Wikipedia Commons: CPU)

The CPU knows what to do by reading in a sequence of instructions. Each instruction the computer reads is a number. For example the number "4" might be an instruction to add two other numbers together.

Everything stored on the computer is saved as a long sequence of numbers. Some numbers are instructions. Some numbers represent data, such as text, photos, and movies.

6.1.1. Graphics Processing Unit

In addition to a CPU, computers often have a Graphics Processing Unit (GPU). The GPU is a processor whose primary purpose is to run graphics displays. In fact, high-end computer graphic cards can have not just one processor, but 2,500 processors! We call each processor a 'core', and a GPU is often made of many cores. The more processors we have, the more calculations we can run at the same time.

GPUs aren't just used for graphics anymore. They are also very useful for any type of task with simple calculations that can be broken into many parts. Physics simulations, artificial intelligence, and data analytics can often make use of a computer's GPU.

6.2. Computer Languages

Computer languages are divided into three broad categories. First generation, second generation, and third generation languages.

6.2.1. First Generation Languages - Machine Code

In the early days of computing, programmers entered sequences of numbers that represented commands for the CPU. Programmers also entered sets of numbers as data for the computer to process.

Note

Machine code is the native language of any computer.

We call these numbers that are CPU instructions machine code. All machine code is made of numbers, but not all numbers are machine code. Some of the numbers might be data to hold text or images. Machine code is also called a First Generation Language (1GL).

Below is an image of the Altair 8800, the first personal computer that regular people could buy. Notice that it is missing a monitor and a keyboard! The first computers loaded instructions by flipping switches. A pattern of switches represented a machine instruction. So you'd flip switches, hit store, flip more switches, hit store, and keep at it until all instructions and data were entered. When you were finally done you would hit the "Run" button. And the lights would blink.

While this may not seem very useful (and quite frankly, it wasn't) it was very popular in the hobbyist community. Those people saw the potential.

Computers still run on machine code. You can still code by punching in numbers if you want. But you'd be crazy because hand-coding these numbers is so tedious. There's something better. Assembly Language.

6.2.2. Second Generation Languages - Assembly

In order to make things easier, a computer scientist named Kathleen Booth came up with something called assembly language. Assembly language is a Second Generation Language (2GL). Assembly language looks like this:

Don't worry! We aren't coding in assembly language for this class.

Assembly language allows a programmer to edit a file and type in codes like LDA which stands for "Load Accumulator Immediate." The programmer types these commands into a source file. We call the commands source code. The computer can't run the source code as-is. The programmer runs a compiler that simply translates the computer commands like LDA into the corresponding number of the machine language instruction.

Note

A compiler turns human-readable code into machine code.

After the programmer compiles the source code into machine code, the programmer can run the compiled code. The compiled code can be given to someone else and they can run it. They do not need the source code or the compiler.

Assembly language is an improvement over machine language. But it isn't that much of an improvement. Why? Assembly language instructions are very low-level. There are no commands like "draw a building here." Or even "print hi." There are only mind-numbingly simple commands that move bits from one spot to another, add them, and shift them.

6.2.3. Third Generation Languages

Third Generation Languages (3GL) started with Grace Hopper creating the language COBOL. There are many, many different third generation languages now. These languages often specialize at certain tasks. For example, the language C is great at creating small, fast programs that can run on minimal hardware. PHP is an easy-to-use language that can build websites.

Note

Most of the original computer scientists were female. See Grace Hopper, Hedy Lamar, and Ada Lovelace for examples. If you want to find other female programmers who code in Python, check out @PyLadies, @DJangoGirls, and @WomenWhoCode.

Third generation languages usually fall into one of three categories.

  • Compiled: The computer takes the original source code, and uses a compiler to translate it to machine code. The user then run the machine code. The original source code is not needed to run the program. "C" is an example of a language that works this way. So is the 2GL assembly language we just talked about.

  • Interpreted: The computer looks at the source code and translates/runs it line-by-line. The compile step is not needed, but the user needs both the source code and an interpreter to run the program. Python is an example of an interpreted language.

  • Runtime Environment: Languages such as Java and C# take source code, and compile the source code to a machine language. But not the language of your actual machine, they compile to a virtual machine. This is a separate program that acts as a layer between the real machine and the compiled code. This allows for better security, portability, and memory management.

Working with a compiled language is like taking a book in Spanish and translating it to English. You no longer need the Spanish book, and you don't need the translator. However, if you want to edit or change the book you have to re-translate everything.

Working with an interpreted language is like working with a interpreter. You can communicate back and forth with a person that knows both English and Spanish. You need the original Spanish, the English, and the interpreter. It is easier to make ad-hoc changes and carry out a dialog. Interpreters often help prevent computers from running commands that will cause major crashes or common security issues. Kind of like having a human interpreter that says, "You don't really want to say that."

Using a runtime environment is hard to explain in human terms. It is a hybrid of the two systems. You need source code. You need a compiler. Instead of the compiler making machine code for the CPU, it makes machine code for a virtual machine.

6.3. Python as a Computer Language

Python is a great language to start programming in. Python is a Top-5 language in popularity according to the TIOBE Index. While it may be slightly less popular than Java, it is easier to read and learn. Less work is required to do graphics. And everything you learn in Python you can also apply when you learn other popular languages, such as C# or Java.

Python a great language for people interested in automating boring things, because you can program repetitive tasks to happen automatically. Python is also extremely popular in data analytics. Typically researchers will use the add-on libraries like Pandas and Jupyter Notebooks.

6.3.1. Python 2.x vs. Python 3.x

There are two main versions of Python. When Python moved to version 3, there were changes that didn't work with all the currently written Python 2 programs. It was too much work to suddenly rewrite thousands of Python 2 programs. So both Python 2 and Python 3 were being developed simultaneously for a while.

We are using Python 3. Why does Python 2 matter to us?

  • If you search up examples on the web, you might find incompatible Python 2 examples.

  • Systems such as the Mac and Linux have Python 2 installed by default.

If you see a Python example on the web that has a print statement that looks like:

# A "print" statement with Python Version 2.x
print "Hi"

Instead of:

# A "print" statement with Python Version 3.x
print("Hi")

Then you have a Python 2 example and it won't run with what we install and use in this class.

In the case of the Mac and Linux, it will be important to use Python 3 and not Python 2. Since Python 2 is installed by default, it can be a bit of a hassle to make sure they use Python 3.

6.4. Review

In this chapter we learned about what a CPU is, and that computer instructions are simply numbers fed into the CPU. We learned about first, second, and third generation computer languages. Second and third generation languages have programmers write source code, that is saved into source files. Those files are used by either a compiler or an interpreter to turn the source code into machine language.

Some languages compile code to a set of instructions for a virtual machine, and the virtual machine can run on multiple different types of systems.

The language we are using for this class is called Python, it is one of the top five computer languages in use today.

6.5. Review Questions

  1. What do we call the main "brain" of the computer where all the processing happens?

  2. Instructions for a CPU are made up of a long sequence of what?

  3. What is the name of the native language for CPUs?

  4. What is the difference between a CPU and a GPU?

  5. Commands with a GPU can be processed by hundreds or thousands of what?

  6. If machine language is a first-generation language, what is the second-generation language?

  7. What do we call the file that programmers type commands into?

  8. What is the name of the program that turns assembly language into machine language?

  9. Third-generation languages usually fall into what three categories?

  10. What is the difference between a compiler and an interpreter?

  11. What generation of language is Python?

  12. What are some of the most popular languages in use today, according to the TIOBE Index?