| ![]() |
First known programmable device (Turkish inventor Al-Jazari's water-powered Castle Clock)
English mathematician Charles Babbage describes what he calls the "Analytical Engine", a fully programmable mechanical computer, able to perform calculations automatically. In the years leading up to the Analytical Engine, Babbage had built simpler mechanical calculators (which he called "Difference Engines"), but financial issues with those led him to concieve of this more ambitious machine capable of general-purpose computation. Sadly, he was unable to actually build the device at the time, but the designs included the ability to store and process data with input and output accomplished through punched cards inspired by Jacquard looms. Instructions for the machine to follow (i.e., programs) then took the form of sequences of these cards.
Ada Lovelace (daughter of poet Lord Byron), as part of her work translating Luigi Menabrea's paper on Analytical Engine, writes notes on how to program this device to calculate Bernoulli numbers [needed for a uniform formula of the sum of the first n perfect m powers]. Consequently, she is considered the "first programmer".
Babbage finally builds a piece of his "Analytical Engine -- however, limited finances, the complex mechanical engineering required, and other reasons kept him from ever completely building the rest of this device.
A computer is a machine that manipulates data according to a list of instructions, consisting of both hardware and software.
Modern computers generally have the following (hardware) components:
These components talk to one another through various pathways of wires and circuits, each called a "bus". The system bus connects the CPU, memory, and I/O devices and serves as the main highway for data exchange within the system; the address bus tells the CPU where data needs to go or be retrieved from; the data bus carries the actual data being processed between components; and the control bus transmits commands and status signals between the CPU and devices.
The computer's memory is responsible for (temporarily) storing data and program instructions for the CPU. There are various kinds of memory. There is RAM (short for Random Access Memory), which is volatile (i.e., data stored in it is lost when the computer is powered off). RAM is fast, and used for actively running programs and storing data the CPU needs quickly; ROM (Read-Only Memory), which is non-volatile, holds essential startup instructions (firmware/BIOS), changes infrequently, and is used for booting the computer; Cache Memory, which is very small, extremely fast memory (SRAM) built into or near the CPU, holds frequently used data for instant access, which significantly speeds up processing.
Storage devices like hard drives, DVDs, SSD drives, etc., offer a more permanent means of remembering data even after the machine is turned off.
Programs and data need to be loaded into main memory (RAM) before execution and use by the CPU
When data is stored in memory, everything is coded as a series of bits
A "bit" can be expressed as a binary digit taking the value of 0 or 1, although its physical manifestations in a computer can take a variety of forms. All of these, however, have only two states (e.g., the presence or absence of a small capacitive charge (in DRAM memory); a small line on a HDD platter that is either magnetically attractive or repulsive; a tiny transistor on an SSD drive (1/300th the width of a human hair) that stays in an either charged or uncharged state, the presence or absence of a microscopic "dimple" in a DVD that diffuses or reflects a laser beam into a light sensor, etc.)
A "byte" is a sequence of 8 "bits" (a convenient power of 2 with enough bits to encode the alphabet and other necessary characters. Fortunately, programmers using a high-level programming language like Java generally don't need to be concerned with encoding and decoding data to and from a series of bits -- this is performed automatically. However, some understanding of how this encoding happens can explain certain unavoidable weird behaviors many high-level languages exhibit.
You can fairly easily convert between binary and decimal number systems - instead of expressing numbers in the "decimal way" (i.e., base 10) as sums of multiples (0-9) of powers of 10, we write them in "binary" as sums of multiples (0-1) of powers of 2. In both cases, the coefficients on the powers give the digits needed.
In a DRAM (dynamic RAM) memory cell a 0 or a 1 is represented by a paired transistor and capacitor. The capacitor holds the bit of information, while the transistor acts as a switch that lets the control circuitry on the memory chip read the capacitor or change its state. Think of a capacitor as a small bucket that stores electrons. To store a 1, the bucket is filled with electrons. To store a 0, it is emptied. Unfortunately, capacitors leak. Without the intervention of the memory controller, within a few milliseconds a capacitor's bucket-full of electrons becomes empty. As such, to store information for any length of time, the computer must (very frequently) recharge all of the capacitors holding a 1 before they discharge. To do this, the memory controller reads the memory and then writes it right back again. This refresh operation happens automatically thousands of times per second.)
You can envision memory as a list of positions (identified by different, sequential memory addresses) where data in the form of some fixed number of ones and zeros is stored at each address, as the below diagram suggests. Note, that there is nothing about an individual byte of data that identifies it as a letter, number, or anything else. How we should interpret the ones and zeros at a particular memory address in terms of the type of data those ones and zeros are meant to represent must additionally be stored (someplace else). Consider the addresses 2003 and 2004 below. The data in one is interpreted as the letter "a", while the other is interpreted as the value 17. However, nothing in the memory addresses shown tell us to do that. Without some information elsewhere, we would not be sure how to interpret those ones and zeros!
| Memory Address | Memory Content | Decoded As |
| . | . | . |
| . | . | . |
| 2000 | 01001010 | character "J" |
| 2001 | 01100001 | character "a" |
| 2002 | 01110110 | character "v" |
| 2003 | 01100001 | character "a" |
| 2004 | 00010001 | number "17" |
| . | . | . |
| . | . | . |
A computer program is essentially just a set of instructions the computer can execute via a Fetch-Decode-Execute Cycle (sometimes just called the Fetch-Execute Cycle) based on the current program counter, a critical register which holds the memory address of the next instruction to be executed:
Fetch: The CPU retrieves an instruction from main memory (RAM) based on the program counter, and loads it into the current instruction register.
Decode: A control unit interprets the instruction (which is stored as a bunch of 1s and 0s) as some operation to be conducted or applied to some data.
Execute: An arithmetic logic unit conducts the operation. This may involve calculation or movement/storage of data.
Computer programs include general purpose application software (which end-users use) and operating systems. There are other specialized types of programs as well, including device drivers that allow the operating system (OS) to communicate with specific hardware, language processors including compilers and interpreters, utitlity software that help maintain and optimize the computer system (antivirus tools, disk cleaners, backup tools), programming software to create other software (IDEs, debuggers, code editors).
There are various types of operating systems. The most common include Windows, MacOS, and Unix/Linux variants. The operating system for a computer is the program that manages and controls a computer's activities and all the programs (browsers, word processors, games, etc.) that it might run -- including the programs that you might write!
The tasks of the OS include:
Controlling and monitoring system activities
Allocating and assigning system resources
Scheduling operations
A program's set of instructions are specified using a programming language. There are different types of programming languages. For example:
Machine language
Assembly language
ADDF3 R1, R2, R3
These were eveloped to make programming even easier still - both to learn and use. They do a better job of mimicking our natural language, and are hence, far more readable. Generally, the closer a programming language gets to natural language, the higher "level" that language is considered.
Similar to how assembly source files must be "assembled" with an assembler to create the machine code file - source code written in a high level language must either be "interpreted" by an interpreter, or "compiled" with a compiler. The difference between these two is that an interpreter reads the source code one statement at a time, translates the statement into machine code (or virtual machine code) and then immediately executes it. A compiler, on the other hand, translates all of the source code into a machine language program called an "object program". This object program can then be linked with supporting libraries using a "linker" to create an executable file you can run on the computer.
Note, supporting libraries keep programmers from having to reinvent the wheel every time they write a program. For example, lots of programs need to display a dialog box to the user. Rather than every programmer having to write his or her own instructions for how to display a dialog box to the user, a common set of these instructions sits inside a library that can be called upon by the programmer. In this way, the programmer can focus more on the details unique to his particular program.In addition to Java, popular higher-level languages include the following (be aware, some are a higher level than others):
Developed by James Gosling and the "Green Team a.k.a. FirstPerson" group from Sun Microsystems who originally were trying to take advantage of the coming trend of consumer device and computer convergence.
Originally named "Oak" for a tree outside Gosling's office.
After developing the language to create a demo of one of these convergent devices: an interactive, handheld home entertainment device controller with an animated touch-screen - and failing to successfully pitch the idea to the cable companies, the team focused their attention on using the language to develop a similar content delivery system for the internet.
The team created a Java-technology-based clone of Mosaic named "WebRunner" (named after the movie "Blade Runner"), which later became the "Hot-Java" Browser.
Marc Andreessen of Netscape made an agreement to integrate Java into the Netscape browser and the rest is history.
Why the name Java? There were initially ten names considered (i.e., Silk, Lyric, Pepper, NetProse, Neon, Java, DNA, WebDancer, WebSpinner, WRL for "WebRunner Language"). There are various disputes and failed recollections as to who actually suggested it be named "Java".
Java involves two primary elements:
The Java Language Specification -- which consists of the syntax and semantics of the Java language
The Java API (Application Program Interface) -- which consists of predefined classes for developing Java programs
Additionally, there is the Java Development Kit (JDK), which programmers use to write Java programs. There are different versions of the JDK:
Java Standard Edition (Java SE) - the core Java platform, providing the foundational APIs for genera-purpose desktop and server environments. This is the base for all other editions. (We will use this edition!)
Java Enterprise Edition (Java EE / Jakarta EE) - used to develop large-scale, distributed, server-side applications and web services
Java Micro Edition (Java ME) - used to develop applications for resource-constrained environents such as mobile phones and embedded systems (printers, set-top boxes, etc.)
These editions of the JDK have evolved over time. The first version, JDK 1.02, was released in 1995. The latest version released (as of September, 2025) is JDK 25.
If desired, one can create a java program using only a text editor, and compile and run it using only the command-line programs javac and java, respectively. javac is the Java Compiler and produces class files containing Java byte code. java is the Java Virtual Machine (JVM) and is able to execute the instructions written in Java byte code contained in the class files produced by javac. Compiling and running things from the command-line is perhaps not the most efficient way of doing things, but it will work.
The process of writing programs is generally easier if one works within an IDE (Integrated Development Environment). This is a software application that provides comprehensive facilities for software development, such as:
Popular IDEs for Java include:
However you plan on writing, compiling, and running your programs -- there is a certain order to the process, as suggested below:
Create/Edit Source Code
You can use a text editor, like "notepad" or "edit" in Windows, or "vi", "vim", or "gedit" in Unix, to type your Java source code. Several of these editors are designed to run from a command line interface (CLI) through a shell window. Each program file should have a "*.java" extension.
Compile Source Code
When the program is written, it is then compiled with the Java compiler, "javac <my sourcefile>", which is included in the JDK.
Fix Syntax Errors
If there were errors in your source files that keep the compiler from compiling your program (these type of errors are called "syntax errors"), re-edit the source files to eliminate them and then recompile. If there were no such errors, then a Java bytecode file will be produced (i.e., a file with a "*.class" extension).
Run Bytecode
The java bytecode file is a set of instructions that can be executed by the Java Virtual Machine (JVM). To run the JVM, one can type "java <my classfile>" from the command line. This runs the program.
The *.class files are not meant to be directly read by humans. If you really want to know what is inside them, you can use "javap -c -s -verbose <classfile>" from the command line to translate a class file into bytecode you can actually read (which is printed to "stdout" by default). This is called "disassembling" the class file. In this class, you should never have to do this - just know that it can be done. Expert programmers disassemble class files to discover how they might tweak the associated programs into running more efficiently. Note: If you do execute the above command, leave off the .class extension just like you do when running "java".
Fix Runtime and Logical Errors
There might still be errors present in your program, despite the fact that it compiled. Sometimes, a program crashes in the middle of running. This is called a "runtime error". Sometimes the program runs, but doesn't do what you want it to do. This is called a "logical" error. If you encounter a runtime or logical error, just like before - go back to the source files, fix the error, recompile, and run it again. Repeat this process until all of the errors are gone.