Computer Science

An Introduction

Tagged: Artificial Intelligence Math

People often ask me what I do or about what I am studying. Many have certain misconceptions and stereotypes that render the simple answer of “Computer Science” insufficient. For example, the vast majority of non-technical people with whom I’ve talked seem to think that learning new programming languages and writing programs are the primary areas of study for computer-related university majors. That’s like believing literature majors go to university to learn the intricacies of using pens and typewriters. In the ~7 years—and counting (gasp!)—in which I’ve been in higher education, I haven’t been taught a single programming language.

The following is an attempt on my part to answer these questions, in the hopes that I can hereafter simply refer people to this page instead of having to explain this for the thousandth time.

What is Computer Science?

The vast majority of people that call themselves “Computer Scientists” are occupied with solving problems. This raises the question: What is a problem? In this context, a problem is a language. Do I mean a language in the sense of “English,” “Russian,” or “Pig Latin?” In a way, yes. Now, let me explain.

Languages and problems

A language is really just a set of grammar rules and a set of words that constitute a vocabulary. For example, everything I’ve written so far is a part of the English language because (1) all of the sentences conform to the English grammar, and (2) all of the words of the sentences are of the English vocabulary. Although the sentence “Evan writing” contains two words that are both in the English vocabulary, the lack of a verb (e.g., “Evan is writing”) means that it does not conform with the English grammar and is thereby not an English sentence. Likewise, the sentence “Boy ball threw” is not in English because it is unclear from the word order which noun is the subject and which is the object.

It turns out that lots of real-world problems—and virtually all computational problems—can be represented in terms of languages. For example, say you want to check whether a list of numbers is sorted in order. That problem is the same as creating a language in which the vocabulary is the set of all numbers, the grammar only allows nondecreasing numbers to appear after one another, and the problem then becomes testing whether the given list is a member of that language.

The majority of the work in Computer Science is concerned with automatically and efficiently testing whether a given sentence conforms to a given language.

Example: Sudoku

Let’s consider the popular game of Sudoku. A “language” for Sudoku might consist of:

  • Vocabulary: 1, 2, 3, 4, 5, 6, 7, 8, 9
  • Grammar: The set of all sentences that are exactly $9 \times 9 = 81$ characters long (representing the 81 Sudoku squares) and that conform to the Sudoku property.

It’s relatively easy to test whether a given sentence conforms to the Sudoku grammar; all we have to do is make sure that the Sudoku property holds (i.e., there are no repeats in the columns, rows, or boxes). This can be done relatively quickly by a human, and almost instantaneously by a computer. A computer scientist, however, might try and solve the following problem: “Given a partially complete sentence, either complete the sentence such that it conforms to the Sudoku grammar or prove that such a completion does not exist.” Such a problem is much harder to solve.

The majority of computer science consists of creating automated algorithms that can solve such problems. The question is: how expensive is the algorithm? Since testing whether a sentence is in the grammar is easy, one might just enumerate all of the possible Sudoku grid layouts and find the first one that is a part of the grammar. The problem is that, in the worst case, there will be on the order of $9^{81} =$ 196,627,050,475,552,913,618,075,908,526,912,116,283,103,450,944,214,766,927,315,415,537,966,391,196,809 (i.e., about one Quinquavigintillion) possibilities to test! Even if a computer could test a trillion sentences per second, such an algorithm would take about 6 octodecillion years to finish (that’s 6 with 57 zeros afterward)! What we really want are what are called polynomial time algorithms.

Polynomial time

Let’s say a given problem is of size $n$. For example, our Sudoku problem would be of size $n = 3$ because the grid is composed of nine 3-by-3 sub-squares. Now let’s say we devise an algorithm (i.e., solution method) that allows us to solve any given Sudoku problem in no more than 1000 steps. That’s pretty quick! Using this algorithm a computer could solve any $n=3$ Sudoku problem almost instantaneously. Now, let’s say we get bored with $n=3$ Sudokus (since they’re so easy to solve now) and we now want to solve Sudoku problems on larger grids—say a $n=4$ Sudoku (i.e., a grid of sixteen 4-by-4 sub-squares). As I mentioned before, the number of possibilities for $n=9$ was about $9^{81}$; for $n=10$ there are on the order of $16^{16\times 16} = 16^{256} \approx 10\ \mbox{with 308 zeros afterward}$ possibilities! By most estimates that’s over three times the number of atoms in the universe! The question is: Given that our algorithm is so efficient at solving problems where $n = 3$, how fast will it be able to solve problems when we just increase $n$ by 1?

If for each increase in $n$ it turns out that the amount of time the algorithm needs to solve the problem increases by at most a linear amount, then we say the algorithm runs in polynomial time. This means that if we have a problem of size $n$ then a polynomial time algorithm for that problem will run in at most roughly $n^c$ steps, where $c$ is some constant.

Why polynomial time matters

Moore’s law states that processor speed doubles every two years. (More accurately, Moore’s law speaks of the number of components per integrated circuit.) This means that if we have a polynomial time algorithm that is able to quickly solve a problem of size $n$ on current hardware, we will only have to wait about $\log_2\left(\frac{n+1}{n}\right) c$ years until computer hardware becomes fast enough to solve a problem of size $n+1$. For most values of $n$ and $c$, that’s a very short time!

Why Computer Science isn’t just a science

Depending on one’s discipline, Computer Science is really an amalgam of mathematics, science, and engineering. One way of thinking about the various computing disciplines is in what types of problems they aim to solve. There is a whole field of “Computer Science” concerned with discovering new efficient (e.g., polynomial time) algorithms; this field is rooted in pure mathematics. Once can devise an algorithm and formally prove its computational efficiency; no “experimentation” is required.

It turns out, however, that there are some classes of languages/problems that are so hard that it is probably impossible to devise a polynomial time algorithm to solve them! Sudoku is actually one such problem. There is therefore an entire discipline within Computer Science concerned with finding algorithms that give an approximate solution whose quality is not necessarily optimal but within a close range of optimal. This field is also rooted in pure mathematics.

Then there are problems that are so complex that either no approximation can be found or, perhaps, one doesn’t even know the grammar for the problem’s language. For example, consider the game of poker. Since poker is both non-deterministic (i.e., there is chance involved) and unobservable (i.e., one doesn’t necessarily know what is going on in other players’ hands/heads), an optimal poker-playing algorithm/strategy will be a function of the other players’ strategies. How can one prove that one’s poker playing algorithm is not only efficient, but “optimal”? How does one know that it will always play well, regardless of the opponents’ strategies? Creating the algorithm may require building heuristics, which is engineering. Evaluating the algorithm requires experimentation: playing one’s algorithm against a number of opponents over a number of games and observing the outcome, which is science.

The latter field generally falls under the term “Artificial Intelligence” (AI), the definition for which I usually give being, “The creation of systems capable of finding solutions to arbitrarily difficult problems.” The practitioners of AI generally fall into two camps: the neats and the scruffies. The neats like to design systems using formal logic and mathematics; since such methods are infallible, once a method is produced then the work is done. The scruffies, on the other hand, like to engineer systems that tend to get a job done and then, retrospectively, examine the system to figure out why it worked.

I study AI and I generally fall into the “neat” camp.

← Older Post Blog Archive Newer Post →