Building an artificial general intelligence begins by asking 'what is intelligence?'

Ready to learn Artificial Intelligence? Browse courses like Uncertain Knowledge and Reasoning in Artificial Intelligence developed by industry thought leaders and Experfy in Harvard Innovation Lab.

What we typically refer to as intelligence is the processes that are associated with intellectual functions like playing games (i.e. chess), picking stocks to buy, composing symphonies or doing theoretical physics. These are processes that people must learn deliberately, and doing so requires a concentrated effort. Some of these are tasks that computers can perform well, but others are more of a challenge.

Seemingly basic tasks like walking across a crowded room, folding laundry, or recognizing emotion in faces are also challenging for computers. Take a child once to the zoo and buy him or her cotton candy, and they will then expect cotton candy every time they go to the zoo.

People often make seemingly irrational choices. When offered an early registration discount for a conference, only 67 percent of the graduate students took advantage of the offer. When told that there would a penalty for late registration, 93 percent of the students took the offer even though the costs and the cost differences were identical in the two situations ($50 discount or $50 penalty). We can think about decisions like these as being somehow abnormal, but they are very common and, more importantly, demonstrate just how people use heuristics to achieve their intelligence.

When intelligence has been studied by psychologists, the focus has generally been on identifying individual differences. Intelligence testing started with Alfred Binet and Theodore Simon’s efforts to identify French school children who might require special help. Their focus was on those factors that would allow a child to do well in school. Their tests included evaluations of language skills, memory, reasoning and the ability to follow commands. They chose tasks like these because they believed that these were the constituents of intelligence.

Charles Spearman recognized that a student’s performance on these tests was correlated. Students who tended to do well on one task also performed similarly on others. With that, he developed new mathematical analyses that allowed him to separate the performance on each task into components representing each specialized skill and a general components, called “g,” that represented the general capacity that he assumed caused the correlation — general intelligence.

It is not clear whether the general intelligence measured by Spearman’s analysis is anything more than a statistical curiosity. If it represents some kind of general cognitive capacity, then the nature of that capacity is elusive. For example, it could be due to some factor unrelated to intelligence, such as fear of testing (the idea that people who are less afraid of testing perform better than those who are on the verge of panic). Alternatively, it could be due to better memory or faster neural processing.

It is also not clear just how “general” so-called general intelligence is. Einstein, for example, is renowned for his ability in theoretical physics, but he was also known to not be very strong in mathematics, and, although he played the violin, he was far from an expert violinist. William Shockley won the Nobel prize for his work as one of the inventors of the transistor, but was far from competent, when later in life he promulgated racist pseudoscience.

If the definition of an artificial general intelligence includes that it can perform any intellectual task as well as any human, then we have a problem. By this definition, humans do not have general intelligence. So, the analogy is, at best, imperfect.

Some people are undoubtedly smarter than others. Those that we identify as more intelligent, typically have achieved a certain level of success in a particular field. Expertise requires a great amount of practice, often practice of a particular type.

Experts are much more successful than novices at solving problems in their specific domain of expertise. Experts represent problems differently from novices, are more abstract in their thinking, and surprisingly, perhaps, they are less dependent on reasoning. Novices depend more than experts on formal rules, while experts seem to rely more on internalized intuitions. Expert ‘go’ players, for example, claim that they choose moves based on their aesthetic appeal, rather than basing their choice on an exhaustive analysis of the game tree.

In one study on memory in chess, experts were shown a configuration of about 25 chess pieces on a board for about 5 to 10 seconds. If the placement of these chess pieces could be reached in an actual game, then experts were much better than novices at being able to recall and reproduce the configuration. If the chess pieces were randomly arranged, however, the experts were no better than novices at remembering the positions of each piece.

The experts’ better memory for sensible chess piece arrangements was consistent with the idea that they viewed the board as “chunks” of pieces, each of which constituted a pattern. Some of these patterns even had names, such as “fianchettoed bishops.” Novices did not have access to this learned pattern information.

Experts also differ from novices in how they solve physics problems. Experts are guided more by the physical principles in the problems (such as conservation of energy), while novices are guided more by the terms used in the problem’s description (does the statement refer to a spring?). Additionally, experts used deeper, more abstract representations of the problems than novices did. Similar patterns of expertise have been found in many other domains.

Achieving an expert level of performance seems to depend strongly on a certain kind of practice, called deliberate practice. Expertise does not come simply from doing things like playing chess, but rather requires a focus on the components of the task.

At the height of his career, Tiger Woods, for example, would perform a putting drillwhere he would put two golf tees in the green about four feet from the hole, separated by the length of his putter’s head. He would put a ball between the two tees and putt from there into the hole until he had sunk at least 100 balls in a row. Elite levels of performance seem to require about 10 years-worth of this kind of deliberate practice.

Deliberate practice allows past experience to be transferred to new situations. For example, Abraham Luchins (1942) studied transfer between problems. In these problems, there are three jars, each of which holds a certain amount of water. The goal is to fill or empty the jars to end up with a specific amount of water in one of the jars.

In one problem of this type, the three jars hold 29, 3 and 21 liters respectively. The goal is to end up with one jar containing exactly 20 liters of water. Think about how you could accomplish this – none of the jars can measure out exactly 20 liters, but if you fill the 29-liter jar and dump water from it into the 3 liter jar you would end up with 26 liters. Repeat that twice more and you would end up with exactly 20 liters of water in the 29-liter jar.

After this example, Luchins gave participants a series of 10 similar problems:

Problem #	Capacity of Jar A	Capacity of Jar B	Capacity of Jar C	Goal quantity
1	21	127	3	100
2	14	163	25	99
3	18	43	10	5
4	9	42	6	21
5	20	59	4	31
6	23	49	3	20
7	15	39	3	18
8	18	48	4	22
9	14	36	8	6
10	28	76	3	25

As you might expect, people, being intelligent, get faster at solving these problems as they move through the list. The first nine problems can all be solved using the pattern B – 2C – A. Fill Jar B, dump water from Jar B into Jar C twice, and then dump water from Jar B to Jar A once. Problems 6 and 9 can be solved more simply using the pattern A – C, but 83 percent of people used the original pattern that worked for problems 1 – 6 on these problems.

Problems 7 and 8 can be solved more simply using the pattern A + C, but 79 percent of participants solved these problems using the original method. Perhaps most surprising, 64 percent of the participants given problems 1-9 did not solve problem 10 at all (A – C), but 95 percent of a different group of participants who got only problem 10 were able to solve it.

The point of Luchins’ water jar problem in the present context is that transfer from one problem to another does not always improve intelligence, but can actually hinder it. Because the same strategy works with problems 1 through 9, they never had a chance to discover that a simpler solution was possible for the later problems. And, because they tried to solve problem 10 with the same methods that they had been using, they actually failed on this one. Transfer is not always a good thing.

All of this means that general intelligence is not simple, or well understood. As Yogi Berra used to say, “if you don’t know where you’re going, you might not get there.”

Whatever the challenges of artificial general intelligence, the chances of us actually achieving it will be greatly improved if we have a better idea of just what we are trying to create. So far, that means better understanding human intelligence. We may not need human-like intelligence for solving specific problems, but it looks like it could be critical for developing artificial general intelligence.

Building an artificial general intelligence begins by asking ‘what is intelligence?’

Math for Machine Learning