CS代考计算机代写 flex ant arm computer architecture ER Hive chain Java scheme assembly decision tree AI python algorithm 01 – Introduction to Knowledge-Based AI

01 – Introduction to Knowledge-Based AI

01 – Introductions

>> We have had a lot of fun putting this course together. We hope you enjoy it as well. We think of this course as an experiment as well. We want to understand how students learn in online classrooms. So if you have any feedback, please share it with us.

02 – Preview

So welcome to 7637 Knowledge Based AI. At the beginning of each lesson, we’ll briefly introduce a topic as shown in the graphics to the right. We’ll also talk about how the topic fits into the overall curriculum for the course. Today, we’ll be discussing AI in general, including some of the fundamental conundrums and characteristics of AI. We will describe four schools of AI, and discuss how knowledge based AI fits into the rest of AI. Next, we’ll visit the subtitle of the course, Cognitive Systems, and define an architecture for them. Finally, we’ll look at the topics that we’ll cover in this course in detail.

03 – Conundrums in AI

Let’s start a recognition today. We’re discussing some of the biggest problems in AI. We obviously are not going to solve all of them today, but it’s good to start with a big picture. AI has several conundrums, I’m going to describe five of the main ones today. Conundrum number one. All intelligent agents have little computational resources, processing speed, memory size, and so on. But most interesting AI problems are computationally intractable. How then can we get AI agents to give us near real time performance on many interesting problems? Conundrum number two. All competition is local, but most AI problems have global constraints. How then can we get AI agents to address global problems using only local computation? Conundrum number three. Computation logic is fundamentally deductive, but many AI problems are abductive or inductive in their nature. How can we get AI agents to address abductive or inductive problems? If you do not understand some of these terms, like abduction, don’t worry about it, we’ll discuss it later in the later in the class. Conundrum number four. The world is dynamic, knowledge is limited, but an AI agent must always begin with what it already knows. How then can an AI agent ever address a new problem? Conundrum number five. Problem solving, reasoning, and learning are complex enough, but explanation and justification add to the complexity. How then can we get an AI agent to ever explain or justify it’s decisions?

05 – Characteristics of AI Agents

In addition to AI problems having several characteristics, AI agents too have several properties. Property number one. AI agents, have only a limited computing power, processing speed, memory size, and so on. Property number two. AI agents have limited sensors, they cannot perceive everything in the world. Property number three. AI agents have limited attention, they cannot focus on everything at the same time. Property number four. Computational logic is fundamentally deductive. Property number five. The world is large, but AI agents’ knowledge of the world is incomplete relative to the world. So, the question then becomes, how can AI agents with such bounded rationality address open-ended problems in the world?

06 – Exercise What are AI Problems

Now that we have talked about the characteristics of AI, agents in AI problems. Let us talk a little about for what kind of problems might you build in AI agent. On the right are several tasks. Which are these AI problems? Or to put it differently, for which of these problems would you build an AI agent to solve?

07 – Exercise What are AI Problems

>> I agree. In fact, during this class we’ll design AI agents that can address each of these problems. For now, let us just focus on the first one: how to design an AI agent that can answer Jeopardy questions?

08 – Exercise AI in Practice Watson

>> What is event horizon?

09 – Exercise AI in Practice Watson

>> That’s right. And during this course, we’ll discuss each part of David’s answer.

10 – What is Knowledge-Based AI

Let us look at the processes that Watson may be using a little bit more closely. Clearly Watson is doing a large number of things. It is trying to understand natural language sentences. It is trying to generate some natural language sentences. It is making some decisions. I’ll group all of these things broadly under reasoning. Reasoning is a fundamental process of knowledge based data. A second fundamental process of knowledge based AIs learning. What simply is learning also? It perhaps gets a right answer to some questions, and stores that answer somewhere. If it gets a wrong answer, and then once it learns about the right answer, it stores the right answer also somewhere. Learning to is a fundamental process of knowledge based AI. A third fundamental process of knowledge based ai is memory. If you’re going to learn something, that knowledge that you’re learning has to be store somewhere, in memory. If you’re going to reason using knowledge, then that knowledge has to accessed from somewhere, from memory. From memory process it will store, what we learn as well as provide access to knowledge it will need for reasoning. These three forms of processes of learning, memory, and reasoning are intimately connected. We learn, so that we can reason. The result of reasoning often. Result in additional learning. Once we learn, we can store it in memory. However, we need knowledge to learn. The more we know, the more we can learn. Reasoning requires knowledge that memory can provide access to. The results of reasoning can also go into memory. So, here are three processes that are closely related. A key aspect of this course on knowledge based AI is that we will be talking about theories of knowledge based AI that unify reasoning, learning, and memory. And sort of, discussing any one of the three separately as sometimes happens in some schools of AI. We’re going to try to build, unify the concept. These 3 processes put together, I will call them deliberation. This deliberation process is 1 part of the overall architecture of a knowledge based AI agent. This figure illustrates the older architecture of an AI agent. Here we have input in the form of perceptions of the world. And output in the form of actions in the world. The agent may have large number of processes that map these perceptions to actions. We are going to focus right now on deliberation, but the agent architecture also includes metacognition and reaction, that we’ll discuss later

12 – Exercise What is KBAI

>> What do you think? Do you agree with David? [BLANK_AUDIO]

13 – Exercise What is KBAI

>> So the autonomous vehicle may really belong to the acting rationally side of the spectrum. At the same time, looking at the way humans write might help us design a robot. And looking at the robot design might help us reflect on human cognition. This is one of the patterns of knowledge-based data.

14 – Exercise The Four Schools of AI

Let us do an exercise together. Once again, we have the four quadrants shown here, and at the top left are four compression artifacts. I’m sure you’re familiar with all four of them. C-3PO is a fictitious artifact from Star Wars. Can we put these four artifacts in the quadrants to which they best belong?

15 – Exercise The Four Schools of AI

>> So if you’d like to discuss where these technologies belong on these spectrums or, perhaps, discuss where some other AI technologies that you’re familiar with belong on these spectrums, feel free to head on over to our forums where you can bring up your own technologies and discuss the different. Ways in which they fit into the broader school of AI.

16 – What are Cognitive Systems

I’m sure you have noticed that this class has a subtitle, cognitive systems. Let’s talk about this term and break it down into its components. Cognitive, in this context, means dealing with human-like intelligence. The ultimate goal is to dwell up human level, human-like intelligence. Systems, in this context, means having multiple interacting components, such as learning, reasoning and memory. Cognitive systems, they are systems that exhibit human level, human-like intelligence through interaction among components like learning, reasoning and memory. Thus, on a spectrum, what we’ll discuss in this class will definitely lie on the right side of the spectrum, on the human side. We will be talking about thinking and acting, but we will always be concerned with human cognition.

17 – Cognitive System Architecture

So let us take a look at what is a cognitive system. Notice that I’m using the term cognitive system and not the term knowledge-based AI agent. I could have used that term also. When we talk about knowledge-based AI agent, then we could take two views. One view is that we are going to build a knowledge-based AI system, which need not be human like. Another view is that the knowledge based AI agent that we will build will be human-like. The cognitive system is situated in the world. Here by the world I mean the physical world. For example, the world that I am interacting with right now, with this screen in front of me and this microphone. This world is perceptionate. There’s an example, the percept something being a straight line or a color of some object. Or the smoothness of some of the texture of some object. This perceptionate around the world and cognitive system is using sensors to perceive this percept. That’s the input of the cognitive system. The cognitive system also has some actuators. So, for example, I have fingers that I’m using right now to point to things. And a cognitive system uses actuators to carry out actions on the world. Cognitive system then is taking perceptor’s input and giving actions as output. So far, we’ve talked about a single cognitive system. But of course one can have multiple cognitive systems. These multiple cognitive systems can interact with each other. Just like a cognitive system situated in a physical world, it is also situated in a social world. Let us now zoom into the inside of a cognitive system. What is the architecture of a cognitive system? So the cognitive system takes as input certain percepts about the world. It has a task of giving as output actions of the world. The question then becomes, how can these percepts be mapped into actions? One way of mapping them is that we will do a direct mapping. These percepts will be directly mapped into actions. Let’s take an example. Imagine that you’re driving a car, and the brake lights of the car in front of you, become bright red. Should that happen, you will then press on the brakes of your car. Well, that is an example of a reactive system. The percepts were that the break lights on the car in front of you became bright red and the action was that you pressed on your own brakes. In doing so, you may not have planned. This is now a direct mapping of percept into actions. Alternatively, consider a slightly different problem. Again you’re driving you’re car on the highway, but as you’re trying to drive on the highway your task this time is to change lanes. Now, in order to change lanes, again you may look around and look at the percept of the road. There are other cars on the road, for example, and you need to take some action that will help you change lanes. This time you may actually deliberate, you may actually look at the goal that you have as well as the percepts of the environment and come up with a plan that will tell you what action to take. As we discussed in the last lesson, the deliberation itself has a number of components in it. Three of the major components that we’ll studying in this class are learning, reasoning, and memory. These three components interact with each other in many interesting ways that we will decipher as we go along. Now, deliberation was reasoning about the world around us. So if I take that example again of changing lanes, as I’m driving on the highway, then I’m reasoning about the world around me. Where are the other cars? Should I change lanes to the left or to the right. Metacognition on the other hand, the third layer here, has to do with reasoning about the internal mental world. So metacognition reasons about the deliberation. Or metacognition can also reason about reaction. Let us take an example of the metacognition also. Image again that I had to change lanes. And I did, as I changed lanes to the left, the cars behind me honk because I did not leave enough space for the car that was already moving on the left lane. In that case I know that the lane changing did not go very smoothly. I may now think about my own actions in the world, about the deliberation that led to those actions, and I may then decide to change or reconfigure, or repair the deliberation that led to that sub-optimal plan for changing the lanes. That is an example of metacognition. So now I have this three layered architecture, reaction, deliberation, metacognition. Note that we have defined intelligence in a way, intelligence here is about mapping percepts in the world, interactions in the world. Intelligence is about selecting the right kind of action given a particular state of the world. But there are many different ways in which we can map the percepts into actions. Purely reactive, deliberative, or also entailing metacognition on the deliberation and the reaction. This then is the overall architecture of the cognitive system. This is called a three layered architecture. We’ll be returning to this architecture many times in this course.

18 – Topics in KBAI

We have organized the materials in this course, into eight major units, this chart illustrates those eight units. So starting from the top left, the first unit has to do with Fundamentals of presentation and recent, Panning, Common Sense Reasoning. Analogical Reasoning, Metacognition that we just talked a little about, Design & Creativity, Visuospatial Reasoning, and Learning. Now let’s look at each of these circles, one at a time. So in the first part, dealing with the Fundamentals of this course. We’ll be dealing with certain, knowledge representations, and reasoning strategies. Two of the major knowledge representations that we’ll discuss in the first part of this course, are called Semantic Networks, and Production Systems. Three of the reasoning strategies, are called Generate and Test, Means-End Analysis, and Problem Reduction. Note that, the arrows here imply an ordering. There is an ordering. In that, when we are discussing our Production Systems, we might allude to things that we are discussing in the Semantic Networks. Similarly, when we discuss Means-End Analysis, we might allude to things that we are discussing in Generate and Test. However. It is important to note also, that these three methods are completely independent from each other. It’s just that we are going to discuss them, in the order shown here. Similarly these two knowledge representations, are independent from each other. It’s simply that in this course, we’ll discuss them in this order. So the second major unit in this course. Pertains to Planning. Planning is kind of problem solving activity whose goal is to come up with plans, for achieving one or more goals. Before we discuss Planning, we’ll discuss Logic as a knowledge representation. This knowledge representation, will then enable us to discuss Planning in a systematic way. The third major unit in this course is common sense reasoning. Common Sense Reasoning, pertains to reasoning about every day situations in the world. As an example, I may give you the input, John gave the book to Mary. Note the input, does not specify who has the book at the end. But, you can draw that inference easily. That is an example of Common Sense Reasoning. In our course, we’ll discuss both knowledge representations like frames, as well as methods for doing Common Sense Reasoning. As we discussed earlier, when we were talking about the architecture of a cognitive system, Learning is a fundamental process within deliberation. And therefore, we will be visiting the issue of Learning many, many times throughout this course. However, we also have a unit on Learning, which has several topics in it. There are other topics in Learning, that do not show up in this particular. Circle here but are distributed throughout the course. Another major unit in our course is Analogical Reasoning. Analogical Reasoning is, reasoning about novel problems or novel situations, but, analogic to what we know about familiar problems, or familiar situations. As I mentioned earlier, Learning is distributed throughout this course. Therefore Learning comes here in Analogical Reasoning also. In fact Learning by recording cases appeared in the Learning topic as well as here, and you can see explanation based Learning occurring here. Visuospatial Reasoning is another major unit in our course. Visuospatial Reasoning pertains to reasoning with visual knowledge. As an example, I might draw a diagram and reason with the diagram. That’s an example of Visualspatial Reasoning. In the context of Visualspatial Reasoning, we’re talking both about Constraint Propagation, and using that to do Visualspatial Reasoning. Design & Creativity is the next topic in our course. We want to build AI systems, that can deal with novel situations, and come up with creative solutions. Design is an example of a complex task which can be very, very creative, where we are discussing a range of topics in the context of Design & Creativity. So in the next topic in our courses Metacognition, we have already come across a notion in Metacognition, when we were talking about the architecture of the cognize system. Metacognition pertains to thinking about thinking. And we’ll discuss a range of topics and then, we will end the course by talking about Ethics in Artificial Intelligence. This figure illustrates all the eight major units once again, as well as the topics within each major unit. I hope this will give you mental map of the organization of the course as a whole. In preparing this course, we came up with a ordering of the topics, which will interleave many of these topics. So we will not do the entire first unit, before we go to the entire second unit and so on. Instead, we will do some parts of first unit, then go to some other part that follows conceptually from it, and so there will be some interleaving among these topics. And one aspect of the personalization is, that you are welcome to go through these topics in your own chosen order. You don’t have to stay, with the kind of order that we’ll be using. This is an exciting agenda. I hope you are as excited as I am. There are very few, opportunities where we can talk about exotic topics, like Analogical Reasoning, and Creativity, and Metacognition. And, in this particular course, we’ll talk about all of them together.

19 – Wrap Up

So at the end of every lesson, I will briefly recap what we talked about during that lesson and try to tie it into some future topics. Today, we started off by talking about the central conundrums and characteristics of AI. This may have connected with some of your previous experience with other AI classes, like machine learning in AI for robotics. We then talked about the four schools of AI, and we talked about knowledge-based AI more specifically, what is it and where does it fit in with the other schools? Then we talked about cognitive systems and how cognitive systems are always concerned with human like intelligence. Lastly, we talked about the overall structure of the course, which is broken up into eight large categories, like learning, planning and analogical reasoning. Next time we’ll talk a little bit more specifically about this class in particular. The goals, the outcomes and the learning strategies, and what projects you’ll complete.

21 – Final Quiz

This brings us to the first quiz. After every lesson in this course we’ll have a short quiz, in which we’ll ask you to write down what you learned in this lesson in this blue box here. These quizzes have two goals. The first goal is to help you synthesize and organize what you have learned. The process of writing down what you learned may help you, in fact, learn it more deeply. The second goal is to provide us with feedback. Perhaps we could have been clearer or more precise about some of the concepts. Perhaps we left some misconceptions. Note that these quizzes are completely optional.

22 – Final Quiz

Great. Thank you so much for your feedback.

02 – Introduction to CS7637

01 – Preview

In this lesson, we’ll talk most specifically about what you should expect from CS7637. We’ll start by talking about the learning goals, the learning outcomes, and the learning strategies that we’ll use for this class. Then we’ll discuss the class projects and assessments. That will lead us to talking about something called Computational Psychometrics, which is one of the multi-weighting principles behind the projects in this class. Next we’ll talk about the Raven’s Progressive Matrices test of intelligence. I think you’re going to find it fascinating. The Raven’s Progressive Matrices test of intelligence are the most commonly used tests of human intelligence. And that test was the target of the projects in this class. Very ambitious, you’re going to enjoy it. FInally we’ll discuss something commonly reoccuring principle in this class and you should be on the lookout for them.

02 – Class Goals

There are four major learning goals for this class. First, you’ll learn about the core methods of knowledge-based AI. These methods include schemes for structured knowledge representation, methods for memory organization, methods for reasoning, methods for learning, [UNKNOWN] architectures as well as methods for meta reasoning. Meta reasoning is reasoning about reasoning. Second, you learn about some of the common tasks addressed by knowledge-based AI, such as classification, understanding, planning, explanation, diagnosis, and design. Third, you will learn ways AI agents can use these methods to address these tasks. Fourth, you learn the relationship between the knowledge-based AI cognitive science. Using theories of human cognition to inspire their design of human level, human-like AI and using AI techniques to generate testable hypothesis about human cognition.

03 – Class Outcomes

What are the learning outcomes of this course? At the conclusion of this class, you will be able to do three primary things. First, you’ll be able to design, implement and evaluate and describe knowledge-based AIs. The design and description of knowledge-based AI agent is really the first learning goal. In order to be able to design an agent, you need knowledge of the methods of knowledge-based AI. Second, you will also be able to use these strategies to address practical problems. This learning outcome addresses the second learning goal, where you will be able to [INAUDIBLE] the relationship between AI agents and real world problems. Third, you’ll also be able to use the design of knowledge-based AI agents to reflect on human cognition and vice versa. This addresses the fourth learning goal.

04 – Class Assignments

During this course, you’ll complete a variety of different kind of assessments. These assessments play different roles. First, they help you learn by demonstrating and testing what you know. Second, they help you reflect on what you’ve learned. Third, they help us understand what material is being taught well, and what is not being taught well. The main assessment of the projects. You’ll complete a series of programming projects in designing AI agents that address a pretty complex task. We’ll talk a little bit more about it in a few minutes. Second, written assignments. We’ll compete a number of written assignments that will tie the course material to the projects. Third, tests. There will be two tests in this class using the content of this class to introduce a broad variety of problems. Fourth, exercises. Throughout the lessons there’ll be a number of exercises to help you evaluate and manage your own learning. Fifth, interactions. We’ll be looking at the interactions of the forum and other places to get a feel for how everyone is doing and how can we help improve learning

06 – Introduction to Computational Psychometrics

Let us talk about Computational Psychometrics a little bit. Psychometrics itself is a study of human intelligence, of human aptitude, of human knowledge. Computational Psychometrics for our purposes, is the design of computational agents that can take the same kind of tests that humans do, when they are tested for intelligence or knowledge or aptitude. Imagine that you design an AI agent that can take an intelligence test. After designing it, you might want to analyze how well does it do compared to the humans on that test? You might also want to compare the errors it makes with the errors that humans make. If it does as well as humans do and if its behavior, its errors are the same as those of humans, you might conjecture then that perhaps its reasoning mirrors that of humans. In this class, we are going to be designing AI agents that can take the Raven’s Test of Intelligence. In the process, we will want to use this agents to reflect on how humans might be addressing the same intelligence tests.

07 – Ravens Progressive Matrices

The class projects will be based on the Raven’s Progressive Matrices test of intelligence. This test was written in the 1930s to examine general human intelligence. It consists of 60 multiple-choice visual analogy problems. The Raven’s test is unique among intelligence test in that all problems in the Raven’s test are strictly visual. No words. It is the most widespread, the most commonly used, the most reliable test of intelligence. The Raven’s test consists of two kinds of problems, two by two matrix problems and three by three matrix problems. We’ll also consider a special case of two by one matrix problems. They’re not part of the original test. But they will providing a baseline for starting to construct AI agents. Your project will be to implement AI agents that can solve problems like those that appear in the Raven’s Test of Intelligence. Let’s look at a few sample problems right now.

08 – 2×1 Matrices I

Let us consider an example. We are shown initially three images, A, B and C. And you have to pick a candidate for the D image here on the top right. And it can be one of these six candidates that would go here in the D image. Given that A is to B, as C is to D, what would you pick among the six choices at the bottom to put into D?

09 – 2×1 Matrices I

>> Very good, that is in fact the correct answer for this problem. Now, of course, here’s a situation where a human being, David, answered this problem. The big question for us would be, how to write a air agent that can solve this problem?

10 – 2×1 Matrices II

The previous problem was pretty simple. Let’s try a slightly harder problem. Once again, we’re given A, B, C, and D. Given that A is to B, what would we pick between 1, 2, 3, 4, 5, and 6 to put into D?

11 – 2×1 Matrices II

>> This of course, raises another issue. How do we do it? How do you solve the problem? Why was it so easy for you? Why is it so hard for AI? You remember the question. When David was trying to solve this problem, he looked at the relationship between A and B and then marked it to C and some image here. But one could have gone about it the other way. We could have picked any one of these images, put it in the D, and ask whether this would be a good fit. So in one case, one can start from the problem and propose a solution. In other case, one could take one of these solutions at a time and see if it matches. Two different strategies

13 – 2×1 Matrices III

What do you think is the correct answer, David? >> So on the left, we have the same two frames we had in the first problem. So first, I thought that the circle in the middle disappears, so the triangle should disappear. But none of these options match that. So then I went back and looked and said, the other way we can think about this is to say the circle on the outside disappeared but the circle on the inside grew, since their both circles we can’t really tell the difference between those, but once we know that the correct answer is not just the big square, we can say the only logical conclusion is to say that the square disappeared, and the triangle grew. So the answer has to be three, the big triangle. >> That’s a correct answer, David. But notice something interesting here. This is an example of generate and test. You initially generated an answer from it and then tested it against the choices of a level. Yet the test failed, so you rejected a solution. And you generate another solution. For that one, the test succeeded, and you accepted it.

14 – 2×1 Matrices IV

I like this problem. This one is really interesting. Everyone, try to solve this one.

33 – Final Quiz

Great. Thank you so much for your feedback.

03 – Semantic Networks

01 – Preview

Okay. Let’s get started with Knowledge PCI. Today we’ll talk about semantic networks. This is the kind of knowledge representation scheme. This is the first lesson in our fundamental topics part of the course. We’ll start talking about knowledge representations, then we’ll focus on semantic networks. We’ll illustrate how semantic networks can be used to address two by one matrix problems. You can think of this like a represent and reason modality. Represent the knowledge, represent the problem, then use that knowledge to address the problem. As simple as that. At the end, we’ll close this lesson by connecting this topic with human cognition and with modern research in AI.

04 – Exercise Constructing Semantic Nets I

>> Okay, very good. Here is C and I’ve just chosen one of the choices out of the six choices, five here. And so we’re going to try to build a semantic network for C and five, just the way we built it for A and B. So for C and five, I have already shown all the objects. Now, your task is to come up with the labels with the links that are between these objects here, as well as labels for the link in the, between the object for five.

05 – Exercise Constructing Semantic Nets I

>> Now David made an important point here. He said that the vocabulary he’s using here of inside and above, is the same as the vocabulary that I had used of inside and above here. And that’s a good point because we want to have a consistent vocabulary throughout the representation for the class of problems. So here we have decided that for representings, problems of this kind in semantic networks, we will use a vocabulary of inside and above and we will try to use it consistently.

06 – Exercise Constructing Semantic Nets II

Let’s go one step further. Now we have the semantic network for C, and the semantic network for 5. But we have yet to capture the knowledge of the transformation from C to 5. So we have to label the, these three links.

08 – Structure of Semantic Networks

Now that we have seen some examples of semantic networks, let us try to characterize semantic networks as a knowledgeable presentation. A knowledgeable presentation, will have a lexicon. That tells us something about the vocabulary of the presentation language. A structure which tells us about how the words of that vocabulary can be composed together into complex representations and the semantics which tells us how the representation allows us to draw inferences so that we can in fact reason. In case of semantic network the basic lexicon consists of nodes that capture objects. So, x, y, z. What about this structural specification? Structural specification here consists of links which have directions. These links capture relationships and allows to compose these notes together into complex representations. What about the semantics? In case of semantics, we are going to put labels on these links which are then going to allow us to do, draw inferences and do reasoning over these representations.

09 – Characteristics of Good Representations

Now that we have seen semantic networks in action, we can ask ourselves the important question. What makes a knowledge representation, a good representation? Well, a good knowledge representation makes relationships explicit. So in the two by one matrix problem, there were all these objects, circles and triangles and dots. There were there this relationship between them, left off and inside. And the semantic that worked made all of them explicit. It exposed the natural constraints of the problem. A good representation works at the right level of abstraction. So that it captures everything that needs to be captured, and yet. Removes all the details that are not needed. So representation, a good representation, is transparent, concise, captures only what is needed, but complete, captures everything that is needed. It is fast, because it doesn’t have all the details that are not needed. And it is computable. It allows you to draw from the inferences that need to be drawn, in order to address the problem at hand.

10 – Discussion Good Representations

>> What do you think? Do you think David, is right?

12 – Guards and Prisoners

Let us now look at a different problem, not a 2 by 1 matrix problem but a problem called the guards and prisoners problem. Actually this problem goes by many names, Cannibals and missionaries problem, the jealous husbands problem and so on. It was first seen in a math text book about 880 and has been used by many people in AI for discussing problem representation. Imagine that there are three guards and three prisoners, on one bank of the river and they must all cross to the other bank. There is one boat, just one boat and they can only take one or two people at a time, not more and the boat cannot travel alone. On either bank, prisoners can never outnumber the guards, if they do they will overpower the guards. So, the number of guards must at least be equal to the number of prisoners on each bank. We’ll assume these are good prisoners. They won’t runaway if they’re left alone. Although they might beat up the guards if they outnumber them. That’s the beauty of this class. We lead with real problems, practical problems. We also make up problems to help illustrate specific things. I think you’re going to have fun with this one.

13 – Semantic Networks for Guards Prisoners

Let us try to construct a semantic network representation, for this guards and prisoners problem, and see how we can use it to, do the problem solving. So in this representation, I’m going to say that each node is a state in the problem solving. In this particular state, there happens to be one guard and one prisoner on the left side. The boat is on the right side, and two of the prisoners and two of the guards are also on the right side. So this is a node, one single node. So the node captured, the lexicon of the semantic network. Now, we’ll add the structural part. And the structural part has to do with the transformation. That is going connect different nodes, into a more complex sentence. We’ll label the links between the nodes, and these labels then, will capture some of the semantics of this representation, that will allow us to make interesting inferences, when it comes time to do the problem solving. Here is a second node, and this node represents a different state in the problem solving. In this case, there are two guards and two prisoners on the left side. The boat is also on the left side. There is one guard and one prisoner on the right side. So this now, is a, semantic network. A node, another node, a link between them and the link is labelled. Note that in this representation, I used icons to represent objects, as well as icons to represent labels of the links between the nodes. This is perfectly valid. You don’t have to use words. You can use icons, as long as you’re capturing the nodes and the objects inside each state, as well as the labels on the links between the different nodes.

14 – Solving the Guards and Prisoners Problem

There’s an old saying in AI, which goes like, if you have the right knowledge representation, problem solving becomes very easy. Let’s see whether that also works here. We now have a knowledge representation for this problem of guards and prisoners. Does this knowledge representation immediately afford effective problem solving? So, here we are in the first node, the first state. There are three guards and three prisoners in the boat, all in the left-hand side. Let us see what moves are possible from this initial state. Now, using this representation, we can quickly figure out that there are five possible moves from the initial state. And the first move, we move only guard to the right. On the second move, we move a guard and a prisoner to the right. In the third move, we can move two guards, or two prisoners. Or, in the fifth move, just one prisoner to the right. Five possible moves. Of course, we know that some of these moves are illegal and some of them are likely to be not very productive. Will the semantic network allow us to make inferences about which moves are productive and which moves are not productive? Let’s see further. So, let’s look at the legal moves first. So we can immediately make out from this representation, that the first move is not legal because we are not allowed to have more prisoners than guards on one side, of the river. Similarly, we know that the third move is illegal for the same reason. So, we can immediately rule out the first and the third moves. The fifth move, too, can be ruled out. Let’s see how. We have one prisoner on the other side. But the only way to go back would be to take the prisoner to the, back to the previous side. And if we do that, we reach the initial state. So we did not make any forward progress. Therefore, we can rule out this move as well. This leaves us with two possible moves that are both legal and productive. The, we have already removed the moves that were not legal and not productive. Later, we will see how AI programs can use various methods to figure out what moves are productive and what moves are unproductive. For the time being, let’s go along with our problem solving.

15 – Exercise Guards and Prisoners I

>> Write the number of guards on the left coast in the top left box, just as a number zero, one, two, or three. The number of prisoners on the left coast in the bottom left box, the number of guards on the right coast in the top right box, and the number of prisoners on the right coast in the bottom right box.

17 – Exercise Guards and Prisoners II

Let us take this problem solving a little bit further. Now that we’re in this state, let us write down all the legal moves that can follow. It will turn out that some of these legal moves will be unproductive, but first, let’s just write down the legal moves that can follow from here.

18 – Exercise Guards and Prisoners II

>> In fact, David, most of us have the same difficulty. So the power of this semantic network as a representation is arising because it allows us to systematically solve this problem because it makes all the constraints, all the objects, all the relationships, all the moves very explicit.

20 – Exercise Guards and Prisoners III

>> So we’ve not yet talked about, how an AI method, can determine which states are productive and which states are unproductive. We’ll revisit this issue in the next couple lessons.

21 – Represent Reason for Analogy Problems

Now that we have seen how, the semantic network knowledge representation, enables problem solving, let us return to that earlier problem that we were talking about. The problem of A is to B, as C is to 5. Recall that we have worked out the representations for both A is to B and C is to 5. The question now becomes, whether we can use this representation, to decide whether or not 5 is the correct answer. If we look at the two representations in detail, then we see part of the representation here, is the same as the representation here. Except that, this part is different from this part. Here we have y expanded and right here, we have s remain unchanged. So this may not be the best answer. Perhaps there is a better answer. Where the representation on the left, will exactly match representation on the right.

24 – Exercise How do we choose a match

Let us do another exercise. This is actually an exercise we’ve come across earlier, however this exercise has an interesting property. Often the world presents input to us, for which there is no one single right answer. Instead, there are multiple answers that could be right. The world is ambiguous. So, here we again have A is to B, C is to D, and we have six choices. So, what choice do you think is the right choice here?

25 – Exercise How do we choose a match

>> That’s a great question. Let’s look at this in more detail.

27 – Discussion Choosing a Match by Weight

>> What does everyone think about David’s answer? Did David give the right answer with two?

31 – Wrap Up

So let’s recap what we’ve talked about today. We started off today by talking about one of the most important concepts in all of knowledge based AI, which are knowledge based representations. As we go forward in this class, we’ll see knowledge representations are really at the heart of nearly everything we’ll do. We then talked about semantic networks, which are a good particular kind of [UNKNOWN] representation and we used those to talk about the different criteria for a good knowledge representation. What do good knowledge representations enable us to do and what to they help us avoid? We then talked about kind of an abstract class of problem solving methods called Represent and Reason. Represent and reason really lies under all of knowledge based AI and it’s a way of representing knowledge and then reasoning over it. We then talked a little bit about augmenting that with weight, which allows us to come to more nuanced and specific conclusions. In the next couple weeks, we are going to use these semantic networks to talk about a few different problem solving methods. Next time, we’ll talk about generating tests and then we’ll move on to a couple slightly different ones called Means and Analysis and Proper Reduction.

32 – The Cognitive Connection

How is semantic networks connected with human cognition? Well we can make at least two connections immediately. First, semantic networks are kind of knowledge representation. We saw hive knowledge is presented as a semantic network. If the results of [UNKNOWN] representation, then you can use the knowledge presentation to address the problem. We can now say similarly for human mind that human mind represents problems. It represents knowledge. Then it uses that knowledge to address the problem. So, representation then becomes the key. Second, and most specifically, semantic networks are related to spreading activation networks, which is a very popular theory of human memory. Let me give you an example. Supposing I told you a story consisting of just two sentences. John wanted to become rich. He got a gun. And notice that I did not tell you the entire story, but I’m sure you all made up a story based on what I told you. John wanted to become rich. He decided to rob a bank. He got a gun in order to rob the bank. But how did you end this story? How did it draw the inferences about robbing a bank which I did not tell you anything about? Imagine if you have a semantic network that consisted of a large number of nodes. So when I gave you the first sentence, John wanted to become rich, the nodes corresponding to John and wanted and become and rich, got activated, and the activation started spreading from those nodes. And when I said John, he got a gun, then the gun node also got activated and that activation also started spreading. As this activation spread, it merged. And a path they could walk on. And all the nodes on that pathway now become part of the story, and if you happen to have nodes like, rob a bank along the pathway, now you have understanding of story.

34 – Final Quiz

Great. Thank you so much for your feedback.

04 – Generate & Test

02 – Guards and Prisoners

Knowledge-based AI is a collection of three things. Knowledge representations, problem solving techniques and architectures. We have already look at one knowledge representation, semantic networks. We have not so far looked at problem solving methods or architectures. Today, I’d like to start by talking about the problem solving method. Let us illustrate the problem solving method of generate and test with the same examples that we have discussed earlier. When we were discussing this example in the case of semantic networks, we simply came up with various states and pruned some of them without saying about how an AI agent would know what states to prune. So imagine that we have a generator that takes the initial state and from that initial or current state, generates all the possible successive states. For now, imagine it’s not a very smart generator, it’s a dumb generator. So it generates all the possible states. So the generator test method not only has a generator but also has a tester. The tester looks at all the possible states the generator has generated and removes some of them. For now, let’s also assume that the tester is is dumb as well. And so the tester is removes only those states that are are clearly illegal based on the specific of the problem. Namely, that one cannot have more prisoners than guards on either back. So the first and the third states are removed by the tester.

03 – Exercise Generate and Test I

Let us continue with this exercise one step further. So now we have three successor states to the initial state. Given these three successor states, what states might the dumb generator generate next?

04 – Exercise Generate and Test I

>> So from the top state we have three possible next states. We can move both of them, we can move just the prisoner, or we can move just the guard. From this one we can either move one prisoner or two prisoners, and from this one all we can really do is move the prisoner back over to the left. Remember that David is not generating these successive states. David is saying that the DOM generator will generate the successive states.

05 – Exercise Generate and Test II

So now that we have all of these states that the generator has generated, given that we have a dump tester what states will the dump tester dismiss?

06 – Exercise Generate and Test II

>> So the only one of these six states that disobeys our one rule against having more prisoners than guards on either shore, is this state over here. So, that’s the only state that’s going to get thrown out. These five states are all legal according to our dumb testers understanding of the problem. So after we dismiss that state, though. We’ll notice that we only have two unique states, we have everyone on the left coast and one prisoner on the right coast. So like we did earlier, we can collapse these two down into only these two states. It won’t matter how we got there, once we’re there.

07 – Dumb Generators, Dumb Testers

Now we can continue to apply this method of generate and test iteratively. So we can apply it on this state and that state and see what successor states we get. If we do so, then we get a very large number of successor states. This is a problem of call many total explosion. While one was tasked with a small number of states, but the number of successor states keeps on increasing very rapidly. Now, the reason it is occurring here and it did not occur when we are talk, dealing with semantic networks is because here we have states like this one which have three guards and three prisoners on the same side of the bank, exactly the same state that was the initial state to begin with. This is because we have a dumb generator and a dumb tester. So this state never got pruned away, although this particular state is identical to the initial state that we started from. This method of generating test, even with a dumb generator and a dumb tester, if applied iteratively could finally lead to the goal state. In which case, we will have a path from the initial state all the way to the goal state, but this will be computationally very inefficient. This is because we have a dumb generator and a dumb tester. So the question now becomes, can we make a smarter generator and a smarter tester? Before we make a smarter generator and a smarter tester, we should note that generate and test is a very powerful problem solving method.

08 – Smart Testers

So suppose that we have a smarter tester, a tester which can detect when any state is identical to a previously visited state. In that case the tester may decide that this, this, and this state are identical to the initial state and therefore dismiss them. The tester also dismisses this state, as usual, because of the problem specification that one cannot have more prisoners than guards on any one bank. This leaves the following state of affairs. Note also that this particular state has no successor states, all successor states of this have been ruled out. Therefore this particular part clearly is not a good path to get to the gold state. If we notice also, that these two states are identical, then we can merge them. If we do so, then we get exactly the same kind of configuration of states that we had when we were dealing with the semantic network in the previous lesson. There is something to note here. We had this semantic network in the last lesson, but the knowledge representation of semantics network, while very useful, by itself and of itself doesn’t solve any problems. You need a problem solving method that uses knowledge afforded by the knowledge representation to actually do the problem solving. Generating test is one of those problem solving methods. In general, when we do problem solving or reasoning, then there is a coupling between a knowledge representation and a problem solving method, like semantic networks and generating test. What we did so far had a dumb generator, but we made the testers smarter. The testers started looking for what states had been repeated. Alternatively we can shift the balance of responsibility between them and make the generator smarter. Let’s see how that might happen.

09 – Smart Generators

Instead of the generator generating all the successive states and then a tester finding out that this state, this state and this state are identical to the initial state. One could make the generator itself smarter and say that a generator will not even generate these three states, but it will know that it should not generate states that are already up here. This means that we can either provide the generator with some additional abilities or the tester with some additional abilities or both. If the generator was smarter, then it would not even generate these three states because they are nonproductive. I would exclude maybe the tester, the determinant of this state is illegal and therefore dismisses it. We could even go one step further and make the generator even smarter, so the generator will not generate this particular state. And thus, the balance within the generator and the tester can shift depending on where we try to put knowledge. For this problem, for this relatively simple and small problem, the balance will responsibility between the generator and test might look like a tree relationship. But imagine a problem in if there are a million such states. Then whether we have generated very smart or the tests are very smart or both can become a important issue. Despite that, genetic testing factors are a very popular method used in some schools of AI. Genetic algorithms, for instance, can be viewed as genetic [INAUDIBLE]. Given a number of states, they try to find out all the potential successive states that are possible, given some simple rules of recombination. And then of a fitness function that acts as a tester. Genetic algorithms, therefore, are an effective method for a very large number of problems. They’re also a very inefficient method because neither the generator nor the testing generator algorithms are especially smart.

10 – Discussion Smart Generators and Testers

>> What does everyone else think? Is David right about this?

11 – Discussion Smart Generators and Testers

>> That sounds like a good answer, to me. So once again, we are back to the issue of where do we draw the balance of responsibility between the generator and the tester? The important thing to note from here however is that generation test when in doubt with the right kind of knowledge can be a powerful method.

12 – Generate Test for Ravens Problems

Let us return to our problem from the intelligence test to see how generate and test might apply as a problem solving method. Again, here is a problem that we encountered earlier. Notice that this is a more complicated problem than the guards and prisoners problem. Here is why. In case of the guards and prisoner problem, each transformation from one state to another, was a discrete transformation. One could take a certain number of guards to the other side. One could take a certain number of prisoners to the other side, or one could take a certain of number of guards and prisoners to the other side. In this case, if I look at the approximation between A and B, and I notice that the diamond inside the circle is now outside the circle and is larger. Now suppose I were to try the same transformation from C to D. So I can look at the circle inside the triangle, put it outside, and also make it larger. I notice that when I put it outside, I can put it outside right next to the triangle, a little bit farther, a little bit farther, a little bit farther away. I can make it the same size, or a little larger, or a lot larger. Increase its size by 50% or 51% or 52%. So this space of possibilities here is very large. So for problems of this kind, the need for a smarter generator and a smarter tester is critical, because this space of possibilities can become very large, very quickly.

19 – Final Quiz

Great. Thank you so much for your feedback.

05 – Means-Ends Analysis & Problem Reduction

02 – Exercise The Block Problem

To understand a method of means and analysis. Let us look at this blocks word problem. This is a very famous problem in AI. It has occurred again and again. And almost every textbook in AI has this problem. You’re given a table on which there are three blocks. And A is on table, B is on table, and C is on A. This is the initial state. And you want to move these blocks, to the gold state. On this configuration, so that C is on table, B is on C and A is on B. The problem looks very simple listen, doesn’t it? Let’s introduce a couple of constraints. You may move only one block at a time, so you can’t pick both A and B together. And second, you may only move a block that has nothing on top of it. So, you cannot move block A in this configuration, because it has C on top of it. Let us also suppose that we’re given some operators in this world. These operators essentially move some object to some location. For example, we could move C to the table, or C onto B, or C onto A. Not all the operators may be applicable in the current state. C is already on A, but in principle, all these, all of these operators are available. Given these operators, and this initial state and this goal state, write a sequence of operations that will move the blocks from the initial state to the goal state.

03 – Exercise The Block Problem

>> That’s a good answer, David, that’s a correct answer. Now the question becomes how can we make in AI agent that will come up with the similar sequence of operations? In particular, how does the matter of means-end analysis work on this problem and come up with a particular sequence of operations?

07 – Exercise Block Problem I

To understand more deeply the properties of means and analysis, let us look at another, slightly more complicated example. In this example, there are four blocks instead of the three in the previous example. A, B, C, D. In the initial state, the blocks are arranged as shown here. The goal state is shown here on the right. The four blocks are arranged in a particular order. Now if you compare the configuration of blocks on the left with the configuration of blocks on the right, in the goal state, you can see there are three differences. First, A is on Table, where A is on B here. B is on C. That’s not a difference. C is on Table. C is on D here, D’s on B, D’s on Table here. So there are three differences. So, this is a heuristic measure of the difference between the initial state and the goal state. Once again, we’ll assume that the AI agent can move only one block at a time. Given the specification of the problem, what states are possible from the initial state? Please write down your answers in these boxes.

08 – Exercise Block Problem I

>> That’s good David.

10 – Exercise Block Problem II

>> Good, David. So in each state David is comparing the state with the goal state and finding differences between them.

11 – Exercise Block Problem III

Given these three choices which operation would means-end analysis choose?

12 – Exercise Block Problem III

>> That’s correct, David

13 – Exercise Block Problem IV

Given this current state, we can apply means ends analysis veritably. Now, if we apply means on some of those to this particular state, the number of choices here is very large, so I will not go through all of them here. But I’d like you to write down the number of possible next states. As well as, how many of those states reduce the difference to the goal? Which is given here.

14 – Exercise Block Problem IV

>> That’s good, David.

18 – Problem Reduction

Let us now turn to the third problem solving method under this topic called problem reduction. The method of problem reduction actually is quite intuitive. I’m sure you use it all the time. Given the hard complex problem, reduce it. Decompose it into multiple easier, smaller, simpler problems. Consider, for example, computer programming or software design that I’m sure many of you do all the time. Given a hard part of the address, you decompose it with a series of smaller problems. How do I read the input? How do I process it? How do I write the output? That itself is a decomposition. In fact, one of the fundamental roles that knowledge plays is it tells you how to decompose a hard problem into simpler problems. Then once you have solutions to this simpler smaller problems. You can think about how to compose the sub-solutions to the sub-problems into a solution of the problem as a whole. That’s how problem reduction works.

20 – Exercise Problem Reduction I

So given this is a current state, what successor states are possible if we were to apply means and analysis? Please fill in these boxes.

21 – Exercise Problem Reduction I

>> That looks right, David.

22 – Exercise Problem Reduction II

Let us now calculate the difference from each of the states to the goal state.

23 – Exercise Problem Reduction II

>> So note that both the state at the top and this state at the bottom have a equal amount of difference compared to goal state. We could’ve chosen either state to go further. For now, we going to go with the one at the bottom. The reason of course is that if I put A on D that will get in the way of solving the rest of the problem. For now, let us go with this state. Later on we will see how an AI agent will decide that this is not a good path to take and this is the better path to take.

24 – Exercise Problem Reduction III

So if we make the move that we had at the end of the last shot, we’ll get this state. So now we need to go from this state to the goal state. Please write down what is the sequence of operators which might take us from the current state to the goal state.

30 – Wrap Up

So let’s wrap up what we’ve talked about today. We started off today by talking about state spaces and we used this to frame our discussion of mean-ends analysis. Means-ends analysis is a very general purpose problem solving method, that allows us to look at our goal and try to continually move towards it. We then use means-ends analysis to try and address a couple of different kinds of problems. But when we did so, we hit an obstacle. To overcome that obstacle, we used problem reduction. We can use problem reduction in a lot of other problem solving contexts, but here we use it to specifically to overcome the obstacle we hit during means-ends analysis. Problem reduction occurs and we take a big hard problem and introduce it into smaller easier problems. By solving the smaller easier problems, we solve the big hard problem. Next time we’re going to talk about production systems, which are the last part of the fundamental areas of our course. But if you’re particularly interested in what we’ve talked about today, you may wish to jump forward to logic and planning. Those were built specifically on the types of the problems we talked about today. And in fact in planning, we’ll see a more robust way of solving the kinds of obstacles that we hit, during our exercise with means and analysis earlier in this lesson.

32 – Final Quiz

We’re at the end of this lesson. Please summarize what you learned in this lesson, inside this box.

33 – Final Quiz

And thank you for doing it.

06 – Production Systems

01 – Preview

Today, we’ll talk about production systems. I think you’re going to enjoy this, because part of production systems is going to do with learning. And this is the first time in the course we’ll be talking about learning. Production systems are kind of cognitive architecture, in which knowledge is represented in the form of rules. This is the last topic under the fundamental topics part of the course. We’ll start by talking about cognitive architectures in general, then focus on production systems, then come to learning, a particular mechanism of learning called chunking.

02 – Exercise A Pitcher

To illustrate production systems, let us imagine that you are a baseball pitcher. This illustration is coming from a game between Atlanta Braves and the Arizona Diamondbacks. Here is a pitcher on the mound, and a pitcher has to decide, whether to pitch a ball to the batter, or whether to walk the batter. To be more specific, take a look at the story here in the left. Read it. And to decide what would the pitcher do. What would you do? What would an intelligent agent do? If you don’t know much about baseball, don’t worry about it. Part of the goal here is to see what someone who does not know about, a lot about baseball may do.

03 – Exercise A Pitcher

>> David, it’s clear that you know more about baseball than I do. So I assume that your answer is the right one. But notice what is happening here. David has a lot of knowledge about baseball, and he’s using that knowledge to make a decision. How is he using his knowledge to make a decision? What is the architecture? What is the reasoning that leads him to make that specific decision? This is one of the things we’ll learn in this lesson. How might intelligent agents make complex decisions about the world?

08 – Assumptions of Cognitive Architectures

The school of AI that works on cognitive architectures makes sort of fundamental assumptions about the nature of cognitive agents. First, that cognitive agents are goal oriented, or goal directed. They have goals and they take actions in the pursuit of those goals. Second, that these cognitive agents live in a rich, complex, dynamic environments. Third, this cognitive agent used knowledge of the world in order to pursue their goals in this rich complex dynamic environments. Fourth, that this knowledge is particular abstraction that captures the important things about the world that the level of abstraction and removes all the details. And at that level of abstraction, knowledge is captured in the form of symbols. Fifth, the cognitive agents are very flexible. The behavior is dependent upon the environment. As environment changes, so does the behavior. And sixth cognitive agents learn from their experiences. They’re constantly learning as they interact with the world.

09 – Architecture Content Behavior

We can capture the basic intuition behind work on cognitive architectures by a simple equation, architecture plus content equals behavior. Let us look at this equation from two different perspectives. First, imagine that you want to design an intelligent machine that exhibits a particular kind of behavior. This equation says that, in order to do that, you have to design the right architecture, and then put the right kind of knowledge content into that architecture, to get the behavior that you want from it. That’s a complicated thing. But suppose that I could fix the architecture for you. In that case, if the architecture is fixed, I simply have to change the knowledge content to get different behaviors, which is a really powerful idea. From a different direction, suppose that we were trying to understand human behavior. Now we could say, again, that the architecture is fixed, we could say that, this behavior is arising because the knowledge content is different. We can map now, behavior to content because the architecture is fixed. That simplifies our understanding of how to design machines or how to understand human cognition. By the way, the same thing happens in computer architecture. I’m sure you have, are familiar with computer architecture. Computer architecture has stored programs in it, that’s the content, and that running of the stored program gives you different behaviors. The computer architecture doesn’t change, the stored program keeps on changing, to give you different kind of behaviors. Same idea with cognitive architectures. Keep the architecture constant, change the content. Now, of course, the big question will become, what is a good architecture? And that’s what we’ll examine later.

11 – Return to the Pitcher

Let us now go back to example of the baseball pitcher who has to decide on a action to take in a particular circumstance. So we can think of this pitcher as mapping a percept history into an action. Now imagine that this pitcher is embodying a production system. We are back to a very specific situation, and you can certainly read it again. Recall that David had given the answer, the pitcher will intentionally walk the batter. So we want to make the theory of how might the pitcher or internal and AI agent come to this decision. Recall the very specific situation that the pitcher is facing. And recall also that David had come up with this answer. So, here is a set of percepts, and here is an action. And the question is, how these percepts get mapped into this action? We are going to add one build a theory of how the human pitcher might be making these decisions, as well as a theory of how an AI agent could be built to make this decision. So let’s go back to the example of the pitcher having to decide on a action in a particular situation in the world. So the pitcher has several kinds of knowledge. Some of its knowledge is internal. It already has it. Some of it, it can perceive from the world around it. As an example, the pitcher can perceive the various objects here, such as the bases, first, second, third base. The pitcher can perceive the batter here. The pitcher can perceive the current state of the game. The specific score in the inning, the specific batter. The pitcher can perceive the positions of its own teammates. So, all these things the pitcher can perceive, and these then are become exact specific kinds of knowledge that each pitcher has. The pitcher also has internal knowledge. The pitcher has knowledge about his goals and objectives here.

12 – Action Selection

So imagine that Kris Medlen from Atlanta Braves is the pitcher. And Martin Prado from Arizona Diamondbacks is at the bat. Kris Medlen has the goal of finishing the inning without allowing any runs. How does Kris Medlen decide on an action? We may conceptualize Medlen’s decision making like the following. Medlen may look at various choices that are available to him. He may throw a pitch. Or he may choose to walk the batter. If he walks the batter, then there are additional possibilities that open up. He’ll need to face the next batter. If he chooses to pitch, then he’ll have to decide what kind of ball to throw. A slider, a fast ball, or a curve ball. If it was a slider, then there is a next set of possibilities open up. There might be a strike or a ball or a hit or he may just strike the batter out. Thus, Medlen is setting up a state space. Now, what we just did informally can be stated formally. So, we can imagine a number of states in the states space. The state space is a combination of all the states that can be achieved by applying various combinations of operators, starting from the initial state. Each state can be described in terms of some features, f1, f2, there could be more. Each feature can take on some values. For example, v1, there might be a range of values here. So initially, the picture is at state s0. And the pitcher wants to assume some state S101. And at a state S101 presumably the pitcher the pitcher’s goal has been accomplished. So we may think as the pitcher’s decision making as some kind of a part of its current state to this particular goal state. This is an abstract space. The pitcher has not yet made any action. The picture is still thinking. The picture is sitting up an abstract state space in his mind and exploding that state space.

13 – Putting Content in the Architecture

Okay, now in order to go further, let us start thinking in terms of, how we can put all of these precepts and goal, into some feature value language, so that we can store it inside Sole. It is one attempt at capturing all of this knowledge, so I can say that it’s the 7th inning. Inning is 7th. It’s the top of the 7th inning. It’s the top here. Runners are on 2nd and 3rd base. 2nd and 3rd base. And then so on and so forth. Note that at the bottom I have goal is to escape the inning. Which I think means in this particular context, to go to the next inning, without letting the batter score anymore points. So now that we have. Put all of this precepts coming from the world and the goal, into some kind of simple representation which has features and values in it, the fun is going to begin.

18 – Exercise Production System in Action II

>> That was right, David. Thank you. Let’s summarize some of the things that David noted. So based on the contents of the working memory, some rules get activated. As these rules get activated, some consequences get established. As these consequences get established, they get written. These consequence get written on the working memory. So the contents of working memory are constantly changing. As the contents of the working memory change, new rules can get activated. So there is a constant interaction between the working memory and the long term memory. The contents of the working memory change quite rapidly. The contents of the long term memory change very, very slowly.

21 – Chunking

So for this situation source cognitive architectures selected not one goal but to. So this SOAR theory this is called an impasse. An impasse occurs when the decision maker cannot make a decision either because not enough knowledge is available or because multiple courses of action there are being selected and the agent cannot decide among them. In this case two actions have been selected and the agent cannot decide between them. Should the pitch throw a curve ball or a fast ball? At this point SOAR will attempt to learn a rule that might break the impasse. If the decision maker has a choice between the fast ball and the curve ball and it cannot decide it, might there be a way of learning a rule that decides between what to throw in a particular situation given the choice of the fast falling curve ball. For this now SOAR will invoke episodic knowledge. Let’s see how SOAR does that and how it can help SOAR learn the rule that results in the breaking of the impasse. So imagine that SOAR had episodic knowledge about the previous event, about the previous instance of an event. And this previous instance of an event in another game it was a fifth inning bottom of the fifth inning, if the weather was windy it was the same batter though, Parra, who bats left handed. It was a similar kind of situation and the pitcher threw a fastball and Parra hit a homerun out of it. Now we want to avoid that. The current pitcher wants to avoid it. So given this episodic knowledge about this even set occurred earlier, SOAR has learning mechanism that allows it to encapsulate knowledge from this event into the form of a production rule that can be used as part of the procedural knowledge. And the learned rule is, if two operators are suggested, and threw a fast ball is one of those operators, and the batter is Parra, then dismiss throw-fast-ball operator. This is the process of learning called chunking. So, chunking is a process, a learning technique that’s SOAR uses to learn rules that can break impasse. First note, that chunking is triggered when impasse occurs. In this situation, the impasse is that two rules got activated and there is no way of resolving between them. So the impasse imagery tells the process of chunking, what the goal of chunking is. Find a rule that can break the impasse. SOAR now searches for the episodic memory and finds an event that has some knowledge that may break the impulse. In particular, it looks like a perceptual current situation that we had in previous shot. And compared to the perceptions of previous situations, of the event memory, the episodic memory, and find that any information available of the current batter. If some information is available that tells, SOAR the result of some previous action that also occurs in the current impasse, then SOAR picks that event. So now it tries to encapsulate the result of the previous event, in the form of a rule. In this case, it wants to avoid the result of a homerun, and therefore it says dismiss that particular operator. If it wanted that particular result, it would have said throw that particular operator. We said earlier that in cognitive systems, reasoning, learning and memory are closely connected. Here is an example of that. We’re dealing with memory, procedural memory, we’re dealing with memory that can deal with procedural knowledge and episodic knowledge. Dealing with reasoning, decision making. We’re also dealing with learning, chunking. If you want to learn more about chunking then the reading by Lehman Leodon Rosenblum, and the further readings at the end of this lesson gives lot many more details.

22 – Exercise Chunking

Let’s do one more exercise on the same problem. Note that, I have added one more rule into the procedural knowledge. This is the rule that was the result of chunking. If two operators are suggested, and throw-fast-ball operator suggested, and the batter is Parra. Then dismiss the throw-fast-ball operator. Okay, given these rules and the same situation, what do you think will be the operator that will be selected?

24 – Fundamentals of Learning

This is the first time we have come across the topic of learning in this course, so let us examine it a little bit more closely. We are all interested in asking the question, how do agents learn? But this question is not isolated from a series of other questions. What do agents learn? What is the source of their learning? When do they learn? And why do they learn at all? For the purpose of addressing what goal or what task? Now here is the fundamental stance that knowledge based AI takes. It says that we’ll start with a theory of reasoning. That will help us address questions like, what to learn, when to learn, why to do learning? And only then will we go to the question, of how to do the learning. So, we reasoning first, and then backwards to learning. This happened in production systems. When the production system reach an impasse, then it said let’s learn in order to resolve this impasse from episodic knowledge. So once again, we are trying to build a unified theory of reasoning, memory, and learning where the demands of memory and reasoning constrain the processing of learning.

25 – Assignment Production Systems

So how would you use a production system to design an agent that can solve Raven’s Progressive Matrices. We could think about this kind of at two different levels. At one level we could imagine a production system that’s able to address any incoming problem. It has a set of rules for what to look for in a new problem and it knows how to reply when it finds those things. But on the other hand, we can also imagine production rules that are specific to a given problem. When the agent receives a new problem, it induces some production rules that govern the transformation between certain figures and then transfers that to other rows and columns. So in that way, it’s able to use that same kind of production system methodology to answer these problems, even though it doesn’t come into the problem with any production rules written in advance. So, inherent in this idea though, is the idea of learning from the problem that it receives. How is this learning going to take place? How is it actually going to write these production rules, based on a new problem? And what’s the benefit of doing it this way? What do we get out of actually having these production rules, that are written based on individual problems?

28 – Final Quiz

So we are now at the final quiz for this particular lesson. What did you learn in this lesson?

29 – Final Quiz

And thank you for doing it.

07 – Frames

02 – Exercise Ashok Ate a Frog

We started this unit by saying, that frames are a useful knowledge representation, for enabling common sense reasoning. But what is common sense reasoning? You can do it, I can do it. How do we make a machine do it? To illustrate common sense reasoning, let us consider a simple sentence. Ashok ate a frog. All right, you understand the sentence. You understand the meaning of the sentence. But, what is the meaning of the meaning? What did you just understand? Try to answer the questions on the right.

04 – How do we make sense of a sentence

Let us look at the meaning of the sentence, Ashok ate a frog. When I say the sentence, you understand its meaning immediately. But what did you understand? What is the meaning, of the meaning, of the sentence? How can I capture that meaning? How we capture it in a machine? Let us focus for now, on the verb in the sentence, which is ate. We’ll associate a frame. With the verb in the sentence. Later on we will see that frames can also be associated with the objects or the nouns in the sentence, but for now we will focus on the verb. Now what is it that we know about the stereotypical action of eating? What happens when people eat, when you and I eat? Usually there is an agent, that does the eating. And that particular agent that corresponded the subject of the sentence. Usually something is being eaten, that’s the object. There is often a location where the eating is done or the time when the eating is being done. Someone might use a, utensil to do the eating. You might eat with a fork or a spoon for example. There might be other things that we know about the stereotypical action of eating. For example, what is being eaten typically is not alive. At least not when humans eat it. Now this vertical slot object-is, this concerns the location of the object. Where is the object after it has been eaten? And you might say well, it’s inside the subject’s body. What might be the mood of the subject? Well, after people have eaten, typically they are happier. So, here is a list of slots that we associate with the stereotypical action of eating. This is not an exhaustive list, you can add some more. So each of the slots may take some values. We’ll call these values fillers. So slots and fillers. Some of the fillers are here by default. Some of the fillers may come from. Parsing the sentence. So, we know that in this particular sentence, the subject is Ashok and the object is a frog. Okay so, frame then, is a knowledge structure. Note the word structure. There are, a number of things happening in this knowledge representation. If I may take an analogy with something, with which I’m sure you are familiar. Consider the difference between an atom of knowledge representation, and a molecule of knowledge representation. Some knowledge representations are like atoms, other knowledge representations are like molecules. An atom is a unit by itself, a production rule is like an atom. On the other hand, frames are like molecules, they have a structure. There are a large number of things happening. These molecules could expand or could contract. You can do a lot more with frames, that you can do with a simple production rule. So frame isn’t like a knowledge structure, which has slots, and which has fillers that go with it. Some of these fillers, are by default. A frame deals with the stereotypical situation. Consider now a different sentence. Suppose we had the sentence, David ate a pizza at home. Now here, I have filled out what a frame for this particular sentence would look like. The subject is different, the object is different. This time, there is some information about location, in the previous sentence there was no information about location. Let us compare these two frames for another second. Note that these slots. In case of both the frames are exactly the same, because the frame corresponds to the action of eating. The fillers on the other hand are different, at least some of the fillers are different, because these fillers corresponded the various input sentences. The only fillers that are the same, are those fillers which have to do with default values for particular slots.

05 – Exercise Making sense of a sentence

Okay, let us do an exercise. On the left I have shown you a sentence. On the right is a frame for Ate. Please write down the slots for the frame for Ate, as well as the fillers that will go for these slots for this particular input sentence.

09 – Exercise Interpeting a Frame System

Let us do an exercise together. Imagine that there is a set of frames here that is capturing a conceptual knowledge. What sentences is expressed by these frames?

10 – Exercise Interpeting a Frame System

>> That’s good, David. But here’s something interesting to note. It could have been that this was the input sentence and that this frame representation got constructed from this input sentence. So Haruto became the subject and the person and ate became the verb and so on. Alternately, this could have been the frame representation and perhaps the sentence which error gets through language generation from this frame representation. So the frame representation could potentially act as an intermediate representation for both sentence comprehension and for sentence generation. Of course, there’s a lot more to sentence generation and to sentence comprehension than what we have shown so far.

11 – Frames and Semantic Nets

We can also use frames to address the Raven’s Matrices problems that we have been talking about all throughout this course. In fact as we do so we’ll note another interesting fact, frames in semantic networks are closely related. So let’s do this problem. Here is a particular image and here is a semantic network for this particular image that we had come across earlier. I could rewrite this semantic network in the language of frames. But, first of all building a frame for each of these specific objects. I have frame for x, a frame for y and a frame for z. So, here are the frames for the three objects, x, y and z. Let’s look at the frame for z in more detail for just a second. So, here are the slots, the name is z, the shape is a circle the size is small and it is filled, you can see it here. We can also capture the relationship between these two objects. So let’s consider a relationship example. Here y is inside x, y is inside x. We can capture that through this slot for the object y. Here is the slot for inside, for the object y, and it is pointing to x, indicating that y is inside x. Note again the equivalence between the semantic network and the frame representations. The three objects and the three frames corresponding to three objects. The relationship between the objects and the relationships being captured by these blue lines here between the frames. While we can capture relationships between frames through lines like this where one frame points to another frame, we could also capture them more directly by actually specifying variables of other frame names. So for example, for the frame y, we might say, inside x which captures the same idea that we were capturing by drawing a line between them. In fact, this is a notation we’ll use with the rest of the exercises in this lesson.

12 – Exercise Frames and Semantic Networks

Let us do an exercise together to make sure that we understand frame representations for images like this. So consider the image shown here on the top left. Can you write down all the slots and the fillers for these three frames?

14 – Frames and Production Systems

We have actually come across the notion of frames earlier, when we were talking about production systems. You may recall we had a diagram like this, where we had procedural, semantic and episodic knowledge, and the working memory container structure like this. You can see this is really a frame, here are the slots, here are the values for the slots. We can think of these frames as capturing conceptual knowledge, that is stored in the semantic memory. So let’s take an example. Suppose an input is a shark ate a frog. Remember the word ate there, and that verb ate gets returned to working memory, and the entire frame for ate gets pulled out. Once this frame is pulled out of semantic memory, it immediately generates expectations. So we now know, that ate is likely to have a subject, an object and location, perhaps time utensils and so on. So we can ask ourselves the question, well what will go, under subject here? What will go under object here? And in the sentence a shark ate a frog. This frame tells us, what to look for. As a result of which, the processing is not just bottom up, coming from natural language or the world in general, and going into mind. Also, mind provides knowledge structures like frames, which, structured knowledge representations, which generate expectations and make the processing, partially top down.

16 – Exercise Frames Complex Understanding

>> Or, if you’re interested in reading more about this now, you can go ahead and go watch those lessons.

19 – The Cognitive Connection

Frames are typical character of human cognition. Let us consider three specific ways. First, frames are a structured knowledge representation. We can think of production systems as being atoms of knowledge representation, and frames as being molecules of knowledge representation. A production rule captures a very small amount of information. A frame can capture a large amount of information in organized manner as a packet. Second, frames enable me to construct a theory of cognitive possessing which is not entirely bottom-up, but is partially top-down. I have to see a lot of data from the world. But not all of the cognitive processing is bottom-up. The data results in the retrieval of information from my memory. That information, that knowledge in the form of frames then help to make sense of the data. It has been generating expectations of the world. So then the processing becomes not just bottom-up, but also top-down. Third, frames capture the notion of stereotypes. Stereotypes of situations, stereotypes of events. Now, stereotypes can sometimes lead us to incorrect inferences. Yet you and I have stereotypes of whereas kind of events and situations. So why do we have state effects, because they’re cognitively efficient. And why are they cognitively efficient? Because instead of reasoning about the world anew each time, they already have default values associated with them. That’s the property of frame. All the default values then, enable me to generate certain number of expectations very rapidly. That’s cognitively efficient. Here are three formative connections between frames and human cognition. There are a lot more that we’ll get into slowly.

20 – Final Quiz

Please fill out what you learned in this lesson in this box.

21 – Final Quiz

Great. Thank you very much.

08 – Learning by Recording Cases

01 – Preview

Today we’ll talk about learning by recording cases. This is our first topic of learning, one of the fundamental elements of knowledge-based AI. It is also a first topic in analogical reasoning, often considered to be a core process of cognition. We’ll start talking about recording cases in a general sense. Then we’ll discuss a specific method for recording cases called the nearest neighbor method. We’ll generalize this method into the k-nearest neighbor method and end by talking about complex cases in the real world.

02 – Exercise Block World I

To see how learning, the recording cases might work. Consider a world of blocks. Color blocks, with various shapes and sizes. Six blocks in all. Now let us suppose, that I were to give you a question. So, based on your experiences in this world. What do you think is the color, of this block?

04 – Learning by Recording Cases

>> That’s a good example David, and we could even try to generalize the numerical diagnosis. Imagine that you went to a medical doctor with a set of signs and symptoms, so the doctor is faced with a new problem. What is a diagnosis for your signs and symptoms. The doctor may even have a number of cases recorded in her memory. These are the cases she has encountered during her experience. So the doctor must select a most similar case the most closely resembling case. Which in this case might b and say that will apply to a exactly the same diagnosis that are applied to b. So a case then is an encapsulation of a past experience. And learning where to call the cases is a very powerful method that works in a very large number of situations ranging from tying your shoelaces to medical diagnosis.

06 – Exercise Retrieval by Nearest Neighbor

Let us do an exercise together. Given the block shown here with the width of 0.8 and the height of 0.8, what do you think is the color of this block?

07 – Exercise Retrieval by Nearest Neighbor

>> So in this problem, we are dealing with a two-dimensional grid, because here, two coordinates, x and y, are enough to represent any one point. In the real world, of course, problems are not that easy to represent, and one might need a multi-dimensional space in order to be able to represent all the cases in the new problem. Let’s examine a problem like that now.

10 – Nearest Neighbor for Complex Problems

Now we can try to calculate the most similar case of the new problem based solely on the origin. The two dimensional grid here tries to represent both all the cases and in your problem. Of course we can also calculate the similarity of the new problem with the old cases based on destination. This two dimensional grid captures the cases and the problem based on the destination. You can compute the [UNKNOWN] distance from Q in all the cases placed on the origin, shown here. And you can do the same thing with the destination, shown here. If we focus only on the origin, then the B case seems the closest. If we focus solely on the destination, the E case seems the closest. However, the B case is not very good when we look at the destination. And the E case is not very good when you look at the origin. How then might an AI agent find out which is the best route of all of these choices? How might it decide D is the best route?

11 – Nearest Neighbor in k-Dimensional Space

Earlier we had this formula for calculating the Euclidean distance in two dimensions. Now we can generalize it to many dimensions. So here is a generalization of the previous formula computing nearest neighbor. In this new formula, both the case and the problem are defined in K dimensions. And we’ll find the Euclidean distance between them in this K space. So this table summarizes Euclidean distance between the cases and the new problem in this multidimensional space where we are dealing both with the origin and the destination, and where the origin as well as the destination are specified by the x and y coordinates. Looking at this table, we can very quickly see that D and not B or E, is the closest case, your most similar case, linear problem Q. This method is called the KNN method where NN stands here for nearest neighbor, K nearest neighbor method. This is a probably method as simple as it is. Of course, it also has limitations. One limitations is that, in the real world, the number of dimensions in which I might want to compute the distance between the new problem and old cases might be very large, a high dimensional low space. In such a situation, deciding which of the stored cases is closest to the new problem may not be as simple as it appears here. A second difficulty with this method is, that even if the new problem isn’t very close to an existing case, that does not mean that the existing cases solution can or should be darkly applied to the new problem. So, we need both alternative methods of retrieving cases from memory, and methods for adapting passed cases to fit the requirements of the new problem. That is called [UNKNOWN] and we will discuss that in the next lesson.

12 – Assignment Learning by Recording Cases

For this assignment, talk about how you might use notion of recording cases to design an agent that can solve Raven’s Progressive Matrices. You might think of cases in a variety of different ways here. For example, each figure in a problem could be a case. Each transformation between figures could be a case. Or more broadly, each problem that your agent has encountered in the past could be a case. As part of this, you’ll also need to think about how to evaluate similarity. If you’re using figures, how do you evaluate the similarity between two figures in a problem? Or how do you evaluate the similarity between two transformations and a problem? Or more broadly, how do you find what problem that you face in the past, is most similar to the new one you’re facing now?

13 – Wrap Up

So today we discussed a learning method called learning by recording cases. In learning by recording cases, we file away individual cases we have encountered in the past in order to use them for future problem solving. We talked about the nearest neighbor method as a way of finding the most similar case to the current problem that we faced in the past. But in the real world, this can often be very difficult. So we talked about using nearest neighbor to find very complex similar cases to our current problem, such as our navigation example. However, there are still a lot of limitations to this method. Oftentimes, just executing a solution we’ve used in the past doesn’t work. And oftentimes, we have to store cases based on qualitative labels instead of numeric labels. These weaknesses will be addressed in our next lesson when we talk about case-based reasoning. There we’ll add adaptation and evaluation into our process, and start to be able to use cases in a much more thorough and robust way.

14 – The Cognitive Connection

Learning by storing cases in memory has a very strong connection to cognition. Cognitive agents like you and I are situated in a world. Our interactions with the world have certain patterns of regularity. The world offers us the same problems again and again. If we think about it, the kinds of problems that you and I deal within a routine everyday basis are the same problems that occurred yesterday and the day before. Tying shoelaces is a good example of that. When we have to tie shoelaces, none of us thinks a lot about how to do it. Memory supplies us with the answer. We don’t think as much as we think we do. If you recall we have drawn a cognitive architecture earlier that had three components in it, reasoning, memory, and learning. When we think of intelligence, we typically focus on the reasoning component. We think intelligence has to do with reasoning, with solving problems, with decision making. To some degree, that is true. By learning by recording cases, shifts the balance between the component. It says that, learning is very important and so is memory. We recall things in memory and then memory supplies us with the answers so that we don’t actually have to reason as much as we think we need to.

15 – Final Quiz

Please write down what all you learned in this lesson, in this box.

16 – Final Quiz

And thank you for doing it.

09 – Case-Based Reasoning

01 – Preview

Today we will talk about Case-Based Reasoning. In Case-based reasoning, the cognitive agent addresses new problems by tweaking solutions to similar previously encountered problems. Case-based reasoning builds on the previous lesson on learning recording cases. In learning recording cases, the new problem is identical to the previous problem. In case-based reasoning, the new problem is similar to a previously encountered problem. Case-based reasoning typically has several phases, case retrieval, case adaptation, case evaluation, and case storage. We’ll also discuss certain advanced processes of case-based reasoning which will include new methods for case retrieval.

04 – Recording Cases to Case-Based Reasoning

To examine a more realistic problem, let’s revisit the problem that we had in our last lesson. Once again, this is a map of a part of Long Island, and the problem is to go from Q to the end location here. So I’ll call it Q problem. We’ll retrieve from memory the D case, which takes us from this initial location to this collocation. Clearly, this D case is potentially useful for addressing the Q problem. But it is not useful as is. The initial location of the D case is not the same as the initial location of the Q problem. And the end location of the D case is not the same as the end location of the Q problem. So we can start with this D case but we need to adapt it. So, this leads us to the overall process of case-based reasoning. The basic process of case-based reasoning consists of four steps. The first step is retrieval, and we already and considered this when we were considering learning by recording cases. K nearest neighbor is one way of retrieving cases from memory. Once we have retrieved a case from memory that is delivered to the current problem, we need to adapt it. For example, in the previous problem we had the D case and the Q problem. And we needed to adapt the D case into the Q problem. There are many similar examples. All of us program and all of us, as computer programmers, sometimes use case-based reasoning. We are given a new problem to address, and we often look at the design of a program that we have come across earlier. So there’s retrieving a case and they’re adapting a particular design of the old program to solve the new problem. Once we have adapted the case to meet the requirements of the new problem, we have a candidate solution for the new problem. With it, the candidate solution is to be evaluated. For example, in the navigation problem, when we have a solution of the Q problem, we can evaluate it but they would actually take us to the end location. We can do a simulation, we can walk through it. As we walk thought it, we will be able to evaluate whether the solution actually succeeds in meeting the requirements of the problem. For the programming problem, once we have a new program that we obtain by adapting the old program, we can actually run the program to see, whether or not it will meet the requirements of the new problem. Let us suppose for a moment that we evaluate a candidate solution and it succeeds. Then, we could encapsulate the new problem and the new solution into a case, and store it back into the case memory, so that case memory is constantly increasing. Notice that this case-based reasoning process unifies memory, reasoning, and learning. There is a case memory that contains a large number of cases and that’s how we retrieve cases that are relevant to the current problem. We’ll reason when we adapt and evaluate. And we learn when we store the new case back into the case memory.

05 – Assumptions of Case-Based Reasoning

>> That’s a good example, David. So we have at least two examples now with similar problems and have quite different solutions. Nevertheless, this assumption is valid most of the time. Most of the time, two problems that are quite similar will end up having two solutions that are quite similar as well.

06 – Case Adaptation

>> That’s a good point David. In fact in the design community there is a old cliché, which says that all designers redesign. Designer is fundamentally is evolutionary. We take old designs and we evolve them slightly. And that’s how we get a new design. And the same thing is happening in case based reasoning here. It is saying that, often this particular solutions that we come up with, are revolutionary in the nature, in the sense that, they are small tweaks over previous solutions. So, the next question becomes, how can we adapt an old case to meet the requirements of a new problem? There are potentially several ways of doing it. We will discuss three important ways, perhaps the three most common ways of adapting a case. They are called the model based method, the recursive case based method, and the rule based method.

09 – Case Adaptation by Rules

>> David, to generalize on your answer to design. Designers, often use heuristics of the kind that you mentioned. For example, if you want to make an artifact lighter, try a different material. That’s a heuristic, expressed as a rule.

10 – Case Evaluation

>> Design more generally, we can simulate your design or we can actually prototype a design. Under the method for evaluating a design could be to share it with other designers and let them critique it. So there are a number different methods that are possible for evaluation as well.

11 – Case Storage

So we just talked about how the evolution step in the case based reasoning process when decided a correct solution in fact meets the requirements of the given problem. Now that we have the new problem and the solution for it, we can encapsulate them as a case, and store them in a case memory. We saw the advantages of this kind of storage earlier, when we went from home to restaurant. We stored that case is memory so that when wanted to go back from restaurant to home. We could retrieve that case and try to adapt it. So case choice is an important way of learning. We are constantly accumulating and assimilating new cases. We talk about two kinds of storage mechanisms. Indexing and discrimination crease.

12 – Case Storage by Index

>> That’s an important point. We want to use an indexical structure which allows for effective and efficient retrieval, because we are storing things only because we want to be able to retrieve them at a later times. In case of design more generally, people have developed indexical structures that had to do with functions, with operating environment, with performance criteria, and so on.

13 – Exercise Case Storage by Index I

But for now, let’s go back to where, original navigation micro world. Imagine that we have a nucleus Y. Given our index equals scheme here, of X were coordinates of the initial location, what do you think of the indices of the case Y?

14 – Exercise Case Storage by Index I

>> Precisely.

15 – Exercise Case Storage by Index II

Let’s consider a different case. Supposing we have a case Z of going back from the restaurant or the home. Let’s also suppose that we’re change our index equal Kim. Now we are indexing things by the x square coordinates of the destination not the origin. What will be the indices for the case Z?

20 – Exercise Storage by Discrimin Tree II

But know that A and Y were in this same branch. So we now we need to find a way of discriminating between A and Y. How could we do that?

21 – Exercise Storage by Discrimin Tree II

>> So, for those of you familiar with Big O Notation, you’ll notice that the efficiency of searching the case library organized by indices was linear, whereas here, it’s logarithmic.

22 – Case Retrieval Revisited

Now that we have considered storage, let’s revisit retrieval. We talked about two different ways of organizing the case memory, a tabular way and a discrimination tree. How can we retrieve the case relevant to a given problem? We assume here that the new problem has the same features in its description as the cases stored in the memory. Earlier when we were storing a case in memory, at that time we were navigating this tree to find where in this tree should we store the new case. This time, we’ll use the problem to navigate this tree and find out which case is most similar to the problem.

24 – Exercise Retrieval by Index

>> That’s right, David.

25 – Exercise Retrieval by Discrimin Tree

Let’s repeat this exercise, but this time using discrimination tree for organizing the case memory. So here is a discrimination tree, containing the cases currently in the case memory. And here is the, new problem. You could go to the initial location, to the goal location. Given this problem, what case would be retrieved from this discrimination tree?

27 – Advanced Case-Based Reasoning

>> Failures are great opportunities for learning. When failures occur, we can try to repair the failure by going back from the evaluation step to the adaptation step. Or we can try to recover from the failure by going from the ed, evaluation step, all the way to the retrieval step. In addition, we can store these failures in the case memory. When we store them in the case memory, then these failures can help us anticipate failures that might occur with new problems. There’s a flip side to this. Just like it is useful to store failed cases, it is not useful to store every successful case. If we stored every successful case, then very soon, the case memory will become very, very large, and the retrieval step will become less efficient. This is sometimes called the utility problem. We want to store only those successful cases that in fact help us cover a larger span of problems. This means that even when a case succeeds. We want to store it only if there is something entrusting or noteworthy about that case.

28 – Assignment Case-Based Reasoning

In this assignment, discuss how you’d use case-based reasoning to develop an agent that can answer Raven’s Progressive Matrices. Make sure to describe how this is different from learning by recording cases alone. Where is your adaptation phase? How are you adapting past solutions to the new problem? What is evaluation in this context? How are you evaluating the strength of your answer? Are you going to record the cases that your agent encounters as they’re solving the test, or are you going to equip them with past cases beforehand for them to use to solve new problems?

29 – Wrap Up

So today we talked about the broad process of case-based reasoning. Learning by recording cases gave us a method for case retrieval called nearest neighbor method. So we went ahead and jumped into the adaptation phase. Given an old solution to a problem, how do we adapt that old solution to a new problem? We talked about three ways of doing that. We can do it by model of the world, we can do it by rules, or we can do it by recursion. Then once we’ve adapted that old case, how do we then evaluate how good it was for our problem? Then after we evaluated how good it is we looked at storing it back in our memory. We want to build up a case library of past solutions, so if we’ve solved a new problem we will now sort that back into our case library. Then based on that we revisited the notion of case retrieval. Based on how our case library is organized, how do we retrieve a prior case that’s most similar to our new problem? Now there are a lot of open issues here. For example, should we store failed cases? Should we store failed adaptations? Do we want to store them so we can avoid failing in the future? Should we ever forget cases? Can our case library ever get so big that it’s intractable, and we can’t really use it efficiently? Should we abstract over cases, so should we use these individual cases to develop a more abstract understanding of a concept, or should we stick the individual cases and adapt them from there? If you’re interested in these questions you can over to our forums and we’ll talk about it there. But we’ll also be revisiting these questions throughout the rest of the course. Next time we’ll talk about incremental concept learning, which takes individual cases and abstracts over them to learn some kind of higher level concepts.

31 – Final Quiz

Please write down what you learned in this lesson.

32 – Final Quiz

And thank you for doing it.

10 – Incremental Concept Learning

02 – Exercise Identifying a Foo I

Let us try to do a problem together on incremental concept learning. I’m going to give you a series of examples, and we will see what kind of concept one can learn from it. I’ll not tell you what the concept really is, for the time being I’m just going to call it foo. Here is the first example. In this first example, there are four bricks. A brick at the bottom, horizontal brick at the bottom, or horizontal brick at the top. And two vertical bricks on the side. Here is a second example. And this time I’ll tell you that this particular example is not a positive instance of the concept foo. Once again, we have four bricks, a brick at the bottom, a brick at the top, and two bricks on the side. This time the two bricks aren’t touching each other. Here’s a third example of the concept foo. This is a positive example. This is a foo. Again we have four blocks. This time they are two bricks, and instead of having two bricks, vertical bricks, we have two cylinders. They are not touching each other. So I showed you three examples of the concept foo. And I’m sure, you learned some concept definition out of it. Now I’m going to show you another example and ask you, does this example fit your current definition of concept foo, what do you think?

03 – Exercise Identifying a Foo I

>> And in coming up to this answer David used some background knowledge. The background knowledge that he used was that the bricks that were in the vertical position in the first example, and the cylinders that were in the vertical position in the third example, and the special blocks that are in the vertical position in this example. Are all examples of something called a block. They can be replaced by each other. So instead of having a brick, one could have a cylinder, or some other thing that’s vertically placed here. Now, someone else in the class may have a different background knowledge, and he or she may not consider this to be an example of a block, in which case the answer might be no. The point here being that background knowledge is playing an important role in deciding whether or not this is an example of foo.

04 – Exercise Identifying a Foo II

Let’s try another example. This time, again, there are four blocks. There are two bricks, at the bottom and the top. And the two cylinders, both vertical, but they are touching each other. Is this an example of the concept foo based on what we have learned so far?

05 – Exercise Identifying a Foo II

>> Once again, David, is using his background knowledge. In his background knowledge he says that the bricks are like the cylinders. The vertical bricks are like the vertical cylinders. So, what holds for the vertical bricks, they must not be touching, also holds for the vertical cylinders. They too must not be touching. Again, someone else may have a different background knowledge and may come up with a different answer.

06 – Exercise Identifying a Foo III

Lets try one last example in this series. So, in this example, there are four blocks again. There are three bricks at the bottom and two on the side, not touching each other. And there is a wedge this time at the top, not a brick. Is this an example of the concept foo?

08 – Incremental Concept Learning

>> That’s good, David. It connects things with our everyday lives.

09 – Variabilization

Let us look at the algorithm for incremental concept learning more systematical in more detail. This time, imagine that there is an AI program, and there is a teacher which is going to teach the AI program about the concept of an arch. So teaching this first example and suppose the teacher gives the example which has four bricks in it. Two vertical bricks that are not touching each other and there is a third brick on top of it and a fourth brick on top of it. To the AI program, the input may look a little bit like this, there are four bricks, A, B, C and D. And there are some relationships between these four blocks. So brick C is on left of brick D. Brick C supports brick B. Brick D supports brick B as well, and brick B supports brick A. This then is the input. What may the error program learn from this one, single example? Not very much. For this one single example, the AI program can only variablize. There were these constants here, brick A, brick B, brick C, brick D. Instead, the AI program may be able to variablize these constants and say, well, brick A is an instance of brick, and therefore, I just have brick here. Brick B is an instance of a brick. Therefore, I’ll just have a brick here. So now, I can have any brick in these spaces as long as these relationships hold, it’s an example of an arch. Note the first example was the positive example. Now we are going to see a series of positive and negative examples, and each time we see an example, the AI program will either generalize or specialize. If it sees a positive example, then it may generalize, if the positive example is not covered by a current concept definition. If it sees a negative example, it may specialize the current definition of the concept to exclude that negative example.

11 – Specialization to Require Features

So here now is the, current concept definition of arch that the AI program has. Now, the teacher shows a new example, here is the new example shown. There are three bricks, but the third brick here, is not on top of the first two. This is the input to the AI program with a third example. And the teacher tells, the AI program that this is not a positive example of an arch. So here is a current concept definition. Here is a representation of the input example, and information that this is a negative instance of the example. What may the AI program learn from it? The AI program must refine it’s current definition of the arch, in such a way that the, new negative example is ruled out. But how can we do that? One way of doing that is to say, that, we will put extra conditions on these links. These support links must be there. These are not optional. We’ll call it, the require-link heuristic. This require-link heuristic says that, if the structure of the presentation of the concept and is structure of the representation of the, negative example have some things in common. But there are also some differences. Then revise the current definition, in such a way that, those things that are not in common become, must be required.

12 – Specialization to Exclude Features

Let us continue this exercise a little bit further. Imagine that the teacher gives this as a third example to the AI program. This time again you have three bricks, but the two vertical bricks are touching each other. So here is a representation of this input example. Three bricks, the two vertical bricks are supporting this brick at the top, however, the two bricks are touching each other. Recall, this is the current concept definition that the AI program has. It mus support links here. And here is the representation of the new example. And the AI program knows that this is a negative example. How might the AI program refine or specialize this particular current definition, so that this negative example is excluded? Well, AI program may say that the current definition can be devised in such a way that for these two bricks, where one is left of the other one, these two bricks cannot touch each other. This particular symbol means not. So it is saying that this brick does not touch that one. And we have bi-datashield links. Because this one cannot touch the other one, and that one cannot touch this one. This is called a forbid-link heuristic. So, here some particular link, in this particular case touches, is being forbidden.

13 – Generalization to Abstract Features

Now let us look at examples that are even more interesting than previously. Recall that earlier we were talking about background knowledge. Let’s see what role background knowledge plays more explicitly. So imagine this is the fourth example that the teacher gives to the AI program, and this is a positive example. So the AI program may have this as the input representation. There are two bricks, this brick is left of the other brick, there’s a wedge on top, the two bricks are supporting the wedge. So now the AI program has this as the current definition, recall the not touches links here, and this is the new example. And this is the positive example. How may the AI program revise its current concept definition to include this positive example? Well the simplest in the AI program I do is to replace this brick here, in the current concept definition by brick or wedge. So that makes sure that a new example is included in the definition of the concept. We’ll call this the enlarge-set heuristic. This particular set here, which had only brick here as an element, now has two elements in it; brick or wedge.

15 – An Alternative Visualization

I hope that algorithm for incremental concept learning makes sense to you. Here is another way of visualization that algorithm. Imagine that an AI agent was given a positive example and the AI agent may come up with a concept definition that covers that positive example. Now let us suppose that the AI agent is given a negative example, and this negative example is covered by the current concept definition. Well in that case the current concept definition must be refined in such a way that the negative example is excluded, while still including the positive example. So you can visualize a new concept definition which includes the positive example, but excludes the negative example. Now let us suppose that the AI agent is given another positive example, in which case the AI agent must revise its definition of the concept so that the new positive example is also included so that’s also covered. So we may revise this concept definition something like this. And we can repeat this exercise many times. Imagine there is a negative example and the current concept definition covers it. Well, we can refine it in such a way that a new negative example is excluded and so on. We can imagine going through several of these iterations of positive and negative examples. Eventually we’ll get a concept definition that includes, that covers all the positive examples, and excludes all the negative examples. So again, the problem is the same. Given a small set of positive and negative examples, the number of dimensions in which the algorithm can do generalization and specialization is very large. How do, how do we constrain the learning in this complex learning space? That’s where those [UNKNOWN] and background knowledge come in. The [UNKNOWN] guide the algorithm so that it revises the concept definition in an efficient manner and the background knowledge helps in that process.

16 – Heuristics for Concept Learning

Here is a summary of the kind of Heuristics that an AI agent might use in criminal concept learning. You only come across five of them, require-link, forbid-link, drop-link, enlarge-set and climb-tree. Here is another one called close-interval, let’s look at it briefly. Let’s go to David’s example of a child having a dog and suppose that child has only come across dogs that were very small in size. Now the child comes across a large dog. In that case the child might change the concept definition, expand the range of values that the dog can take, that dog size can take so that the larger dog can be included. So the difference here being that it’s values can be continuous like size of a dog and sort of indiscreet as in the other hero sticks

17 – Exercise Re-Identifying a Foo I

Let us do a series of exercises together. This time, the concept that the AI agent is going to learn about, I’ll call it foo. Here is a first example the teacher gives to the AI program. All right, because there is only one example the only kind of learning that can occur here is variabilization. What do you think will be the values that can go in these boxes here that will variabalize these four bricks. Initially, they are brick one, brick two, brick three, brick four.

19 – Exercise Re-Identifying Foo II

So how would we reflect the relationship with the concepts on the right?

20 – Exercise Re-Identifying Foo II

>> Now we’ll give you some more examples, some positive, some negative, and we’ll ask you how the AI agent can go about refining its concept definition.

22 – Exercise Re-Identifying Foo III

>> Good job, David.

23 – Exercise Re-Identifying Foo IV

Here is the next example. This is a positive example. How would the current definition be refined?

24 – Exercise Re-Identifying Foo IV

>> That looks good to me, David. And you are right. While humans may have a lot of background knowledge, we have not yet ascribed any background knowledge to the AI agent. So the AI agent might be able to simply say brick or cylinder and nothing more than that

25 – Exercise Re-Identifying Foo V

With this next example, suppose that the AI agent does have some background knowledge which tells it that brick and cylinders are both both sub classes of blocks. In that case, how might the AI program refine this current concept definition?

27 – Exercise Re-Identifying Foo VI

Let us consider one last example in this series of examples. Let us suppose the teacher gives this as the negative example of foo. Note, negative example. How do think the AI agent may refine this concept to exclude this negative example?

29 – Final Concept of a Foo

So, given the input series of examples, and the background knowledge, this is the final concept definition for [UNKNOWN] that this particular [UNKNOWN] agent will learn. Notice, that there are no must support lengths here, because input [UNKNOWN] of examples did not require them. Now it is also, that we did not generalize this bricks into something else, or further generalize these blocks into something else, because there was no background knowledge to do that. So the result of learning here, depends not just on the input examples, but on the background knowledge that the AI agent has. This method of incremental concept learning differs quite a bit from some of these standard algorithms in machine learning. Often in machine learning, the AI agent is a given a large number of examples to begin with and the learning begins with those large number of examples where the number of examples could be in thousands or millions or more. When you have a large number of examples to begin with, then one can apply statistical machine learning methods to find patterns of regularity in the input data. But if the number of examples is very small, and if the examples come one at a time, the learning is incremental. Then it becomes harder to apply those statistical methods to detect patterns of the [INAUDIBLE] input data. Instead, in that case, the algorithm must make use of its background knowledge to decide what to learn and how to learn it.

33 – Final Quiz

So please write in this box once again what you learned from this particular lesson.

34 – Final Quiz

Great. Thank you very much.

11 – Classification

01 – Preview

Today we’ll talk about one of the most ubiquitous problems in AI called classification. Classification is mapping sets of percepts in the world into equals classes, so that we can take actions in the world in an efficient manner. We could learn this concept through incremental concept learning. We’ll talk about the nature of these equivalence classes and how they can be organized into a hierarchy of concepts. We’ll talk about different kinds of concepts, like axiomatic concepts and prototypical concepts. Given a classification hierarchy, we’ll talk about multiple processes for doing the classification, including both bottom-up and top-down processes.

02 – Exercise Concept Learning Revisited

In the previous lesson, we examined some techniques for learning about concepts from examples. But those were simple concepts that we learned from a few examples. Concepts like arg or foo, which was our imaginary, hypothetical concept. The real world concepts can be much more complicated than that. Consid, as an example, consider the eight animals shown here. Each picture shows a very cute animal here. How many of these do you think are birds? Which ones are birds?

03 – Exercise Concept Learning Revisited

>> That was a good answer David, thank you. Note that David was able to classify these eight animals into birds. Which one of these animals belong to the class of birds and which one did not belong to it. Notice also, that he used some criteria to decide on that. He has some notion it seems so what is a typical bird. What kind of features does a typical bird have? He has some notion of what are the basic conditions that something must satisfy in order to be considered a bird, and if those conditions are not being satisfied he would reject them and say those are not birds.

04 – Classifying Birds

So here are four of the animals that David classified as birds. Let us look at the kind of features that he examined in order to classify whether or not an animal was a bird. He may have used several features, some of which he articulated, others that he may not have articulated. So whether a animal has wings? Whether it has feathers? Whether it has a beak? And so on and so forth. One could add more features here if one wished to do so. We do classification all the time. AI agents need to do classification all the time also. Why? Why is classification so ubiquitous?

07 – Exercise Equivalence Classes

So the next question becomes, given a set of animals, or in general, a set of objects or elements, and a set of percepts for each of those animals, how can we decide what’s a good equivalence class for those animals? Consider, for example, the three animals shown here, eagle, bluebird, and penguin. Let us suppose that we knew that there are six percepts that are important for each one of these animals. Lays eggs, has wings, has talons, and so on. But in order to decide what might be a good equivalence class for these three animals, we first have to decide on what might be the right values for each of these percepts for each animal. So I’m going to ask you to use your background knowledge, to fill in the values of the percepts that applies to each of the animals.

08 – Exercise Equivalence Classes

>> Good, David. You know more about those animals than I do. Imagine that the three animals were given as examples, one after the other. So we are back to incremental concept learning of the previous lesson. One can use the techniques that we learn in the previous lesson, to learn some equivalence class. Learn some concept definition with the three animals. But that’s not the point here, the point here is not of the learning of the concept. The point here is much more about, the nature of the concepts, and how they get organized available each other.

09 – Concept Hierarchies

>> So, one of the benefits of this establish and or find approach, is it helps us figure out which variables we need to actually key in on and focus on. So, for example we saw earlier that Eagles are large, but, Bluebirds are small. So, Birds can come in different sizes. So if we’re trying to establish whether somethings a Bird, or a Reptile, or a Mammal, we know that Mammals also come in different sizes like tiny mice and large elephants. So size doesn’t really impact our decision whether it’s a Reptile, Bird or Mammal. But once we’ve established it’s a Bird, and we’re trying to decide between Eagle and Bluebird for example. We know that size actually can be something, that helps us differentiate between these two. So now we’ll pay attention to size, as a variable that matters. So note that there are several things going on here. On one side, there is knowledge of these different classes. But, there’s also organization of these different classes in a hierarchy of a particular colony. This organization is very powerful. In some ways if knowledge is power, so is organization. Organization too is power. This organization provides power, because organization tells you what it’s controller processing should be. Establish one node. Refine it. Establish that node. Refine it.

10 – Exercise Concept Hierarchies

Let us return to the exercise that we were trying to do earlier, where we had eagle and bluebird and penguin. And we had features, and we had values for the eagle and the bluebird and the penguin. Well, given these three sets of values for the eagle, bluebird and penguin. And given that bird is a superclass of these three classes. What would be the features that you would put in the bird node in that classification hierarchy?

11 – Exercise Concept Hierarchies

>> That’s a good answer, David. But I should quickly note that, this idea that we can decide on the features that should go into a super class, given the features that are shared among the sub-classes. Works only for certain kind of concept. It doesn’t work for all concepts. And vertical work for concepts that have a formal nature as we will see in just a minute.

12 – Types of Concepts

We can think of concepts lying on a spectrum. On one extreme end are extremely formal concepts for which we can define logical conditions that are necessary and sufficient for that concept. We’ll examine that in more detail in just a minute. On the other end of the spectrum are less formal concepts for which it’s hard to define necessary and sufficient conditions. Now here are three points on the spectrum. Axiomatic concepts, prototype concepts, exemplar concepts. There can be other types of concepts as well. We’re just going to consider these three concepts because they are the three most common ones. And we’ll look at each one of them in turn. In general, humans find it easier to communicate about axiomatic concepts because they are well defined. There is a set of necessary and sufficient conditions that we all agree with. Examples are mathematical concepts. Humans find it harder to communicate about prototype concepts, but most of the time we do quite well. It’s even harder to talk about exemplar concepts like, let’s say, beauty or freedom. Similarly, it’s easier to teach computers axiomatic concepts or program axiomatic concepts into computers. It’s much harder to program or teach prototype concepts. And much, much harder to teach a program exemplar type concepts.

14 – Prototype Concepts

The notion of axiomatic concepts is the classical view in cognitive systems. Here’s an alternative view. This is called Prototypical concepts. Prototypical concept is a base concepts defined by some typical properties that can sometimes be overridden. So, an example is a chair. You and I have a notion of a prototypical chair. Our notion of a prototypical chair may have, for example, there is a back, there is four legs and so on. So, here might be your and my notion of a prototypical chair. It has a back, it has a seat, it has four legs and so on. Now I can represent this notion of the prototypical chair in the language of frames. This is something we have come across earlier in the class. A frame has slots and fillers, as you may recall, and we use frames to represent stereotypes. Here we’re talking about prototypes, so very closely related. So concept is the a content you can represent. Frame is the form in which we can represent it. So a notion for the prototypical chair might be, it has four legs, it, the material is metal, it has a back, it does not have arms, and it’s not cushioned. Note that these are the typical properties of a chair. Of course, some chairs need not necessarily satisfy all of these properties. That is why there are no, no necessary and sufficient conditions here. For example, we may come across a chair. Which is made of wood. While you would still consider it a chair, even if you’re not strictly satisfied with this particular definition. Thus, these properties can be overridden in case of specific instances. But we’ll still have the basic notion of a prototypical chair so that we can in fact communicate with each other about what a chair is. Despite the fact that we can override these properties we do still have a notion of prototypical chair. So the relationships between concepts and frames actually is quite close. Recall that when we were talking about frames, we were also talking about inheritance and default. The notion of default in frames is closely connected to the notion of typical properties and concepts. So chair has this prototypical notion with some of these typical properties, and we can think of this as having a default values. By default, we assume the number of legs is four, the material is metal and so on. Here is a stool and this stool is a kind of chair which means that it inherits all the values and all the slots that are there in the chair directly except for those that happen to be different from the one in the chair. So then an example here it over writes a notion that the chair necessarily has to have a back. In the case of a stool, the stool does not have a back.

15 – Exemplar Concepts

Like so many concepts with refinement necessary in sufficient conditions, for typical concepts in typical conditions, what about exemplar concepts? Exemplar concepts don’t even have typical conditions, let alone necessary and sufficient conditions. In case of exemplar concepts, I can give you examples. Perhaps I can do some implicit abstraction of some examples, but it’s about as far as I can go. Consider the example of beauty for a second. There are four examples of something beautiful. Here’s a painting by Van Gogh, here’s a beautiful sunset, a beautiful flower, a beautiful dance and so on. While I can give examples of the concept of beauty, it’s really hard to come up with the typical condition of a beauty. Exemplar concepts are very hard to, define. And for that reason, they are also very hard to, communicate to each other, or to teach in a art program. Exemplar concepts can be culture specific, sometimes even individual specific.

16 – Order of Concepts

To summarize then, concepts can be of many different kinds. Fro very formal, concepts, like Axiomatic concepts, to less formal concepts, like Exemplar concepts. Of course, we can, if we want less formal concepts, like exemplar concepts, philosophers often talk about concepts called qualia, Q-U-A-L-I-A. Qualia for the raw sensations that we may get from our sensors. And example of qualia is, bitterness. I’m sure you’ve come across some bitter fruit, some time or the other. And you can even taste it, inside your mouth right now, if you wanted to. But it’s very hard to communicate what a qualia is to anyone else, what. Your notion of a genesis to anyone else.

17 – Exercise Order of Concepts

Let us do an exercise together. On the left again is a spectrum from very formal to less formal. In the right here are six concepts. Inspirational, reptile, foo. Foo here is the same concept that we came across when we’re talking about incremental concept learning. This is the same foo here. Right triangle, holiday, saltiness. Can you rank the six concepts according to the notion of formality that we have studied so far?

18 – Exercise Order of Concepts

>> So part of the point of looking at these two different kinds of concepts is that depending on the different kinds of concepts we are dealing with right now, you may come up with a different knowledge representation and a different inference method. Let me explain. Supposing we’re dealing with concepts like Foo or Holiday or Inspirational. Then the case-based reasoning method might be a very good method for dealing with things of that kind. We may have experience with specific holidays, but we cannot abstract them out into a concept with prototypical conditions. On the other hand, if we were dealing with concepts like a right triangle or a reptile, for which we can define necessary and sufficient conditions, in that case there are alternative methods available that might be more suitable then case-based reasoning. So instead of thinking in terms of one method that is going to work for all conditions and all concepts, we might want to think in terms of a array of methods, where each method is more or less useful for different kind of conditions or different kind of concepts. David, I am sure you recall mentally who was a Russian chemist who came up with the basic notion of the chemical periodic table. I’m sure all the students in the class know about the chemical periodic table, which organizes all the elements according to certain properties like hydrogen, oxygen, calcium, and so on. Now, Mendeleev came up with this notion of a chemical periodic table, and in some sense what we’re trying to do in this course is to build a similar kind of periodic table, except that this is a periodic table of the elements of mind. It’s a periodic table of the basic fundamental elements that compose intelligence. Instead of talking about elements and valances and atoms and so on, what we are going to be talking about are methods and representations. So it as if we are discovering the fundamental knowledge for presentations and organizations and reasoning methods that go with it. Case based reasoning was one. Reasoning method that went with certain kinds of concepts that are hard to abstract and into conditions, like typical conditions, are necessarily logical conditions.

19 – Bottom-Up Search

Let us build on this metaphor of periodic table a little bit further. So earlier we came across another method of dealing with classification, which we call top down or establish define. In that method we had this classification hierarchy which start with a concept, will establish it, then refine it, and refine it further if needed. That particular control of processing is very well suited for one kind of organization of concepts. It’s very well suited for one set of situations where we know something is already a vertebrate, we’re trying to establish whether it’s a bird or a bluebird. And a different kind of classification task. A better control of processing is to go bottom up. Let’s look at this a little bit more carefully. Here are a number of leaf nodes. And we know, the agent knows something about the value for each of these leaf nodes. And the task is to make a prediction at the root node. So in this particular case imagine the task of the AA agent is to predict the of the Dow Jones Industrial Average tomorrow. It’ll be great to have an AA agent like that. If we had a good AA agent like that, you and I could both become very rich. Now how could this A, agent make a prediction about the Dow Jones Industrial Average tomorrow? Well, one way in which it could do it is it could look at the. Information it has about the GDP the inflation and employment today. but how does it know about the value GDP or the inflation or employment today. Well it can look then at the values of the overtime hours. The consumer sentiment index, new orders index and so on and so forth. Now, the processing is largely, bottom up. We know something about the values or the features that go into this concept. And they’re going to this concept, they’re going to this concept. You can abstract them and find the value of the GDP. And similarly for this, and then abstract it further. So the control of processing in this particular case, we might call it identify and abstract, identify an abstract. Bottom-up controller processing rather than top-down in the previous case. We have just defined two other elements of our periodic table, of our growing periodic table of intelligence. In this latter element, the bottom-up classification, the conditions of application are different.

21 – Wrap Up

So today we’ve talked about classification, which is one of the biggest problems in AI. We started by revisiting incremental concept learning and reminding ourselves how it allowed us to take examples and abstract away a concept. We then looked at the idea of equivalence classes and how we can group sets of percepts into equivalence classes to establish a particular instance of a concept. Within this is the hierarchies of concepts, such as the animal kingdom, where animals can grow kind of into vertebrae, birds and penguins. We then discuss the idea of different types of concepts like axiomatic or exemplar concepts, and how each of them have different definitions and different affordances. Finally, we discuss bottom-up search, so instead of establish and refine, we look at the lower level variables and abstract up from them. Next, we’re going to move on to logic, which is a little bit unrelated to this. But if you’re interested in classification, you can look ahead to all our different lessons on design, such as diagnosis and configuration. They’re going to really heavily leverage our idea of classification.

22 – The Cognitive Connection

One could say a lot about the connection between classification and cognition. This is because classification is ubiquitous in cognition. You’re driving a car on the street, you see a friend driving his car, you look, take a look at the car, and you see a Porsche. Classification. You’re on a computer program, the output is faulty. You look at the output, decide on the bug. You name the bug. Classification. You go to a doctor with certain signs and symptoms, the doctor names a disease category, classification. The reason classification is so ubiquitous is because it allows us to select actions. Once the doctor knows what the disease category is, he can suggest a therapy. Once you know what the bug is you can decide on a repair for that bug. If action selection indeed is a very productive characterization of intelligence then we can see why classification is central to cognition.

23 – Final Quiz

All right. Please right down what you learned in this lesson in this box for us to peruse later.

24 – Final Quiz

And thank you for doing it.

12 – Logic

01 – Preview

Today we’ll discuss logic. Logic is a formal language that allows us to make assertions about the world in a very precise way. We learn about logic both because it is an important topic and also because it forms the basis of additional topics such as planning. We’ll start talking about a formal notation for writing sentences in logic. This formal notation will have things like conjunctions and disjunctions. Then we’ll talk about truth tables, a topic that you probably already know. We’ll talk about rules of inferences like modus ponens and modus tollens. Finally we’ll discuss methods for proving theories [UNKNOWN] by repetition. One of those methods is called, resolution theorem proving.

04 – Exercise Inferences About Foos

Before we represent those sentences in the language of logic, let us consider another example of conceptual knowledge and its relationship to logic first. So here is a concept of four. We have come across this earlier. They were a block and a block and a brick at the bottom and a brick at the top. And some relationship between these objects. Given this conceptual knowledge about foo we can ask ourselves, what are the sufficient conditions for something to be a foo? Here are several choices. Please mark all of those choices that together make for sufficient conditions for the concept of foo

05 – Exercise Inferences About Foos

>> That’s good David. So what we are finding here is, given conceptual knowledge, how can we translate it, into the language of logic?

06 – Predicates

Recall that we said that a lot of this AI agent will have two parts to it, a knowledge base and then the rules of inference. We’ll come to the rules of inference a little bit later. First let us look at how can we construct a knowledge base in the language of logic. So what we are trying to do now is that an AI agent has some knowledge about the world and it is going to express it in the scheme of logic. In earlier schemes of knowledge representation, we discussed how there were objects and relationships between objects. And any knowledge representation scheme we need to capture both objects and relationships between those objects. Logic has a particular way of doing it. So consider an object like a bird. This object may have various arguments. We’ll define something called a predicate, which is a function that maps object arguments to either true or false. So let us consider an example. Here we have bluebird as the object and feathers as the predicate on this object. Let’s consider this example. Here, bluebird is the object and feathers is the predicate on this object. Feathers is now a function that can map either into true or into false. Either bluebird has feathers or bluebird doesn’t have feathers. In this particular case, feathers of bluebirds would be true, because bluebirds do have feathers. Now, just like we had bluebird as the object in the previous example, here we have animal as the object, the same predicate. Now, of course, not all animals will have feathers, so this particular predicate may be true or false, depending on the choice of the animal. In this sentence there are two predicates, one object, animal still, but there is a predicate feathers and a predicate bird. And we can capture the relationship between these two predicates, by saying that if feathers animal is true, then bird animal is also true. This example has two predicates. Here there’s one object, the animal. But the predicates are feathers and birds. And the sentence is capturing a relationship between the two predicates. If the animal has feathers, then the animal is a bird. In logic we call sentences like this as having an implication. This is an implicative relationship. So in logic, we’ll read this as Feathers(animal) implies Bird(animal). Or if the animal has feathers, then it implies that the animal is a bird.

07 – Conjunctions and Disjunctions

Now, consider another sentence that we have come across earlier. If an animal lays eggs, and it flies, then it is a bird. How do we write this in the language of logic, given that there is conjunction here. So this time, we can have two predicates again. There is a predicate of lays-eggs, coming from here. The predicate of flies, coming from here. And we can denote a conjunction between them. Which in the language of logic is often put in this form. Now we can re-write this sentence in the following form. If the animals lays eggs and the animal flies, then the animal is a bird. Remember again, this semi colon here, really is denoting implication for now. Remember again, that in logic, this really stands for an implication. Consider the slightly different sentence. Suppose if the sentence was if an animal lays eggs or it flies it is a bird. In that case, again, we’ll have two predicates, but this time we’ll have a disjunction between them. And the sentence would become or if animal lays eggs or animal flies, then the animal is a bird. And again, this is an implication. Let us continue with the our exercise in which we are learning how to write sentencable language of logic. It is under the sentence, if an animal flies and is not a bird. So, it is a negation here, then it is a bat. How do we write that in logic? So I’m still interested in writing the antecedent of this particular sentence, and I may be able to say that animal flies is a conjunction here, because it is an and here, and we have this negation symbol for this predicate, bird. Now we can write a complete sentence by saying that the animal flies, conjunction. Animal is not a bird, implies animal is a bat.

08 – Implies

Now, I have talking a little about implication. Let’s see how do we actually write, implication and logic. So here is a sentence, if animal lays eggs and animal flies, it is implication is that the animal is a bird. In logic we write this using the symbol, arrow symbol, or an indication, so if the animal lays eggs and animal flies, implication animal is a bird. So here is the left hand side of the implication, here is the left hand side of the implication. The left hand side of the implication, implies the right hand side

09 – Notation Equivalency

Generally speaking, you won’t have these symbols on your keyboard. You can find them in your character map and you are welcome to use them if you’d like to. But for the exercises in the rest of this lesson and in the next lesson, feel free to use the symbols given over here. These are the symbols for AND, NOT, OR and Equals that come from Java or Python. So, feel free these when you are doing the exercises that you’ll come across in the rest of this lesson.

10 – Exercise Practicing Formal Logic

So remember we are still trying to learn how to build a knowledge based on the language of logic. To put it all together, consider four exercises. Here is the sentence. Please put it in the language of logic. Similarly for this sentence, this sentence, this sentence.

11 – Exercise Practicing Formal Logic

>> Good, David, that looks right to me. To wrap this part up, let us note that, when we defined what a predicate was, we said a predicate like flies can map into true or false. Well, okay, a predicate can map into true or false. What about complicated sentences like this which are multiple predicates as well as implications? How do we find out whether the sentence as a whole maps into true or false? That’s what we’re going to look at next. We’re looking at truth tables.

12 – Truth Tables

So we’ll now build truth tables for conjunctions and disjunctions and negations of sentences, so that we can find the truth of complex sentences stated in logic. Now many of you probably are familiar with truth tables, and if you are in fact familiar with truth tables, then you can skip this part and go directly to implication elimination. If you’re not familiar with this then please stay with me, but even so I’m going to go through this quite rapidly. So here is the truth table for A or B. If A is true, then B. If A is true and B is true, then A or B is true. If A is true and B is false, then A or B is still true, because A was true. If A is false and B is true, then A or B is true, because B was true. One of them is true, makes this true. If A is false and B is false, than A or B is false.

13 – Exercise Truth Tables I

Let us try a couple of simple exercises. So here we have A, B and we want to find a truth value of A or not B. Given these values for A and B, can you please write down the truth values for A or not B. And similarly, for not A and not B

14 – Exercise Truth Tables I

>> So for A or not B, I got that if A is ever true, then this has to be true, because it’s A or not B. When A is false the negation flips the value of B, so it makes it true when B is false, but keeps it false when B is true. For not A and not B, that means that any time either A or B is true, then this is all false. So when A is true, this is false. When B is true, this is false. When both are false, this becomes true, because those negations flip the values of both A and B.

15 – Exercise Truth Tables II

Now, we can play the same game, for ever more complex sentences. So, here I’ve again, three predicates, A, B and C. And here’s a more complicated sentence that involves all three of those predicates. A or B, and within parentheses, B and, nought C. And we can find the truth values for this particular, sentence, given the truth values for the predicates A, B and C. Why don’t you give it a try and write down the values here?

18 – Exercise Commutative Property

>> That’s good, David. And as you know, this property is called the commutative property. The commutative property says that the truth value for A and B is the same as the truth value for B and A. So whenever I have A and B, and can re-write it as B and A.

19 – Exercise Distributive Property

Let us try a slightly more complicated exercise. This time, we have three variables, A, B, and C. And here are the combinations of the truth values of A, B, and C. Here on the right are two formulas. The first one says, A and parenthesis B or C parenthesis closed. The second says parenthesis A and B parenthesis closed or parenthesis A and C parenthesis closed. Please write down the truth values for these two formulas.

20 – Exercise Distributive Property

>> We can also think of this as distributing the part outside both the predicate and the operator, into both the ones on the inside. We take the A and, and apply it to B, so A and B. We take the A and, and apply it to C, so A and C. And we preserve the operator in-between B and C. In between the two new parenthesis. So if this had been a or b or c. This would become a or b or b or c. This would become a and b and c, a would be all the operators here

21 – Exercise Associative Property

Let us do one of their exercising in two tables illustrate one of the property of logical predicate. Again here are three predicates, and here are two formulas. It should be a simple exercise. Please write down the truth values, of the two formulas, in these boxes.

22 – Exercise Associative Property

>> The difference between these formulas and the ones we were doing before, are the values of these operators. Associative property works when it’s, both ors or both ands. Distributed property worked when there was a mixture of operators.

23 – Exercise de Morgans Law

One other property of logical predicates that we will see very soon in action is called de Morgan’s law. So this time there are two predicates A and B. Here are their truth values. And here are two formulas. Remember this is a negation. Please write down the truth values of these two formulas in these boxes.

25 – Truth of Implications

>> So it can be a little bit weird to talk about the truth value of an implication sentence. What we’re really saying here is, whether or not this implication actually holds true. So let’s take three different implications to see this. First let’s think of the implication, feathers implies bird. All birds have feathers and only birds have feathers. So, we know that if an animal has feathers, then it is a bird. That’s true. On the other hand, let’s take the implication, if scales then bird. Lots of animals with scales aren’t birds and in fact no animals with scales are birds. So the implication, scales implies birds. Would be False. For our third example, let’s take the implication, flight implies bird. If we have a penguin, flight is False. But the penguin is still a bird. So, flight can be false and bird can still be true, meaning the implication can still be true here. On the other hand, if we have a cat, flight if False. And bird is False. So, the implication can still be true. So in this case, if flight was false, we can’t actually make a determination on whether or not the animal is a bird.

26 – Implication Elimination

As we go ahead and start applying rules of inferences to sentences in a knowledge base. We’ll find it convenient to rewrite the sentences in a knowledge base. And sometimes it will be very useful to rewrite these sentences in the knowledge base in a manner that eliminates the implications in a sentence. And this is how we can eliminate the implication. If a implies b, than we can rewrite it as not a or b. We know this because the truth value of a implies b is exactly the same as your truth value of not a or b. We can take an example here. Supposing that we are given feathers imply bird. Then we can rewrite this as not feathers or bird. And intuitively, you can see the truth value of this. It is either the animal does not have feathers or, it is a bird. In a little bit, we will see that this is a important rewrite rule in doing certain kinds of logical proofs.

27 – Rules of Inference

>> You may already be familiar with this line of reasoning, because this is another way of raising contrapositive, that we see in other areas of logic.

28 – Prove Harry is a bird

Now you can see how we apply these rules of inferences on sentences in a knowledge base or philosophical agent to prove all kinds of sentences. See, imagine that an AI agent begins with the knowledge that if an animal has feathers, it implies that the animal is a bird. Now it comes across Harry, who does have feathers. By Modus Ponens, therefore the AI agent can conclude that Harry is a bird. This completes the proof for our original goal of proving that Harry is a bird. Now let us suppose that a goal is to prove that Buzz does not have feathers. Once again, imagine an AI agent which begins with the knowledge that if an animal has feathers, it implies that the animal has, is a bird. The agent comes across a animal, which is not a bird. Then by Modus Tollens it can infer that buzz must not have feathers. This completes the proof for of a original goal of proving that buzz does not have feathers. Okay. So now, we have looked at two ways of proving the truth value of various sentences. The first way was just through truth tables. I could have sentences and logic. Then I could write another sentence. And ask myself, what, what is the truth value of this sentence? I could construct a truth table for that sentence, composed of the truth values of all the predicates, with some of which might be coming from earlier sentences. The second way in which we have seen how we can prove the truth values of sentences and logic is by applying these rules of inferences like modus ponens and modus tollens. This is very powerful, and in fact the power of this logic has been known since before the birth of AI. As computer scientists however, we’ll analyze this power in a slightly different way. Yes, we can use method of truth tables to construct a truth table for any arbitrary sentence. However, the sentence was complicated. Then the truth table very soon will become very complex. Computationally, that is infeasible for very long, large sentences. Similarly, yes we can apply simply modus ponens and modus tollens to find the truth value of many sentences. But if the knowledge base consisted of a very large number of sentences, instead of just one or two sentences, then the kinds of inferences, number of inferences I can draw from those sentences simply by applying modus ponens and modus tollens, will be very large. Or if I had to find the truth value of a single sentence, then the different pathways I could take in order to get to the truth value of those sentences can make for long, large problem space. So while these methods of proving the truth with your sentences and logic have been around for a long time. These methods are not computationally feasible. At least not for complex tasks. At least not for agents that have only limited computational resources and from who we want near realtime performance

30 – A Simple Proof

Okay, let us set aside predicate calculus, and return back to population logic. Recall that we had found ways of writing sentences in population chronologic. We had found rules of inferences, we could prove theorems. We could find the truth value of new sentences. However, we found that those methods were computationally, not very efficient. So AI has developed more efficient methods. One of those methods is called Resolution Theorem Proving. Let us take an example to illustrate how resolution theorem proving works. So, imagine there is a robot, and this robot. Is working on an assembly line, it’s a factory robot, and on the assembly line are coming weird kind of widgets. The robot’s task is to pick up each widget, as it comes on the assembly line and put it in a truck. However, there are some humans in this factory. Who play a joke on the robot once in a while, they glued the widget to the assembly line belt, so that, when the robot tries to move it, it can not move it. But the robot is a smart robot, this is a logical agent, so when it can not move it. It uses its logical reasoning, to figure out that the boxes aren’t liftable. And the moment it knows that the boxes aren’t liftable, it lets go of the box and moves onto the next one. Everyone got the story? All right. So let us suppose that the robot begins with some knowledge in its knowledge base. And this knowledge in its knowledge base, that it begins with says that if cannot move, then it implies that not liftable. Now, it tries to move the box, the next box in the widget. It’s biceps tells it, it can not move. It needs to prove that it’s not liftable. And of course this is a preview example and I’m sure you’ll understand it. You can put essentially a class of the modest components to prove that it’s not liftable. If p then q, p therefore you can infer q. But, we’ll use this example to show. How does resolution theorem proving works? So, the first step in resolution theorem proving is, to convert every sentence into a conjunctive normal form. A conjunctive normal form of a sentence, can have one of three conditions. It can have a literal. That can be either a positive atom, or a negative atom. It can have this disjunctional literals like here can-move, or not liftable, or it can have a conjunction of disjunctional request. In this example the third condition doesn’t occur. So, the first thing we must do is to take the first sentence. The negation of not move implies not liftable. And remove the implication, because an implication cannot occur in conjunctive normal form. So the first thing we need to do is, to rewrite the sentence, the first sentence, to remove the implication. Because the implication cannot occur in a conjunctive normal form. So now we use the. Implication elimination rewrite rule. To rewrite this in the form of can-move, or not liftable. Remember that was alpha implies beta becomes, not alpha or beta. So the not of negation of can-move becomes can-move or not liftable. So, we have done it for the first sentence. This is now in a conjuncted normal form. We can do the same thing for the second sentence, but wait, the second sentence already is in a conjunctive form. We don’t have to do anything. Now, the robot wants to prove that their box is not liftable. Resolution to improving, is like proof by refutation. To do proof by refutation we will take the negation of what we want to prove. We wanted to prove not liftable would take its negation, which makes it liftable. Okay, so now we got three sentences. This one’s the first sentence that the robot was bootstrap with, you’ve just converted to a conductor normal form. This was the sentence that came from a it saw that the box cannot move. And this is the sentence throughout the negation of the sentence, the refutation of the sentence that it wants to prove. So we have three sentences now. The first sentence came from the bootstrapping, of the robot’s knowledge base. This is the axiom that the robot assumes to be true. The second sentence came from its percepts. The robot tried to move the box, it could not move it. The third sentence is coming from taking the negation of what the robot wants to prove. It wants to prove it’s not liftable. So, it’s going to take this negation of it and then, sure that it’s going to lead to a null condition that we’ll view as a contradiction. The resolution for improving lawless begin with a liftable in the sentence that we want to prove. So here that sentence is liftable, and we’ll look for a sentence that contains a negation of liftable in this sentence that we want to prove. So the sentence here was liftable, sentence S1 contains liftable which is a negation of that so we pick S1 and not S2. Note, how efficient it was to decide what sentence on the knowledge based to go to. In sentence container negation of the liftable. So, liftable and not liftable can not both be true. We know that, and therefore we can eliminate them. This is called resolution. We resolve unliftable and we remove them from the sentences. Now, we were sentence as S1, that leaves us can move. So, now we pick a sentence, that has the negation of literal can-move. Sentence S2 has a negation of that, and we can resolve one can move, they can not both be true. When we resolve on both of them, those get eliminated as well. And now we see we’ve reached another condition. This null condition represents a contradiction, and now we can infer that liftable cannot be true, therefore not liftable is true. The robot has proved not liftable. And in this case it appears as a resolution theorem improving is more complex there’s more respondents. In general it is not. It just appears here, because this condition happened to fulfil the form of more respondents directly. In general, deciding on which sentence to apply the modest ponents on, and how to combine those groups of inferences don’t suddenly become [INAUDIBLE] harder than deciding how to apply the resolution and improvements.

31 – A More Complex Proof

Let us make this example a little bit more complicated, complicated enough that it cannot be proven simply by applying one instance of modus ponens. Imagine that when the robot found that, imagine that the robot proved to itself that this box is not liftable. And the humans in the factory, who are trying to make fun of the robot said to the robot,. Well the reason it’s not lift-able is not because it’s not movable, but because the batteries aren’t working. So, now this situation is more complicated, the robot must also check it’s back. So, now the robot begins a slightly different knowledge base, so suppose that the knowledge in this knowledge base is. Cannot move, and battery full means it’s not lift-able. It finds from its percepts again it cannot move. So it checks its battery, and it checks its battery, and finds that the battery is full. So there are two new sentences that get written in the knowledge base. Now, the knowledge base contains three sentences. As earlier, in resolutions you’re improving, the agent must convert all these sentences in its knowledge base into a conjunctive novel form. That means that these sentences can, can contain a literal or a disjunctional literals, or a conjunction of disjunctional literals. So, if we begin by removing the implication from sentence one, because an implication cannot occur in a conjunctive normal form. So when we remove implication from the first sentence, we get this sentence. When a sentence is not yet satisfactory it is not a conjunct to normal form. Because this is a dis-junction of conjunctions, and what we want are conjunctions or dis-junctions. So we apply the deMorgan’s law, and now we get the following sentence. We’re simply taking the negation inside, which flips the conjunction into a disjunction, and now we have three literals connected with this junction, and this is a conjunctive novel form, this junction of literals. So, now we have in the knowledge base three sentences. All three of them in the conjunctive normal form, either literals or disjunctional literals. Recall that the robot wanted to prove not lift-able. So it takes the negation of that, this is, again, proof by refutation, so it considers lift-able. So now this knowledge base has four sentences, these four sentences coming from. The negation of what it wants to prove. Once again, the reasoning begins by the literal that it wants to prove, in this case, liftable. And it finds a sentence in which the negation of literal is true. So once again, we begin with the sentence S4, because that is what we want to prove. And we find a sentence in the knowledge base, which contains a lift-able which is a negation of lift-able in the sentence s four that we want to prove. We resolve on this because they both cannot be true and resolution here is to be subtract that we drop them. Now in the sentence S one that is under consideration currently we have two lift-ables. We can begin with either one of them. Let us begin with not battery full. We’ll try to find a sentence, which contains a negation of this particular clause. It is a sentence, S3, which is the negation of this, so we resolve on this, battery full and not battery full because they both cannot be true. We’ll drop them. Now in sentence, S1, we’re left with just one literal, can move. We can try to find a sentence in knowledge base which can Here it is. And so we can resolve on them, and we dissolve on them. We drop them. And once we drop them then we have a null condition, which stands for a contradiction. So you reached a contradiction. Therefore the assumption that this was lift-able cannot be true therefore not lift-able is true, and we have just shown that resolution in this case proves what wanted. One important point to note here is that a focused of because of the conjunctive novel form. One important point to notice here is the focus of attention. Often, when the problem space is very complex. For example, when the number of sentences is very large, the sentences are very complex, it can become really hard to the logical agent to decide what to focus on. But, because we have converted at a conjunctive normal form, and because resolutions [INAUDIBLE] is making use of resolution. Therefore, at any particular time, the logical agent knows exactly what to focus on. You always began with this literal. You always try to find a sentence which contains this negation. You always resolve on that. You take out the remaining literal in this sentence and proceed forward. This focus of attention, this computational efficiency of resolution theorem proving, is arising because. Or what are called Horn Clauses. A Horn Clauses is a disjunction, it contains at most one positive literal. This is happening is S1, this is a disjunction that contains at most one positive literal. This is a negative literal, this is a negative literal. And the fact that it contains just one positive literal is a very powerful idea, because that’s where the focus of attention comes from.

13 – Planning

01 – Preview

Today we’ll talk about planning. Recall that we had said an intelligent agent maps perceptual histories into actions. Planning is a powerful method for action selection. We’ll use the syntax we learned from logic last week to set up the specification of goals and states, and operators, for moving between states and achieving goals. We’ll see that when there are multiple goals, there can be conflicts between them. We’ll describe a specific technique called partial-order planning that avoids conflict between multiple goals. Finally, we’ll talk about a representation called hierarchical task networks that allows us to make complex hierarchical plans.

02 – Block Problem Revisited

In order to look at planning in detail, let us consider this problem that we have encountered earlier. This is a blocks world, in which there is a robot which has to move the blocks from the current state, to the goal state. The robot has only one arm, so it can move only one block at a time. It can move only a block which does not have some other block on top of it. Earlier, when we considered this method, we had looked at weak AI methods like means-ends analysis. Now we’re going to look at more systematic, knowledge-based methods. One question unanswered in our previous discussion was, how can we find which goal to select, among the various goals that are here? They simply said, the agent might select a goal. Now we will look at how the agent can in fact do the goal composition and select the right goal to pursue?

03 – Painting a Ceiling

It is considered a little bit more realistic problem. Imagine that you were to hire a robot, and a task of the robot was to paint the ceiling in your room, and also paint the ladder. So, two goals here, paint the ladder paint the ceiling and note that the two goals are in conflict. Because if the robot paints the ladder first, the ladder will become wet. And this robot cannot climb on it, in order to paint the ceiling. So, the robot must really first paint the ceiling, then climb down, then paint the ladder and everything is okay. You would expect a human to get this almost immediately, you probably got it almost immediately. You have to paint the ceiling first, before you paint the ladder. Of course, every time I’ve hired construction workers, they always paint the ladder first. Then they go and take a break, and so that I had to pay them anyway. We’ll accept that kind of behavior from human construction workers, we would not accept that from robot construction workers. The robots must be intelligent, they must know how to prioritize the goals. Well, in order to reason about these goals, the robot first must be able to represent them. So, how can we present the goal of painted ceiling? Now that we have learned about propositional logic, here is a preposition that can represent painted ceiling. This is the object, this is the predicate on that.

04 – Exercise Goals

So in this box, please write down the second goal of painting the ladder in propositional form. And having done that, in this box, write down how would we represent the two goals as a conjunction.

05 – Exercise Goals

>> Let’s move on then.

06 – States

So, we just talked about goals and the goal state of the goals. In order to specify the goal fully, we need to not only specify the goal state, but also to specify the initial state. So, let’s do that. Let us suppose that the initial state in this world is, that the robot is on the floor. Note, how I’m writing this. On is a predicate, that says this is a two [UNKNOWN] and I’m reading this as Robot On Floor. So Robot on Floor, and the Ladder is Dry, and the Ceiling is Dry. So, this is the Initial state, this is the Goal State. Now we have fully specified the goal. Let’s now ask, how can the robot come up with a plan?

07 – Exercise States

Now that we have learned to specify the initial state of the world and the ghosts of the world, let us do an exercise in specifying other states of the world. So please write down in this box, the state of the world that would occur after the robot is on the ladder and the ceiling has been painted.

10 – Exercise Operators

Now that you have learned how to specify an operator, such as climb ladder. Let us do some exercises about how to specify other operators, like descend ladder, paint ceiling and paint ladder. In these boxes, please write down the pre-condition and the post-condition in the same notation

12 – Planning and State Spaces

in. >> That’s good David, and to builder’s example, what do we actually do, when we have to? Plan a navigation route to go from one location, to another location, in an urban area. We use knowledge of the goal. The goal tells us, what turn to take at every intersection? We want to take a turn, that helps us get closer to the goal. So one thing we are learning here, is there are different kinds of knowledge. There is knowledge about the world, the intersections and the turns, the states and the operators more generally. There’s also asset knowledge about how to do the operator selection, how to select between greatest terms at any intersection? This knowledge is tacit, and is sometimes called control knowledge. Goals provide us with the control knowledge, of deciding how to select between different operators. Let us recall how means-end analysis work. How goals poetic control knowledge, and means-end analysis heuristic method, if you would compare the Current State and the Goal State and enumerate the differences between them. Then we’ll select the operator that will help us reduce, the largest difference within the Current State and the Goal State. That’s one, way of using goals as control knowledge to select between operators. Planning, provides more system mathematics matters for selecting between different operators. So the real problem now becomes, how to do operator selection, which is the same problem as, how to do action selection. Recall with me, we’re talking about intelligent agents. We define intelligent agents, as agents that map perceptual history into actions. Action selection was a key problem, was a central problem. This is where planning is central, because it deals starkly with action selection, or with operator selection. Operators simply mental representations of actions, that we live with in the world. So, let us look at what a plan might look like, in the language we have been developing for planning. A plan might look like this, here is the Initial State, and a set of successor states. A series of states, punctuated by an operators that transform one state into another. Here we have expanded this operative, paint ceiling, to specify its peak conditions and post conditions, and there’s several things not worthy here. Note that the preconditions of this operator, exactly match the predecessor’s state. So, we have on robot ladder, here, and we have on robot ladder, here. So, some assertions of the world are true, here. And those assertions match the precondition, which is why this operator is applicable. Similarly, the post conditions of this operator, directly match the assertions about the world in this successor state. So I have painted ceiling here, there is painted ceiling there. There is not dry ceiling here, there is not dry ceiling here. So this provides a very precise way, of specifying the states, and the operators, and t he exact connections between them.

13 – Planning

Let us return to means-ends analysis for just another minute. Just to see how means-ends analysis, might try to work with this problem, and get into difficulties. So this is the goal state, painted ladder and painted ceiling. And this is the initial state. Now means-ends analysis may enumerate the operators that have to deal with the painted ladder and the painted ceiling. Here the operator might be paint ladder. Here the operator might be paint ceiling, but that requires some pre-condition climb up ladder which is not satisfy the initial state. So maintenance analysis picked the goal painted ladder. And select the operator paint ladder, which gets the maintenance owner to this state. This is the holistic method. Here is the paint-ladder specified at the right, and you can see the peak conditions of paint-ladder, match the initial state, and the post conditions match the successive state. Now that means since analysis has achieved the first goal of painted ladder it make turn to the second goal of painted ceiling. Recall that this is the current state. So, mean sense analysis may pick the operator climb-ladder, as a peak condition for the operator of painted ceiling. But note what happens, when precondition of climb ladder, constitutes a postcondition of paint ladder. So this is not dry ladder, this is quest dry ladder. There is a conflict here. In a situation like this now, the robot, would need to just wait for the ladder to become dry again, before climb ladder is a [UNKNOWN]. So it seems as, as if the people who are sometimes hired for working on a home or using main sense analysis. The first being the ladder, then the goal weight, until the ladder dries up, and then they of course expect me to pay them for their time. To summarize, we have a plan for achieving one of the goals, Painted Ladder. But this particular plan clobbers achieving the other goal, Painted(Ceiling), because it creates a condition, that makes it impossible to achieve the other goal. The question now becomes [UNKNOWN] reason about the conflict between these codes? How can planning systematically find out, how to organize these various operators, so that these conflicts do not occur? What we have described here, this goal clobbering, is true for all simple planners, sometimes called linear planners. Linear planner, does not try to reason about the conflict between these goals. Does not try to reason about how the plan for one goal may clobber another goal. Instead it just goes about making plans as if those goals can be achieved in any order.

14 – Partial Order Planning

>> Good example. And next we will discuss how partial order planning can help us detect conflicts like this and avoid them.

15 – Partial Planning

Now let us see how partial order planning, sometimes also called nonlinear planning, may work for our ladder and ceiling problem. So here is a goal state, painted ladder. There is the initial state. We can now use the goal knowledge as control knowledge to select between different operators available in this world. The only operator whose post conditions match the goal condition of painted ladder. And whose preconditions are compatible with the initial status, paint-ladder. So we’ll select that operator. When we think of applying the operator paint-ladder to the initial state, we get this as a successor state. Painted ladder and not dry ladder are coming from the post conditions of paint-ladder. Robot and floor, and ceiling, dry, have been propagated from the initial state. We changed dry ladder to not dry ladder because that was the post condition of paint-ladder. We did not change the on robot floor and dry ceiling because pain ladder was silent about them.

16 – Exercise Partial Planning

Now that we have seen how a simple planner may work for this goal, let us see how the simple planner, the linear planner may work with the goal of painted ceiling. Pleas write down the operators in these boxes and the states that will be achieved after the application of these boxes in these bigger boxes.

17 – Exercise Partial Planning

>> So note that we just made a connection back to problem reduction that we talked about right after means and analysis. Ashok in his description talked about the sub-goal of getting up the ladder. When we talked about problem reduction earlier, we talked about the need to break big problems down into smaller problems, but we didn’t talk exactly about how an agent would go about doing that. Here we see one way in which an agent would go about actually identifying those sub-goals to accomplish in order to accomplish the final goal.

18 – Detecting Conflicts

So what the partial or the planner has done so far is to view the two goals as if they were independent of each other. And come up with a partial plan for each of the two goals. It has not yet detected any conflicts between that will not resolve those conflicts. The next thing would be to examine the relationship between these two plans and see if there are any conflicts between. But how might a partial order plan go about detecting conflicts between two plans? So, here is plan one imagined, here is plan two. The partial order planner may go about detecting conflicts. We’re look at each precondition of the current plan. Under the precondition of an operator any current plan is clobbered by some state in the, another plan, in the second plan, than the partial order planner would know that there’s a conflict between them. [UNKNOWN] goal resolving these conflicts, but promoting or demoting one clients goal or another clients goal. There’s if some stated plan B covers the application of some operator in plan A, then we now want to alter the goals in this plan and this plan in such a way that this operator’s done before that state is achieved. Now, let us see how the partial order planner may go about detecting conflicts within these two plans. So the partial order planner may begin with this plan for painting the ladder. And see whether the precondition of this operator, paint-ladder, are clobbered by any state in the second plan. As it turns out, that doesn’t happen in this example. Now, the partial order planner will look at the operands in the second plan. And see whether the preconditions of any of the operators are clobbered by some status in this first plan. So let’s look at climb-ladder here. The precondition of climb-ladder is, on robot, floor, and dry ladder. And as this precondition is compared with the states. In the first plan, we can eventually see the conflict. Here is dry ladder, and here is not dry ladder. And this way the partial order planner has been able to find that the water-less states here in the first plan proverbs the precondition of one operator on this second plan. To resolve this conflict, the partial order planner may promote these goals or the goal of painting the ladder. Some of you also noticed that after the robot has painted the ceiling, the robot is on ladder. But in order to apply the paint ladder operator, the robot must be on the floor. So here there is an open condition problem. This particular condition where this operator is not being satisfied. When the robot is on the ladder. We’ll come to this in a minute.

19 – Open Preconditions

So recall that in order to resolve the conflict, the partial order planner has decided to promote this goal over that one. As it tries to connect these two plans, it finds that there is an open condition problem that we just talked about On Robot Ladder, does not match On Robot Floor. So now it needs to select an operator whose first condition will match this state. Robot On Floor. And those three conditions will match this state, Robot On Ladder. And there is just one operator that matches those conditions. And that operator is descend ladder. So now the partial order planner uses this information to select the operator, the simulator, and now we have a complete plan. So now you know about the algorithm for partial order planning, and how it works in practice. But what does this tell us about intelligence? Let’s consider several postulates. First, knowledge is not just about the world. Knowledge is also controlled knowledge. It is often tacit, but this controlled knowledge helps us select between operators. Second, that goals provide control knowledge. Goals can be used to decide between different operators, and we select an operator that helps us move closer to the goal. Third, we can view partial order planning as an interaction between several different kinds of agents or abilities. Each agent here represents a small micro ability. There is this agent which was responsible for generating plans for each of the goals independently, then there was an agent responsible for detecting conflicts between them. Then there was a third agent responsible for resolving this conflict. So we can think of partial order planning as emerging out of interaction between three different agents, each one of which is capable of only a small ability. So we can think of partial order planning as emerging out of interaction between free agents, where each agent is capable of only one small task. Minsky has proposed a notion of a society of mind. A society of agents inside an intelligent agent’s mind that work together to produce complex behavior, where each agent, itself is very simple. As in this case, a simple agent for detecting conflict, or a simple agent for resolving conflicts, and of course an agent for making simple plans with simple goals. It is one other lesson to take away from here. When you and I solve problems like the ladder and the ceiling problem, we seem to address these problems almost effortlessly and almost instantaneously. So it looks really simple. What AI does, however, is to make the process explicit. To write a computer program that can solve the same problem is very hard. It is hard because the computer program must specify each operator, each precondition, each state, each goal, every step very, very clearly and very, very precisely. By writing this computer program is this AI agents that consults problem. We make the process that humans might be using more explicit. We generate hypotheses about how humans might be doing it, which is a very powerful idea.

20 – Exercise Partial Order Planning I

Now that we have seen partial art of planning in action, let us try to do a series of exercises to make sure that we understand it clearly. We have come across this problem earlier. This is the micro world of blocks. Here is the initial state. And here is the goal state. We need to transfer from this initial state to the goal state, moving only one block at a time. Please write on the initial state and the goal state in propositional logic.

21 – Exercise Partial Order Planning I

David? >> So our initial state, is that each block is on top of the other. D is on B. B is on A. A is on C. And C is on the table. Our goal state, is each block on top of the other in a different order, an alphabetic order. So, A is on B. B is on C. And C is on the table. So our initial state is that the blocks are all stacked up, D is on B, B is on A, A is on C. C is on the table. And our goal state, is that the blocks are stacked up in alphabetical order, so A is on B, B is on C, C is on D and D is on table.

22 – Exercise Partial Order Planning II

Now that we humans find addressing problems like this almost trivial, we know what to do here. Put D on the table, put B on the table, and so on. And then put C on top of D and so on. The question is, how can we write an AI program that can do it? And, by writing an AI program, how can we make things so precise that that will provide insight into human intelligence. To do this, let us start writing the operators that are available in this particular work. There are only two operators. I can either move block x to block y, which is the first operator here. Or I can move block x to the table. Note two things. First, instead of saying block A and block B, we have variabalized them, move block x to block y, where x could be A, B, C or D, and similary for block y. And this is just a more concised notation. Second, that in order to move block x to block y, both x and y must be clear. That is neither x nor y should have any of the block on top of them. Given this setup, please write down the specification of the first operator as well as the second operator.

23 – Exercise Partial Order Planning II

>> So like you said, our precondition for the first one, is that both x and y are clear. We can’t move x if there’s anything on top of x, and we can’t put it on y if something is already on top of y. Our post condition then, is that x is on y. For the table it’s a little bit easier, the table has unlimited room. So for the table, as long as X is clear we can move X to the table. And in the postcondition is that X is now on the table.

24 – Exercise Partial Order Planning III

So given the various goals here, A and B, B and C, and so forth, write down the plan for accomplishing each goal, as if these goals were independent of each other. We are shown here only three goals here not the fourth goal of D on table, because of lack of space. But D on table anyway is [UNKNOWN].

25 – Exercise Partial Order Planning III

>> So like you said [UNKNOWN], the plan of putting D on table’s kind of trivial. And we actually see that it’s the first step of any other plan. So we don’t really need to articulate that explicitly. For putting A on B, our fastest idea would be to put D on the table, then to put B on the table, then to put A on top of B. I would just get a straight to putting A on B. For putting B on C, we need to put D on the table, B on the table, A on the table, and then move B on to the top of C. And then, for putting for C on D, we would need to move D to the table, B to the table, A to the table, and then put C on top of D.

26 – Exercise Partial Order Planning IV

Now that we have these three plans for accomplishing the three goals, can you detect the conflicts between these plans? Use a pencil and a piece of paper, to detect the conflicts and resolve the conflicts and then write down the, ordering of the goals in these boxes.

27 – Exercise Partial Order Planning IV

>> But David

28 – Exercise Partial Order Planning V

Now that we know about the conflict between these plans, please write down the final plan for achieving the goal state. To save space, just write down the operators. You don’t have to specify all the states in this plan.

30 – Hierarchical Task Network Planning

Our next topic in planning is called hierarchical planning. We’ll introduce the idea to you. We’ll also introduce the representation called hierarchical task network to you. HTN for hierarchical task network. To illustrate hierarchical planning, imagine that you are still in the box microworld. Here is the initial state. And here is the goal state. These states are more complicated than any initial state and goal state that we have encountered so far. So as previously, we can use partial order planning to come up with a plan to go from this initial state to goal state. Here is the final plan, and as you can see, it’s pretty long and complicated, with a large number of operations in them. So the question then becomes, can we abstract some of these operations at a higher level? So that instead of thinking in terms of these slow level move operations, we can think in terms of high level macro operations. And those macro operations will then make the problems space much smaller, much simpler so that we can navigate it. And then we can expand those macro operators into the move operations.

31 – Hierarchical Decomposition

So look at the macro operators at a high level abstraction, consider this one part of the current problem. Here is the initial state, here is the goal state, there is the final plan. To enlist this idea of macro operators, and hierarchical planning, at multiple [UNKNOWN] of abstraction, let us read with this problem that we had encountered earlier. This was the initial state, this was the goal state. And we come up, we came up with this as the final plan. Now, we can think of these three operations as being abstracted out into a macro operator that we can call unstack. And these three operations being abstracted out into a macro operator that we can call stack-ascending. Just simply saying stacking them in a particular ascending order. Here is the specification of the two macro operators. Unstack, and stack-ascending. You do preconditions and post conditions. And this macro operator, also tells you how this macro operator can be expanded in to the lower level move operations. Similarly for the stack ascending macro operator.

32 – Hierarchical Planning

Now that we have illustrated hierarchical planning, what does it tell us about intelligence? Intelligent agents, both cognitive and artificial, constantly faced with large, complex problems. The problem spaces corresponding to these problems often have [INAUDIBLE] explosion of states in them. Intelligent agents address these complex problems by thinking at multiple levels of abstraction. So that at any one level of abstraction, the problem appears small and simple. In order to be able to reason at these multiple levels of abstraction, we need knowledge at multiple levels of abstraction. In this case, there was knowledge not only at the level of move operations, but also the level of macro operations, like unstack and stack ascending. And perhaps even higher level macro operations, like sort. This goes back to the fundamental notion of knowledge based AI. Intelligent agents use knowledge in order to be able to tackle hard, complex problems.

33 – Assignment Planning

How would you use planning to develop an agent that can answer Raven’s progressive matrices? So, the first question you want to ask here is what are our states? What’s our initial state? And what’s our final state? Given that, what are the operators that allow the transition between them. How would we select those operators? We are talking about partial ordering planning in this lesson, what conflicts are possible when we are trying to solve Raven’s problems? How would we detect those conflicts beforehand and avoid them? Note that again we can consider this at two different levels. First, we can think of the agent as having a plan for how to address any new problem that comes in. Or second, we can consider the agent as discerning the underlying plan behind a new problem.

34 – Wrap Up

So today we’ve discussed how to plan out actions using formal logic. We started off by talking about states, operators, and goals in formal logic. We then used those to contextualize our discussion on detecting conflicts that might arise. This introduced the need for partial-order planning which helps us avoid those conflicts beforehand. Finally we talked about hierarchical tasks networks which can be used for hierarchical planning. Now, we’re going to move on to understanding, which builds on our notion of frames from a few lessons ago, but if you’re interested in this, you can jump forward to our lessons on design. Configuration and diagnosis leverage some of the concepts of planning very heavily.

35 – The Cognitive Connection

Planning is another process central to cognition. It is central because action selection is central to cognition. You and I are constantly faced with the problem of selecting actions. Where should I go for dinner today? What should I cook for dinner today? How do I cook what I wanted to cook? I got a bonus. What should I do with the bonus? Shall I go for a vacation? How should I go to a vacation? Where should I go to a vacation? These are all problems of action selection. And I need planning to select the appropriate actions. Cognitive agents also have multiple goals. As a professor, one of my goals right now is to talk with you. Another goal that I have is to become rich, although I know that becoming a professor is not going to make me rich. The point is that cognitive agents have multiple goals that can have interactions between them. Sometimes interaction is positive. Achieving one goal provides an opportunity for achieving the second goal. Sometime the interaction is negative, there are conflicts. Cognitive agents detect those conflicts. They avoid those conflicts. And planning, then, is a central process for achieving multiple goals at the same time.

37 – Final Quiz

Great. Thank you so much for your feedback.

14 – Understanding

03 – Thematic Role Systems

Let us first consider a simpler sentence. So consider the sentence Ashok made pancakes for David with a griddle. I’m sure you understood the meaning of this sentence almost immediately. But what did you understand? What is the meaning of meaning? We can do several different kinds of analysis on this sentence. We can do lexical analysis, which will categorize each of these words into different lexical categories. For example, Ashok is a noun, made is a verb, pancakes is a noun, and so on. We can do syntactic analysis. In terms of the structure of this particular sentence. So, you might say Ashok is a non phrase. Made pancakes for David with a griddle is a verb phrase. And this particular verb phrase itself has sub phrases in it. Or we can do semantic analysis on this, and say that Ashok was the agent. Made was the action. Pancakes were the object that got made. David was the beneficiary. Griddle was the instrument. The knowledge base here as oppose to understanding stories like this. Semantic analysis is at a forefront. Syntactic analysis and lexicon analysis will serve semantic analysis. So here are some of the semantic categories in terms of which we can classify the different words in the sentence. So Ashok is an agent, made is an action or verb in the lexical sense, pancakes are the thematic objects, the things that are getting made and so on. The frame for representing understanding of this sentence has the verb of the action make, the agent Ashok, and so on just like we just discussed. This then is the meaning of meaning. This is what the agent understands when it understands the meaning of this sentence. This is perhaps also what you understand when you understand the meaning of this sentence. How do we know that you understood the meaning of this sentence? Well, we know that because I can ask you some questions and you can draw the right kind of inferences from it. So for example, given this sentence, I can ask you, well who ate the pancakes? And you might be able to say, well David ate the pancakes because Ashok made the pancakes for David. Notice that this information about who ate the pancakes was not present in this particular sentence. This is an inference you’re drawing. This is a very similar to what we had encountered earlier when we had sentence like Ashok ate a frog. At that time too, we’d ask questions like, well, was Ashok happy at the end? And the frame had some default values which said Ashok was probably happy. Or, was the frog dead at the end, and the frame for eating had some default value which said that the frog was dead at the end? So according to this theory, the meaning lies in the inferences we can draw from it. You understand the meaning of this if you can draw the right inferences. You do not understand the meaning of this if you cannot draw the right inferences or if you can draw only the wrong inferences. This frame representation of the meaning of this particular sentence allows you to draw the right inferences. Given the action make here, the thematic role pertains to the relationship of various words in the sentence. To this particular action of making. Ashok is the agent, David is the beneficiary, and so on. So far we are describing meaning of this sentence, and how we can capture that meaning in this frame. We have not yet described the process by which this knowledge is extracted out of that sentence. The extraction of the meaning of this sentence is exactly the topic that we will discuss next.

04 – Thematic Role Systems

>> Not that even though David isn’t talking about a specific instance of throwing, he’s still able to generate an expectation of the general action of throwing. This is what a temeric case fame does for you. It is able to generate expectations. Let us look at how it would actually work in action.

05 – Exercise Thematic Role Systems

Now that we understand how to represent the meaning of stories, let us consider a different story. David went to the meeting with Ashok by car. Please write down the meaning of this story, in terms of the slots of this particular thematic role frame.

06 – Exercise Thematic Role Systems

>> That’s right, David. But how did we know that David was the agent? How did we know the destination was the meeting? How did we know that car was a conveyance? That’s what we’ll look at next.

07 – Constraints

Let us use the assignment of car as the conveyance, to illustrate how these different words get assigned to different categories, different slots in this frame. Now, we know that car was a conveyance because of the role that this preposition, by, plays here. That is, an intelligent agent might make use of the structure of the sentence to make sense of the story. We have designed human language in such a way that there is a particular structure to them. Prepositions, for instance, play a very important role. Here are some of the common prepositions: by, for, from, to, with. Each preposition plays certain thematic roles. By can be an, by an agent, or by a conveyance, or by a location. Similarly for the other prepositions. So the moment we see by, here we know that, whatever is going to come after by, the care can be a agent, or a conveyance, or a location. Note again that the categories we are using here are semantic categories, we are not saying noun, or verb or anything like that, what we are saying here is beneficiary and duration, and source which are semantic categories, that allow us to draw inferences. But how did we know that car was a conveyance, and not an agent, or a location? In general, while the structure of language does provide constraints, these constraints do not always definitely determine the meaning of that particular word. We’ll use additional knowledge to find the exact meaning of the word car.

09 – Ambiguity in Verbs

>> Did you know that knowledge based AI was going to be t his much fun? Our goal going forward is to look at how agents may resolve some ambiguities, when one correct meaning, and only one correct meaning is possible. Note however, that we might already have [UNKNOWN] theory of humor. Perhaps, one form of humor is, when a single sentence can be made to fill multiple [UNKNOWN] simultaneously.

10 – Resolving Ambiguity in Verbs

We saw in the previous example how sentences in a story can be ambiguous. For example, by, could have referred to an agent, a conveyance, or a location. This is true, not just for prepositions, but is also true of other [UNKNOWN] categories, like words. In fact, words often can have several interpretations. Let us consider the word take as an example. Take is a very common word. It has at least these 12 different meanings. Consider for instance to medicate. Ashok took an aspirin. Here, the meaning is that Ashok took aspirin as a medication. Each of these meanings has a common meaning of take, as we will say in just a minute. But given a sentence in which take occurs, how do we know which of these meanings is intended by the word, take? So suppose the input sentence was, my doctor took my blood pressure. The taken in this sentence refers to, to measure and not to any of the others. Let us examine this issue further. So, for each of these 12 interpretations of take, we have a frame-like representation. So take 1 to take 12. Each of this frame-like representation specifies the thematic roles that go with that particular meaning of take. So in this particular meaning of take, take 1, we have an agent and an object. In this meaning of take, take 12. We have an agent, an article, and a particle. Another word that typically occurs with take which signifies this meaning, so to take clothes off from a body. Let us consider another example of particle. Let us consider take11. The meaning of this take is to assume control, as in to assume control of a company, or to assume control of a country. When the meaning is intended to be to assume control, then take typically occurs with the word over. Take over a company. Take over a country. So, over then is a particle that signifies this eleventh meaning of take. To go deeper into story understanding, consider the simple story I took the candy from the baby. What is the meaning of the word take here? You and I get this immediately, but how can an agent get it? To keep it simple, we have shown here just nine meanings of take, you could have added the other three as well. Although we started with bottom-up processing, we’re now going to shift to top-down processing. Because there’s something about the background knowledge about candy that we have, which is going to eliminate lots of choice. In particular, we know that candy is not a medicine, so this particular choice goes away. We know that candy is not a quantity, so this choice goes away. Several of these choices disappear, because of our background knowledge of candy. Just like some of the constraints came from our background knowledge of the semantic category of candy, other constraints come from our background knowledge of the preposition, from. In the table showing prepositions earlier, from referred to a source. These three frames do not have anything to do with the source, and therefore we eliminate them. We’re left only with this particular frame, which has source in it as required by the preposition from. And thus we decide that the interpretation of took in this particular sentence is to steal from a baby. And thus we infer the correct interpretation of take, in this particular sentence. It refers to, to steal. This is the only frame that is still active

11 – Exercise Resolving Ambiguity in Verbs

Now that we have examined how the thematic role frames help us disseminate between different meanings of take, let us do some exercises together. In fact, let’s do three exercises together. Here are three sentences. Please pick a box which best captures the meaning of take in each of these three sentences.

13 – The Earthquake Sentences

So let us now return to our original example of these two stories and see how the analysis we have done, this semantic analysis, can help disambiguate between the two meanings of kill here. So, first my background knowledge tells me that kill can have several meanings, just like take had several meanings earlier. Kill can have the meaning of causing the death of someone, or kill can have the meaning of, to put an end to something. There could be other meanings of kill as well. Second, my background knowledge tells me that when kill has the meaning of, to cause the death of someone, it typically is a victim as well as an agent. In this particular case, the victim is 25 people, the agent is a serious earthquake. My background knowledge also tells me, that when the meaning to kill is to put an end to something, then typically it is both an agent that puts an end to something and an object that gets put an end to. In this particular case, 25 proposals is the object, and the agent is President of Lower Slabovia. It is this combination of background knowledge that allows me to infer the meaning of, kill, in the first sentence as, to cause the death of, someone. And kill, in the second sentence as, to put an end to something. I hope you can appreciate the power and beauty of this theory. But it is also important to point out that this theory has many limitations. To understand some of the limitations of this theory, let’s go back to the sentence, I took the candy from the baby. In this sentence, we inferred that took signifies stealing the candy from the baby. And in fact, we had a large number of rules that told us how to make sense of the word take by making sense of the word candy, making sense of the word from. But as we look at increasingly large number of forms of the sentence, the number of rules that we need starts exploding. So consider small variations of the sentence. I took the candy for the baby. I took the toy from the baby. I took the medicine from the baby. I took the smile from the baby. I took a smile for the baby. They’re all valid English language sentences, and each one of them tells a story. As I look at more and more variations of this sentence, I’ll need to find more and more rules that can disambiguate between different interpretations of cake in those variations of sentence. In practice it turns out that it’s very hard to enumerate all the rules for all the variations of sentences like this one. Nevertheless, the story appears to cover a good percentage of stories that we routinely deal with.

14 – Assignment Understanding

So how would you use the notion of understanding [UNKNOWN] frames, constraint, and ambiguity to address Raven’s progressive matrices? One example of an application here, would be to the idea of multiple possible transformations, that we saw in some of our earlier problems. We saw certain problems that could be solved with either rotation or a reflection, but they would give different answers. You might imagine a frame that dictates certain expected values for different transformations. And the degree of fit to those expectations can dictate the accuracy of that particular answer. Try to think of the different phases of the understanding process. How do you first understand what’s happening in the problem? How do you fit that, into a thematically role frame representation? And how would that representation then help you transfer and solve the problem?

15 – Wrap Up

So today we’ve talking about understanding, which how agents make sense of stories, events and other things in the world around them. We started off by creating a more formal type of frame representation called thematic role systems that captures verbs and tell us what to expect in certain events. We then talked about how single verbs can actually have ambiguous meanings. But thematic role frames can help us differentiate which meaning a verb has in a particular sentence. Finally we talked about constraints and how certain words or frames can constrain the possible meanings of a sentence and help us figure out those ambiguous meanings. Today we’ve largely been talking about how single words or phrases can have multiple possible meanings, but next time we’ll do this in reverse. We’ll talk about how multiple words or multiple phrases or multiple sentences can actually have the same meaning. We’ll talk about how we can discern that sameness, and then react accordingly

15 – Commonsense Reasoning

01 – Preview

Today we’ll talk about common sense reasoning. You and I, as cognitive individuals, are very good at common sense reasoning. We can make natural, obvious inferences about the world. How can we help AI agents make similar common sensical inferences about the world? Suppose I had a robot, and I asked a robot, find the weather outside. I don’t want the robot to jump out of the window to see whether it’s raining outside but, why should the robot not jump out of the window? What tells it that it’s not a very common sensical thing to do? We’ll talk about common sense reasoning using a frame representation. We’ll start talking about certain small set of primitive actions. There are only 14 of them, but they bring a lot of power because they organize a lot of knowledge. These primitive actions can be organized into hierarchies of sub actions. These actions can result in changes in the state. This is important, because that is what allows the robot to infer that if I were to take this action, that result might occur. That state might be achieved. So then it decides not to jump out of the window, because it might get hurt.

02 – Example Ashok Ate a Frog

Okay. Have you ever wondered how you could write the equivalent of a Watson program or a CD program for your own computer? Just imagine if you could talk to your machine in terms of stories. Simple stories. Perhaps just one sentence stories like Ashok ate a frog. Now, we’ve already seen how a machine can make sense of this particular sentence, Ashok ate a frog. We did that last time. But, of course, eat or ate can occur in many different ways in sentences. Here are some of the other ways in which I can use the verb eat. Ashok ate out at a restaurant. I could tell something was really eating him. And, the sense of eating him here, is very different from the sense of eating here. There’s no physical object that is being eaten here. The manager would rather eat the losses. So now this is a very different notion of eat. The river gradually ate away at the riverbank. Yet another notion of eat. So just like we had in the previous lesson, take, the word take, which had so many different meanings, eat has so many different meanings. Now when we were discussing the word take, we discussed how we can use both the structure of sentences as well as background knowledge to disambiguate between different interpretations of take. We could do something similar with eat. We could enumerate all the different meanings of eat, then for each meaning of eat, we could ask ourselves what is it about the structure of the sentence and what is it about the background knowledge I have about things like rivers and river banks which tells me what the meaning of ate here is. To summarize this part, if you were to start talking to your machine in stories, in terms of simple stories designated by a single sentence, then one problem that will occur is, that words like ate or take will have large number of meanings. And we have seen how your machine can be programmed in such a way so as to disambiguate between different kinds of meanings of the same word. Now here is a different problem that occurs. Consider a number of stories again. Each story here is designated by a single sentence. Ashok ate a frog, Ashok devoured a frog, Ashok consumed a frog, Ashok ingested a frog, and so on. Several sentences here. And if we think about it a little bit, each of these sentences is using a different verb. But the meaning of each of these words in this sentence is pretty much the same. So whether Ashok ingested a frog or Ashok devoured a frog or Ashok dined on a frog, exactly the same thing happened in each case. There was a frog that was essentially outside of Ashok’s body and then Ashok ate it up and at the end the frog was inside Ashok’s body. The frog was dead and Ashok was happy. So the next challenge becomes, how can a machine understand that the meaning of each of these words here is exactly the same? In a way this problem is the inverse of the problem that we had here. Here the problem was that we had a single word like eat, which had got a lot of different meanings and different context in different sentences. Here, the issue is that we have a lot of different words, but they have the same meaning given the context of the sentences. So another question then becomes, well how can we make machines understand that the meaning of these words in the sentences is exactly the same? Each of the sentences telling us exactly the same story. There is one other thing that is important here, and that is the notion of context. One of the hardest things in AI is, how do we bring context into account? In both of these cases, context is playing an important role. On the left side, context is playing an important role because we understand that the meaning of eat is different in these different sentences, because the context of eat is different. Here context is playing a different role. We understand that the meaning of each of these words is practically the same because the context here is practically the same. A couple of quick qualifications here. First, right now we’re dealing with stories that are just one sentences. Very soon in the next lesson, we’ll deal with stories which are much more complicated, which really have a series of sentences. For now, just simple stories. The second is that here is structural all of these sentences is the same. So structure practically tells us something about the context. But a situation is a lot more complicated because often two sentences, which have very different kind of structures can still have the same meaning. So consider these two studies. Bob shot Bill, and Bob killed Bill with a gun. Now here the sentence structures are very different, but their interpretations, their meaning are the same. Bill was the one who got killed. Bob was the one who did the killing. And the killing was done by putting a bullet into Bill. So the question now becomes, how can we build AI agents that can meaningfully understand stories like this one? And stories are the kind that Bob shot Bill, and Bob killed Bill with a gun. One thing we could do is, that for each of the verbs here we could have a frame like we had a frame for take. So we could have a frame for ate, a frame for devoured, a frame for consumed, a frame for ingested, and so on. Well, we would have a lot of frames then. Because there are a large number of verbs in the English language or in any other modern language. Perhaps we could do something different. Perhaps we could look that there is a similarity between the interpretation of these things. Perhaps we could use the same primitive action for each one of them. So when we talk in English language, we might use for communication purposes all of these verbs. But perhaps we could compare AI agents that have a single internal representation for each one of them. What might that internal representation look like? We will call that representation a primitive action. Each one of these is an action. But many of these actions are equal in terms of our interpretation of those actions. Let’s see what these primitive actions might look like.

04 – Exercise Primitive Actions

Okay, I hope you’re enjoying this particular lesson, because I certainly am. Let’s see, whether you’re also understanding this lesson. So, here are four sentences, John pushed the cart, John took the book from Mary, John ate ice cream with a spoon, John decided to go to the store. And here’s some of the words, are in these blue boxes. For each of the sentence, find the primitive action, that would best capture, the meaning of the verb inside the blue box.

05 – Exercise Primitive Actions

about. >> That sounds right David. There are sort of several other things to note here. What is the last sentence that has two verbs in it? Decided and go. You’ll see in a minute, how we can deal with sentences or stories that have multiple verbs in them. Second, you are right in that it is not always very easy to decide which is the best primitive action for capturing the meaning of a particular verb here. So the verb here was pushed, and the reason David propelled or moved object is because in propulsion the body is in some sense in contact with the particular object that is getting moved. Now we are not sure inherently detailed specification of each of one of these primitive actions. But I can tell you that the readings at the end do give them in detail

08 – Exercise Roles Primitive Actions

>> That’s good, David. Notice how lexical syntactical semantical analysis are all coming together here. The driver does semantic analysis. The driver does this frame which captures the semantics of this particular sentence that allows us to draw inferences about it. That’s why we are use the term semantics here. So the semantics has been captured by this frame with terms of all of these slots. But how do we decide on the fillers? That requires lexical analysis and syntactic analysis. So, when we have the word John here, John is a concept and a concept is inanimate and that information is coming from a lexicon. That is the lexicon analysis. And when have a sentence like: John took the book from Mary. And from is a preposition and Mary follows immediately after that. This is capturing part of this syntax of this particular sentence. And that is how we derive that the source must be Mary. So the semantic syntactical analysis is all working together here. Notice also how frames and routes are coming together here. You’ve seen how frames help us understand the meanings of the stories and that is being done part by the rules that are budded inside the slots. So there is a rule here which tells us how to extract the agent from the sentence and put it inside the slot. Similarly, there’s a particular rule here and a rule here and each one of these slots may have its own rule. Of course as the sentences become complex and these frames become complex, these rules will become much more complex and sometimes there will be multiple rules here and multiple rules here. And this can become very complicated, very soon. Another point to notice here is that this capturing the semantics like I said earlier and how do we know that. Because I can ask questions. Who has the book? Who had the book? What did John take from Mary? And you can answer any of these questions using this frame. Once you have this frame, question answering becomes possible. Common sense reasoning becomes possible.

09 – Implied Actions

The relationship between the structure of these sentences, and background knowledge is very intricate, and complex, and very entrusting as well. Sometimes the structure of sentences can be used to tell stories, have only implied actions in them. Consider the sentence, John fertilized the field. Now it’s hard to see fertilized mapping, into any of the primitive actions that we had here. There is an implied action here, that is not specified in the structure, the sentence here. But, it is much more meaningful in terms of the background knowledge that we have of those primitive actions. What John fertilized the field really means, is that John put the fertilizer on the field. Now put is something that maps in the primitive actions more easily. The response under the feature that’s processing. Initially the inner agent must talk for the word, given in the sentence. And try to map in the primitive actions. And if that fails, then the [INAUDIBLE] agent must start thinking in terms of how to transform on the sentences, to bring out implied actions, that can more easily map the independent actions. This, again, is common sense reasoning. Through common sense reasoning, I’m interpreting that there must be implied action here, that captures the meaning of the sentence. Once I have made the implied action here transparent. The the rest of the processing, easier. The put action, maps into the primitive action of move object, this frame is pulled out, and the rest of the slots can be filled in, like earlier.

10 – Exercise Implied Actions

Now let’s do an exercise together. So consider the sentence, Bill shot Bob. Once again, shot is a verb here, that does not quite map internal [UNKNOWN] primitive actions clearly, cleanly. So perhaps there is an implied action here. So first write down the sentence in terms of a primitive action, then write down the frame from this primitive action so fill the slots.

11 – Exercise Implied Actions

>> That’s really interesting, David. Because notice, if you say Bill took a picture of Bob, it’s not clear into what primitive action would this took map into? Perhaps we can discuss this more on the forum. There’s one more thing to notice here. Like David says, I can have multiple interpretations of Bill shot Bob. This sentence doesn’t help me to resolve between those interpretations. Perhaps, it is something coming before the sentence in the story, or something coming after that, that will help me this in big way. From the sentence itself, we can simply construct two particular interpretations.

12 – Actions and Subactions

>> That’s a valid question Davis. Move-object was in need of primitive action related to the various superficial forms in which words can occur in a sentence. As an example, transported, carried, or moved. However, this primitive action move-object can have its own story in terms of further decomposition. Notice this particular decomposition, this particular story is specific to this structural sentence where Ashok puts the wedge on the block. This raises lots of hard issues. How many such stories are there? Do we have a story? Must an AI agent have such a story for every single situation that it encounters? It was a hard question, so it’s not clear that AI currently has an answer to all those questions. So this theory by no means covers all stories. Many of the stories are beyond the scope of this theory. And in fact, even for the stories that are within the scope, there is a hard question about confidential tractability, because in number such stories can explode very quickly.

14 – Implied Actions and State Changes

Sometimes it might not be clear to what exactly does this particular verb correspond to. So consider, Susan comforted Jing. Well, what exactly did Susan do to comfort Jing? It’s not at all clear. Not clear what the primitive action should be. Although if you may not understand what exactly is the primitive action here. We want agents nevertheless to do common sense reasoning. Their interpretation might be incomplete, this interpretation might be partial. Nevertheless, you and I, as humans understand something about Susan and Jing here, that Susan for example, did something that made Jing happier. We want the agent to do the same kind of reasoning, without knowing what exactly is the comforting action here. So we may have a generic primitive action of do. This generic primitive action will use it, whenever we are unable to decide, whenever the agent is unable to decide, what exactly is the primitive action that should be pulled out. So the agent will simply say, well, Susan did something that made Jing’s mood happy and this is as much interpretation that the agent might be able to the sentence, which is a pretty good interpretation.

15 – Actions and Resultant Actions

Earlier we had problems that will deal with sentences which have two verbs in it. So here are two verbs, told and throw. Maria told Ben to throw the ball. How may an AI agent make sense of this particular sentence? So once again, the processing starts on the left. Maria is not a verb, so let’s put in a concept list and for the time being ignore it. The processing goes to the second word which is told, which is a verb. And so the primitive action calls when the told is pulled out. The primitive action is speak, and so now we can start putting the fillers for the various slots. So the agent is Maria and the result is now we go to the second one. Here the primitive action is propel because we have a throw here. And the propulsion is being done by Ben and the object is ball and now we relate these two. This second frame is a result of the first frame’s action. So, if we go back to the previous sentence, Ashok enjoyed eating a frog. We can see how we can represent both verbs there in terms of action frames. So, Ashok enjoyed. That might be the frame here. The primitive action is feel. The agent is Ashok. Ashok ate a frog, that’s the primitive action of ingest, agent is Ashok, and the object that got eaten was a frog. And the result of that is this frame where Ashok had a feeling of enjoyment. Note that some problems still remain. It is too difficult to figure out exactly how represent a sentence like Ashok enjoyed eating a frog. It can be two representations with that particular sentence and see that those are two action frames or one action frame and one state of change frame. Some of these questions will get answered when we stop thinking in terms of stories that are only one sentence long, but stories that have a number of sentences. Stories based on a discourse. Because some of these ambiguities will get resolved when we know more about what happened when Ashok enjoyed eating the frog. What came before it, or what came after it.

16 – Exercise State Changes

So let’s do a couple of exercises together. Here are two sentences at the top. Anika decided to have a glass of water and Marc loved watching TED talks. Please write down the action and the state change frames that will capture the meaning of the first sentence and the meaning of the second sentence.

17 – Exercise State Changes

>> That’s good David. Note though, that this sentence is a little bit like, a shark enjoyed eating the frog. This is one representation for this sentence, and under the representation we may have two action frames. One corresponding to the word loved, another corresponded to word watching, and then connect them, to the slot result. I hope you can see how agents might be able to understand simple stories. In fact, this is quite similar to the way Watson and Siri go about to understand the stories that we talk to them. Almost surely the human interactions with the machines of tomorrow, will not be based on the keyboards and mouses that we have today. We’ll talk to the machines, the machines will talk back to us, and when we talk to the machines we’ll be telling the machines stories. Stories like Anika decided to have a glass of water, or a shark enjoyed eating the frog. And when we tell the stories, the stories will have context. They will have ambiguities. And we will expect the machines to do common sense reasoning. The power of this particular lesson lies in the [UNKNOWN] of a representation, that enables a particular kind of common sense reasoning. We’ll continue this discussion, about common sense reasoning of more complex stories, in the next lesson.

18 – Assignment Common Sense Reasoning

So would you use commonsense reasoning to design an agent that could answer Raven’s progressive matrices? Here you might make two connections. First you could connect primitive actions of agents, to primitive transformations in these matrices. Different problems could be composed out of a finite set of possible transformations. What would those primitive transformations be? And what would the action frames involved in each transformation look like? Second, you might connect those individual primitive actions to higher level transformations. What would your primitive transformations be? And what common higher level transformations are possible? What primitive actions would result in those higher level transformations? And how would composing these like this, actually help you solve Raven’s progressive matrices? In a way that you couldn’t do otherwise?

20 – The Cognitive Connection

The connection between common-sense reasoning and human cognition is both very strong and not yet fully understood. Let us suppose that I were to ask you to go find the weather outside. You would not jump out of the window. Why not? You would use common sense reasoning to know that jumping out of the window to find the weather is not a good idea. But what is it that tells you not to jump out of the window? You use the notion of goals. The notion of context to decide what not to do and what to do. We use similar notion of context in order to do natural language understanding. We could use context to disambiguate between various meanings of take. Can we context also to decide on what would be a good source of plan. So far we have been talking about common sense inferences about physical actions, what about the social world? You and I also make common sense inferences about the social world around us. One possibility is that you and I have a flurry of mind, this is actually called the flurry of mind theory. You and I as cognitive agents, ascribe goals, beliefs, desires to each other. And it’s the theory of mind that allows us to make inferences about each other, including common sensical inferences.

22 – Final Quiz

Great. Thank you so much for your feedback.

16 – Scripts

02 – Exercise A Simple Conversation

To motivate our discussion of scripts, let us continue with our metaphor of machines with whom we can talk in stories. Now the Watson program and the Siri program, normally they understand stories that are limited to one sentence. You can ask Watson a question, you can ask Siri a question, and those questions are one sentence questions. Similarly when Siri and Watson reply, they typically give their answers in one word or one sentence at most. The story plays a very important role in the life of quadrant division and we would expect AI agents that live among quadrant divisions also to be able to understand stories. And one of the common roles that story players that they enable more common sense reasoning. To see how stories enable common sense reasoning, consider two simple sentences. Imagine the first sentence was, Ali asked, do you think we’ll have many customers in the next half hour? And the second sentence is, Sarah replied, go ahead and grab your lunch. So these are two sentences of a story. Does this story make sense to you?

04 – Story Understanding for AI Agents

>> That’s a good story David. Now let’s consider a different set of issues. Imagine that I told you a story. Bob went to a restaurant and sat down. But nobody came to serve him for quite awhile. Eventually, when someone did appear, he ordered a hamburger. The hamburger took a long time before it came. And when it did come, it was burned. Bob was not very happy. He didn’t even finish the hamburger. Do you think Bob left a large tip? Well, I expect most of you would say no. He did not leave a large tip. If we had to wait for a long time. And if the food that came eve, eventually, was not of very high quality. You’d probably not even ask tip. But how did you come to that answer? Why did you expect, that in this particular case, Bob will not leave a large tip? Again, this connects to your notion of a story. It connects to your notion of a story, of what happens in a restaurant. Why do people leave tips? When do they leave them? To put it another way, stories help you make sense of the world. They help you [INAUDIBLE] expectations, even before [UNKNOWN]. They allow us to make connections between evens that otherwise might appear disparate. So in this lesson, we’ll look at another structured knowledge representation called scripts. For representing stories or the kind that we are talking about. And we will see how this knowledge for presentation allows us to make sense of the world. And answer the kinds of questions that we are talking about.

05 – Visiting a Coffeehouse

>> Notice I didn’t need to tell David what coffee house, what time of day, what he was ordering, who the cashier was etc. He has a script for that scene. And he associates it when needed. It helps with different expectations like, the cashier’s going to give me my total, my drink should be coming up soon. If those expectations are not met, he knows something has gone wrong and he needs to react. This is the power of script. It helps to generate expectations about scenes in the world.

06 – Definition of Scripts

So what is a script? A script is a knowledge representation for capturing causally coherent set of events. Casually means that one event sets off another. So when David goes to the coffee house, as soon as he approaches the counter, the barista comes to him and says, what do you want? One event has set off the next one. Coherent means the links between these events make sense in the context of the world around us. So in David’s script again, ordering coffee doesn’t cause the barista to slap in the face. Because that would not be causally coherent in the context of the world. These events are referent to events in the world. Some events, like deciding or concluding, might be in the actor’s mind, but for the most part these events are observable events.

07 – Parts of a Script

So the structured knowledge for a presentation called script has six parts to it. The first part is called entry conditions. These are the conditions necessary to execute the script. So, for a restaurant script, as an example, the entry condition might be that there is a customer who is hungry and the customer has some money. The result refers to the conditions that will become true, after the script has occurred, after it has taken place. So, for restaurant script, the result might be that the owner of some restaurant has more money, the customer has less money, and the customer is now pleased and is no longer hungry. So the third part of a script is our props. So the third part of script are called props. Props are the kind of objects that are involved in the execution of this script so in case of the restaurant script the props might include tables and menus and food items and so on. So the fourth part of a script is roles. These are the agents involved in the execution of the script. As an example in the restaurant script it might be a customer who goes to a restaurant, the owner of the restaurant, the waiter or the waitresses in the restaurant, and so on. The fifth element of a script is a track. Track are variations or subclasses of a particular script. So for example in case of the restaurant script we may have tracks for going to a coffee house, or going to a fast food restaurant, or to a fine dining house. And finally the sixth element of a script refers to scenes. Scenes are specific sequence of events that occur during the execution of the script. So in case of the restaurant script there might be a scene for entering a restaurant. And a scene for ordering food. And a third scene for accepting the food and so on. When you put all of the six elements together then you get a complete script. In the previous lesson we had taken the metaphor of a molecule as a knowledge representation. That is, knowledge representations that are not small or short or not atomic, but are molecule in nature, script is a big, large molecule.

08 – Constructing a Script

So here is a representation of the restaurant script. Here is the name of the script, restaurant, and the six elements that we talked about earlier. Track, props, roles, entry, result and scenes. So this particular track refers to formal dining. Here are the props, tables, menu, check, money, food and place. And for food and place we can use symbols. These symbols can be used as variables. So where I may different kinds of foods and different kinds of places in a restaurant. Here are the rules, so S is a customer, W is a waiter, C is a cook, M is a cashier, O is an owner and so on. The entry conditions are S is hungry, S as the member is a customer, S has money and those conditions are that S has less money. S is not hungry. S is pleased. But O has more money. And scenes, well, let’s discuss them in the next slide. Here is a representation of scene one. We’ll call it the entering scene. This particular scene consists of several events. So in the first frame, the customer S, S stands for the customer. Moves himself or herself. So S is moving himself or herself to some restaurant, some place P. And the second frame, the agent S, the customer Sees some table. So this frame, the customer S decides to, take an action. The particular action is, where this agent S this customer S moves. Customer S himself or herself to the table. So this is a walking action going on. Let us continue another set just a little bit longer. So now S is moving his own body into a sitting position. Here, the waiter sees the customer, and now the waiter moves himself to the customer. And now the waiter moves the menu, to the customer. And this completes a representation of the first scene of entering in a restaurant. One can imagine many more scenes. The next scene might be a way to customer orders food. The third thing might be the way the waiter brings food, and so on and so forth. And then the last thing is the customer pays the bill and then walks out. This is a stereotypical notion of a script. Your notion of a script, might be slightly different depending on what kinds of restaurants that you go to. In different cultures, the script for going to a restaurant might be quite different. The point here is that the script is capturing in a knowledge representation, what is known about the stereotypical situation of going to a restaurant of a particular kind.

09 – Form vs Content

So so far we have talked about a journal script of going to a restaurant, or an abstract script for going to a restaurant. This is like a class. We can instantiate it. So let’s do this same script instantiated with these values with the greatest variables. So, Salwa is not a customer. Lucas is the waiter. And so on. And this instantiation is an important aspect of intelligence. Let’s go back to our intelligent agent. It might be that Salwa is really a robot. Now, how would a robot know what to do in a restaurant? How do we program a robot in such a way that it would know what actions to take in a particular situation? Well, suppose that, Salwa the robot, in its memory, had a number of scripts like this one? And when it entered a restaurant, it invoked the restaurant script, which told it exactly what kind of actions to take. We could also see how this script allows Salwa, the robot, to generate expectations, to guess what will happen even before it happens. There is one more thing worth noting here. Notice how we are composing scripts of these primitive actions here, the same primitive actions that occurred in the last lesson. So, these primitive actions are now providing the fundamental units of frames that compose together in some causally coherent sequence, make a script. This brings up another point. Notice how some knowledge structures are composed out of other knowledge structures. Earlier, we had frames for these primitive actions, gave them social knowledge structures. Now, we have scripts which are composed out of these framelike knowledge structures.

10 – Using a Script to Generate Expectations

>> David, that’s a good point. That happens in many different kinds of movies. So when I go and see a science fiction movie or even a romance movie, I’m expecting certain things to happen. And sometimes, I think a movie is really good if it is new and novel and different and offers some surprising things. Notice that this could also be the beginning of a period of creativity. In the last lesson, we were talking about puns and humor. Now, we’re talking about surprises. Some current theories of creativity say that a situation is creative if it is. A, novel; B, if it is useful or valuable in some way; and C, if it is unexpected or surprising. Now, this begins to capture at least one of the three dimensions of unexpectedness or surprise, and that’s an important part. We’ll return to computational creativity at the end of this course when we’ll talk a lot more about these issues.

11 – Tracks

Now, another part of the script was the track. And we really haven’t talked a lot about track so far. So let’s talk a little bit more about it. So here are four tracks with the restaurant script. That really [INAUDIBLE] to going to restaurant, four kinds. Here’s a coffeehouse, fast food, casual dining, formal dining. You could add more if you wanted to. Now in restaurants of all kinds, some even so common. You have to go to a restaurant, you have to order some food. You eat that food. You pay the bill. And then you leave. That is common to all of them which is why all of them are part of the restaurant script. On the other hand what happens in a Coffeehouse is quite different from what happens in Formal Dining, which is quite different from what happens in a Fast Food restaurant. So you may have specific cracks that correspond to Coffeehouses and Fast Foods and so on. In effect, we are building a semantic hierarchy or script. Here is a script for going to a restaurant. Here is a script for going to a coffeehouse, going to fast food. And this can be tracks in the overall script. Of course, we can build a semantic hierarchy of something higher than this. We could think about going to, for social events and [INAUDIBLE] going to this restaurant becomes part of going to a social event of various kind. Okay now that we know something about the [UNKNOWN] representation called script, the next question becomes how many AI agent actually use these scripts? So imagine an AI agent that is hungry has some money and decides to do something about it. So it may go into its long term memory and find out the script that will be most useful for the current situation. This really becomes a classification problem. In long term memory a large number of scripts, and the agent is trying to classify the current situation into one of those scripts. Let us suppose the agent picks a restaurant script, and decides to execute it. As it enters the restaurant, the scene it observes in the restaurant matches the conditions of a fast food script. So it decides to invoke the fast food script. This way the robot may walk down the semantic hierarchy, first in working the restaurant script, then working the fast food script, and so on. Now a robot could have taken a different stance. A robot could have decided to do planning. Given some initial conditions, and cold conditions, a robot may have used the operative that is available to it, to generate a plan at one time. While the script is doing it, it is giving it a plan in a compiled form. The robot doesn’t have to generate this plan at runtime. It is already available in memory in a pre stored form. This is very useful because one of the central conundrums that we have been talking about is, how is it possible that AI agents can’t address computationally complex problems with limited resources in near real time? In a complex dynamic world, planning can take a lot of time. But if I already have the store plans, then in working the script and executing it is much faster.

12 – Exercise Learning a Script

So, we have talked a lot about how the notion of scripts is connected to many of the topics that we have discussed earlier in this course. So, let’s do an exercise together. Which of the following topics might help an agent learn a script? Please check all that apply here.

14 – Exercise Using a Script

Okay, let us do one more exercise together. This predicate exercise has to do with using a script rather than learning a script. Which of these topics that we have discussed earlier apply to an agent using a script? Check all that may apply.

15 – Exercise Using a Script

>> This is good, David. Thank you for sharing this. Note that generate and trust and case-based reasoning might be applicable to the use of scripts after all. So one can imagine a situation where there are a large number of scripts of a level. The robot has to decide which of these scripts should I use for this particular situation, and may not be able to classify the situation directly into scripts. And with that case the robot might pick a script, generate it, try it out, see if it works. If it does not, pick another one. Also case-based reasoning is currently the application of scripts in the sense that, both case-based reasoning and script-based reasoning are extremely memory intensive. What both of them are saying is that memory often supplies most of the answer. Like we said earlier when we were discussing case-based reasoning, we don’t think as much as we think we do. Most of the time, memory gives us the answer. The difference of course, like David pointed out is, that case is there for instances where scripts are abstractions over instances.

16 – Assignment Scripts

So how would scripts help inform the design of an agent to solve Raven’s progressive matrices? Remember, scripts are ways of making sense of complex events in the world and we can certainly consider individual Raven’s matrices to be complex situations. You thus might have a script for different broad categories of Raven’s problems. If this was your approach, what would your entry conditions be for each script? What would the tracks be? What would the scenes be? Where are these scripts going to come from? Are you going to tell the agent what script it should use or will it learn a script from prior problems? If the agent succeeds using the script that you give it, who is intelligent? You or your agent?

17 – Wrap Up

So today we’ve talked about scripts, a complex way of understanding stories in the natural world. Stories aren’t just narratives, though. Painting, songs, buildings are all stories of different kinds. Stories are around us every single day. We started off by defining scripts. Scripts are causally coherent series of events. They give a prototype for what to expect in certain situations. The form of the general script shows us the form of the overall prototype for the situation. A specific instantiation of the script then specifies the content. Scripts can have different tracks as well. At a high level, any kind of restaurant involves entering, ordering, paying, and leaving. More narrowly though, fast food and drive through restaurants involve different scripts from casual or formal dining. This concludes our unit on common sense reasoning, but note that some of what we cover in the future will be applicable to learning, differentiating, and refining scripts.

18 – The Cognitive Connection

Scripts are strongly connected to current theories of human cognition. In fact one recent theory says that brain is a predictable machine. We do very quick bottom up processing followed by mostly top down processing with general expectations of the world. Then we act on those expectations. This idea in fact is so strong, that when it fails it leads to amusement, or surprise, or anger. If I violate the expectations of your script, you might find it funny or surprising or you might be upset about it. An interesting and open question is, whether we carry the scripts in our hear or do we generate them at run time? Scripts are also current with the notion of mental models. You and I have mental models, or scripts. Not just about social situations like going to a restaurant, going to a movie, but also about how the computer program works, how the economy works, how the car engine works, your physical, social, economic works. Note that scripts can be culture specific. In the U.S. for example, going to a restaurant typically involves leaving a tip. But in many countries, this is not the case. In fact in some countries, tipping is considered insulting. So scripts presumably evolved through cultural interaction over long periods of time. But once there, they’re a very powerful source of knowledge.

20 – Final Quiz

Great. Thank you so much for your feedback.

17 – Explanation-Based Learning

01 – Preview

Today, we’ll talk about explanation-based learning. In explanation-based learning, the agent doesn’t learn about new concepts. Instead, it learns about connections among existing concepts. We’ll use explanation-based learning to introduce the notion of transfer of knowledge from an old situation to a new situation. This will help us set up infrastructure needed for talking about analogical reasoning next time. Today, we’ll start by talking about a concept space where we will map out all the concept and relationships between them. Then we introduce the notion of abstraction that helps us do transfer. Finally, we’ll use transfer to build complex explanations that will lead to explanation-based learning.

02 – Exercise Transporting Soup

To illustrate explanation-based learning, let us begin with an exercise. Imagine that you want to transport soup from the kitchen to the dining table. Unfortunately, all of the usual utensils you use to transport soup are unavailable. They’re dirty, or just not there. So you look around, and you see some objects in the vicinity. Here is your backpack, here is a pitcher, here is a box, here is your car. And you wonder which one of these four objects could you use to transport soup from the kitchen to the dining table. Well, which one would you use?

03 – Exercise Transporting Soup

>> I think most of us would give the same answer. The pitcher, not the backpack or the car or this box. Now for humans, for you and me, this is a little bit of an easy problem. All of us get it right pretty much all the time. And what about a machine? What about a robot? How would a robot decide that a pitcher is a good utensil to use for transporting soup from the kitchen to the dining table, but not a backpack and not a box? For a robot, this is a surprisingly hard problem. So the question then becomes, well, what is it that makes it easy for humans and hard for robots? How can we program AI agents so that it would be easy for them as well? One important thing to note here, this is another example of incremental learning. Now we have come across incremental learning earlier, when we were talking about incremental concept learning. There we were given one example at a time, that was the example of art, and we were learning one concept, the concept of art. This is in contrast to other methods of machine learning. Where one is given a, large amount of data, and one has to detect patterns with a variety in that data. Here, there is learning occurring one step at a time, from a small number of examples, one single concept has been learned. We also came across the notion of incremental learning, when we were talking about chunking. Day two, there was one particular problem, and from a small number of previous episodes, we chunked a particular rule. This notion of incremental learning, for me it’s much of knowledge based AI. Another thing to note here, this notion or expression based learning is related to creativity. We talked earlier about the relationship between creativity and novelty. Here is an example in which an AI agent is dealing with a novel situation. Usual utensils for taking soup from the kitchen to the dining table are not available. What should the robot do? The robot comes up with a creative solution of taking the pitcher as the utensil

04 – Example Retrieving a Cup

So imagine that you have gone to the hardware store and bought a robot. This is a household robot. Usually the robot goes into the kitchen, makes coffee and brings it to you in a cup. However, last night you had a big party, and there’s no clean cup available in the kitchen. The robot is a creative robot and looks around, and it finds an object. And this object is light and made of porcelain. It has a decoration. It is concave. It has a handle. The bottom is flat. The robot wonders, could I use this particular object like a cup? It would want to prove to itself, that this object, in fact, is an instance of this concept of cup. How might the robot do it?

05 – Concept Space

So let us now see how the AI agent may go about building an explanation. Why this particular object is an instance of this cup? So this what the robot wants to prove, the object is a cup. And this what the robot know about the object. Object has a bottom, the bottom is flat, the object is made of porcelain and so on and so forth. So the question then becomes, how might the AI agent embodied in the robot, go about using its prior knowledge, in order to be able to build this explanation? But another way, what prior knowledge should the AI agent have, so that it can, in fact, build this explanation?

06 – Prior Knowledge

Let us suppose that the A agent has prior knowledge of these four concepts. A brick, a glass, a bowl, and a briefcase. Let us look at one of these concepts in detail. So the brick is stable because the bottom is flat. And a brick is heavy. So first, this particular conceptual characterization has several parts. About stability and about heaviness. Second, there is also something about causality here. The brick is stable because its bottom is flat. This is an important element of the brick, and similar things occur in the other four concepts. Let us now consider how an AI agent might be able to represent all of this knowledge about a brick. Here is a visual rendering of this representation. First, the A agent knows about a lot facts about the brick. So a brick is heavy. The brick has a bottom and the bottom is flat. That comes from the second part of the first sentence. Bottom is flat. And also that the brick is stable. So here are some observable facts and this is a property of the brick. So this is part of the structure of the brick. This is part of its function. In addition, the A agent knows that the brick is stable because the bottom is flat. So we need to capture this notion of causality. So this yellow arrows here, are intended to capture this notion of causality. The AI agent knows that the brick is stable because the brick has a bottom and the bottom is flat. And this way, it connects these structural features into these factional features, to these causal connections. To take another example, here is a conceptual characterization of a briefcase and here is it’s knowledge for representation. I’ll not go through this in great detail, but briefly, a briefcase is liftable because it has a handle and it is light. And it is useful because it contains papers. Notice the notion of causality here again. Once again there’re these facts about the briefcase, for example the briefcase is portable, the briefcase has a handle, you know, these structural observable features. And then there is this functional features like briefcase is useful and briefcase is liftable. And then we have these yellow arrows denoting the causal connections between them. Similarly for the bowl, here is its conceptual characterization and the knowledge representation. So the bowl contains cherry soup that one of the fact here. The bowl is concave, that’s another fact here and it is a causal relationship here. It carries liquid because it is concave. Finally the fourth concept the A agent knows about, a glass. And a glass enables drinking because it carries liquid and it is liftable, and it is pretty. So the glass is pretty. The glass carries liquids. It is liftable. And the fact that it enables drinking is because of these two other facts. Note quickly that not all these structural features participate in this causal explanation.

07 – Abstraction

Now that we have the characterizations and the knowledge representations the four concept’s worked out, let us see how the AI agent might actually use them. So let’s look at the bowl. Here was the knowledge representation of the characterization of the bowl. The AI agent will abstract some knowledge from this particular example. Here is its abstraction. Two things have happened here. First, it is abstracting only those things, that are in fact causally related. Simple features that have no causal relationship with other things, are not important and they can be dropped. So we can add one other element of a notion of an explanation. The explanation is a causal explanation. The AI agent is trying to build a causal explanation that will connect the instance, the object, into the cup. Second, the AI agent creates an abstraction of this characterization of the bowl. And so in the bowl, it replaces it with an object. So here the bowl carries liquids, because it is concave, and it is abstracted to the object carries liquid because it is concave. This is the abstraction that is going to play an important role in constructing the causal explanation.

08 – Transfer

>> This kind of explanation with learning, actually occurs in our everyday life. So you and I are constantly improvising. Papers are blowing off my desk. How can I stop them from blowing off? So, I need something to stop them blowing away. What is available? What can act as a paper weight? A cup. Here is a cup. Let me put it on the papers. This is an example of improvisation, where we use explanation based learning to realize that, anything that’s heavy and there’s a flat bottom, can act as a paper weight. Here is another example. I need to prop open a door. A door stopper is not available. What can I use? Perhaps an eraser or a chair. You and I do this kind of improvisation all the time, and often we are building these explanations that tell us that an eraser can be used as a door stopper

09 – Exercise Explanation-Based Learning I

Okay let us do an exercise together. This time instead of showing that an object is an instance of a cup, we are going to try to show that an object is an instance of a Mug. So here is the definition of a Mug. A mug is an object that is stable, enables drinking, and protects against heat. Notice that we have added one more element here, not only stable like a cup, not only enables drinking like a cup, but also protects against heat. Here is an object, I will label in the cushion. The object is light and is made of clay. It has concavity and has a handle. The bottom’s flat and the sides are thick. You can assume that the agent knows about all four examples as earlier, the glass, the bowl, the brick, and the briefcase. [UNKNOWN] in this particular case the agent also knows about yet another example, a Pot. The Pot carries liquid because it is concave, It limits heat transfer because it has thick sides and is made of clay. Your task is to. Build an explanation that shows, that this object is an instance of a Mug. Can we prove this?

10 – Exercise Explanation-Based Learning I

>> That is good David. Let us make sure that we understand the processing that David did. He wanted to show that the object is a Mug. So, he looked at the condition, the open conditions were proving that the object is a Mug and there were three of them. For each of them, he tried to build a proof. He could do so for the first two, but this was the open one. So, he came up with the closest example that was a part and he did the extraction here but he was unable to link these two because there is no knowledge which links these two in the present time

11 – Exercise Explanation-Based Learning II

So let us do another exercise that builds on the previous one. Which of this four concepts will enable the a agent to complete the proof in the previous exercise?

12 – Exercise Explanation-Based Learning II

>> There are a couple of other points to note about this exercise. An important question in our [INAUDIBLE] is, what knowledge does one need. It’s not a question of putting in a lot of knowledge into a program. Instead, the real question is, in order to accomplish a goal what is the minimal amount of knowledge that the AI agent actually needs? Let’s see how this applies here. The goal was to show that the object is a mug. Instead of putting in a lot of knowledge, the agent starts asking, what do we need in order to show that the object can protect against heat? What do we need to know to show that the object is stable? And then it goes about searching for that knowledge. The second point to note here is, depending on the background knowledge available, the agent will opportunistically build the right kind of causal proofs. So if the agent knows about the wooden spoon, it will approve this proof. If, on the other hand, the AI agent knew not about the wooden spoon but about the oven mitt, then it could use this particular proof. Which proof the AI agent will build, will depend upon the precise background knowledge available to it

13 – Explanation-Based Learning in the World

Exploration-based learning is very common in the real world. You and I do it all the time. You need to prop open a door, you bring a chair, and you use it to prop open the door because it just put an explanation for why the chair, in fact, can prop open a door. There is a sheaf of papers on a desk with the shuffling around. You take a coffee mug, put it on the sheaf of paper that acts as a paperweight. Another example or explanation best learned. You and I are constantly dealing with novel situations, we are constantly coming up with creative solutions to them. How do we do it? One way is if we use existing concepts but use them in new ways. We find new connections between them by building explanations for them. This is sometimes called Speed Up learning because we’re not learning new concepts, we’re simply connecting existing concepts. But it’s a very powerful way of dealing with a large number of situations. And today in class, we’ll learn how we can build AI agents that can do the same thing that you and I do so well.

14 – Assignment Explanation-Based Learning

So how would you use explanation based learning to implement an agent to can solve Raven,s progressive matrices? The first question you’re asking here is what exactly are you explaining? Are you explaining the answer to the problem, or are you explaining the transformations between figures in the earlier stages of the problem? Given that, what new connections are you learning, Is it learning performed within the problem or old connections an justify the figure to fill in the blank, or to perform the cross problems, where new transmissions and types of problems can be learned and connected together. For example you might imagine that you’ve encountered two problems before, one are rotation and one are reflection. A new problem might involve both, how do you use those earlier problems to explain the answer to this new problem?

15 – Wrap Up

So today, we’ve talked about explanation-based learning, a type of learning where we learn new connections between existing concepts. We first talked about our concept space. A space of information that enables us to draw inferences and connections about existing concepts. We then talked about how prior knowledge is mapped onto this concept space for new reasoning. Then we talked about how we may abstract over prior knowledge to discern transferable nuggets. And how we might then transfer those nuggets onto the new problem we encounter. Next time, we’ll expand on this idea of transfer to talk about analogical reasoning, which is inherently transfer-based. Explanation-based learning will also come up significantly in learning about correcting mistakes and in diagnosis. So feel free to jump ahead into those lessons, if you’re interested in continuing with this example.

18 – Final Quiz

Great. Thank you so much for your feedback.

18 – Analogical Reasoning

01 – Preview

Today we’ll talk about Analogical Reasoning. Analogical Reasoning involves understanding new problems, in terms of family of problems. It also involves addressing new problems, but transferring knowledge of relationships from known problems across domains. We introduced a notion of transfer previously in explanation based learning. We also have talked about Case Based Reasoning. Today we’ll talk about transfer in a much more general manner. We’ll start by talking about similarity then revisit case based reasoning. Then we’ll talk through the overall process of analogical reasoning, including retrieval, and mapping, and transport. Then we’ll close by talking about a specific application of analogy, called Design by Analogy.

02 – Exercise Similarity Ratings

To illustrate the notion of similarity, let us consider an example. Consider that a woman is climbing up a ladder. Here are seven situations. Can you please rank these seven situations by their order of similarity to the given situation?

03 – Exercise Similarity Ratings

>> Interesting answer, David, note that there are several factors in David’s answers. In the two situations that he thought were most similar to a woman climbing up a ladder, in the similarity in the relationship, climbing up as well as similarity between objects, woman and ladder. In contrast, one that he did not think was really similar to a woman climbing up a ladder, woman painting a ladder. Although there is some similarity between the objects woman, and ladder, the relationship is very different here it is climbing up a ladder, here it is painting a ladder which are two very different activities. Between one and two we notice, that both of them have the same relationship, climbing up, but an object is different, In one case it is the step ladder and in another case it is a set of stairs. So one can have similarities in relationships, one can have similarities in objects, of course some of you may have different rankings with the similarities of the one they would give. Because, your background knowledge might be different or your priorities might be different, but the point here is the similarity can be measured around several dimensions. Around the dimension of relationships, around the dimension of objects, around the dimensions of features of objects, and around the dimensions of values of features of objects that are participating in relationships, we’ll talk more about this in just a few minutes.

04 – Cases Revisited

We have come across the notion of similarity earlier in this course. When we were discussing learning [INAUDIBLE] cases, that particular point, we came across the matter of finding the nearest neighbor. At that point we found the nearest neighbor simply by looking at the [INAUDIBLE] distance between the new situation, and the familiar situations. We came across the notion of similarity when we were discussing case reasoning as well at that point, we came across at least two different methods of organizing the case library. And that, in one method, we could simply organize all the cases in array here’s an array of several cases in the domain of navigation and urban area, each case here is represented, by the x and y location of the destination. A different and smarter method, also organizes cases that are discriminatory, the leaf nodes of this discrimination tree represented the cases. The root node and the interior nodes in the discrimination tree represented discrimination, or decisions about the values of specific features for example, east of 5th Street or not east of 5th street, both of these [INAUDIBLE] schemes are based on measures of similarity. In the first scheme the similarity is based on the similarity between the tags, If a new problem were to come along it would be more or less similarly one of these cases depending on whether or not its tags match the tags of a particular case here. In the second scheme of this [INAUDIBLE] tree, similarity is based on, traversing this particular tree, If a new problem came along, we would use the features of that new problem to traverse this tree and find the case whose features best match your new problem. Note that the new problem, and the source cases in all of these examples so far have been in the same domain. Here for example, both the new problem and the source case are in the same domain of navigating in an urban area, in the previous example, the new problem and the source case were the domain of colored blocks in the blocks world. What happens if the new problem and the SOS case are not in the same domain? So consider the example of, a woman walking up the ladder and walking up the wall. The two dimension are the same, we’re talking about woman in one case and in other case, a ladder in one case, a wall in other case yeah, there’s some similarity. Situations like this, where the new problem and the source case are from different domains, lead to cross-domain analogies. So the question now becomes, how can we find leaf similarity between the new problem, the target problem, and the source case, if they happen to be in different domains?

05 – Need for Cross-Domain Analogy

To dig into this issue of similarity between a target problem and a source case in different domains, let us look at another example. Let us suppose it is a patient who has a tumor in his stomach. There is a physician who has a laser gun. She knows that if the laser light were to shine on this tumor, the tumor will be killed and the patient will be cured. But the physician has a problem. The laser light is so strong that it will also kill all the healthy tissue on its way to the tumor, and thereby killing the patient. What should the physician do? This is actually a very famous problem in cognitive science. It was first used by a psychologist called Don Curry around 1926. What do you think the physician should do in this situation? Take a moment and think about it. [BLANK_AUDIO] We’ll return to the physician and the patient example in just a minute. First let me tell you another story. Once there was a kingdom ruled by a ruthless king, there was a rebel army approaching the fortress in which the king lived. Well there was a problem. The kings’ men had mined, all the roads approaching the fort. As a result, if an army was to walk over the roads, the mines would go off, and the soldiers would be killed. So what did the army decide to do? The army decided to decompose itself into smaller groups, so that each group could come from a different road, and reach the fort at the same time. Because each group was small enough, the mines on the roads did not go off. The soldiers were able to attack the fort at the same time and overthrow the bad king. Now let’s go back to the problem of the physician and the patient. What do you think now? Has the answer to the problem changed? Some of you indeed may have changed your answer because of this story I told you about the king and the rebel army. One solution to this problem is that a physician would divide a very intense laser beam into several smaller, less intense beams. As these beams come from different directions, they do not harm the healthy tissue. However, they reach the tumor at the same time and manage to kill the tumor. You will note that this is an example of cross-domain analogy. Here the target problem had to do with the physician and the patient. The source case had to do with the king and the rebel army. The objects in these two situations were clearly very different. In one case we had a physician and the patient, the laser beam and the tumor. And in the other case we had the king and the rebel army, the fort and the mines. Some of the relationships were very similar. In capturing the fort case, we had a resource, the army, which was decomposed into several small armies, which was sent to the goal location at the same time. We took this battle, we took this strategy, abstracted it out, and then applied it to the patient and physician example. A physician used the same strategy. Resource decomposes into several smaller resources, and sent to the goal at the same time. Now you can also see why the ant climbing a wall is similar to a woman climbing a ladder. The objects are different, ant and wall, woman, and ladder, but the relationship is similar. Climbing up. The cross-domain analogy is then, the objects and the features and the values of the objects can be different. The similarity is based on the relationship. It is the relationship that is important. It is the relationship that gets transferred from the source case to the target problem.

06 – Spectrum of Similarity

We can think of a spectrum of similarity. At one end of the spectrum, the target problem and the source case are identical. At the other extreme end of the similarity spectrum, the target problem and the source case have nothing in common. We can evaluate the similarity between the target problem and source case in similar dimensions. In terms of the relationships occurring in the source case and the target problem. In terms of the objects occurring between the two. In terms of the features of the objects and in terms of the values that the features of the objects take. At the end of the spectrum, where the target problem and the source case are very similar, with relationships, objects, features, and values are all similar. At the other end the values, features and objects may be different, but the relationships are similar. If the relationships too are different, then there’ll be nothing in common with the target problem in the source case. When the relationships, objects, features and values are all similar, then that is an example of recording cases, and we have come across it. An example of that was from the colored blocks in the blocks world. When the similarity between the target problem and the source case is along the dimension to relationships and objects, but not along the dimensions of values or values and features, then that’s an example of case-based reasoning. We discuss this method in the domain of navigation and urban areas. The objects of the concept between the target problem and the source case being the same means that the domains are the same. So cased-based reasoning is within domain analogy. An analogical reasoning in general, objects in the target column and the source case too, might be different. We saw an example of analogical reasoning in the Dunker radiation problem, when we were talking about cross-domain analogical transfer. Actually recording cases in case based reasoning are also examples of analogical reasoning, except that they occur in the same domain. The target firm and the source cases in the same domain, which is why we consider them earlier. By analogical reasoning here, we mean cross domain analogical transfer. As in the Dunker radiation problem.

09 – Three Types of Similarity

Semantic similarity used with conceptual similarity between the target problem and the source case. If we recall the original exercise that David had answered, in that exercise, a woman climbing up a ladder is conceptually similar, semantically similar to a woman climbing up a step ladder. The same kind of concepts occur in both situations. Woman, and step ladder or ladder. Pragmatic similarity concerns with external factors. Factors external to the presentation, such as goals. As an example, in the Dunker radiation problem, the physician had a goal of killing the tumor, which was similar to the goal of capturing the fort in case of the rebel army and the king. Pragmatic similarity refers to similarity of external factors, factors external to the representation, such as similarity of goals. The Dunker radiation problem for example, the physician had the goal of killing the tumor, which was similar to the goal of capturing the fort in case of the rebel army in the king example. The third measure of similarity is structural similarity. Structure here refers to the structure of presentations, not to physical structure. Now structural similarity of the first two, similarity between the representational structures of the target problem and the source case, and we’ll look at an example of this in just a few minutes. Know that one can assign different kinds of weights to these three measures of similarity. So some queries of analogy focus on structural similarity. Other theories of analogy focus on semantic and pragmatic similarity. That is also why you may have given slightly different answers to the questions in the first exercise than David did.

10 – Exercise Analogical Retrieval I

Let us do another exercise together now that we know about deep similarity and superficial similarity. Consider the situation again, a woman is climbing up a ladder. Give this set of situations, mark whether each of the situations is deeply similar or superficially similar to this given situation. Know that some might be both and others might be neither

11 – Exercise Analogical Retrieval I

>> This is good, David. Once again, different people may give different answers to this exercise. Why do we do so? Well, let’s examine it next.

12 – Exercise Analogical Retrieval II

Many science textbooks in middle school or high school explain the atomic structure in terms of the solar system. Here’s a representation for the solar system, here’s a representation for the atomic structure. Let us see how this model of the solar system helps us make sense of the atomic structure. We’ll use this example often going forward. And this representation of the solar system is arrows are denoting causality. So the sun’s mass is greater than the planet’s mass, which causes the planet to revolve around the sun. Similarly for the atomic structure, there is a force within the nucleus and the electron, and that causes the nucleus to attract the electron and the electron to attract the nucleus. Given these two models, what are the deep similarities between them?

13 – Exercise Analogical Retrieval II

>> Now we can see why these textbooks write about the solar system, and the atomic structure in such a way that these relationships become salient. They help us make sense of the atomic structure, by pointing to the deep similarities between the relationship that occur in the atomic structure, and the relationship that are occurring in the solar system.

14 – Analogical Mapping

Now let us consider analogical mapping. The problem here is called the correspondent’s problem. There are a number of obvious relationships in this target problem. There are a number of obvious relationships in this source case. What in the target problem corresponds to what in the source case? If we can address the correspondence problem. If we can say, for example, that the laser beam corresponds to the rebel army, then we can start aligning the target problem and the source case so it makes the deep similarities between relationships salient. Note there are several parts of a target problem and several in the source case. In principle, any of these objects of the target problem could correspond to any of the objects in the source case. In which case we would have an m to n mapping, and that becomes computationally inefficient. If you and I, often do not have much of a problem deciding, if the laser beam must correspond to the devil army. How do we do it? And how can we help AI agents make similar kind of correspondences? Our answer is, we’ll make use of relationships. In fact, we’ll make use of higher order relationships, whenever possible. We’ll give precedence to higher order relationships, over other relationships. As a unary relationship, we might say that a patient is a person here, and king is a person there. The binary relationship we might say, that physician has a resource, the laser beam. And that the rebel army has a resource, the army itself. It’s a higher ordered relationship, a tertiary relationship between say, that between the goal and the resource is an obstacle. They held a tissue in this case. Similarly between the goal and the resource is an obstacle, the minds in this case. We focus on the higher ordered relationship there, that’s where the deepest similarity between the two situations lies. This is how we know to mark between the king and the tumor and not between the king and the patient. Although the king and the patient are superficially similar, a deeper similarity lies in viewing the king and the tumor in terms of goals which need to be cured or captured using a resource when there is an obstacle in between them.

15 – Exercise Analogical Mapping

Let us do an exercise on deep relationships. Let’s get back to for example the solar system and the atomic structure. Let us suppose that you’ve given this representation of the solar system and this representation of the atomic structure. How would you map the solar system to the atomic structure?

16 – Exercise Analogical Mapping

>> This is right, David. Another thing to take away from here is note the depth of understanding it requires in order to be able to make your right kind of correspondences. If one didn’t have the right kind of model for the solar system and atomic structure that captures the deep relationships, then the mapping may not be done. The alignment wouldn’t work, and we would not be able to understand the atomic structure in terms of the solar system. Thus, models, deep and rich models of the two systems, the target problem and the source case, are essential to deciding how to align them, how to map them, and as will see in a moment, what to transfer and how to transfer it.

20 – Evaluation and Storage in Analogical Reasoning

Let us briefly talk about evaluation in storage. These evaluation and storage steps in analogical reasoning are very similar to the evaluation and storage steps in case based reasoning. Analogical reasoning by itself does not provide guarantees of correctness. So the solution that it proposes must be evaluated by some manner. For the down correlation problem, for example, we may evaluate the proposed solution by doing a simulation. Once the evaluation has been done, then the new problem and a solution can be encapsulated as a new case and stored back in memory for later potential reuse. To return to the down correlation problem, as an example. Once we have the solution of decomposing the laser beam into several smaller beams and sending them to a tumor at the same time from different directions, we can do a simulation of this solution and see whether they are successful. If it is, then we can encapsulate the target problem and the proposed solution as a case, and store it in memory. It might be useful later. It could potentially become a source case for a new target problem to come later. Once again, in this way, the AI agent learns incrementally. Each time it solves a problem, the new problem and its solution becomes a case for later reuse.

23 – Design by Analogy Mapping Transfer

Recall that we started with a problem of designing a robot, that can walk on water. Let us suppose that, that particular target problem resolves in the retrieval of a source case, of a robot design that we already encountered. One that can walk on ground. Now the question becomes, how can we adapt this particular design of the robot that can walk on the ground, into a robot design that can walk on water? Let us now suppose, if we reuse this particular problem of designing a robot to walk on water. As a probe into the case memory. And now the case returns to us, the design of the basilisk lizard. That might happen, because the design of the basilisk lizard, is indexed by it’s functional model, walk on water. So there is a pragmatic similarity between the two. We now have the design of a robot who can walk on ground, and we have the design of a biological organism, the Basilisk Lizard, that can walk on water. For the Basilisk lizard, we also have a complete model, a complete explanation of how its structure achieves its function. Now that we have a partial design for the robot, this is a design of the robot that can walk on ground. And we have a design of an organism that can walk on water. We can try to do an alignment between these two. This alignment will be based on the similarity between relationships. Clearly, the objects here, and objects there are very different. Once we have aligned these structural models, or the robot that can walk on ground, and the basilisk lizard that can walk on water. Then, we can start doing transfer. We can transfer specific features, of the structure of the basilisk lizard. For example, the shape of its feet, to this model, of the robot that can walk on ground. In order to convert it into a robot, it can walk on water. Having constructed a structural model, for this robot that can walk on water then we can try to transport the behavioral model, and then the functional model. And then this way we have a complete model of a robot that can walk on water. Along with an explanation of how it will achieve it’s function. This is sometimes called compositional analogy. We’ll first do mapping at the level of structure, and that mapping at a level of structure helps us transfer some information. That in turn allows us to transfer information at the behavioral level. Once we have transferred information at the behavioral level, we can climb up this abstraction hierarchy, and transfer information at a functional level. We can now revisit our computational process, and our logical reasoning. Initially we had presented this particular process like, a linear chain, Retrieval, Mapping, Transfer, Evaluation and Storage. In general, however, there can be many loops here. We may do some initial mapping, for example, that may result in some transfer of information. But that transfer then, may lead to additional mapping, and then to additional transfer and so on. Here is another brief example, from biological inspired design, in this case we want to design a robot that can swim under water in a very slowly manner. This particular function of swimming underwater in a stealthy manner, reminds a design team of a copepod. A copepod is a biological organism, that has a large number of appendages. It moves underwater, in such a way that in generates minimum wake, especially when it moves very slowly. On the other hand, when it moves rapidly, then the wake becomes large, when the wake is small then its motion is very steady, when the wake is large, its motion is no longer steady. An analogically transfer of knowledge about this particular copepod, gives a design for the microbot for slow velocity. This analogy, decomposes our original design problem. We had the original design problem, as moving underwater in a stealthy manner. Now that we have a design of an organism, for moving underwater at low velocities, we are still left with the sub goal of moving underwater at high velocities. The goal of designing a microbot, that can move underwater in a stealthy manner, at fast velocities, may remind the design team of the squid. The squid uses a special mechanism, like the jet propulsion mechanism to move underwater in a stealthy manner at pretty high velocities. Now we have created a designed for microbot. Where part of the design comes from the design of the copepod, and the other part comes from the design of the squid. Instead of borrowing the design from one source case, we are borrowing parts of the design of multiple source cases. This is a compound analogy. Notice that there’s a problem evolution going on, which started with one problem. We arrived at a partial solution to that. Which then leads us to a problem evolution. And the problem transformation. We then have a new understanding of the problem. So, this example we saw, how we first did analogical retrieval of the coco powder organism. Then Mapping, then Transfer. That then lead to addition retrieval, in this case with a squid. Once again this process is not linear. Just like we can iterate between Mapping and Transfer, similarly we can iterate between Transfer and Retrieval.

25 – Advanced Open Issues in Analogy

There’re a number of advanced and open issues in analogical reasoning, that are the subject for current research. First, because analogical reasoning entails cross-domain transfer, does it mean that we necessarily need a common vocabulary across all the domains? Consider the example of the atomic structure and the solar system once again. Suppose I were to use this term revolve, to say the electron revolves around the nucleus. But use the term rotate to say the planet rotates in an orbit around the sun. I have used two different terms. How then can I do alignment between these two situations? Should I use the same vocabulary? If I don’t use the same vocabulary, what alternative is there? Second, analogical reasoning entails problem abstraction and transformation. So far we have talked as if the problem remain fixed, it’s source case is retrieved and transferred across. But often, the agent needs to abstract and transfer the problem, in order to be able to retrieve the source case. A third issue in analogical reasoning concerns compound and compositional analogies. So far we have talked that given a problem, we can retrieve a case and transfer some knowledge from that case to the problem. But often we retrieve not one case, and we transfer knowledge from not one case, but from several cases. If you’re designing a car, you might design the engine binology to one vehicle and the chassis binology to some other vehicle. This is an example of compound analogy. But how can we make compound analogy work? In compositional analogy, analogy works at several levels of abstraction. Supposing we were to make an analogy between your business organisation and some other business organisation. We might make this compositional analogy, first at the level of people. Next to the level of processes. Third of level of the organisation as a whole. This is another example of compositional analogy, where mapping at one level supports transfer to the next level. How do we do compositional analogy in AI agents? Fourth, visuospatial analogies. So far we have talked about analogies in which it transferred necessarily engages causal knowledge. But a large number of analogies in which causality is at most implicit. We’ll consider these visuospatial analogies later in the class. Fifth, conceptual combination. A powerful learning mechanism is learning a new concept by combining parts of familiar concepts. Analogical reasoning is one mechanism for conceptual combination. I have a one concept, [UNKNOWN] concept, that of the atomic structure, another concept, the solution concept. The concept of the solar system. I take some part of the solar system knowledge, combine it with my concept of the atom to get a new concept of the atom. If you’re interested in any of these issues, I invite you to join the PhD program in Computer Science.

26 – Assignment Analogical Reasoning

So how would you use analogical reasoning to design an agent to answer Raven’s progressive matrices? This might be a tough question at first, because the agents we’re designing only operate in one domain, taking the Raven’s test. They don’t look at other areas. So, we’re going to get the knowledge necessary to do cross domain analogical transfer. In this instance instead of the agent doing the analogical reasoning, maybe it’s you doing the analogical reasoning. Can you take inspiration from other activities to inspire how you design your agent? Or can you take knowledge from other activities and put them in your agent, so that it can do the analogical reasoning?

27 – Wrap Up

So today, we’ve been talking about analogical reasoning. We started by talking about similarity. As we saw in our opening exercise, similarity is something that we evaluate very easily without even really thinking about it. How can we design agents that can do the same kind of similarity evaluation? We then talked about analogical retrieval, which can be difficult, because we’re trying to retrieve examples across other domains. How can we structure our knowledge to facilitate this kind of retrieval? How can a system know the given a model of the atom, it should retrieve a model of the solar system? Then we talked about mapping, which is figuring out which parts of different systems correspond. For example, how can figure out that the troops in the four example correspond to the lasers in the tumor example? We then talk about transfer, which is moving knowledge from the concept we know to the concept we don’t. For example, we used what we knew about the solar system to fill in our knowledge of the atom. Then next, we talked about evaluation and storage. How do we evaluate our analogies? In the tumor example, we might actually try that medical procedure. But for other analogies, how do we evaluate them? And then how do we store them for future use? Last, we talked about a special kind of analogy, design by analogy, where we use something that we know a lot about to inform our design of something new. We’ll talk a lot more about this, especially design by analogy, when we come to the design unit later in our course.

28 – The Cognitive Connection

Analogy is often considered to be a core process of cognition. A common example of analogy we encounter everyday is that of metaphors. For example, you can imagine someone saying, I had to break up with her. We had grown very far apart. Far apart here is a spatial metaphor. One of the famous examples of metaphors comes from Shakespeare. All the world’s a stage, all the men and women merely players. The theater here is a metaphor for the world. A third connection is the Rubin’s test of intelligence. The Rubin’s test is considered to be one of the most common and reliable test of intelligence, and as you well know by now, it is based entirely on analogies. An analogy is that central to cognition.

29 – Final Quiz

Please summarize, what you learned, in this lesson.

30 – Final Quiz

Great, thank you for your answer.

19 – Version Spaces

03 – Abstract Version Spaces

So in the versions basis technique of learning concepts incrementally, we always have a specific model and a general model. As a new example comes along, we ask ourselves is this a positive example of the concept that we learned, or a negative example of the concept that we learned? If it’s a positive example, then we generalize the specific model. If it’s a negative example, we specialize the general model. Here is another set of visualizations to understand the specific and general models. This is a specific model. This is a general model. The most specific model, matches exactly one thing. The four legged, furry, black animal called Buddy. The most general model matches everything, all animals. Here is a current specific model, and as more positive examples come, you’re going to generalize this specific model. Here are some of the generalizations. Similarly, here is the current general model, and as negative examples come, we’re going to specialize the general model, and here are some of the specializations. As we’ll illustrate in a minute, I’ll start with the most general model and try to specialize it. Some of these generalizations and specializations that we are creating will no longer match the current data. When that happens, we’ll prune them out. As we prune on this side as well on this side, the two pathways may merge depending on the ordering of the example. When they do merge, we have a solution. The right generalization of the concept for the given examples. So far we have been talking about this in very abstract terms. Let’s make this concrete with an illustration.

04 – Visualizing Version Spaces

So Shoke, I tend to think of the difference between the incremental concept learning we’ve talked about in the past and version spaces in terms of a bit of a visualization. We can imagine a spectrum that runs from a very specific model of a concept to a very general model of a concept. And we can imagine that this circle represents where our model currently is. If receive a new positive example that’s not currently assumed bar concept, we then generalize it a bit to move it to the right. If we receive a negative example that is currently included in our concept, we’re going to move it to the left and specialize. As more and more examples come in, we start to see that our concept keeps moving around. [BLANK_AUDIO] >> Notice that they would use it a model to refer to a concept. In fact, they use in concert with models almost interchangeably. This is actually quite common for certain kinds of concepts. We have discussed earlier prototypical concepts when we were discussing classification. But prototypical concepts, concepts are like models. What is a model, a model is the representation of the world. Such that there is a one-to-one correspondence what is being represented to the world and the representation itself. As an example, in the world of those blocks that made arches, I can actually make an arch in the world, and then I can build a representation of that particular arch. That’s a model of the world, so the concept of an arch and the model of an arch in this particular case can be used interchangeably.

05 – Example Food Allergies I

So let us suppose that I go to a number of restaurants, and have various kinds of meals and sometimes get an allergic reaction. I do not understand why I’m getting this allergic reaction, under what conditions do I get the allergic reaction. So go to an ER agent and say, dear ER agent tell me, under what conditions do I get allergic reactions. And I give all the data, shown in this table, to the AI agent. Note that there are only five examples here, like we mentioned in knowledge-based AI we want to do learning based on a small number of examples because that’s how humans do learning. Note also, that the first example is positive. And that there are both positive and negative examples. That is important so we can, construct both specializations and generalizations. How then, may an AI agent decide the conditions under which I get allergic reaction. So this examples are coming one at a time, and let us see what happens when the first example comes. Here is the first example. The restaurant was Sam’s. Meal was breakfast. Day was Friday. The cost was cheap. So from this one example, I can construct both a very specific model, which is exactly this example. Sam’s, breakfast, Friday, cheap. You can’t have anything, more specific than this. And the AI agent can also construct a more general model. Which of course is, that it can be any restaurant, any meal, any, day and so on. You can’t construct a more general model than this. So the most specialized model based on this one example says that, I’m allergic when I go to Sam’s and have breakfast on Fridays and the cost is cheap. And the most general model says, I’m allergic to everything. No matter where I go, what meal I have, on what day, and what the cost is, I feel allergic.

06 – Example Food Allergies II

Let us consider the processing as a second example comes along. And the red outline for this example means it is a negative example. So now the agent will try to find a way of specializing the most general model and generalizing the most specialized model, in order to account for this negative example. So given this negative example, you want to specialize the most general model so that this negative example is excluded and yet each of the specializations is a generalization of this most specific model because this was coming from a positive example. We do want to include this. Let’s first specialize in a way so that each specialization is a generalization of this model. There are 4 ways of doing it because there are 4 slots here. The first slot here deals with the name of the restaurant like Sam’s or Kim’s. One specialization of this most journal concept is to put the name of an actual restaurant there. This is generalization of this concept because this was deferring to one specific need at Sam’s, this is referring to any need at Sam’s. In a similar way I can specialize the filler of the second slot. In short of having any meal, I can make it a breakfast meal. This is a specialization of this most general concept that is a generalization of this concept because this refers to breakfast at any place, this refers to breakfast at Sam’s on Friday and so on. Similarly for the third slot and the fourth slot in this most general concept. Now I must look at these specializations of the most general concept and ask which one of them should I prune so as to exclude the negative example. I notice that Sam’s doesn’t match Kim’s, so this is already excluded in so far as this concept is concerned. Breakfast doesn’t match lunch, so this example is already excluded as far as this concept is concerned. How about with this concept of characterization and mix this negative example, therefore I must floor it. So we pull away that particular concept characterization and we are left with three specializations of the most general model.

07 – Example Food Allergies III

Let us consider what happens when a third example comes along. And the green outline of this example shows that this is the positive example of the concept. Because this is the positive example of the concept, we must try to generalize the most specific model. So a generalization of the specific concept, that includes this positive example as shown here. Here the meal was breakfast, here the meal was lunch. So we can generalize over any meal. Here the day was Saturday, here it was Friday, so we can generalize over any day. Of course we could have also generalized just Friday or Saturday, but for simplicity we’ll generalize over any day. Similarly for breakfast or lunch, generalized to any meal. But at this stage, there is another element to the processing. We must examine all the specializations of the most general concept and see whether any one of them needs to be pruned out. The pruning may need to be done in order to make sure that each specialization here is consistent with the positive examples that are coming in. So in this case, if we look at the first specialization here, which says, I’m allergic to breakfast at any place on any day. This cannot be a generalization of this particular concept. Put another way, there is no way that this breakfast here can include, can cover, this positive example which deals with lunch. But yet another way, the only way I can move from breakfast to any here would be if I generalize, but in this direction I can only specialize. Therefore, this must be pruned out. As you prune this first concept out, we’re left with only two.

08 – Example Food Allergies IV

Now let us consider, the processing of the fourth example comes along. Again the red outline shows that this is a negative example of the constant. Because this is a negative example, we must specialize in most journal concept characterizations available at the moment. We can begin by checking, whether we need to specialize this particular general concept. But wait, this general concept characterization, already excludes the negative example. This says the earlier happens when I go to Sam’s, and this has Bob’s in it, so this already excludes it, I don’t have to specialize it any more. Now let’s look at this general model. Does this need to be specialized, in order to excluded? Yes, because at the current stage, this includes this vertical example. It is cheap here, this is cheap, this is any here, and this has particle elements within. This means that, this concept characterization, must be specialized in a way that excludes this negative example and yet. The new specialization, is consistent with the most specialized characterization at present. It is tempting to see the two pathways as converging here, because this is identical to that, but we also have this branch hanging, and this branch says that I’m allergic to any meal at Sam’s, not just a cheap meal. So, we’re not done yet. In this state there is one other element to consider. If there is a node, that lies on a pathway starting from the most journal concept characterization, that is subsumed by a node, that comes from another pathway starting from the same journal concept characterization, then I want to prune that particular node. The reason I wanted to put on this note is, because this note is subsumed by this note. So this note is true, I don’t have to carry this around. If I’m allergic to any meat at Sam’s, I don’t have to specify that I’m allergic to cheap meat at Sam’s, thus I can pull on this particular pathway, and I’ve left it only this particular pathway. At this point in processing, these are the examples that have been encountered so far. There are only two possible. I’m either allergic to everything at Sam’s, or I’m allergic to every cheap meal at Sam’s.

09 – Example Food Allergies V

I know you’ are wondering when this is going to end. We’re almost done, we’re almost done. Let’s consider what happens when the first example comes. This is a negative example as indicated by the red outline. Because the negative example, we must specialize in most journal characterization, in such a way that this negative example is dueled out, and this specialization is consistent with. The most journal version, starting from the most specialized concept characterization. The only specialization of this journal concept, that both excludes this and is consistent with this node is, Sam’s cheap. It excludes this, because it is cheap here, it will rule out the fact that this is expensive here. Now the agent noticed that these two particular consequences positions are the same and if a convergence has occurred. Now we have the answer we wanted. I get allergies whenever I go to Sam’s and have a cheap meal.

10 – Version Spaces Algorithm

What we have just done here, is a very powerful idea in learning. Convergence is important. Because without convergence, a learning agent could zig zag forever in a large learning space. We want to ensure that the learning agent converges to some concept characterization, and that remains stable. This method guarantees convergence, as long as there is a sufficiently large number of examples. We needed five examples in this particular illustration, for the convergence to occur. This convergence would have occurred, irrespective of the order of the examples, as long as the five examples were there. Note that we did not use background knowledge like we did in incremental concept learning. Note also that we did not assume that the teacher was forwarding the examples in the right order. This is the benefit of version space learning. There is another feature to note. In incremental concept learning, we wanted each example different from the current concept characterization in exactly one feature, so that the learning agent could focus its attention. However inversion spaces, you can notice that each successful example, the first one, the previous one and many features, just look at the first two examples. They differ in many features in the name of the restaurant, in the meal, in the cost. Here is the algorithm for the version space technique. We’ll go through it very quickly, because we’ve already illustrated it in detail. If the new example is positive, generalize all specific models included. Prune away the general models that cannot include the positive example. If the example is negative, specialize all the general models to include it. Prune away the specific models that cannot include the negative example. Prune away any models subsumed by the other models. Know that in this specific implementation of version space technique that we just illustrated, there is a single pathway coming from the most specialize concert model. And therefore there is no need to prune away specific models. In general, there could be multiple generalizations coming for the most specialized models, and this might be needed.

12 – Exercise Version Spaces I

>> So, this example is pretty similar to the case we had in the previous example. So, the most specific case is that I’m simply allergic to any breakfast that comes on Friday that’s cheap and isn’t vegan, so this very specific example. And the most general model is I’m just allergic to everything, no matter what meal it is, what day it is, how much it costs, whether it’s vegan, or what restaurant I got it at.

13 – Exercise Version Spaces II

Now suppose a second example comes along, and this example is also positive as indicated by the green outline. Based on the second example, would you specialize or would you generalize?

14 – Exercise Version Spaces II

>> That’s right David.

15 – Exercise Version Spaces III

So write down of the generalization of this most specific model that is consistent with this positive example.

17 – Exercise Version Spaces IV

Let’s go a little bit further, suppose a third example comes along, and this is the negative example indicated by the red outline here. What would you do this time? Generalize or specialize?

18 – Exercise Version Spaces IV

>> So, this time we’re going to specialize our most general model. It’s obvious that I’m not allergic to absolutely everything everywhere, because here’s a particular instance where I wasn’t allergic to what I ate. So we’re going to specialize our most general model.

19 – Exercise Version Spaces V

So like David said, given this negative example, we’ll specialize this most general model. And we’ll prune out those specializations that no longer match the data. Given this, how many specializations are left after the pruning?

20 – Exercise Version Spaces V

>> So I said that there’ll be three potential general models left after specializing and pruning. Those three models are going to be that I could just be allergic to everything at Kim’s, I could just always be allergic to breakfast, or I could just be allergic to eating on Friday. I would prune the ones based on cost and whether or not the meal is vegan, because although I’ve had bad reactions to cheap, non-vegan meals in the past, here I didn’t have a reaction to a cheap, non-vegan meal. So it’s not sufficient to say I’m allergic to everything non-vegan or I’m allergic to all cheap food.

22 – Exercise Version Spaces VI

>> Note that in this exercise, there were only seven examples and only five features. So we could do it by hand. What would happen if the number of examples was much larger and the number of features were much larger? This algorithm would still work but we’ll need a lot more computing power. It is also possible that the algorithm may not be able find the right concept to converge to because I might be allergic to multiple meals at multiple restaurants such as breakfast at Kim’s and lunch at Sam’s. But even in that case, the benefit of this algorithm is it will show that convergence is not possible even after many, many examples.

23 – Identification Trees

It is one of the method we can use, to process the kind of the data that we just saw. It is sometimes called decision-free learning. Recall that we were discussing case-based learning, we talked about discrimination tree learning. There, we learned the discrimination tree incrementally. A case would come one at a time, and we would ask the question, what feature would discriminate between the existing cases, and the new case? And we would pick a feature. Discrimination pre-learning provides no guarantee of the optimality of this tree. That is to say, at retrieval time, when a new problem comes along, traversing this tree might take a long time because this tree is not the most optimal tree was during these cases. We’ll discuss an alternative method called decision tree learning, which will give us more optimal trees, however, at a cost. The cost will be that all the examples will need to be given right at the beginning. Let us return to our restaurant example. We want to learn a decision tree that will classify these five examples so that as a new problem comes along, we can quickly find which is the closest example to the new problem. To do this, we need to pick one of four features, restaurant, meal, day or cost that will separate these allergic reactions, so that one category contains either only false instances, or only true instances. As an example, supposing we think of restaurant as being the decisive feature. So we have picked restaurant as a decisive feature. Now, there are three kinds of restaurants. Kim’s, Bob’s, and Sam’s. Whenever it’s Kim’s restaurant, or Bob’s restaurant, there is no allergic reaction. Whenever it’s Sam’s restaurant, there can be allergic action shown in green here, or no allergic reaction, shown in red. So the good thing about this particular feature, restaurant, is that, it has separated all the five examples into two classes. Into the class Sam’s, and into the class not Sam’s. Not Sam class consists of only negative reactions, which is good, because we know that we have now been able to classify all of these five examples into two sets, one of which contains only negative examples. Now for these three examples, you must pick another feature that will separate them into positive and negative instances. In this case, we might consider cost to be that feature. When the cost is cheap, then we get positive examples. When the cost is expensive, then we get negative examples. This is a classification tree. And in fact, this is a very efficient classification tree. When a new problem comes around, for example visit6. Sam’s, lunch, Friday, cost is expensive, and you want to decide what the allergic reaction might be, we simply have to travel through this tree, to find out, the closest neighbor, of that particular new example. This is called a decision tree. And this technique that we just discussed is called decision tree learning. This method of inductive decision tree learning worked much more efficiency and apparently more easily than earlier method that we have discussed. But the trade off is that we needed to know all the five examples right in the beginning. Of course, this technique simply appears to be efficient and easy. And that is because we had only five examples, and only four features that were describing all five examples. If the number of examples was very large, or the number of features that were describing the examples were very large. Then it’s very hard to decide what exactly should be the feature that we should use to discriminate on.

24 – Optimal Identification Trees

Let us look at another example of decision tree learning. Here is a data set of people who go to the beach, and some of them get sunburned, and others don’t. In this data set, there are nine examples and each example is characterized by four features, hair, height, age and lotion. Once again, how can we construct an optimal decision tree that they classify all of those examples? One possible idea is to discriminate first on hair color. Hair color classifies all of these known examples into three categories, brown, red and blonde. The interesting thing about the choice of hair color is that in the case of brown, all of these sunburnt cases are negative. People with brown hair apparently don’t get sunburned. In case of all the red haired people, there is sunburn. So hair color is a good choice for picking as a feature to discriminate on because it classifies things in such a way that some of the categories have only negative instances and no positive instances. And some of the categories are only positive instances and no negative instances. Of course, that still leaves blonde-haired people. In this case, there are both some positive instances and some negative instances, and therefore, will need another feature to discriminate between the positive and the negative instances. Here, lotion might be the second feature that we pick. Lotion now classifies the remaining examples into two categories, some people used lotion, other people did not. Those who used lotion did not get sunburnt. Those who did not use lotion did get sunburn. Once again, these are all negative instances. These are consisting of only positive instances. Thus, in this decision tree, simply by using two features, we were able to classify all of these nine examples. This is a different decision tree for this same data set. But because we use a different order, therefore, now we have to do more work. This decision tree is less optimal than the previous one. We could have chosen a different set of features in a different order. Perhaps, we could first discriminate on height then on hair color and age. In this case, we did a much bushier tree. Clearly, this tree is less optimal than this one. Note the trade off with the decision tree learning and discrimination tree learning that we covered in case-based reasoning. Decision tree learning leads to more optimal classification trees. But there is a requirement. You need all the examples right up front. Discrimination tree learning may lead to suboptimal trees, but you can learn incrementally.

25 – Assignment Version Spaces

So how would version spaces be useful to answer Raven’s progressive matrices? Like with the incremental concept learning, think first about what concept you’re trying to learn. Are you learning transformations? Are you learning types of problems? What are the increments? Are they individual problems? Are they individual figures? Are they individual transformations in a problem? Second, what are you converging onto? For example, you could use version spaces within one problem and converge down onto a correct answer, or you could use it for learning how to solve problems in general and converge onto adoptable algorithm, or you could use it for learning an ontology of problems and converge onto a single type of problem you expect to see in the future. So what are you converging onto if you use version spaces for Raven’s progressive matrices?

26 – Wrap Up

So today we’ve talked about version spaces. Version spaces are an algorithm for converging onto an understanding of a concept, even in the absence of prior background knowledge or an intelligent teacher. We covered the algorithm for version spaces, where we iteratively refine a general and specific model of a concept, until they converge down onto one another. We then talked about using version spaces to address more complex problems. We’ve also connected version spaces to older concepts, like incremental concept learning. Finally we talked about the limitations of version spaces, such as what to do if there’s no single correct concept, or what to do in the absence of either positive or negative examples. To address these, we also covered identification trees, which are a different ways of approaching the same kind of data that version spaces operate on. We’ll touch on version spaces and incremental concept learning again, when we talk about mistake based learning.

27 – The Cognitive Connection

Cognitive agents too face the issue of how far to generalize. We can undergeneralize in which case what we learn is not very useful. We can overgeneralize in which case what we learned may not be correct. For example, imagine that I was a Martian who came to your Earth. I saw the first human being, and I may undergeneralize and say this specific person has two arms. That is not very useful because that is not applicable to any other human being. Or I may overgeneralize and say everyone on this Earth has two arms. That may not be correct. [UNKNOWN] is a technique that allows convergence to the right level of abstraction. This is also connected to the notion of cognitive flexibility. Cognitive flexibility occurs where the agent has multiple characterizations or multiple perspectives on the same thing. As we saw in version spaces, the agent has several possible definitions for a concept that converge over time. An alternate view is to come up with one generalization and try it out in the world. See how well it works. If it leads to a mistake or a failure, then one can learn by correcting that mistake. We’ll return to this topic a little bit later in the class.

29 – Final Quiz

Great. Thank you so much for your feedback.

20 – Constraint Propagation

01 – Preview

Today, we’ll talk about constraint propagation, another very drawn out purpose method. Constraint propagation is a mechanism of influence where the agent assign values to variables to satisfy certain conditions called constraints. It is a very common method in knowledge-based AI, and there are a number of different topics, such as planning, understanding, natural language processing, visual spatial reasoning. Today, we’ll start by defining constraint propagation. Then we’ll see how it helps agents makes sense of the world around it. Our examples will come mostly from understand natural language sentences, as well as visual scenes. Finally, we’ll talk about some advanced issues of constraint propagation.

03 – Exercise Recognizing 3D Figures

>> And this brings us to the point of this exercise. The point of the exercise is that clearly some kind of processing is occurring in our visual system. That allows us to group these lines and these surfaces and waves, so that we can identify which one of them is a 3D object, and which one of them is not a 3D object. Clearly this processing is not completed definitely and that, that sometimes there is ambiguity. You might come up with one answer to this, and someone else might come up with a slightly different answer to this, because the processing leaves the room for ambiguity

04 – Exercise Gibberish Sentences

To look more deeply into the processing that might be a [UNKNOWN] visual system, that allows us to identify which objects are 3D objects and which ones are just 2D. Let us consider a different example. Shown here are six sentences. None of the sentence makes much sense semantically. Nevertheless, some of the sentences are grammatically correct. And you and I can quickly detect which of the sentences are grammatically correct. Can you identify which of the sentences are grammatically correct?

05 – Exercise Gibberish Sentences

>> Note the vocabulary that David used in trying to find out which of these sentences were grammatically correct. It seemed to me that he was examining the structure of these sentences was fulfilling certain conditions, or fulfilling certain constraints that he expects from his knowledge of English language grammar. One could even say that he was doing constraint processing.

06 – Constraint Propagation Defined

>> So we’ve actually come across this idea of constraints in English language grammar before. During our lesson on understanding, we talked about how a preposition for example can constrain the meaning of the word that follows it. If we see the word from for example, we expect what comes after it to be some kind of source for the sentence. There we used grammatical constraints in service of some kind of semantic analysis. Here, we’re just using grammatical constraints to figure out if a sentence is grammatically correct or not. There’s another connection here to understanding as well. [INAUDIBLE] Talked about how we can interpret this shape as either popping out towards us or down into the screen. We talked about two simultaneously accurate interpretations of the same thing and understanding. With regard to sentences that can be read as puns. So, for example, when I said, it’s hard to explain puns to kleptomaniacs, because they always take things literally, the word, take, can simultaneously be interpreted as interpret and physically remove, while satisfying all the constraints of the sentence.

07 – From Pixels to 3D

Let us look at the details of constraint propagation. To do so, we’ll take a specific example from computer vision. Here’s an example of a 2D image composed of a large number of pixels. The greyness at any one pixel is a depiction of the intensity of light at that pixel. Now of course, you and can immediately recognize that this is a cube. But how do we do it, and how can we make a machine do it? [UNKNOWN] decompose a task of 3D object recognition into several several sub-tasks. Miles said in the first sub-task, a visual system detects edges, or lines as shown here. At this particular point, no surfaces have been detected. In this particular point, no 3D object has been fignized. Just these pixels have been put into lines based on the intensities of light in different pixels. According to Miles the second sub task of object recognition consists of grouping these lines and the surfaces with orientations, as indicated here. So now these four lines have been grouped into the surface, and then orientation defined by the perpendicular the surface, and similarly these four lines, and these four lines. In the third and final phase of the object recognition task, according to Miles surfaces are grouped into a complete 3D object. At this particular point, your visual system recognizes that this is a cube. Miles theory has been very influential in computer vision. It has actually also been influential in AI as a whole. One of the lessons we can take away from Miles’ theory of computer vision of object’s recognition is that before we get into our guarded tones for addressing the task, we want to understand how a task gets decomposed into sub tasks. Throughout this course, we have emphasized task decomposition repeatedly. As an example, when we were talking about understanding, a big task of understanding got decomposed into a series of small tasks. Where surface level cues acted as probes into memory and a frame was retrieved. The slice of the frames dented expectations. Lexicon and grammatical analysts led to the identification of objects and predicates that would satisfy those expectations. And the fillers were put in. Problem reduction certainly is a general purpose method for decomposing complex tasks into smaller tasks. This notion of class decomposition is a powerful idea irrespective of what algorithm we use for any of these specific sub tasks

09 – Constraints Intersections and Edges

So let’s take the notion of constraints. Consider this cube again. You’ll notice this cube has junctions, and these junctions have different kind of shapes. For example, this looks like a Y junction, this looks like an L junction, this also looks like an L junction, it’s just that this arm of the L is coming in the other direction. This also looks like an L junction. This junction, on the other hand, looks a little bit like a W junction. So, junctions of various kinds. Here are the kind of junctions that can occur, in the world of trihedral objects like cubes. Y junction, W junction, T junction, L junction. We can say a little bit more about each of these junctions. Let us look at the Y junction first. If we examine the various kinds of Y junctions that get formed in the world of trihedral optics, then we find that whenever there is a Y junction formed, then each of these lines represents a fold, where a fold is a line where two surfaces meet. Now, the important thing about this is. That if we can infer, that this is a Y junction and that this line represents a fold, then an image that follows, this line must also represent a fold, and this line must also represent a fold. Actually I should tell you quickly, that in the world of trihedral objects. Y junctions can have multiple kind of constraints. But right now, let’s just look at this one single constraint. So in the case of an L junction, which has a shape like this, in the world of trihedral objects, an L junction is characterized by this being a blade, and this being a blade, where a blade is a line, well we cannot infer that two surfaces are getting connected with each other. Again, the L-Constraint can actually have many more formulations. But right now, we’re keeping it simple just looking at one single constraint for the L junction. Similarly, in the world of trihedral objects, one of the ways in which a double junction gets characterized is through a blade, fold, blade. In effect, we’re defining a spatial grammar here, for the world of trihedral objects. The equivalent of this, in case of grammar of natural language sentences might be that a sentence can have a non phrase, followed by their verb phrase, followed by a propositional phrase, and so on. Given this set of very simple constraints for the world of trihedral objects, let us see how these constraints actually can be propagated, to group edges and to surfaces

10 – Exercise Assembling the Cube I

Let us do an exercise together. Here is a cube with its seven junctions. For each of these junctions, identify the kind of junction that it is.

11 – Exercise Assembling the Cube I

>> This sounds good and now let us look at how we will apply these to identify the surfaces

12 – Exercise Assembling the Cube II

Let us do another exercise together. We have identified the type of each of these junctions. Let us now use, the constraints for each type of junction to identify the type of each of these edges. For each of these boxes, right? Either fold or blade for the type of the edge.

13 – Exercise Assembling the Cube II

>> That’s good, David. Note that David started on the top left corner, this is a random selection. He could have started at any other corner, for example, this one or that one. And found the same answer and that is because we have simplified these constraints. But now that we know that this line is a fold, by the definition of fold, we know that two surfaces must be meeting at this line. It follows then, that this must be a surface and this must be a surface. Similarly, because we know this is a fold and by definition of a fold, two surfaces must be meeting here. Follows this is a surface, this is a surface and so on. And now we have identified that this one surface, this is another surface, this is a third surface. In this way, the visual system used knows the different kinds of junctions in the world of triangular objects, and it constrains at each of these junctions to figure out which of these lines made surfaces. Instead of thinking of this as one single surface, the visual system identified this as being composed of three different surfaces. And now we can recognize that this might be a cube.

14 – More Complex Images

>> Now of course, some of us do see this as a 3D shape. You can think of this as a paper folded here. One plane of the paper and another plane of the paper. This looks kind of like an open book. This particular line here then can be ignored, just being a line of these two planes, not signifying a surface by itself. If you view this only as a line, and not signifying a surface then it adverses David’s first problem. But how do we address David’s second problem of this being a fold or a blade depending upon where we started constraint propagation from? The answer is that we actually have a much more complex ontology of disconnections. The answer lies in the fact that we have so far used a very simple ontology, just to demonstrate the constraint propagation process. In reality the ontology risk constraints is more complicated. Let’s do Y-constraint may not just fold, fold and fold, but it might also be blade, blade and blade. And the L-constraint is not always blade and blade and fold and fold. It could also be blade and fold and fold and blade. Now we can see David’s second problem disappearing, because the Y junction may have a blade and the L junction may also suggest a blade. And there is then no conflict. Let me know that what we have shown here, is still not a full anthology of the Y, W, L and T constraints. T constraints in particular, may have additional complexity. The advantage of having a more complete ontology is, that we can use that ontology to interpret more complex scenes like this one, where there are two rectangular objects, one being partially occluded by the other one. Of course, the more complicated ontology is not without its own problems. It now introduces ambiguities of a different kind. This particular junction. Is it now a blade, blade, blade, or is it a fold, fold, fold? Both of them are permissible in the new complete ontology. In order to resolve some of these ambiguities, we can come up with additional conventions. One convention is, that all of these edges that are next to the background, we’ll consider them to be blades. So we’ll make this a blade, blade, blade, blade. Once you make all these blades, then it’s easy to propagate the constraints. Notice this W junction could have been a fold, blade, fold, or a blade, fold, blade. But if we adopt the convention of labeling all of these lines as blades, then, this W junction can only be blade, fold, blade. But if this is a fold, this Y junction can only be fold, fold, fold, and so on. And yet, helps us resolve the ambiguity of what this junction could be. This task of image interpretation is an instance of the abduction task. In abduction, we try to come up with the best explanation for the data. This is the data, we’re trying to interpret it in terms of an explanation. We’ll discuss abduction in more detail when we come to diagnosis. Well now notice that, we start with what we know. Blade, blade, blade. And that we propagate the constraints, so that we can disambiguate between other junctions.

16 – Assignment Constraint Propagation

So how would constraint propagation be useful for Raven’s progressive matrices? This concept has a strong correspondence to the final project where you’ll be asked to reason over the images of the problem directly instead of over the propositional representations that we’ve given you in the past. So first, if constraint propagation leverages a library of primitive constraints, what will your constraints be for the final project? How will you propagate those constraints into the image to understand it? Then once you’ve actually propagated those constraints, how will you actually use those inferences? Will you abstract out propositional representations? Or will you stick to the visual reasoning and transfer the results directly?

17 – Wrap Up

So today we’ve talked about constraint propagation, which is a method of inference where we assign values to variables to satisfy certain external constraints. By doing so, we arrive at strong inferences about the problem, like which shapes represent objects, or how the words in a sentence interact. After defining constraint propagation, we talked about how it can be useful in interpreting images by using prior knowledge of constraints to anticipate predictable shapes out in the world. Then we talked about natural language understanding, where prior knowledge of the rules of grammar and parts of speech allow us to make sense of new sentences. Now, constraint propagation is actually an incredibly complex process. We have numerous constraints for visual scenes and verbal sentences that we haven’t discussed here. We all see this constraint propagation in other areas as well, such as in making sense of auditory and tactile information. Reading braille for instance, can be seen as an instance of constraint propagation. We’ll pick up on this discussion later when we talk about visual and spatial reasoning. But this will also come up in our discussion of configuration, which can be seen as a specific instance of constraint propagation in the context of design.

18 – The Cognitive Connection

Constraint propagation also connects to human cognition. First, constraint propagation is a very general purpose method like, means analysis. In both knowledge based AI and in human cognition, constraint propagation allows us to use our knowledge of the world, in order to make sense of it. Constraints can be of any kind. Symbolic, as well as numeric. We discussed symbolic constraints in today’s lesson. A good example of numeric constraints comes from XL spreadsheets, with which most of you are familiar. If the columns in a particular spreadsheet are connected to some formula, and you make a change in one column, then the change is propagated into all the columns of the spreadsheet. That’s an example of Numerical Constraint Propagation. We have seen constraint propagation under other topics as well. For example, planning, and understanding, and scripts. The next topic, configuration, will build on this notion of constraint propagation.

19 – Final Quiz

All right. Please write down what you understood from this lesson, in the box right here.

20 – Final Quiz

And thank you for doing it.

21 – Configuration

01 – Preview

Today, we’ll talk about configuration. Configuration is a very routine kind of design task in which all the components of the design are already known. The task now is to assign values to the variables of those components so they can be arranged according to some constraints. Configuration will be your first part under the unit on designing creativity. We’ll start by talking about design, then we’ll define configuration. Then we trace through the process of configuration, a specific measure called planned refinement. Finally, we’ll connect configuration to several earlier topics we have discussed such as classification, case based reasoning, and planning.

02 – Define Design

Let us talk about what is design. Design in journal takes us input some sort of needs, so goals or functions. It gives us output that’s specification of this structure of some artifact that satisfies those needs and goals and functions. Note that the artifact need not be a physical product. It can be a process. A program, a policy. Some example for design, design a robot that can walk on water. Design a search engine that can return the most relevant answer to a query. The Federal Reserve Bank designs and monitor the policy to optimize the economy. Note the design is very wide ranging, open ended and ill-defined. In problem solving, typically the problem remains fixed, even as the solution evolves. In design, both the problem and the solution co-evolve. The problem evolves as the solution evolves. We are studying design and AI because we are working with AI agents that can do design. At least potentially, we want AI agents that can design other AI agents.

03 – Exercise Designing a Basement

>> Thanks, Isuke so right now, my wife and I are actually building a house and as part of that, we need to configure the basement for the house. I’ve taken a list of some of the requirements for this basement and listed them over here on the left. And on the right, I have the variables that we need to assign values to, we have things like the width of the utility closet, the length of the stairwell, we also had two additional rooms, each must have their own length and width. So try to configure our basement, such that we meet all the different requirements listed over here on the left, write a number in each of these blanks.

05 – Defining Configuration

>> Designing journal is a very common information processing activity and configuration is the most common kind of design. Now that we have looked at the definition of the configuration task, we are going to look at matters for addressing the task. Once again, recall that the components in case of configuration are already known. We are deciding on the arrangement of those components. We are assigning values to specific variables of those components, for example, sizes.

06 – The Configuration Process

>> So for an example that might hit a little bit closer to home for many of you, as you’ve been designing your agents that can solve the Raven’s test, you’ve done a process somewhat like this. You start with some specifications, general specifications that your agent must be able to solve as mini problems on the Raven’s test as possible. You can start with an abstract solution of just the general problem solving process, that you may have then refined to be more specific about the particular transformations to look for or the particular problem solving methods to use. That got you to your final result. But when you ran your final result, you may have found something like, it would work but it would take a very, very long time to run, weeks or months. So that thing causes you to revise your specifications. You not only need an agent that can solve as many problems as possible, but you also need one that can solve it in minutes or seconds instead of weeks or months.

08 – Example Ranges of Values

Now in configuration design, we not only know all the components like legs, and set, and arms, and so on. We not only know the variables for each of the components, like size and material, and cost. But we also know the ranges of values that any of these variables can take. Thus the seat of a chair may have a certain weight, or length, or depth. Here between the sides and the seat in a very simple matter in terms of the mass of the seat as measured in grams. So 10 to 100 grams, you’ll see in minute why we’re using this simple measure. So when it is brackets for this material slot suggests that there is a range here, it will show the range on the left [UNKNOWN]. TThe cost then will be determined by the size and the materials. Let us suppose that this table captures the cost per gram for certain kinds of materials. Now you can see why we’re using gram as a measure for the size of the seat. We wanted to very easily relate the size to the cost. The material slot now can take one of these three values. This is the range of values that can go into the material slot. Given a particular size and a particular material, we can calculate this cost. Note that this representation allows us to calculate the total mass of the chair and the total cost of the chair, given the total mass and the total cost at least of the components.

10 – Exercise Applying a Constraint

Let us do an exercise together. This exercise again deals with the configuration of a chair. The input specification is a chair that costs at most $16 to make, and has 100 grams metal seat. Please fill out the values of all of these boxes. Try to use a configuration process that we just described, and make a note of the process that you actually did use

11 – Exercise Applying a Constraint

>> That’s good, David. It’s important to note that David used several different kinds of knowledge. First, he had knowledge of the generic chair. He knew about the components. He know about the slots, but not necessarily all the fellows for these slots. Second, he had heuristic knowledge. He used the term heuristic, recall that down here, heuristic stands for rule of thumb. So, heuristic knowledge about how to go what filling the values of some of these slots. Third, explicit in this is not just the knowledge about legs and seats and arms and so on, but also how does chair as a whole is decomposed in these components. That is one of the fundamental rules of knowledge and knowledge based AI. It allows to struck to the problem so this problem can be addressed efficiently. Note this process of configuration design is closely related to the method of constraint propagation that we discussed in our previous lesson. Here, are some constraints, and these constraints have been propagated downwards in the pan obstruction hierarchy.

12 – Connection to Classification

>> So it sounds to me like, while classification is a way of making sense of the world, configuration is a way of creating the world. With classification we perceive certain details in the world and decide what they are. With configuration, we’re given something to create and we decide on those individual variables

13 – Contrast with Case-Based Reasoning

We can also can cross configuration with case based reasoning. Both configuration and case based reasoning are typically applied to routine design problems, problems of the kind that we’ve often encountered in the past. Gives a configuration, we start with a prototypical concept, then assign values to all the variables as we saw in this chair example. In case of case-based reasoning we start with the design of a specific chair that we had designed earlier. Look at its variables and tweak it as needed to satisfy this constraint so the current problem. Case-based reasoning assumes that we already designed our other chairs, and we have stored examples of the chairs for designing the memory. Configuration assumes, that we already designed enough chairs so that we can in fact extract the plan. When a specific problem is presented to an IA agent, the IA agent, if it is going to use the method of configuration, is going to call upon the plan obstruction hierarchy and then start defining plans. If the AI agent uses the method of case based reasoning, then it’ll go into the case memory, retrieve the closest matching case, and then start tweaking the case. Little bit later we will see how an AI agent select between different methods that were able to address the task. As we have mentioned earlier in the course, the chemical periodic table was one of the really important scientific discoveries. Similar to chemical periodic table, we are trying to build a periodic table of intelligence. Unlike the chemical periodic table which deals with balance electrons. Our periodic table of intelligence, deals with tasks and methods. In this particular course, we have considered both a large number of tasks, configuration being one of them, as well as a large number of methods, [UNKNOWN] instantiation and case-based reasoning being two of them.

14 – Connection to Planning

The process of configuration is also related to planning. You can consider a planner that actually generates the plan in this plan obstruction hierarchy. But then for any plan in this plan obstruction hierarchy, then it converts a plan in this plan obstruction hierarchy into a skeletal plan. It drops the values of the variables in the plans and constructs it into a plan it’s simply specify the variable without specifying the values. The process of configuration planning then, takes these plans, organizes them into obstruction hierarchy and goes about [INAUDIBLE] shading and refining and expanding them. We already discussed how configuration is connected to a number of other lessons like case based reasoning, planning and classification. You may also consider this plan to be kind of strict for physical object. In addition, this plans have been learned, through learning methods similar to the method of incremental concept learning. In addition, this plan hierarchy might be learned through learning methods similar to the method for incremental concept learning. One of the things that we are doing in knowledge based AI is, to describe the kinds of knowledge that we need to learn. Before we decide on what is a good learning method, we need to decide on what is it we need to learn? The configuration process tells us of the different kinds of knowledge that then become targets of learning. To connect this lesson back to our cognitive architecture, consider this figure once again. So knowledge of the prototypical chair, as well as knowledge about the radius, plans, and the abstraction hierarchy are stored in memory. As the input gives specification with the design problem, the reasoning component instantiates those plans, refines them and expands them. The knowledge itself is learned through examples of configuration of chairs that presumably, the agent is already encountered previously.

15 – Assignment Configuration

So how might you use the idea of configuration to design an agent that can answer Raven’s progressive matrices? We’ve talked in the past about how constraint propagation can help us solving these problems. If configuration is a type of constraint propagation, how can you leverage the idea of variables and values in designing your agent? What are the variables and what values can they take? We’ve also discussed how planning can be applied to Raven’s progressive matrices. If configuration leverages old plans, how you build your agent to remember those old plans and reconfigure them for new problems? Will it develop the old plans based on existing problems, or will you hand it the problems in advance?

16 – Wrap Up

So today we’ve talked about configuration, a kind of routine design task. We do configuration when we’re dealing with a plan that we’ve used a lot in the past, we need to modify to deal with some specific new constraints. So for example, we’ve built thousands of buildings, and thousands of cars, and thousands of computers, and each of them is largely the same. But there’s certain parameters, like the number of floors in a building, or the portability of the computer, that differ from design to design. So we need to tweak individual variables to meet those new constraints. We started this off by defining design in general, and then we used that to define configuration, as a certain type of routine design task. We then discussed the process of configuration and how it’s actually very similar to constraint propagation that we’ve talked about earlier. Then we connected this to earlier topics like classification, planning and case-based reasoning, and saw how in many ways, configuration is a task, while other things we’ve talked about provide us the method for accomplishing that task. So now we’ll move on to diagnosis, which is another topic related to design, where we try to uncover the cause of a malfunction in something that we may have designed. In some ways, we’ll see that diagnosis is a lot like configuration in reverse.

18 – Final Quiz

Please summarize what you learned in this lesson in this blue box.

19 – Final Quiz

Great, thank you very much.

22 – Diagnosis

01 – Preview

Today we will talk about diagnosis. Diagnosis is the identification of the fault or faults responsible for a malfunctioning system. The system it could be a car, a computer program, an organism, or the economy. Diagnosis builds on our discussion of classification and configuration. They start by defining diagnosis. They really setup two spaces. A data spaces and a hypothesis space. Data about the malfunctioning system. Hypothesis about the fault that can explain that malfunctioning system. Then, we’ll constructing mappings through data space to hypothesis space which amount to diagnosis. We’ll discuss two views of diagnosis, diagnosis as classification and diagnosis as abduction. Abduction in this context is a new term to you. We’ll discuss it in more detail today.

02 – Exercise Diagnosing Illness

To illustrate the task of diagnosis, let us begin with an exercise. When we think of diagnosis, most of us think in terms of medical diagnosis. The kind of diagnosis a doctor does. So this particular exercise, is coming from medical diagnosis, actually it’s a made-up exercise from medical diagnosis. And here’s a set of diseases, fictional diseases, that the doctor knows about, along with the symptoms that each diseases causes. So, Alphaitis, for example, causes elevated A, reduced C and elevated F and so on. Given this set of data, and this set of diseases, what disease or set of diseases do you think the patient suffers from?

05 – Data Space and Hypothesis Space

We can think of diagnosis as a mapping from a data space, to a hypothesis space, In case of a medical diagnosis, the data may be the greatest kind of signs and symptoms that I may go to a doctor with. Some of the data may be very specific, some of it may be very abstract, an example of a very specific data is that a [UNKNOWN] temperature is 104 degrees fahrenheit. An example of the extraction of the data is that [INAUDIBLE] is running a fever. The hypothesis space consists of all hypothesis that can explain parts of the observed data. A hypothesis in the hypothesis space can explain some part of the data, In case of medicine, this hypothesis may reference to diseases. A doctor may say that my hypothesis is that a shook is suffering from flu, and that explains his high fever. In the domain of car repairs, this hypothesis may refer to specific faults with the car, for example, the carburetor is not working properly. In the domain of computer software, this hypothesis may refer to specific methods not working properly. And this mapping from data space to the hypothesis space can be very complex. The complexity arises partly because of the size of data space, partly because of the size of hypothesis space, partly because the mapping can be M to N. And also, because this hypothesis can interact with each other, If H3 is present, H4 may be excluded, If H5 is present, H6 is sure to be present and so on. It helps then not to deal with all the raw data, but to deal with abstractions of the data, so the initial data that a patient may go to a doctor with may be very, very specific. The signs and symptoms of their particular specific patient, but the diagnostic process might abstract them from Asok has a fever of 104 degrees farenheit to Asok has a high fever. This abstract data that can be mapped into an abstract hypothesis, Asok has high fever can get mapped into Asok has a bladder infection for example. The abstract hypothesis can now be refined into a suffering from flu or a flu for a particular screen. At the end, we want a hypothesis that is as refined as possible, and that explains all the available data. When we were talking about classification, we talked about two processes of classification, bottom-up process and our top down process. The bottom up process of classification, we started with raw data and then grouped and abstracted, it in case of top down classification we started with some high level class and then established it and refined it. You can see that in diagnosis both the bottom up process of classification, and the tope down process of classification are co-occurring. This method of bottom up classification and data space, mapping and hypothesis space, and then top down classification of hypothesis space is called heuristic classification. This is yet another method like rule-based reasoning, case-based reasoning, and model-based reasoning with a diagnostic task.

06 – Problems with Diagnosis as Classification

>> In general, cancellation interactions are very hard to account for. In order to address these factors that make diagnosis so complex, it is useful to shift from the perspective of diagnosis solely as classification to a perspective of diagnosis as abduction.

07 – Deduction, Induction, Abduction

>> Or given the rule if flu then fever and the fact that Ashok has fever we might be able to abduce that Ashok has flu. First of all notice that we are back to diagnosis. Diagnosis is an instance of abduction. But notice several other properties. First, deduction is truth preserving. If the rule is true, and the cause is true, we can always guarantee that the effect is true as well. Induction and abduction are not truth preserving. We may know something about the relationship between cause and effect for some sample, that does not mean that the same relationship holds for the entire population. Induction does not always guarantee correctness. Same for abduction. We may know the rule and the effect, and we may suppose that its cause is true, but that may not necessarily be true. It may be the case, if flu then fever, and Ashok may have fever, but that does not necessarily mean that Ashok has flu. Fever can be caused by many, many things. The reason that fever does not necessarily mean that Asoka’s flu is because there can be multiple causes for the same effect, multiple hypothesis for the same data. This is exactly the problem that we had encountered earlier when we talking about what makes diagnosis hard. We said that deduction, induction, and abduction, are three of the fundamental forms of inference. We can of course also combine these inferences. Science is a good example. You and I as scientists, observe some data about the world. Then, we abduce some explanation for it. Having abduced that explanation for it, we induce a rule. Having induced a rule, now we can use deduction to predict new data elements. We go and observe some more. Again, we abduce. Induce. Induce. And we continue the cycle. Might the cycle also explain significant part of cognition? Is this what you and I do on a daily basis? Abuse, induce, reduce?

08 – Criteria for Choosing a Hypothesis

Now that we understand abduction, and now that we know the diagnosis is an instance of abduction, let us ask ourselves, how does this understanding help us in choosing hypotheses? So the first principle for choosing a hypothesis is explanatory coverage. A hypotheses must cover as much of the data as possible. Here’s an example, hypotheses H3 explain data items D1 through D8. Hypothesis H7 explains data item D5 to D9. Assuming that all of these data elements are equally important or equally salient, we may prefer H3 over H7 because it explains for of the data than does H7. The second principle for choosing between competing hypotheses is called the principle of Parsimony. All things being equal, we want to pick the simplest explanation for the data. So consider the following scenario. H2 explains data elements D1 to D3. H4 explains data elements D1 through D8. H6 explains data elements D4 to D6 and H8 explains data elements D7 to D9. Now if you went by the criteria of explanatory coverage, then we might pick H2, plus H6, plus H8, because the three of them combined, explain more than just H4. However, the criteria of Parsimony would suggest if you pick H4, because H4 alone, explains almost all the data, and we don’t need the other three hypothesis. In general this is a balancing act between these two principles. We want to both maximize the coverage, and maximize the parsimony. Based on this particular example, we may go with H4 and H8. The two together explain all the data and in addition, the set of these two hypotheses is smaller than these set of hypotheses H2, H6, and H8. The [UNKNOWN] criteria for choosing between competing hypotheses is that we want to pick those hypotheses in which we have more confidence. Some hypotheses are more likely than others. You may have more confidence in some hypotheses than in others. As an example, in this particular scenario, H3 may explain data items D1 to D8 and H5 may explain more data elements from D1 to D9. So H5 also explains D9 that H3 doesn’t. However, we may have more confidence in H3, and so we may pick H3 instead of H5. Once again this is a balancing act between these three criteria for choosing between competing diagnostic hypotheses. A quick point to note here, these three criteria are useful for choosing between competing hypotheses even if the task is not diagnosis. The same problem occurs for example in intelligence analysis. Imagine that you have some data that needs to be explained and your competing hypothesis for explaining that particular data, well, you may pick between the competing hypothesis based on this criteria. All of the task is not a diagnostic task. These three criteria are useful for explanation. Diagnosis simply happens to be an example of this [UNKNOWN] task.

09 – Exercise Diagnosis as Abduction

Let us do an exercise together. The data in this particular exercise, a little bit more complicated than in the previous one. On the right-hand side, I’ve shown a set of diseases. What disease or subset of these diseases best explains the available data?

10 – Exercise Diagnosis as Abduction

>> Note that one can use alternative methods for the same problem. For example, one could use K-space reasoning. And for when we came across a problem very similar to this one previously. Suppose that the solution of that particular problem was ever labeled as a case. In that particular case, B was high, C was low, and H was low. And the solution was Thetadesis. In the current problem, the additional symptom is that F is low. So case retrieval would first lead you to the conclusion of Thetadesis. Only to [UNKNOWN] this particular solution to also account for the additional symptom of F being low. We could do that by adding Kappacide and Mutension to Thetadesis. Case based system thus would tend to focus the alternate set of hypotheses. One more point to note here then. Note that different methods can lead to different solutions. Given different methods, how might an AI agent decide which method to select? We’ll return to this particular problem when we discuss meta reasoning.

11 – Completing the Process

>> We can also think of this last phase as a type of configuration which we talked about last time. Given a set of hypothesis about illnesses or faults with a car, we can then configure a set of treatments or repairs that best address the faults we discovered before.

12 – Assignment Diagnosis

So would the idea of diagnosis help us design an agent that can answer Raven’s progressive matrices? Perhaps the best way to think about this is to consider how your agent might respond when it answers a question wrong. First, what data will it use to investigate its incorrect answer? Second, what hypotheses might it have for incorrect answers? Third, how will it select a hypothesis that best explains that data? And last, once it’s selected hypothesis that explains that data, how will it use that to repair its reasoning, so it doesn’t make the same mistake again?

15 – Final Quiz

Please write down what you learned in this lesson.

16 – Final Quiz

Thank you very much.

23 – Learning by Correcting Mistakes

01 – Preview

Today we’ll talk about another method of learning, called learning by correcting mistakes. An agent reaches a decision. The decision turns out to be incorrect, or sub optimal. Why did the agent make that mistake? Can the agent correct its own knowledge and reasoning so that it never makes that same mistake again? As an example, I’m driving and I decide to change lanes. As I change lanes, I hear cars honking at me. Clearly I made a mistake. But what knowledge, what reasoning led to that mistake? Can I correct it, so that I don’t make the mistake again? Learning to correct mistakes is our first lesson in meta-reasoning. We’ll start today by revisiting explanation-based learning. Then we’ll use the explanations for isolating mistakes. This will be very similar diagnosis. Except that here we’ll be using explanation for isolating mistakes. This will make it clear, why explanation is so central to knowledge based AI. Then we’ll talk about how we can use explanations for correcting mistakes which will set up the foundation for subsequent discussion on metareasoning.

02 – Exercise Identifying a Cup

To illustrate learning by correcting mistakes, let’s go back to an earlier example. We encountered this example when we were discussing explanation-based learning. So imagine again that you have bought a robot from the Acme hardware store, and in the morning, you told your robot, go get me a cup of coffee. Now the robot already is bootstrapped with knowledge about the definition of the cup. A cup is an object that is stable and enables drinking. The robot goes into your kitchen and can’t find a single clean cup. So it looks around. This is a creative robot and it finds in the kitchen a number of other objects. One particular object has this description. The object is light and it’s made of porcelain. It has decorations and it has concavity and a handle. And the bottom of this object is flat. Now the robot decides to use this object as a cup, because it can prove to itself that this object is an instance of a cup. It does so, by constructing an explanation. The explanation is based on the fact that the bottom is flat, that it has a handle, that the object is concave, and that it is light. Let us do an exercise together that will illustrate the need for learning by correcting mistakes. So shown here are six objects. And there’re two questions here. The first question is, which of these objects do you think is a cup? Mark the button on the top left if you think that a particular object is a cup. The second question deals with the definition of a cup that we had in the previous screen. So mark the button on the right as solid, if you think that that particular object meets the definition of the cup in the previous screen.

04 – Questions for Correcting Mistakes

>> This problem of identifying the error in one’s knowledge that led to a failure is called Credit Assignment. Blame assignment might be a better term. A failure has occurred, what fault of gap in one’s knowledge was responsible for the failure? This blame assignment. In this lesson we’ll be focusing on gaps or errors in one’s knowledge. In general, the error could be in one’s reasoning or in one’s architecture. Creative assignment applies to all of those different kind of errors. Several accurists, Marvin Minsky for example, consider creative assignment to be the central problem in learning. This is because air agents live in dynamic worlds. Therefore we’ll never be able to create an air agent which is perfect. Even if we were to create an air agent which had complete knowledge and perfect reasoning related to some world, the world around it will change over time. As it changes, the agent will start failing. Once it starts failing, it must have the ability of correcting itself. Of correcting its own knowledge, of correcting its own reasoning, correcting its own architecture. You can see again, how this relates to the matter of cognition. The agent is not diagnosing some electrical circuit or a car or software program outside. Instead, it is self diagnosing, self repairing.

05 – Visualizing Error Detection

As we mentioned previously, in journal, the editors may lie in the knowledge, in the reasoning, or the architecture of a nation. And therefore, learning and correcting errors might be applicable to any one of those. However, in this lesson, we will be focusing only on errors and knowledge. In fact, in particular, we’ll be focusing on errors and classification knowledge. Classification, of course, is a topic that we have considered this class repeatedly. Let us consider an air agent that has two examples of executing an action in the world. In the first experience, the agent will use this object as a cup and gets the feedback, this indeed was a cup. So this was a positive experience. This, on the other hand, is a negative example. Here, the agent viewed this as a cup and got the feedback that this should not have been viewed as a cup. We can visualize the problem of identifying what knowledge element led the agent to incorrectly classify this as a cup. As follows. This left circle here consist of all features that describe the positive example. The circle on the right consist of all features that describe the negative example. So features in this left circle might be things like this is a handle, there is a question mark, there is a blue interior and so on. The circle on the right consist of features. That characterize a negative example. It has a movable handle. It has a red interior. It has red and white markings on the outside. There’s some features that characterize only the positive experience and not the negative experience. There are those that characterize only, the negative experience, and not the positive experience. There are also many features that characterize both the positive and the negative example. For example they are both concave, they both have handles, and so on. In this example, it is these features that are specially important. We’ll call them fault suspicious features. We call them fault suspicious features because first they identify only the negative experience. Secondly, one or more of these features may be responsible for the fact that the agent classified this as a positive example when in fact it was a negative example. As an example suppose that this features corresponds to a movable handle. This is a false suspicious feature. It is false because this experience was false. It is suspicious because it does not characterize the positive experience. And thus it may be one of the features responsible for the fact that this was a negative example. But now there’s an additional problem. There are number of false suspicious features here. So how will the agent decide which false suspicious feature to focus on? We’ve encountered this problem earlier, when we were talking about incremental costs of learning. At that point we had said that we wanted to give examples in an order, such that each succeeding example referred to the current constant definition, and exactly one feature. So that the agent knows exactly what the focus that feature is on. The same kind of problem occurs again how might the agent know which feature to focus on. One possible idea is that it could try on of the feature at a time and see if it will work. That is it could select this feature to repeat the process get more feedback and either is accepted or eliminated. An alternate method is that the agent perceived not just two experiences. But many such experiences. So there were other positive experiences that covered this part of the circle. That would leave only this as a false suspicious feature, and then the agent can focus attention on this feature. As an example, just like this circle may correspond to a movable handle, this may correspond to red interior. Because red interior is one of the features that characterizes a negative example and not a positive example. But later on, there might be another positive example that comes of a cup, which had a red interior in which case agent can exclude this particular feature. The reverse of this situation is also possible. Let us suppose that the agent decides that this is not a cup, perhaps because its definition says that something with a blue interior is not a cup. And therefore, it doesn’t bring water to you inside this cup and tells you there is no cup available in the kitchen. You go to the kitchen. You see it and you say, well, this is the cup. Now, the agent must learn why did they decide that it was not a cup. In this case, the relevant features are these three features. These are the three features that define this cup, but do not define the other experiences. So this dot may correspond to a blue interior, this dot may correspond to a question mark on the exterior, we’ll call this feature true suspicious, just like we call them false suspicious. These are the features that prevented the agent from deciding that this was a positive example of a cup. One or more of these features may be responsible for the agent’s failure to recognize that this was a cup

06 – Error Detection Algorithm

>> As you can see, we’re taking unions, and intersections of features characterizing different examples. The number of examples, both positive and negative. Needed for this algorithm to work well, depends upon the complexity of the concept. In general, the more features you have the description of the object, more will be the number of the examples you will need to identify the features that were responsible for a failure.

08 – Explaining the Mistake

You may recall this explanation from our lesson on explanation-based learning. There, the agent constructed an explanation like this to show that a specific object was an example of a cup. For a good example of a pail, the agent may have constructed similar explanation with the object being replaced by pail everywhere. Now however, the agent knows that pail is not an example of a cup. Something is not quite right with this explanation. We’ve also just seen how the agent can identify the false suspicious relationship in this explanation. It is identified, but the handle must be fixed because that is the feature that separates the positive experiences from the negative experiences. The question then becomes, how can this explanation be repaired by incorporating handle as fixed? Where should handle as fixed go?

09 – Discussion Correcting the Mistake

>> What do you think, is this a good way to fix the agents error?

11 – Correcting the Mistake

So the [UNKNOWN] agent figured out that handle is fixed should go beneath object enables drinking, but not beneath object is liftable. So the agent will put handle is fixed here, in this particular explanation. This is correct. If the agent has background knowledge, it tells it that the reason handle is fixed is important is, because it makes the object manipulable, which in turn enables drinking. Then the agent can insert this additional assertion here in the explanation. Even more, if the agent has additional background knowledge that tells it, the defect of the object has a handle, and that the handle is fixed, together make the object orientable, which is what makes the object manipulatable, then the agent may come up with a richer explanation. The important point here is, that as powerful and important as classification is, it alone is not sufficient. There’re many situations under which explanation too is very important. Explanation leads to richer learning, deeper learning.

13 – Assignment Correcting Mistakes

So how would you use learning by correcting mistakes, to design an agent that can answer Raven’s progressive matrices? On one level, this might seem easy. Your agent is able to check to see if its answers are correct, so it’s aware of when it makes a mistake. But the knowledge of when it’s made a mistake, merely triggers the process of correcting the mistake. It doesn’t correct it itself. So how will your agent isolate its mistake? What exactly is it isolating here? Once it’s isolated the mistake, how will it explain the mistake? And how will that explanation then be used to correct its mistake so it doesn’t make the same mistake in the future? Now in this process we can ask ourselves, will your agent correct the mistake itself?, or will you use the output to correct the mistake in your agent’s reasoning? Will you look at what it did and say, here’s the mistake it made. So next time it shouldn’t make that mistake. If you’re the one using your agents reasoning to correct your agent, then as we’ve asked before, who’s the intelligent one? You? Or your agent?

15 – The Cognitive Connection

Learning by correcting mistakes is a fundamental process of human learning. In fact, it may closely resemble the way you and I learn and practice. In our lives, we rarely are passive learners. Most of the time we’re active participants in the learning process. Even in a difficult setting like this, you’re not just listening to what I’m saying. Instead, you’re using your knowledge and reasoning, to make sense of what I’m saying. You generate expectations. Sometimes those expectations may be violated. When they’re violated, we generate explanations for them. We try to figure out, what was in error, in your knowledge and reasoning. This, is learning by correcting mistakes. Notice, that you think about your own thinking, a step towards meta-reasoning, which is our next lesson.

17 – Final Quiz

Great. Thank you so much for your feedback.

24 – Meta-Reasoning

03 – Beyond Mistakes Knowledge Gaps

So far, we have talked about the case where there was an error in the knowledge or an error in the reasoning or in the learning. The knowledge for example was incorrect in some way but the knowledge can also be incomplete. There can be a gap in knowledge or in reasoning or in learning. A gap in knowledge occur when we’re doing this exercise under explanation-based learning. In that particular case, the agent had build this part of the explanation, and it also had this part of the explanation. But it could not connect these two, because there was no knowledge to connect that are object that has thick sides and it can limit heat transfer that it would protect against heat. Once the agent detects this as a knowledge gap, then it can set up a learning goal. The learning goal now, is, acquire some knowledge that will connect these two pieces of knowledge. Once the agent detects a knowledge gap, it can set up a learning goal. The learning goal now is to be able to connect these two pieces of information. Notice that we are seeing how agents can spawn goals. In this particular case the agent is spawning a learning goal. You might recall that when we did this exercise on explanation based learning, the agent went back to its memory, and found a precedent, found a piece of knowledge, that enabled it to connect these two parts of the explanation. And so this link was formed and the agent was then able to complete its explanation. This is an example how the learning goal was satisfied, using some piece of knowledge. In this case the knowledge came from the memory. But the agent could have potentially also acquired the knowledge from the external world. For example, it may have gone to a teacher and said, I have a learning goal. Help me with the knowledge that will satisfy that learning goal. Its ability to spot learning goals and then find ways of satisfying or achieving those learning goals or any goal in general, is another aspect of metacognition. So this was an example of how metacognition helps resolve a gap in knowledge. Now let us see how it can help resolve gaps in reasoning or learning. To see how metacognition can help resolve reasoning gaps, let us return to this example of using mean sense analysis in the blocks micro build. Once the agent reaches a cul de sac in the reasoning. The agent could formally list its goal and ask itself how can I help to resolve this cul-de-sac. It may then be the reminder of this strategy problem reduction was it uses its goals into several independent goals and then the agent can go about achieving each goal at one at a time. Thus in this example, the agent set up a new reasoning goal and that used that reasoning goal to pick a different strategy and thereby achieved that reasoning goal. Note also that this is one way in which we can integrate multiple strategies. We first use some [x] analysis right in the cul-de-sac, form a new listening goal, use the listening goal to bring in a different strategy follow reduction and then go back to the original strategy means and analysis. We’re achieving each goal independently

05 – Strategy Selection

In this course, we have learned about a large number of reasoning methods. Here are some of them. We could have added a lot more here, for example, plan refinement or logic or scripts. Typically when you and I program an AI agent, we pick a method, and we program that method into the agent. One unanswered question is, how might an agent know about all of these methods and the autonomously select the right method for a given problem? This is the problem of strategy selection and metacognition helps with strategy selection. Given a problem, and given that all of these matters are relative to the agent to potentially address problem. Metacognition is select between these matters using several criteria. First, each of these methods require some knowledge of the world. For example, case-based reasoning requires knowledge of cases. Constraint propagation requires knowledge of constraint. And so on. Metacognition is select one particular method, depending on what knowledge is exactly available for addressing that specific input problem. If that specific input problem, case does not have a label, then clearly the method of case-based reasoning cannot be used. If, on the other hand, constraints are available, the constraint propagation might be a useful method. Second, if the knowledge required by multiple methods is available, then metacognition must select between the competing methods. Under the criteria for selecting between these methods might be computational efficiency. For a given class of problems, some of these methods might be computationally more efficient than other methods. As an example, if the problem is very close to a previously encountered case, then a case-based reasoning might be computationally a very good method to use. On the other hand, if the new problem is very different from a previously encountered case, then case-based reasoning may not be a computationally efficient method. We’ve come across this issue of computational efficiency earlier in this class. For example, when we were discussing generate and test. If the problem is simple, then it is potentially possible to write a generator that will produce good solutions to it. On the other hand, for a very complex problem, the process of generating good solutions may be computationally inefficient. Similarly, if there is a single goal, then the method of means-ends analysis may be a good choice. On the other hand, if there are multiple goals that are interacting with each other, the means-ends analysis can run into all kind of cul-de-sacs, and have poor computational efficiency. A third criteria that metacognition can use to select between these various methods is quality of solutions. Some methods come with guarantees of quality of solutions. For example, logic is a method of provide some guarantees of the correctness of solutions. Thus, if this is a problem for which computational efficiency is not important, where the quality of solutions is critical, you might want to use the method of logic. Because it provides some guarantees of the quality, although it might be computationally inefficient. The same kind of analysis holds for selecting between different learning methods. Once again, given a problem, the agent may have multiple learning methods for addressing their particular problem. What method should the learning agent choose? That depends partly on the nature of the problem. Some methods are applicable to that problem, and some methods may not be applicable to that problem. Second, for example, in this learning task, if the examples come in one at a time we might use incremental concept learning. On the other hand, if all the examples are given together, then we might use decision-tree learning or identification-tree learning. Another criteria for deciding between these methods could be computational efficiency that lay down what the criteria could have to do with quality of solutions.

07 – Process of Meta-Reasoning

>> To summarize the spot then, metacognition can use the same reasoning strategies, that we have been studying at the deliberative level.

08 – Discussion Meta-Meta-Reasoning

So if metacognition reasons over deliberation, could we also have an additional layer, where meta-metacognition reasons over metacognition? And to take that even further, could we have a meta-meta-metacognition reasons over meta-metacognition all the way up, infinitely up in a hierarchy? Is this a good way to think about the levels of metacognition?

11 – Connections

So, like we said earlier in this lesson, we’ve actually been talking about kinds of meta-cognition throughout this course, even if we didn’t call it that at the time. We were talking about agents reflecting on their own knowledge, and correcting it when they were introduced to a mistake. Earlier in this lesson, we also talked about the possibility that an agent would reflect on the learning process that led it to the incorrect knowledge, and correct that learning process, as well. Back during partial order planning, we talked about agents that could balance multiple plans and resolve conflicts between those plans. This could be seen as a form of meta-cognition as well. The agent plans out a plan for achieving one goal, a plan for achieving the other goal, and then thinks about its own plans for those two goals. Then it detects the conflict between those two plans and it resolves that conflict accordingly. Then it detects the conflict between those two plans and creates a new plan to avoid that conflict. Here the agent is reasoning over its own planning process. We saw this in production systems as well. We had an agent that reached an impasse, it had two different pitches which is suggested and it couldn’t decide between the two. Let’s find a new learning goal to find a rule to choose between those pitches. It then selected a learning strategy, chunking, went into its memory, found a case, and chunked a rule that would it resolve that impasse. In this case, the agent used that impasse to set up a new learning goal. It didn’t select the strategy, strategy selection, to achieve that learning goal. We can also see medicognition in version spaces. Our agent has the notion of specific and general models, and it also has the notion of convergence. The agent is consistently thinking about it’s own specific and general model, and looking for opportunities to converge them down into one model of the concept. And finally, we can very clearly see metacognition in our lesson on diagnosis. We talked about how all the results for our treatment become new data for our iterative process of diagnosis. If our treatment didn’t spond desirable results, it also sponds data for the metal layer. Not only do we still want to diagnose the current malfunction,. But we also want to diagnose, why we weren’t able to diagnose it correctly in the first place. So, now we’re diagnosing the problem with our diagnosing process. So as we can see, meta cognition’s actually been implicit in several of the topics we’ve talked about in this course.

14 – Wrap Up

So today we’ve talked about meta-reasoning. This very strongly leveraged and built on nearly everything we’ve talked about so far in this course. Meta-reasoning is, in many ways, reasoning about everything we’ve covered so far. We started off by recapping learning from correcting mistakes and the related notion of gaps. Then we covered two broad metacognitive techniques called strategy selection and strategy integration. We then discussed whether or not meta-meta-reasoning might exist. And we decided, ultimately, that such a distinction isn’t even necessary. After all, the structures involved in meta-reasoning, like cases, and rules, and models, and the same as those involved in a reasoning, itself. So, meta-reasoning is already equipped to reason about itself. Finally, we discussed a particular example of meta-reasoning, called goal-based autonomy. Meta-reasoning is in many ways the capstan of our course. It covers reasoning of all the topics we’ve covered so far, and it provides a way that they can be used in conjunction with one another. We do have a few more things to talk about though, and we’ll cover those in our Advanced Topics lesson.

15 – The Cognitive Connection

Meta reasoning arguably is one of the most critical process of the human cognition. In fact, some researchers suggest that, developing meta-cognative skills at an early age in life, may be the best predictor of a student success later in life. Actually, this makes sense. Meta reasoning is not about simply learning new information, it is about learning how to learn. About, learning new reasoning strategies. About integrating new information into memory structures. Meta reasoning is also connected to creativity. In meta reasoning, the agent is monitoring its own reasoning. It is spawning goals. It is trying to achieve them. Sometimes it suspends a goal, sometimes it abandons a goal. These are all part of the creative process. Creativity is not just about creating new products. It is also about creating a processes, that lead to interesting products.

17 – Final Quiz

Thank you, for answering this quiz, on metacognition.

25 – Advanced Topics

01 – Preview

To close this class, we are talking through a handful of advanced topics, related to the course material. In this course we already discussed a variety of goals, matters and paradigms of knowledge based AI. Now let’s close by talking through about some of the advanced applications of this content. We’ll also talk quite a bit about some of the connections with both AI and human cognition. Many of the topics we’ll discuss today are very broad and discussion oriented. So we encourage you to carry on the conversation on the forums and discuss all the issues that this content raises.

02 – Visuospatial Reasoning Introduction

Visuospatial reasoning is reasoning with visuospatial knowledge. This has two parts to it, visual and spatial. Visual deals with the what part. Spatial deals with the where part. So imagine a picture in which there is a sun on the top right of the picture. There are two parts to it. Sun, the what, the object. And where, the top right of the picture. We have come across visuospatial reasoning a little bit when we use constraint propagation to do line labeling and 2D images. One way of defining visuospatial knowledge is to say that in visuospatial knowledge causality is, at most, implicit. Imagine a picture in which there is a cup with a pool of water around it. You don’t know where the pool of water came from. But you and I can quickly infer that the cup must have contained the water, and the water must have spilled out as the cup fell. So visuospatial knowledge, causality is implicit when it enables inferences about causality.

03 – Two Views of Reasoning

There’re several ways of how we can deal with visuospatial knowledge. In fact in your projects you’ve already come across some of them. So imagine there is a figure here. Here is a triangle with the apex facing to the right. Here is another triangle with the apex facing to the left. So in one view, the AI agent can extract propositional representations out of figures like this. And similarly propositional representations out of figures like this. So this is a propositional representation, this is a propositional representation. And then, the AI agent can work on these propositional representations to produce new propositional representations. So some AI agent can use a logic engine or a production rule to say that this particular triangle, which was rotated 90 degrees, has not been rotated to 270 degrees. So although the input wasn’t in formula’s figures, the action here was at the level of propositional representations of these figures. The agent may extract propositional representations like this through image processing, through image segmentation, perhaps using some techniques like constraint propagation as well. Alternatively, the agent may have analogical representations. In these analogical representations, it is a structural correspondence between the representation and the external figure. So the external world headed triangle like this, and the analogical representation will also have a triangle like this. Notice that I’m using the term Analogical Representation, we use a separate thing from analogical reasoning. We are not talking about analogical reasoning right now. We’re talking about analogical representation and analogical representation is one, which is some structural correspondence with the external world that is being represented. Give a certain analogical representation, then I might want affine transformations or set transformations to get this. So I may say that I got this triangle out of that one, simply by the operation of reflection or rotation. So these proposed representations in the previous view are A model. They are separated from, divorced from the perceptual modality. These analogical representations on the other hand, are modal representations. They’re very close to the perceptual modality. And human cognition, mental imagery, appears to use analogical representations. What would be an equally intuitive of computational imagery? Human cognition is very good at using both propositional representations and analogical representations. Computers however, are not yet good at using analogical representations. Most computers, most of the time, use only prepositional presentations. The same kind of analysis may apply to other perceptual modalities, not just to our visual images. So here are two measures and we can either extract proposed representations out of them and then analyze those propositional representations. Or, we could think directly with the relationship in these two particular measures. There is a question for building queries of human cognition. When you’re driving a car, and you listen to a melody on your radio and you’re reminded of something. Reminded of a similar melody that you had heard earlier. What exactly is happening? Are you extracting a purpose for your presentation out of the melody that you just heard? And then the proposition representation reminds you of the proposition representation for a previously heard melody. Or, does a new melody somehow directly remind you of a previously heard melody without any intermediate propositional representation? These are our open issues in cognitive science, as well as in knowledge based AI. In cognitive science, it is by now, significant agreement that human cognition does use mental imagery at least with visual images. But we don’t know how to do mental imagery in computers.

04 – Symbol Grounding Problem

This chart summarizes some of the discussion so far. Content deals with the content of knowledge. Encoding deals with the representation of knowledge. Content and form. The content of knowledge can be visuospatial, that deals with what and where. Where is spatial, what is visual, and the encoding of the visuospatial knowledge could be either analogical or propositional. An analogical inquiry of visuospatial knowledge is a structural correspondence between the encoding and the external world that is being represented. In the propository presentation of visuospatial knowledge, there is no such correspondence. Examples of this verbal knowledge include things like scripts or going to a restaurant. The script for going to a restaurant again can represented either propositionally or potentially analogically. And a propository presentation of the kind we say we may have tracks and props and actors. In an analogical representation of the script for going to a restaurant, we may have a short movie. In much of the codes, we have dealt with the right hand side of this chart with verbal knowledge and prepositional presentations. Part of the point of this lesson on visuospatial knowledge and reasoning is that reasoning and knowledge can be visuospatial, and representations can be analogical. But we have yet to fully understand the role of human cognition and you [UNKNOWN] agents that can deal with visuospatial knowledge and analogical representation.

06 – Visuospatial Reasoning Another Example

We just saw an example where visual spatial knowledge by itself, suffices too in our logical reasoning under certain conditions. Now let us look at a different problem. There suddenly are situations where we might want AI agents to be able to extract [UNKNOWN] presentations. Your projects one, two, and three did exactly that. One task, where AI agent might want build proper [INAUDIBLE] representations out of regional spatial knowledge is when an AI is given a design drawing. So here is a vector graphics drawing of a simple engineering system. Perhaps some of you can recognize what is happening here. This is a cylinder and this a piston. This is the rod of the piston. The piston moves. Left and right. The other end of the rod is connected to a crankshaft. As this piston moves left and right, this particular crankshaft starts moving anticlockwise. This device translates linear motion into rotational motion. I just gave you a causal account. Although because [INAUDIBLE] only implicit in this [INAUDIBLE] spatial knowledge. You and I were able to extract a causal account out of this. How did we do it? How can we help AI agents do it? At present if you were to make a CAD drawing using any CAD tool that you want, the machine does not understand the drawing. But can machines of tomorrow understand drawings by automatically building these causal models out of them? Put it another way. There is a story that has been captured in this particular diagram. Can a machine automatically extract the story from this diagram? In 2007, Patrick Yaner built an AI program called Archytas. Archytas was able to extract causal models out of vector graphics drawings of the kind that I just showed you. This figure is coming from paper and Archytas and hence the form of the figure. We’ll have a pointer to the paper in the notes. This is how Archytas works. It began with a library of source drawings. These were drawings that we already knew about. For each drawing order it knew about it already had done the segmentation. The basic shapes for example might be things like circles and the composite shapes which were then labeled like piston and cylinder. Then a behavioral model or a causal model which said what happens when the piston moves in and out, namely the crankshaft turns. And then a functional specification we’ve said this particular system can work in linear motion into rotational motion. So there was a lot of knowledge with each previous drawing that Archytas already had seen. All of this knowledge was put into a library. When a new drawing was input into Archytas then it generated line segments and arcs and intersections from it. And then, it started mapping them to the lines and segments and arcs of previously known drawings. Retrieve the drawing that was the closest match in drawing to the new drawing. And then started transferring basic shapes, and then composite shapes, and it transferred each element through this abstraction hierarchy all the way up to the functional level. As an example, if Archytas library contains piston and crankshaft drawings like this along with causal functional models for them, then given a new drawing of a piston and crankshaft device Archytas will then be able to assemble a causal functional model for the new drawing. Thus Archytas extracted causal information from which spatial presentations to analogical reasoning.

07 – Ravens Progressive Matrices

>> Wrote another computer program that used a different kind of analogical representation called a fractal representation. And he was able to show that the fractal representation also enables. Addressing problems from the Raven’s test with a good degree of accuracy. It provides references both Maithilee’s work and Keith’s work in the notes.

09 – Systems Thinking Connections

>> In any complex system, there will be many levels of abstraction, some invisible, some visible. The human eye or human senses, more generally, can see only some of these levels of abstraction in visible levels of abstraction. System thinking helps us understand invisible levels.

11 – Design Introduction

When we talked about configuration, we alluded to design. Design is a very wide ranging, open ended activity. But then we settled on to configuration, very routine kind of design, where all the parts of the design are already known, we simply have to figure out the configuration of the parts. It is time now to return to design thinking. What is design thinking? Design thinking is about thinking about ill-defined, underconstrained, open ended problems. Let’s a design a house that is sustainable is an example of design thinking. Sustainability here is ill-defined. The problem is open ended. In design thinking, it is not just that a solution that evolves, it is that the problem it was as well. We have problem, solution, coevolution.

12 – Agents Doing Design

As we have mentioned earlier, configuration is a kind of design, a kind of routine design. And one material configuration is bound refinement. In configuration, all the components of the design are already known, but we are to find some arrangement with the components, and we assign values to some of the variables of those components, to arrive at the arrangement. Here is a design specification working it’s way. Here might be a plan for designing a chair as a whole. And once we assign values to some of the variables at the level of the chair, then we can refine the plan for the chair into a plan for the chair legs, the chair seat, and so on. All of this might be subject to some constraints. There are in fact a number of AI systems, that do configuration design. Many of them are being used in industry. Some of these AI systems use, matters like brand refinement the way we are showing it here. Others use case based reasoning. And various systems use a variety of methods, for doing configuration design, including model based reasoning and rule based reasoning. What about more creative kinds of design? Design in which not all the parts are known in advance. Since we just discussed the flashlight example, in the context of systems thinking, let us revisit that example in the context of creative design. So this is a schematic of the flashlight circuit. Here is the switch, the battery, the bulb, as earlier. On the systems thinking, we discussed how structured behavior function models capture the knowledge that when the switch is closed, electricity flows from the battery to the bulb, and the bulb converts the electrical energy into light energy. Let us suppose that this particular electrical circuit use a 1.5 volt battery and created 10 lumens of light. Tomorrow someone comes to you and says, I want 20 lumens of light. Design a flashlight electrical circuit for me. How will you do that? You might go to the structure, behavior function model for this particular circuit and do some thinking. You may recognize, the amount of light created in the bulb is directly proportional to the voltage of the battery. Instead of creating 10 lumens of light you need 20 lumens of light, you might say, I’m going to use a 3 volt battery. So far, so good. You’ve done system thinking in the context of design thinking. But now let us add a wrinkle. Suppose that a 3.0 volt battery is not available. At this point, a teacher tells you it’s okay if a 3.0 volt battery is not available. You can connect two 1.5 volt batteries in series. Two 1.5 volt batteries connected in series will give you the voltage of three volts. Accepting the teacher’s advice, you can now create an electrical circuit that will use two 1.5 volt batteries in series and create light of 20 lumens. But you’re not just creating this particular design, you also learned something from it. Every design, every experience is an opportunity for learning. In the 1990s, Sam [UNKNOWN] here at Georgia Tech created a program called IDOL, IDOL did creative design. In particular, IDOL would learn about design patterns. From simple design cases, the kind we just talked about. I’m sure most of you are familiar with the notion of design pattern, design patterns are a major construction software engineering. But design patterns are not just in software engineering but in all kinds of design, for example architecture and engineering and so on. There is some way of capturing the design pattern that can be learned from the previous case. A field of design of a device that changes the valuable variable from one value to another value. And you want another design that changes the value the same variable to some other value not the same as the previous design. One way you in which you can create the new design is. By replicating the behavior of the previous design. So not just having behavior be one for the first design, but having this behavior be one as many times as needed. Let us connect this to the example we just saw. If you have a design of an electrical circuit that can create 10 lumens of light, and you know how to do it through some behavior B1. I need to design an electrical circuit that can create 20 lumens of light, but you don’t know the behavior of B2. Then this behavior B2 is a replication of behavior B1 by connecting components and series. Once Sam’s program IDOL had learned about this design pattern of cascading, of replication, then, when it was given the problem of designing a water pump of higher capacity than the one available. It could create a new water pump by connecting several water pumps in series. Thus, ideal, created new designs in one domain, the domain of water pump, through analogical transfer of design patterns learned under the domain, the domain of electrical circuits. You would form the perspective of the new domain of water pumps initially did not know about all the components about all the water pumps that will be needed. With Sam’s program, IDOL is creative enough to know that the pattern of problems here in the water pump is exactly the same pattern that was also occurring in the domain of electrical circuits. Sam’s theory provides a computational account of not only how design patterns can be used, but also about how these design patterns can be learned and transferred to new domains. There is of course a lot more to design. We said earlier that design thinking engages problem solution, core evolution. It’s not just that a solution evolves but the problem remains fixed. But the problem evolves even as the solution evolves. It’s not quite clear how humans do this kind of creative design, with this problem solution co evolution. There is certainly a few AI systems capable of problem solution coevolution at present

14 – Exercise Defining Creativity I

In order to build AI agents that are creative, it might be useful to think about, what is creativity? Please write down your answer in this box and also post your answer in the class forum.

15 – Exercise Defining Creativity I

>> So after much deliberation I decided I would define creativity simply as anything that produces an non-obvious, desirable product. I think that we have to have to sort of output for creativity in order for it to be actually be identifiable as creativity. I think that the output has to actually be wanted in some way. Doing something that no one wants is not necessarily creative. I think the output has to actually be desirable in some way, and it also has to be something non-obvious. Doing the obvious answer is not a very creative solution. If I’m propping open a door and I use a chair, it’s a slightly more creative solution to that problem. Thank you David. Or course everyone’s answer to this question may differ. For example, some people may not put the word product here. It’s not clear that the result of creativity is necessarily a product. Some people do not put the word desirable there because sometimes creativity may not result from some initial desire. Let us carry on this discussion of what is creativity on the forum. Feel free to add your own notions.

16 – Defining Creativity II

>> Good question, David. Novelty had use with newness, the unexpectedness had use with something non-obvious or surprising. Perhaps this will become clearer if I take an example. So in my deal, we decide to entertain a group of 20 friends. We already know how to make soufflés according to a particular recipe. We’ll make soufflé for 20 friends this time. We have never made soufflé for 20 people, so something is novel, something new, something we haven’t done earlier. On the other hand, we have known this recipe for ages. Something unexpected would be if we come up with a new recipe for this soufflé which taste dramatically different, surprisingly different. Not just something new, but something unexpected. So far we have been talking about the product of creativity, the result of creativity, the outcome of creativity. What about the process of creativity? You use it on here some and other, both of these terms are important. Let’s first look at the term other. In this course we’ve only talked about several processes of creativity. An analogical reasoning is a fundamental process of creativity. You already explored an analogical reasoning in the context of designing. We just did that when were talking about design thinking. One might be able to design a new kind of water pump, but composing several water pumps in series if one can analogically transfer a design factor from the domain of electrical circuits. Was a very good example. Similarly under analogical reasoning, we were talking about the processes that might be used to come up with a model the atomic structure the analogy to the model the solar system, which clearly is a creative process that cuts across large number dimensions of space and time. Another place where we talked about creative processes was when we were talking about explanation based learning. It seems creative, if the robot can go to the kitchen, and use the flower pot as a cup to bring you coffee. Here are three other processes of creativity. Emergence, re-representation, and serendipity. A simple example of emergences. If I draw three lines. One, two, three. Then a triangle emerges out of it. The triangleness doesn’t belong to any single line. I was not even trying to draw a triangle. I just drew three lines, and a triangle emerged out of it. Emergence of the triangle to the drawing the three lines is a kind of creativity. Re-representation occurs when the original representation of the problem is not conducive to problem solving. So we re-represent the problem and then commence problem solving. To see an example of this. Let’s go back to atomic structure and solve this problem. Suppose that we have a model sort of system which uses the word the planets revolve around the sun. You also have a model of the atom, and this uses the term the electron rotates around the nucleus. The model of the sun had the word revolve. The model of the atom has the word rotates. The two vocabularies are different. If you were to stay with, with this couple of sort of presentations mapping between rotate and reward would be very hard. On the hand, suppose we were to re-represent this problem. Re-represent the atomic structure by growing the nucleus in the middle and the electron around it. We represent the solar system by drawing the sun in the middle and the earth around it. Then in this new representation, we can see the similarity, we can do the mock-up. This re-representation is another fundamental process of creativity. Serendipity can be of many types and can occur in many different situations. One kind of serendipity occurs when I’m trying to address a problem but I’m unable to address it. So I suspend the goal and I start doing something different. Later, at some other time, I come across a solution, and I connect it with the previous, suspended goal. The story has it that in 1941 in France, Josh Mistral’s wife asked him to help her open a dress by pulling on a zipper because it was stuck. Mistral struggled with the zipper, but couldn’t pull it down. Later on one day, Mistral was walking his dog, when he found that some birds were stuck to the dog’s legs. Curious about this, Mistral looked at the bird closely under the microscope, he then connected the solution, the bird solution to the opening of the zipper problem and out of that was born the notion of Velcro which you and I now use on a common basis. But just like the word other was important here, these are three processes in addition to the process we already discussed in this class. The word some is also important here. This is not an exhaustive list. There are in fact additional things we can add. For example, another potential process here is called conceptual combination.

17 – Exercise Defining Creativity III

Let us do an exercise together. Here are a number of tasks that we have come across in this class. For each of these tasks, mark the box if you think that the agent that performed that task well, is a creative agent.

18 – Exercise Defining Creativity III

>> So actually, I marked none of them. It seems to me that for all of these tasks if an artificial agent that we design accomplishes the task, we’re able to go back and look at its reasoning, look at its processing, and figure out exactly how it did it. So it’s never going to be unexpected. It’s always the output of a predictable algorithm. Interesting [UNKNOWN] David. Not sure I agree with it. Let’s discuss it further.

19 – Exercise Defining Creativity IV

Do you agree with David’s assessment that none of these areas is creative because we can trace to the process that the agents used?

20 – Exercise Defining Creativity IV

But let’s look at each of these choices, one at a time. The first one says yes, because in order for a result to be creative, it must be novel, an output of an algorithm cannot be novel. Well, there are a few problems with this particular answer. What if an output of an algorithm for a small, closed-word problem cannot be novel? The output of combinations of algorithms for open ended problem can and indeed sometimes is novel. There are algorithms for example, that do design or that do scientific discovery, whose results are novel. Let’s look at the second answer. Yes, because given a set of input, the output will always be the same. Therefore, the product can never be unexpected. The output will depend not just on the input. And not only on the methods of the system, but also the situation in which the methods are situated. The output depends not just on the input under the method the AI agent uses, but also the context in which the AI agent is situated. For example, given the same input but different context or the input, the agent will come up with very different outputs, very different understandings of that same input as we saw in this section on understanding. The third answer, no because it defines creativity in terms of the output rather than the process. This answer too has problems, because sometimes creativity can be defined simply in terms of the output without knowing anything about the process. We can think of a black box, that creates very interesting creative music. We would not know anything about the process that it is using. But, if the output’s interesting and creative music, we would consider it to be creative. Personally, my sympathies lie with the fourth answer. But, of course you are welcome to disagree with me. Why don’t we continue this discussion on the forum?

23 – Final Quiz

Please write down what all you learned in this lesson, in this box.

24 – Final Quiz

Great, thank you very much.

26 – Wrap-Up

01 – Preview

Today we’ll wrap up this course. It’s been a fun journey. But like all journeys, this one too must come to an end. The goal for today’s lesson is to tie together some of the big things we have been discussing, and to point to some of the other big ideas out there in the community. We’ll start by revisiting the high level structure of the course. Then we’ll go through some of recurrent patterns and principles we’ve encountered throughout the course. Finally, we’ll talk about all the broader impacts and applications of knowledge based AI.

02 – Cognitive Systems Revisited

Let us revisit the architecture for a cognitive system. We have come across this several times earlier in this course. We can think in terms of three different spaces. A Metacognitive space, a Deliberative space, and a Reactive space. The Reactive space directly maps percepts from the environment into actions in the environment. The Deliberative space maps the percepts into actions, with the mappings mediated by reasoning and learning and memory. So, while this is a see act cycle, this is a see think act cycle. The metacognitive space monitors the deliberative reasoning. It sees the deliberative reasoning and acts on the deliberative reasoning. As we had discussed earlier, it’s better to think of these in terms of overlapping spaces rather than disjoint layers. When we were discussing metacognition we also saw that we could draw arrows back from here like this because metacognition could also act on its own. This cognitive system is situated in the world. It is getting many different kind of input from the world. Percepts, signals, goals, communication with other agents. Depending on the input, and depending upon the resources available to the agent to address that input, the agent may simply react to the input and give an output in the form of an action to a percept, for example. Or the agent may reason about the input. And then decide on an action after consulting memory and perhaps invoking learning and reasoning. In some cases, the input that the agent receives might be as a result of the output it had given to a prior input. For example, it may have received a goal. Then come up with a plan. When it had received the input of the plan, failed upon execution. In that case, deliberation can give an alternative plan. A metacognition may wonder about, why did the planner fail in the first place? And repair the planner, not just the plan. Just like reaction and deliberation are situated in the world outside, metacognition, in a way, is situated in the world inside. It is acting on the mental world. While deliberation and reaction act on the objects in the external environment. For metacognition it is the thoughts and the learning and the reasoning that are objects internally in the amygdala. Of course, this input is coming constantly. Even now, as you and I are communicating, both you and I are receiving input. And we’re giving output as well. In fact, the cognitive system is constantly situated in the world. It is never separate from the world. There is no way to separate it. It has input coming constantly. There is output going out constantly. The world is constantly changing, and the cognitive system is constantly receiving information about the world. It uses this knowledge in it’s knowledge structure to make sense of the data about the world. And it decides on actions to take on the world. Recall that we had said that intelligence is a function that maps the perceptual history into action. And to some degree, intelligence is about actions, reactions. About selecting your right output to act on the world. Depending on the nature of the input the output may vary. If the input is a goal, the output might be a plan or an action. If the input is a piece of knowledge, the output might be internal to the cognitive system. And the cognitive system may assimilate that knowledge in its internal structures. If the input is a percept, the output might be an action based on that percept. If the input is new information, the cognitive system will learn from the new information, and then store the result of the learning in its memory. The world here consists not just of the physical world, but of the social world as well. Thus, this cognitive system is situated not just in the physical world but is also situated in the social world, consisting of other cognitive systems and constantly interacts with them. Once again, the interactions between these cognitive systems can be of many different kinds. Percepts, goals, actions, plans, information, data. Any cognitive system is not just monitoring and observing the physical world, but it’s also monitoring and observing the social world, the actions of other cognitive systems in the world, for example. Any cognitive system learns not just from its own actions but also from the actions of other cognitive systems around it. Thus a cognitive system needs to make sense not just to the physical world but also by the social world. We saw some of this when we were talking about scripts. There we saw how a cognitive system used a script to make sense of the actions of other cognitive systems. This architecture is not merely an architecture for building AI agents, it is also an architecture for reflecting on human cognition. Does this architecture explain large portions of human behavior?

03 – Course Structure Revisited

>> This summarizes 30 topics that we covered in this class, which is quite a lot. Of course, there is a lot more to talk about each of these 30 topics than we have covered so far. Therefore, we have provided readings for each of the topics. And you are welcome to pursue the readings for whatever topic that interests you the most. We’ll also love to hear about your views about this on the forum.

04 – The First Principle

At the beginning of this course we enumerated seven major principles of knowledge based AI agents that we’ll cover in CS 7637. Now let’s wrap up this course by revisiting each of the seven principles. Here is the first one. Knowledge based AI agents represent in organize knowledge into knowledge structures to guide and support reasoning. So the basic paradigm here is represent and reason. Represent and reason. If you want to reason about the world, you’d have to represent knowledge about the world in some way. You not only want to represent knowledge to support reasoning, you also want to organize this knowledge into knowledge structures to guide, to focus the reasoning. Let us look at a few examples that we covered in this course about dispensing. Semantic networks not only allow us to represent knowledge about the world, they also allows us to organize that knowledge in the form of a network. We use semantic networks to address the problem of the guards and prisoners dilemma. The advantage of the semantic network was that they expose the constraints of this problems so clearly, so that we can in fact reason about it. And notice that the organization helps us focus the reasoning. Because of the organization, there’s so many other choices we don’t have even have to reason about them. Frames were on to the knowledge structure that organize knowledge, and guided and supported reasoning. Given frames for things like earthquakes, we could reason about sentences like, a serious earthquake killed 25 people in a particular country. We’ll also use frames to support common sense reasoning. Here, Ashok is moving his body part to a sitting position. Here, Ashok is moving himself into a sitting position. Here, Andrew sees Ashok. Now Andrew moves to the same place as Ashok, and Andrew then moves in menu to Ashok. This is about a story about visiting a restaurant. Once again, there are knowledge structures here. These knowledge structures are not only representing knowledge, they are organizing knowledge into a sequence of actions. These knowledge structures help generate expectations. So we know what Ashok expects to happen next in any of these situations. We also know how Ashok can detect surprises. When the non-obvious thing happens, Ashok knows that it has warranted the expectations of the scripts, and can do something about it. This is how the script support in guided reasoning. We also saw this principle in action, when we were talking what explanation based learning. In order to show that an instance was an example of a particular concept, cup, we constructed complex explanations. In this case, we were constructing the complex knowledge structure on the fly out of smaller knowledge structures. The smallest knowledge structures came out of precedents, or examples we had already known. Then we composed the knowledge of these various knowledge structures, into a complex explanation to doable reasoning, to guide and support their reasoning. You’ve seen this principle in action in several other places in this course. This is one of the fundamental principles. Represent, organize, reason.

05 – The Second Principle

Our second main principle for CS7637 was that learning in KBAI agents is often incremental. This means that information or data or experiences arrive one at a time. This is one of the key differences between knowledge based AI, and other forms of AI, like machine learning. In those forms a large amount of information is often given right at the beginning. Here, our agents learn step by step, incrementally. We first encountered this with learning by recording cases. Our agents learned each individual case one by one. The experiences themselves were the increments in this learning strategy. Case based reasoning also operated on individual cases, but it organized them into much more complex knowledge structures. Like tagging them in an array. Or organizing the my discrimination tree. But the fundamental object of knowledge were still individual cases that arrived one by one. We did a complex exercise where one by one we added these new cases into our discrimination tree. Incremental concept learning was, as the title suggest, incremental. Here were received positive and negative examples one at a time. Based on the difference between our current concept, and our new example, and whether or not the new example was a positive or negative example. We would change our concept. This is always done example by example, incrementally. Version spaces involved a very similar kind of knowledge. Here experiences came one at a time, and we generalized a specific model and specialized the general model to converge down to an understanding of the concept. Finally, learning by correcting mistakes is also deeply incremental. Here, the individual mistakes arrived incrementally, and based on the individual mistakes, our agent modified it’s knowledge base to repair the cause of the previous mistake. To take away here is that many of our methods in learning, reasoning and memory. All involve dealing with information that comes incrementally, bit by bit, instead of processing a large amount of data all at the same time. One can also see this connects more closely to human experience, where we’re constantly experiencing the world experience by experience instead of being given a lifetime of experiences all at once.

07 – The Fourth Principle

Our fourth principle was a Knowledge Based AI agents match methods to tasks. At the beginning of this course we covered several very powerful problem solving methods like Generate & Test, and Means-Ends Analysis. But because they were very powerful and very general, they also weren’t necessarily the best for solving any one problem. We also covered some more specific problem solving methods like planning that addressed a narrower set of problems but addressed those problems very, very well. We also covered several tasks in this class, like configuration and diagnosis and design. These tasks could all be carried out by a variety of methods. For example, we can imagine doing configuration with Generate & Test or we generate every possible configuration of a certain plan and then test to see which one is best. We could also do configuration by Problem Reduction where we reduce the problem down into the sub parts and solve them individually and then compose them into an overall solution. In this way, knowledge based AI agents match methods to tasks. In some cases we do the matching, we decide that generate and test is the best way to address this diagnosis problem. In other cases we might design AI agents with their own meta-reasoning such that they themselves can decide which method is best for the task that they’re facing right now. Note that this distinction between methods and tasks is not always necessarily absolute. Methods can spawn different sub tasks, so for example, if we’re doing design by case-based reasoning that spawns new problems to address. And we might address those new problems, those new tasks, with analogical reasoning, or with problem reduction. This gets back to our meta-reasoning notion of strategy integration. In this way, knowledge based AI agents match methods to tasks not only at the top level, but also at every level of the task-subtask hierarchy.

08 – The Fifth Principle

The first principle of knowledge based AI, as we have discussed it in CS7637, is that AI agents use heuristics to find solutions that are good enough, but not necessarily optimal. Some schools of AI put a lot of emphasis on finding the optimal solution to every problem. In knowledge based AI, we consider agents that find solutions that are good enough. Herbert Simon called this satisficing. The reason for finding solutions that are only good enough is because of the trade off between computational efficiency on one hand and optimality of solutions on the other. We can find optimal solutions, but that comes with the cost of computational efficiency. Recall one of the conundrums of AI. AI agents are with limited resources, bounded rationality, limited processing power, limited memory size. Yet, most intrusting problems are impractical. How can we get AI agents to solve impractical problems with limited rationality and yet give nearly a ten performance? We can get AI agents to do that if we can focus on finding solutions that are good enough, but not necessarily optimum. Most of the time you and I as human agents do not find optimum solutions. The plan you may have to make dinner for yourself tonight is not necessarily optimum, it’s just good enough. The plan that you have to go from your house to your office is not necessary optimal, it’s just good enough. The plan that you have to walk from your car to your office is not necessary optimal, it’s just good enough. Further, AI agents use heuristics to find solutions that are good enough. They do not do an exhaustive search, even the exhaustive search might yield more optimal solutions because exhaustive search is computationally costly. We came across this notion of heuristic search several times in this course. Once place where we discussed this in some detail, was in incremental concept learning. Given a current concept definition and a negative example, we arrive at a new concept definition by using heuristics like require-link heuristic. The require-link heuristic adds the must clause to this support link between these two bricks. Mean-sense analysis was a heuristic method. It said that given the current position and the goal position, find the differences and then select an operator that will reduce the difference. Because mean-sense analysis was a heuristic method sometimes it ran into problems and did not follow guarantees of optimality. But when it worked, it was very efficient. Another case where we explicitly made use of heuristic laws in the generate interest method. Here we had a heuristic which said, do not generate a state that duplicates a previously generated state which made the method more efficient. Does the focus of knowledge based AI agents is a near real time performance? They’re addressing computational intractable problems with bounded resources. And yet being able to solve a very large class of problems in robust intelligence and flexible intelligence. And that happens not by finding optimal solutions to a narrow class of problems, but by using heuristics to find solutions that are good enough to very large classes of problems. This principle comes very much from theories of human cognition. As I mentioned earlier, humans do not normally find optimal solutions for every problem they face. However, we do manage to find solutions that are good enough, and we do so in near real time, and that’s where the power lies

09 – The Sixth Principle

Our sixth principle, was knowledge-based AI agents make use of recurring patterns in the problems that they solve. These agents are likely to see similar problems over and over again and make use of the underlying patterns behind these similar problems to solve them more easily. We talked about this first with learning about recording cases. Here we assumed that we had a library of cases, and that the solution to a former case would be the exact solution to a new problem. Ashok’s example of tying shoe laces was similar to this. When we tie our shoelaces, we aren’t resolving the problem of tying our shoelaces from scratch. Instead we’re just taking the solution from an earlier time when we tied our shoelaces and doing it again. We assumed that the solution to the old problem will solve this new similar problem. In case-based reasoning, however, we talked about how the exact solution to an old problem won’t always solve new problems. Instead sometimes we have to adapt an old problem. Here we assumed that there were recurring patterns in the world that would help us solve these new and novel problems based on previous experiences. Even though the new experience is novel, the pattern is similar to a prior experience. Analogical reasoning is very deeply rooted in this principle. Here we explicitly talked about the idea of taking patterns from one problem, abstracting them, and transferring them to a problem in a different domain. Whereas in case-based reasoning, the pattern was within a domain, here the pattern can span different domains. In configuration, we assumed that the underlying design, the underlying plan for a certain device or product was pretty similar each time. But there were certain variables that had to be defined for an individual instance of that object. In a chair example, the overall design of a chair is a recurring problem, they all have legs, they all have seats, they all have backs, but the individual details of a specific chair might differ. Now, it might be tempting to think that this is actually at odds with the previous principal when the knowledge-based AI agent’s consult a novel problems. Here we’re saying that knowledge-based AI agents solve recurring problems based on recurring patterns, but in fact these are not mutually exclusive. Knowledge-based AI agents leverage recurring patterns in the world, but they do so in conjunction with the other reasoning methods to allow them to also address novel problems.

10 – The Seventh Principle

So the seventh and last principle of knowledge based AI agents in CS7637 is that the architecture of knowledge based AI agents enables reasoning, learning, and memory to support and constrain each other. Instead of building a theory of reasoning or problem solving by itself or a theory of learning by itself or a theory of memory by itself, we are trying to build unified theories, theories where reasoning, learning, and memory coexist. Memory stores and organizes knowledge. Learning acquires knowledge. Reasoning uses knowledge. Knowledge is the glue between these three. One place that we saw reasoning, learning, and memory coming together very well was in production systems. When reasoning failed, an impasse was reached then. Memory provided some episodic knowledge and a learning mechanism of chunking extracted a rule from that episodic knowledge. And that rule broke the impulse and reasoning could proceed a pace. This is a clear example where reasoning, learning, and memory came together in a unified architecture. In logic, memory, or the knowledge base, may begin with a set of axioms. And those set of axioms decide what we can prove using that particular logic. To look at the problem, conversely, depending upon the reasoning, we need to put into the knowledge base, so that the reasoning can be supported. Exploration of this learning was under the place. Where reasoning learning and memory came together so well. Memory supplied us with the earlier precedents. Reasoning led to the composition of this explanation which explained why this instance was an example of a cup. This lead to learning about the connections between this weirdest precedence to an explanation. Here in the correcting mistakes was yet another example of, learning, reasoning, and memory coming together. Where a failure occurred, when the agent used it’s previous knowledge, memory, to reason and identify the fault responsible for the failure, reasoning, and then corrected that particular fault, learning, in order to get the correct model. This knowledge based paradigm says that we want to be able to reach unified, that connect reasoning, learning and memory. And this also connects very well with human cognition, human cognition, of course, has reasoning and learning and memory intertwined together. It is not as if memory and human cognition works by itself or learning works by itself or reasoning works by itself. You cannot divorce them from each other.

11 – Current Research

Knowledge based is a dominate field with a very active research program. There are number of exiting projects going on right now. Here is a small list of them. CALO is a project, in which cognitive assistant learns and organizes knowledge. CALO, in fact was a pick cursor for the CD program of Apple. Cyc and OMCS. OMCS stands for Open Mind Common Sense. Cyc and OMCS are two large knowledge bases to support everyday common sense of reasoning. Wolfram Alpha is a new kind of search engine that uses some of the same kind of pilot structures we have considered in this particular class, different from many of the search engines. The three projects in the right column are projects here at Georgia Tech. VITA is a computational model of visual thinking in autism. And particular it solves problems in the real world raven’s progressive matrices test using only visual spacial representations. Dramatis is a computational model suspense in drama and stories. Recall that in this class we talked about the theory of humor and surprise. Dramatis tries to do the same thing for suspense. DANE is a system for supporting design based on analogies to natural systems. We’ve come across this idea of biologically inspired design earlier in the class and DANE supports that kind of biologically inspired design. We have provided references for these and many other knowledge based AI projects in the class notes. Your welcome to explore depending on your inquests.

12 – Our Reflections

>> We’d also like to thank our colleagues here at Georgia Tech, including David White at the College of Computing. And Mark Weston, and the staff of the Georgia Tech Professional Education Department. It’s been a real fun journey for us. We expect to hear from you. We hope it’s the beginning of a beautiful friendship.