The field of artificial intelligence, AI, is a really large field, nourish with concepts and tools from several other research areas. One example of such external contributions is the nearest neighbor algorithm, NNA. For case-based reasoning, CBR, we use NNA as the backbone of the artificial reasoning behind CBR applications. CBR is a type of expert system. Expert systems are a branch of AI. The purpose of explaining CBR, its applications, and achievements, is to help us to have an insight on how we can use pattern recognition techniques for AI.
Contents
Expert systems
The main objective of an expert system is to emulate the behavior and reasoning of human subjects for a particular topic or domain. The word 'expert' became very handy as an adjective in expert systems. When we are designing an expert system we are not developing an intelligent agent that is going to solve or work over a very large and diverse repository of knowledge. Rather, we want to work over a specialized and well defined human knowledge.
Rule-Based Reasoning
Contrary to CBR, we may be familiar with rule-based reasoning systems, RBR; another class of expert system. Simplifying the reasoning strategy behind RBR, we can say that it is very similar to the 'if-else' statements in a C or C++ program. In RBR we are trying to match a given pattern of features to a rule; the 'if' condition. If we find a match, those rules are linked to a set of actions; the statements inside the 'if'. Consequently, the system is going to perform the actions associated with that rule. Finally, in RBR we emulate the reasoning and behavior of an expert in those rules and actions.
Case-Based Reasoning
In CBR the knowledge representation strategy is based on a set of features that describes an event or an object. This set of features is called a case, and each case has a solution, an action, or information that we would like to extract or use. In contrast to RBR, where we use a set of rules to represent knowledge, in CBR we use a set of cases which represent different instances of the domain we are working on. This set of cases is called a knowledge repository or case-base. We use NNA to make use of that repository.
In the design phase of the application, we decide which features are required in order to characterize the events in our domain. As Ian Watson explains, those features should be predictive, address the purpose the case will be used for, be abstract enough to allow for widening the future use of the case-base, and be concrete enough to be recognize in the future. The strategy is that when we encounter a new event with an unknown solution, we extract the predefined features and create a case for that event; the current case. Then using NNA the system tries to match the current case with an old case in our repository. If we found a match, we assume that the solution of that old case is a feasible solution for the current case. Therefore, the reasoning strategy is what makes CBR an AI application, but the way we match the cases is where we use pattern recognition.
An example
Let's give an example so we can have a clear idea of what a case is and how we can use a set of cases to solve a problem. I am going to use an example from Ian Watson book that I found really simple and useful.
Let think you want to develop a CBR system for a bank. Your task is to help the banker to make good decisions about who is going to get a loan and who isn't. We want to minimize the risk of give money to the wrong person. The banker already has a database with information about his loan clients, i.e. salary, amount of loan, credit score, interest rate of loan, if the client is paying the loan, and so on. Therefore, we can use this information for our cases. It is intuitive and predictable to use the credit score of a new loan client and compare this score with other previous loan clients. We may compute the probability of defaulting a loan given the client credit score. The problem is that this information alone does not provide enough information to help the banker. Let add another feature to the case; the loan amount. We can verify from the previous loan clients which clients had defaulted a loan given a loan amount and a credit score. If we find that the profile of this new client fit the profile of customers that had defaulted loans, we are not going to approve the loan.
We can see that there are no rules involved in the reasoning. We just make a match with the closest case in the case-base. If the matched client defaulted, we can expect the new client to default the loan. We do this match using NNA. You can make each feature (i.e. salary, loan amount) an axis in Euclidean space. Then you compute the distance between each feature in the current case (new loan client) and each feature in the cases stored in the case-base (loan client 1, loan client 2, etc.) You have to define a method to measure the overall distance of case to a case in the case-base. The case in the case-base with the smallest distance to the current case is a potential match. If this distance is over a given threshold we consider the case a match and we apply the solution encapsulated in the stored case. Later I am going to explain methods to compute the distance, ways to generalize cases, and manipulate them to improve the performance and lifespan of the system.
CBR Links
- Topics on CBR from the American Association of Artificial Intelligence
- AI-CBR.org
- Wikipedia article on CBR
Closing thoughts
I will continue to post new things on CBR each week. Came back every week for updates in this topic.
References
- Jackson, Peter, 'Introduction to expert systems', Addison-Wesley, 1998
- Watson, Ian, 'Applying case-based reasoning: techniques for enterprise systems', AI-CBR, University of Salford, U.K., Morgan Kaufmann Publishers, 1997