Jul 20, 2016
Table of contents:
Over the last couple of years there have been a number of breakthroughs in the pursuit of artificial intelligence as computing power and “big data” have become more ubiquitous. It has probably never been a better time to start digging into the industry as building systems that can benefit from machine learning are finally a reality.
However, the artificial intelligence industry is awash with complex algebra and computer science jargon which makes approaching it as someone without many years of focused experience intimidating.
Artificial Intelligence - A Guide to Intelligent Systems aims to show how the theory behind artificial intelligence is actually fairly simple and straightforward and how people from all kinds of different industries can benefit from this new wave of technological advancement. The book covers the principles behind intelligent systems, how they are built, what they are used for, and how to choose the right tool for the job.
The book is written by Michael Negnevitsky, a professor at the University of Tasmania and it is based on his lectures to students who do not have a background knowledge of calculus and without the requirement of learning a new programming language.
Artificial Intelligence - A Guide to Intelligent Systems is based on the lectures given by Michael Negnevitsky over 15 years as a lecturer at the University of Tasmania and it serves as an introduction to the field of computer intelligence covering rule-based expert systems, fuzzy expert systems, frame-based expert systems, artificial neural networks, evolutionary computation, hybrid intelligent systems, knowledge engineering, and data mining.
The lectures were an introductory course for undergraduate students in computer science, computer information systems, and engineering, but the book is also aimed at non-computer science students as a guide to the state of the art technology of knowledge-based systems and computational intelligence, and anyone who aspires to solving their problems using computer intelligence in almost any type of industry or background.
The book begins with a definition of intelligence and a brief history of AI. Intelligence is the ability to learn and understand, to solve problems and to make decisions.
Artificial Intelligence is a science that has defined its goal as making machines do things that would require intelligence if done by humans.
A machine is thought to be intelligent if it can achieve human-level performance in some cognitive task. To build an intelligent machine, we have to capture, organise and use human expert knowledge in some problem area.
A computer program capable of performing at a human-expert level in a narrow problem domain area is called an expert system.
Knowledge is a theoretical or practical understanding of a subject. Knowledge is the sum of what is currently known. An expert is a person who has deep knowledge in the form of facts and rules and strong practical experience in a particular domain.
Experts can usually express their knowledge in the form of production rules.
The most popular expert systems are rule-based expert systems. Production rules are represented as IF THEN statements. A production rule is the most popular type of knowledge representation. Rules can express relations, recommendations, directives, strategies, and heuristics.
Expert systems provide a limited explanation capability by tracing the rules fired during a problem-solving session.
Rule-based expert systems have the advantages of natural knowledge representation, uniform structure, separation of knowledge from its processing, and coping with incomplete and uncertain knowledge.
Rule-based systems also have disadvantages, especially opaque relationships between rules, ineffective search strategy, and inability to learn.
Uncertainty is the lack of exact knowledge that would allow us to reach a perfectly reliable conclusion. The main sources of uncertain knowledge in expert systems are: weak implications, imprecise language, missing data, and combining the views of different experts.
Probability theory provides an exact, mathematically correct approach to uncertainty management in expert systems. The Bayesian rule permits us to determine the probability of a hypothesis given that some evidence has been observed.
In the Bayesian approach, an expert is required to provide the prior probability of hypothesis and values for the likelihood of sufficiency to measure belief in the hypothesis if evidence is present, and the likelihood of necessity to measure disbelief in hypothesis if the same evidence is missing.
To employ the Bayesian approach, we must satisfy the conditional independence of evidence. We also should have reliable statistical data and define the prior probabilities for each hypothesis.
Certainty Factors theory is a popular alternative to Bayesian reasoning. Certainty Factors theory provides a judgemental approach to uncertainty management in expert systems. An expert is required to provide a certainty factor to represent the level of belief in hypothesis given that evidence has been observed.
Certainty factors are used if the probabilities are not known or cannot be easily obtained. Certainty theory can manage incrementally acquired evidence, the conjunction and disjunction of hypothesis, as well as evidences with different degrees of belief.
Both Bayesian reasoning and certainty theory share a common problem: finding an expert able to quantify subjective and qualitative information.
Fuzzy logic is logic that describes fuzziness. As fuzzy logic attempts to model humans’ sense of words, decision making and common sense, it is leading to more human intelligent machines.
Fuzzy logic is a set of mathematical principles for knowledge representation based on degrees of membership rather than on the crisp membership of classical binary logic. Unlike two-valued Boolean logic, fuzzy logic is multi-valued.
A fuzzy set is a set with fuzzy boundaries, such as short, average, or tall for men’s height. To represent a fuzzy set in a computer, we express it as a function and then map the elements of the set to their degree of membership. Typical membership functions used in fuzzy expert systems are triangles and trapezoids.
A linguistic variable is used to describe a term or concept with vague or fuzzy values. These values are represented in fuzzy sets.
Hedges are fuzzy set qualifiers used to modify the shape of fuzzy sets. They include adverbs such as very, somewhat, quite, more or less, and slightly. Hedges perform mathematical operations of concentration by reducing the degree of membership of fuzzy elements (e.g very tall men), dilation by increasing the degree of membership (e.g more or less tall men) and intensification by increasing the degree of membership above 0.5 and decreasing those below 0.5 (e.g indeed tall men).
Building a fuzzy expert system is an iterative process that involves defining fuzzy sets and fuzzy rules, evaluating and then tuning the system to meet the specified requirements.
Tuning is the most laborious and tedious part in building a fuzzy system. It often involves adjusting existing fuzzy sets and fuzzy rules.
A frame is a data structure with typical knowledge about a particular object or concept.
Frames are used to represent knowledge in a frame-based expert system. A frame contains knowledge of a given object, including its name and a set of attributes also called slots.
Frame-based systems support class inheritance, that is the process by which all characteristics of a class-frame are assumed by the instance-frame.
The fundamental idea of inheritance is that attributes of the class-frame represent things that are typically true for all objects in the class, but are filled with data that is unique for that instance.
Although frames provide a powerful tool for combining declarative and procedural knowledge, they leave the knowledge engineer with difficult decisions about the hierarchical structure of the system and its inheritance paths.
Machine learning involves adaptive mechanisms that enable computers to learn from experience, learn from example, and learn by analogy. Learning capabilities can improve the performance of an intelligent system over time. One of the most popular approaches to machine learning is artificial neural networks.
An artificial neural network consists of a number of very simple and highly interconnected processors, called neurons, which are analogous to the biological neurons in the brain. The neurons are connected by weighted links that pass signals from one neuron to another. Each link has a numerical weight associated with it. Weights are the basic means of long-term memory in Artificial Neural Networks. They express the strength, or importance, or each neuron input. A neural network “learns” through repeated adjustments of these weights.
The evolutionary approach to artificial intelligence is based on the computation models of natural selection and genetics known as evolutionary computation. Evolutionary computation combines genetic algorithms, evolution strategies and genetic programming.
All methods of evolutionary computation work as follows: create a population of individuals, evaluate their fitness, generate a new population by applying genetic operators, and repeat this process a number of times.
Genetic algorithms use fitness values of individual chromosomes to carry out reproduction. As reproduction takes place, the crossover operator exchanges parts of two single chromosomes, and the mutation operator changes the gene value in some randomly chosen location of the chromosome. After a number of successive reproductions, the less fit chromosomes become extinct, while those best fit gradually come to dominate the population.
Knowledge engineering is the process of building intelligent knowledge-based systems. There are six main steps: asses the problem; acquire data and knowledge; develop a prototype system; develop a complete system; evaluate and revise the system; and integrate and maintain the system.
Intelligent systems are typically used for diagnosis, selection, prediction, classification, clustering, optimisation, and control. The choice of a tool for building an intelligent system is influenced by the problem type, availability of data and expertise, and the form and content of the required solution.
Understanding the problem’s domain is critical for building an intelligent system. Developing a prototype system helps us to test how well we understand the problem and to make sure that the problem-solving strategy, the tool selected for building a system, and the techniques for representing acquired data and knowledge are adequate for the task.
Modern societies are based on information. Most information appears in its raw form as facts, observations, measurements. These constitute data, which is what we collect and store. As the cost of computing power continues to fall, the amount of accumulated data is increasing exponentially. Yet traditional databases are not designed for carrying out meaningful analysis of the data - this is where data mining comes into place.
Data mining is the extraction of knowledge from raw data. It can also be defined as the exploration and analysis of large quantities of data in order to discover meaningful patterns and rules. The ultimate goal of data mining is to discover knowledge.
Although data mining is still largely a new and evolving field, it has already found numerous applications. In direct marketing, data mining is used for targeting people who are most likely to buy certain products and services. In trend analysis, it is used to identify trends in the marketplace, for example by modeling the stock market. In fraud detection, data mining is used to identify insurance claims, cellular phone calls, and credit card purchases that are most likely to be fraudulent.
Overall, this is an excellent guide to the machine learning and artificial intelligence industry as a whole and gives a good explanation to many of the different types of expert systems that are widely used and studied.
However, I initially started reading this book because I was particularly interested in artificial neural network. Whilst I’m sure I benefited from reading about other techniques, I found I was wasting my time reading about the stuff I wasn’t focused on.
The books also claims to be not require a background in calculus and it is free from jargon, however I didn’t find that to be the case at all. There are much better resources online that explain topics such as artificial neural network using first principles that do not jump straight into the underlying complexity.
I think this is a good book if you want to understand the history of the industry and what techniques have been tried and what they are good far. If you are particularly interested in one area, such as artificial neural networks, I think there are a lot of better resources out there that will give you a better intuition of how it actually works without drowning you in complexity.