Theoretical and Practical Introduction to Artificial Intelligence
1. Introduction
Artificial Intelligence (AI) is a field of study that has experienced incredible breakthroughs over the last decades, especially in the last few years, and will revolutionize humanity in the coming years.
Given the importance of AI, there is an increasing demand for training resources that enable people to learn about its fundamentals without mathematical or computer technicalities and without ignoring the disadvantages of AI or exaggerating its capabilities.
Not considering when AI should be applied, when it should not, or the mixed uses between AI, humans and algorithms, can lead us to situations where we not only do not move forward, but go backwards, in some cases dangerously. The latter has been demonstrated in examples such as the fatal accidents of some autonomous cars, where to this day have a higher accident rate than humans[1]; or as some AI-based personnel selection processes where biases that discriminated by gender were detected[2].
Therefore, the objectives of this theoretical and practical introduction to AI are to provide the reader with an understanding of:
- What is AI and what are its origins.
- In which cases it makes sense to apply AI.
To achieve the above objectives, in this document, we will approach AI from a theoretical and practical perspective, so that the reader will have a complete and first-hand introduction.
1.1. Glossary
In this section some terms necessary for the understanding of the document are mentioned.
- Neuron: a neuron is the most fundamental unit of processing. It is also called a perceptron.
- Input layer: composed by the input neurons, introduces data into the neural network.
- Output layer: it is the last layer of neurons that produces the output of the neural network.
- Hidden Layers: these are the layers of neurons between the input and output layers.
- Forward propagation: it is the algorithm used in the neural network to predict its outputs based upon its inputs.
- Backpropagation: it is the algorithm used in the neural network to learn its outputs based upon its inputs.
- Epoch: it is the training of the neural network with all the training data for one cycle of forward and backward propagation. For example, if we train the neural network with 30 epochs, it means that we have trained it with all the data 30 times.
- Learning rate: it is a hyperparameter used to govern the pace at which the backpropagation algorithm updates or learns the values of a parameter estimate, often in the range between 0 and 1. The learning rate controls how quickly the model is adapted to the problem. Smaller learning rates require more training epochs given the smaller changes made to the weights in each update, whereas larger learning rates result in rapid changes and require fewer training epochs.
2. Theory
2.1. What is Artificial Intelligence (AI)?
AI is the ability, exhibited by non-biological systems, to learn and apply knowledge. To solve a challenge or problem, we can use:
- Algorithm: know in depth the challenge and the steps to follow to solve it.
- AI: apply AI to learn the challenge and the solution.
- Mixed: know part of the challenge and the steps to follow to solve it and apply AI to the part we don’t know.
The advantages of using AI versus algorithms are:
- It learns: the AI learns so we do not need to know in depth the challenge or the steps or procedure (algorithm) to follow for its solving.
- Adaptive:adapts to changes in the environment.
- Ignore outliers: AI is usually more capable than statistical techniques of ignoring outliers.
However, AI has the following disadvantages:
- Black box: since its operation is based on billions of calculations, it is not possible to really know how the AI arrives at its conclusions.
- Imperfect: the success of the AI in solving a challenge is measured in percentages and is almost never 100%.
- Energy consumption: AI will consume much more energy, typically several orders of magnitude, than the equivalent algorithm.
Other authors also present as disadvantages the following points that we will explain:
- Data quality: the dependence on the quality of the data fed to it should not be considered a disadvantage against algorithms because it also happens to the latter when they are being defined.
- Bias: this should not be considered a disadvantage compared to algorithms because they can also be designed following a certain bias towards one result or another.
Due to the above disadvantages, it is said that AI is always the 2nd best option, being the best option to know in depth the challenge to be solved, the detail of the procedure to solve it (algorithm) and implement such procedure automatically.
However, sometimes it is either not possible or too costly to have a thorough understanding of the challenge and the detail of the procedure to solve it, or the advantages of AI are essential for a particular challenge; and it is in those cases that using AI makes sense.
2.2. Brief history of AI
Origins and Early Development (1940s – 1950s)
- 1943: Warren McCulloch and Walter Pitts publish a paper describing a simple neural network based on mathematics and algorithms, laying a theoretical foundation for AI (see 1 in “Sources and references” below).
- 1950: Alan Turing publishes his seminal paper “Computing Machinery and Intelligence”, proposing what is now known as the “Turing Test” to assess a machine’s intelligence (2).
- 1956: The term “artificial intelligence” is coined by John McCarthy at the Dartmouth Conference, an event considered the birth of AI as an independent field of study (3).
Growth and Challenges (1960s – 1970s)
- During the 60s and 70s, AI experienced a period of optimism and funding, followed by the “AI winters” due to technical and theoretical limitations (4).
Resurgence and Advances (1980s – 2000s)
- 1980s: Renewed interest in AI, partly driven by the success of expert systems and the development of machine learning algorithms (4).
- 1997: IBM’s supercomputer Deep Blue defeats world chess champion Garry Kasparov, a significant milestone in AI development (5).
Modern Era of AI (2010s – Present)
- 2012: Significant advances in deep neural networks, particularly the work on convolutional neural networks by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, revolutionize AI’s perception capabilities (6).
- 2016: DeepMind’s AlphaGo defeats world Go champion Lee Sedol in a game known for its complexity (7).
- 2020s: Advances in AI continue, with notable developments in natural language processing, robotics, and the ethical and social applications of AI (8).
Sources and references
- McCulloch, W., & Pitts, W. (1943). “A logical calculus of the ideas immanent in nervous activity”. Bulletin of Mathematical Biophysics.
- Turing, A. (1950). “Computing Machinery and Intelligence”. Mind.
- McCarthy, J., Minsky, M. L., Rochester, N., & Shannon, C. E. (1955). “A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence”.
- Russell, S., & Norvig, P. (2009). “Artificial Intelligence: A Modern Approach”. Prentice Hall.
- IBM (1997). “Deep Blue”.
- Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). “Imagenet classification with deep convolutional neural networks”. NIPS.
- Silver, D. et al. (2016). “Mastering the game of Go with deep neural networks and tree search”. Nature.
- OpenAI (2020s). “GPT series: Advancements in Natural Language Processing”.
2.3. How does AI work?
AI works in a similar way to what biological intelligences seem to do, basing their functioning on cells called neurons, which are organized into groups called neural networks.
Neural networks, as we will see below, compute mathematical functions. In other words, the entire neural network works like a mathematical function. To understand this, let’s look at a very simple neural network capable of classifying points on a plane according to whether the point is red or blue.
The neural network, shown on the right of the image above, is composed of 3 neurons that, together, compute the function:
2x + 7y = 4
The neurons x and y are the inputs, i.e., they will take the values of the red and blue (x, y) points of the image plane. The neuron marked with the number 4 is the output. The values 2 and 7 are what are known as weights.
This function allows to know, with a high success rate, whether a point on the plane shown in the image will be red or blue.
By combining multiple groups of neurons like the one in the image, it is possible to define much more complex functions that allow finding the solution to far more deep challenges. Challenges that do not need to be limited to a two-dimensional plane, as in the image above, but can be in conceptual planes of thousands of dimensions.
Thus, the example seen above is capable of transforming the data of the points (x,y) belonging to a plane to the concept of whether the point will be red or blue. It is this capability that allows neural networks to move from the data plane to any other plane, such as, for example, the abstract concept plane, as ChatGPT does.
As we have seen above, AI is usually the 2nd best way to solve a challenge, so biological systems and AI researchers have used the best way to make AI learn, learning algorithms.
There are different learning algorithms used to implement AI. One that is widely used is called Backpropagation. It is the core of a neural network and what makes it “learn”. This algorithm is usually implemented using matrix computations since it is the fastest way to execute it using current computers. See the bibliography of this document if you are interested in going deeper into this area.
3. Practice
In this section we will perform three exercises using the Anaimo AI SDK library and the NNViewer application. At the end of each exercise, there are some questions to see if the concepts have been understood. The answers to these questions are in the next section but we highly encourage you to answer them by yourself first.
3.1. Libraries
So that we can practice with AI without having to develop the neural networks and all the necessary functions and utilities ourselves, there are several software libraries on the market, such as Google’s Tensorflow or Keras on Python.
In our case we will use Anaimo AI SDK, a software library that runs the neural networks that we will build and includes other utilities that we will also need. This library is free for a limited number of neurons, so we can download it from the Anaimo website (https://anaimo.ai).
You can consult the Anaimo AI SDK library web page (https://anaimo.com/sdk/) for its advantages over other alternatives. In this document we will use it because it can be used to solve challenges ranging from the simplest to the most complex.
3.2. Anaimo AI SDK installation
It is important to note that technical documentation of Anaimo is available in English.
We will use version 2024-01 (build 10010) of the Anaimo AI SDK. To download and install it, follow these steps:
- Download it from the official Anaimo website. To do this access the following address: https://anaimo.com/shop/
- Select “Add to cart” option) the free version and proceed (“View cart” option) to download it.
- Click on “Proceed to checkout” to download it, fill in the requested data and you will be able to download the software by clicking on the “Free SDK AI Toolkit” button.
Once downloaded, we will unzip the ZIP file, and we will find this structure of folders and files:
- Examples: contains use cases.
- Libraries: contains the libraries and files necessary for its use.
- AnaimoAI_nn_Users_Guide.pdf: this file is the User’s Manual in PDF format. The manual can also be consulted online here: https://anaimo.com/academy/advanced-reference/neural-networks-users-guide/version-202401010010/anaimo-ai-sdk-users-guide/
For the exercises we will solve here, we will use the NNViewer application, which can be found, including its source code, in the directory \Examples\.NET\NNViewer.
The NNViewer application runs on Windows and is developed with Microsoft VB.NET (.NET 5.0). Please refer to the Anaimo AI SDK user manual for the correct version of Microsoft Visual Studio (remember that you can use the free Community version) or other software components that must be installed before you can run the application.
Execute the NNViewer application and answer questions with the defaults. The application will display, on the left, a table showing the values of the inputs, and on the right, a table showing the values of the outputs. Among many other things we can do, with the mouse we can paint on the inputs and outputs.
Remember that in the NNViewer application you can press the M key at any time to consult the help of the available keys. With NNViewer you can do, for example:
- Import your own images from folders, either in black and white, color, or RGB separated; activating for them the output you want and create your own projects.
- Make variations of the images (augmentation) so that the neural network learns better.
- Create networks automatically or manually (where you would have to establish the connections between neurons manually).
- Create different neural network architectures, from fully connected to convolutional.
- Automatically reorder the records (pages) so that you have a training data set first and a test data set later.
- Obtain the confusion matrix, i.e., the matrix that reports what the neural network has predicted for each case, both when it has been correct and when it has not.
- View the datasets that the Anaimo AI SDK library has in memory, since NNViewer can open the NNV files that the library generates.
- View the datasets that the Anaimo AI SDK library has in memory, since NNViewer is able to open the NNV extension files that the library generates.
In addition to many more features already included that, as you have the source code available, you can modify, extend or even create new ones.
3.3. Exercise: XOR function
In this exercise we will create an XOR (exclusive or) function using a neural network. The XOR function is defined as follows:
Input 1 | Input 2 | Output |
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 0 |
You can use the NNViewer application to make a neural network learn the XOR function. You have the file \Examples.NET\NNViewer\bin\Release\net6.0-windows \xor.nnv, which you can load with the O key (open) so that you do not have to create the records (called pages in NNViewer) of data from the table above.
When you are going to create the neural network, we recommend that you indicate that you want only 1 hidden layer and manually enter that this intermediate layer should have 2 neurons. Your neural network will look like the one shown below (note that the neurons are numbered starting with zero):
Once you have gotten your neural network to be able to reproduce the outputs of the XOR function according to the table above, please answer the following questions.
Question 1. How many epochs did your neural network need to learn the XOR function with a 100% success?
Question 2. If you wanted to solve the same challenge using an algorithm, what would it be?
Question 3. Which would consume more energy, solving the challenge by your neural network or by the algorithm?
3.4. Exercise: Pattern and anomaly detection
In this exercise we will make the neural network recognize patterns, learn them, and thus detect anomalies, i.e., records that do not fit the patterns.
For this exercise, we will use the file \Examples.NET\NNViewer\bin\Release\net6.0-windows\ranges.nnv, which you can load with the O (open) key.
We will accept the default options to all the questions that are asked. Once loaded, we will see this:
This project (ranges.nnv) has loaded 100 pages. You can use the arrow keys of the keyboard to navigate between pages.
This project has been assembled as follows:
- In 95% of the pages, 2 contiguous rows of inputs have been activated leaving some square blank and, on the right, the output has been activated which is at the same height visually. In the image above this is shown in a yellow box.
- However, in 5% of the pages, exceptions or anomalies to the above rule have been introduced. You can find an anomaly to the pattern on page 3, as shown in the image below (note that input and output have different height visually):
Next, we are going to make the neural network learn, by itself, the defined pattern. To do this, we will follow the steps below:
- We type A to learn and enter 30 epochs:
- We answer “Ok” in the next questions thus accepting the default value, except in the question regarding number of batches, which we answer 0:
- We answer “Ok” in the next questions thus accepting the default value, except in the question regarding last record for training, where we answer 100 (to use them all):
This is because normally NNViewer will not use all the records for training and that is why it proposed that the last training record should be a lower one, to leave a percentage of the records for learning tests. This is because in AI it is mandatory to test with data records that the AI has never seen before, but it does not apply in this case because we are interested that the AI learns all the records.
- In a few epochs, the AI will have learned 95% of the records and will indicate this on the screen:
Now that the AI has learned the pattern, we can see what it “thinks” of the records by using the T key to put it into “Thinking” mode:
To display the outputs more clearly, we can switch them to black and white by pressing the B key, saying “No” to invert all images and “Yes” to switch to black and white mode. Once this is done, it will show it like this:
At this point, we can do a test to see how many pages the AI has learned correctly. To do this, we use the V key and default answer all the questions except the following:
By answering “1” in the starting record, you will take the test with all pages. We must get 95% correct.
Please, answer the following questions.
Question 1. Is there a common trait for the pages that the AI has not learned? In that case, why hasn’t it learned them?
Question 2. What practical use would it be the fact that the AI did not learn those pages?
Question 3. Do you think that with more epochs it would eventually learn 100% of the pages?
This exercise allows us to explain two very important concepts: overfitting and underfitting.
3.4.1 Overfitting and Underfitting
Overfitting is when the neural network learns the outputs it should predict for inputs, instead of the patterns that define them. In the previous exercise, this occurs when we ask it to learn up to a success rate greater than 95%, because it means that the network will be learning the memory anomalies, and not the pattern that makes those cases anomalies.
Underfitting, on the other hand, is the opposite. It happens when the neural network has not learned the pattern.
3.5. Exercise: Character recognition of handwritten numbers
In this exercise we will make the neural network learn to recognize numeric characters written by people, i.e., handwritten.
For this we will use a dataset of handwritten numeric characters called MNIST which is very popular in the field of AI challenges.
Handwritten numeric characters are already available within an NNViewer project, within the file \Examples.NET\NNNViewer\bin\Release\net6.0-windows\handwritten_28x28_50000_10000
This project consists of 50,000 28×28 pixel gray tone images with numeric characters actually written by people, ordered consecutively from 0 to 9. Then the next 10,000 records are, again, handwritten numeric characters also sorted consecutively from 0 to 9. This is because in AI, learning must always be performed on a set of records and testing of learning results on a set of records or data that the AI has never seen. Therefore, in this case, we will use the first 50,000 records for learning and the last 10,000 for testing learning outcomes.
We will follow the steps below:
- From the NNViewer, we open the file \Examples\.NET\NNViewer\bin\Release\net6.0-windows\handwritten_28x28_50000_10000, with the key O (open). We will answer that we want only 1 intermediate layer, since for this project it will be enough this way:
- We will answer “Ok” with the default options to the rest of the questions. Finally, it will create a neural network with these characteristics:
We will wait for the entire project to load, which may take a few minutes.
- Once the project is loaded, we can use the arrow keys to scroll through the different pages. We can also use the “G” key to go to a specific page. Remember that with the “M” key you can consult all the available key commands.
- Now we will make the neural network learn the handwritten numeric characters. We will change the learning rate to 3, with the L key:
We can leave the decay rate at its default value of 1. This rate would cause the learning rate to decrease with each epoch. You can refer to the Anaimo AI SDK user manual for more information.
- We will switch to validation by maximum. This means that the neural network will consider the active output to be the one with the highest probability, and not the one that exceeds a certain threshold. To do this, we will press the “V” key:
We will cancel the next step, since we do not want to perform the test itself, but simply tell the library how it should measure its results (by maximum):
- To speed up the learning process, we can tell the NNViewer to use all the CPU cores of our computer. If we do not tell it to do so, it will learn using only one core. To find out how many cores we have available, we can use the Windows Task Manager. To tell it the number of cores we want to use, use the “X” key:
- To start learning we use the “A” key, and choose, for example, 30 epochs of learning:
- We answer with the default value to the target percentage of success and respond that it uses 5,000 batches of records. Since we will use 50,000 records for learning, that will make the batch size 10 records, which means it will consolidate learning every 10 records:
- We will answer that the first learning record is 1 and the last one is 50,000, thus:
- Finally, if we answer Do not refresh the screen during learning, the application will be blank, but the learning will be a little faster:
- Learning will begin. Typical results are that it learns over 85% of the characters in the first epoch, as can be seen in the following image:
- We can wait for the 30 epochs to finish, or we can stop the learning process at any time with the escape key. It is convenient to note that the process can be accelerated by using AV2 and AVX512 instruction sets, available in the SDK libraries.
- Once finished, you will get a result similar to this one:
As can be seen in the image above, the neural network has learned 94.96% of the handwritten characters, after 30 epochs and using a network with 1 intermediate layer of 30 fully connected neurons. Similar results can be obtained in fewer epochs using convolutional networks. If you are interested in going deeper into convolutional networks, you can generate them, and other network architectures, with the “I” key and using architecture mnemonics. That more advanced part is outside the scope of this document for now but remember that the Anaimo AI SDK user manual has more information on architecture mnemonics.
Now that your neural network has learned, please answer the following questions.
Question 1. Would it make sense to define the procedure or algorithm, in detail, to recognize handwritten numeric characters, as achieved by AI? Reason your answer.
Question 2. Can you think of other reasons to question the use of algorithms?
4. Solution
Now that you have tried it on your own, let’s look at the answers to the exercises.
4.1. Exercise: XOR function
We follow the next steps:
- We ensure that we have installed Microsoft Visual Studio (VS) 2022 (or higher) Community, with the VB.NET development environment.
- Go to the folder: \Examples\.NET\NNViewer
- Double click on the NNViewer.sln file. It should open the NNViewer project from VS.
- We run the project. The first time it will create a temporary neural network, so we answer all the questions it asks us by taking the default option.
- When it is ready to receive commands, type “O” to open a project (remember that you have the “M” key to consult the available commands).
- A dialog box will appear to open the project, we will find the xor.nnv project here: \Examples\.NET\NNViewer\bin\Release\net6.0-windows\xor.nnv
- We will answer 1 intermediate layer as follows:
- We will say “Yes” to “Full Connected”, which means to automatically connect all neurons with all neurons in the next layer.
- We will answer “No” to the application assigning the number of neurons automatically:
- We will tell it to put 2 neurons in the hidden (intermediate) layer:
- We will accept the following default parameters and this is the network that will be created:
- Now with the arrow keys we can review the 4 pages of data (records) of the XOR function.
- Next, we will make the neural network learn. We press “A” and it will ask us the number of epochs to learn, 10000 is a good value for this project:
- We accept the rest of the default values, except for the last record for training, which should be 4.
- At a given number of epochs (2179 in the image below), the neural network will learn the XOR function and show that it has a 100% success rate:
- Once the XOR function is learned we can switch to “Thinking” mode with the “T” key and see what prediction (output) the neural network makes for each record. If we want the outputs to be displayed in black and white, we can press the “B” key, answer “No” to invert all images and “Yes” to switch to black and white.
It is necessary to consider that the neural networks are initialized with random numbers, so the results may differ from one run to another. It could also happen that it does not learn 100% in the indicated number of epochs and more epochs are needed. It may be surprising that a function as simple as XOR needs so many epochs to be learned and it is something interesting for reflection.
Question 1. How many epochs did your neural network need to learn the XOR function with a 100% success?
Answer: 2179
Question 2. If you wanted to solve the same challenge using an algorithm, what would it be?
Answer: the XOR function algorithm is:
- Output = 1, if the inputs are different.
- Output = 0, if the inputs are equal.
Question 3. Which would consume more energy, solving the challenge by your neural network or by the algorithm?
Answer: It consumes more energy, by several orders of magnitude, to learn what we have done by AI than it would have taken to fully understand the challenge of the XOR function, create and implement the algorithm.
4.2. Exercise: Pattern and anomaly detection
Let’s see the answers hereunder.
Question 1. Is there a common trait for the pages that the AI has not learned? In that case, why hasn’t it learned them?
Answer: it has not learned that 5% of the pages because they do not fit the common pattern defined by all the other pages.
Question 2. Do you think that with more epochs it would eventually learn 100% of the pages?
Answer: The practical utility is that the AI learns the patterns, so it is able to detect anomalies. If instead of those test inputs and outputs, we had fed it real data from any system, the AI would be able to tell us those situations that do not fit the usual pattern.
Question 3. Do you think that with more epochs it would eventually learn 100% of the pages?
Answer: the intuitive answer is no, because as the AI learns patterns, it would never learn the exceptions if they were below a certain percentage of the total data. However, the correct answer in this case is yes, because although we have defined a pattern which is to set the inputs and outputs at the same height, the inputs are marked randomly and that is why they sometimes have white cells in the strip that should be totally black. That allows the AI, if given more epochs to learn, to distinguish even those special cases and learn 100% of the pages.
4.3. Exercise: Character recognition of handwritten numbers
Let’s see the answers hereunder.
Question 1. Would it make sense to define the procedure or algorithm, in detail, to recognize handwritten numeric characters, as achieved by AI? Reason your answer.
Answer: This is one of the cases where it is better to use AI than to define the algorithm, because AI learns to recognize handwritten numeric characters in a few minutes, as we have seen in this exercise, while defining the algorithm to recognize them could take weeks of work to finally get a hit rate equal to or worse than AI.
Question 2. Can you think of other reasons to question the use of algorithms?
Answer: As seen in this document, in cases where the challenge changes over time (i.e., different inputs), an algorithm would normally not adapt to such changes, whereas AI would adapt, if it did not use static knowledge but learned in real time.
5. Feedback, help and support
If you want to leave a comment, need help or support on a more ongoing basis, please refer to the following web pages and use the mechanisms provided:
6. Bibliography
- Basic and advanced AI concepts at the “Serrano Academy”: https://www.youtube.com/@SerranoAcademy
- Anaimo Academy (currently the technical documentation): https://anaimo.com/academy/
- YouTube video from Anaimo solving some of the exercises in this document: https://www.youtube.com/watch?v=eYXvQjl5ALw
- Documentary on the risks and ethical implications of AI: https://www.codedbias.com/
[1] For more information see: https://www.wired.com/story/self-driving-car-crashes-rear-endings-why-charts-statistics/
[2] For more information see: https://www.codedbias.com/