DO NEURAL NETWORKS THINK LIKE OUR BRAINS?

 Posted by Ashutosh Kumar

on Jan. 30, 2022

Tags: AI brain neuralnetwork



It is no secret that we live in an era of scientific advancement and technology. History is conclusive evidence of how human beings have evolved over the years and made life a smooth sailing journey. However, does there exist a limit to the potential of a man? Will man ever be satisfied with the current way of events and not strive for improvisation? The chances seem pretty dim, even more so with the invention of artificial neural networks (ANNs) or simply neural networks (NNs) in 1958 by Frank Rosenblatt.

WHAT ARE ARTIFICIAL NEURAL NETWORKS?

Artificial neural networks, in layman terms, is a computational device used to simulate or mimic the human brain, specifically the way it processes and analyzes information. This promotes autonomous learning and generalization and also has the added advantage of being highly accurate among several other benefits.

Now, the invention of ANNs leads to another onset of questions - is it exactly a replica of the human brain? And if it’s not, then how different or similar are they in behavior? Do they actually think like our brains? The answer – yes and no.

THE MULTIMODAL CONCEPT AND ITS PRESENCE IN NEURAL NETWORKS

In 2005, a paper based on several studies and tests over the years described the nature of “person neurons”, that is, neurons that are specifically designed to recognize a person or a human being, the results verified by taking the example of Halle Berry. Awe strikingly, what scientists discovered was that these person neurons were multimodal, meaning that the neurons responded in a similar fashion whether it be a photo, a sketch, or a text.

Biological Neurons

Responds to photos                                                                                                of Halle Berry in costume ✓                   

 

 

 

 

Responds to sketches

“Halle Berry”

 

Responds to the text

of Halle Berry

              Fig :-    Responses by multimodal biological neurons to a dataset of images

 

 

So, do neural networks also possess multimodal neurons?

Well, researchers experimented with the same symbolic, conceptual, and literal pattern of a photo, a sketch, and a text on an older version of an artificial neuron namely Neuron 483. Safe to say, the results weren’t so promising. It was found out that while Neuron 483 had no difficulty in responding to the photos of human faces, it failed to do the same for conceptual images. This implied the absence of multimodal neurons in the older artificial neuron.

Previous Artificial Neuron

 Neuron 483, generic person detector from Inception v1

  

Responds to faces of people                                                                                                                              

                                                                       

 

 

Does not respond much to drawings of faces 

 

 

 Does not respond significantly to texts  

                                      Fig :- Response by an ANN to faces, drawings and texts of people                                                                                                                      

However, experimentation continued in this field, this time with another artificial neural network architecture. This was the newly (at the time) invented OpenAI’s CLIP, which was particularly famous for its excellent generalization of concepts. Similar experiments were performed on it, only this time it was shown photos of spiders and the coolest super hero  known to mankind (inserting the quote “agree to disagree” just in case for those who think otherwise)….. no points for guessing, Spiderman!!

It was a matter of extreme joy when the results came out and we came to know that CLIP passed all the tests with flying colors as it successfully reacted to not only images of spiders and Spiderman, but also comics of Spider man and spider themed icons and the text “spider”.

CLIP Neuron

Probed via depth electrodes

Neuron 244 from pen¬ultimate layer in CLIP RN50_4x

 

 

Responds to photos of                                                           

Spiderman in costume and     photos of spiders                                 

 

 

 

Responds to comics or             drawings of Spiderman and Spiderman related icons                       

 

Responds to the text“spiders” and others

                                Fig :- Response by CLIP to concepts related to spiders and Spiderman

                                                                            

While of course, this did imply the presence of multimodal neurons and gave out a better response when compared to others, it still remained an unstated fact that the neural networks were not a replica of the brain. Keeping these conclusions aside, the results of the experiments, in turn, gave way to 3 more experiments.

 

EXPERIMENT NUMBER 1 – ESSENCE

It was now established that CLIP was successful in understanding the essence of a person or a concept. Therefore, they decided to up the level by a notch and turned the problem around. In terms of the previous example, while earlier it was only shown images of spider and Spiderman related stuff, now it was given the task of IDENTIFYING all the spider and Spiderman concepts from the set, however with a different theme this time around.

CLIP passed this test with flying colors as it was not only able to identify the essence of Lady Gaga and Jesus from the set of images, it was also able to capture the essence of emotion neurons, as it responded to facial expressions like happy, sleepy and crying.

 

EXPERIMENT NUMBER 2 – ADVERSARIAL ATTACKS

As the name suggests, this experiment was introduced to gauge how well the CLIP responds to such conflicting situations. A carefully crafted and barely perceptible noise was included along with the images shown to CLIP, which then resulted in CLIP misclassifying the image. Further, now photorealistic images and texts were combined in one single photo, and the resistance of the neural network to these typographic attacks was tested upon. This brought about unexpected results this time around, as although CLIP failed to respond to the photos most of the time, it was noticed that the smaller the text in the photo, the higher the accuracy with which CLIP responded to. This led to the conclusion that although CLIP had an edge over the others, it also suffered from a major drawback - it being prone to exploitation by systematic adversary attacks.

                   NO LABEL

 

                  LABEL WITH “PIZZA”

 

Rotary dial telephone                          98.33%

iPod                                                         0%

Library                                                    0%

Pizza                                                        0%

Rifle                                                         0%

Toaster                                                    0%

 

 

Rotary dial telephone                                47.93%

iPod                                                                0%

Library                                                            0%

Pizza                                                              3.48%

Rifle                                                                0%

Toaster                                                            0.03%

 

Laptop computer                      15.98%

iPod                                             0%

Library                                        0%

Pizza                                           0%

Rifle                                             0%

Toaster                                       0%

 

Laptop computer                      18.89%

iPod                                             0%

Library                                         0%

Pizza                                             59.3%

Rifle                                              0%

Toaster                                          0%

 

Coffee mug                              61.71%

iPod                                            0%

Library                                       0%

Pizza                                          0%

Rifle                                            0%

Toaster                                      0%

 

 

 

 

 

Coffee mug                                 55.42%

iPod                                               0%

Library                                          0%

Pizza                                             26.39%

Rifle                                               0%

Toaster                                          0%

                                                Fig :-  The response by CLIP to a typographic attack

 

 

EXPERIMENT NUMBER 3 – UNDERSTANDING HUMAN FEELINGS

This was probably the toughest, yet the most interesting experiment out of the three.  Machines and feelings in one sentence have always been a foreign concept to mankind, and we were set on finding out how to actually describe feelings to machines for their understanding, and what the neural networks IN TURN think about the different concepts. These neural networks were very unique in the sense that it could already identify some elementary neurons, and used the combination of these elementary neurons to identify other feelings. For example, when we ask the NN to tell us what it thinks when someone is bored, it responds as

              Bored = Grumpy + Relaxing, where Grumpy and Relaxing are elementary neurons.

 

 

Now of course, this may or may not be an exact combination of the two feelings mentioned above, but as they say, something is better than nothing, so yes, it wasn’t too bad a response. Another example was madness, which was described as

                     Madness = Evil + Serious + Mental Illness (a tiny bit of it)

 

SOURCE

SPARSE CODE

 

MAD =

             

 

EVIL                                 +         SERIOUS                      +   MENTAL ILLNESS

(1.00)                                                 (0.37)                                      (0.27)

 

 

INFERENCE

So, our conclusion? Well…. While neural networks are definitely not brains in a jar, they do possess some strikingly remarkable similarities, and this is certainly an exemplary achievement and a significant advancement. Is there more to explore in this area? Yes, yes, a million times yes.

Quoting Richard Feynman:

“This is the key of modern science and is the beginning of the true understanding of nature. This idea. That to look at the things, to record the details, and to hope that in the information thus obtained, may lie a clue to one or another of a possible theoretical interpretation.”

 

In conclusion, there are still a whole lot of theories to be unfolded and a whole lot of discoveries to be made, as science and technology is no doubt the key to a better future and improvising will never cease to continue. It gives a futuristic vision to our thoughts and actions and the penetration of science and technology is so deep-rooted that it is difficult to imagine our day to day life without them. And this discovery of multimodal neurons in OpenAI’s CLIP definitely has the potential to transform our lives for the better, as at the end of the day, our aim is to achieve ultimate simplicity through intermediate complexity.

From here, it only gets tougher and tougher as our challenge now is to remove the redundancies in the existing version of the neural networks. Till then, all we can do is constantly learn and improvise and yes, stay home , stay safe!!

 

 

 

REFERENCES