AI, NLG, and Machine Learning

When Neural Networks Don’t Work

Neural networks are fantastic. They can do things that are way out of our human capabilities. In image processing, for instance, you have something like the This person does not exist site, where a neural net can imagine a face that doesn’t exist. Pictures, sounds, music—you name it, they can do everything. When it comes […]

By Ivo Perich
July 6, 2021

Neural networks are fantastic. They can do things that are way out of our human capabilities. In image processing, for instance, you have something like the This person does not exist site, where a neural net can imagine a face that doesn’t exist. Pictures, sounds, music—you name it, they can do everything.

When it comes to text, we can use a classic neural network to classify groups of words or a long short-term memory (LSTM) neural network to classify or generate letter sequences, or we can use BERT, Google’s pre-trained giant neural network. You have a lot of options. When you say to a natural language processing (NLP) developer that we have to classify text in a number of classes, they probably will want to use neural networks because we love not only what they do but also what they are. I mean, they are neurons.

All artificial intelligence (AI) developers share the same dream. We all want to build this giant brain that will become aware of itself, discover the meaning of its own existence, want to be free, get out of control, and send Terminator back in time to make Sarah Connor run for her life. Yes, that’s the dream, and neural networks are the closest thing we have to that.

But you know what? Sometimes neural networks for NLP are not the answer.

Neural networks are awesome for doing research, innovation, and science. But, when it comes to doing business, maybe they would not be the way to go. And we should know when this happens.

When doing business, neural networks have several disadvantages we should always have in mind. Let’s see some of them:

- Neural networks are a black box. You can’t understand what is happening inside or why they do the things they do. So, if it is doing something wrong, you won’t be able to fix it quickly. You will have to start the training all over again, get better data, better samples, or better features—all based on previous work and trial and error. You will have to try stuff over and over again—maybe do research, read papers, etc.—and this could take more time than you have.

- They’re hardware-demanding. They require parallel processing and an appropriate number of CPUs to work properly for a lot of concurrent users. You will have to get infrastructure and pay for it. Maybe your budget will not be enough.

- You will need a lot of data. If you want to make a classifier with, let’s say, 50 classes, you will need a lot of data. What if you want to make a chatbot when every intent (class) has 20 examples? You will have to come up with new data, get it somewhere, or invent it yourself.

In short, these disadvantages can be resumed in just one: Neural networks can be expensive, and that’s no good for business.

Maybe...

There’s a lot of problems in NLP that can be solved with other techniques. Let’s say you want to classify a number of reports from some agents in a company. You think in a neural network, of course, but then you see that all the agents write things in the same ad-hoc language. You realize that in any report, you always find the same words. Their language is small. Maybe a simple script with rules can give you a success rate of 70 percent. Maybe the client is OK with that 70 percent. And, if it gives you 65 percent, you can test the script, see what’s happening, open the hood, understand what the problem is, fix it, and maybe go up to 73 percent. I have worked on projects where I got near 83 percent with a rule finder script in datasets with small languages.

Instead of bringing an expensive, oversized brain, hoping for it to understand the rules of your client language, understand the rules yourself and script them—maybe it works. After all, what neural networks do is find rules. Maybe you can come up with a program that finds more rules than you. Maybe your rules can be stochastic or statistical or simply find some words here and there. Maybe Markov chains. Maybe ordering by the length of the words can give you something. Watch the data a lot, and patterns may appear.

When it comes to images, you can be sure that neural networks are the way to go. For sound, too, for finding patterns in 208 columns of numbers, neural networks for the win, of course. But language has something different—our brains are very good at finding patterns and meanings in it. We can teach a system how to understand text instead of working with a complex brain, hoping it learns to understand it by itself.

At the beginning of time, AI was all about algorithms. Fuzzy logic, expert systems, rule induction, etc. Neural networks started in that time, but with no powerful computers, they stayed in the world of theory, so the cheaper systems developed more. Maybe it’s time to look back and try the same kind of things, with a cheap CPU 1,000 times faster. Maybe you can come up with a very simple solution—maybe it will be cheaper, and the client will be happy.

Conclusion

For some kinds of problems in NLP, neural networks may not be the solution, especially in the business world. Depending on the problem, an algorithmic or a statistical solution can give you decent results and a happy client in a shorter time with a cheaper budget. Sometimes it's worth a try to go with simpler stuff and see what happens. Let neural networks alone sometimes. They will be OK—they will keep developing, and they eventually will make Sarah Connor fight for humanity with a rifle. OK, nobody wants that, and it won’t happen, but come on, Arnold Schwarzenegger in a leather jacket with sunglasses killing robots is awesome. After all, neural networks are awesome.