Artificial Intelligence is probably the hottest topic in the world of technology right now.
You have AI-driven products becoming unicorns or securing hundreds of millions in venture funding. There are also celebrity product leaders who lead these AI-driven products that we follow and learn from.
Many product managers out there want to benefit from this incredible technology by either leading an AI product or adding AI capabilities to their existing products. If you are one of them, this one goes out to you.
I have built several AI products, and I am here to help you understand this technology and make the most out of it.
What is Machine Learning?
Machine learning is a discipline that focuses on creating algorithms and software capable of “learning” from the information we provide and effectively performing specific types of tasks as a result of this training.
The term and the field itself are surprisingly old—dating back to 1959, when Arthur Samuel of IBM (one of the prominent computer scientists of the time) coined it in his work "Some Studies in Machine Learning Using the Game of Checkers."
Despite its 60-year history, however, machine learning has become popular only during the last decade thanks to the massive increase in computing power, especially in GPUs.
But what does machine learning have to do with artificial intelligence? Let’s move on by getting that out of the way.
Machine Learning (ML) vs Artificial Intelligence (AI)
Everyone is using the terms “machine learning” and “artificial intelligence” interchangeably to refer to the smart software that is capable of recognizing items in images, writing poems, and generating impressive responses to detailed prompts.
However, artificial intelligence is the broader concept of machines with the abilities of perception and reasoning, while machine learning is a subset of AI that focuses on software that can learn.
With the term definitions behind us, let’s move on to understanding how these machines learn and the essentials of how they work.
How Machine Learning Works
In order to understand machine learning, we should first review its most common form—deep learning neural networks.
A neural network is a structure written in regular code that tries to replicate the operation of a biological brain. For humans and other living creatures, the brain is a complex network of neurons (nerve cells) connected to each other where an impulse from one cell activates another one, and the information travels along the network of neurons.
I don’t think I am the right person to explain biology (as I’d fail miserably at it), so I'll stick to the technology.
The neural networks in software look something like this:
Each circle here is a single neuron. In code, a neuron is simply a function that can consume some sort of information, process it, and pass a result to the next neuron.
Note: this is a simplification of the real process. For more details, you can watch the exceptional video lesson on Neural Networks on YouTube channel 3blue1brown where they also explain the mathematical concepts behind this technology.
As we can see in the image above, our neurons are organized into layers. Neural networks will usually have three types of layers:
- The input layer, which is responsible for consuming information from the outside world.
- The hidden layers process this information with each one processing a single aspect of it.
- The output layer will give the processed result back to the outside world.
Now let’s understand the meaning of these layers with a simple example. Imagine that you want the neural network to recognize an animal in a photo. For this, you have devised a neural network that consists of six layers that will operate the following way:
- Input layer that will take in small segments of the photo of the animal and pass to the first hidden layer.
- The first hidden layer will try to find basic shapes (e.g. straight lines) in each segment and pass that information to the second one.
- The second layer will then combine some of those lines into basic geometric forms (e.g. circles, semi-circles, etc.) and pass it to the third one.
- The third layer will then create more complex shapes out of these geometric forms.
- The fourth layer will combine all of the shapes created before and get an outline of an animal.
- The output layer, finally, will give the type of animal that it has recognized from the shape.
Here is what it will look like visually.
By looking at the process above, you might wonder how AI engineers get these layers to recognize shapes and guess the animal in the image right? Well, they do it by “training” the neural network to perform its task.
To train the network, AI engineers will give it a set of inputs along with their corresponding correct outputs. In our case, it would be several images of animals as inputs and these same images with the correct animal type as outputs.
As your network has the correct answers, it will continuously tweak itself (the parameters that each neuron passes to others) until its answers are very similar to the correct answers we have provided.
In real life, your network will never be 100% accurate with its answers and there will always be some degree of error that you need to consider. The good news is that you don’t really want them to be 100% correct. All you care about is the model being faster and more accurate than a human.
Now that we have an understanding of how it works, let’s move on to understanding how you can use machine learning in your products.
Machine Learning: What It Can Do, and What It Can’t
The hype around AI is absolutely massive and many startup founders or product managers dream about adding AI to their products. But here's the million-dollar question: do you really need AI?
I understand that the term “AI-driven” looks good in the investment pitch deck and marketing materials of any product, but don’t forget that building an AI model is expensive and time-consuming.
Thus, if your AI model intends to solve a basic problem that you could have easily handled with ordinary code, you will end up wasting your funds on a flashy term on your website.
In order to understand your need in an AI model, let us discuss the tasks and problems that you can handle with it and compare them with those that you should solve with regular coding.
Common Tasks that Machine Learning Models Can Handle for You
ML models are not magic—although, with all the ways you can use ChatGPT and other new large language models, it certainly seems that way. While we're seeing a surge in advanced AI tools, many products can benefit from simpler ML models.
The function these models perform will usually fall under these four categories.
As the name suggests, these models are able to recognize the information you provide to them and classify it or assign a label to that information. The example of recognizing animals from the image that we discussed previously is a typical case of AI classification.
The area of AI classification, in turn, divides into four types:
Binary Classification: This model will give you a Yes or a No. The common use cases for this are email spam detectors (isSpam = true or false), breast cancer diagnosis (risky or not risky), etc.
Imbalanced Classification: These models will divide data into two groups - normal majority and abnormal minority. An example of this is fraud detection in customs declarations on state borders.
The AI model will examine the documents and give green (good to go), yellow (needs document inspection), or red (needs goods inspection) flags to each customs declaration, depending on how “normal” the documents look compared to others.
Multi-Class Classification: In this case, the model will select one of the many predefined values. Optical character recognition algorithms (the ones that extract text out of an image) belong to this type.
Multi-Label Classification: Unlike the previous one, these models will assign multiple labels to the same piece of data. Many computer vision algorithms fall under this category as they look at the image or video and identify multiple objects in it. Here’s what Multi-label classification looks like for a self-driving car.
Other applications of this technology include extracting useful information from the text (e.g. opinions about your brand from Tweets) or pinpointing the topics of discussion in a podcast.
This class of AI models is able to do predictions based on existing information. With regression models, you can understand the relationship between several factors as well as predict future values based on those in the past (a.k.a forecasting algorithms).
Some of the uses of regression algorithms include:
- Forecasting stock price based on past price fluctuation.
- Predicting the price of a car based on mileage, production year, etc.
- Suggesting the best ad texts for better conversion rates.
- Finding the best weekday and time for sending emails to get the most opens and clicks.
Some of the larger models are also able to predict things as complex and chaotic as the weather.
This is the process based on unsupervised learning when the model detects similarities in a set of items and groups them by that similarity. Unlike classification, a clustering algorithm has no idea what it is seeing, all it knows is that these items are similar to each other.
Some of the use cases for clustering algorithms are:
- Grouping your email marketing list into segments that behave similarly.
- Organizing a large set of books into topics.
- Grouping photos by the items found in them.
In general, if you have a large set of items that you want to divide into different groups based on a specific characteristic, the clustering algorithms are what you will need.
Finally, we have the algorithms that are capable of creating content. We are talking about AI artists, writers, and musicians.
The use cases for this group of models are probably endless—and the models (now referred to as large language models, or "LLMs") get better every day. Here you have:
- Website copywriters that create structure and content optimized for SEO.
- Removing the watermark from an image and drawing the part of the image that was under the watermark.
- Creating new frames in a game instead of letting your graphics card render it to save on resources.
- Generating photos for your ads that are license-free.
Here's an example of a poem I asked GPT-4 to write. (Note: I asked GPT-3 the same question in January and I assure you, it is a lot more talented today.)
You can use natural language to formulate a prompt, and the model will understand and fulfill your request based on the knowledge it has gathered from the internet.
To sum up, these are the most common types of applications of AI models.
It looks like there are a lot of things that you can do with AI, but sometimes it is faster and cheaper to ditch the idea of building a model and resort to simple software product development.
Tasks Where You Should Use Regular Coding Instead
Not every problem needs an AI model to solve.
To determine if you can solve the problem at hand without AI, ask yourself if there is a deterministic and clear set of rules or actions that you can take to solve it.
- If the answer is yes, then you can use regular code (after all code is simply a set of clear rules and commands to your computer). For instance, you do not need AI to determine if the file that your users have uploaded is an Excel spreadsheet as XLSX files have a strict structure and you can check for that.
- When you cannot point out these clear rules, then you will need to rely on AI models to handle that task for you. The obvious example here is image recognition. There is no reliable set of rules that can let you understand if the red thing on the image is a car of a specific brand.
To further solidify this concept, let’s take a look at several problems that you can solve without AI.
Finding the shortest route from point A to point B on the map. This is a well-known problem in mathematics and there are a lot of deterministic approaches (e.g. Dijkstra’s algorithm) that can solve it with ordinary code.
Checking for traffic jams on the map. This one has a clever solution. Google, for instance, will take real-time GPS location data from all smartphones (where it has this feature enabled) and check if there are many phones standing still on the street. If so, then there must be a traffic jam there.
Note: Although, you can use AI here to predict the traffic jam on a street where they are very few smartphones, or the locations are scattered.
Calculating brand mentions on social media. If you want to get the number of mentions of your brand on social media platforms, then all you need is a scraping tool that will read social media search results and count them.
Note: An AI model here would be useful for predicting the number of mentions based on scraping a small portion of the results (instead of the entire search result list) - saving you computational power and resources.
To sum up, AI models are quite capable in terms of the tasks that they can perform, but sometimes it is easier to code instead.
Now that you have an understanding of how to use Artificial Intelligence in your products, let me move on to sharing a couple of tips on working with AI models.
6 Tips For Product Managers Working With Machine Learning
Throughout my career working with machine learning products, I have made lots of mistakes that have resulted in a loss of time and money. To make sure that you, the ML product manager, do not repeat these mistakes, here are some handy tips to consider.
The Quality of Data Will Make Or Break Your Models
ML Engineering teams and Data Scientists love to use the term “garbage in - garbage out." It means that the quality of the model will directly depend on the quality of the datasets you use to train it.
To understand what I mean by bad data, let me go over some of the aspects that determine its quality:
Bad annotations: Annotation is the process of manually labeling new data to use as “correct results” for your model’s training. Annotators for a model that recognizes animals, for instance, will highlight the creature on the image and tag it with its name. If you tag elephants as “penguins”, your model will do the same after training too.
Lack of diversity: Your models will work well only if the data in real world looks similar to the one they have trained on. If you have a car plate detection system that has learned to read U.S. license plates only, it will miserably fail when it sees a car with EU plates.
Bias in data: Have you ever heard about racist AI? Well, Google has recently apologized to the African-American community because its computer vision algorithm did this:
There are also numerous cases of police AI models targeting minorities.
The reason behind this is biased data. These police AIs had used the database of a police station that was notorious for being racially biased. Hence, there was a lot of data on minority arrests that led to the AI model targeting them.
Therefore, make sure that the data you use does not contain any biases that will potentially lead to disrespect, offence, or worse.
Build Features That Help You Retrain Your Models
Building models is half the job—you also need to constantly improve it by adding new capabilities or simply giving it fresh data to retrain on.
Finding data for retraining that is relevant to your ML product might be challenging. So consider adding features that will enable you to get that data from the product itself.
Here are two feedback loop approaches that I have tried (and which worked well):
Asking for feedback on the AI results: If you are detecting spam in email, then add a “not spam” button for your users to remove the email from spam. As a result of this action, they will also “annotate” that email as spam-free and you can use it to retrain your model.
Note: Yes, that’s exactly what the “Report not spam” button is doing in Gmail.
Offering the user to fine-tune your model on their data: This is a win-win situation. Your users will get a better-performing model because you have retrained it on their specific data while you will get extra data to increase the diversity of your model.
If You Can, Use Ready Solutions Instead of Building Yours
You don’t have to reinvent the AI wheel. There are lots of ready models out there that you can buy or rent. This is especially relevant if you are trying to solve an issue that is relatively common (e.g. identifying nude images in your social media app).
A simple search for a logo detection model brought us 9 ready solutions that we can integrate with using API.
Use (and Contribute to) Open Source AI Models
Artificial Intelligence has a significant open source community. You can lean on it to find either ready models or useful tools to help you build your models.
You can check out this list on GitHub for some of the best open-source machine learning projects and initiatives out there. This list is quite diverse and includes:
- Models with different levels of explainability (how well you can explain the way your model works) written in Python and C from giants like IBM and Amazon.
- Training data for your models.
- Tools for data analysis, key metrics tracking, anomaly detection, and more.
Note: If you have made any improvements to these models, please open a pull request and share your work with the open-source community and its stakeholders. Open source is about awesome people doing awesome things and sharing them with others.
AI Models Take Time to Build
The methodologies of agile and MVP don't always work with ML systems. You cannot always build a small model and then add new features on top of it with multiple iterations. Sometimes changing a model slightly will mean retraining it from scratch.
You might even face a case when optimizing your model by 10% would mean discarding the old one and creating a new machine learning algorithm.
Another thing to consider is the specificity of the work of the data science professionals or cross-functional teams that have them on board.
Sometimes they will not be able to give you estimates on when they will complete their work, as they spend the majority of their time on inventing something completely new and they have no idea how much time the development process will take them.
Avoid Drastic Changes In Your Database Structure
This is a painful reality that I have faced a couple of times.
If you are using your own data to retrain your models, please refrain from changing the structure of your database.
Your models will be built to consume data of specific structure and kind, and, if your data engineering team changes your DBs, your ML team will have to change your models and even retrain them from scratch.
Learning is an ongoing process.
Artificial Intelligence has made the world a better place for all of us as it has let many visionary founders and ML PMs solve problems that seemed impossible in the past.
I hope that this guide has helped you understand the concept of AI product management, the product manager’s role in this process, as well as the real-world applications of this fascinating technology.
But building great new products is not only about the proper use of AI. There are many important product aspects that you should master too, including:
- The art of building Minimum Viable Products.
- Properly planning your team’s work with Product Roadmaps.
- Finding your PMF in a cost-effective manner, etc.
Finally, as the poem in the GPT-3 demo has suggested, you can subscribe to our newsletter to get the best of product management and stay ahead of the curve.