In order to use AI to help folks to keep it real, we have to train it to know the difference between real and fake. In AI this is known as a classification problem, and they would say our case has two categories or “binary classification”. In this example we use real food and fake food.
Decide on a problem.
Because we are looking at real vs fake, you want an example where that difference actually matters. We will train the AI to decide if a given image is in the "hand made and healthy" food category or the “factory fakes” food category.
Find examples of images in both categories.
The more examples, the better it will be able to generalize and find the common features across all examples in one category. But we set aside some images to use in validation, so we can see if it recognizes an image it never saw before.
Validate your model.
Provide an image it has never seen before, and test to find out if it classifies it correctly. Do that a few times with different target images, and you can gain more evidence that your AI app is a success.
In the next example, we will start you with images we have collected, but you can then use it with your own images to create your own AI project.
Even with only the images we have collected, you can run your own experiments. For example, it is crucial for AI developers to test for bias. Suppose many of the “gummy” fruit pictures include the packaging, as you can see above. Perhaps you are not really training AI to see “gummy” but rather to see packaging! How would you select the training data to prevent that problem?
This problem is called “spurious correlations”. Generally when collecting data to use in AI, you want to observe the principle of Ceteris Paribus ( Latin for "all other things being equal”), even for something like lighting or camera angle.
And now you can see where social bias comes into AI. For example, if you selected images by googling “beautiful” or “smart” what kinds of spurious correlations might the image set contain?Start Joe's Lunch