The idea was to demonstrate that deep-learning can “understand” human behavior and the habits of a specific person, and based on that, the AI system could offer suggestions to the user.
The problem with that method is the huge amount of time required to manually label the images (40,000 in this case). So AI researchers have turned to using synthetic images (such as from a video) that are pre-labeled (in captions, for example).
Creating superrealistic image recognition
But that, in turn, also has limitations. “Synthetic data is often not realistic enough…