For the past few decays, obesity has become a serious problem in modern life. Obesity associates with many chronic diseases, which are the leading causes of death, including diabetes, heart disease, stroke and cancer. The most effective way to prevent obesity is through food control, i.e., knowing the food ingestion including the nutrient and calorie. To assist in understanding the food ingestion of each meal, this thesis develops a food recognition system that can analyze the food composition based on the provided image. This thesis also proposes a new-collected dataset Ville Cafe Dataset for food recognition.
The system is developed based on a Mask R-CNN network with a postprocessing mechanism. Mask R-CNN is composed by a Mask R-CNN backbone, a RoIAlign layer, and a Mask R-CNN head. The Mask R-CNN backbone first applies a ResNet101-FPN structure to extract different levels of features. These features are then fed to RPN to locate food regions, or Region of Interests (RoIs), in image. RoIAlign layer resizes the RoIs using bilinear interpolation method and fed to the Mask R-CNN head. The Mask R-CNN head then classify the food category, detect food bounding boxes, and food masks. After obtaining the regions and the categories of each kind of food, the system estimates weight of food using a linear regression model. This thesis also proposes a postprocessing mechanism, which modifies the extracted bounding boxes and masks, to provide a better result on both analytics and visualization.
To estimate the calories and nutrients accurately, the system considers dataset provided the Ministry of Health and Welfare and the United States Department of Agriculture (USDA). According to these informations, the system then shows the estimated calories and nutrients based on the computed food weight and the analysis results.
To estimate the effort, this experiment applied two datasets in the experiments, the Food-256 dataset and the Ville Cafe Dataset. The Ville Cafe Dataset contains 16 categories with 35842 images for each category. This experiment first train our model on the training set, which is the mixture of Food 256 and Ville Cafe, to recognize 16 categories of food, including salad, fruit, toast, egg, sausage, chicken cutlet, bacon, french toast, omelette, hash browns, pancake, ham, hamburger, sandwich and french fries. The training set contains 1278 of food images, 6096 of food items. As for testing, there are 686 food images and 3680 food items being used for evaluation. The food recognition accuracy of the mixture of Ville Cafe Dataset and Food-256 Dataset is 99.86%, and the IoU is 97.17%. As for the food weight estimation experiment includes eight categories: salad, fruit, toast, sausage, bacon, ham, hamburger and french fries. Each of the categories uses 40, 40, 44, 40, 41, 49, 40 and 40 data respectively, a total of 320 data, for linear regression model. In the experimental results, the average absolute error is 8.22, and the average relative error is 0.13.
Title
類神經網路模型應用於食品熱量與營養成份分析
ProQuest Dissertations & Theses
Source type
Dissertation or Thesis
Language of publication
Chinese
ProQuest document ID
2450188083
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Back to toplbN9MzpF3oZ+7BxGz2ioSA==:44V1edKYAy2Gs1MFRsLogLNCBB2IXzTJ4Ff+jhGXEadC5b3D2nFDlNupl6yhqSd4wpToOBt6yplN7kUC7Je/e13zVxorSBIpMta+f4XokQiO93r8McwlLEItGmRFdEDJwEOPDOvL1yEMUX5N2KvHuZ+uQXd8qeW8IqcnaqeTF5aMoFsSf//dDb3aX6lm9HKjFvja4nam85HZ4n6iw/6JdTEEceXXHCjv280Zq0vIddmFBIcnyx3f3gSmIiAI9DeT8yEQm79MchpfxUj5HdMgNPjRddrWt8FsbaXl883TtrkK88e2WI1kZIrjTcFyk931zy4k7ZvMuw6HBRk9ObYyMGcP6vjKJ7hMOgTue4MPAKwaNuo/Yft8WRWMsIN0RXQ/Ezr2oHwsjKH7z9H3s1t/fg==