AI on the Edge
It is widely known today, that AI applications require a lot of computing power and energy. The AlphaGo computer, for example, used almost 2000 CPUs and 300 GPUs, resulting in a $3000 cost per game. Additionally, in this pursuit of achieving better accuracy, capability and performance, the deep learning models are getting larger and more demanding. For example, the winner of the ImageNet recognition challenge for image recognition in 2015 was 16 times larger than the one from 2012, and the winner for speech recognition required ten times more training operations than the one from 2014. These facts encourage investment in developing eficient methods for downsizing tremendous memory usage demands, increasing the eficiency and computational cost of the inference process and bringing machine learning (ML) even to the smallest and most eficient hardware devices. Some market research reports predict that the global edge AI software market size is projected to reach hundreds of millions of USD by the quarter of this century. This rapid growth is attributed mostly to the emergence of the 5G network. Many reports also predict that the video surveillance segment would have the largest market size along with autonomous vehicles, access management and predictive maintenance segments. This thesis presents concepts and a broad overview of the edge AI and ML. Some useful and popular methods, available hardware and software infrastructure for enabling the use of these technologies in the constrained environment of the embedded systems, are also presented. Specific development of human presence detection is then described and tested. In the ML industry, there are four basic demand categories. The first of them is the one that doesn’t need to be low cost and needs high power computing, which is used for training large ML systems required for research and exploration of what can be done with ML. The second type of demand is for training already-designed and used systems and models with new data or adding new features or labels and objects that, for example, need to be recognized in existing image recognition systems. This consumes less power, uses lower precision arithmetic, and is cheaper. The third category is running ML on powerful servers in data centres used, e.g. newsfeeds on news and social media sites, filtering search results, etc. In this case, power consumption and latency are the main concerns because the trend shows there is a fast-growing number of services using this technology and a growing number of users using them. The last category is embedded devices, cars, phones, smart cameras, etc. All these edge devices have less power available, have a smaller memory footprint, and usually have some sort of accelerated arithmetics.