Computer Vision and Abnormal Patient Gait: A Comparison of Methods

Abnormal gait, falls and its associated complications have high morbidity and mortality. Computer vision detects, predicts gait abnormalities, assesses fall risk, and serves as a clinical decision support tool for physicians. This paper performs a systematic review of computer vision, machine learning techniques to analyse abnormal gait. This literature outlines the use of different machine learning and poses estimation algorithms in gait analysis that includes partial affinity fields, pictorial structures model, hierarchical models, sequential-prediction-frameworkbased approaches, convolutional pose machines, gait energy image, 2-Directional 2-dimensional principles component analysis ((2D) 2PCA) and 2G (2D) 2PCA) Enhanced Gait Energy Image (EGEI), SVM, ANN, K-Star, Random Forest, KNN, to perform the image classification of the features extracted inpatient gait abnormalities.


INTRODUCTION
Abnormalities with patient gait fall and associated complications have high morbidity and mortality [1]. Falls, its high costs to the healthcare system is largely preventable. Preventable complications include hip fractures, medical deconditioning, myocardial infarction, and pulmonary emboli. These complications are devastating in the elderly population [2]. Advances in algorithms and low-cost sensors in the healthcare market prevent falls and complications [3].
Computer vision assesses fall risk, provides physicians with an opportunity to outline an early treatment plan, thus limiting any morbidity and mortality. Computer vision is also used to assess gait in disorders like dementia, depression, intellectual disability, musculoskeletal disorders, and stroke [4 -7]. These conditions are managed in the fields of neurology, physical medicine rehabilitation, rheumatology, and orthopaedics [8]. Computer vision assesses postural abnormalities; its parameters' provide strength and an endurance plan for patients during their treatment course [9].
Clinicians' provide a subjective assessment of gait. As a result, subjectivity impacts diagnosis and treatment decisions; thus, patient outcomes [10]. Computer vision strengthens physicians' decisions, provides an in-depth quantitative analy-Sorting: It recognizes and identifies parts. Inspection: It detects, identifies, and classifies parts.
Within the past decade, they have conducted extensive research on video-based human motion capture. Various techniques in machine learning and computer vision are proposed for pose estimation and 3D human motion tracking [14]. A video-based technique is used to carry out joint kinematics, while gait is ongoing, as developed by the work of Corazza et al. [15].

Machine Learning
As deep learning approaches emerge, DNN-based techniques are the standard in vision tasks such as human motion tracking, pose estimation [16], human activity recognition [17], and face recognition [18]. Deep CNN architecture consists of layers between both input and output; model complex non-linear relationships in data. DNN models for 3D human pose estimation focus on a single view, with a complex background setting [14,19]. Machine learning models using Logistic Regression [20], Artificial Neural Networks (ANN) [21], K-Star [22], Random Forest [23], K-nearest neighbors (KNN) [24] and Support Vector Machines (SVM) [25] can identify and classify patterns of gait, thus provide valuable insight into medical conditions [16].

Post Estimation
An individual's trunk, joints, and other body parts are detected using the human pose estimation method [26]. The pose estimation technique detects body parts using images from a video or an image detector; it describes the anatomic details [27]. These images are processed into an algorithm. Key skeletal points serve as coordinates, generated by using the pose estimation method. Human pose estimation is important; it predicts human posture, behaviour by gait recognition, character tracking, action, and behaviour recognition [28]. Similarly, the method: Part affinity field is another groundbreaking computer vision technique, able to detect multiple 2D people poses in the wild with high accuracy [29].
Pictorial structures mode [30 -32] expresses spatial relationships within body parts labelled as kinematic-priorsbased, tree-structured graphic models. This model couples connected limbs, thus make up the classic articulated pose estimation technique. These methods make mistakes such as counting image data twice; this happens because connections between variables of the tree-structured model did not capture correctly. These errors occur on high-quality limb images used in the pictorial structure model [33].
Hierarchical models [34,35] signify how the parts relate at various sizes and scales in a hierarchical tree structure. Based on assumptions, the image structure easily detects, discriminates the location of small hard-to-locate parts within the entire limb system.
Interactions introduce loops and augment the tree structure with edges; it captures: long-range, occlusion, symmetry, and incorporates into non-tree models [36,37]. The inference is required in methods for learning and test time. As a result, spatial relationships' provide fast, efficient inference, with a parametric form; but, a trade-off with other models occurs.
Sequential-prediction-framework-based approaches [38] use complicated relationships between variables: it learns from an implicit spatial model, trains an inference procedure to achieve a performance output [39,40].
Recently, the articulated pose estimation method [41,42] combined with a convolutional architecture gained popularity in the computer vision community. This method [43] uses convolutional architecture to carry out a regression of the Cartesian coordinates [44]. It regresses an image to a confidence map and opts in for graphical models using spatial probability priors' heuristic initialization or energy functions designed by hand. This removes outliers on the regressed confidence maps; it also uses a dedicated network model for a precise refinement [45,46]. This input to the regressed confidence maps combined with convolutional networks does not require hand-designed priors, has great receptive domains for learning, and attains a high level of performance within the entire precision region. Furthermore, it should not be carefully initialized and needs a dedicated precision refinement. A network module with a large receptive field is used in the work for capturing implicit spatial models [47], considered joint training's advantages; the model we reviewed is trained globally because of convolutions differentiable attributes [42].
A deep network with the features of being able to use error feedback for training is seen in the work of [48]. It also uses Cartesian representation, as seen in that it is incapable of preserving spatial improbability, and that reduces the high precision regime's accuracy [49].
The task of articulated pose estimation using Convolutional Pose Machines (CPMs) has been carried out. CPM inherit pose machine architecture's benefits [38], integrating learning and inference tightly, the learning of long-range dependencies between multi-part cues and image implicitly, and a modular sequential design. It combines these with the benefits convolutional architecture provides. CPMs also include advantages such as the capability of handling large training datasets efficiently, a differentiable architecture that makes joint training with backpropagation possible, and the ability to learn spatial and image context's feature representations directly from data. Series of convolutional networks, 2D maps for each part's location make up CPMs. The ability to extract the low-level feature is enhanced, using the more convoluted network structures, and deeper network layers of the enhanced CPM model [57]; and afterward, apply a system to fine-tune it. The enhanced CPM is proven to include an excellent image detection effect and high image classification accuracy, and a good human pose estimation model for designing a new network and apply a system of fine-tuning to increase the human pose estimation's efficiency.

Gender Recognition
Gait Energy Image (GEI) is a combination of gait with a new spatiotemporal method for force representation to mark human walking behaviour for individual recognition [58]. The findings show the efficiency of combining gait and GEI approach for individual recognition, and the competitiveness of its performance [58,59]. The GEI approach is used for studying individual recognition. The researchers used various tech-niques, methods to present the GEI approach as biased attributed in their survey. It is clear from the findings of their research that the system's performance in real-time improved; hence, its application in real-word is possible [60]. It further used automated approaches to combine psychological methods for improving accuracy quality to classify human gait-based genders. According to their research, compared to other parts of the body, the major body parts for the gender recognition process include the chest, back, hair, and head. Even though the application process contains several impediments because of the differences in how humans appear, they include change of shoes and clothes, or when they lift objects, the gait classification is possible in a controlled environment.
The classification of human behaviour using 2-Directional 2-dimensional principles component analysis ((2D) 2PCA) and 2G (2D) 2PCA) Enhanced Gait Energy Image (EGEI) is proposed in the work [61]. The outcomes of the experiment revealed the simplicity of the algorithm and its capacity for realizing a higher classification accuracy within a short period. The system uses gait classification based on the silhouette, recommends books to visitors according to their age or gender, and in real-time [62]. The Support Vector Machine (SVM) has 77.5% accuracy in classification [63]; it combines the Denoised Energy Image (DEI) and GEI approach in preprocessing to present gender recognition's initial design, outcomes, the training, and extraction of feature from the walking movements experiment. This method may provide high real-time accuracy.
The method of integrating information from the multi-view gait at the feature level is proposed [64], and it increases the effectiveness of the performance for the gender classification based on multi-view gait. Gait for human recognition was conducted [65]. Gait image's features that are founded on information theory sets are referred to as image feature information gait. Gait information features are information set theory-based gait image features that are described by this research team. The concept of the information set was applied on the frames in a gait cycle, and two elements referred to as Gait Information Image with Sigmoid Feature (GIISF) extracted and Gait Information Image with Energy Feature (GII-EF) to derive the proposed Gait Information Image (GII). The identification of the gait was made using Nearest Neighbour (NN) for the classification. The robust feature-level fusion of directional vectors such as forward and backward diagonal, vertical, and horizontal vectors are used by this research team [66] to study gender recognition. First, they construct for each image sequence: 1) Gait Energy Image (GEI), followed by 2) Gradient Gait Energy Image (GGEI), which is achieved using neighbourhood gradient computation [67]. After that, differences in all the four directions were utilized as discriminative gait features. Afterward, SVM used in the classification process, while the largest multi-view CASIA-B (Chinese Academy of Sciences) datasets were used to test the model. The investigators report that their study outcomes were beneficial.
According to the literature review, the current most universal gait-based approaches to gender classification include GEI and GII approaches. As a result, this research focuses on contrasting GII approaches with GEI approaches to present a gait-based gender classification in real-time [68]. The one with the highest accuracy is beneficial for future ongoing research studies.

Search Criteria
The systematic review aimed at reviewing published papers, as well as academic journals, in a step-by-step manner. It also intends to perform a systematic peer-review on academic-based journals. It will use online search engines such as IEEExplore 1 , PubMed 2 , Google Scholar 3 , Cochrane 4 , CINAHL, Medline 5 , Web of Science 6 , DBLP 7 , and Embase 8 to search for literature. The primary keywords used for the search are Computer vision, Artificial Intelligence, Machine learning, deep learning, CNN, Abnormal gait analysis, gait analysis, Stroke, Parkinson's disease, and Movement disorders.

Justification of the Selection
The preliminary research produced one hundred articles. We considered only 10 of them in this literature review. Out of the 10 articles, only 5 of them were selected related to this report's topic. This literature review set an interval duration from 2009 and 2019 to analyze; this ensures up-to-date works of literature used for the review. However, at times, some earlier journals were selected.

Findings and Analysis
The key findings from the journals are provided in Table 1 [5] Several measures were identified in the gait analysis to study the abnormality of the patients, some of which are in Table 2.

CONCLUSION
According to this brief literature review, several machine learning algorithms are used in the classification, which includes SVM, K-Star, Random Forest, KNN, and DNN. The images and videos are widely used in the literature to capture the human walk while performing the gait analysis. Therefore, the use of high technologies of computer vision, such as smartphone cameras, surveillance cameras, among others, is rapidly emerging. Limitations to this brief review include its deficiency to perform in-depth research on the gait analysis, its functions, and at length comparison of studies.
Future research databases with real-time data, as opposed to single gait data and less geographic and demographic restrictions, are needed [6]. Improvement in accuracy in gait patterns recognition affected by variations in clothing needs further research [6]. Multiview covariate data sequences are needed, giving multiple view angles resulting in less error rate [6]. Segmentation of gait in unconstrained, background conditions that lead to adaptive background modelling also needs refined research to correct these issues [6]. Lastly, optimization of feature selection and reduction of feature space is also needed for future research [6].

CONSENT FOR PUBLICATION
Not applicable.

FUNDING
None.

CONFLICT OF INTEREST
The author declares no conflict of interest, financial or otherwise.