Under illumination variations, exploiting 3D image for ‘Face Authentication’ in Biometrics…
Automatic recognition of human faces is extremely useful in a wide area of applications, such as face identification for security and access control, surveillance of public places, mug shot matching and other commercial and law enforcement applications.
The majority of face recognition techniques employ 2D grayscale or color images. Only a few techniques have been proposed that are based on range or depth images. This is mainly due to the high cost of available 3D digitizers that makes their use prohibitive in real-world applications. Furthermore, these devices often do not operate in real time or produce inaccurate depth information.
A common approach towards 3D face recognition is based on the extraction of 3D facial features by means of differential geometry techniques. Facial features invariant to rigid transformations of the face may be detected using surface curvature measures. The combination of 3D and gray-scale images is addressed in here, but 3D information is only used to aid feature detection and compensate for the pose of the face.
The most important argument against techniques using a feature-based approach is that they rely on accurate 3D maps of faces, usually extracted by expensive off-line 3D scanners. Low cost scanners however produce very noisy 3D data. The applicability of feature based approaches when using such data is questionable, especially if computation of curvature information is involved. Also, the computational cost associated with the extraction of the features (e.g. curvatures) is significantly high. This hinders the application of such techniques in real-world security systems. The recognition rates claimed by the above 3D techniques were estimated using databases of limited size and without significant variations of the faces. Only recently conducted an experiment with a database of significant size containing both grayscale and range images, and produced comparative results of face identification using eigenfaces[i.e. these are a set of eigenvectors used in the computer vision problem of human face recognition] for 2D, 3D and their combination and for varying image quality. This test however considered only frontal images with neutral expression, captured under constant illumination conditions.
Apart from the combination of 2D and 3D information under background clutter, occlusion, face pose variation and harsh illumination conditions one could exploit depth information and prior knowledge of face geometry and symmetry. Furthermore, unlike techniques that rely on an extensive training set to achieve high recognition rates, crack requires only a few images per person.
Acquisition of 3D and Color Images
A 3D and color camera capable of real-time acquisition of 3D images and associated color 2D images is employed . The 3D-data acquisition system, which uses an off-the-shelf CCTV-color camera and a standard slide projector, is based on an improved and extended version of the well-known Coded Light Approach (CLA) for 3D-data acquisition. The basic principle lying behind this device is the projection of a color-encoded light pattern on the scene and measuring its deformation on the object surfaces. By rapidly alternating the color coded light pattern with a white-light pattern, both color and depth images are acquired. The average depth accuracy achieved, for objects located about 1 meter from the camera, is less than 1mm for an effective working space of 60cm × 50cm × 50cm, while the resolution of the depth images is close to the resolution of the color camera.
The acquired range images contain artifacts and missing points, mainly over areas that cannot be reached by the projected light and/or over highly refractive (e.g. eye-glasses) or low reflective surfaces (e.g. hair, beard). Some examples of images acquired using the 3D camera can be seen in Figs. 1, 2, 3 and 4. Darker pixels in the depth map correspond to points closer to the camera and black pixels correspond to undetermined depth values.
Face Localization
A highly robust face localization procedure is proposed based on depth and color information. By exploiting depth information the human body may be easily separated from the background, while by using a-prior knowledge of its geometric structure, efficient segmentation of the head from the body (neck and shoulders) is achieved. The position of the face is further refined using brightness information and exploiting face symmetry.
Separation of the body from the background is achieved by computing the histogram of depth values and estimating the threshold separating the two distinct modes. Segmentation of the head from the body relies on statistical modelling
of the head -torso points in 3D space.
The probability distribution of a 3D point x is modelled as a mixture of two
Gaussians:
P(x) = P(head)P(x|head) + P(torso)P(x|torso)
In this case, these may be obtained by exploiting prior knowledge of the body geometry.
The above clustering procedure yields inaccurate results when biased by erroneous depth estimates, i.e. occluded parts of the face. Therefore, a second step is required that refines the localization using brightness information. The aim of this step is the localization of the point that lies in the middle of the line segment defined by the centers of the eyes. Then, an image window containing the face is centered around this point, thus achieving approximate alignment of facial features in all images, which is very important for face classification. The technique proposed exploits the highly symmetric structure of the face. The estimation of the horizontally oriented axis of bilateral symmetry between the eyes is sought first. Then, the vertically oriented axis of bilateral symmetry of the face is estimated. The intersection of these two axes defines the point of interest.
Simulating Illumination
Another source of variation in facial appearance is the illumination of the face. The majority of the techniques proposed to cope with this problem exploits the low dimensionality of the face space under varying illumination conditions. They either use several images of the same person recorded under varying illumination conditions or rely on the availability of 3D face models and different maps to generate novel views. The main shortcoming of this approach is the requirement in practice of large example sets to achieve good reconstructions.
Our approach on the other hand builds an illumination varying subspace by constructing artificially illuminated color images from an original image. This normally requires availability of surface gradient information, which in our case may be easily computed from depth data. Since it is impossible to simulate all types of illumination conditions, we try to simulate those conditions that have the
greatest effect in face recognition performance. Heterogeneous shading of the face caused by a directional light coming from one side of the face, was experimentally shown to be most commonly liable for misclassification. Given the surface normal vector N computed over each point of the surface, the RGB color vector Ia of a pixel in the artificial view is given by:
Ia = Ic(ka + kdL ・N) (5)
where Ic is the corresponding color value in the original view, and ka, kd weight the effect of ambient light and diffuse reflectance and L representing the direction of artificial light source, respectively. Fig. 4 shows an example of artificially illuminated views.
Thus, an attacker can edit and adjust the lighting and angle of a ‘phony’ photo to ensure the system will accept it. Due to the fact that a hacker doesn’t know exactly how the face learnt by the system looks like, he has to create a large number of images…let us call this method of attack ‘Fake Face Brute force.’ It is just easy to do that with a wide range of image editing programs at the moment. And lets face the fact, they are not ‘so’ secure.
So, in hand the biometric systems use ‘not so good image capturing devices, like systems of Lenovo, Toshiba and Asus’ + { a local face localized method + a basic illumination technique } = which can just be brute-forced through embedded programming techniques and essential path-breaking hardware and wide range of image editing programs.
//Abhiraj