Integrating SharePoint and Microsoft’s New Face Identification APIs

Microsoft has a new set of APIs in beta for recognizing faces. You can sign up to leverage these APIs through the Azure Marketplace.

This provides your account with a subscription key you can use to send data to the API. The API is REST based but there is also a .NET SDK that wraps the REST layer within .NET objects.

I wanted to develop a demo that leveraged the document management power of SharePoint and integrate it with the face identification capabilities of this new API. The demo code I’m going to describe is posted to github here.

The Scenario

Many organizations have pictures of people in SharePoint, from company photos, pictures of customers, etc. These pictures are typically uploaded into image libraries and then manually tagged. The tags allow for searching – without a tag, SharePoint cannot filter or refine the search to find pictures of a specific person.

Could we write a piece of software that would grab the pictures from a document library, send the images to the Microsoft Face API, identify the faces found in the pictures and then update the tags automatically? With a little bit of coding, I built such a demo solution.

Step #1: Establishing a Domain Model

Using a Domain Driven Design approach, I wanted a set of classes that could act as the domain model. For this domain, we need two basic objects: 1) a person and 2) a photo. The person and photo have a many to many relationship, e.g. a person can have a list of photos and a photo can have multiple faces in it and link to multiple people.

/// <summary>
  /// Value object representing a Photo. 
  /// </summary>
  public class Photo
  {
      public byte[] Image { get; set; }
      public string ID { get; set; }
     
      public List<string> TextInPhoto { get; set; }
      public string LanguageDetectedInPhoto { get; set; }

     public int NumberOfMatchedFaces { get; set;  }

     public int NumberOfUnmatchedFaces { get; set; }

     public List<PhotoPerson> PeopleInPhoto { get; set; }

     public Photo()
      {
          PeopleInPhoto = new List<PhotoPerson>();
          TextInPhoto = new List<string>();
      }

}

public class PhotoPerson
{
     public string Name { get; set; }
     public int ID { get; set; }
     public List<Photo> Photos { get; set; }

    public PhotoPerson(int ID)
     {
         this.ID = ID;
         Photos = new List<Photo>();
     }

    public PhotoPerson()
     {
         Photos = new List<Photo>();
     }

    public PhotoPerson(int ID, string Name)
     {
         this.ID = ID;
         this.Name = Name;
         Photos = new List<Photo>();
     }
}

The benefit of this approach is we have a neutral, non-SharePoint specific domain model that we can pass data from the SharePoint service responsible for pulling and pushing data in and out of SharePoint and the FaceTagger service which is responsible for pulling and pushing data into the Microsoft Face API. Neither class is now dependent on the other and this makes unit testing very simple and easy.

Step #2: Creating a Training Set

The first step in creating a matching algorithm is to create a training set. The training set is our master list of photos that we know are good. In this case, I used a picture of myself and my two children as a training set, stored in a SharePoint Image library.

You can have multiple images for the same person if you wish and this will make the training set more effective at matching by combining and averaging characteristics found in multiple photos.

In order to pull the data from SharePoint, I used the client side SharePoint APIs (CSOM) to fetch the images from the library.

public Dictionary<string, PhotoPerson> getTrainingPhotos()
         {
             Dictionary<string, PhotoPerson> trainingPhotos = new Dictionary<string, PhotoPerson>();

            using (ClientContext context = Login(SharePointURL))
             {
                 try
                 {
                     var list = context.Web.GetList(TrainingListURL);
                     var query = CamlQuery.CreateAllItemsQuery();
                    
                     var result = list.GetItems(query);
                     ListItemCollection items = list.GetItems(query);
                     context.Load(items, includes => includes.Include(
                         i => i[TrainingPersonIdColumn],
                         i => i[TrainingFileColumn],
                         i => i[TrainingIdColumn]));
                        
                        

                    //now you get the data
                     context.ExecuteQuery();

                    //here you have list items, but not their content (files). To download file
                     //you’ll have to do something like this:

                    foreach (ListItem item in items)
                     {
                         PhotoPerson person = null;
                         if (item[TrainingPersonIdColumn] != null)
                         {
                             string fullName = (string)item[TrainingPersonIdColumn];
                             // look for existing person
                             if (trainingPhotos.ContainsKey(fullName))
                             {
                                 person = trainingPhotos[fullName];
                             }
                             else
                             {
                                 person = new PhotoPerson();
                                 person.Name = fullName;
                                 person.ID = item.Id;
                             }

                            //get the URL of the file you want:
                             var fileRef = item[TrainingFileColumn];

                            //get the file contents:
                             FileInformation fileInfo = Microsoft.SharePoint.Client.File.OpenBinaryDirect(context, fileRef.ToString());
                            
                             using (var memory = new MemoryStream())
                             {
                                 byte[] buffer = new byte[1024 * 64];
                                 int nread = 0;
                                 while ((nread = fileInfo.Stream.Read(buffer, 0, buffer.Length)) > 0)
                                 {
                                     memory.Write(buffer, 0, nread);
                                 }
                                 memory.Seek(0, SeekOrigin.Begin);
                                 Photo photo = new Photo();
                                 photo.ID = item.Id.ToString();
                                 photo.Image = memory.ToArray();
                                 person.Photos.Add(photo);
                             }

                            trainingPhotos.Add(fullName, person);

                        }
                        
                     }

                }
                 catch (Exception e)
                 {
                     throw;
                 }
             }
             return trainingPhotos;

        }

The important part is we need to have photos organized by person – we can do this by using a field such as full name as the identifier to link photos together.  The result of this is a Dictionary collection of people and their related photos. 

Once we have our training photos loaded, we can push them to Microsoft’s Face API.  This done through the Face Client SDK.

public async Task addPhotosToTrainingGroup(Dictionary<string, PhotoPerson> Photos, string PersonGroupID)
{
     IFaceServiceClient faceClient = new FaceServiceClient(SubscriptionKey);

    // Get the group and add photos to the group.
     // The input dictionary is organized by person ID.  The output dictionary is organized by the GUID returned by the added photo from the API.
     try
     {
         await faceClient.GetPersonGroupAsync(PersonGroupID);

        // training photos can support multiple pictures per person (more pictures will make the training more effective). 
         // each photo is added as a Face object within the Face API and attached to a person.

        foreach (PhotoPerson person in Photos.Values)
         {
             Person p = new Person();
             p.Name = person.Name;
             p.PersonId = Guid.NewGuid();

            List<Guid> faceIDs = new List<Guid>();

           
             foreach (Photo photo in person.Photos)
             {
                 Stream stream = new MemoryStream(photo.Image);
                 Face[] face = await faceClient.DetectAsync(stream);

                // check for multiple faces – should only have one for a training set.
                 if (face.Length != 1)
                     throw new FaceDetectionException(“Expected to detect 1 face but found ” + face.Length + ” faces for person ” + p.Name);
                 else
                     faceIDs.Add(face[0].FaceId);
             }

            Guid[] faceIDarray = faceIDs.ToArray();

            // create the person in the training group with the image array of faces.
             CreatePersonResult result = await faceClient.CreatePersonAsync(PersonGroupID, faceIDarray, p.Name, null);
             p.PersonId = result.PersonId;
             TrainingPhotos.Add(p.PersonId, person);

        }

        await faceClient.TrainPersonGroupAsync(PersonGroupID);
         // Wait until train completed
         while (true)
         {
             await Task.Delay(1000);
             var status = await faceClient.GetPersonGroupTrainingStatusAsync(PersonGroupID);
             if (status.Status != “running”)
             {
                 break;
             }
         }
     }
     catch (ClientException ex)
     {
         throw;
     }

}

This method creates a set of Person and Face objects, links them together and pushes them up to the Face API. Once the whole set is uploaded, we tell the Face Client to run the training algorithm which analyzes these photos to create a master reference set of faces for identification.

Step #3: Identifying and Matching Faces

Now that we have a training set, we can identify faces and match them against our master training set. I uploaded some unidentified pictures to test out the API and see if the Face API could recognize the people in the pictures.

In this SharePoint list, I created the following columns to store the output of the matching algorithm:

  • Number of Matched Faces
  • Number of Unmatched Faces
  • Matched People

In a similar way to Step #2, we need a method for fetching the untagged images from SharePoint.

public List<Photo> getPhotosToTag()
         {
             List<Photo> photos = new List<Photo>();

            using (ClientContext context = Login(SharePointURL))
             {
                 try
                 {
                     var list = context.Web.GetList(PhotosToTagURL);
                     var query = CamlQuery.CreateAllItemsQuery();

                    var result = list.GetItems(query);
                     ListItemCollection items = list.GetItems(query);
                     context.Load(items, includes => includes.Include(
                         i => i[PhotoFileColumn],
                         i => i[PhotoIdColumn]));

                    //now you get the data
                     context.ExecuteQuery();

                     //here you have list items, but not their content (files). To download file
                     //you’ll have to do something like this:

                    foreach (ListItem item in items)
                     {
                         Photo photo = new Photo();
                        
                         //get the URL of the file you want:
                         var fileRef = item[PhotoFileColumn];

                        //get the file contents:
                         FileInformation fileInfo = Microsoft.SharePoint.Client.File.OpenBinaryDirect(context, fileRef.ToString());

                        using (var memory = new MemoryStream())
                         {
                             byte[] buffer = new byte[1024 * 64];
                             int nread = 0;
                             while ((nread = fileInfo.Stream.Read(buffer, 0, buffer.Length)) > 0)
                             {
                                 memory.Write(buffer, 0, nread);
                             }
                             memory.Seek(0, SeekOrigin.Begin);

                            photo.ID = item.Id.ToString();
                             photo.Image = memory.ToArray();
                             photos.Add(photo);
                         }

                    }
                    
                 }
                 catch (Exception e)
                 {
                     throw;
                 }
             }

     return photos;
}

The important information we need to track is the ID of the photo so that we can update it back into SharePoint once we’re finished processing.

Once we have our list of photos, we can now send these to the Face API for identification.

public async Task identifyPhotosInGroup(string PersonGroupID, List<Photo> Photos)
         {
             IFaceServiceClient faceClient = new FaceServiceClient(SubscriptionKey);
                        
             try
             {
                 foreach (Photo photo in Photos)
                 {
                     photo.NumberOfMatchedFaces = 0;
                     photo.NumberOfUnmatchedFaces = 0;
                     photo.PeopleInPhoto.Clear();

                    // convert image bytes into a stream
                     Stream stream = new MemoryStream(photo.Image);

                    // identify faces in the image (an image could have multiple faces in it)
                     var faces = await faceClient.DetectAsync(stream);

                    if (faces.Length > 0)
                     {
                         // match each face to the training group photos. 
                         var identifyResult = await faceClient.IdentifyAsync(PersonGroupID, faces.Select(ff => ff.FaceId).ToArray());
                         for (int idx = 0; idx < faces.Length; idx++)
                         {
                             var res = identifyResult[idx];
                             if (res.Candidates.Length > 0)
                             {
                                 // found a match so add the original ID of the training person to the photo
                                 if (TrainingPhotos.Keys.Contains(res.Candidates[0].PersonId))
                                 {
                                     photo.PeopleInPhoto.Add(TrainingPhotos[res.Candidates[0].PersonId]);
                                     photo.NumberOfMatchedFaces += 1;
                                 }
                                 // didn’t find a match so count as an unmatched face.
                                 else
                                     photo.NumberOfUnmatchedFaces += 1;
                             }
                             // didn’t find a match so count as an unmatched face.
                             else
                                 photo.NumberOfUnmatchedFaces += 1;

                        }
                     }

                }

             }
             catch (ClientException ex)
             {
                 throw;
             }

        }

The Face API takes each photo, identifies the faces in the photo and then identifies candidates from the training set that could be a match.  Once a match is found, we link the photo to the original person from the training set.

Step #4: Tagging Photos in SharePoint

Now that we have our matches identified, we can update our photos in SharePoint with the new information.

public void updateTaggedPhotosWithMatchedPeople(List<Photo> Photos)
         {
             using (ClientContext context = Login(SharePointURL))
             {
                 try
                 {
                     foreach (Photo photo in Photos)
                     {
                         SP.List list = context.Web.GetList(PhotosToTagURL);
                         ListItem item = list.GetItemById(photo.ID);
                         item[PhotoNumberOfFacesColumn] = photo.NumberOfMatchedFaces;
                         item[PhotoNumberOfUnMachedFacesColumn] = photo.NumberOfUnmatchedFaces;

                        FieldLookupValue[] matchedPeople = new FieldLookupValue[photo.PeopleInPhoto.Count];
                         for (int i=0; i< photo.PeopleInPhoto.Count; i++)
                         {
                             FieldLookupValue value = new FieldLookupValue();
                             value.LookupId = photo.PeopleInPhoto[i].ID;
                             matchedPeople[i] = value;
                         }
                         item[PhotoMatchedPeopleColumn] = matchedPeople;
                         item.Update();
                         context.ExecuteQuery();
                     }
                 }
                 catch (Exception e)
                 {
                     throw;
                 }
             }

        }

The trickiest part is updating the lookup column with the person matched. Remember that for each photo we could have multiple people so we need to support multiple values in the column. The SharePoint CSOM API provides the ability to perform such an update using a special object called a FieldLookupValue. In addition, we update the list with the number of matched faces and the number of unmatched faces.

Here is the result of the whole process!

Conclusion

The Face API while in beta is still pretty good at matching faces. All of these pictures are from the same family members and Microsoft Face API still was able to distinguish between me, my daughter Katie and my son Geoffrey. I also tried uploading a picture of Katie when she was about 10 years younger and the Face API was able to still identify her successfully. The third picture has all three of us huddle together and the Face API was able to identify all three of us distinctly.

However, there were a few photos that the Face API struggled with in identifying faces.

This one was identified as not a match even though it is a match.

This picture the Face API failed to identify any face at all.

So clearly there is some work to do in the research to achieve near perfect results. However, for corporate contexts where pictures tend to be more formal the algorithm seems to work really well.

Please note also that the current Face API only supports 20 transactions per minute and 5000 transactions per month, so you won’t be able to run this program across hundreds of images – if you try you’ll get an exception. The API is still in the research stage so it isn’t available as a production service yet.

Christopher Woodill

About ME

Enterprise technology leader for the past 15+ years…certified PMP, Six Sigma Black Belt and TOGAF Enterprise Architect. I collaborate with companies to help align their strategic objectives with concrete implementable technology strategies. I am Vice President, Enterprise Solutions for Klick Health.

Leave a Comment