Logo preload
close Logo

Photo Object Detection with AWS Rekognition, the Fulcrum API, and .Net Core

August 8, 2019

One of the most important features of Fulcrum is the ability to capture photos and attach them to a record in the field. This ability adds a further level of ground truth to the rich information being captured in a Fulcrum mobile form.

Although photos are valuable on their own, it is possible to exploit them and derive further insights using image processing and computer vision tools. There are many hosted options, including the Google Vision API, the Microsoft Computer Vision API, AWS Rekognition, and Mapillary. Additionally, tools such as OpenCV and TensorFlow make it possible to integrate computer vision and image processing without external, hosted dependencies, although this can require a large amount of training data.

This post will describe how to build a simple console application, using C# and .Net Core, to extract a photo from Fulcrum and perform a simple object detection using the Amazon Web Services (AWS) Rekognition service. Rekognition is an online image processing and computer vision service hosted by Amazon. This application will make use of both the Fulcrum API and the Rekognition API (via the AWS .Net Core SDK). You will need the following to replicate its behavior.

  1. A Fulcrum Developer Pack subscription
  2. A Fulcrum API key
  3. An AWS account
  4. An AWS key/secret

This application will be a simple console application that processes a single image, from a Fulcrum photo URL, and performs a label detection operation using Rekognition. The output will take the form of a JSON string that is written to the console. This post will discuss key steps, but the full source code will be made available. The sample photo from Fulcrum is here:

Object detection bikes

After initiating a .Net Core 2.2 C# console application in Visual Studio (this application was written with Visual Studio 2019 Community Edition for Mac), add the packages for AWSSDK.Core, AWSSDK.Rekognition, and Newtonsoft.Json. Using NuGet package manager is the most straightforward way to use the latest compatible versions of these libraries for your version of the .Net Core SDK.

The primary functions that do the work in the Rekognition SDK are async functions, so the first thing we will do is modify the application’s “Main” function to be able to support async calls. That requires the following:

//standard entry point
static void Main(string[] args)
{
MainAsync(args).Wait();
}

//async worker function
static async Task MainAsync(string[] args)
{
//The work gets done here
}

This application will accept four arguments: Fulcrum photo ID, Fulcrum API key, AWS access key, and AWS secret. In production, you should not pass the Fulcrum or AWS keys in this manner as this can be very insecure. This method is used to simplify the presentation here.

Assuming you have parsed the arguments and stored them, the first step is to retrieve the specified photo from the Fulcrum API. The URL template for this API call is:

private static string fulcrumPhotoTemplate = “https://api.fulcrumapp.com/api/v2/photos/{0}.jpg?token={1}”;

The first token will be the photo ID and the second will be the Fulcrum API key. In production, it is recommended to pass the API key via an HTTP header. The Rekognition SDK needs photos to be passed as byte arrays, so the following function encapsulates calling the Fulcrum API, receiving the photo, and returning the byte array. This function was adapted, with minor modifications, from this post.

static byte[] getImageFromUrl(string url)
{
System.Net.HttpWebRequest request = null;
System.Net.HttpWebResponse response = null;
byte[] b = null;

request = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(url);
response = (System.Net.HttpWebResponse)request.GetResponse();

if (request.HaveResponse)
{
if (response.StatusCode == System.Net.HttpStatusCode.OK)
{

Stream receiveStream = response.GetResponseStream();
int byteCount = (int)response.ContentLength;
using (BinaryReader br = new BinaryReader(receiveStream))
{
b = br.ReadBytes(byteCount);
br.Close();
}
}
}

return b;
}

Next is to pass the bytes into the Rekognition SDK for processing. As mentioned previously, the application will do a label detection. This is the most generic form of object detection in the Rekognition service. More specific services include face detection, celebrity recognition, text detection, and text detection for profanity (to filter questionable content). You can choose the recognition type based on your understanding of your photo content. When using any kind of facial detection, care should be taken to understand any applicable privacy regulations or legal restrictions.

From here, it’s a short trip to having text that describes a photo. The following lines prep the photo, call the Rekognition SDK, and process the response. Note the use of async/await, which necessitated the use of MainAsync above.

photoBytes = getImageFromUrl(url);
Amazon.Rekognition.Model.Image img = new Amazon.Rekognition.Model.Image();
img.Bytes = new MemoryStream(photoBytes);
DetectLabelsRequest detectLabels = new DetectLabelsRequest();
detectLabels.Image = img;
AmazonRekognitionClient rekClient = new AmazonRekognitionClient(awsKey, awsSecret, Amazon.RegionEndpoint.USEast1);
var o = await rekClient.DetectLabelsAsync(detectLabels);
if (o.Labels.Count > 0)
{
var s = JsonConvert.SerializeObject(o.Labels);
Console.WriteLine(s);
}

As written, the project creates a file called fulcrumRek.dll. To run it from the command line, simply type the following (substituting valid values into the arguments below):

Dotnet fulcrumRek.dll {fulcrum photo ID} {fulcrum API key} {AWS key} {AWS secret}

This application converts the response to JSON and writes it to the console. In practice, you would probably want to do something more elegant, such as writing it to a database or even using the Fulcrum API to push the text back into a form field for later use. You would probably also want more elegant logic, using the Fulcrum API, to process all the photos associated with a record. With the tools used here, there are many ways to enrich your Fulcrum photo content with hosted image processing services, including adding highlights to selected objects. This is the photo from above with people and bicycles highlighted:

Object detection bikes

We’ve been doing more work with image processing and computer vision at Spatial Networks recently. Stay tuned (here and to the SNI blog) for future posts from some of my co-workers about how we’re using it to enrich our Foresight product line.

The full code for this post can be found here.