Automatically describing and tagging pictures on SharePoint using Azure Cognitive Services

May 23, 2018September 11, 2020 Gunnar Peipman 1826 Views 1 Comment

Azure Cognitive Services has wider audience than cool young guys developing very cool mobile apps. These services can be used in very different use cases. This blog post shows how to use Azure Cognitive Services to automatically describe and tag photos added to Office 365 SharePoint picture library using Microsoft Flow as workflow engine.

Automatic image describing and tagging

SharePoint on Office 365 doesn’t allow us to use custom code deployed to server. So, we don’t have event receivers and custom workflows available there. But we can use Microsoft Flow to listen to some SharePoint library and run sequence of steps that do something useful, gather some data and update the list item that triggered the flow. This is what we will build in this blog post.

In the end, when we add new picture to picture library in SharePoint it gets automatically described and tagged without any user involvement.

Building Azure Function to analyze photos

We start by building HTTP-triggered Azure Function that analyzes images we give it and returns some information back. To Analyze images will we use Azure Cognitive Services. Cognitive Services have also free offering for computer vision service and it has enough capacity for trying out my work described here.

New to Azure Functions and Cognitive Services? To find out more about Azure Functions I suggest to read my posts Short introduction to serverless architecture and Getting started with Azure Functions. Good start to Azure Cognitive Services is my demo project at GitHub: gpeipman/CognitiveServicesDemo.

Before we start with function we need some data structure to return data back to Microsoft Flow.

public class ImageDescriptor
{
    public string Description;
    public string Tags;
}

This model class is good enough to return information from function to Flow.

In function we expect that file is sent as request body. In the beginning of function we save execution context and trace writer to class scope so we can use these two from other methods if needed. DescribeImage() method used in function is described below.

public static class ImageProcessing
{
    private static ExecutionContext _context;
    private static TraceWriter _log;
 
    [FunctionName("AnalyzeImage")]
    public static async Task<IActionResult> Run([HttpTrigger(AuthorizationLevel.Function)]HttpRequest req, 
                                                 TraceWriter log, ExecutionContext context)
    {
        _context = context;
        _log = log;
 
        var result = new ImageDescriptor();
        await DescribeImage(req.Body, result);
 
        return new JsonResult(result);
    }
}

We keep Cognitive Services API key and end-point URL in function settings. To get these values later we need a method to load configuration. Let’s add GetConfiguration() method to our function class.

private static IConfigurationRoot GetConfiguration()
{
    return new ConfigurationBuilder()
        .SetBasePath(_context.FunctionAppDirectory)
        .AddJsonFile("settings.json", optional: false, reloadOnChange: true)
        .AddJsonFile("local.settings.json", optional: true, reloadOnChange: true)
        .AddEnvironmentVariables()
        .Build();
}

Now we need a method that calls Cognitive Services computer vision API to analyze an image. We read API key and end-point URL from function settings, initialize client and send image to computer vision service. When results come back we carefully build up a model that we return to Flow.

private static async Task DescribeImage(Stream stream, ImageDescriptor descriptor)
{
    var config = GetConfiguration();
    var apiKey = config["Values:ComputerVisionApiKey"];
    var apiRoot = config["Values:ComputerVisionApiUrl"];
 
    var client = new VisionServiceClient(apiKey, apiRoot);
    var description = await client.DescribeAsync(stream);
 
    descriptor = descriptor ?? new ImageDescriptor();
    descriptor.Description = description.Description.Captions[0].Text;
 
    if (description.Description.Tags != null)
    {
        descriptor.Tags = string.Join(", ", description.Description.Tags);
    }
}

Complex functionality but simple code – this is what makes Cognitive Services cool. Now we are done with our function and we can deploy it to cloud.

Of cource, if it needs debugging then there is Azure Functions console available in Visual Studio so we can run our functions in local box to debug them. In current case we can use Postman extension for Google Chrome to make requests to our function.

The screenshot above shows successful request to function running on my dev box.

Adding SharePoint picture library

We need picture library on Office 365 SharePoint. For testing reasons let’s create it to some test site where users doesn’t hang around. Two fields in picture library are specially interesting for us.

Description and keywords are fields we want to automatically fill based on image analyzes by cognitive services. Fortunately both of these fields are text fields and we don’t have any issues with formatting these to make it understandable for SharePoint.

Connecting SharePoint library with Flow

Next step is to create a new flow in Microsoft Flow site and connect it to our SharePoint picture library. In short this our flow. It is connected to picture library in SharePoint and it is triggered when new file is added.

When triggered the flow does the following:

read properties of file that was just created
load file contents
run Azure Function with file contents,
load returned JSON to variable
assign description and tags to image.

Here is the full version of flow diagram. Things not clearly visible are written in red by me.

Information given on the diagram above should be enough to repeat what I did.

Trying it out

To try our solution out we only need to upload some photo to SharePoint photo library we just created. To see if flow works we can check flow run history from Microsoft Flow environment.

If everything went well like on image above then on SharePoint we see Description and Tags field filled automatically like shown on the following screenshot.

Description and Keywords fields were automatically filled by Microsoft Flow using Azure Cognitive Services.

Experiences gathered

I stop for a moment on experiences I got when I built this solution.

Connecting SharePoint to Microsoft Flow is very easy – just define connection.
Dealing with Microsoft Flow expressions is not so easy when starting. They are not always very intuitive and one must know how data must be transformed to make expressions work.
Building POST-requests is tricky. In the end I went with request to function that just inserts image as POST request body.
Debugging functions on local box is easy.
It’s possible to write trace messages in function also when function runs in live environment.
Flow error messages have some useful information when debugging but there are still cases when expressions are saved and they fail at runtime. Debugging communication with function was actually easy.

I hope it helps those of my readers who try to use it in real scenarios.

Wrapping up

To wrap up this long post we can say: we did it! We built Azure Function that analyzes images using Azure Cognitive Services. Then we built flow on Microsoft Flow that is invoked when new picture was added to SharePoint picture library. Using the flow we called Azure Function to analyze picture and fill in description and tag fields. Building something like this first time can be challenging and take time. After getting used with Azure Functions and Microsoft Flow, things are actually easy and go smooth, In the end we have a little bit intelligent (try out the solution with different images) solution to automatically tag and describe pictures in SharePoint picture library.