How MATLAB Distributed Computing Server and Machine Vision Tools Are Transforming Shell
James Martin, Shell International
Amjad Chaudry, Shell International
Machine learning and deep learning can be used to automate a range of tasks. Shell and the Advanced Analytics Center of Excellence (AACoE) are using these techniques to speed up processes while increasing their reliability. In geomatics, terrain classification can be improved using a rich training dataset of labelled satellite images. Automatic tag detection in large (panoramic) plant images also leads to more efficient maintenance.
James and Amjad will show how MATLAB® make using these techniques easy. With minimal setup, MATLAB Parallel Server™ allows the team to train networks on multiple remote GPUs in the cloud. MATLAB Production Server™ lets the team create thin web clients that operators in the field can use, with minimal physical hardware such as a smartphone.
Shell leverages all those techniques and tools so that its engineers can easily and painlessly use the latest findings.
Recorded: 3 Oct 2018
Over the last four years or so within Shell, advanced analytics is playing an increasingly important role in how we're doing things. Today, though, I'd like to particularly talk to you about deep learning, and how, specifically within MATLAB, we're leveraging upon some of the deep learning tools to improve our innovation pipeline. Interestingly enough, Rick's keynote talk mentions transfer learning and semantic segmentation. And that's precisely some of the examples that I'll be talking to you about today.
Of course, as being Shell, we always have to put up a cautionary note. So I'll leave this up for five seconds or so for those who want to read. OK.
So today, I'm going to structure my talk as follows. I'm going to just briefly introduce you to Shell and the range of services and products that we encompass. I'll also talk about our innovation and delivery pipeline, how we try and bring innovative ideas, especially in advanced analytics, through to final products which are properly maintained by IT. And then where MATLAB fits into that.
I'll then talk about two use cases. As I mentioned, the first one on tag recognition in industrial imagery, and then also terrain recognition in hyper-spectral satellite imagery. It sounds very cool, so I put it in there. And then finally next steps, where do we go from there with the results that we've got.
OK. So this is the latest incarnation of our summary slide of the business. So we're a very wide ranging company. We range all the way from my initial joining of the company, which was in upstream exploration, trying to identify hydrocarbon deposits. And then through to development where we try and drill wells to extract those, through then to more downstream activities where we try and process and refine the products, through to transport and trading where we then deliver those products to the various end users, which could include the retail forecourt, aviation, and also lubricants.
If we repurpose that information, we can highlight the value at the moment where the analytics is bringing within the organization. And-- oh, that's it-- what I'd really like to bring attention to are all the circles in various colors. So these are active areas where analytics is playing a leading role within our organization. And we'll probably end up having quite large change, making quite a big impact into the current workflows and ways of working. The two circles in blue are where I'll be exploring a bit further.
So this is our innovation funnel in yellow. And we have a series of decision gates running across the top, D0 through to D4. And basically we try and take ideas and concepts from the left through to the right.
At the bottom you can see the two overlapping triangles where we have an overlapping-- where we move from one digitalization team, which is where I currently sit, through to IT proper. So
what we try and do is during the scoping and innovation phase, we get involved. We produce POC concepts, minimum viable products, try and prove the value. And then gradually IT gets brought in, and we try and scope out full deployment solutions and also maintenance strategies so we can then fully deliver the value to the business.
The other thing I'd like to bring attention to are all the dots. So it's-- think of it as almost like a normalized indication of the number of ideas in the organization on the left. And what I want to emphasize is we're perfectly OK with a large churn at each decision gate, so it's about making sure you scope fully within the organization. And then as you get through to the end, we focus your resources, really, on the most high value solutions.
Where does MATLAB add value? It's very fast prototype. We have an active agreement with MathWorks Consulting, which we leverage upon to improve our productivity.
There's a huge set of examples, documentation, that we want to maintain within MATLAB. And because of the huge focus that MathWorks has put on integrating some of the deep learning techniques, say, in the last year, we're able to leverage on some of these latest developments in that space whilst also being able to access this advantageous backlog of modules. We really like the web app delivery, so we bypass a lot of the issues around installing versions of MATLAB to get some of our software running.
So here we've got two examples of web apps that we've produced. Just in the top right is a web app of a bitumen test. And in the bottom left you can also see a sneak preview of what I'll talk about later, which is the terrain classification as a web app.
We've also been experimenting a bit with the MDCS, so the MATLAB Distributed Computing Server. So that allows us to leverage quite powerful GPUs on the cloud. And we mainly use it for training up some of our deep learning models.
So in terms of this year, we've had quite a few milestones between Shell and MATLAB. We've now finally-- because Shell sometimes has a bit of admin things, it's been quite difficult to get licenses for disparate parts of the business. So now we have an enterprise-wide deal. so that means any smart people, wherever they come from that join the organization, can finally get productive with MATLAB quickly, in theory.
We've got a second MPS license. And as I said, the MDCS, I think, is going to be an increasingly important feature. And we're looking at bringing that more into line with our strategy.
MathWorks Consulting, as I said, has been a very productive use of our time. And then we're also now hoping to leverage on some of our resources in Bangalore to try and allow us to progress projects around the clock.
OK. So this is the first example. This is tag recognition. So what you can see in the background is a piece of industrial equipment. I think it's a pump.
But underneath, what I want to draw your attention to, is that tag, that label. And on that label there's an SAP code. And we have these images all dotted around-- they're all geotagged-- all dotted around in an industrial setting. And what we want to do is extract that tag, do OCR on it, and then link that through to our SAP systems, because we can pull lots of metadata from the SAP systems.
So the initial approach that we've taken is using an R-CNN, a Regional Convolutional Neural Network. So we take the image. We then, because the image is very large, we need to extract first a series of region proposals from the image, which are then fed into the CNN proper.
In our case, we used-- so I think Rick talked about the AlexNet example. So we used a VGD 16 network instead, and then we did transfer learning on the final three layers for our purposes. And initially here we've just got two class problems. We've just got tag or no tag.
This is what some of the images look like. Think almost Google Street View. So on the left there you can see that it's almost like it's been taken with a fisheye lens. So first we need to apply a distortion correction to the images, which is done within MATLAB. And then think of-- the output of that is almost like you're standing inside a box, and then you have the six faces of the box looking out.
We dump the top and the bottom projections, and we just keep the horizontal fore projections. And then we feed that through to the regional extraction part of the algorithm. In this case, we slightly modified it and used something called the Pdollar EdgeBox approach. But the important thing is you can see that the regions are nicely extracting areas where there's probably a tag in there.
OK. That then gets fed through to the CNN. So this now is just discussing the training of that.
So although with the training you don't need too much training data, however, we still had some issues trying to have enough training data set for this to perform in a stable way. So we kind of expanded the definition of a label to more of a sign. So we also included signs and then did data augmentation to increase the data set further to provide enough data to give you a stable result.
On the right you can then see after training the activation. So that's a good indication of where the network is initially paying attention prior to the classification. So that quite weird looking image is telling you it's essentially focusing on the purple patches. And then this is the output of the algorithm.
So you can see an interior scene and you can see an exterior scene, different lighting conditions. And what you get is a bounding box around what it thinks are signs-- sorry-- what are signs and tags with an associated probability.
For the keen eye among you, you may notice that there's lots of false positives in there. And what we want to do is actually bring out all the possible options, and then we're relying on OCR on top of this to filter out a lot of those false positives.
OK. So I've just shown you transfer learning being used to identify tags in industrial images, which will then run OCR on top to extract the SAP codes. In terms of runtime, just to give you an idea, it's around three to four minutes per image. Now in this particular use case, we can manage with that, that's fine, but obviously if you wanted real time feedback that's not going to happen.
However, there are techniques to dramatically increase the speed of this if you wanted to go down the real time route. So, for example, fast R-CNNs, which should give you approximately 100 times improvement in speed.
We're also looking at maybe getting more GPU, large GPUs on the MDCS, to allow us to increase the resolution size of the image. And the next, I guess, cool thing is once we've connected this through to the SAP system, how do we then bring that information back, say, for someone walking around the site with some augmented reality goggles? How can we co-visualize this information? That's probably quite an exciting area that some of our customers are interested in.
So the data that we used is from an European industrial site, and we've got quite a lot of interest now from, in particular, an Asian business unit. So we're going to proceed with those lines of activity.
OK. So the next example is terrain recognition in hyper-spectral satellite data. So just quickly a description of why this problem's worth solving and why we bother.
So in upstream, in exploration, seismic data is one of the most important technologies that we have in order to look underneath the ground in the subsurface. And, for example, in this unspecified Middle Eastern location at the bottom, you can see the expanse of it, right? And the cost of acquiring data, so putting an energy into the ground and receiving it, is very high. So we're talking tens of millions per year, per survey. This is very high cost.
And the terrain type, smooth versus rough, for example, can affect the cost of that by up to 50%. So because of that, they, in our language, have a really ideal situation for labeling data, but in their language a really inefficient system. So they pay for a highly-specialized, well-paid individual to look at satellite images and manually draw polygons around rough terrain, what they think is rough terrain.
And then they have to corroborate that with site visits. So someone has to fly over to this particular area of the desert, then to drive around in a truck. And they need to put flags down to confirm that this is, indeed, rough terrain. This is prior to the survey.
So in our case, because we now have lots of training data, we thought, right. Perhaps we can replace the whole workflow with something a bit more computer-intensive. So we decided to try this semantic segmentation approach.
So this is the data that we have. We've got three types of images, aerial photography, radar, and a depth surface model, DSM image. Because of the limitations of 2017 B we need to do three channels, but that's OK in this case.
That's been improved now with 2018 A and B. But we decided here to put it into three channels to colorize the images, and we did so as follows. We gray-scaled the aerial photography, put that in the red channel, radar in the green, et cetera. And then you end up with these colorized images that you can see on the right. That was used for the algorithm.
So SegNet, what is it? It's commonly used in self-driving cars. So imagine a road scene in the top left, and what the network does is you feed it through and then it will basically map each pixel to a class.
So in the example at the top, you have, say, a pavement class, a road class, trees class, et cetera. So in our case we wanted to repurpose this and use it for rough terrain or smooth terrain. And that's what we did.
We actually have a 30,000 example data set at the moment, but we, just for this work, use 1,000 examples. So there's a lot of room for improvement. And we've also, compared to the picture at the top, had a slightly simpler structure of the network.
So we decided to use three encoder and decoder sections. And in terms of training on the 1,000 test examples, on a 4 gigabyte GPU, which is quite small, it's around eight hours training time.
So these are the results. I've removed-- gone from the color and decomposed it back into the original images. So on the top you can see, on the left, the aerial photography, and then the radar and the DSM. And then on the bottom on the left, you can see the human, or the ground truth in our case, and then what the algorithm predicted.
And in both cases, you can see, OK. For this selects a snapshot of the data I've chosen, that the performance is pretty good. At the moment the results is small qualitative rather than quantitative, although we were going to be working on producing confusion matrices and all these kinds of things. But the performance is very good. And we showed this to, actually, the end customers, and already they essentially think that the performance is superior to the existing workflow.
We've allowed the customer to interact with the data via a web app, so this is what you can see here. With the picture on the left, the customer can very easily just go into URL and just upload the various images plus the area of interest that they want to look at. And then on the right after the inference step, you can flick through the different input and output images and overlay the ground truth, just so they can get a sense of what the results mean, and what they're happy with and what they're not happy with.
OK. So in terms of next steps this is very like initial work. So there's lots of future work to be done, assuming that we can get good funding for it internally. So some of the first steps we're going to do are parameter tuning.
We're going to start looking at increasing the amount of training data from where we are at the moment, which is 1000. We're going to add more classes as well. So we have a facility class, an urban class as well that we want to add into the data. And you can see an example of the facilities class in the top right there.
The app as well, we've just quickly-- it only took two days to make that web app. So that's the real power of collaborating properly with MathWorks Consulting there. We want to add further functionality into that web app and deliver exactly what the customer wants.
And with this particular example, because of the performance already and people are quite excited by it, there's a bit of worry about how this is going to impact the existing workflow. And that includes the people doing the work. So this time around we're trying to have a dual-integration strategy, where we both provide the technology whilst also up-skilling the staff so they can understand the workflow more, understand the technology more, and then also maybe come up with new ideas and better ways of working then we could come up with. Some of our Middle Eastern units, obviously, are very interested in this technology. But we've also received interest now from some southeast Asian business units as well.
OK, so what does this mean in terms of the future? Within Shell, it's all about knowing the grand master plan and then how you can fit into the grand master plan. So in our case we have these digital themes.
So we're now going to ensure that the way we promote this internally aligns with those digital themes, and we've identified three of them. Leveraging everything to the cloud, the high performance compute with the MDCS, and then the advanced analytics. So, for example, with smart app-based technologies.
In terms of immediate priorities for 2018, we want to continue to deploy the MPS and the MDCS. And also now we've proven the technology side of some of these solutions, but we now need to look at proving the business value side. So, as I said, we're going to look at furthering progress on the terrain recognition, the tag recognition.
But something I couldn't talk about today, unfortunately, is also in the seismic domain. So we're currently looking at very steep learning techniques to try and map seismic data, so just images of the subsurface through to pay distribution, hydrocarbon distribution, hydrocarbon attribute distribution via just simple convolutions. So that's a very exciting area that a fair few people in our company are looking at, too.
OK. So that's all I had to say. I hope it was an interesting talk. Thank you.