Vessel Tracking: Making Sense of Large Amounts of Data
By Rory Meyer, CSIR
Here in the Integrated Vessel Tracking research group we have a problem. We have too much data. Our goal is to display real time data for ocean vessels around South Africa but we receive data for about a third of the world’s oceans. What we would like to do is use this glut of data to derive some kind of meaningful information that would allow operators to quickly determine whether a ship is acting strangely or not.
One simple method we be to add another layer of information to our vessel tracking map; a historical heatmap. Sounds pretty simple and useful, doesn’t it. Just take all the messages and run a heatmap function on it. Well, that’s a little tough for us; we receive about 8 million vessel position a day. And that doesn’t even consider the problems of message collisions (not receiving data because there are too many ships), interpolation (ships obviously exist in the places between consecutive messages) and separating by class of vessels.
Fishing, cargo and tanker vessels form the bulk of vessel traffic in the deep ocean but they have different behaviours. Cargo and tanker vessels travel in straight lines generally limited to shipping lanes between ports and travel to ports that are equipped to handle their cargo. Fishing vessels travel to fishing zones and then move around in different ways depending on the fishing gear used. These behaviours can be characterised using the locations, speed and bearings described by the messages received from these vessels.
Building heatmaps of vessel behaviour involved several complex queries to the geospatial database that would first calculate the time between concurrent messages for every message in the database, order them by vessel, calculate which spatial bin they would fall into and then calculate the total amount of seconds each vessel class spent time in each spatial bin. It takes around an hour for a month’s worth of data (Thanks PostgreSQL!).
Heatmaps can also be generated to show the density of specific classes (see Figure 1), the average speed of vessels in an area, the dominant country for vessels in an area or the average amount of time between messages for each vessel (giving an indication of reception strength).
Having these heatmaps stored in the database makes it very easy to compare historical data with new messages. Want to see if the speed, position and bearing of that fishing vessel is normal for this time of the year? Figure 2 shows fishing activity during August 2018. Give me a few milliseconds because all that info is already in the database.