We live in a 3D world. When we look at things we expect to see them in a scene where we understand how they relate to their surroundings… Is the tree in front of that car?… Is the cat behind the chair?… etc, etc.
In all visible light images that’s a relatively easy problem to resolve as we have many clues, not just to an object’s position but also in relation to its relative depth compared to the other objects in the scene.
Science tells us there are two types of ‘depth cues’ we use to derive 3D information – those cues which you can see with one eye (monocular cues) and those that require two eyes (binocular cues). Understanding how these cues provide depth information is the basis of understanding how we ‘perceive’, and why we need details of depth in a scene.
If we consider an x-ray image we first realise that it’s generated not by reflected light, as you see in a visible light image, but rather as a result of the shadow cast by the transmission and absorption of x-rays as they pass through the object.
What about the depth cues?
The consequences of the shadowgraph image is that it contains none of the depth cues described above – a truly two dimensional image containing no depth information whatsoever.
In fact, the situation is worse than that. As the brain is used to seeing depth cues, in the absence of any cues it will make assumptions that could be disastrously wrong.
Take the very simple example of the x-ray image of a case shown at the foot of this article. In this image, it’s quite reasonable to suppose that the spanner is in front of the bottle since it clearly obscures the bottle ‘behind’ it. However, as this is an x-ray the dark image of the spanner says no more about the spanner than it is more dense that the bottle and tells us nothing about its depth in the bag: for all we know it’s at the top, bottom or somewhere in the middle.
In many cases this lack of absolute depth knowledge is not a problem. For example, with medical x-rays looking to see if a bone is broken we already know where that bone is, so depth information is not generally any additional use: all we want to know is: “Is it broken?”
However, there are occasions when knowing not only that an object exists in the image but also how that object relates to its surroundings is vital. Security screening is perhaps the most important example of a situation where depth information can provide the difference between detecting or missing a threat.
3D x-ray image acquisition
Essentially, to generate any 3D x-ray image it’s necessary to acquire images from two or more different views. In understanding how we generate 3D x-ray images for security applications, it’s also essential that we first understand the operational requirements.
If we look at checkpoint, cargo, freight or even EOD applications, they all rely on a given screener interpreting the image in a relatively short time frame – typically 5-to-7 seconds for checkpoint systems – so it’s necessary that there is no delay between screening the object and displaying the image.
That being the case, the approach taken in the medical sector (whereby the x-ray camera is rotated the full 360 degrees around the patient) is clearly not appropriate in the time limited security scenario. As such, a detailed scan can take several minutes.
A far more simple approach is to mimic how the human eye views the real world (ie acquiring images of the object looking at it from two slightly different angles, one from the right and the other from the left, in turn recreating binocular disparity). If these images are then ‘played back’ to the screener such that the screener’s right eye only sees the right perspective and the left eye the left perspective, the brain will automatically fuse these images into a 3D representation of the scene.
Although it’s not a full 3D model of an object, the representation provides sufficient information for security screening.
This effect is achieved by using two line scan x-ray cameras each positioned to produce the required stereoscopic disparity (again, see the image at the foot of this article). The image is then presented to the screener on a special polarising monitor.
The advantage of this technique is that a real-time image can be produced, thus providing the screener with an instant stereoscopic 3D representation of the scene.
3D and security: what are the benefits?
So how does this technique benefit security operatives? The answer to that question comes in three parts.
- Improved detection/reduced false alarms
Security screeners use x-ray images to detect potential threats within baggage, freight and packages. As mentioned, these screeners generally only have a limited time to make their decision – typically 5-to-10 seconds in the case of baggage inspection – so anything that can help make that decision more accurately and faster is of significant benefit to both the screener and the passenger.
The challenge to the security screener is to determine from the x-ray image if an item inside the object being screened is a threat or benign. Due to the nature of x-rays this can be quite a daunting task since even a set of innocent objects positioned carefully in a bag may look threatening.
For example, consider a suspect improvised explosive device (IED). Such devices usually consist of a number of key components: an explosive material containing a detonator connected to some form of timing/triggering device powered by some form of power supply (view the image at the foot of this article).
Seeing the combination of components shown, and having no further information, the screener would have to assume it was a threat and take immediate action. However, if we could also provide depth information we may discover that what we actually have is the scenario in the fourth image below.
In this example we have assumed that the only difference between the two x-ray images would be the 3D displacement of the items within the baggage, but the reaction of the screener would be significantly different dependent on whether the 2D or 3D image were presented – a 2D image would generate an IED alarm, but the 3D image would be correctly interpreted as a set of benign items.
In the reverse scenario where an object has been placed specifically to confuse a screener viewing the object in 2D, such as placing a gun vertically in line with the detectors, the 3D image would clearly indicate the depth and shape of the object.
In this example, the chance of detection of the threat would be significantly improved in 3D compared to 2D.
- Reduced training times/faster time to competency
3D is the way we normally view the real world and, consequently, a conventional shadowgraph 2D x-ray image is inherently difficult to interpret (no matter how well you are trained).
Providing the screener with a 3D x-ray image both makes the scene more natural and aids interpretation. As a consequence, trainers see a significant reduction in training times of new screeners simply because the screener ‘understands’ the x-ray scene more readily in the natural 3D view compared to the confusing 2D presentation.
On a similar basis to the training scenario, if a screener has been away from the inspection machine for any significant period of time it’s recognised that there is always a period of readjustment as the screener re-familiarises themselves with the x-ray image and its interpretation when they return to their job.
Again, with 3D images this time to competency has been seen to be significantly reduced when compared to conventional 2D systems.
- Reduced search times
A final, but no less important benefit of 3D imaging is related to possible hand searches of the object under inspection.
It must be acknowledged that there will always be the occasion when, even with 3D images, the screener may be uncertain about a specific item. In such circumstances it will be necessary to open the object and visually inspect the item. This process is both intrusive to the customer and can take a proportionally significant period of time (potentially delaying other customers or the throughput of freight).
However, a major benefit of 3D imaging is that it enables the screener to go immediately to the item in question since they know exactly where the item is (ie: “It’s in the left hand corner close to the top of the bag”) whereas in a conventional 2D system all they would know is that the item (or items, since it may just be an amalgamation of several images superimposed on each other) is in the left hand corner and would have to empty the whole bag or container in that region (top to bottom) to ensure they have found what they were looking for.
3D again will reduce the need to search, but if necessary, also reduce the time of search.
Making the unnatural more natural
Ultimately these benefits are achieved through the simple fact that 3D imaging makes the wholly unnatural x-ray image appear more natural. Although the principle is incredibly simple, it delivers some powerful benefits.
Screening and inspection remain key components of security operations in a huge variety of sectors and applications – for those looking to enhance their operations, 3D imaging should be a serious consideration.
Nick Fox is CTO of 3DX-Ray