Written by Sgt. Christopher
WE HAVE ALL HEARD THAT SAYING: “A
picture is worth a thousand words.” In the field of forensic video and image
analysis, that adage is certainly applicable. The increase in storage space,
speed of processors, and improved technology has resulted in an increase in
quality video in many of our law enforcement investigations, including 4K-resolution
footage. The quality of images produced from this evidence can often solve
cases that would have been useless in the past. If a picture is worth a
thousand words, then a video must be worth tens of thousands… unless it is the video
of a shooting.
As a sergeant heading the video
analysis unit with the Tucson Police Department, I respond to all our officer-involved
shootings (OIS). In addition, we are often asked to analyze surveillance and
cell phone video of homicide investigations involving firearms. Over the past eight
years, I have come to appreciate the audio track as evidence—just as much, if
not more, than the images. Let me start by looking at the basics of conducting
a forensic analysis of a shooting captured on video.
After completing the standard
verification and authentication review of the evidence, I start by watching the
video. I try to watch it with as few assumptions as possible and make some
notes about my basic observations as a viewer. I look and listen for any
indications of firearm use. If I am lucky, I will have at least two major parts
of the video file: the video track and the audio track.
The analysis must start somewhere.
Whether you start with the video track or the audio track, there is no “right” place
to begin. I guess you could say I am biased, as I started my career as a video
analyst, but I recommend starting with the video. To avoid any other bias
caused by the sounds on the audio track, I start with my first file and complete
a full analysis of the video track alone. I make notes on my observations to
include frame numbers, timing, and the visual evidence that I see. I continue
and perform this analysis for any of my other video tracks. After I have
reviewed all the video tracks independently, I compare the results in any files
that were recorded at the same moment in time to look for images that capture
the event from different cameras.
Next, I conduct a full analysis
of the audio track (or tracks), making notes regarding timing, amplitude
(loudness), and any other audible evidence I hear, independently. Then I
compare any other audio tracks that were recorded at the same moment in time.
My last step involves comparing my
notes of the video and audio tracks to see where they match. This process is
performed individually on each file, and then collectively on any files that
were recorded at the same moment in time.
When it comes to video evidence
of a shooting, there are several things to look for: weapon mechanics, recoil,
muzzle flash, projectiles, casings, impacts and/or ricochets. Many of these
observations can be seen in the video below.
Weapon mechanics include things
like the pulling of a trigger, the rotation of a cylinder, or movement of a
slide on a semi-automatic. Depending on the action observed, they can indicate
if the weapon is about to fire (by the movement of the trigger toward the rear
of the weapon and the rotation of the cylinder); or if the weapon is in the
process of firing or has already fired (by the backwards movement of the slide).
Recoil is the result of physics.
For every action, there is an equal and opposite reaction. Keep in mind that
recoil is not only about the weapon barrel. You may see recoil in the hand,
arm, shoulder, or even the whole body of the shooter. As recoil is a reaction,
it is always occurring after the shot fired and there is a greater chance it
will be observed over weapon mechanics due to the length of time it is present.
At its most basic, the act of
shooting a firearm is just a controlled explosion. As a result, we can expect
to see fire and smoke. For firearms, we refer to that as muzzle flash—demonstrated
in the video below.
Most of the research indicates
muzzle flash occurs in 1 to 3 milliseconds, leaning closer to the 1 millisecond.
Sometimes the evidence can help prove that all by itself without any of the
In the image on the left, a
subject wearing a white shirt, red jacket, and blue pants is standing still and
firing a weapon at the approaching officer. You can see the muzzle flash at about
chest height. The image to the right was captured at the same moment in time by
another officer’s body worn camera (BWC). This second officer was running
toward the subject and, as a result, there is significant motion blur.
Motion blur occurs when the
camera is trying to take a picture while it is moving. It is more often seen
during low-light conditions. The time the camera needs to capture a picture—the
sampling period—is increased due to the lack of available light. If there is
movement of the lens during that sampling period, the camera tries to record all
that movement on the one image, and you end up with motion blur.
However, notice there is no
motion blur of the muzzle flash in either image. The flash occurred so fast (1
millisecond) during the extended sampling period that it remains as a small
The last things I look for are
projectiles or casings leaving the weapon and impacts or ricochets caused by
the projectile. The following image shows these observations in two consecutive
frames of video.
There is no doubt these images
provide valuable evidence when found. But that is just the problem: they must
be found. As it turns out, these observations are rarely captured on video.
Most of today’s advancements in technology are geared toward creating higher-resolution
images with more pixels. More pixels are not going to solve the main problem
here. This problem stems primarily from the frame (or sample) rate and sampling
Modern video in the U.S. is
recorded at 30 frames per second. In essence, 30 samples of short periods of
time are captured every second. Generally, the sampling period is twice the
frame rate. This means that, with a frame rate of 30 samples taken every second
(1 frame every 1/30th of a second or .033 milliseconds), the period
the camera is recording that individual frame is about 1/60th of a
second or .017 milliseconds.
Although this is often referred
to as the shutter speed, in modern digital video recorders the shutter has been
replaced with an image sensor; however, the principle is the same. Considering
the speed at which a weapon can be fired, even under ideal conditions with a
standard frame rate, at most we would only observe half of the visual
indicators of a shooting.
Most of the visual indicators I
have been discussing occur in approximately 1-3 milliseconds. As a result, generally
they will only be observed in one single frame of video, if at all. During a
study of ten OIS cases in 2018, I found visual indicators of a shot fired only
25% of the time. When you consider things like surveillance cameras recording
at much lower frame rates, low-resolution images, poor lighting, movements, and
camera positions, it is common to not find any visual indications of the
discharge of a firearm anywhere on the video track.
As it turns out, in a shooting
incident a picture is not always worth a thousand words. But that audio
track can be priceless.
Just like the video, there are several
things I listen for when evaluating the audio tracks. The first three—shell
casings hitting the ground, the impact or ricochet of the projectile, and the
mechanical actions of the weapon (slide cycling in a semi-automatic)—are
self-explanatory. Unfortunately, because these sounds are often so faint and are
usually occurring during the act of firing the weapon, they are often covered
up by the two more pronounced audible indicators of a gunshot: muzzle blast and
the shock wave.
Muzzle blast is the audible
equivalent of the muzzle flash. It is the result of that controlled explosion
and is the most obvious indicator of gunshots on an audio track, especially if
the gunfire occurs close to the microphone. However, there are two important issues
that I have learned to take into consideration.
The first is the inability to
identify gunshots over other loud bangs when captured on an audio device. Now,
to be clear, I am not referring to scientific experiments with multiple extreme
frequency-range microphones strategically placed to determine shots fired from
a rifle verses a handgun. I am also not referring to research comparing the calibers
of weapons fired in a controlled environment. I am referring to surveillance
footage captured with one outdoor microphone on the other side of the building,
or cell phone video captured a block away, or a BWC video right in the middle
of the action.
Muzzle blast is considered a
percussion event. Why? Because the science at this point does not provide the
data or tools to differentiate between a car exhaust, a boxer hitting a bag, bubble
wrap popping, the beating of a drum set, or a gunshot. When evaluating sound,
the two basic tools we use to visually represent the sound are a sound wave, a
measurement of the amplitude or loudness of the sound, and a frequency spectrum—that
is, the total frequencies (pitch) that make up the sound.
All five of these sounds have
the same basic features. They are sudden, loud, and fill the frequency
spectrum. Without listening to the sound in context with the video images, I
would be very careful about identifying a gunshot versus bubble wrap popping
until I have considered all the other evidence I am evaluating. Which letter do
you think represents the gunshot? (See the end of this article for the answer.)
Now, the loudness or amplitude
of the muzzle blast, when considered in context with the visual images, can
provide evidence of the distance to the microphone, difference in caliber of
multiple weapons fired, and even the movements of the shooter toward or away
from the recording.
The other pronounced audible
indicator when dealing with supersonic rounds is the shock wave, sometimes
referred to as the crack. There is enough data about shock waves and their use
in determining distance and direction of weapons fire to write another article.
If you want more information on this topic, I suggest starting with the video
The second important issue I
take into consideration is the speed of sound. This differs dramatically from
the speed of light. If a single event is captured visually on two or more
cameras, regardless of the distance of the event from the cameras, you can
confidently state it is happening at the about the same moment in time,
allowing you to sync the video images. This is due to the speed of light being
constant at 186,000 miles per second. If the single event was captured on two
cameras half a mile apart, the difference in time it takes the image (light) to
reach and be “seen” and recorded by the two different cameras is relatively
In this same scenario, the speed
of sound is very significant and, as a result, what you hear from two separate
recordings of the same incident will often sound significantly different. The
speed of sound is variable, but most agree it is approximately 1,125 feet per
second. This fact, when taken into consideration, can provide significant
additional evidence, including things like the timing of the shots fired, the
position of the weapons in relation to the recordings, and possibly movements
of the shooters. However, if the speed of sound is ignored, in many cases
improper assumptions will be reached if the sound heard is assumed to be
occurring at the same exact moment in time as the frame of video it is synced
I was asked to assist in an OIS
where an assumption was made based on the single BWC of the involved officer. The
officer stated the suspect fired three shots before the officer returned fire.
However, the audio on his BWC recorded two distant shots followed by what was
clearly the officer’s shot. I located additional footage from officers in the
surrounding area and, after syncing the recordings, discovered three distinct
shots followed by a fourth, as described by the involved officer.
In addition, the amplitude of
the sound was also consistent with the positions of all involved. The involved
officer’s shot is obviously loudest on his microphone, whereas his shot is
quieter on the surrounding officers’ microphones because they were closer to
If you are like me, you are
probably asking yourself, Who cares if the officer fired after the second
round was fired by the suspect versus the third round? It was still justified.
I agree, but this is about the collection of evidence and what an audio track
can provide to enhance your analysis. What if the involved officer fired after
only the first shot by the suspect? What if (as in this incident) the involved
officer’s audio covered up the suspects shot? What if the investigators could
not find the projectile fired by the suspect? What if the suspect denied ever
firing a shot? When we perform these investigations the same way every time and
consider all the available evidence, even in the cases where it “probably
doesn’t matter,” we are preparing ourselves for the time it does.
So why are we getting so much evidence
out of an audio track? It is all about data over time. It is not uncommon for a
single frame of video recorded in high definition to contain more data than the
entire audio track in a short video clip. But, over time, the audio provides thousands
more samples of what is occurring.
As I discussed earlier, a video
recorded at 30 frames per second is only providing 30 samples of moments in time
for that single second. If any part of your event—or, more specifically, the
evidence you are looking for—occurs when the camera is not sampling, you miss
it. The average audio track records at almost 1,500 times that rate or 44,100
samples per second. As a result, in the average shooting which takes place in literal
seconds, the audio is truly worth tens of thousands of words.
About the Author
Sergeant Christopher Andreacola
is a 34-year veteran of the Tucson Police Department and currently heads their
Video Analysis and Management Unit. He is a Certified Forensic Video Technician
with LEVA and teaches nationally on the use of in-car and body worn cameras and
analyzing video from officer involved shootings.