Written by Johnmichael
TRADITIONALLY, detectives seeking to solve crimes gather physical evidence such as fingerprints, drugs, and firearms.
Cases can take weeks, months, even years to solve—or go unsolved altogether—not
because detectives lack the skills, but simply because they lack the resources
to gather the necessary information to solve crimes.
New developments in technology have
delivered a radical upgrade for criminal investigation, giving investigators
tools to rapidly search online for information needed for an investigation—information
that would be time-consuming or even impossible to find by traditional methods.
At the same time, advances in
technology have opened up new territory for the criminal element to exploit,
and ways to hide illegal activity out of sight of the law. More than ever
before, it’s a cat-and-mouse game.
Police investigating criminal
activity online face myriad challenges. When criminals commit offenses,
communicate with each other, or move information or potential evidence online,
they do not operate in a user-friendly place like a website with a username and
password. Rather, they move deep into another world—a world that is accessible
to investigators with the right tools, but which proves a nightmare for the
uninitiated trying to navigate and locate the evidence needed to solve crimes.
Three sectors contribute to the
complexity of criminal investigations online:
The surface web is where
most of us visit every day. We use a search engine like Google or Firefox to
find data, read news and mail, shop, and research. All information stored and
found on the surface web is indexed and accessible.
The deep web is a component
of the surface web. Here information is not indexed and cannot be found by
search engines. It is perfectly above board, but can contain illegal content. Deep
web pages require usernames and passwords for access. Typical examples include
subscription information, bank accounts, corporate intranet content, and personal
Note that this data is not completely
inaccessible: Google can pull results from the deep web just like a headline in
a search query, and users can see snippets of the site in the search result.
Interestingly, a large amount of data on the deep web is of great interest to
Much of the material on the dark
web is intentionally hidden and potentially illegal. However, not all of it is nefarious: many individuals
and organizations use the dark web for beneficial purposes, such as journalists
soliciting information from confidential sources, or communication in politically
oppressed regions. Stored data is encrypted, and most dark web pages are hosted
The distinguishing feature of the dark
web is that it cannot be accessed with the same browsers used for the surface
or deep web: it requires a TOR browser, or similarly anonymizing keys that are
freely available. Data posted here is not indexed, so searching is complicated
Criminals frequent all parts of the cyber world. Some use social
media accounts (which are deep web, because they are password protected) to
communicate with each other, issue threats, distribute propaganda to followers,
or research potential threat actors.
Some move to the dark web to congregate,
communicate, and collaborate anonymously and to trade and share stolen assets
and information. When an investigation begins online, it invariably leads to the
dark and deep webs.
The surface web as we know it
holds only around 4% of all online information. The other 96% is found on the deep and
dark webs, and this is where investigators conduct the bulk of their detective
work. To grasp the astronomical amount of data investigators contend with,
consider that the World Economic
Forum estimates that by 2025, 463 exabytes of data will be
created each day globally — equivalent to 212,765,957 DVDs per day.
Investigating a case requires evidence in many forms, including
digital formats. Given the sheer volume of unstructured online data,
investigators have their work cut out for them.
Law enforcement agencies face a multitude of challenges online. They
have to figure out how to find evidence on the deep, dark, and surface webs,
and how to navigate between structured and unstructured data. They need access
to the correct browsers, plus they must understand how the deep and dark webs
function. Moreover, they need to know which search terms to utilize: names, IP addresses,
keywords, hashtags, locations, websites, social media platforms, batches of
numbers including telephone numbers, and cryptocurrency wallets, among others.
Information found online should not be confused with data that could
serve as evidence or even be considered admissible as evidence. Role players
must be identified together with their associates (Social Network Analysis),
their connections verified, and their part in the crime firmly established.
Threats and risks also need to be worked out. The investigators must know how
to take the information found and use it to support a subpoena, if relevant to
Once information is found, investigators then have to move data
through the intelligence cycle and on to the due diligence and evidence vetting
processes. Evidence needs to be confirmed and validated, and irrelevant
information discarded. With online evidence in hand, detectives then start to
build a case for prosecution.
Clearly, with so much complexity,
so much data, and so much darkness on the web, there is no room for error.
Missing a detail could mean a case goes unsolved, and a felon walks free.
So, what is the solution?
The leading solutions for law enforcement investigators working
online utilize machine learning (ML) and artificial intelligence (AI). ML and
AI working in tandem provide law enforcement with an automated methodology that can search the open, deep, and dark webs to pinpoint illegal
activities and bring malicious actors to justice. Together, ML and AI rapidly make
sense of the vast amounts of unstructured and uncategorized data on the web.
ML, by virtue of its algorithm, gives technology the ability to learn on its
own. AI helps the technology think or decide.
It is essential for law enforcement purposes that ML algorithms be
fine-tuned to analyze, label, and sort Big Data. The AI component must be able
to identify and extricate the relevant intelligence for investigators using a
number of different procedures and capabilities.
Without these capabilities, investigators working on a crime
committed from the deep or dark web—or those working on a real-world crime with
components of the crime posted online—would have to manually gather data from
individual online sources, analyze it, and then put all the puzzle pieces
together to see the crime in its entirety. The risk with this time-consuming
and cumbersome method is that key parts of the crime may be missed or
overlooked, which could negatively impact the overall investigation.
With an open-source intelligence solution (data from publicly
available sources) built on ML and AI capabilities, the investigative team has
the ability to look for any and all evidence they may need in the online world,
starting on the surface web, drilling down to the deep web, all the way into
the dark web, often looking for — and finding — a needle in a haystack.
An open-source intelligence solution delivers value to police and
law enforcement investigators by mitigating their investigative challenges and
expanding their scope. It can search for a wide array of terms from publicly
available information, and it can initiate a deep search through a variety of
open-source information on all layers of the web and across all social media
platforms and blogs.
When the solution finds the data it is searching for, it collects,
analyzes, and presents it to investigators in an easy-to-read, easy-to-interpret
format. Investigators can then analyze the data presented and identify
potential threat actors or new
threats. They can prevent harmful incidents from occurring in the first place, support
the evidence confirmation and validation procedure, and rule out irrelevant
Open-source intelligence solutions are not complicated to use; they
require only minimal training. Open-source solutions increase the investigative
team’s speed, accuracy, and capabilities, and reduce the actual cost of an
Used effectively, an open-source intelligence solution powered by ML
and AI can both predict a potential crime and direct investigators to the
person or persons involved. The enormous value delivered by the ML and AI
components can help overcome human limitations of research and analysis and connect
small pieces of data that may seem irrelevant but are in fact pertinent to the
Cobwebs Technologies provides an effective ML and AI web intelligence
solution, trains investigators how to use it, and provides after-sales solution
support to all clients. In addition, under certain circumstances, Cobwebs
provides expert analyst support to help investigators find the relevant data
for scrutiny before it can be considered as evidence. The following two cases detail
actual events. Identifying details have been omitted so as not to compromise
A state law enforcement
agency asked Cobwebs to help with a case involving an individual who used an
open social media platform to issue a threat about a mass shooting at a particular
event. The online threat was accompanied by a picture of the person in makeup
holding what appeared to be a real firearm. This individual went on to make
several additional threatening comments in other social posts.
This particular case was
extremely urgent because the law enforcement agency discovered the tip on the very
same day that the event under threat would be held. The law enforcement agency
could not determine any additional information from the postings, so they
approached a third party for help. When that party could offer no additional
insights, the law enforcement agency was referred to Cobwebs.
Using its open-source
intelligence capabilities, Cobwebs identified the threat actor’s social network and found multiple accounts for
the same individual. They also uncovered additional photos the threat actor
posted of firearms, including one in which the serial number was identifiable. This
information was collected and turned over to the investigating agency for
In addition, Cobwebs discovered
a post where the threat actor made a public comment that he was registering his
firearm. This information was immediately traceable. Following a detailed
cross-analysis of his many social media accounts, one particular post led to a photo
which clearly showed his face, as well as some flyers for discussions he hosted
listing his full name.
identified the threat actor and made contact. The outcome of this case has not yet
A local law enforcement
agency, as part of a larger task force, was investigating a human trafficking
case. The investigating agency had already identified one potential victim as
well as two alleged traffickers. Searching for more leads, the investigative
agency contacted a specialized non-profit organization that referred the local
law enforcement agency to Cobwebs.
Cobwebs assisted in
identifying the human trafficking network, all connections between the role
players, and other possible threat actors and victims. With the known profile
information provided, Cobwebs used ML and AI to initiate a full open-source
intelligence sweep of the entire online environment using specific search
terms. In a very short time, a much larger human trafficking network was
discovered and identified. Using this network as a starting point, Cobwebs
found a number of common connections and then identified an established core
network among all parties involved.
investigative process, Cobwebs was able to identify an additional five new
victim profiles of the original victim. This in turn led to the identification
of several other potential victims, including another solid threat actor. The
information was then passed on to the investigating agency for evidence
With the information
provided, the law enforcement agency concerned was able to exponentially expand
their investigation past the original threat actor. The outcome of this case
has not yet been disclosed.
ML and AI are core
components that assist investigators in finding evidence in a criminal case and
helping investigative teams achieve successful arrests and prosecutions.
Without access to ML and AI within a larger web intelligence solution, criminal
investigators would find it nearly impossible to fully investigate a case
involving online aspects, case-resolution rates would decrease, public
confidence in law enforcement would suffer, and crime rates could increase. ML
and AI significantly advance the identification, confirmation, and utilization of
online evidence and the prosecution of threat actors for law enforcement
O’Hare is the business development and sales
director at Cobwebs Technologies. He is the
former Commander of the Vice, Intelligence, and Narcotics Division for the
Hartford (Connecticut) Police Department. Prior to that he was the Project
Developer for the City of Hartford's Capital City Command Center (C4), a Real
Time Crime Center (RTCC) that reaches throughout Hartford County and beyond. C4
provided real-time and investigative back support for local, state, and federal
law enforcement partners utilizing multiple layers of forensic tools, coupled
with data resources, and real-time intelligence.