It’s a Marketing Mess! Artificial Intelligence vs Machine Learning

By Sean Martin, CISSP
With contributions by
Igor BaikalovScott Scheferman, and Carson Sweet
and special comments and support from Alan Zeichick


PART 1 of 3

Artificial intelligence is a thing. No matter where you turn, technology companies are selling AI as the secret sauce in their cybersecurity platforms, their decision support systems, their network analytics tools, even their email marketing software. You name it, it’s got “AI Inside.” You’ll see that acronym AI often, as companies refer to artificial intelligence that way – which in itself is pretty vague, as you’d expect for a term that’s been bandied about for many decades and has a great number of representative branches. In our current context, AI generally refers to hardware or software that thinks, learns, and cognitively processes data the same way a human would, although presumably faster and more accurately: Think about Commander Data from Star Trek as a human-shaped role model for what AI could become someday.

The latest marketing discovery of AI as a cybersecurity product term only exacerbates an already complex landscape of jingoisms with like muddled understanding. A raft of these associated terms, such as big data, smart data, heuristics (which can be a branch of AI), behavioral analytics, statistics, data science, machine learning and deep learning. Few experts agree on exactly what those terms mean, so how can consumers of the solutions that sport these fancy features properly understand what those things are?

The overuse and misuse of AI-related terms makes it difficult for information security professionals to make heads or tails of the solutions available to them. For example, while not in the context of cybersecurity, the terms artificial intelligence, machine learning, and deep learning are sometimes used interchangeably throughout this entire article.

If images are a better way to consume this message, then the image below pretty much sums up the complexity of this term and all of the techno-spaces it lives in.

The issue for consumers is that they are being told that they should embrace artificial intelligence – and machine learning – as part of the solutions they buy, but vendors are too often communicating those two concepts as equivalent terms, and sometimes those terms are misrepresented. The complexity resides in the fact that machine learning incorporates artificial intelligence by methods, but artificial intelligence does not always utilize machine learning.

With this in mind, let’s take some time to dig through the messaging and terms to uncover the truth, at least as it relates the challenges the consumers of these technologies are trying to overcome.

By far the greatest danger of Artificial Intelligence is that people conclude too early that they understand it.
— Eliezer Yudkowsky, Research Fellow, Machine Intelligence Research Institute


Descrambling the Marketing Terms

To start, let’s examine some of the AI-oriented terms we encounter in the cybersecurity world. Certainly, there may be more, but we’ll focus on these first. The sample use cases presented at the beginning of each section were provided by Scott Scheferman, Director of Consulting for Cylance; they are designed to paint a general, non-InfoSec view for a common use of the term being discussed; the goal is to make the term relatable.


Big Data

Non-security example use case: The Apple Watch’s ultimate contribution to humanity will likely be in the form of massive health studies across all ages, sex, geographies and demographics in order to perform big data health studies on everything from diabetes, to chemo, to diet and exercise.

From an InfoSec perspective, Igor Baikalov, Chief Scientist of Securonix, describes Big Data as a marketing term that combines existing technologies and architecture to achieve some specific goal, and evolves as the market gets saturated. Vendors can sell Big Data as 3 V’s (Volume, Velocity, Variety), cross-sell it as 4 V’s (+ Veracity), up-sell it as 5 V’s (+ Value), and – if they are really late to the market and desperate – down-sell it as 7 V’s (+ Variability and Visualization).

Not only is the definition of Big Data fuzzy, “but the problem with Big Data is that it is only that, data. The true value of the data comes with some analysis or other learning techniques applied to it,” says Baikalov.

“Big Data is tough because one never knows when that extra one piece of data is the missing link to something very very interesting hiding in the rest of the data,” says Scheferman. “So you might have 40 data types, but it’s the 41st data type you aren’t yet collecting, let alone learning from, that could have unlocked the value of the other 40 data types tremendously. And then there is always the 42nd data type that you don’t even know about that you’ve never even thought about collecting or integrating into the analysis.”




Non-security example use case: Hybrid analytics will progressively show us what we’ve been missing all along on our own. By letting “the data find the data,” hybrid analytics are already being used to hunt criminals using a combination of statistical NLP (Natural Language Processing), time series analysis, graph analysis, heuristics and anomaly detection. (reference)

There are many types of analytics that are used in the security world; some are defined by vendors, others by analysts. Let’s begin by using the Gartner analytics maturity curve as a model for the list, with the insertion of one additional term slotted in the middle of the curve: Behavioral Analytics.

Descriptive Analytics (Gartner): Descriptive Analytics is the examination of data or content, usually manually performed, to answer the question “What happened?” (or What is happening?), characterized by traditional business intelligence (BI) and visualizations such as pie charts, bar charts, line graphs, tables, or generated narratives.

Baikalov explains that descriptive Analytics is the realm of a SIEM (Security Information and Event Management system) like ArcSight: “these systems gather and correlate all log data and report on known bad activities.”

Diagnostic Analytics (Gartner): Diagnostic Analytics is a form of advanced analytics which examines data or content to answer the question “Why did it happen?”, and is characterized by techniques such as drill-down, data discovery, data mining and correlations.

Here, Baikalov says that “diagnostic Analytics is where link analysis tools like Palantir thrive: given a suspect, or security incident, they can figure out potential impact or root cause based on known relationships; it's a forensic activity heavily dependent on human analysts. A next-gen SIEM like Splunk combines both sets of capabilities in one tool – Descriptive + Diagnostic.”

Behavior Analytics — sometimes called Behavioral Analysis: Behavioral Analytics — analyzes massive volumes of raw user event data to predict future actions and trends to detect anomalies.

Baikalov explains that, “While not on the Gartner maturity curve, I would categorize Behavioral Analytics as the next evolutionary step up from Diagnostic Analytics. In addition to what bad we know about, has anything out of the ordinary happened and should we worry about it? Behavioral Analytics is looking for deviations from normal, be it temporal (has it happened before?) or environmental (has it happened to suspect's peers?).”

"Anomaly in the behavior of any asset, be it user, computer system, application, or network device, is a good indicator of malicious activity,” says Baikalov. “The indicator does not rely on a priori knowledge of what exactly is wrong or on established thresholds, and is capable of detecting zero-day, low-and-slow, and APT (Advanced Persistent Threat) attacks." (reference)

Advanced Analytics (Gartner): Advanced Analytics is the autonomous or semi-autonomous examination of data or content using sophisticated techniques and tools, typically beyond those of traditional business intelligence (BI), to discover deeper insights, make predictions, or generate recommendations. Advanced analytic techniques include those such as data/text mining, machine learning, pattern matching, forecasting, visualization, semantic analysis, sentiment analysis, network and cluster analysis, multivariate statistics, graph analysis, simulation, complex event processing, neural networks.

Prescriptive & Predictive Analytics (Gartner): Prescriptive Analytics is a form of advanced analytics which examines data or content to answer the question “What should be done?” or “What can we do to make _______ happen?”, and is characterized by techniques such as graph analysis, simulation, complex event processing, neural networks, recommendation engines, heuristics, and machine learning.

“Predictive capabilities are a must-have feature in active development,” says Baikalov. As the predictive capabilities improve and false positives decrease, Behavior Analytics will gain enough credibility to work in Prescriptive mode, driving automated response based on the analytics' results. See the UK's new "active cyber-defense" initiative.


Traditional/Legacy AI

Non-security example use case: A traditional AI called an expert system is often used in the context of medical diagnosis. By ingesting reams of medical knowledge, the system asks a series of questions that allow the system to diagnose a disease by narrowing down the possible outcomes. Expert systems are narrowly focused on a particular problem.

Scheferman explains that this earliest form of AI is designed to do basic things that humans can with relative ease. The general premise is that the AI system must possess a large amount of raw knowledge and so, when a question is asked of the expert system, it is able to work through a series of rules until a satisfactory answer is provided. In cybersecurity, the most evolved example of such an expert system would likely be IBM’s Watson for Cyber Security, which is ingesting over 75,000 documented software vulnerabilities, 10,000 security research papers published each year and 60,000 security blogs per month. (reference)

Like its predecessors, however, Watson for Cyber Security requires a significant amount of domain experts to provide its data — and measure how good a job it is doing. Watson is unable to learn on its own, and it can only answer questions derived from the knowledge it has absorbed. The power of expert systems power very affective AI, however: Watson is often able to use pattern recognition, human interaction, NLP and data mining (of both structured and unstructured data) be able to predict an attacker’s next move. It’s impressive by any measure.

Now that we have examined some types of artificial intelligence, such as expert systems and analytics, you'll likely want to read the next article in this series: “Machine Learning: The More Intelligent Artificial Intelligence. This is where software can grow beyond the constraints of human knowledge and actions – and it’s an area of great investment, and tremendous excitement.

Once you read parts 1 and 2, you'll certainly want to read the third article in the series: “The Actual Benefits of Artificial Intelligence & Machine Learning” Here, we will explore how to move beyond the hype and confusion in order to see the real benefits of artificial intelligence and machine learning.

Part 2 was published on Tuesday, November 22nd.
Part 3 was published on Tuesday, November 29th.

Thank you contributors!