On This Page

Applied Data Science for Security Professionals

GTK Cyber, gtkcyber.com
July 22-23 & July 24-25


This interactive course will teach security professionals how to use data science techniques to quickly write scripts to manipulate and analyze network and security data and ultimately uncover valuable insights from security data. The course will cover the entire data science process from data preparation, exploratory data analysis, data visualization, machine learning, model evaluation and finally, implementing at scale—all with a focus on security related problems.

Participants will learn how to read in data in a variety of common formats then write scripts to analyze and visualize that data. A non-exhaustive list of what will be covered include:

  • How to write scripts to efficiently read and manipulate CSV, XML, and JSON files
  • How to quickly and efficiently parse executables, log files, pcap and extract artifacts from them
  • How to make API calls to merge datasets
  • How to use the Pandas library to quickly manipulate tabular data
  • How to effectively visualize data using Python
  • How to preprocess raw security data for machine learning and feature engineering
  • How to build, apply and evaluate machine learning algorithms to identify potential threats
  • How to use machine learning to identify anomalous network behavior and recognize potential network threats.

Finally, we will introduce the students to cutting edge Big Data tools including Apache Spark (PySpark), Apache Drill, and GPU accelerated parallel computing frameworks and demonstrate how to apply these techniques to extremely large datasets.

Who Should Take this Course

Anyone who wishes to incorporate automated data analysis into their work.

Student Requirements

Students will need to have a basic understanding of Python.

What Students Should Bring

Students should bring a laptop with either:
  • Virtualbox (or VMWare) installed, 6GB of RAM and 10GB of storage.
  • Anaconda and IPython installed.

We strongly recommend using the virtual machine we will provide as it will give the best student experience.

What Students Will Be Provided With

A preconfigured virtual machine (VM) containing all the software needed for the class. The VM will also contain:
  • All course slides, notebooks, reference sheets and handouts. documentation
  • Skeleton code examples for in-class exercises

Students will also be provided with access to our website which will have additional exercises.


Charles Givre is an unapologetic data geek who is passionate about helping others learn about data science and become passionate about it themselves. He has worked at Booz Allen Hamilton for the last five years as a data scientist for various government clientsand done some really neat data science work along the way, which hopefully saves U.S. taxpayers some money. Most of his work has been in developing meaningful metrics to assess how well the workforce is performing. For the last two years, Charles has been part of the management team for one of the company's largest analytic contracts. His responsibility has been to increase the amount of data science on the contract, both in terms of tasks and people. Even more than the data science work, he loves learning about new technologies and techniques, and then teaching them. Charles has been instrumental in bringing Python scripting to his government clients, as well as to the analytic workforce. He has developed a 40-hour Introduction to Analytic Scripting class for that purpose. Additionally, he's developed and taught a 60-hour Fundamentals of Data Science class, which helps to put analysts on the data on-ramp. He's taught the class to Booz Allen staff, government civilians, and U.S. military personnel around the world. Charles has a Master's degree from Brandeis University, two Bachelor's degrees from the University of Arizona, and various IT security certifications. In his nonexistent spare time, he plays trombone, spends time with his family, and works on restoring British sports cars.

Austin Taylor (www.austintaylor.io) has an extensive background in Defensive and Offensive Cyber Operations and has performed incident response for some of the world's top Fortune companies. His expertise includes penetration testing, data science, threat hunting, User and Entity Behavioral Analytics (UEBA) and incident response. Austin has won numerous Capture the Flag (CTF) competitions, including SANS Netwars. In his off time, he teaches programming and conducts training at conferences. He currently serves as a Cyber Warfare Operator for the United States Air Force and works at IronNet Cybersecurity as a Senior Security Researcher. Austin holds multiple industry certifications including CISSP, GMON, GCCC, GCIA, GCIH, GCPM, GSEC, GPEN, CEH, VCP, CCNA:Security.

Dr. Melissa Kilby (www.melissackilby.com) is passionate about high-performance computing and mathematical modeling. As a Cyber Data Scientist and experienced trainer she enjoys teaching complex Machine Learning concepts in a comprehensive manner. She encourages her students to quickly solve seemingly difficult problems. Melissa has a multi-disciplinary background in academic research, statistics, computer science, cyber security and neuroscience which empowers her to solve tough unknowns in yet unknown ways. At Booz Allen Hamilton she applies her expertise to several cutting edge cyber challenges such as hunting Advanced Persistent Threats (APTs) based on a Digital Forensics + Machine Learning blend and prototyping of novel cyber defense mechanisms to protect Industrial Control Systems (SCADA). Melissa holds a PhD in Biomechanics from the University of Georgia. Some fun facts about Melissa are that she had the privilege to perform research on real NASA space suits and knows that the Oktoberfest starts in September.

Video Preview (Training Description Above - Top of Page)