Eng 日本語

Research Paper: Identification of Cybersecurity Specific Content Using Different Language Models

Enod Bataa

Published 21/04/2022

Given the sheer amount of digital texts publicly available on the Internet, it becomes more challenging for security analysts to identify cyber threat-related content. In this research, we proposed to build an autonomous system to identify cyber threat information from publicly available information sources. We examined different language models to utilize as a cyber security-specific filter for the proposed system. Using the domain-specific training data, we trained Doc2Vec and BERT models and compared their performance. According to our evaluation, the BERT-based Natural Language Filter is able to identify and classify cyber security-specific natural language text with 90% accuracy.

Read more on – https://www.jstage.jst.go.jp/article/ipsjjip/28/0/28_623/_article/-char/en


    contact us

    Get in touch

    We will meet you where you are! Let’s discuss projects, partnerships, or job opportunities!

    Eng 日本語