Table of Contents
- Research Questions
- RQ1: What are the various categories of OSINT and how can we utilize them?
- RQ2: What are the main strengths and weaknesses of different OSINT tools and techniques?
- RQ3: What are the benchmark research works available that integrate OSINT with different fields to maximize its utilization?
- RQ4: Is there any mechanism or workflow to utilize OSINT to exploit publicly available data?
- RQ5: How can we utilize OSINT tools and techniques in social network analysis, opinion extraction, cyber security, counter-terrorism, cyber defense, cybercrime investigation, criminal profiling, surveillance, etc.?
- Research Process
- Extracting Social Opinion and Emotions
- Cyber Crime and Organized Crime
- Cyber Security and Cyber Defense
This article aims to incorporate the findings from previous research related to OSINT (open-source intelligence) tools and techniques. The objective is to enhance our understanding of OSINT and its utilisation by addressing five research questions. These questions delve into the various categories of OSINT, the strengths and weaknesses of different OSINT tools, benchmark research works that integrate OSINT with different fields, mechanisms or workflows to exploit publicly available data, and the utilisation of OSINT tools and techniques in various domains such as social network analysis, opinion extraction, cybersecurity, counter-terrorism, cyber defence, cybercrime investigation, criminal profiling, and surveillance.
RQ1: What are the various categories of OSINT and how can we utilise them?
Previous studies have primarily used OSINT for the purpose of translation, exploitation, analysis, and dissemination. However, with the advancement of technologies, OSINT can now be applied in various applications. OSINT is categorised according to applications such as Geospatial intelligence, Signal Intelligence, Imagery Intelligence, Human intelligence, and Social media Intelligence.
RQ2: What are the main strengths and weaknesses of different OSINT tools and techniques?
This question aims to identify the strengths and weaknesses of various OSINT tools and techniques from different perspectives, such as availability, application, processing time, input type, and reliability score.
RQ3: What are the benchmark research works available that integrate OSINT with different fields to maximise its utilisation?
This question focusses on the sources of research work published that integrate OSINT with different fields and examines their relevance to OSINT. The aim is to identify the benchmark research works that are regularly cited and preferred in previous studies.
RQ4: Is there any mechanism or workflow to utilise OSINT to exploit publicly available data?
This question explores the existing research work, government guidelines, and reports that have explored the workflow of utilising OSINT to exploit publicly available data. The aim is to identify different relevant workflows based on their characteristics and applications.
RQ5: How can we utilise OSINT tools and techniques in social network analysis, opinion extraction, cybersecurity, counter-terrorism, cyber defence, cybercrime investigation, criminal profiling, surveillance, etc.?
This question focusses on the applications of OSINT tools and techniques in various domains, such as social network analysis, opinion extraction, cybersecurity, counter-terrorism, cyber defence, cybercrime investigation, criminal profiling, and surveillance. The aim is to identify frameworks that integrate OSINT with Machine Learning/Deep Learning/Artificial Intelligence to achieve better results. This question also considers the comparison of model results with and without OSINT.
To conduct the research, various keywords related to OSINT, OSINT tools and techniques, cybercrime and organised crime detection using OSINT, counter cyber criminals using OSINT, threat intelligence using OSINT, application of OSINT in cyber defence, security intelligence, disaster management, social opinion extraction, sentiment analysis, malware analysis, vulnerability assessment, national security surveillance, and counter misinformation were used. These keywords were searched in electronic databases such as Google Scholar, ACM Digital Libraries, Web of Sciences, Science Direct, and IEEE Xplore. Additionally, various tools and techniques of OSINT, magazines, blogs, and newsletters related to OSINT were also searched. The selection of works was based on their relevance to OSINT, applications of OSINT, possible domains for integration with OSINT, citation score, reputed journals/publications, and real-life-based scenarios. The analysis of the resources accessed and used for the study of OSINT trends in cybersecurity using various AI/ML/DL techniques.
Extracting Social Opinion and Emotions
Aye and Aung (2020) proposed a model for determining user opinions in Myanmar language using text analysis techniques. Kandias et al. (2017) conducted an experiment on Facebook users to determine the level of stress. Yadu and Shukla (2020) analysed emotions contained in tweets from Indian Air Asia Service using five different classification techniques. Prabhakar et al. (2019) used AdaBoost (Ensemble) and other classifiers to build a robust sentiment analysis model with a precision of 84.5%. Wadawadagi and Pagi (2020) explored the use of deep neural networks in sentiment analysis.
Naseem et al. (2019) proposed a model based on Bidirectional short-term memory (BiLSTM) and hybrid word representation for sentiment analysis in airline tweets. Soomro et al. (2020) analysed more than 18 million tweets related to the novel coronavirus to study the relationship between the number of infections and public mood. Garcia and Berton (2021) conducted sentiment analysis in Portuguese using Twitter data to analyse the consequences of the pandemic in the United States and Brazil.
Mishra et al. (2019) analysed sentiments of various reviews and created a hotel recommendation system. Jain and Dandannavar (2016) studied sentiment analysis using Twitter data and used classifiers such as decision tree (DT), support vector machine (SVM), and naive bayes (NB) for training. Shuai et al. (2018) used Chinese hotel reviews and Doc2Vec to analyse sentiment and found that SVM gave the best result. Hashida et al. (2018) proposed a model based on distributed multichannel representation for interpreting text data.
Cyber Crime and Organised Crime
Digital forensics and open-source intelligence are the two broad types of cyber crime investigations. Cyber crime is divided into categories such as crime against confidentiality, integrity, and availability (CIA), content crime, computer crime, and other types of cyber crime. Nazah et al. (2020) highlighted the eight primary cyber crimes, including human trafficking, pornography, child pornography, assassination, drug sales, terrorist activity, cybercrime markets, and cryptocurrency exchange. The choice of tools and techniques in cyber investigations has a significant impact on the outcome. Quick and Choo (2018) applied OSINT in digital forensics to improve the analysis of criminal intelligence. Delavallade et al. (2017) proposed a model based on social network data to extract crime indicators and predict future crimes.
Shestak and Koscheeva (2021) discussed the increase in cybercrime rates during the coronavirus pandemic and mitigation procedures. Valluripally et al. (2019) developed a framework for analysing crime incidents using related tweets, hashtags, and URL connections. Kadoguchi et al. (2020) developed a device for real-time tracking of crime hub locations across Indian states. Edwards et al. (2015) described data mining methods such as natural language processing, information extraction, social network analysis, computer vision, and machine learning for gathering intelligence on criminal or terrorist groups. Liao et al. (2016) proposed iACE, a technology for automatically gathering intelligence from multiple sources and analysing data relationships.
Phishing is a social engineering attack, and detecting and mitigating phishing attacks is crucial. Alabdan (2020), Rastenis et al. (2020), Wang et al. (2020), and Churi et al. (2017) proposed various methods and techniques for detecting and preventing phishing attacks. Stafford (2020) explored the effects and causes of phishing attacks on users. Various ML techniques have been used for phishing detection, including Naive Bayes, KNN, SVM, and clustering algorithms (Cui et al. 2017, Wang et al. 2020, and Alabdan 2020).
Cybersecurity and Cyber Defence
OSINT techniques have been applied in the field of cybersecurity and cyber defence. Senekal and Kotzé (2019) used NLP to analyse WhatsApp chats for investigation purposes. AlKilani and Qusef (2021) proposed the integration of OSINT techniques with the ISO 27001 standard for enhanced security. Raj and Meel (2022) developed a model for fake news detection using real-world datasets. Islam et al. (2022) proposed a validation tool for automating the validation of security alerts and incidents. Ch et al. (2020) analysed the rate of cyber crimes in different Indian states using ML techniques. Nouh et al. (2016) proposed a multi-functional cybercrime intelligent system framework to reduce cognitive biases in investigations. Aslan et al. (2018) used OSINT techniques to identify social network accounts related to cybersecurity.
In conclusion, this article provides an overview of research conducted on OSINT tools and techniques, their applications in various domains, and their integration with different fields. The research questions addressed in this article aim to enhance our understanding of OSINT and maximise its utilisation. The findings from previous research contribute to the development of frameworks, models, and methodologies for improving OSINT research and its practical applications. By leveraging OSINT tools and techniques, researchers and practitioners can enhance social opinion and emotion analysis, investigate cyber and organised crimes, strengthen cybersecurity and defence, and mitigate threats and risks in various domains.