SlideShare ist ein Scribd-Unternehmen logo
1 von 13
Downloaden Sie, um offline zu lesen
SHOBEVODSDT: SHODAN AND BINARY EDGE BASED
VULNERABLE OPEN DATA SOURCES DETECTION TOOL
OR
WHAT INTERNET OF THINGS SEARCH ENGINES KNOW
ABOUT YOU
The International Conference on Intelligent Data Science Technologies and Applications (IDSTA2021)
November 15-16, 2021. Tartu, Estonia (web-based)
Artjoms Daskevics, Anastasija Nikiforova
“Innovative Information Technologies” Laboratory, Programming Department
Faculty of Computing, University of Latvia
AIM
To propose an OSINT-based (Open Source Intelligence) tool for non-intrusive testing of open data sources inspecting their
vulnerabilities and their extent.
is the data source visible outside the organization?
what data can be gathered from open data sources (if any) and what is their “value” for attacker and fraudsters?
whether these data can pose the risks to organization using them to deploy an attack?
This allows both a comprehensive analysis of unprotected data sources, falling into a list of predefined data sources, or a
specific IP or IP range to examine what can be seen from the outside of the organization about the data source in use
The use of Open Source Intelligence (OSINT) tools, more precisely the Internet of Things Search Engines (IoTSE) should
allow the tool to inspect a list of predefined data sources on their vulnerabilities and their extent
ShoBeVODSDT
Shodan- and Binary Edge- based vulnerable open data sources detection tool
ShoBeVODSDT
ShoBEVODSDT uses mainly the passive assessment (non-intrusive testing), which is characterized by its
low level of intrusiveness;
the data sources concerned are not thoroughly and actively tested.;
the tool refer to the most likely and potentially existing bottlenecks or weaknesses which, if the fourth stage
of the penetration testing, namely the attack, would take place, could be revealed and exposed.
ShoBeVODSDT
Shodan- and Binary Edge- based vulnerable open data sources detection tool
ShoBeVODSDT
ShoBeVODSDT SCOPE
What will be inspected?
8 types of data sources– MySQL, PostgreSQL, MongoDB, Redis, Elasticsearch,
CouchDB, Cassandra and Memcached.
Three types of sources
relational databases,
NoSQL databases, both types, document-oriented,
column-oriented and key-value databases
data stores.
How it will be inspected?
OSINT tools or, more precisely, Internet of Things (IoT) search engines (IoTSE)
Shodan and BinaryEdge, which search for and index publicly available and accessible open data sources
Database Primary database model Connection data Default port
MySql Relational DBMS IP address, port, username, password 3306
PostgreSql Relational DBMS IP address, port, authentication data (if supports connection with a
password)
5432
MongoDB Document store IP address, port, username, password 5984
Redis Key-value store IP address, port, authentication data (if access control is enabled) 27017
Elasticsearch Search engine IP address, port 6379
CouchDB Document store IP address, port, authentication data (if anonymized access is not
enabled)
9200
Cassandra wide-column store IP address, port, authentication data 9160
Memcached key-value store IP address, port 11211
DATA SOURCES, THEIR MODELS AND CONNECTION DATA
ShoBeVODSDT ACTION
searches for files in a “checked” folder that corresponds to
the service and country being checked;
opens the file and checks IP address using the “check”
class method associated with the service;
if the connection has been successful, the IP address is
stored in „good/<service_name> _ <country>.txt”, if failed -
the IP address and error information are stored in the
„bad/<service_name>_ <country>.txt”.
Step I
IP address search (gather)
uses BinaryEdge and Shodan libraries to find
service IP addresses that belong to an user-defined
country;
combines results from BinaryEdge and Shodan
by eliminating duplicates;
saves results in the
“parsed/<service_name_>_<country>.txt”;
Step II
IP address check
Step III
Retrieving information from an IP
address (parse)
searches for files in a “parsed/good” folder that corresponds to the
service and country to be checked;
opens the file and tries to reconnect. If the connection was successful -
tries to download the information from the database. For each type of
database, the is different;
saves the information in the “parsed” ,“<IP_ ADDRESS>.txt”.
TOOL ARCHITECTURE
The search class includes a class constructor where a Shodan or
Binary Edge client is initialized using a valid API key and
search method to obtain data from Shodan or Binary Edge*.
*In the case of Binary Edge, a page number to search for IP addresses should
also be provided.
The service class includes a class constructor where a separate
service client tries to establish the new connection. Two
functions :
(1) “check”, which returns an error if the connection was
unsuccessful or “true” if it was successful
(2) “parse”, which attempts to download all information
from the database.
ShoBeVODSDT IN ACTION
Use-case - data on Latvia, Estonia and Lithuania (Baltic States)
15180 IP addresses were processed,
Lithuania (7453)
Estonia (5352)
Latvia (2375)
98.43% of the addresses have failed to connect
Category Description
0 failed to connect
1 has managed to connect but failed to gather data or information
2 has managed to connect, but the database is empty
3 has managed to connect by gathering system data or non-sensitive information
4 has managed to connect and gather sensitive data
5 compromised database
✔ the further actions took place with 1.57% or 93 IP addresses only
ShoBeVODSDT IN ACTION
“2” and “3” – the most popular categories – good point, i.e. while these
data sources are open, these data are not of very high importance to
attackers and fraudsters, although they can facilitate their attacks,
8% of data sources contain data that could be used by attackers,
12% of them have already been compromised
most empty and compromised databases belong to Elasticsearch.
most databases that store sensitive data belong to Memcached, but it is also a
leader in databases where sensitive data are not stored (category “3”).
Memcached and ElasticSearch have the highest number of open data sources
with higher “value” of data gathered from them in almost all categories, except for
relatively poor results demonstrated by the MongoDB for the number of
compromised databases and Redis for data sources storing sensitive data.
FUTURE WORKS
The list of used IoTSE may be extended to other well-known Search Engines such as Censys, ZoomEye etc. to allow more extensive
investigation and determine whether the number of IoTSE has an impact on the results.
Similarly, the number of data sources can be supplemented by other data sources identified as the most popular; especially given
Oracle and MS SQL are somteimes found to have the highest number of vulnerabilities.
Although our aim was to propose the tool for investigating databases only, further studies may also cover other “types of devices”,
such as Network Equipments, Terminal, Server, Office Equipment, Industrial Control Equipment, Smart Home, Power Supply
Equipment, Web Camera, Remote Management Equipment, Blockchain and industrial based connected devices in the cloud.
At the moment, the future study aims to apply the tool to specific countries of Latvia, Lithuania and Estonia and to carry out
extensive investigation on the current state of data sources and their security. This will allow conclusions to be drawn on differences
in country patterns, i.e. whether the technological development of Estonia will be also seen in this matter. It will draw more objective
conclusions on the less protected-by-design data sources.
RESULTS AND CONCLUSIONS I
The paper proposes a tool called ShoBeVODSDT - Shodan- and Binary Edge- based vulnerable open data sources
detection tool, for non-intrusive testing of open data sources for detecting their vulnerabilities. ShoBeVODSDT:
supports the identification of vulnerabilities at early security assessment stages and does not require the
implementation of active and possibly disruptive techniques;
uses two IoTSE (Shodan and Binary Edge) by extending their features with the advanced capabilities built
in it;
allows inspecting 8 predefined data sources - MySQL, PostgreSQL, MongoDB, Redis, Elasticsearch,
CouchDB, Cassandra and Memcached, on their vulnerabilities and their extent.
While the tool covers 8 data sources representing both rational databases, NoSQL databases and data stores, it is
designed to be easily scalable by extending the publicly available code  https://github.com/zhmyh/ShoBEVODST
https://www.eosc-hub.eu/open-science-info
RESULTS AND CONCLUSIONS II
The total number of open data sources available to everyone (who wants to access them) is not very high, i.e. less than 2% of
the data sources scanned.
BUT, there are data sources that may pose risks to organizations, since external users can access the information that can be
used for further attacks. For 12% of ispected data sources this has already taken place.
Security features built into the database allow to protect against unauthorized access, but there are databases with low
security features, where we were able to connect to nearly all IP addresses by retrieving information from them. Even more, in
some cases the databases, which do not use security mechanisms, have been already compromised.
THANK YOU FOR
ATTENTION!
QUESTIONS?
For more information, see ResearchGate
See also anastasijanikiforova.com
For questions or any other queries, contact
me via email - Anastasija.Nikiforova@lu.lv

Weitere ähnliche Inhalte

Was ist angesagt?

Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)paperpublications3
 
Final review m score
Final review m scoreFinal review m score
Final review m scoreazhar4010
 
Semantics and linked data at astra zeneca
Semantics and linked data at astra zenecaSemantics and linked data at astra zeneca
Semantics and linked data at astra zenecaKerstin Forsberg
 
Keystone summer school_2015_miguel_antonio_ldcompression_4-joined
Keystone summer school_2015_miguel_antonio_ldcompression_4-joinedKeystone summer school_2015_miguel_antonio_ldcompression_4-joined
Keystone summer school_2015_miguel_antonio_ldcompression_4-joinedJoel Azzopardi
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environmentphilipdurbin
 
dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016dkNET
 
Implementation of Matching Tree Technique for Online Record Linkage
Implementation of Matching Tree Technique for Online Record LinkageImplementation of Matching Tree Technique for Online Record Linkage
Implementation of Matching Tree Technique for Online Record LinkageIOSR Journals
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationRinke Hoekstra
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked dataLaura Po
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the WebRinke Hoekstra
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflowsSSSW
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...Robert Grossman
 
Exploration, visualization and querying of linked open data sources
Exploration, visualization and querying of linked open data sourcesExploration, visualization and querying of linked open data sources
Exploration, visualization and querying of linked open data sourcesLaura Po
 
Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Michel Dumontier
 

Was ist angesagt? (20)

Konrad cedem praesi
Konrad cedem praesiKonrad cedem praesi
Konrad cedem praesi
 
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at ScaleFull Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 
DLD_SYNOPSIS
DLD_SYNOPSISDLD_SYNOPSIS
DLD_SYNOPSIS
 
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
 
Sanderson Shout It Out: LOUD
Sanderson Shout It Out: LOUDSanderson Shout It Out: LOUD
Sanderson Shout It Out: LOUD
 
Final review m score
Final review m scoreFinal review m score
Final review m score
 
Sub1555
Sub1555Sub1555
Sub1555
 
Semantics and linked data at astra zeneca
Semantics and linked data at astra zenecaSemantics and linked data at astra zeneca
Semantics and linked data at astra zeneca
 
Keystone summer school_2015_miguel_antonio_ldcompression_4-joined
Keystone summer school_2015_miguel_antonio_ldcompression_4-joinedKeystone summer school_2015_miguel_antonio_ldcompression_4-joined
Keystone summer school_2015_miguel_antonio_ldcompression_4-joined
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016dkNET ESP Meeting - February 2016
dkNET ESP Meeting - February 2016
 
Implementation of Matching Tree Technique for Online Record Linkage
Implementation of Matching Tree Technique for Online Record LinkageImplementation of Matching Tree Technique for Online Record Linkage
Implementation of Matching Tree Technique for Online Record Linkage
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the Web
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflows
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
 
Exploration, visualization and querying of linked open data sources
Exploration, visualization and querying of linked open data sourcesExploration, visualization and querying of linked open data sources
Exploration, visualization and querying of linked open data sources
 
Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...
 

Ähnlich wie ShoBeVODSDT: Shodan and Binary Edge based vulnerable open data sources detection tool or what Internet of Things Search Engines know about you

Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptxmantatheralyasriy
 
Web Investigation
Web InvestigationWeb Investigation
Web InvestigationData Source
 
Database Management in Different Applications of IOT
Database Management in Different Applications of IOTDatabase Management in Different Applications of IOT
Database Management in Different Applications of IOTijceronline
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptxmantatheralyasriy
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesASIS&T
 
Ultralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC EdgeUltralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC EdgeDataWorks Summit
 
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET Journal
 
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET Journal
 
The LOD Gateway: Open Source Infrastructure for Linked Data
The LOD Gateway: Open Source Infrastructure for Linked DataThe LOD Gateway: Open Source Infrastructure for Linked Data
The LOD Gateway: Open Source Infrastructure for Linked DataDavid Newbury
 
Building big data solutions on azure
Building big data solutions on azureBuilding big data solutions on azure
Building big data solutions on azureEyal Ben Ivri
 
Cloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdfCloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdfkalai75
 
Elasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analyticsElasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analyticsTiziano Fagni
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptxmantatheralyasriy
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptxmantatheralyasriy
 
EUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederEUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederOpenAIRE
 
JAVA 2013 IEEE NETWORKSECURITY PROJECT Utility privacy tradeoff in databases ...
JAVA 2013 IEEE NETWORKSECURITY PROJECT Utility privacy tradeoff in databases ...JAVA 2013 IEEE NETWORKSECURITY PROJECT Utility privacy tradeoff in databases ...
JAVA 2013 IEEE NETWORKSECURITY PROJECT Utility privacy tradeoff in databases ...IEEEGLOBALSOFTTECHNOLOGIES
 
Utility privacy tradeoff in databases an information-theoretic approach
Utility privacy tradeoff in databases an information-theoretic approachUtility privacy tradeoff in databases an information-theoretic approach
Utility privacy tradeoff in databases an information-theoretic approachIEEEFINALYEARPROJECTS
 

Ähnlich wie ShoBeVODSDT: Shodan and Binary Edge based vulnerable open data sources detection tool or what Internet of Things Search Engines know about you (20)

BrightTALK - Semantic AI
BrightTALK - Semantic AI BrightTALK - Semantic AI
BrightTALK - Semantic AI
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptx
 
Web Investigation
Web InvestigationWeb Investigation
Web Investigation
 
Top 10 data science technologies
Top 10 data science technologiesTop 10 data science technologies
Top 10 data science technologies
 
Database Management in Different Applications of IOT
Database Management in Different Applications of IOTDatabase Management in Different Applications of IOT
Database Management in Different Applications of IOT
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptx
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication Repositories
 
Ultralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC EdgeUltralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC Edge
 
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
 
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
 
The LOD Gateway: Open Source Infrastructure for Linked Data
The LOD Gateway: Open Source Infrastructure for Linked DataThe LOD Gateway: Open Source Infrastructure for Linked Data
The LOD Gateway: Open Source Infrastructure for Linked Data
 
Building big data solutions on azure
Building big data solutions on azureBuilding big data solutions on azure
Building big data solutions on azure
 
Cloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdfCloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdf
 
Elasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analyticsElasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analytics
 
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptx
 
Dataset Sources Repositories.pptx
Dataset Sources Repositories.pptxDataset Sources Repositories.pptx
Dataset Sources Repositories.pptx
 
EUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederEUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan Broeder
 
JAVA 2013 IEEE NETWORKSECURITY PROJECT Utility privacy tradeoff in databases ...
JAVA 2013 IEEE NETWORKSECURITY PROJECT Utility privacy tradeoff in databases ...JAVA 2013 IEEE NETWORKSECURITY PROJECT Utility privacy tradeoff in databases ...
JAVA 2013 IEEE NETWORKSECURITY PROJECT Utility privacy tradeoff in databases ...
 
Utility privacy tradeoff in databases an information-theoretic approach
Utility privacy tradeoff in databases an information-theoretic approachUtility privacy tradeoff in databases an information-theoretic approach
Utility privacy tradeoff in databases an information-theoretic approach
 

Mehr von Anastasija Nikiforova

Data Quality for AI or AI for Data quality: advances in Data Quality Manageme...
Data Quality for AI or AI for Data quality: advances in Data Quality Manageme...Data Quality for AI or AI for Data quality: advances in Data Quality Manageme...
Data Quality for AI or AI for Data quality: advances in Data Quality Manageme...Anastasija Nikiforova
 
Towards High-Value Datasets determination for data-driven development: a syst...
Towards High-Value Datasets determination for data-driven development: a syst...Towards High-Value Datasets determination for data-driven development: a syst...
Towards High-Value Datasets determination for data-driven development: a syst...Anastasija Nikiforova
 
Public data ecosystems in and for smart cities: how to make open / Big / smar...
Public data ecosystems in and for smart cities: how to make open / Big / smar...Public data ecosystems in and for smart cities: how to make open / Big / smar...
Public data ecosystems in and for smart cities: how to make open / Big / smar...Anastasija Nikiforova
 
Artificial Intelligence for open data or open data for artificial intelligence?
Artificial Intelligence for open data or open data for artificial intelligence?Artificial Intelligence for open data or open data for artificial intelligence?
Artificial Intelligence for open data or open data for artificial intelligence?Anastasija Nikiforova
 
Overlooked aspects of data governance: workflow framework for enterprise data...
Overlooked aspects of data governance: workflow framework for enterprise data...Overlooked aspects of data governance: workflow framework for enterprise data...
Overlooked aspects of data governance: workflow framework for enterprise data...Anastasija Nikiforova
 
Data Quality as a prerequisite for you business success: when should I start ...
Data Quality as a prerequisite for you business success: when should I start ...Data Quality as a prerequisite for you business success: when should I start ...
Data Quality as a prerequisite for you business success: when should I start ...Anastasija Nikiforova
 
Framework for understanding quantum computing use cases from a multidisciplin...
Framework for understanding quantum computing use cases from a multidisciplin...Framework for understanding quantum computing use cases from a multidisciplin...
Framework for understanding quantum computing use cases from a multidisciplin...Anastasija Nikiforova
 
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Anastasija Nikiforova
 
Putting FAIR Principles in the Context of Research Information: FAIRness for ...
Putting FAIR Principles in the Context of Research Information: FAIRness for ...Putting FAIR Principles in the Context of Research Information: FAIRness for ...
Putting FAIR Principles in the Context of Research Information: FAIRness for ...Anastasija Nikiforova
 
Open data hackathon as a tool for increased engagement of Generation Z: to h...
Open data hackathon as a tool for increased engagement of Generation Z:  to h...Open data hackathon as a tool for increased engagement of Generation Z:  to h...
Open data hackathon as a tool for increased engagement of Generation Z: to h...Anastasija Nikiforova
 
Barriers to Openly Sharing Government Data: Towards an Open Data-adapted Inno...
Barriers to Openly Sharing Government Data: Towards an Open Data-adapted Inno...Barriers to Openly Sharing Government Data: Towards an Open Data-adapted Inno...
Barriers to Openly Sharing Government Data: Towards an Open Data-adapted Inno...Anastasija Nikiforova
 
Combining Data Lake and Data Wrangling for Ensuring Data Quality in CRIS
Combining Data Lake and Data Wrangling for Ensuring Data Quality in CRISCombining Data Lake and Data Wrangling for Ensuring Data Quality in CRIS
Combining Data Lake and Data Wrangling for Ensuring Data Quality in CRISAnastasija Nikiforova
 
The role of open data in the development of sustainable smart cities and smar...
The role of open data in the development of sustainable smart cities and smar...The role of open data in the development of sustainable smart cities and smar...
The role of open data in the development of sustainable smart cities and smar...Anastasija Nikiforova
 
Data security as a top priority in the digital world: preserve data value by ...
Data security as a top priority in the digital world: preserve data value by ...Data security as a top priority in the digital world: preserve data value by ...
Data security as a top priority in the digital world: preserve data value by ...Anastasija Nikiforova
 
OPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERS
OPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERSOPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERS
OPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERSAnastasija Nikiforova
 
Invited talk "Open Data as a driver of Society 5.0: how you and your scientif...
Invited talk "Open Data as a driver of Society 5.0: how you and your scientif...Invited talk "Open Data as a driver of Society 5.0: how you and your scientif...
Invited talk "Open Data as a driver of Society 5.0: how you and your scientif...Anastasija Nikiforova
 
TIMELINESS OF OPEN DATA IN OPEN GOVERNMENT DATA PORTALS THROUGH PANDEMIC-RELA...
TIMELINESS OF OPEN DATA IN OPEN GOVERNMENT DATA PORTALS THROUGH PANDEMIC-RELA...TIMELINESS OF OPEN DATA IN OPEN GOVERNMENT DATA PORTALS THROUGH PANDEMIC-RELA...
TIMELINESS OF OPEN DATA IN OPEN GOVERNMENT DATA PORTALS THROUGH PANDEMIC-RELA...Anastasija Nikiforova
 
ATVĒRTO DATU SAVLAICĪGUMS NACIONĀLAJOS ATVĒRTO DATU PORTĀLOS AR PANDĒMIJU SAI...
ATVĒRTO DATU SAVLAICĪGUMS NACIONĀLAJOS ATVĒRTO DATU PORTĀLOS AR PANDĒMIJU SAI...ATVĒRTO DATU SAVLAICĪGUMS NACIONĀLAJOS ATVĒRTO DATU PORTĀLOS AR PANDĒMIJU SAI...
ATVĒRTO DATU SAVLAICĪGUMS NACIONĀLAJOS ATVĒRTO DATU PORTĀLOS AR PANDĒMIJU SAI...Anastasija Nikiforova
 
Towards a Concurrence Analysis in Business Processes
Towards a Concurrence Analysis in Business ProcessesTowards a Concurrence Analysis in Business Processes
Towards a Concurrence Analysis in Business ProcessesAnastasija Nikiforova
 

Mehr von Anastasija Nikiforova (20)

Data Quality for AI or AI for Data quality: advances in Data Quality Manageme...
Data Quality for AI or AI for Data quality: advances in Data Quality Manageme...Data Quality for AI or AI for Data quality: advances in Data Quality Manageme...
Data Quality for AI or AI for Data quality: advances in Data Quality Manageme...
 
Towards High-Value Datasets determination for data-driven development: a syst...
Towards High-Value Datasets determination for data-driven development: a syst...Towards High-Value Datasets determination for data-driven development: a syst...
Towards High-Value Datasets determination for data-driven development: a syst...
 
Public data ecosystems in and for smart cities: how to make open / Big / smar...
Public data ecosystems in and for smart cities: how to make open / Big / smar...Public data ecosystems in and for smart cities: how to make open / Big / smar...
Public data ecosystems in and for smart cities: how to make open / Big / smar...
 
Artificial Intelligence for open data or open data for artificial intelligence?
Artificial Intelligence for open data or open data for artificial intelligence?Artificial Intelligence for open data or open data for artificial intelligence?
Artificial Intelligence for open data or open data for artificial intelligence?
 
Overlooked aspects of data governance: workflow framework for enterprise data...
Overlooked aspects of data governance: workflow framework for enterprise data...Overlooked aspects of data governance: workflow framework for enterprise data...
Overlooked aspects of data governance: workflow framework for enterprise data...
 
Data Quality as a prerequisite for you business success: when should I start ...
Data Quality as a prerequisite for you business success: when should I start ...Data Quality as a prerequisite for you business success: when should I start ...
Data Quality as a prerequisite for you business success: when should I start ...
 
Framework for understanding quantum computing use cases from a multidisciplin...
Framework for understanding quantum computing use cases from a multidisciplin...Framework for understanding quantum computing use cases from a multidisciplin...
Framework for understanding quantum computing use cases from a multidisciplin...
 
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
 
Putting FAIR Principles in the Context of Research Information: FAIRness for ...
Putting FAIR Principles in the Context of Research Information: FAIRness for ...Putting FAIR Principles in the Context of Research Information: FAIRness for ...
Putting FAIR Principles in the Context of Research Information: FAIRness for ...
 
Open data hackathon as a tool for increased engagement of Generation Z: to h...
Open data hackathon as a tool for increased engagement of Generation Z:  to h...Open data hackathon as a tool for increased engagement of Generation Z:  to h...
Open data hackathon as a tool for increased engagement of Generation Z: to h...
 
Barriers to Openly Sharing Government Data: Towards an Open Data-adapted Inno...
Barriers to Openly Sharing Government Data: Towards an Open Data-adapted Inno...Barriers to Openly Sharing Government Data: Towards an Open Data-adapted Inno...
Barriers to Openly Sharing Government Data: Towards an Open Data-adapted Inno...
 
Combining Data Lake and Data Wrangling for Ensuring Data Quality in CRIS
Combining Data Lake and Data Wrangling for Ensuring Data Quality in CRISCombining Data Lake and Data Wrangling for Ensuring Data Quality in CRIS
Combining Data Lake and Data Wrangling for Ensuring Data Quality in CRIS
 
The role of open data in the development of sustainable smart cities and smar...
The role of open data in the development of sustainable smart cities and smar...The role of open data in the development of sustainable smart cities and smar...
The role of open data in the development of sustainable smart cities and smar...
 
Data security as a top priority in the digital world: preserve data value by ...
Data security as a top priority in the digital world: preserve data value by ...Data security as a top priority in the digital world: preserve data value by ...
Data security as a top priority in the digital world: preserve data value by ...
 
OPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERS
OPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERSOPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERS
OPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERS
 
Invited talk "Open Data as a driver of Society 5.0: how you and your scientif...
Invited talk "Open Data as a driver of Society 5.0: how you and your scientif...Invited talk "Open Data as a driver of Society 5.0: how you and your scientif...
Invited talk "Open Data as a driver of Society 5.0: how you and your scientif...
 
Atvērto datu potenciāls
Atvērto datu potenciālsAtvērto datu potenciāls
Atvērto datu potenciāls
 
TIMELINESS OF OPEN DATA IN OPEN GOVERNMENT DATA PORTALS THROUGH PANDEMIC-RELA...
TIMELINESS OF OPEN DATA IN OPEN GOVERNMENT DATA PORTALS THROUGH PANDEMIC-RELA...TIMELINESS OF OPEN DATA IN OPEN GOVERNMENT DATA PORTALS THROUGH PANDEMIC-RELA...
TIMELINESS OF OPEN DATA IN OPEN GOVERNMENT DATA PORTALS THROUGH PANDEMIC-RELA...
 
ATVĒRTO DATU SAVLAICĪGUMS NACIONĀLAJOS ATVĒRTO DATU PORTĀLOS AR PANDĒMIJU SAI...
ATVĒRTO DATU SAVLAICĪGUMS NACIONĀLAJOS ATVĒRTO DATU PORTĀLOS AR PANDĒMIJU SAI...ATVĒRTO DATU SAVLAICĪGUMS NACIONĀLAJOS ATVĒRTO DATU PORTĀLOS AR PANDĒMIJU SAI...
ATVĒRTO DATU SAVLAICĪGUMS NACIONĀLAJOS ATVĒRTO DATU PORTĀLOS AR PANDĒMIJU SAI...
 
Towards a Concurrence Analysis in Business Processes
Towards a Concurrence Analysis in Business ProcessesTowards a Concurrence Analysis in Business Processes
Towards a Concurrence Analysis in Business Processes
 

Kürzlich hochgeladen

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Kürzlich hochgeladen (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

ShoBeVODSDT: Shodan and Binary Edge based vulnerable open data sources detection tool or what Internet of Things Search Engines know about you

  • 1. SHOBEVODSDT: SHODAN AND BINARY EDGE BASED VULNERABLE OPEN DATA SOURCES DETECTION TOOL OR WHAT INTERNET OF THINGS SEARCH ENGINES KNOW ABOUT YOU The International Conference on Intelligent Data Science Technologies and Applications (IDSTA2021) November 15-16, 2021. Tartu, Estonia (web-based) Artjoms Daskevics, Anastasija Nikiforova “Innovative Information Technologies” Laboratory, Programming Department Faculty of Computing, University of Latvia
  • 2. AIM To propose an OSINT-based (Open Source Intelligence) tool for non-intrusive testing of open data sources inspecting their vulnerabilities and their extent. is the data source visible outside the organization? what data can be gathered from open data sources (if any) and what is their “value” for attacker and fraudsters? whether these data can pose the risks to organization using them to deploy an attack? This allows both a comprehensive analysis of unprotected data sources, falling into a list of predefined data sources, or a specific IP or IP range to examine what can be seen from the outside of the organization about the data source in use The use of Open Source Intelligence (OSINT) tools, more precisely the Internet of Things Search Engines (IoTSE) should allow the tool to inspect a list of predefined data sources on their vulnerabilities and their extent ShoBeVODSDT Shodan- and Binary Edge- based vulnerable open data sources detection tool
  • 3. ShoBeVODSDT ShoBEVODSDT uses mainly the passive assessment (non-intrusive testing), which is characterized by its low level of intrusiveness; the data sources concerned are not thoroughly and actively tested.; the tool refer to the most likely and potentially existing bottlenecks or weaknesses which, if the fourth stage of the penetration testing, namely the attack, would take place, could be revealed and exposed. ShoBeVODSDT Shodan- and Binary Edge- based vulnerable open data sources detection tool ShoBeVODSDT
  • 4. ShoBeVODSDT SCOPE What will be inspected? 8 types of data sources– MySQL, PostgreSQL, MongoDB, Redis, Elasticsearch, CouchDB, Cassandra and Memcached. Three types of sources relational databases, NoSQL databases, both types, document-oriented, column-oriented and key-value databases data stores. How it will be inspected? OSINT tools or, more precisely, Internet of Things (IoT) search engines (IoTSE) Shodan and BinaryEdge, which search for and index publicly available and accessible open data sources
  • 5. Database Primary database model Connection data Default port MySql Relational DBMS IP address, port, username, password 3306 PostgreSql Relational DBMS IP address, port, authentication data (if supports connection with a password) 5432 MongoDB Document store IP address, port, username, password 5984 Redis Key-value store IP address, port, authentication data (if access control is enabled) 27017 Elasticsearch Search engine IP address, port 6379 CouchDB Document store IP address, port, authentication data (if anonymized access is not enabled) 9200 Cassandra wide-column store IP address, port, authentication data 9160 Memcached key-value store IP address, port 11211 DATA SOURCES, THEIR MODELS AND CONNECTION DATA
  • 6. ShoBeVODSDT ACTION searches for files in a “checked” folder that corresponds to the service and country being checked; opens the file and checks IP address using the “check” class method associated with the service; if the connection has been successful, the IP address is stored in „good/<service_name> _ <country>.txt”, if failed - the IP address and error information are stored in the „bad/<service_name>_ <country>.txt”. Step I IP address search (gather) uses BinaryEdge and Shodan libraries to find service IP addresses that belong to an user-defined country; combines results from BinaryEdge and Shodan by eliminating duplicates; saves results in the “parsed/<service_name_>_<country>.txt”; Step II IP address check Step III Retrieving information from an IP address (parse) searches for files in a “parsed/good” folder that corresponds to the service and country to be checked; opens the file and tries to reconnect. If the connection was successful - tries to download the information from the database. For each type of database, the is different; saves the information in the “parsed” ,“<IP_ ADDRESS>.txt”.
  • 7. TOOL ARCHITECTURE The search class includes a class constructor where a Shodan or Binary Edge client is initialized using a valid API key and search method to obtain data from Shodan or Binary Edge*. *In the case of Binary Edge, a page number to search for IP addresses should also be provided. The service class includes a class constructor where a separate service client tries to establish the new connection. Two functions : (1) “check”, which returns an error if the connection was unsuccessful or “true” if it was successful (2) “parse”, which attempts to download all information from the database.
  • 8. ShoBeVODSDT IN ACTION Use-case - data on Latvia, Estonia and Lithuania (Baltic States) 15180 IP addresses were processed, Lithuania (7453) Estonia (5352) Latvia (2375) 98.43% of the addresses have failed to connect Category Description 0 failed to connect 1 has managed to connect but failed to gather data or information 2 has managed to connect, but the database is empty 3 has managed to connect by gathering system data or non-sensitive information 4 has managed to connect and gather sensitive data 5 compromised database ✔ the further actions took place with 1.57% or 93 IP addresses only
  • 9. ShoBeVODSDT IN ACTION “2” and “3” – the most popular categories – good point, i.e. while these data sources are open, these data are not of very high importance to attackers and fraudsters, although they can facilitate their attacks, 8% of data sources contain data that could be used by attackers, 12% of them have already been compromised most empty and compromised databases belong to Elasticsearch. most databases that store sensitive data belong to Memcached, but it is also a leader in databases where sensitive data are not stored (category “3”). Memcached and ElasticSearch have the highest number of open data sources with higher “value” of data gathered from them in almost all categories, except for relatively poor results demonstrated by the MongoDB for the number of compromised databases and Redis for data sources storing sensitive data.
  • 10. FUTURE WORKS The list of used IoTSE may be extended to other well-known Search Engines such as Censys, ZoomEye etc. to allow more extensive investigation and determine whether the number of IoTSE has an impact on the results. Similarly, the number of data sources can be supplemented by other data sources identified as the most popular; especially given Oracle and MS SQL are somteimes found to have the highest number of vulnerabilities. Although our aim was to propose the tool for investigating databases only, further studies may also cover other “types of devices”, such as Network Equipments, Terminal, Server, Office Equipment, Industrial Control Equipment, Smart Home, Power Supply Equipment, Web Camera, Remote Management Equipment, Blockchain and industrial based connected devices in the cloud. At the moment, the future study aims to apply the tool to specific countries of Latvia, Lithuania and Estonia and to carry out extensive investigation on the current state of data sources and their security. This will allow conclusions to be drawn on differences in country patterns, i.e. whether the technological development of Estonia will be also seen in this matter. It will draw more objective conclusions on the less protected-by-design data sources.
  • 11. RESULTS AND CONCLUSIONS I The paper proposes a tool called ShoBeVODSDT - Shodan- and Binary Edge- based vulnerable open data sources detection tool, for non-intrusive testing of open data sources for detecting their vulnerabilities. ShoBeVODSDT: supports the identification of vulnerabilities at early security assessment stages and does not require the implementation of active and possibly disruptive techniques; uses two IoTSE (Shodan and Binary Edge) by extending their features with the advanced capabilities built in it; allows inspecting 8 predefined data sources - MySQL, PostgreSQL, MongoDB, Redis, Elasticsearch, CouchDB, Cassandra and Memcached, on their vulnerabilities and their extent. While the tool covers 8 data sources representing both rational databases, NoSQL databases and data stores, it is designed to be easily scalable by extending the publicly available code  https://github.com/zhmyh/ShoBEVODST https://www.eosc-hub.eu/open-science-info
  • 12. RESULTS AND CONCLUSIONS II The total number of open data sources available to everyone (who wants to access them) is not very high, i.e. less than 2% of the data sources scanned. BUT, there are data sources that may pose risks to organizations, since external users can access the information that can be used for further attacks. For 12% of ispected data sources this has already taken place. Security features built into the database allow to protect against unauthorized access, but there are databases with low security features, where we were able to connect to nearly all IP addresses by retrieving information from them. Even more, in some cases the databases, which do not use security mechanisms, have been already compromised.
  • 13. THANK YOU FOR ATTENTION! QUESTIONS? For more information, see ResearchGate See also anastasijanikiforova.com For questions or any other queries, contact me via email - Anastasija.Nikiforova@lu.lv