SlideShare ist ein Scribd-Unternehmen logo
1 von 39
1
Basavaraj Soppannavar
Sr. Strategist, IoT
Toshiba America Research Inc.
Purpose-built In-Memory NoSQL Database
For
Internet of Things
5th Aug 2017
Los Angeles
Agenda
 Internet of Things
 IoT Data & its properties
 GridDB
 Real Use Cases
GridDB by Toshiba 2
Internet of Things
GridDB by Toshiba 3
Internet of Things Predictions
Number of Connected Devices
4GridDB by Toshiba
By 2020 the number of connected devices will be
• 50 Billion – Cisco
• 28.1 Billion* – IDC
• 20.8 Billion* – Gartner
*not including smartphones & computers
Most IoT smart devices aren’t in your home or phone—they are in factories,
businesses, and healthcare – Intel Infographics
• 40.2 % in Business and Manufacturing
• 30.3 % in Healthcare
IoT Revenue projections
• $300 Billion – Gartner
• $470 Billion – Bain
IoT Economics
Technology Stack of IoT
Data Aggregation / Processing
Session / Communication
Transport
Link
Connectivity
Data Storage and Retrieval
CoAP, MQTT, DDS, XMPP, AMQP, HTTP
IPV4, IPV6
Ethernet, WiFi, Bluetooth, BLE, Zigbee, Zwave, RFiD, 2G, 3G, LTE
Wireless, USB, RJ45(Ethernet), DSL
Storm, Kafka, Fluentd, RabbitMQ
GridDB, HBase, Cassandra, MongoDB, MS-SQL, Hadoop
Analytics & AIDeviceandDataManagement
SecurityandPrivacy
BI, Visualization, Data Mining, DPP* Analytics, Machine Learning
Applications Mobile, Web, Business Apps
Device Sensors, Embedded chips, Cameras, Wearables
*Descriptive, Predictive, Prescriptive
5GridDB by Toshiba
Toshiba’s Full Stack
Solution for
IoT & Big Data
GridDB by Toshiba
6
GridDB NoSQL
Database
IoT Data & Databases
GridDB by Toshiba 7
Properties of IoT Data
Periodic
Large volume
but
Small record size
Structured
Time
Stamped
8GridDB by Toshiba
Timestamp Voltage Current Temperature
2017/05/03 10:45:00 100 0.64 20.5
2017/05/03 10:45:30 101 0.63 20.4
2017/05/03 10:46:00 99 0.65 20.5
.
.
.
.
.
.
.
.
.
.
.
.
Single record (size less than 100 bytes)
Millions of records
Database Requirements of IoT
Highly Available &
Fault Tolerant
Great read and write
performance for millions
of records
Time series data &
operations support
Fast Search and Range
Queries
Spatial and geo-location
support
Real-time streaming
support
9GridDB by Toshiba
Support for ever-increasing data (Scale Out)
Evolution of Database Management Systems
RDBMS
NoSQL DBs
Key Value Store
Wide Column Store
Document Store
Graph Store
Hadoop
OLAP / DW
Riak, Aerospike
Cassandra, HBase
MongoDB, Couchbase
Neo4j
MySQL, Postgres
Cloudera, Hortonworks
Teradata, Vertica, GreenPlum
RDBMS RDBMS
OLAP / DW
Operational / Transactional
Database
Data Warehouse for BI
and Analytics
OLAP – Online Analytical Processing
DW – Data Warehouse
10GridDB by Toshiba
Inspired by Source: https://practicalanalytics.co/2015/06/02/the-maturing-nosql-ecoystem-a-c-level-guide/
90s 2000s Today
GridDB
A Purpose-built In-Memory NoSQL Database for IoT
GridDB by Toshiba 11
What is GridDB?
Highly Scalable
In Memory
Distributed
Key-Value
IoT Database
12GridDB by Toshiba
GridDB – Highly Scalable Database for IoT
13GridDB by Toshiba
Highly Scalable Distributed Key-Container Database
14GridDB by Toshiba
NoSQL Data Models
15GridDB by Toshiba
• GridDB has a unique Key-Container data model
• Container can be visualized as a table of a Relational Database
• Fixed schema
Key Container Data Model
16GridDB by Toshiba
 Container is a group of data set with a schema
 GridDB supports 2 types of containers
 Collection container – For generic records management
 Time-series container – For time series records management
 Key Container model provides
 Data Consistency within the container (ACID is guaranteed within the container)
 Faster data retrieval and search because of schema
 TQL, an SQL-like query language for reading data from the containers
Key Container Data Model - Example
17GridDB by Toshiba
static class SMData {
@RowKey Date timestamp;
int voltage;
double current;
int temp;
}
TimeSeries<SMData> ts = store.putTimeSeries(SM101, SMData.class);
Schema definition
Creating a TS Container
Container name
“Key”
Schema
High Performance
18GridDB by Toshiba
GridDB’s hybrid composition of In-Memory and Disk architecture is optimized for maximum performance
Memory from multiple nodes
Node/Server Node Node Node
SSD/DiskSSD/HDD SSD/Disk SSD/Disk
Add new nodes
GridDB 4-node Cluster
In-Memory + Disk Hybrid
Excess data from memory is saved on to SDD/Disk
YCSB Performance Results
19GridDB by Toshiba
• Tests performed under same hardware systems (MS Azure Standard_D2 dual core CPUs, 7GB RAM per node)
• 1 client per core; 128 threads per client
*Tests performed by Fixstars
0
100
200
300
400
A B C D F
Avg.Throughput
('000ops/sec)
YCSB Workloads
Throughput - 16 nodes
GridDB
Cassandra
0
100
200
300
400
500
600
700
800
A B C D F
Avg.Throughput
('000ops/sec)
YCSB Workloads
Throughput - 32 nodes
GridDB
Cassandra
0
50
100
150
A B C D F
Latencyin
Microseconds
YCSB Workloads
Read Latency – 16 nodes
GridDB
Cassandra
Yahoo Cloud Servicing Benchmark (YCSB) comparing
GridDB and Cassandra shows that*
 Average throughput of GridDB is 4x-5x higher than
that of Cassandra
 Average latency of GridDB is 3x-4x lower than that of
Cassandra
Superior Stability
20GridDB by Toshiba
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
0 10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90,000
Throughput(ops)
Elapsed Time (seconds)
YCSB Workload A 24Hrs Stability test
GridDB
Cassandra
3hrs 15hrs9hrs 21hrs 25hrs
Tests performed by Fixstars
High Availability
21GridDB by Toshiba
Advanced Master-Slave Model - Hybrid Cluster Management
• No Single Point of Failure (SPOF) – Master node is selected automatically
• No Split Brain – Quorum Policy is applied
Autonomous Data Distribution
• Data distribution and failover are taken care of automatically
Master
Original Replica
Original Replica
Original Replica
Original Replica
OriginalReplica
Data Distribution Table (Cached)
Hybrid Cluster Management Failover
Node 1 Node 2 Node 3 Node 4 Node 5
Data Replication
Client Client Client
Add new nodes
Time Series Features
22GridDB by Toshiba
• TDPA
• GridDB implements Time Series Data Placement Algorithm for high frequency data to maximize
memory utilization
• Expiry Release Function
• Data retention period can be set to a particular period to release the old data and free storage
• Aggregate Functions
• MIN, MAX, AVG, VARIANCE, STDDEV
• Sampling and Interpolation Functions
• TIME_INTERPOLATED, TIME_SAMPLING, TIME_NEXT, TIME_PREVIOUS
• Trigger functions
• JMS and REST notifications
GridDB is optimized for Time-Series operations
Real Use Cases
1. Building Energy Management Systems
2. Smart Meters – Electric Power Company
3. Smart City – Ishinomaki City
1. Building Energy Management Systems
24GridDB by Toshiba
• 100+ buildings are managed by the BEMS company in Kawasaki, Japan
• BEMS company manages over 1 Peta Byte (million Gigabytes) of sensor data each year
• Average 5MB data per sensor per day or approximately 2GB data from each sensor per year
• 100-1000 sensors per building depending on the sqft area making the collected sensor data of 1TB per building
per year
GridDB was used for its easy scalability, simple data model and Time Series querying & functions
2. Smart Meters – Electric Power Company
25GridDB by Toshiba
• One of Japan’s top Electric power companies
switched from a Relational Database to
GridDB
• The company saw an increase in throughput
by 2,250 times the old system
• Overall processing time was went down
considerably
• Data center costs reduced significantly
GridDB was used for its high performance, large data handling and reduced cost
2. Smart Meters – Electric Power Company
26GridDB by Toshiba
• Has been running as a real system since April, 2016
• 3 million smart meters` data is collected every 30 minutes and is stored for 3 months
• Data size is approximately 2.6 TB
• 13 billion records
• Record size of 200 bytes
MDMS
MapReduce
Charge Cal. Imbalance Cal.
30 Min. Balancing
MapReduce
Read Value App
AppServer
Data Input GridDB
GridDB
RDB
Preliminary
Results Usage
Power
Retailers
Usage
Power
Retailers3 million
smart meters
SM
SM
SM
3 node cluster 3 node cluster
5 node cluster
Active-Standby Cluster
3 node cluster
4 node cluster
SM – Smart Meter
MDMS – Meter Data Management System
RDB – Relational Database
3. Smart City – Disaster-tolerant Ishinomaki City
27GridDB by Toshiba
GridDB was used for its high speed processing of large data, long-term data retention, maintain consistency
Post 2011 disaster recovery plan of Ishinomaki city
PoC of Consignment Charge Calculation System
28GridDB by Toshiba
• 30 million smart meters’ data is collected every 30 minutes
and is stored for 1 month
• Data size is approximately 8.6TB
• 43 billion records
• Record size of 200 bytes
• 1 month charge calculation for 30 million meter data was
executed in 96 minutes
MDMS
Imbalance
(43G records)
5 node cluster
MapReduce
Data Input
(30M data)
GridDB
6 node cluster
30 million
smart meters
SM
SM
SM 8.6TB
Charge
Calculation
(43G records)
Associating
Contract Info.
(30M data)
Execution Time
= 1 min 47 secs
Execution Time
= 9 mins
Execution Time
= 30 mins
Execution Time
= 55 mins
GridDB
Editions, Languages, Connectors
GridDB Editions
30GridDB by Toshiba
GridDB on Amazon AWS Marketplace
31GridDB by Toshiba
Languages and Connectors
• GridDB Community Edition is open sourced and is available on GitHub
• https://github.com/griddb
• Currently supports Java, C/C++, REST, Python & Ruby interfaces
• Go, PHP, Perl and JavaScript drivers will be added in the coming months
• MapReduce connector is available on GitHub
• https://github.com/griddb/griddb_hadoop_mapreduce
• KairosDB connector is available on GitHub
• https://github.com/griddb/griddb_kairosdb
• Spark connector is recently released on GitHub
• https://github.com/griddb/griddb_spark
• Kafka-GridDB integration blog post is up on www.griddb.net website
32GridDB by Toshiba
GridDB feature set
33GridDB by Toshiba
Horizontal scaling is near-linear and works great on commodity hardware
• Tested on 100 nodes per cluster, can scale up to 1000 nodes
GridDB's advanced master-slave model eliminates SPOF and split brain
Autonomous data distribution prevents data loss
ACID transactions are guaranteed at the container level
TQL, an SQL-like language for fast querying and analytics
GridDB’s hybrid composition of In-Memory and Disk architecture is optimized for maximum performance
GridDB is custom designed for IoT and other use cases that involve Time Series operations
• TS data types, temporal based querying, geometry type and BLOB types are supported
• Vector sets data type support is in development
Useful Links
• Developers’ website - www.griddb.net
• Toshiba GridDB website - http://solutions.toshiba.com/overview.html
• GitHub repository - https://github.com/griddb
• Quick Start Guide - http://www.griddb.net/en/docs/GridDB_QuickStartGuide.html
• Technical Reference - http://www.griddb.net/en/docs/GridDB_TechnicalReference.pdf
• API Reference - http://www.griddb.net/en/docs/GridDB_API_Reference.html
34GridDB by Toshiba
Contact
Basavaraj Soppannavar
Sr. Strategist, IoT
Basavaraj.Soppannavar@toshiba.com
@griddbcommunity
Follow GridDB
GridDB by Toshiba 35
T H A N K YO U
ADDITIONAL INFO
GridDB by Toshiba 36
Yahoo Cloud Services
Benchmark (YCSB)
GridDB by Toshiba 37
YCSB
Yahoo Cloud Services Benchmark is an open source benchmarking suite designed by Yahoo
Labs for comparative performance evaluation of NoSQL Database Management Systems
• YCSB is used by DBMS vendors for ‘Benchmark Comparison’
• Traditional benchmarking tools such as TPC (Transaction Processing Performance Council) are used
to compare RDBMS
• YCSB measures/compares various attributes of the DBMS such as Latency, Throughput, Durability,
Scalability, Availability, Read/Write optimization, Sync/Async replication etc.
YCSB has 2 main parts
• YCSB Client – an extensible workload generator
• Client generated standard workloads can also be extended to generate user defined workloads that would be operated
on the system (on DBMS)
• YCSB Core Workloads – a set of scenarios generated by the client to run on the existing system
under test
• Core workloads give a well rounded picture of the system’s performance under test
GridDB by Toshiba 38
YCSB Workloads
YCSB has 6 core workloads
GridDB by Toshiba 39
Workload A-
Update heavy
Workload B -
Read mostly
Workload C -
Read only
Workload D -
Read latest
Workload E -
Short Ranges
Workload F -
Read-modify-
write
This workload has a mix of 50/50 reads and writes. An application example is a session store
recording recent actions
This workload has a 95/5 reads/write mix. Application example: photo tagging; add a tag is
an update, but most operations are to read tags
This workload is 100% read. Application example: user profile cache, where profiles are
constructed elsewhere (e.g., Hadoop)
In this workload, new records are inserted, and the most recently inserted records are the
most popular. Application example: user status updates; people want to read the latest
In this workload, short ranges of records are queried, instead of individual records.
Application example: threaded conversations, where each scan is for the posts in a given
thread (assumed to be clustered by thread id)
In this workload, the client will read a record, modify it, and write back the changes.
Application example: user database, where user records are read and modified by the user
or to record user activity

Weitere ähnliche Inhalte

Was ist angesagt?

Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013
IntelAPAC
 
Rapids: Data Science on GPUs
Rapids: Data Science on GPUsRapids: Data Science on GPUs
Rapids: Data Science on GPUs
inside-BigData.com
 

Was ist angesagt? (20)

Bio bigdata
Bio bigdata Bio bigdata
Bio bigdata
 
Getting Started with Real-Time Analytics
Getting Started with Real-Time AnalyticsGetting Started with Real-Time Analytics
Getting Started with Real-Time Analytics
 
MCT Virtual Summit 2021
MCT Virtual Summit 2021MCT Virtual Summit 2021
MCT Virtual Summit 2021
 
Overview of big data in cloud computing
Overview of big data in cloud computingOverview of big data in cloud computing
Overview of big data in cloud computing
 
Vortrag ralph behrens_ibm-data
Vortrag ralph behrens_ibm-dataVortrag ralph behrens_ibm-data
Vortrag ralph behrens_ibm-data
 
Interactive query in hadoop
Interactive query in hadoopInteractive query in hadoop
Interactive query in hadoop
 
RAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature EngineeringRAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature Engineering
 
Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013
 
Google and big query
Google and big queryGoogle and big query
Google and big query
 
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
 
Interactive query using hadoop
Interactive query using hadoopInteractive query using hadoop
Interactive query using hadoop
 
Accelerating analytics in a new era of data
Accelerating analytics in a new era of dataAccelerating analytics in a new era of data
Accelerating analytics in a new era of data
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data
 
Rapids: Data Science on GPUs
Rapids: Data Science on GPUsRapids: Data Science on GPUs
Rapids: Data Science on GPUs
 
AI meets Big Data
AI meets Big DataAI meets Big Data
AI meets Big Data
 
Overview of stinger interactive query for hive
Overview of stinger   interactive query for hiveOverview of stinger   interactive query for hive
Overview of stinger interactive query for hive
 
Advanced Analytics for Any Data at Real-Time Speed
Advanced Analytics for Any Data at Real-Time SpeedAdvanced Analytics for Any Data at Real-Time Speed
Advanced Analytics for Any Data at Real-Time Speed
 
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャ
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャリアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャ
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャ
 
How To Achieve Real-Time Analytics On A Data Lake Using GPUs
How To Achieve Real-Time Analytics On A Data Lake Using GPUsHow To Achieve Real-Time Analytics On A Data Lake Using GPUs
How To Achieve Real-Time Analytics On A Data Lake Using GPUs
 
Scaling Out With Hadoop And HBase
Scaling Out With Hadoop And HBaseScaling Out With Hadoop And HBase
Scaling Out With Hadoop And HBase
 

Ähnlich wie Purpose-built NoSQL Database for IoT by Basavaraj Soppannavar

In memory grids IMDG
In memory grids IMDGIn memory grids IMDG
In memory grids IMDG
Prateek Jain
 
3. ami big data hadoop on ucs seminar may 2013
3. ami big data hadoop on ucs seminar may 20133. ami big data hadoop on ucs seminar may 2013
3. ami big data hadoop on ucs seminar may 2013
Taldor Group
 

Ähnlich wie Purpose-built NoSQL Database for IoT by Basavaraj Soppannavar (20)

Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-Hadoop
 
In memory grids IMDG
In memory grids IMDGIn memory grids IMDG
In memory grids IMDG
 
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsHybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGs
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
 
Galaxy Big Data with MariaDB
Galaxy Big Data with MariaDBGalaxy Big Data with MariaDB
Galaxy Big Data with MariaDB
 
Big Data - Umesh Bellur
Big Data - Umesh BellurBig Data - Umesh Bellur
Big Data - Umesh Bellur
 
Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?
 
3. ami big data hadoop on ucs seminar may 2013
3. ami big data hadoop on ucs seminar may 20133. ami big data hadoop on ucs seminar may 2013
3. ami big data hadoop on ucs seminar may 2013
 
Big Data LDN 2017: BI Converges with AI - GPUs for Fast Data
Big Data LDN 2017: BI Converges with AI - GPUs for Fast DataBig Data LDN 2017: BI Converges with AI - GPUs for Fast Data
Big Data LDN 2017: BI Converges with AI - GPUs for Fast Data
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
 
Using a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming AggregationsUsing a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming Aggregations
 
VoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big Data
VoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big DataVoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big Data
VoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big Data
 
MariaDB AX ユースケース / ColumnStore 1.2 新機能
MariaDB AX ユースケース / ColumnStore 1.2 新機能MariaDB AX ユースケース / ColumnStore 1.2 新機能
MariaDB AX ユースケース / ColumnStore 1.2 新機能
 
Supercharging Smart Meter BIG DATA Analytics with Microsoft Azure Cloud- SRP ...
Supercharging Smart Meter BIG DATA Analytics with Microsoft Azure Cloud- SRP ...Supercharging Smart Meter BIG DATA Analytics with Microsoft Azure Cloud- SRP ...
Supercharging Smart Meter BIG DATA Analytics with Microsoft Azure Cloud- SRP ...
 
Welcome to the Datasphere – the next level of storage
Welcome to the Datasphere – the next level of storageWelcome to the Datasphere – the next level of storage
Welcome to the Datasphere – the next level of storage
 
Seagate – Next Level Storage (Webinar mit Boston Server & Storage, 2018 09-28)
Seagate – Next Level Storage (Webinar mit Boston Server & Storage,  2018 09-28)Seagate – Next Level Storage (Webinar mit Boston Server & Storage,  2018 09-28)
Seagate – Next Level Storage (Webinar mit Boston Server & Storage, 2018 09-28)
 
Stsg17 speaker yousunjeong
Stsg17 speaker yousunjeongStsg17 speaker yousunjeong
Stsg17 speaker yousunjeong
 

Mehr von Data Con LA

Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA
 

Mehr von Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

Purpose-built NoSQL Database for IoT by Basavaraj Soppannavar

  • 1. 1 Basavaraj Soppannavar Sr. Strategist, IoT Toshiba America Research Inc. Purpose-built In-Memory NoSQL Database For Internet of Things 5th Aug 2017 Los Angeles
  • 2. Agenda  Internet of Things  IoT Data & its properties  GridDB  Real Use Cases GridDB by Toshiba 2
  • 4. Internet of Things Predictions Number of Connected Devices 4GridDB by Toshiba By 2020 the number of connected devices will be • 50 Billion – Cisco • 28.1 Billion* – IDC • 20.8 Billion* – Gartner *not including smartphones & computers Most IoT smart devices aren’t in your home or phone—they are in factories, businesses, and healthcare – Intel Infographics • 40.2 % in Business and Manufacturing • 30.3 % in Healthcare IoT Revenue projections • $300 Billion – Gartner • $470 Billion – Bain IoT Economics
  • 5. Technology Stack of IoT Data Aggregation / Processing Session / Communication Transport Link Connectivity Data Storage and Retrieval CoAP, MQTT, DDS, XMPP, AMQP, HTTP IPV4, IPV6 Ethernet, WiFi, Bluetooth, BLE, Zigbee, Zwave, RFiD, 2G, 3G, LTE Wireless, USB, RJ45(Ethernet), DSL Storm, Kafka, Fluentd, RabbitMQ GridDB, HBase, Cassandra, MongoDB, MS-SQL, Hadoop Analytics & AIDeviceandDataManagement SecurityandPrivacy BI, Visualization, Data Mining, DPP* Analytics, Machine Learning Applications Mobile, Web, Business Apps Device Sensors, Embedded chips, Cameras, Wearables *Descriptive, Predictive, Prescriptive 5GridDB by Toshiba
  • 6. Toshiba’s Full Stack Solution for IoT & Big Data GridDB by Toshiba 6 GridDB NoSQL Database
  • 7. IoT Data & Databases GridDB by Toshiba 7
  • 8. Properties of IoT Data Periodic Large volume but Small record size Structured Time Stamped 8GridDB by Toshiba Timestamp Voltage Current Temperature 2017/05/03 10:45:00 100 0.64 20.5 2017/05/03 10:45:30 101 0.63 20.4 2017/05/03 10:46:00 99 0.65 20.5 . . . . . . . . . . . . Single record (size less than 100 bytes) Millions of records
  • 9. Database Requirements of IoT Highly Available & Fault Tolerant Great read and write performance for millions of records Time series data & operations support Fast Search and Range Queries Spatial and geo-location support Real-time streaming support 9GridDB by Toshiba Support for ever-increasing data (Scale Out)
  • 10. Evolution of Database Management Systems RDBMS NoSQL DBs Key Value Store Wide Column Store Document Store Graph Store Hadoop OLAP / DW Riak, Aerospike Cassandra, HBase MongoDB, Couchbase Neo4j MySQL, Postgres Cloudera, Hortonworks Teradata, Vertica, GreenPlum RDBMS RDBMS OLAP / DW Operational / Transactional Database Data Warehouse for BI and Analytics OLAP – Online Analytical Processing DW – Data Warehouse 10GridDB by Toshiba Inspired by Source: https://practicalanalytics.co/2015/06/02/the-maturing-nosql-ecoystem-a-c-level-guide/ 90s 2000s Today
  • 11. GridDB A Purpose-built In-Memory NoSQL Database for IoT GridDB by Toshiba 11
  • 12. What is GridDB? Highly Scalable In Memory Distributed Key-Value IoT Database 12GridDB by Toshiba
  • 13. GridDB – Highly Scalable Database for IoT 13GridDB by Toshiba
  • 14. Highly Scalable Distributed Key-Container Database 14GridDB by Toshiba
  • 15. NoSQL Data Models 15GridDB by Toshiba • GridDB has a unique Key-Container data model • Container can be visualized as a table of a Relational Database • Fixed schema
  • 16. Key Container Data Model 16GridDB by Toshiba  Container is a group of data set with a schema  GridDB supports 2 types of containers  Collection container – For generic records management  Time-series container – For time series records management  Key Container model provides  Data Consistency within the container (ACID is guaranteed within the container)  Faster data retrieval and search because of schema  TQL, an SQL-like query language for reading data from the containers
  • 17. Key Container Data Model - Example 17GridDB by Toshiba static class SMData { @RowKey Date timestamp; int voltage; double current; int temp; } TimeSeries<SMData> ts = store.putTimeSeries(SM101, SMData.class); Schema definition Creating a TS Container Container name “Key” Schema
  • 18. High Performance 18GridDB by Toshiba GridDB’s hybrid composition of In-Memory and Disk architecture is optimized for maximum performance Memory from multiple nodes Node/Server Node Node Node SSD/DiskSSD/HDD SSD/Disk SSD/Disk Add new nodes GridDB 4-node Cluster In-Memory + Disk Hybrid Excess data from memory is saved on to SDD/Disk
  • 19. YCSB Performance Results 19GridDB by Toshiba • Tests performed under same hardware systems (MS Azure Standard_D2 dual core CPUs, 7GB RAM per node) • 1 client per core; 128 threads per client *Tests performed by Fixstars 0 100 200 300 400 A B C D F Avg.Throughput ('000ops/sec) YCSB Workloads Throughput - 16 nodes GridDB Cassandra 0 100 200 300 400 500 600 700 800 A B C D F Avg.Throughput ('000ops/sec) YCSB Workloads Throughput - 32 nodes GridDB Cassandra 0 50 100 150 A B C D F Latencyin Microseconds YCSB Workloads Read Latency – 16 nodes GridDB Cassandra Yahoo Cloud Servicing Benchmark (YCSB) comparing GridDB and Cassandra shows that*  Average throughput of GridDB is 4x-5x higher than that of Cassandra  Average latency of GridDB is 3x-4x lower than that of Cassandra
  • 20. Superior Stability 20GridDB by Toshiba 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 0 10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90,000 Throughput(ops) Elapsed Time (seconds) YCSB Workload A 24Hrs Stability test GridDB Cassandra 3hrs 15hrs9hrs 21hrs 25hrs Tests performed by Fixstars
  • 21. High Availability 21GridDB by Toshiba Advanced Master-Slave Model - Hybrid Cluster Management • No Single Point of Failure (SPOF) – Master node is selected automatically • No Split Brain – Quorum Policy is applied Autonomous Data Distribution • Data distribution and failover are taken care of automatically Master Original Replica Original Replica Original Replica Original Replica OriginalReplica Data Distribution Table (Cached) Hybrid Cluster Management Failover Node 1 Node 2 Node 3 Node 4 Node 5 Data Replication Client Client Client Add new nodes
  • 22. Time Series Features 22GridDB by Toshiba • TDPA • GridDB implements Time Series Data Placement Algorithm for high frequency data to maximize memory utilization • Expiry Release Function • Data retention period can be set to a particular period to release the old data and free storage • Aggregate Functions • MIN, MAX, AVG, VARIANCE, STDDEV • Sampling and Interpolation Functions • TIME_INTERPOLATED, TIME_SAMPLING, TIME_NEXT, TIME_PREVIOUS • Trigger functions • JMS and REST notifications GridDB is optimized for Time-Series operations
  • 23. Real Use Cases 1. Building Energy Management Systems 2. Smart Meters – Electric Power Company 3. Smart City – Ishinomaki City
  • 24. 1. Building Energy Management Systems 24GridDB by Toshiba • 100+ buildings are managed by the BEMS company in Kawasaki, Japan • BEMS company manages over 1 Peta Byte (million Gigabytes) of sensor data each year • Average 5MB data per sensor per day or approximately 2GB data from each sensor per year • 100-1000 sensors per building depending on the sqft area making the collected sensor data of 1TB per building per year GridDB was used for its easy scalability, simple data model and Time Series querying & functions
  • 25. 2. Smart Meters – Electric Power Company 25GridDB by Toshiba • One of Japan’s top Electric power companies switched from a Relational Database to GridDB • The company saw an increase in throughput by 2,250 times the old system • Overall processing time was went down considerably • Data center costs reduced significantly GridDB was used for its high performance, large data handling and reduced cost
  • 26. 2. Smart Meters – Electric Power Company 26GridDB by Toshiba • Has been running as a real system since April, 2016 • 3 million smart meters` data is collected every 30 minutes and is stored for 3 months • Data size is approximately 2.6 TB • 13 billion records • Record size of 200 bytes MDMS MapReduce Charge Cal. Imbalance Cal. 30 Min. Balancing MapReduce Read Value App AppServer Data Input GridDB GridDB RDB Preliminary Results Usage Power Retailers Usage Power Retailers3 million smart meters SM SM SM 3 node cluster 3 node cluster 5 node cluster Active-Standby Cluster 3 node cluster 4 node cluster SM – Smart Meter MDMS – Meter Data Management System RDB – Relational Database
  • 27. 3. Smart City – Disaster-tolerant Ishinomaki City 27GridDB by Toshiba GridDB was used for its high speed processing of large data, long-term data retention, maintain consistency Post 2011 disaster recovery plan of Ishinomaki city
  • 28. PoC of Consignment Charge Calculation System 28GridDB by Toshiba • 30 million smart meters’ data is collected every 30 minutes and is stored for 1 month • Data size is approximately 8.6TB • 43 billion records • Record size of 200 bytes • 1 month charge calculation for 30 million meter data was executed in 96 minutes MDMS Imbalance (43G records) 5 node cluster MapReduce Data Input (30M data) GridDB 6 node cluster 30 million smart meters SM SM SM 8.6TB Charge Calculation (43G records) Associating Contract Info. (30M data) Execution Time = 1 min 47 secs Execution Time = 9 mins Execution Time = 30 mins Execution Time = 55 mins
  • 31. GridDB on Amazon AWS Marketplace 31GridDB by Toshiba
  • 32. Languages and Connectors • GridDB Community Edition is open sourced and is available on GitHub • https://github.com/griddb • Currently supports Java, C/C++, REST, Python & Ruby interfaces • Go, PHP, Perl and JavaScript drivers will be added in the coming months • MapReduce connector is available on GitHub • https://github.com/griddb/griddb_hadoop_mapreduce • KairosDB connector is available on GitHub • https://github.com/griddb/griddb_kairosdb • Spark connector is recently released on GitHub • https://github.com/griddb/griddb_spark • Kafka-GridDB integration blog post is up on www.griddb.net website 32GridDB by Toshiba
  • 33. GridDB feature set 33GridDB by Toshiba Horizontal scaling is near-linear and works great on commodity hardware • Tested on 100 nodes per cluster, can scale up to 1000 nodes GridDB's advanced master-slave model eliminates SPOF and split brain Autonomous data distribution prevents data loss ACID transactions are guaranteed at the container level TQL, an SQL-like language for fast querying and analytics GridDB’s hybrid composition of In-Memory and Disk architecture is optimized for maximum performance GridDB is custom designed for IoT and other use cases that involve Time Series operations • TS data types, temporal based querying, geometry type and BLOB types are supported • Vector sets data type support is in development
  • 34. Useful Links • Developers’ website - www.griddb.net • Toshiba GridDB website - http://solutions.toshiba.com/overview.html • GitHub repository - https://github.com/griddb • Quick Start Guide - http://www.griddb.net/en/docs/GridDB_QuickStartGuide.html • Technical Reference - http://www.griddb.net/en/docs/GridDB_TechnicalReference.pdf • API Reference - http://www.griddb.net/en/docs/GridDB_API_Reference.html 34GridDB by Toshiba Contact Basavaraj Soppannavar Sr. Strategist, IoT Basavaraj.Soppannavar@toshiba.com @griddbcommunity Follow GridDB
  • 35. GridDB by Toshiba 35 T H A N K YO U
  • 37. Yahoo Cloud Services Benchmark (YCSB) GridDB by Toshiba 37
  • 38. YCSB Yahoo Cloud Services Benchmark is an open source benchmarking suite designed by Yahoo Labs for comparative performance evaluation of NoSQL Database Management Systems • YCSB is used by DBMS vendors for ‘Benchmark Comparison’ • Traditional benchmarking tools such as TPC (Transaction Processing Performance Council) are used to compare RDBMS • YCSB measures/compares various attributes of the DBMS such as Latency, Throughput, Durability, Scalability, Availability, Read/Write optimization, Sync/Async replication etc. YCSB has 2 main parts • YCSB Client – an extensible workload generator • Client generated standard workloads can also be extended to generate user defined workloads that would be operated on the system (on DBMS) • YCSB Core Workloads – a set of scenarios generated by the client to run on the existing system under test • Core workloads give a well rounded picture of the system’s performance under test GridDB by Toshiba 38
  • 39. YCSB Workloads YCSB has 6 core workloads GridDB by Toshiba 39 Workload A- Update heavy Workload B - Read mostly Workload C - Read only Workload D - Read latest Workload E - Short Ranges Workload F - Read-modify- write This workload has a mix of 50/50 reads and writes. An application example is a session store recording recent actions This workload has a 95/5 reads/write mix. Application example: photo tagging; add a tag is an update, but most operations are to read tags This workload is 100% read. Application example: user profile cache, where profiles are constructed elsewhere (e.g., Hadoop) In this workload, new records are inserted, and the most recently inserted records are the most popular. Application example: user status updates; people want to read the latest In this workload, short ranges of records are queried, instead of individual records. Application example: threaded conversations, where each scan is for the posts in a given thread (assumed to be clustered by thread id) In this workload, the client will read a record, modify it, and write back the changes. Application example: user database, where user records are read and modified by the user or to record user activity