SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
RDB: Repairable Database Systems

                                                                                              Alexey Smirnov and Tzi-cker Chiueh
                                                    Experimental Computer Systems Lab, Department of Computer Science, SUNY Stony Brook

Motivation                                                      Transaction Dependency Tracking                              SQL Statements Rewriting
Suppose you are a DBA and you have just noticed that
your database has been compromised 24 hours ago. How            To track dependencies successfully, the tracking             As a part of the dependency tracking mechanism, SQL proxy rewrites certain classes of SQL statements coming from a client.
would you repair the database?                                  mechanism should be able to intercept both read and
                                                                write actions performed by the database server. Possible
                                                                                                                                               Original statement                                                  Modified statement(s)
Currently, the only way to do this is to restore a database     ways of interception are:
                                                                                                                              SELECT t1.a1,…,t1.an,…,tk.ank FROM t1,…,tk WHERE c          SELECT t1.a1,…,t1.an,…,tk.ank, t1.trid,…,tk.trid FROM t1,…,tk WHERE c
backup and recommit all benign transactions manually.           •Database triggers (online) – cannot add a trigger for
                                                                SELECT statements;                                                                                                                      SELECT t.trid FROM t WHERE c
                                                                                                                             SELECT SUM(t.a) FROM t WHERE c GROUP BY t.b
Challenges: (1) How to tell which transactions are benign                                                                                                                                       SELECT SUM(t.a) FROM t WHERE c GROUP BY t.b
and which are malicious? Identifying the initial set of         •Database log analysis (offline) – read operations are not
                                                                logged; no run-time overhead is its big advantage.                  UPDATE t SET a1=v1,…,an=vn WHERE c                          UPDATE t SET a1=v1,…,an=vn, trid=curTrID WHERE c
malicious transactions is not enough because initial
damage can spread over the database by subsequent                                                                                 INSERT INTO t(a1,…,an) VALUES (v1,…,vn)                       INSERT INTO t(a1,…,an,trid) VALUES (v1,…,vn,curTrID)
                                                                •Tracking proxy (online) – a small program sitting between
benign transactions. (2) The amount of data can be huge                                                                                                                                                   INSERT INTO trans_dep(curTrID,…)
and the repair process is very error-prone. There is a need     a client and a server that intercepts all SQL statements                           COMMIT
a way to automate it.                                           sent by the client and results sent back by the server.                                                                                              COMMIT
                                                                RDB uses both second and third approaches to
Ideally, an intrusion-resilient DBMS should be able to          implement dependency tracking.
    Track inter-transaction dependencies;
    Perform a selective transaction rollback.
                                                                                                                             Database Repair                                                      Performance Results
We propose an implementation framework called RDB
that can render an off-the-self DBMS intrusion resilient                                                                     The database is repaired by compensating malicious                   We used TPC-C benchmark to evaluate the run-time
without modifying its internals. RDB has two major                                                                           transactions. When using RDB, the repair process                     overhead of JDBC proxy. The size of the test database was
components: tracking subsystem which runs at run-time                                                                        consists of the following steps:                                     about 4GB.
and recovery subsystem which runs offline.
                                                                                                                             • Database log analysis to reconstruct complete                      We varied the following parameters:
                                                                RDB inserts a proxy JDBC driver between the DB               dependency information and generate compensating                          Transaction mix (read intensive and read/write intensive);
                                                                server and a client that transparently intercepts all        transactions;                                                             Connection type (local or over a network);
                                                                queries and results. The proxy can be either on the
                                                                                                                                                                                                       Total footprint size W (effect of database cache);
                                                                client side or on the server side.                           • Dependency graph visualization;
                                                                                                                                                                                                  Our results suggest that the overhead of the proxy is between 6% and
Definition of Transaction Dependency                                                                                         • Repairing database by committing compensating                      13% for a typical load.
A read set of an SQL statement S is the set of rows                                                                          transactions.
fetched by this statement.                                                                                                   Different DBMSs provide different facilities for log
                                                                                                                             analysis. We have studied three database servers: Oracle
We will say that statement S2 depends on statement S1 if                                                                     9.2.0, PostgreSQL 7.2.2, and Sybase ASE 12.5.
at least one row from the read set of S2 was modified by                                                                     Eventually, all of them provide enough information to
S1. We will say that transaction T2 depends on transaction      The following changes are made to the database at the        generate compensating transactions.
T1 if at least one statement of T2 depends on a statement       time of its creation:
                                                                                                                                                           We used GraphViz – a free
from T1. This definition is prone to both false positives and
                                                                                                                                                          graph drawing software
false negatives. Example of a false positive dependency:             A new field tr_id is added to each table. It                                         from AT&T.
                                                                   contains the ID of last transaction that modified a
  A1      A2      A3                                               particular row;
                            T1: SET A2=5 WHERE                                                                                                            The application allows the
  100      5       5                                                 Table trans_dep(tr_id:INTEGER,
                            A1<250                                                                                                                        user to select an initial set
  200      5       6        T2: SELECT A3 WHERE                    dep_ids:VARCHAR) – stores IDs of                                                       of malicious transactions
                            A3>3                                   transactions that depend on transaction tr_id;                                         and computes its transitive
  300      1      7                                                  Table annot(tr_id:INTEGER,                                                           closure. Then the result
                                                                   descr:VARCHAR) – stores annotations for                                                can be refined by the user
Also, in general it is impossible to determine all                 transaction tr_id;                                                                     to build the final set of               Contact Information
transaction dependencies by looking at the traffic                                                                                                        transactions to be
                                                                                                                                                          compensated.                            ECSL Lab at SUNY Stony Brook:
between a client and the DB server only because part of         The proxy uses its own transaction Ids because there is                                                                           http://www.ecsl.cs.sunysb.edu
the logic may be inside the application itself.                 no standard way to access the internal transaction ID of
                                                                a database.                                                                                                                       E-mail: {alexey,chiueh}@cs.sunysb.edu

Weitere ähnliche Inhalte

Was ist angesagt? (13)

Distributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit ProtocolsDistributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit Protocols
 
Chapter 12 transactions and concurrency control
Chapter 12 transactions and concurrency controlChapter 12 transactions and concurrency control
Chapter 12 transactions and concurrency control
 
Ph.D. Dissertation
Ph.D. DissertationPh.D. Dissertation
Ph.D. Dissertation
 
resource management
  resource management  resource management
resource management
 
Data Replication in Distributed System
Data Replication in  Distributed SystemData Replication in  Distributed System
Data Replication in Distributed System
 
Chapter 13
Chapter 13Chapter 13
Chapter 13
 
HA Hadoop -ApacheCon talk
HA Hadoop -ApacheCon talkHA Hadoop -ApacheCon talk
HA Hadoop -ApacheCon talk
 
System events concept presentation
System events concept presentationSystem events concept presentation
System events concept presentation
 
design_flow
design_flowdesign_flow
design_flow
 
Bft mr-clouds-of-clouds-discco2012 - navtalk
Bft mr-clouds-of-clouds-discco2012 - navtalkBft mr-clouds-of-clouds-discco2012 - navtalk
Bft mr-clouds-of-clouds-discco2012 - navtalk
 
KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.
 
H04502048051
H04502048051H04502048051
H04502048051
 
Lecture: Reflection
Lecture: ReflectionLecture: Reflection
Lecture: Reflection
 

Andere mochten auch

FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...
FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...
FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...Alexey Smirnov
 
RDB - Repairable Database Systems
RDB - Repairable Database SystemsRDB - Repairable Database Systems
RDB - Repairable Database SystemsAlexey Smirnov
 
DUSK - Develop at Userland Install into Kernel
DUSK - Develop at Userland Install into KernelDUSK - Develop at Userland Install into Kernel
DUSK - Develop at Userland Install into KernelAlexey Smirnov
 
GEM - GNU C Compiler Extensions Framework
GEM - GNU C Compiler Extensions FrameworkGEM - GNU C Compiler Extensions Framework
GEM - GNU C Compiler Extensions FrameworkAlexey Smirnov
 
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...FIA2010
 

Andere mochten auch (6)

FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...
FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...
FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...
 
RDB - Repairable Database Systems
RDB - Repairable Database SystemsRDB - Repairable Database Systems
RDB - Repairable Database Systems
 
DUSK - Develop at Userland Install into Kernel
DUSK - Develop at Userland Install into KernelDUSK - Develop at Userland Install into Kernel
DUSK - Develop at Userland Install into Kernel
 
GEM - GNU C Compiler Extensions Framework
GEM - GNU C Compiler Extensions FrameworkGEM - GNU C Compiler Extensions Framework
GEM - GNU C Compiler Extensions Framework
 
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
Theodore Zahariadis (Synelixis Solutions): Fundamental Limitation of Current ...
 
FIR filter on GPU
FIR filter on GPUFIR filter on GPU
FIR filter on GPU
 

Ähnlich wie RDB: Automated Repair of Compromised Databases

A Distributed Control Law for Load Balancing in Content Delivery Networks
A Distributed Control Law for Load Balancing in Content Delivery NetworksA Distributed Control Law for Load Balancing in Content Delivery Networks
A Distributed Control Law for Load Balancing in Content Delivery NetworksSruthi Kamal
 
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...netvis
 
Supply Chain Management System
Supply Chain Management SystemSupply Chain Management System
Supply Chain Management Systemguest631b66
 
Lesson02 - Network Design & LAN
Lesson02 - Network Design & LANLesson02 - Network Design & LAN
Lesson02 - Network Design & LANAngel G Diaz
 
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...SL Corporation
 
The Concept of Load Balancing Server in Secured and Intelligent Network
The Concept of Load Balancing Server in Secured and Intelligent NetworkThe Concept of Load Balancing Server in Secured and Intelligent Network
The Concept of Load Balancing Server in Secured and Intelligent NetworkIJAEMSJORNAL
 
Dynamic Cloud Partitioning and Load Balancing in Cloud
Dynamic Cloud Partitioning and Load Balancing in Cloud Dynamic Cloud Partitioning and Load Balancing in Cloud
Dynamic Cloud Partitioning and Load Balancing in Cloud Shyam Hajare
 
The Enterprise IT Checklist for Docker Operations
The Enterprise IT Checklist for Docker Operations The Enterprise IT Checklist for Docker Operations
The Enterprise IT Checklist for Docker Operations Nicola Kabar
 
Intrusion Detection and Marking Transactions in a Cloud of Databases Environm...
Intrusion Detection and Marking Transactions in a Cloud of Databases Environm...Intrusion Detection and Marking Transactions in a Cloud of Databases Environm...
Intrusion Detection and Marking Transactions in a Cloud of Databases Environm...neirew J
 
INTRUSION DETECTION AND MARKING TRANSACTIONS IN A CLOUD OF DATABASES ENVIRONMENT
INTRUSION DETECTION AND MARKING TRANSACTIONS IN A CLOUD OF DATABASES ENVIRONMENTINTRUSION DETECTION AND MARKING TRANSACTIONS IN A CLOUD OF DATABASES ENVIRONMENT
INTRUSION DETECTION AND MARKING TRANSACTIONS IN A CLOUD OF DATABASES ENVIRONMENTijccsa
 
Client server computing
Client server computingClient server computing
Client server computingjorge cabiao
 
Device Drivers
Device DriversDevice Drivers
Device DriversSuhas S R
 
Software Architecture
Software ArchitectureSoftware Architecture
Software ArchitectureHenry Muccini
 
Disadvantages Of Robotium
Disadvantages Of RobotiumDisadvantages Of Robotium
Disadvantages Of RobotiumSusan Tullis
 
Why NBC Universal Migrated to MongoDB Atlas
Why NBC Universal Migrated to MongoDB AtlasWhy NBC Universal Migrated to MongoDB Atlas
Why NBC Universal Migrated to MongoDB AtlasDatavail
 
Network Troubleshooting - Part 1
Network Troubleshooting - Part 1Network Troubleshooting - Part 1
Network Troubleshooting - Part 1SolarWinds
 

Ähnlich wie RDB: Automated Repair of Compromised Databases (20)

A Distributed Control Law for Load Balancing in Content Delivery Networks
A Distributed Control Law for Load Balancing in Content Delivery NetworksA Distributed Control Law for Load Balancing in Content Delivery Networks
A Distributed Control Law for Load Balancing in Content Delivery Networks
 
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...
 
Supply Chain Management System
Supply Chain Management SystemSupply Chain Management System
Supply Chain Management System
 
CockroachDB
CockroachDBCockroachDB
CockroachDB
 
Lesson02 - Network Design & LAN
Lesson02 - Network Design & LANLesson02 - Network Design & LAN
Lesson02 - Network Design & LAN
 
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
Overcoming the Top Four Challenges to Real-Time Performance in Large-Scale, D...
 
The Concept of Load Balancing Server in Secured and Intelligent Network
The Concept of Load Balancing Server in Secured and Intelligent NetworkThe Concept of Load Balancing Server in Secured and Intelligent Network
The Concept of Load Balancing Server in Secured and Intelligent Network
 
Dynamic Cloud Partitioning and Load Balancing in Cloud
Dynamic Cloud Partitioning and Load Balancing in Cloud Dynamic Cloud Partitioning and Load Balancing in Cloud
Dynamic Cloud Partitioning and Load Balancing in Cloud
 
The Enterprise IT Checklist for Docker Operations
The Enterprise IT Checklist for Docker Operations The Enterprise IT Checklist for Docker Operations
The Enterprise IT Checklist for Docker Operations
 
Intrusion Detection and Marking Transactions in a Cloud of Databases Environm...
Intrusion Detection and Marking Transactions in a Cloud of Databases Environm...Intrusion Detection and Marking Transactions in a Cloud of Databases Environm...
Intrusion Detection and Marking Transactions in a Cloud of Databases Environm...
 
INTRUSION DETECTION AND MARKING TRANSACTIONS IN A CLOUD OF DATABASES ENVIRONMENT
INTRUSION DETECTION AND MARKING TRANSACTIONS IN A CLOUD OF DATABASES ENVIRONMENTINTRUSION DETECTION AND MARKING TRANSACTIONS IN A CLOUD OF DATABASES ENVIRONMENT
INTRUSION DETECTION AND MARKING TRANSACTIONS IN A CLOUD OF DATABASES ENVIRONMENT
 
Client server computing
Client server computingClient server computing
Client server computing
 
Device Drivers
Device DriversDevice Drivers
Device Drivers
 
Software Architecture
Software ArchitectureSoftware Architecture
Software Architecture
 
Disadvantages Of Robotium
Disadvantages Of RobotiumDisadvantages Of Robotium
Disadvantages Of Robotium
 
Sql server
Sql serverSql server
Sql server
 
realtime system.docx
realtime system.docxrealtime system.docx
realtime system.docx
 
17 51-1-pb
17 51-1-pb17 51-1-pb
17 51-1-pb
 
Why NBC Universal Migrated to MongoDB Atlas
Why NBC Universal Migrated to MongoDB AtlasWhy NBC Universal Migrated to MongoDB Atlas
Why NBC Universal Migrated to MongoDB Atlas
 
Network Troubleshooting - Part 1
Network Troubleshooting - Part 1Network Troubleshooting - Part 1
Network Troubleshooting - Part 1
 

Kürzlich hochgeladen

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Kürzlich hochgeladen (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

RDB: Automated Repair of Compromised Databases

  • 1. RDB: Repairable Database Systems Alexey Smirnov and Tzi-cker Chiueh Experimental Computer Systems Lab, Department of Computer Science, SUNY Stony Brook Motivation Transaction Dependency Tracking SQL Statements Rewriting Suppose you are a DBA and you have just noticed that your database has been compromised 24 hours ago. How To track dependencies successfully, the tracking As a part of the dependency tracking mechanism, SQL proxy rewrites certain classes of SQL statements coming from a client. would you repair the database? mechanism should be able to intercept both read and write actions performed by the database server. Possible Original statement Modified statement(s) Currently, the only way to do this is to restore a database ways of interception are: SELECT t1.a1,…,t1.an,…,tk.ank FROM t1,…,tk WHERE c SELECT t1.a1,…,t1.an,…,tk.ank, t1.trid,…,tk.trid FROM t1,…,tk WHERE c backup and recommit all benign transactions manually. •Database triggers (online) – cannot add a trigger for SELECT statements; SELECT t.trid FROM t WHERE c SELECT SUM(t.a) FROM t WHERE c GROUP BY t.b Challenges: (1) How to tell which transactions are benign SELECT SUM(t.a) FROM t WHERE c GROUP BY t.b and which are malicious? Identifying the initial set of •Database log analysis (offline) – read operations are not logged; no run-time overhead is its big advantage. UPDATE t SET a1=v1,…,an=vn WHERE c UPDATE t SET a1=v1,…,an=vn, trid=curTrID WHERE c malicious transactions is not enough because initial damage can spread over the database by subsequent INSERT INTO t(a1,…,an) VALUES (v1,…,vn) INSERT INTO t(a1,…,an,trid) VALUES (v1,…,vn,curTrID) •Tracking proxy (online) – a small program sitting between benign transactions. (2) The amount of data can be huge INSERT INTO trans_dep(curTrID,…) and the repair process is very error-prone. There is a need a client and a server that intercepts all SQL statements COMMIT a way to automate it. sent by the client and results sent back by the server. COMMIT RDB uses both second and third approaches to Ideally, an intrusion-resilient DBMS should be able to implement dependency tracking. Track inter-transaction dependencies; Perform a selective transaction rollback. Database Repair Performance Results We propose an implementation framework called RDB that can render an off-the-self DBMS intrusion resilient The database is repaired by compensating malicious We used TPC-C benchmark to evaluate the run-time without modifying its internals. RDB has two major transactions. When using RDB, the repair process overhead of JDBC proxy. The size of the test database was components: tracking subsystem which runs at run-time consists of the following steps: about 4GB. and recovery subsystem which runs offline. • Database log analysis to reconstruct complete We varied the following parameters: RDB inserts a proxy JDBC driver between the DB dependency information and generate compensating Transaction mix (read intensive and read/write intensive); server and a client that transparently intercepts all transactions; Connection type (local or over a network); queries and results. The proxy can be either on the Total footprint size W (effect of database cache); client side or on the server side. • Dependency graph visualization; Our results suggest that the overhead of the proxy is between 6% and Definition of Transaction Dependency • Repairing database by committing compensating 13% for a typical load. A read set of an SQL statement S is the set of rows transactions. fetched by this statement. Different DBMSs provide different facilities for log analysis. We have studied three database servers: Oracle We will say that statement S2 depends on statement S1 if 9.2.0, PostgreSQL 7.2.2, and Sybase ASE 12.5. at least one row from the read set of S2 was modified by Eventually, all of them provide enough information to S1. We will say that transaction T2 depends on transaction The following changes are made to the database at the generate compensating transactions. T1 if at least one statement of T2 depends on a statement time of its creation: We used GraphViz – a free from T1. This definition is prone to both false positives and graph drawing software false negatives. Example of a false positive dependency: A new field tr_id is added to each table. It from AT&T. contains the ID of last transaction that modified a A1 A2 A3 particular row; T1: SET A2=5 WHERE The application allows the 100 5 5 Table trans_dep(tr_id:INTEGER, A1<250 user to select an initial set 200 5 6 T2: SELECT A3 WHERE dep_ids:VARCHAR) – stores IDs of of malicious transactions A3>3 transactions that depend on transaction tr_id; and computes its transitive 300 1 7 Table annot(tr_id:INTEGER, closure. Then the result descr:VARCHAR) – stores annotations for can be refined by the user Also, in general it is impossible to determine all transaction tr_id; to build the final set of Contact Information transaction dependencies by looking at the traffic transactions to be compensated. ECSL Lab at SUNY Stony Brook: between a client and the DB server only because part of The proxy uses its own transaction Ids because there is http://www.ecsl.cs.sunysb.edu the logic may be inside the application itself. no standard way to access the internal transaction ID of a database. E-mail: {alexey,chiueh}@cs.sunysb.edu