These slides discuss the issues with EMR Patient Data, Data Modelling, Conventional RDBMS based implementation and NOSQL-DB like MongoDB advantages over the same.
2. EMR Data Characteristics/Challenges
•EMRs basically fit the Entity–attribute–value model (EAV).
•They have large number of attributes and can be modeled as sparse
matrix.
•Many of the attributes are often null in real-life production data.
•Poor fit to relational model because of wasted columns that are often
null
•Also the temporal nature i.e. time-lapse scenarios for medical records
can have multiple values for the same column resulting potentially in
large number of rows/columns bloating the space needed.
•Schemas and Sub-schemas EMR and for Patient billing and demographics
are mostly proprietary, resulting in unification/integration challenges.
•The logical schema and physical schema are often different needing
some translation layer consulting its metadata to address users and
system requirements concurrently.
•Enforcement of Integrity and other constraints in DBMS code can
overload the system itself degrading performance and storage efficiency.
3. Storing EMR Data within an RDMBS
•EMRs are often modeled in a proprietary setting with RDBMS
supported data types and formats that can vary per RDBMS product.
•Attributes for entities are modeled as columns and are defined as ‘null-
able’. Sparse data wastes valuable storage space.
•Indexes are built on data access patterns that are foreseen and queries
pre-compiled for static/canned data access.
•For scaling to multi-million records clustering of RDBMS instances is
employed with nodes split on some key(s) that facilitate parallel
processing of SQL queries on multiple nodes that ultimately aggregate
the results to support the ‘single-view of the truth’. This is complex
setup needing considerable effort to setup and maintain.
•For evolving attribute values schemas need to evolve, data may need
to be revalidated, updated, dropped and reconstructed etc .
•When integrating with other DB systems , specific transformers,
adapters, drivers may need to written and are often proprietary.
4. Storing EMR Data with MongoDB
• EMRs are modeled as MongoDB documents. The schema is in the documents and is
fluid! In other words it is schema-less!
• The need for constraints is eliminated resulting in flexibility and evolvability of the
data. This supports accommodating and integrating data from Healthcare sub-
domains like Clinical Care, Labs, Medication, Patient Medical History, Reports ,
Demographics in one place.
• Healthcare systems often need configurable and flexible presentation, ingestion,
persistence and extraction of EMR data which a typical RDBMS fails to address
efficiently. MongoDB can support a good persistence layer above which a dynamic
abstraction layer can be built that addresses the above requirements efficiently
and elegantly. This can result in savings of huge efforts costs and time for the
Healthcare entities.
• MongoDB’s sharding and replication architecture support easier horizontal scaling
across commodity machines resulting in cost savings, simplicity, also in availability
and resiliencies.
• Inclusion of newer ‘Map-Reduce technologies in the query processing and shell
interface results in a better local analytics layer for Data Analytics that is becoming
increasingly useful in modern times.
• The supported BSON data format is efficient for machine-parsing and its JSON
counter-part on the programming side is good for human parsing resulting in
elimination of additional DAO (data access object) pattern in the middleware.
5. In the next part…..
• We will look at typical Architectural and Design Issues and
patterns to model EMRs using MongoDB .