1. Journal of the American Medical Informatics Association
Co
nf
A corpus-based approach for automated LOINC mapping
id
Journal: Journal of the American Medical Informatics Association
Manuscript ID: amiajnl-2012-001159.R1
en
Article Type: Research and Applications
automated mapping, LOINC, local laboratory tests, health information
Keywords:
t
exchange, supervised machine learning, information retrieval
ia
l:
Fo
rR
ev
ie
w
On
ly
http://mc.manuscriptcentral.com/jamia
2. Page 1 of 28 Journal of the American Medical Informatics Association
1
2
3
4
5
6
7
8
Co
9
10
11
12
nf
13
14
15
id
16
17
18
en
19
20
21
22
t
23
ia
24
25 Figure 1: Growth in unique LOINCs mapped to local terms and unique words in local term descriptions as the
26 number of local terms in the corpus has expanded over time.
l:
27
28
Fo
29
30
31
32
rR
33
34
35
36
37
ev
38
39
40
ie
41
42
43
w
44
45
46
On
47
48
49
50
51
ly
52
53
54
55
56
57
58
59
60
http://mc.manuscriptcentral.com/jamia
3. Journal of the American Medical Informatics Association Page 2 of 28
1
2
3
4
5
6
7
8
Co
9
10
11
12
nf
13
14
15
id
16
17
18
en
19
20
21
22
t
23
ia
24
25
Figure 2: Results of 20 iterations of repeated random sub-sampling validation showing the percentage of
26
l:
test terms with manually mapped LOINCs ranked first (top one) and among the top five by Maxent and
27 Lucene.
28
Fo
29
30
31
32
rR
33
34
35
36
37
ev
38
39
40
ie
41
42
43
w
44
45
46
On
47
48
49
50
51
ly
52
53
54
55
56
57
58
59
60
http://mc.manuscriptcentral.com/jamia
4. Page 3 of 28 Journal of the American Medical Informatics Association
1
2
3
4
5
6
7
8
Co
9
10
11
12
nf
13
14
15
id
16
17
18
en
19
20
21
22
t
23
ia
24
25
Figure 3: Rank of correct LOINCs and their Maxent score for local laboratory terms from three test
26
l:
institutions.
27
28
Fo
29
30
31
32
rR
33
34
35
36
37
ev
38
39
40
ie
41
42
43
w
44
45
46
On
47
48
49
50
51
ly
52
53
54
55
56
57
58
59
60
http://mc.manuscriptcentral.com/jamia
5. Journal of the American Medical Informatics Association Page 4 of 28
1
2
3
4
5
6
7
8
Co
9
10
11
12
nf
13
14
15
id
16
17
18
en
19
20
21
22
t
23
ia
24
25 Figure 4: Performance of Maxent and Lucene when applying the test set against a growing corpus of local
26
l:
terms.
27
28
Fo
29
30
31
32
rR
33
34
35
36
37
ev
38
39
40
ie
41
42
43
w
44
45
46
On
47
48
49
50
51
ly
52
53
54
55
56
57
58
59
60
http://mc.manuscriptcentral.com/jamia
6. Page 5 of 28 Journal of the American Medical Informatics Association
1
2
3
4
5
6 Response to Reviewers’ Comments
7 The following are respectfully submitted in response to the reviewers’ comments. The comments were
8
Co
appreciated and the authors hope that all issues and suggestions are adequately addressed.
9
10
11 Associate Editor
12
nf
13 Three expert reviewers have commented on your paper. We apologize for the delay in getting these
14 reviews to you.
15
id
16 While there is support for the direction of your work and the methods applied, two of the three
17
18 reviewers have offered comments and suggestions which add up to a need for significant revision of
en
19 your paper. We ask that you address all of these comments. Reviewer 1, in particular, has asked that
20 you make the paper easier to comprehend and seeks clarification of your "gold standard". Reviewer 2 is
21
22 asking you to round out your paper into a more scientific work by grounding your methods in what
t
23 other researchers have done and specific choices you made in the conceptualization of your study.
ia
24
25 Reviewer: 1
26
l:
27 However, I think the paper misses the opportunity to inform the readers more specifically how they can
28
take advantage of this work. Is the repository of local terms available to others? Are the analyzing
Fo
29
30 programs available to others to use, rather than reinvent? With the exception of RELMA mapping
31 assistant {Lab Auto Mapper], there was no discussion of sharing.
32
rR
33
34
[Author Response]
35
36 We agree that it would be helpful to clarify and expand our discussion of the generalizability and
37 practicality of our approach. We have significantly revised the Discussion and added a specific section
ev
38 about practical application.
39
40
What the paper told me primarily was the result of what you did with 3 institutions. The three
ie
41
42 institutions are referred to in the Abstract as “novel” institutions. I have no idea what that means or why
43 you used it. Within the university, you refer these as test institutions. As the results were different
w
44
across the three test institutions, I would, at a minimum, want to know some of the characteristics of
45
46 each institution. Where the large or small; were they academic medical centers or rural clinics? I am
On
47 surprised that you did not identify the institutions – I see no invasion of privacy or conflict of interest.
48
Then, the reader would at least have some comparison basis.
49
50
51 [Author Response]
ly
52
53 We have made edits throughout the paper and now consistently refer to these 3 institutions as “test
54 institutions” and their terms as test sets. We agree that the analysis based on the 3 institutions is the
55
most applicable to the typical use case - mapping a new institution’s terms to LOINC. The 80/20 split
56
57
58
59
60
http://mc.manuscriptcentral.com/jamia
7. Journal of the American Medical Informatics Association Page 6 of 28
1
2
3 analyses and characterization of model performance by simulating the corpus expansion over time were
4
5 additional ways to investigate the robustness of this corpus-based approach.
6
7 We have added the details that the test institutions were community hospitals in central Indiana. The
8 INPC participation agreement stipulates that the data may not be used for research that directly
Co
9
10
compares institutions. Our research project was approved by the IRB and the INPC management
11 committee, but as a courtesy in the spirit of that clause we opted to mask the specific identity of the
12 institutions. We have used this same approach in several previously published mapping studies.
nf
13
14 I found the paper difficult to read. I really had to work hard to understand what you were doing; some of
15
that may be me, but I think you could have made it easier. First, I would assume many of your readers
id
16
17 would not know the details of Apache’s Open NLP Maxent. A short description of what it does would
18 significantly help an understanding of how it was used in your research. I would add even more detail
en
19
20 about how you used Lucene, along with a brief description. Did you use Lucene to pull out individual
21 words in a local term and then match and score those words against the text of the LOINC code? Did you
22
t
make an initial pass to pull out terms that were a direct match with the LOINC terms? I think just a little
23
ia
24 more detail would make it much clearer what you did.
25
26 [Author Response]
l:
27
28 We have made significant revisions to address your concerns for clarity and additional explanation.
Fo
29 Specifically, we added descriptions of Lucene and Maxent and how the models were constructed. We
30
31 also added an entire section illustrating the formation of these two models and their scores using a
32 short example corpus. We appreciate this reviewer’s and the other reviewer’s call for examples and
rR
33 think that their inclusion significantly improves the readability of the paper.
34
35
Did you consider looking at the axes other than name in matching local to LOINC codes? I think if you
36
37 included some examples of the kind of things in the corpus of terms, in the local terms, and show
ev
38 specifically the result of some matches that worked and some that different would be very useful.
39
40 [Author Response]
ie
41
42
As noted above, we did include some examples showing the models in action. We also clarified in
43
w
44 several places (most notably the Discussion) that our approach relies exclusively on the rich corpus of
45 local term descriptions and does not directly reference the LOINC terminology. An advantage of this
46
method is that it does not require domain specific techniques or data. For example, we don’t have to
On
47
48 detect that a particular token is the analyte being measured or the specimen. We have considered (and
49 participated in other research) that does leverage additional semantics (such as the units of measure)
50
and data (e.g. average result values, whether the test is performed more often on males or females,
51
ly
52 etc). We think this is also a worthy line of research, but the focus of this analysis was on a simple,
53 corpus-based approach.
54
55 I was really confused with the numbers and content of Table 1. How does the text above the table relate
56
57 to the table? The percentages do not match.
58
59
60
http://mc.manuscriptcentral.com/jamia
8. Page 7 of 28 Journal of the American Medical Informatics Association
1
2
3 [Author Response]
4
5
6 We have added a more descriptive title to this table (which is now table 5) and the narrative that refers
7 to it. The percents listed in the text were the average of the percents across those three institutions (e.g.
8 for Maxent’s ranking of the correct LOINC code #1 across these institutions, the average is
Co
9
10 (.786+.735+.846)/3 x 100% = 78.9%.
11
12 I also do not understand the difference in the upper half and the lower half of the Table. I also do not
nf
13 understand the numbers in the text below the line. I think this material needs to be better organized.
14 You seem to be jumping around in your thoughts faster than I can track.
15
id
16
17 [Author Response]
18
en
19 We have added a more descriptive title to this table (which is now table 5) and the narrative that refers
20 to it. The first three rows pertain to the three methods performance in ranking the correct LOINC code
21 first. The second three rows pertain to the three methods performance in ranking the correct LOINC
22
t
23 code in the top 5.
ia
24
25 We have added additional subheadings throughout the manuscript to help make the organization more
26
l:
clear.
27
28 I sort of understand the corpus is the gold standard. I assume it is because all of these terms were
Fo
29
30 matched manually, at least those in question. How the threshold set, and what is the significance on
31 accuracy by slight movements?
32
rR
33 [Author Response]
34
35 The reviewer is correct - these existing mappings in the corpus served as the Gold Standard for our
36
37 analyses because they were already in production from the operational health information exchange (i.e
ev
38 the INPC). The mapping team at Regenstrief has an extraordinary amount of experience, but we know
39 from prior analyses that even experienced experts make mistakes in mapping (see Lin et al “Correctness
40
of Voluntary LOINC Mapping for Laboratory Tests in Three Large Institutions”). For the purpose of this
ie
41
42 analysis, which focused on leveraging what already existed, we accepted this level of known error as
43 acceptable for our purposes. We are not entirely sure what the reviewer means in the question about a
w
44
45 threshold. If it is in reference to the Gold Standard mapping, this is basically the threshold for clinical
46 acceptability of test equivalence as determined by a mapping team with nearly 20 years of experience.
On
47 We clarified this by adding an explicit reference in the Methods to a previous paper describing the
48
49 general approach to mapping in the INPC. In terms of how the performance of the models is affected by
50 different kinds or sets of terms, this was the reason for doing 2 types of cross validation: the random
51 80/20 subset method and the holdout of three test institutions.
ly
52
53
I would like some discussion on your thoughts for the overall process. Is it your opinion that any
54
55 institution could use your corpus and your processes (applications) to map their local terms to LOINC?
56 What would be the expected accuracy, and what would be the expectation of what is matched? What
57
would be done with the rest? Manual?
58
59
60
http://mc.manuscriptcentral.com/jamia
9. Journal of the American Medical Informatics Association Page 8 of 28
1
2
3 [Author Response]
4
5
6 We have significantly revised the Discussion to present a more clear rationale and set of considerations
7 for applying the results of this study. In short, if a large mapping corpus was available, a Maxent
8 threshold score could be used to identify a big chunk of terms that would need little (if any) human
Co
9
10
review and produce a reasonably accurate ranked list of candidate LOINC terms for the remainder that
11 would expedite the human review process.
12
nf
13 Reviewer: 2
14
15 Lack of justification for the methods. This work is almost more on the side of engineering than science.
id
16
17 The choice of the methods is essentially justified more by convenience (availability of an open source
18 implementation). It would be nice to show that the methods selected have been successfully used in
en
19 similar contexts and perform as well as or better than other methods (to be named and contrasted
20
21 against).
22
t
23 [Author Response]
ia
24
25 We agree that a better justification is needed. We added a few additional details in the methods section
26
l:
about some of these choices, and then added a much richer discussion of these considerations in a
27
28 separate subsection of the Discussion.
Fo
29
30 Lack of reflection on the use of the proposed method in the strategy for mapping local terms to LOINC
31 (compared to existing tools). It would be nice to give some advice to LOINC users and the developers of
32
systems integrating LOINC into a local system (i.e., vendors) as to which mapping approaches and tools
rR
33
34 are best adapted to which content, and how to best use them in combination.
35
36 [Author Response]
37
ev
38 We agree that additional discussion of the practical application of these results was necessary. We
39
addressed this by adding a more explicit Rationale subsection, a Maxent vs Lucene comparison
40
subsection, and a Considerations for practical application subsection.
ie
41
42
43 Along the same lines, justification should be provided for the choice of the top and top 5 terms in the
w
44 evaluation. Does it correspond to a particular use case? (e.g., top mapping for completely automatic
45
46 mapping)
On
47
48 [Author Response]
49
50 We agree that we should have explained our rationale here more clearly. Given the level of accuracy of
51 the automated methods, our assumption is that most operational uses will still require human review
ly
52
53 (except perhaps those ranked 1 that score above the Maxent threshold). A high quality, short list of
54 candidate terms is much easier for domain experts to review than a term-by-term search. We added
55 explanation of this choice in the methods section.
56
57
Lack of examples. Examples should be added throughout the manuscript to illustrate the methods.
58
59
60
http://mc.manuscriptcentral.com/jamia
10. Page 9 of 28 Journal of the American Medical Informatics Association
1
2
3 [Author Response]
4
5
6 As mentioned in the comments to Reviewer 1, we have made significant revisions to address your
7 concerns for clarity and additional explanation. Specifically, we added descriptions of Lucene and
8 Maxent and how the models were constructed. We also added an entire section illustrating the
Co
9
10
formation of these two models and their scores using a short example corpus. We appreciate this
11 reviewer’s and the other reviewer’s call for examples and think that their inclusion significantly improves
12 the readability of the paper.
nf
13
14 The process of selecting part of the corpus for training and the rest for testing repeatedly (n times) is
15
id
16 called n-fold cross validation (n=20 here). Please explain why you are doing it.
17
18 [Author Response]
en
19
20 We agree that we should have explained our rationale more clearly. We actually used a repeated
21 random subsampling method and not n-fold cross validation because we did not ensure that every term
22
t
23 was included in the validation/test set. We have added further clarification on this in the methods and
ia
24 indicated that is complementary to the holdout by institution approach we used in another analysis.
25
26
l:
Unclear if any normalization is applied to words. If not, is this a limitation?
27
28 [Author Response]
Fo
29
30
31 The normalization for both the MaxEnt and Lucene approaches were the same, and were performed by
32 the Apache Lucene v3.0.3 standard analyzer. We have clarified this in the Methods section by creating a
rR
33 subsection and have also added a description of the standard analyzer and an example of what it does.
34
35 The process for the RELMA Auto Mapper approach did not use the Lucene Standard Analyzer, but rather
36
37 followed the recommended procedures as described in the RELMA training materials.
ev
38
39 The failure analysis should be part of the discussion, not results.
40
ie
41 [Author Response]
42
43 We agree and have included it there.
w
44
45
Please organize the discussion in subsections.
46
On
47
48 [Author Response]
49
50 We agree and have done so.
51
ly
52 This work could be contrasted against work about maping local terms to standard terminologies beyond
53 LOINC. See foe example: Peters L, Kapusnik-Uner JE, Nguyen T, Bodenreider O. An approximate matching
54
55 method for clinical drug names. AMIA Annu Symp Proc. 2011;2011:1117-26. Epub 2011 Oct 22. PubMed
56 PMID: 22195172; PubMed Central PMCID: PMC3243188.
57
58
59
60
http://mc.manuscriptcentral.com/jamia
11. Journal of the American Medical Informatics Association Page 10 of 28
1
2
3 [Author Response]
4
5
6 We agree that this is a helpful paper to contrast against and have done so in the Discussion.
7
8 Reviewer: 3
Co
9
10 For one thing, we all need to understand the real cost of using a method such as you propose. That is,
11 what does it cost to find and deal with the false positives and false negatives. As you observe, the cost of
12
manual, and semi-manual, methods are high and well documented. Methodologically, we also need to
nf
13
14 know if the mappings missed by the best method are also missed by the other two methods - likely, or -
15 less likely - whether the other two methods correctly map terms incorrectly mapped by the best method.
id
16
17
[Author Response]
18
en
19
20 We agree that this is an important consideration and have added further detail about the performance
21 of Lucene and Maxent at ranking the correct LOINC code missed by the other model. The discussion also
22 addresses how the RELMA Lab Auto Mapper can find matches for some of the terms that are missed by
t
23
ia
24 the corpus-based models by querying the LOINC terminology directly. We have also sketched out a
25 suggested mapping process using a corpus based approach in the new subsection on considerations for
26
l:
practical application.
27
28
Fo
29
30
31
32
rR
33
34
35
36
37
ev
38
39
40
ie
41
42
43
w
44
45
46
On
47
48
49
50
51
ly
52
53
54
55
56
57
58
59
60
http://mc.manuscriptcentral.com/jamia
12. Page 11 of 28 Journal of the American Medical Informatics Association
1
2
3
4 A corpus-based approach for automated LOINC mapping
5
6
7 Mustafa Fidahussein MD, MS, Daniel J. Vreeman PT, DPT, MSc
8
Co
9
Regenstrief Institute, Inc, and Indiana University School of Medicine, Indianapolis, IN
10
11
12
Abstract
nf
13
14
15 Objective: To determine whether the knowledge contained in a rich corpus of local terms mapped to LOINC could
id
16
17 be leveraged to help map local terms from other institutions.
18
en
19
20 Methods: We developed two models to test our hypothesis. The first based on supervised machine learning was
21
22 created using Apache’s OpenNLP Maxent and the second based on information retrieval was created using
t
23
ia
24 Apache’s Lucene. The models were validated by a random sub-sampling method that was repeated 20 times and that
25
26
l:
used 80/20 splits for training and testing respectively. We also evaluated the performance of these models on all
27
28 laboratory terms from three test institutions.
Fo
29
30
31 Results: For the 20 iterations used for validation of our 80/20 splits Maxent and Lucene ranked the correct LOINC
32
first between 70.5% and 71.4% and between 63.7% and 65.0% respectively. For all laboratory terms from the three
rR
33
34
35 test institutions Maxent ranked the correct LOINC first between 73.5% and 84.6% (mean 78.9%), whereas Lucene’s
36
37 performance was between 66.5% and 76.6% (mean 71.9%). Using a cutoff score of 0.46 Maxent always ranked the
ev
38
39 correct LOINC first for over 57% of local terms.
40
ie
41
Conclusion: This study showed that a rich corpus of local terms mapped to LOINC contains collective knowledge
42
43
that can help map terms from other institutions. Using freely available software tools, we developed a data-driven
w
44
45
automated approach that operates on term descriptions from existing mappings in the corpus. Accurate and efficient
46
On
47
automated mapping methods can help accelerate adoption of vocabulary standards and promote widespread health
48
49
information exchange.
50
51
ly
52 Keywords: automated mapping, LOINC, local laboratory tests, health information exchange, supervised machine
53
54 learning, information retrieval, Maxent, Lucene.
55
56
57 Background and Significance
58
59
60
http://mc.manuscriptcentral.com/jamia
13. Journal of the American Medical Informatics Association Page 12 of 28
1
2
3 Health information technology has the potential to improve the quality and efficiency of care.[1] However, the
4
5 clinical data needed to make care decisions are often unavailable to providers at the right time and place.[2] While
6
7 our patients seek care across many settings and institutions[3], the purview of our clinical information systems are
8
Co
9 usually curbed at organizational boundaries. Even within a single institution, the laboratory, radiology, pharmacy
10
11 and clinical note writing systems may function like data “islands”. Efficiently moving and aggregating patient data
12
nf
13 creates an important foundation for many tools and processes with the capability of improving healthcare delivery.
14
15 The Health Information Technology for Economic & Clinical Health (HITECH) act considerably increases the
id
16
17 prospect of widespread electronic health record systems (EHRs) with health information exchange capabilities.[4]
18
en
19 HITECH requires that providers and hospitals demonstrate EHR information exchange to be eligible for the
20
21 Medicare and Medicaid incentive payments.
22
t
23
ia
24 A central barrier to efficient health information exchange is the unique local names and codes for the same clinical
25
26 test or measurement performed at different institutions. When integrating many data sources, the only practical way
l:
27
28 to overcome this barrier is by mapping local terms to a vocabulary standard. Logical Observation Identifiers Names
Fo
29
30 and Codes (LOINC®) is a universal code system for identifying laboratory and clinical observations.[5] When
31
32 LOINC is used together with messaging standards such as HL7, independent systems can create interfaces with
rR
33
34 semantic interoperability for electronically reporting test results. LOINC has been adopted both in the United States
35
36 and internationally by many organizations, including large reference laboratories, healthcare organizations,
37
ev
38 insurance companies, regional health information networks and national standards.[6-8] Within the USA, one recent
39
40 and notable adoption of LOINC is as the standard for laboratory orders and results in the Standards and Certification
ie
41
42 Criteria of the Centers for Medicare and Medicaid Services EHR “Meaningful Use” incentive program.[9]
43
w
44
Before care organizations can realize the benefit of using vocabulary standards like LOINC, they must first map
45
46 their local test codes to terms in the standard. Unfortunately, this process is complex. It requires considerable
On
47
48 domain expertise and is very resource-intensive.[8,10-12] Reducing the effort required to accurately map local terms
49
50 to LOINC would accelerate interoperable health information exchange and will be especially helpful to resource-
51
ly
52 challenged institutions.
53
54
55 The Regenstrief LOINC mapping assistant (RELMA), a desktop program freely distributed with LOINC
56
57 (http://loinc.org), is widely used by domain experts to map their local terms to LOINCs on a term-by-term basis.[12-
58
59
60
http://mc.manuscriptcentral.com/jamia
14. Page 13 of 28 Journal of the American Medical Informatics Association
1
2
3 15] It also contains a feature called the RELMA Auto Mapper that batch processes a set of local terms and identifies
4
5 a ranked list of candidate LOINCs for each local test in the collection. While RELMA’s automated mapping feature
6
7 has accurately mapped radiology report terms[16,17], laboratory terms present special challenges because of their
8
Co
9 characteristically short and ambiguous test names.[8,10,18]
10
11
12 Previous studies have described several methods and tools for mapping laboratory terms to LOINC. Lau et al used
nf
13
14 parsing and logic rules in conjunction with synonyms, attribute relationships and mapping frequency data to map
15
id
16 local laboratory test names to LOINC.[19] This paper was a descriptive analysis that did not include an evaluation of
17
18 its accuracy. Zollo et al used extensional definitions of laboratory concepts generated from actual test result data to
en
19
20 map between two laboratories using a common dictionary that was also linked to LOINC.[20] The automated
21
22 matching software that leveraged these extensional definitions correctly identified 75% of the possible matches. In
t
23
ia
24 addition to establishing new mappings, extensional definitions have also been used for auditing and characterizing
25
26 the degree of interoperability of existing local laboratory terms to LOINC mappings.[11,21] Sun and Sun evaluated
l:
27
28 the performance of an automated lexical mapping program on terms from three institutions to LOINC.[22] The
Fo
29
30 overall best lexical mapping algorithm identified the correct LOINC between 63% and 75% of local terms. Kim et al
31
32 described an approach for augmenting local test names that modestly improved mapping results using RELMA for
rR
33
34 term-by-term mapping.[18] Lastly, Khan et al developed an automated tool that used a master file of mapped local
35
36 terms from several sites within the Indian Health service.[15] The local terms at these sites shared a common
37
ev
38 heritage, but had diverged over time in their naming conventions. Compared with a gold standard mapping
39
40 established by a term-by-term search with RELMA, the automated method correctly mapped 81% of the test terms.
ie
41
42
Over the last 18 years, Regenstrief has mapped local terms from many institutions to a common dictionary as part of
43
w
44
the process of creating and expanding the Indiana Network for Patient Care (INPC), a comprehensive regional
45
46 health information exchange.[23] Thus, the INPC dictionary now represents a rich corpus of local terms mapped to
On
47
48 LOINC. Like Lau et al and Khan et al, we hypothesized that the knowledge contained in this corpus of mappings
49
50 could be leveraged to help map local terms from other institutions.
51
ly
52
53 To test this corpus-based approach, we developed two models based on supervised machine learning and
54
55 information retrieval using open source tools. Our data-driven approach relies exclusively on a rich corpus of local
56
57 term descriptions and does not directly reference the LOINC terminology. In this study we present the process of
58
59
60
http://mc.manuscriptcentral.com/jamia
15. Journal of the American Medical Informatics Association Page 14 of 28
1
2
3 creating and validating these models and testing their performance on a set of local laboratory terms from three
4
5 institutions. We also compare the performance of these models to the recently improved Lab Auto Mapper feature
6
7 within RELMA.
8
Co
9
10 Methods
11
12
nf
13 Establishing the gold standard and normalizing the corpus
14
15
id
16 We compiled a corpus of all local terms from 104 different institutional code-sets that were mapped to LOINC
17
18 through the INPC common dictionary between 1997 and 2012. Each local term from these sets had been mapped by
en
19
20 domain experts at Regenstrief through manual review, assisted by the use of RELMA and other locally developed
21
22 tools. For all analyses, these existing LOINC mappings from the operational health information exchange served as
t
23
ia
24 our gold standard. A description of how Regenstrief performs and maintains the mappings in the INPC have been
25
26 published previously.[12] We did not perform additional auditing of the mappings as part of this analysis.
l:
27
28 For each local term in the corpus, the set of words constituting its description (e.g. the laboratory test name) was
Fo
29
30 normalized using Apache Lucene’s v3.0.3 Standard Analyzer.[24,25] The Lucene Standard Analyzer uses lexical
31
32 rules to recognize alphanumeric characters, convert strings to lowercase, and remove stop words. For example, the
rR
33
34 local term descriptions “CSF CELL COUNT/DIFF” and “GLU (TOL) UR-5 HR” would be normalized to “cell
35
36 count csf diff” and “glu hr tol ur-5” respectively.
37
ev
38
39 Creating a model based on supervised machine learning – Maxent
40
ie
41
42 We used Apache’s OpenNLP Maxent v3.0.1.[26] to create a maximum entropy based statistical algorithm for
43
w
44 supervised machine learning. The principle of maximum entropy provides a probability distribution that is as
45
46 uniform as possible by assuming nothing about what is unknown.[27] The probability distribution derived from
On
47
48 human specified constraints in training data is then used to predict the probability of a random set of constraints in
49
50 test data.
51
ly
52
53 To create a Maxent model each local term in the training set was considered as a separate event with its normalized
54
55 description used as predicates and the mapped LOINC used as outcome. When local terms from the test set were
56
57 applied against this model, Maxent calculated a probability score between zero and one for each LOINC (outcome)
58
59
60
http://mc.manuscriptcentral.com/jamia
16. Page 15 of 28 Journal of the American Medical Informatics Association
1
2
3 contained in the corpus. The LOINCs with the highest score (Top 1) and those with the highest five scores (Top 5)
4
5 were noted for each local term.
6
7
8 Creating a model based on information retrieval – Lucene
Co
9
10
11 We used Apache’s Lucene v3.0.3.[24] to create an information retrieval based model. Lucene is a popular
12
nf
13 information retrieval library that creates documents with indexed fields for fast searching. Its scoring formula
14
15 matches the similarity between indexed fields and search terms for each document.[25] To create a Lucene model
id
16
17 we created separate documents for every unique LOINC in the training set. Each document then contained the
18
en
19 normalized description from all local terms mapped to that LOINC as its indexed field. When local terms from the
20
21 test set were queried against this model, Lucene calculated a score for each LOINC (document) contained in the
22
t
23 corpus. This score was based on the number of times queried words co-occurred with that document and the total
ia
24
25 number of documents associated with those words. The Lucene score ranged from zero with no upper bound value.
26
l:
27
An example of the models created by Maxent and Lucene
28
Fo
29
30 To illustrate use of the Maxent and Lucene models, consider a corpus that contains only five terms from two
31
32 different institutions with manually mapped LOINCs as shown in Table 1. The data from this corpus is used to
rR
33
34 create a Maxent model with five events and three outcomes as shown in Table 2. It is also used to create a Lucene
35
36 model with three documents and corresponding indexed fields as shown in Table 3. Note that the Lucene model
37
ev
38 concatenates all the term descriptions from different institutions mapped to the same LOINC. Now suppose that a
39
40 test institution contains five unmapped terms, each only containing the words “indirect”, “direct”, “coombs”,
ie
41
42 “bilirubin” and “direct test” in their term descriptions. When these term descriptions are applied against Maxent and
43
w
44 Lucene, each model returns a set of three scores that represents the likelihood of that test term being mapped to each
45
46 of the three LOINCs contained in the corpus.
On
47
48
49 Table 1: Hypothetical corpus containing five local terms from two different institutions.
50
51
ly
Institution Local Code Term Description Mapped LOINC
52
53
1 12802 Indirect AGT 1003-3
54
55
1 18231 Direct Coombs IgG Ab 1006-6
56
57
58
59
60
http://mc.manuscriptcentral.com/jamia
17. Journal of the American Medical Informatics Association Page 16 of 28
1
2
3
4 2 DCTG Direct Coombs Test 1006-6
5
6 2 IAT Indirect Coombs 1003-3
7
8 2 BILID Bilirubin, Direct 1968-7
Co
9
10
11 Table 2: Representation of the Maxent model based on the corpus shown in Table 1.
12
nf
13
14 Event # Predicates (normalized term descriptions) Outcome
15
id
16 1 agt indirect 1003-3
17
18 2 coombs indirect 1003-3
en
19
3 ab coombs direct igg 1006-6
20
21 4 coombs direct test 1006-6
22
t
23 5 bilirubin direct 1968-7
ia
24
25
26
l:
27 Table 3: Representation of the Lucene model based on the corpus shown in Table 1.
28
Fo
29 Document ID Indexed Field
30
31 1003-3 agt indirect coombs indirect
32
rR
33 1006-6 ab coombs direct igg coombs direct test
34
35 1968-7 bilirubin direct
36
37
ev
38 Table 4: Maxent and Lucene scores for each LOINC from the corpus in Table 1 when local term descriptions are
39
40 queried against both models.
ie
41
42
43
Maxent Model Scores Lucene Model Scores
w
44
45
Term Description / LOINC # 1003-3 1006-6 1967-7 1003-3 1006-6 1967-7
46
On
47 “indirect” 0.9134 0.0432 0.0432 1.9876 0.0000 0.0000
48
49 “direct” 0.1352 0.4558 0.4090 0.0000 1.4142 1.0000
50
51 “coombs” 0.5019 0.3704 0.1277 1.0000 1.4142 0.0000
ly
52
53 “bilirubin” 0.0241 0.0241 0.9517 0.0000 0.0000 1.4054
54
55 “direct, test” 0.0136 0.9451 0.0412 0.0000 1.9651 0.2899
56
57
58
59
60
http://mc.manuscriptcentral.com/jamia
18. Page 17 of 28 Journal of the American Medical Informatics Association
1
2
3 Evaluation approach
4
5
6 To characterize how well these models performed in mapping local terms to LOINC, we conducted three sets of
7
8 analyses that are described in detail in the following sections. In each case, the top five scoring LOINCs were
Co
9
10 compared with the LOINC assigned by manual mapping (our gold standard). We chose to limit the list of LOINC
11
12 codes returned by the analyses to the top five based on our practical experience with mapping and preliminary
nf
13
14 analyses that showed it was rare for the correct LOINC code to appear in the next few rankings. Domain experts can
15
id
16 quickly review a short list of ranked candidate LOINC terms to determine which, if any, of the LOINC terms was
17
18 the correct match. A longer list is more cumbersome to review, and our experience has been that mappers prefer an
en
19
20 interactive search interface like RELMA to reviewing a long list of candidate codes.
21
22
t
23 Validating the models using 80/20 splits
ia
24
25
26 We validated the predictive performance of both models by using a random sub-sample method (80% for training
l:
27
and 20% for testing) that was repeated 20 times. For each of the iterations, 80% of local terms from our normalized
28
Fo
29
corpus were randomly selected as training set to create Maxent and Lucene models as described above. Normalized
30
31
local term descriptions from the remaining 20% that served as test set were then queried against both models. We
32
rR
33
chose this approach to cross-validation to help prevent the models from being over fitted. Splitting the corpus at the
34
35
term level (rather than at the level of a whole set of terms from an institution) demonstrates the prediction of the
36
37
ev
models for a heterogeneous set of terms with varying naming conventions. The top five scoring LOINCs resulting
38
39 from each model were compared with the LOINC assigned by manual mapping (our gold standard).
40
ie
41
42 Evaluating the models’ performance using test terms from three institutions and comparison to Lab Auto Mapper.
43
w
44
45 We determined the performance of our models in mapping an entire set of local laboratory terms from three test
46
On
47 institutions (community hospitals located in central Indiana). In this case, training sets comprising of all terms in our
48
49 corpus minus those belonging to the three test institutions were used to create Maxent and Lucene models as
50
51 described above. Normalized local terms descriptions from the corresponding test sets containing all laboratory
ly
52
53 terms from the three institutions were then applied against both models. This approach simulates the typical
54
55 mapping scenario of integrating all the terms from a new institution’s laboratory system. Each institution code set
56
57 covers the set of tests performed by typical community hospital laboratory, and reflects the idiosyncratic naming
58
59
60
http://mc.manuscriptcentral.com/jamia
19. Journal of the American Medical Informatics Association Page 18 of 28
1
2
3 conventions established by that institution. The top five scoring LOINCs resulting from each model were compared
4
5 with the LOINC assigned by manual mapping (our gold standard).
6
7
8 We also compared the performance of the models with RELMA’s Lab Auto Mapper for the test set of terms from
Co
9
10 these three institutions. For this analysis we used the most recent publicly available version, RELMA v5.6.[28] The
11
12 Lab Auto Mapper uses a series of algorithms optimized for laboratory terms to generate a list of candidate LOINCs.
nf
13
14 In addition to using words contained in a local term’s description it can also leverage information from battery
15
id
16 terms, units of measures, common tests and the synonymy contained in LOINC. Its score is based on the number
17
18 and proportion of words that match between the local term and the fully specified LOINC name.[29] We followed
en
19
20 the recommended procedures for loading local terms into RELMA and running the Lab Auto Mapper as described in
21
22 the RELMA Users’ Manual and LOINC and RELMA tutorial produced by Regenstrief Institute. [29, 30]
t
23
ia
24
25 Lastly, we investigated whether a threshold Maxent score could serve as a useful cutoff for always identifying the
26
l:
27 correct LOINC code. We first plotted the rank of the correct LOINC among the top five against its Maxent score for
28
each term in the test set, and then evaluated the Maxent score above which the correct LOINC was always ranked
Fo
29
30
31 first.
32
rR
33
Evaluating the models’ performance on the corpus as it has grown over time
34
35
36 To determine our models’ performance against a growing corpus we again used the test set of all local laboratory
37
ev
38 terms from the three institutions as above. However, this time 12 training sets were used, each containing local terms
39
40 from the corpus (minus those in the test set) in chronological order and in increments of 6,400. Thus, the first
ie
41
42 training set contained the first 6,400 local terms created in the corpus; the second training set contained the first
43
w
44 12,800 local terms; and the twelfth and last training set contained all the local terms. Normalized local term
45
46 descriptions from the test set were applied against Maxent and Lucene models created from each of the 12 training
On
47
48 sets. The top five scoring LOINCs resulting from each model were compared with the LOINC assigned by manual
49
50 mapping (our gold standard).
51
ly
52
53 Results
54
55
56 Our corpus from 104 institutional code sets contained 81,691 local terms, each associated with a description and
57
58 mapped to a LOINC. These local terms were mapped to 7,565 unique LOINCs and contained 11,620 unique words
59
60
http://mc.manuscriptcentral.com/jamia
20. Page 19 of 28 Journal of the American Medical Informatics Association
1
2
3 in their descriptions (test names). This corpus was built from 1997 to 2012 as a byproduct of the INPC expansion.
4
5 New local terms were added to the INPC master dictionary and mapped to LOINC both because new institutions
6
7 began to participate in the health information exchange and because participating institutions created new local
8
Co
9 terms. Figure 1 shows the growth in number of unique LOINCs and number of unique words associated with all
10
11 local terms as the corpus has expanded with new local terms over time.
12
nf
13
14 Results of validating the models using 80/20 splits.
15
id
16
17 In each of the 20 iterations of random sub-sampling from our corpus into 80% for training and 20% for testing, there
18
en
19 were 65,361 local terms in the training set and 16,330 local terms in the test set. The number of unique LOINCs to
20
21 which these local terms were mapped varied between 7,115 and 7,190 for the training set and between 4,391 and
22
t
23 4,493 for the test set.
ia
24
25
26 Maxent ranked the correct (manually mapped) LOINC first for 11,513 to 11,661 (70.5%-71.4%, mean 71.0%) of
l:
27
local terms in the test sets and ranked the correct LOINC among the top five for 13,871 to 14,073 (84.9%-86.2%,
28
Fo
29
mean 85.5%). Lucene ranked the correct LOINC first for 10,407 to 10,610 (63.7%-65.0%, mean 64.3%) of local
30
31
terms in the test sets and ranked the correct LOINC among the top five for 13,649 to 13,841 (83.6%-84.8%, mean
32
rR
33
84.2%). These results for each of the 20 iterations are shown in Figure 2.
34
35
36 Results of Maxent, Lucene and Lab Auto Mapper using laboratory terms from three test institutions.
37
ev
38
39 The three institutions chosen as test sets contained 1,099, 1,705 and 838 local laboratory terms that were mapped to
40
ie
41 573, 757 and 328 unique LOINCs respectively. The results of applying these test sets against the Maxent model, the
42
43 Lucene model, and the Lab Auto Mapper are shown in Table 5. Averaging the performance across these three
w
44
45 institutions, the correct LOINC was ranked first for 78.9%, 71.9%, and 50.3% and ranked among the top five for
46
On
47 91.4%, 90.0%, and 68.6% of local terms when applied against Maxent, Lucene and Lab Auto Mapper respectively.
48
49
50 Table 5: Percentage of local laboratory terms from each test institution that when applied against Maxent, Lucene
51
ly
52 and Lab Auto Mapper had the correct LOINC ranked highest (Top 1) and among the highest five (Top 5).
53
54 Institution 1 Institution 2 Institution 3
55 n=1,099 n=1,705 n=838
56 Maxent Top 1 78.6% (864) 73.5% (1,253) 84.6% (709)
57
58
59
60
http://mc.manuscriptcentral.com/jamia
21. Journal of the American Medical Informatics Association Page 20 of 28
1
2
3
4 Lucene Top 1 72.6% (798) 66.5% (1,133) 76.6% (642)
5
6 Lab Auto Mapper Top 1 49.6% (545) 46.8% (798) 54.5% (457)
7
Maxent Top 5 90.5% (995) 88.8% (1,514) 94.7% (794)
8
Co
9
Lucene Top 5 89.8% (987) 86.0% (1,466) 94.3% (790)
10
11 Lab Auto Mapper Top 5 71.8% (789) 66.9% (1,140) 67.1% (562)
12
nf
13
14
15 For the 3,642 local terms in the three test sets, ranks of the correct LOINCs among the top five were plotted against
id
16
17 their Maxent scores. As illustrated in Figure 3, this plot shows that when the score was above 0.46 the correct
18
en
19 LOINC was always ranked first by the model. Using this cutoff score to separate a high certainty top rank, Maxent
20
21 ranked the correct LOINC first for 2,099 (57.6%) of the local terms.
22
t
23
ia
24 Results of the models’ performance on the corpus as it has grown over time
25
26
l:
27 Figure 4 illustrates the performance of both models on the test set containing all local terms from three institutions
28
using a series of training sets that represent growth in the corpus over time. The training sets in this analysis
Fo
29
30
31 organized the local terms in the corpus in chronologic order by increments of 6,400 terms. The results show a
32
gradual leveling off in Maxent’s performance and a slight decrease in Lucene’s performance as the number of terms
rR
33
34
in the corpus reached its maximum.
35
36
37
ev
Discussion
38
39
40 Our study shows that a rich corpus of local terms mapped to LOINC can help map terms from other institutions.
ie
41
42 Overall, the supervised machine learning based Maxent model ranked the correct LOINC first for 79% and the
43
w
44 information retrieval based Lucene model for 72% of local laboratory terms from our three test institutions. These
45
46 results are similar in accuracy to the best reported automated techniques from prior studies of laboratory test
On
47
48 mapping. Our approach has the advantages of using freely available tools and only requiring local term descriptions
49
50 as the data substrate.
51
ly
52
53 Rationale for using Maxent and Lucene models
54
55
56 Given a rich corpus of existing mappings established by domain experts, we wanted to explore the validity and
57
58 performance of a purely data-driven approach to automated LOINC mapping. We used Apache’s Maxent to create a
59
60
http://mc.manuscriptcentral.com/jamia
22. Page 21 of 28 Journal of the American Medical Informatics Association
1
2
3 supervised machine learning model and Apache’s Lucene to create an information retrieval model, as these tools are
4
5 freely available, offer good performance on typical personal computer hardware, and are relatively easy to deploy.
6
7
8 The usual application of Maxent models involves a binary outcome, such as natural language processing tasks like
Co
9
10 sentence detection and part of speech tagging. In this study, we created a Maxent model with thousands of outcomes
11
12 represented by all unique LOINCs contained in the training corpus. We are not aware of prior studies that used
nf
13
14 Maxent in this manner or in the context of automated mapping.
15
id
16
17 Lucene is used widely in a variety of applications for document indexing and search engine functions.[31] Since
18
en
19 version 5.0 (released December 2010), the search functionality in RELMA has implemented Lucene, including the
20
21 Lab Auto Mapper. Prior studies have demonstrated that RELMA is a very capable tool for mapping local terms to
22
t
23 LOINC.[15-18,32,33] Our application of Lucene differs from RELMA in that we did not directly query the LOINC
ia
24
25 terminology at all. Whereas RELMA queries against the stylized LOINC names and synonyms included in LOINC,
26
l:
27 both the Lucene and Maxent models in our approach only queried against words from local term descriptions
28
mapped to LOINC codes. We had hypothesized that the idiosyncratic variation present in a large corpus of local
Fo
29
30
31 term descriptions might help overcome the challenge of relying on the synonymy in LOINC. Although the
32
synonymy in LOINC is quite good for common abbreviations, the standards development process cannot possibly
rR
33
34
35 keep up with all the permutations of abbreviations seen in local tests names. For example, just a few of the variants
36
for “Neisseria gonorrhoeae” present in our corpus include: “N.GONORRHOEA”, “N. GONORRHEAE”,
37
ev
38
“N.GONO”, “Gono”, “N. GONORR.”, “NEISS GONORR”, and “NEISSERIA GONORR”.
39
40
ie
41 Our approach with the Maxent and Lucene models is relatively simple compared with the processing algorithm of
42
43 the RELMA Lab Auto Mapper or the drug-centric token matching approach employed by Peters et al in mapping
w
44
45 drug name variants to RxNorm.[34] The models in our approach were naïve to the semantics of tokens in the test
46
On
47 descriptions. The Lab Auto Mapper has functions that try to identify the specimen (e.g. CSF or Serum) and uses the
48
49 units of measure associated with the test to limit candidate LOINCs to those with a Property attribute consistent with
50
51 those units. For example, based on an internal mapping table, the Lab Auto Mapper would only return LOINC codes
ly
52
53 with a Property of Mass Concentration if the local test had associated units of ug/dL. Similarly, the drug-centric
54
55 token matching approach used by Peters et al [34] attempts to identify and perform special processing on the drug
56
57 name in a local string that is not performed on the tokens that may represent other components of the name like
58
59
60
http://mc.manuscriptcentral.com/jamia
23. Journal of the American Medical Informatics Association Page 22 of 28
1
2
3 strength or dose form. An advantage of our data-driven approach is that it did not require any domain specific
4
5 tailoring.
6
7
8 Comparing the performance of Maxent versus Lucene
Co
9
10
11 Maxent performed better than Lucene in ranking the correct LOINC first due to Maxent’s tendency to over fit the
12
nf
13 model. Maxent thus computes high scores for local terms with words that matched very closely with those in
14
15 training sets. However, both models ranked the correct LOINC among the top five for more than 90% of local terms
id
16
17 from the three test institutions.
18
en
19
20 Over the 20 iterations of random sub-sampling using 80/20 splits, Maxent on average identified the correct LOINC
21
22 for 2.9% (473) of local test terms that Lucene failed to score among the top five. Conversely, Lucene on average
t
23
ia
24 identified 1.5% (251) of local test terms that Maxent failed to score among the top five. For our analyses on test sets
25
26 from three institutions, Maxent identified the correct LOINC for 2.8% (101) of local terms that Lucene failed to
l:
27
score among the top five whereas Lucene identified 2.4% (89) of local terms that Maxent failed to score among the
28
Fo
29
top five. The relatively small number of terms ranked correctly by one model but not the other illustrates that they
30
31
perform well on similar kinds of test descriptions.
32
rR
33
34 One important advantage of Maxent over Lucene and Lab Auto Mapper is its normalized score. We used this
35
36 normalized score to determine a helpful threshold above which only the correct LOINC was ranked first. Using this
37
ev
38 cutoff score, we found that over 57% of local terms in our three test institutions could be ranked with a high degree
39
40 of certainty. Such a cutoff score is valuable in separating local terms that can be mapped with little (or no) human
ie
41
42 review from those that need more extensive review.
43
w
44
45 Corpus growth and variability in mapping results across institutions
46
On
47
48 We probed the robustness of our corpus-based approach by analyzing several different test sets and evaluating
49
50 performance as the corpus grew over time. These aspects are potentially relevant in deciding whether a corpus has
51
ly
52 reached critical mass to be used effectively for modeling. We observed slightly more variation in accuracy when
53
54 considering entire term sets from each of our three test institutions than in our random 80/20 splits of the corpus.
55
56 This suggests that institutions’ particular naming patterns can alter the mapping success even when the corpus is
57
58 large. As local term mappings were added to our corpus, the growth rate in unique LOINCs decreased more than the
59
60
http://mc.manuscriptcentral.com/jamia
24. Page 23 of 28 Journal of the American Medical Informatics Association
1
2
3 growth rate in unique words in term descriptions. This is a favorable pattern as it indicates a growth in diversity of
4
5 words associated with LOINCs already present in the corpus. Our results showed that Maxent’s performance was
6
7 not affected by the incremental growth in our corpus over time, but there was a slight decrease in Lucene’s
8
Co
9 performance.
10
11
12 Limitations of a data-driven paradigm and potential future research
nf
13
14
15 The primary drawback of our approach is that its success is limited by the relative completeness of the underlying
id
16
17 training corpus. Of the 3,642 local terms in our three test institutions, 46 were mapped to LOINCs with no training
18
en
19 data, 10 had words not associated with any LOINC and 69 had words not associated with the correct LOINC in the
20
21 training set. While neither Maxent nor Lucene was capable of ranking the correct LOINC for these 125 (3.4%) local
22
t
23 terms due to limitations in the corpus or because their term descriptions were completely novel, Lab Auto Mapper
ia
24
25 ranked the correct LOINC first for 35 (28%) and among the top five for 45 (36%) of these local terms.
26
l:
27
RELMA’s Lab Auto Mapper succeeded where our models failed by directly querying the LOINC terminology. It
28
Fo
29
also uses additional information such as the units of measure associated with a local term in its algorithm, and others
30
31
[5a, 9] have illustrated how extended profiles built from actual test results can be useful in mapping. Our corpus-
32
rR
33
based approach solely depends on matching words in term descriptions, and thus a global test name enhancement
34
35
process such as that described by Kim et al [10] may be beneficial. In contrast to the name enhancement process, a
36
37
ev
major benefit of our approach is that it requires little domain expertise on the front end. Evaluating the combined
38
39 strengths of these different approaches; exploring the value in adding other axes such as units of measure to the data
40
ie
41 models; testing alternate algorithms for supervised machine learning; and using information retrieval models like
42
43 “fuzzy search” would be valuable future research.
w
44
45
46 Our study has some other important limitations. We used a single corpus of mapped local terms from institutions in
On
47
48 a broad but geographically based area. Naming conventions used in other institutions may differ from our corpus in
49
50 important ways that lower the accuracy of mapping with Maxent and Lucene. For instance, we have seen some
51
ly
52 institutions that use semantically meaningless descriptions such as “1001” in lieu of something that resembles a test
53
54 name. Clearly, an automated mapping approach like ours would fail to map such local terms. Moreover, significant
55
56 differences in naming conventions may compromise the ability to normalize term descriptions from training and test
57
58
59
60
http://mc.manuscriptcentral.com/jamia
25. Journal of the American Medical Informatics Association Page 24 of 28
1
2
3 data uniformly. Additionally, since our corpus and test sets contained predominantly laboratory terms, we do not
4
5 know how well data-driven models would generalize to other important clinical measurement variables.
6
7
8 Considerations for practical application
Co
9
10
11 Expert review is a high cost resource in mapping. By identifying a short, accurate, ranked list of candidate LOINC
12
nf
13 codes for each local term we can optimize the process of human review. In settings where a large corpus of existing
14
15 mappings is available, the Maxent model performed the best of those we evaluated and would be our
id
16
17 recommendation for producing this ranked list. By choosing a high Maxent cutoff score (e.g. above 0.46), more than
18
en
19 half of the local terms could likely be mapped with little or no human review. If human review of the ranked list
20
21 reveals that a matching LOINC code is not present, the reviewer can default back to the typical term-specific search
22
t
23 using interactive functions of RELMA.
ia
24
25
26 While the core software tools we used in this study (Maxent and Lucene) are available at no cost under open source
l:
27
licenses, the corpus of local test descriptions mapped to LOINC from the INPC is not currently available publicly.
28
Fo
29
Encouraged by the results of this study, the Regenstrief LOINC team recently announced a project to build a shared
30
31
repository of local tests mapped to LOINC.[35] Because it is open to contributions from the global LOINC
32
rR
33
community, this new repository has the potential to serve as an important data substrate for future analyses.
34
35
36 Conclusion
37
ev
38
39 Our study shows that a rich corpus of local terms mapped to LOINC contains collective knowledge that can help
40
ie
41 map terms from different institutions. We developed an automated mapping approach based on supervised machine
42
43 learning and information retrieval using Apache’s Maxent and Lucene that are available at no cost. Our approach
w
44
45 operates on term descriptions from existing mappings in the corpus. Overall, Maxent ranked the correct LOINC first
46
On
47 for 79% and Lucene for 72% of local terms from our three test institutions. Using a cutoff score of 0.46 would allow
48
49 Maxent to identify over 57% of local terms that always had the correct LOINC ranked first. Mapping local terms to
50
51 a vocabulary standard is a necessary, but resource-intensive part of integrating data from disparate systems.
ly
52
53 Accurate and efficient automated mapping methods can help accelerate adoption of vocabulary standards and
54
55 promote widespread health information exchange.
56
57
58 Contributors
59
60
http://mc.manuscriptcentral.com/jamia
26. Page 25 of 28 Journal of the American Medical Informatics Association
1
2
3 MF and DV conceived and designed the study, collected the data, evaluated the results, and wrote and edited the
4
5 manuscript. MF created the models and performed the analyses.
6
7
8 Funding
Co
9
10
11 This work was supported in part by grant 5T 15 LM007117-14 and contract HHSN2762008000006C from the
12
nf
13 National Library of Medicine and performed at the Regenstrief Institute, Indianapolis, IN.
14
15
id
16 Competing Interests
17
18
en
None
19
20
21 Figures
22
t
23
ia
Figure 1: Growth in unique LOINCs mapped to local terms and unique words in local term descriptions as the
24
25 number of local terms in the corpus has expanded over time.
26
l:
27 Figure 2: Results of 20 iterations of repeated random sub-sampling validation showing the percentage of test
28
Fo
29
terms with manually mapped LOINCs ranked first (top one) and among the top five by Maxent and Lucene.
30
31
Figure 3: Rank of correct LOINCs and their Maxent score for local laboratory terms from three test institutions.
32
rR
33 Figure 4: Performance of Maxent and Lucene when applying the test set against a growing corpus of local terms.
34
35
36
37
ev
References
38
39
40 1. Chaudhry B,Wang J,Wu S, Maglione M, Mojica W, Roth E, et al. Systematic review: impact of health
ie
41
42 information technology on quality, efficiency, and costs of medical care. Ann Intern Med. 2006;144.
43
w
44 2. Smith PC, Araya-Guerra R, Bublitz C, et al. Missing clinical information during primary care visits. Jama. Feb
45
46 2 2005;293(5):565-571.
On
47
48 3. Finnell JT, Overhage JM, Grannis S. All health care is not local: an evaluation of the distribution of emergency
49
50 department care delivered in Indiana. AMIA Annu Symp Proc. 2011;2011:409-16. Epub 2011 Oct 22.
51
ly
52 4. 111th Congress of the United States of America. American Recovery and Reinvestment Act of 2009.
53
54 5. McDonald CJ, Huff SM, Suico JG, et al. LOINC, a universal standard for identifying laboratory observations: a
55
56 5-year update. Clin Chem. 2003 Apr;49(4):624-33.
57
58
59
60
http://mc.manuscriptcentral.com/jamia
27. Journal of the American Medical Informatics Association Page 26 of 28
1
2
3 6. International LOINC downloads, linguistic variants in RELMA and translating LOINC. Available at
4
5 http://loinc.org/international/. Accessed Apr 30 2012.
6
7 7. Vreeman DJ, Chiaravalloti MT, Hook J, McDonald CJ. Enabling international adoption of LOINC through
8
Co
9 translation. J Biomed Inform. 2012 Jan 21. [Epub ahead of print]
10
11 8. Baorto DM, Cimino JJ, Parvin CA, Kahn MG. Combining laboratory data sets from multiple institutions using
12
nf
13 the logical observation identifier names and codes (LOINC). Int J Med Inform. 1998 Jul;51(1):29-37.
14
15
9. Department of Health and Human Services. 45 CFR Part 170. Health information technology: Initial set of
id
16
17 standards, implementation specifications, and certification criteria for electronic health record technology; Final
18
en
19 Rule; published July 28, 2010.
20
21
10. Lin MC, Vreeman DJ, McDonald CJ, Huff SM. Correctness of voluntary LOINC mapping for laboratory tests
22
t
23
ia
in three large institutions. AMIA Annu Symp Proc. 2010 Nov 13;2010:447-51.
24
25
11. Lin MC, Vreeman DJ, Huff SM. Investigating the semantic interoperability of laboratory data exchanged using
26
l:
27 LOINC codes in three large institutions. AMIA Annu Symp Proc. 2011;2011:805-14. Epub 2011 Oct 22.
28
Fo
29
12. Vreeman DJ, Stark M, Tomashefski GL, Phillips DR, Dexter PR. Embracing change in a health information
30
31
exchange. AMIA Annu Symp Proc. 2008 Nov 6:768-72.
32
rR
33
13. Li W, Tokars JI, Lipskiy N, Ganesan S. An efficient approach to map LOINC concepts to notifiable conditions.
34
35
Advances in Disease Surveillance. 2007;4:172.
36
37
ev
38
14. Dugas M, Thun S, Frankewitsch T, Heitmann KU. LOINC codes for hospital information systems documents: a
39
case study. J Am Med Inform Assoc. 2009 May-Jun;16(3):400-3. Epub 2009 Mar 4.
40
ie
41
42
15. Khan AN, Griffith SP, Moore C, et al. Standardizing laboratory data by mapping to LOINC. J Am Med Inform
43
Assoc. 2006 May-Jun;13(3):353-5. Epub 2006 Feb 24.
w
44
45
46 16. Vreeman DJ, McDonald CJ. Automated mapping of local radiology terms to LOINC. AMIA Annu Symp Proc.
On
47
2005:769-73.
48
49
50 17. Vreeman DJ, McDonald CJ. A comparison of Intelligent Mapper and document similarity scores for mapping
51
ly
local radiology terms to LOINC. AMIA Annu Symp Proc. 2006:809-1.
52
53
54 18. Kim H, El-Kareh R, Goel A, Vineet F, Chapman WW. An approach to improve LOINC mapping through
55
56 augmentation of local test names. J Biomed Inform. 2011 Dec 21. [Epub ahead of print]
57
58
59
60
http://mc.manuscriptcentral.com/jamia
28. Page 27 of 28 Journal of the American Medical Informatics Association
1
2
3 19. Lau LM, Johnson K, Monson K, Lam SH, Huff SM. A method for the automated mapping of laboratory results
4
5 to LOINC. Proc AMIA Symp. 2000:472-6.
6
7 20. Zollo KA, Huff SM. Automated mapping of observation codes using extensional definitions. J Am Med Inform
8
Co
9 Assoc. 2000 Nov-Dec;7(6):586-92.
10
11 21. Lin MC, Vreeman DJ, McDonald CJ, Huff SM. Auditing consistency and usefulness of LOINC use among
12
nf
13 three large institutions - Using version spaces for grouping LOINC codes. J Biomed Inform. 2012 Jan 28. [Epub
14
15 ahead of print]
id
16
17 22. Sun JY, Sun Y. A system for automated lexical mapping. J Am Med Inform Assoc. 2006 May-Jun;13(3):334-
18
en
19 43. Epub 2006 Feb 24.
20
21
23. McDonald CJ, Overhage JM, Barnes M, et al. The Indiana network for patient care: a working local health
22
t
23
ia
information infrastructure. Health Aff (Millwood). 2005 Sep-Oct;24(5):1214-20.
24
25
24. Apache Lucene. Available at http://lucene.apache.org/. Accessed Apr 30 2012.
26
l:
27
25. McCandless M, Hatcher E, Gospodnetic O. Lucene in action. Stamford: Manning Publications, 2010.
28
Fo
29
30
26. Apache OpenNLP. Available at http://opennlp.apache.org/. Accessed Apr 30 2012.
31
32 27. Berger LA, Della Pietra VJ, Della Pietra SA. A maximum entropy approach to natural language processing.
rR
33
34 Computational Linguistics. 1996 Mar;22(1):39-71.
35
36 28. Regenstrief LOINC Mapping Assistant, version 5.6. Available at http://loinc.org/. Accessed Apr 30 2012.
37
ev
38 29. RELMA version 5.6 Users’ Manual. Available at http://loinc.org/downloads/relma. Accessed Apr 30 2012.
39
40 30. Case JT. Using RELMA. Or…in search of the missing LOINC. Available at http://loinc.org/slideshows/lab-
ie
41
42 loinc-tutorial. Accessed Apr 30 2012.
43
w
44 31. PoweredBy – Lucene-java Wiki. Available at http://wiki.apache.org/lucene-java/PoweredBy. Accessed Sep 26,
45
46 2012.
On
47
48 32. Abhyankar S, Demner-Fushman D, McDonald CJ. Standardizing clinical laboratory data for secondary use. J
49
50 Biomed Inform. 2012 Aug;45(4):642-50. Epub 2012 May 3.
51
ly
52 33. Zunner C, Bürkle T, Prokosch HU, Ganslandt T. Mapping local laboratory interface terms to LOINC at a
53
54 German university hospital using RELMA V.5: a semi-automated approach. J Am Med Inform Assoc. 2012 Jul
55
56 16. [Epub ahead of print]
57
58
59
60
http://mc.manuscriptcentral.com/jamia
29. Journal of the American Medical Informatics Association Page 28 of 28
1
2
3 34. Peters L, Kapusnik-Uner JE, Nguyen T, Bodenreider O. An approximate matching method for clinical drug
4
5 names. AMIA Annu Symp Proc. 2011;2011:1117-26. Epub 2011 Oct 22.
6
7 35. Regenstrief launches Community Mapping Repository and Asks for Contributions of Existing Mappings to
8
Co
9 LOINC. Available at http://loinc.org/resolveuid/5fff22576bc94db371020ed12dbc5c34. Accessed Sep 26, 2012.
10
11
12
nf
13
14
15
id
16
17
18
en
19
20
21
22
t
23
ia
24
25
26
l:
27
28
Fo
29
30
31
32
rR
33
34
35
36
37
ev
38
39
40
ie
41
42
43
w
44
45
46
On
47
48
49
50
51
ly
52
53
54
55
56
57
58
59
60
http://mc.manuscriptcentral.com/jamia