SlideShare a Scribd company logo
1 of 17
X X ∩Y                 Y


                                    p(x, y)
I(X; Y ) =             p(x, y) log
                                   p(x)p(y)
             y∈Y x∈X




      |X ∩ Y |
    min(|X|, |Y |)
$ curl quot;http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?
db=pubmed&term=CDK2quot;

<?xml version=quot;1.0quot;?>
<!DOCTYPE eSearchResult PUBLIC quot;-//NLM//DTD eSearchResult, 11 May
2002//ENquot; quot;http://www.ncbi.nlm.nih.gov/entrez/query/DTD/
eSearch_020511.dtdquot;>
<eSearchResult>
        <Count>3778</Count>
        <RetMax>20</RetMax>
        <RetStart>0</RetStart>
        <IdList>
                 <Id>17904841</Id>
                 <Id>17904366</Id>
                 <Id>17893107</Id>
()
</eSearchResult>
$ curl quot;http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?
db=pubmed&term=CDK6quot;

()
<eSearchResult>
        <Count>740</Count>
        <RetMax>20</RetMax>
        <RetStart>0</RetStart>
()
</eSearchResult>




$ curl quot;http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?
db=pubmed&term=CDK2+CDK6quot;

()
<eSearchResult>
        <Count>321</Count>
        <RetMax>20</RetMax>
        <RetStart>0</RetStart>
()
</eSearchResult>
321
  |X ∩ Y |
                 =
min(|X|, |Y |)       min(3778, 740)
                     321
                 =       = 0.438
                     740
$ ruby simpson.rb CDK2 CDK6

CDK2   CDK6    3778    742    321   0.432614555256065
#!/usr/bin/env ruby

require 'rexml/document'
require 'open-uri'

def count(gene)
  fp = open(quot;http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?
db=pubmed&term=quot;+gene)
  source = fp.read
  fp.close
  doc = REXML::Document.new source
  return doc.elements['/eSearchResult/Count'].text.to_i
end

def simpson(gene1_count, gene2_count, gene12_count)
  if gene1_count <= 0 || gene2_count <= 0
    return nil
  elsif gene1_count < gene2_count
    return gene12_count.to_f / gene1_count.to_f
  end
  return gene12_count.to_f / gene2_count.to_f
end
def main(gene1,gene2)
  gene1_count = count(gene1)
  gene2_count = count(gene2)
  gene12_count = count(gene1 + quot;+quot; + gene2)
  s = simpson(gene1_count, gene2_count, gene12_count)
  puts [gene1, gene2, gene1_count, gene2_count, gene12_count, s].join
(quot; quot;)
end

main(ARGV[0],ARGV[1])
bioinfolec_7th_20071005
bioinfolec_7th_20071005

More Related Content

What's hot

MongoDBで作るソーシャルデータ新解析基盤
MongoDBで作るソーシャルデータ新解析基盤MongoDBで作るソーシャルデータ新解析基盤
MongoDBで作るソーシャルデータ新解析基盤
Takahiro Inoue
 

What's hot (9)

ONLINE STUDENT MANAGEMENT SYSTEM
ONLINE STUDENT MANAGEMENT SYSTEMONLINE STUDENT MANAGEMENT SYSTEM
ONLINE STUDENT MANAGEMENT SYSTEM
 
MongoDBで作るソーシャルデータ新解析基盤
MongoDBで作るソーシャルデータ新解析基盤MongoDBで作るソーシャルデータ新解析基盤
MongoDBで作るソーシャルデータ新解析基盤
 
jQuery Datatables With MongDb
jQuery Datatables With MongDbjQuery Datatables With MongDb
jQuery Datatables With MongDb
 
Phpfunction
PhpfunctionPhpfunction
Phpfunction
 
Fantastic caches and where to find them
Fantastic caches and where to find themFantastic caches and where to find them
Fantastic caches and where to find them
 
自己的JVM自己救: 解救 OOM 實務經驗談 (JCConf 2015)
自己的JVM自己救: 解救 OOM 實務經驗談  (JCConf 2015)自己的JVM自己救: 解救 OOM 實務經驗談  (JCConf 2015)
自己的JVM自己救: 解救 OOM 實務經驗談 (JCConf 2015)
 
Save JVM by Yourself: Real War Experiences of OOM
Save JVM by Yourself: Real War Experiences of OOMSave JVM by Yourself: Real War Experiences of OOM
Save JVM by Yourself: Real War Experiences of OOM
 
Power Shell Commands
Power Shell CommandsPower Shell Commands
Power Shell Commands
 
Hadoop, HDFS, MapReduce and Pig
Hadoop, HDFS, MapReduce and PigHadoop, HDFS, MapReduce and Pig
Hadoop, HDFS, MapReduce and Pig
 

Viewers also liked

Viewers also liked (17)

bioinfolec_1st_20070615
bioinfolec_1st_20070615bioinfolec_1st_20070615
bioinfolec_1st_20070615
 
Datamining 9th Association Rule
Datamining 9th Association RuleDatamining 9th Association Rule
Datamining 9th Association Rule
 
Datamining 2nd Decisiontree
Datamining 2nd DecisiontreeDatamining 2nd Decisiontree
Datamining 2nd Decisiontree
 
Ohp Seijoen H20 07 Arraylist
Ohp Seijoen H20 07 ArraylistOhp Seijoen H20 07 Arraylist
Ohp Seijoen H20 07 Arraylist
 
bioinfolec_8th_20071012
bioinfolec_8th_20071012bioinfolec_8th_20071012
bioinfolec_8th_20071012
 
Datamining r 1st
Datamining r 1stDatamining r 1st
Datamining r 1st
 
Datamining 6th Svm
Datamining 6th SvmDatamining 6th Svm
Datamining 6th Svm
 
bioinfolec_10th_20071026
bioinfolec_10th_20071026bioinfolec_10th_20071026
bioinfolec_10th_20071026
 
Datamining 4th Adaboost
Datamining 4th AdaboostDatamining 4th Adaboost
Datamining 4th Adaboost
 
Ohp Seijoen H20 05 Hairetsu
Ohp Seijoen H20 05 HairetsuOhp Seijoen H20 05 Hairetsu
Ohp Seijoen H20 05 Hairetsu
 
080812
080812080812
080812
 
Datamining 6th svm
Datamining 6th svmDatamining 6th svm
Datamining 6th svm
 
PRE: Datamining 2nd R
PRE: Datamining 2nd RPRE: Datamining 2nd R
PRE: Datamining 2nd R
 
Ohp Seijoen H20 03 Seigyobun
Ohp Seijoen H20 03 SeigyobunOhp Seijoen H20 03 Seigyobun
Ohp Seijoen H20 03 Seigyobun
 
Datamining 4th adaboost
Datamining 4th adaboostDatamining 4th adaboost
Datamining 4th adaboost
 
20110524zurichngs 2nd pub
20110524zurichngs 2nd pub20110524zurichngs 2nd pub
20110524zurichngs 2nd pub
 
Datamining R 2nd
Datamining R 2ndDatamining R 2nd
Datamining R 2nd
 

Similar to bioinfolec_7th_20071005

bioinfolec_5th_20070713
bioinfolec_5th_20070713bioinfolec_5th_20070713
bioinfolec_5th_20070713
sesejun
 
bioinfolec_20070706 4th
bioinfolec_20070706 4thbioinfolec_20070706 4th
bioinfolec_20070706 4th
sesejun
 
20070407 Rit2007 Xmltype Samokhvalov
20070407 Rit2007 Xmltype Samokhvalov20070407 Rit2007 Xmltype Samokhvalov
20070407 Rit2007 Xmltype Samokhvalov
Nikolay Samokhvalov
 
Itsecteam shell
Itsecteam shellItsecteam shell
Itsecteam shell
ady36
 

Similar to bioinfolec_7th_20071005 (20)

bioinfolec_5th_20070713
bioinfolec_5th_20070713bioinfolec_5th_20070713
bioinfolec_5th_20070713
 
bioinfolec_20070706 4th
bioinfolec_20070706 4thbioinfolec_20070706 4th
bioinfolec_20070706 4th
 
Redis 101
Redis 101Redis 101
Redis 101
 
20070407 Rit2007 Xmltype Samokhvalov
20070407 Rit2007 Xmltype Samokhvalov20070407 Rit2007 Xmltype Samokhvalov
20070407 Rit2007 Xmltype Samokhvalov
 
ACM Bay Area Data Mining Workshop: Pattern, PMML, Hadoop
ACM Bay Area Data Mining Workshop: Pattern, PMML, HadoopACM Bay Area Data Mining Workshop: Pattern, PMML, Hadoop
ACM Bay Area Data Mining Workshop: Pattern, PMML, Hadoop
 
Latest java
Latest javaLatest java
Latest java
 
RCEC Email 3.5.03
RCEC Email 3.5.03RCEC Email 3.5.03
RCEC Email 3.5.03
 
Noah Zoschke at Waza 2013: Heroku Secrets
Noah Zoschke at Waza 2013: Heroku SecretsNoah Zoschke at Waza 2013: Heroku Secrets
Noah Zoschke at Waza 2013: Heroku Secrets
 
CAR Email 6.5.02 (d)
CAR Email 6.5.02 (d)CAR Email 6.5.02 (d)
CAR Email 6.5.02 (d)
 
Php
PhpPhp
Php
 
Ajax и будущее Java Script
Ajax и будущее Java ScriptAjax и будущее Java Script
Ajax и будущее Java Script
 
D3.js workshop
D3.js workshopD3.js workshop
D3.js workshop
 
Let's build a parser!
Let's build a parser!Let's build a parser!
Let's build a parser!
 
Os Pruett
Os PruettOs Pruett
Os Pruett
 
Send.php
Send.phpSend.php
Send.php
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love Story
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love Story
 
Hidden treasures of Ruby
Hidden treasures of RubyHidden treasures of Ruby
Hidden treasures of Ruby
 
Itsecteam shell
Itsecteam shellItsecteam shell
Itsecteam shell
 
20190907 Julia the language for future
20190907 Julia the language for future20190907 Julia the language for future
20190907 Julia the language for future
 

More from sesejun

次世代シーケンサが求める機械学習
次世代シーケンサが求める機械学習次世代シーケンサが求める機械学習
次世代シーケンサが求める機械学習
sesejun
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
sesejun
 
20110214nips2010 read
20110214nips2010 read20110214nips2010 read
20110214nips2010 read
sesejun
 
Datamining 9th association_rule.key
Datamining 9th association_rule.keyDatamining 9th association_rule.key
Datamining 9th association_rule.key
sesejun
 
Datamining 8th hclustering
Datamining 8th hclusteringDatamining 8th hclustering
Datamining 8th hclustering
sesejun
 
Datamining r 4th
Datamining r 4thDatamining r 4th
Datamining r 4th
sesejun
 
Datamining r 3rd
Datamining r 3rdDatamining r 3rd
Datamining r 3rd
sesejun
 
Datamining r 2nd
Datamining r 2ndDatamining r 2nd
Datamining r 2nd
sesejun
 
Datamining 5th knn
Datamining 5th knnDatamining 5th knn
Datamining 5th knn
sesejun
 
Datamining 3rd naivebayes
Datamining 3rd naivebayesDatamining 3rd naivebayes
Datamining 3rd naivebayes
sesejun
 
Datamining 2nd decisiontree
Datamining 2nd decisiontreeDatamining 2nd decisiontree
Datamining 2nd decisiontree
sesejun
 
Datamining 7th kmeans
Datamining 7th kmeansDatamining 7th kmeans
Datamining 7th kmeans
sesejun
 
100401 Bioinfoinfra
100401 Bioinfoinfra100401 Bioinfoinfra
100401 Bioinfoinfra
sesejun
 
Datamining 8th Hclustering
Datamining 8th HclusteringDatamining 8th Hclustering
Datamining 8th Hclustering
sesejun
 
Datamining 9th Association Rule
Datamining 9th Association RuleDatamining 9th Association Rule
Datamining 9th Association Rule
sesejun
 
Datamining 8th Hclustering
Datamining 8th HclusteringDatamining 8th Hclustering
Datamining 8th Hclustering
sesejun
 
Datamining 7th Kmeans
Datamining 7th KmeansDatamining 7th Kmeans
Datamining 7th Kmeans
sesejun
 

More from sesejun (20)

RNAseqによる変動遺伝子抽出の統計: A Review
RNAseqによる変動遺伝子抽出の統計: A ReviewRNAseqによる変動遺伝子抽出の統計: A Review
RNAseqによる変動遺伝子抽出の統計: A Review
 
バイオインフォマティクスによる遺伝子発現解析
バイオインフォマティクスによる遺伝子発現解析バイオインフォマティクスによる遺伝子発現解析
バイオインフォマティクスによる遺伝子発現解析
 
次世代シーケンサが求める機械学習
次世代シーケンサが求める機械学習次世代シーケンサが求める機械学習
次世代シーケンサが求める機械学習
 
20110602labseminar pub
20110602labseminar pub20110602labseminar pub
20110602labseminar pub
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
 
20110214nips2010 read
20110214nips2010 read20110214nips2010 read
20110214nips2010 read
 
Datamining 9th association_rule.key
Datamining 9th association_rule.keyDatamining 9th association_rule.key
Datamining 9th association_rule.key
 
Datamining 8th hclustering
Datamining 8th hclusteringDatamining 8th hclustering
Datamining 8th hclustering
 
Datamining r 4th
Datamining r 4thDatamining r 4th
Datamining r 4th
 
Datamining r 3rd
Datamining r 3rdDatamining r 3rd
Datamining r 3rd
 
Datamining r 2nd
Datamining r 2ndDatamining r 2nd
Datamining r 2nd
 
Datamining 5th knn
Datamining 5th knnDatamining 5th knn
Datamining 5th knn
 
Datamining 3rd naivebayes
Datamining 3rd naivebayesDatamining 3rd naivebayes
Datamining 3rd naivebayes
 
Datamining 2nd decisiontree
Datamining 2nd decisiontreeDatamining 2nd decisiontree
Datamining 2nd decisiontree
 
Datamining 7th kmeans
Datamining 7th kmeansDatamining 7th kmeans
Datamining 7th kmeans
 
100401 Bioinfoinfra
100401 Bioinfoinfra100401 Bioinfoinfra
100401 Bioinfoinfra
 
Datamining 8th Hclustering
Datamining 8th HclusteringDatamining 8th Hclustering
Datamining 8th Hclustering
 
Datamining 9th Association Rule
Datamining 9th Association RuleDatamining 9th Association Rule
Datamining 9th Association Rule
 
Datamining 8th Hclustering
Datamining 8th HclusteringDatamining 8th Hclustering
Datamining 8th Hclustering
 
Datamining 7th Kmeans
Datamining 7th KmeansDatamining 7th Kmeans
Datamining 7th Kmeans
 

Recently uploaded

Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial IntelligenceRevolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
Precisely
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 

Recently uploaded (20)

Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial IntelligenceRevolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 

bioinfolec_7th_20071005

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7. X X ∩Y Y p(x, y) I(X; Y ) = p(x, y) log p(x)p(y) y∈Y x∈X |X ∩ Y | min(|X|, |Y |)
  • 8. $ curl quot;http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? db=pubmed&term=CDK2quot; <?xml version=quot;1.0quot;?> <!DOCTYPE eSearchResult PUBLIC quot;-//NLM//DTD eSearchResult, 11 May 2002//ENquot; quot;http://www.ncbi.nlm.nih.gov/entrez/query/DTD/ eSearch_020511.dtdquot;> <eSearchResult> <Count>3778</Count> <RetMax>20</RetMax> <RetStart>0</RetStart> <IdList> <Id>17904841</Id> <Id>17904366</Id> <Id>17893107</Id> () </eSearchResult>
  • 9. $ curl quot;http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? db=pubmed&term=CDK6quot; () <eSearchResult> <Count>740</Count> <RetMax>20</RetMax> <RetStart>0</RetStart> () </eSearchResult> $ curl quot;http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? db=pubmed&term=CDK2+CDK6quot; () <eSearchResult> <Count>321</Count> <RetMax>20</RetMax> <RetStart>0</RetStart> () </eSearchResult>
  • 10. 321 |X ∩ Y | = min(|X|, |Y |) min(3778, 740) 321 = = 0.438 740
  • 11.
  • 12.
  • 13. $ ruby simpson.rb CDK2 CDK6 CDK2 CDK6 3778 742 321 0.432614555256065
  • 14. #!/usr/bin/env ruby require 'rexml/document' require 'open-uri' def count(gene) fp = open(quot;http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? db=pubmed&term=quot;+gene) source = fp.read fp.close doc = REXML::Document.new source return doc.elements['/eSearchResult/Count'].text.to_i end def simpson(gene1_count, gene2_count, gene12_count) if gene1_count <= 0 || gene2_count <= 0 return nil elsif gene1_count < gene2_count return gene12_count.to_f / gene1_count.to_f end return gene12_count.to_f / gene2_count.to_f end
  • 15. def main(gene1,gene2) gene1_count = count(gene1) gene2_count = count(gene2) gene12_count = count(gene1 + quot;+quot; + gene2) s = simpson(gene1_count, gene2_count, gene12_count) puts [gene1, gene2, gene1_count, gene2_count, gene12_count, s].join (quot; quot;) end main(ARGV[0],ARGV[1])