SlideShare ist ein Scribd-Unternehmen logo
1 von 16
The Power of Randomization
Example 1: Checking Equality


• Two large files at two different locations.

• Are they identical?
  – By communicating only a small amount of
    information!
Checking Equality
                The Challenge

• Two large numbers N1 and N2 , n bits each

• Communication allowed: m<<n bits

• Possible?
Checking Equality
                   Impossibility


• Suppose the communication is based on N1 alone

• m<<n,
   – Two different N1’s will have the same m-bit communication
     pattern
   – Switch N2 from one to another (YES->NO)
Checking Equality
          Randomized Algorithms


• Communicate N1 mod M for some number M

• If N1 = N2 then you always get YES


• If N1 != N2 then you get YES if M divides N1 - N2
Checking Equality
                    Analysis


• Probability N1 != N2 but M divides N1 - N2 ?

• Probability over what?
     • M and not N1,N2
     • Choose M at random in the range 1..2m
Checking Equality
                      Analysis


• How many factors does N1 - N2 have?
   – N1 - N2 <= 2n, so (2n)1/log n


• If we choose M randomly in the range 1..2 (2n)1/log n
   – Probability N1 != N2 but M divides N1 - N2 <= 1/2
   – So m is ~ n/log n bits (minor gains)
Checking Equality
                Use Prime Numbers

• How many prime factors does N1 - N2 have?
   – N1 - N2 <= 2n, so 2n/log n

• If we choose M to be a random prime in 1..4n

   – There are at least 4n/log 4n > 4n/log(4n) primes

   – Probability N1 != N2 but M divides N1 - N2 <= ~ 1/2

   – So m is ~ log n bits (major gains)
Checking Equality
                   The Solution

• Two large numbers N1 and N2 , n bits each

• log n bits of communication
   – Remainder w.r.t random prime in range 1..4n


• Error Prob < 1/2
Checking Equality
             Reducing Error Prob

• Repeat k times

• Communication is klog n bits

• Error prob < (½)k
Checking Equality
               Example Numbers

• 10GB file, n=1010

• Desired Error Prob 10-30

• Communication 99 * 33 = 3267 bits = 400 bytes


If 10 billion people do 10 billion checks a day, the prob
  that even one of the checks is erroneous is 1/10
  billion
Another Example
                     PCA

• Fit a line thru 0 to a
  collection of points so as
  to maximize sum of
  squares of projections
PCA
                 Random Sampling


• Too many points?

• Pick a random sample
   – The fitting line doesn’t
     change too much?
PCA
             Random Sampling


• How should you sample
  here?
Puzzle
          Checking Matrix Products

• Given three matrices A and BC, check if A=BC?
   – mod p for simplicity


• Matrices are n*n


• Easy to do in n3 time

• Can you do better?
Puzzle
         Checking Matrix Products

• Given three matrices A and BC, check if A=BC?

• Matrices are n*n


• Easy to do in n3 time

• Can you do better?

Weitere ähnliche Inhalte

Ähnlich wie Randomized algorithms

Significance tests
Significance testsSignificance tests
Significance testsJinho Choi
 
Statisticsforbiologists colstons
Statisticsforbiologists colstonsStatisticsforbiologists colstons
Statisticsforbiologists colstonsandymartin
 
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWERUndecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWERmuthukrishnavinayaga
 
Undecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation AlgorithmsUndecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation AlgorithmsMuthu Vinayagam
 
Matt Purkeypile's Doctoral Dissertation Defense Slides
Matt Purkeypile's Doctoral Dissertation Defense SlidesMatt Purkeypile's Doctoral Dissertation Defense Slides
Matt Purkeypile's Doctoral Dissertation Defense Slidesmpurkeypile
 
All-Reduce and Prefix-Sum Operations
All-Reduce and Prefix-Sum Operations All-Reduce and Prefix-Sum Operations
All-Reduce and Prefix-Sum Operations Syed Zaid Irshad
 
Module 2 Design Analysis and Algorithms
Module 2 Design Analysis and AlgorithmsModule 2 Design Analysis and Algorithms
Module 2 Design Analysis and AlgorithmsCool Guy
 
Recurrence relationclass 5
Recurrence relationclass 5Recurrence relationclass 5
Recurrence relationclass 5Kumar
 

Ähnlich wie Randomized algorithms (20)

Significance tests
Significance testsSignificance tests
Significance tests
 
Unit7
Unit7Unit7
Unit7
 
Statisticsforbiologists colstons
Statisticsforbiologists colstonsStatisticsforbiologists colstons
Statisticsforbiologists colstons
 
Combinatorics.ppt
Combinatorics.pptCombinatorics.ppt
Combinatorics.ppt
 
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWERUndecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
 
Brute force method
Brute force methodBrute force method
Brute force method
 
Undecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation AlgorithmsUndecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation Algorithms
 
Pn sequence
Pn sequencePn sequence
Pn sequence
 
densematrix.ppt
densematrix.pptdensematrix.ppt
densematrix.ppt
 
Matt Purkeypile's Doctoral Dissertation Defense Slides
Matt Purkeypile's Doctoral Dissertation Defense SlidesMatt Purkeypile's Doctoral Dissertation Defense Slides
Matt Purkeypile's Doctoral Dissertation Defense Slides
 
unit 4 nearest neighbor.ppt
unit 4 nearest neighbor.pptunit 4 nearest neighbor.ppt
unit 4 nearest neighbor.ppt
 
sorting
sortingsorting
sorting
 
Chap8 slides
Chap8 slidesChap8 slides
Chap8 slides
 
Unit 5
Unit 5Unit 5
Unit 5
 
Unit 5
Unit 5Unit 5
Unit 5
 
A star
A starA star
A star
 
All-Reduce and Prefix-Sum Operations
All-Reduce and Prefix-Sum Operations All-Reduce and Prefix-Sum Operations
All-Reduce and Prefix-Sum Operations
 
Quantization
QuantizationQuantization
Quantization
 
Module 2 Design Analysis and Algorithms
Module 2 Design Analysis and AlgorithmsModule 2 Design Analysis and Algorithms
Module 2 Design Analysis and Algorithms
 
Recurrence relationclass 5
Recurrence relationclass 5Recurrence relationclass 5
Recurrence relationclass 5
 

Mehr von Strand Life Sciences Pvt Ltd (12)

Strand genomics features in CIO review
Strand genomics features in CIO reviewStrand genomics features in CIO review
Strand genomics features in CIO review
 
Rules of a Quantum World
Rules of  a Quantum WorldRules of  a Quantum World
Rules of a Quantum World
 
Least common ancestors in constant time
Least common ancestors in constant timeLeast common ancestors in constant time
Least common ancestors in constant time
 
Introduction to statistics iii
Introduction to statistics iiiIntroduction to statistics iii
Introduction to statistics iii
 
Introduction to statistics ii
Introduction to statistics iiIntroduction to statistics ii
Introduction to statistics ii
 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statistics
 
Dynamic programming for simd
Dynamic programming for simdDynamic programming for simd
Dynamic programming for simd
 
Complex numbers polynomial multiplication
Complex numbers polynomial multiplicationComplex numbers polynomial multiplication
Complex numbers polynomial multiplication
 
Converting High Dimensional Problems to Low Dimensional Ones
Converting High Dimensional Problems to Low Dimensional OnesConverting High Dimensional Problems to Low Dimensional Ones
Converting High Dimensional Problems to Low Dimensional Ones
 
Searching using Quantum Rules
Searching using Quantum RulesSearching using Quantum Rules
Searching using Quantum Rules
 
Suffix arrays
Suffix arraysSuffix arrays
Suffix arrays
 
Alignment of raw reads in Avadis NGS
Alignment of raw reads in Avadis NGSAlignment of raw reads in Avadis NGS
Alignment of raw reads in Avadis NGS
 

Kürzlich hochgeladen

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Kürzlich hochgeladen (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

Randomized algorithms

  • 1. The Power of Randomization
  • 2. Example 1: Checking Equality • Two large files at two different locations. • Are they identical? – By communicating only a small amount of information!
  • 3. Checking Equality The Challenge • Two large numbers N1 and N2 , n bits each • Communication allowed: m<<n bits • Possible?
  • 4. Checking Equality Impossibility • Suppose the communication is based on N1 alone • m<<n, – Two different N1’s will have the same m-bit communication pattern – Switch N2 from one to another (YES->NO)
  • 5. Checking Equality Randomized Algorithms • Communicate N1 mod M for some number M • If N1 = N2 then you always get YES • If N1 != N2 then you get YES if M divides N1 - N2
  • 6. Checking Equality Analysis • Probability N1 != N2 but M divides N1 - N2 ? • Probability over what? • M and not N1,N2 • Choose M at random in the range 1..2m
  • 7. Checking Equality Analysis • How many factors does N1 - N2 have? – N1 - N2 <= 2n, so (2n)1/log n • If we choose M randomly in the range 1..2 (2n)1/log n – Probability N1 != N2 but M divides N1 - N2 <= 1/2 – So m is ~ n/log n bits (minor gains)
  • 8. Checking Equality Use Prime Numbers • How many prime factors does N1 - N2 have? – N1 - N2 <= 2n, so 2n/log n • If we choose M to be a random prime in 1..4n – There are at least 4n/log 4n > 4n/log(4n) primes – Probability N1 != N2 but M divides N1 - N2 <= ~ 1/2 – So m is ~ log n bits (major gains)
  • 9. Checking Equality The Solution • Two large numbers N1 and N2 , n bits each • log n bits of communication – Remainder w.r.t random prime in range 1..4n • Error Prob < 1/2
  • 10. Checking Equality Reducing Error Prob • Repeat k times • Communication is klog n bits • Error prob < (½)k
  • 11. Checking Equality Example Numbers • 10GB file, n=1010 • Desired Error Prob 10-30 • Communication 99 * 33 = 3267 bits = 400 bytes If 10 billion people do 10 billion checks a day, the prob that even one of the checks is erroneous is 1/10 billion
  • 12. Another Example PCA • Fit a line thru 0 to a collection of points so as to maximize sum of squares of projections
  • 13. PCA Random Sampling • Too many points? • Pick a random sample – The fitting line doesn’t change too much?
  • 14. PCA Random Sampling • How should you sample here?
  • 15. Puzzle Checking Matrix Products • Given three matrices A and BC, check if A=BC? – mod p for simplicity • Matrices are n*n • Easy to do in n3 time • Can you do better?
  • 16. Puzzle Checking Matrix Products • Given three matrices A and BC, check if A=BC? • Matrices are n*n • Easy to do in n3 time • Can you do better?