SlideShare ist ein Scribd-Unternehmen logo
1 von 16
The Power of Randomization
Example 1: Checking Equality


• Two large files at two different locations.

• Are they identical?
  – By communicating only a small amount of
    information!
Checking Equality
                The Challenge

• Two large numbers N1 and N2 , n bits each

• Communication allowed: m<<n bits

• Possible?
Checking Equality
                   Impossibility


• Suppose the communication is based on N1 alone

• m<<n,
   – Two different N1’s will have the same m-bit communication
     pattern
   – Switch N2 from one to another (YES->NO)
Checking Equality
          Randomized Algorithms


• Communicate N1 mod M for some number M

• If N1 = N2 then you always get YES


• If N1 != N2 then you get YES if M divides N1 - N2
Checking Equality
                    Analysis


• Probability N1 != N2 but M divides N1 - N2 ?

• Probability over what?
     • M and not N1,N2
     • Choose M at random in the range 1..2m
Checking Equality
                      Analysis


• How many factors does N1 - N2 have?
   – N1 - N2 <= 2n, so (2n)1/log n


• If we choose M randomly in the range 1..2 (2n)1/log n
   – Probability N1 != N2 but M divides N1 - N2 <= 1/2
   – So m is ~ n/log n bits (minor gains)
Checking Equality
                Use Prime Numbers

• How many prime factors does N1 - N2 have?
   – N1 - N2 <= 2n, so 2n/log n

• If we choose M to be a random prime in 1..4n

   – There are at least 4n/log 4n > 4n/log(4n) primes

   – Probability N1 != N2 but M divides N1 - N2 <= ~ 1/2

   – So m is ~ log n bits (major gains)
Checking Equality
                   The Solution

• Two large numbers N1 and N2 , n bits each

• log n bits of communication
   – Remainder w.r.t random prime in range 1..4n


• Error Prob < 1/2
Checking Equality
             Reducing Error Prob

• Repeat k times

• Communication is klog n bits

• Error prob < (½)k
Checking Equality
               Example Numbers

• 10GB file, n=1010

• Desired Error Prob 10-30

• Communication 99 * 33 = 3267 bits = 400 bytes


If 10 billion people do 10 billion checks a day, the prob
  that even one of the checks is erroneous is 1/10
  billion
Another Example
                     PCA

• Fit a line thru 0 to a
  collection of points so as
  to maximize sum of
  squares of projections
PCA
                 Random Sampling


• Too many points?

• Pick a random sample
   – The fitting line doesn’t
     change too much?
PCA
             Random Sampling


• How should you sample
  here?
Puzzle
          Checking Matrix Products

• Given three matrices A and BC, check if A=BC?
   – mod p for simplicity


• Matrices are n*n


• Easy to do in n3 time

• Can you do better?
Puzzle
         Checking Matrix Products

• Given three matrices A and BC, check if A=BC?

• Matrices are n*n


• Easy to do in n3 time

• Can you do better?

Weitere ähnliche Inhalte

Ähnlich wie Randomized algorithms

Significance tests
Significance testsSignificance tests
Significance tests
Jinho Choi
 
Statisticsforbiologists colstons
Statisticsforbiologists colstonsStatisticsforbiologists colstons
Statisticsforbiologists colstons
andymartin
 
Recurrence relationclass 5
Recurrence relationclass 5Recurrence relationclass 5
Recurrence relationclass 5
Kumar
 

Ähnlich wie Randomized algorithms (20)

Significance tests
Significance testsSignificance tests
Significance tests
 
Unit7
Unit7Unit7
Unit7
 
Statisticsforbiologists colstons
Statisticsforbiologists colstonsStatisticsforbiologists colstons
Statisticsforbiologists colstons
 
Combinatorics.ppt
Combinatorics.pptCombinatorics.ppt
Combinatorics.ppt
 
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWERUndecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
 
Brute force method
Brute force methodBrute force method
Brute force method
 
Undecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation AlgorithmsUndecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation Algorithms
 
Pn sequence
Pn sequencePn sequence
Pn sequence
 
densematrix.ppt
densematrix.pptdensematrix.ppt
densematrix.ppt
 
Matt Purkeypile's Doctoral Dissertation Defense Slides
Matt Purkeypile's Doctoral Dissertation Defense SlidesMatt Purkeypile's Doctoral Dissertation Defense Slides
Matt Purkeypile's Doctoral Dissertation Defense Slides
 
unit 4 nearest neighbor.ppt
unit 4 nearest neighbor.pptunit 4 nearest neighbor.ppt
unit 4 nearest neighbor.ppt
 
sorting
sortingsorting
sorting
 
Chap8 slides
Chap8 slidesChap8 slides
Chap8 slides
 
Unit 5
Unit 5Unit 5
Unit 5
 
Unit 5
Unit 5Unit 5
Unit 5
 
A star
A starA star
A star
 
All-Reduce and Prefix-Sum Operations
All-Reduce and Prefix-Sum Operations All-Reduce and Prefix-Sum Operations
All-Reduce and Prefix-Sum Operations
 
Quantization
QuantizationQuantization
Quantization
 
Module 2 Design Analysis and Algorithms
Module 2 Design Analysis and AlgorithmsModule 2 Design Analysis and Algorithms
Module 2 Design Analysis and Algorithms
 
Recurrence relationclass 5
Recurrence relationclass 5Recurrence relationclass 5
Recurrence relationclass 5
 

Mehr von Strand Life Sciences Pvt Ltd

Converting High Dimensional Problems to Low Dimensional Ones
Converting High Dimensional Problems to Low Dimensional OnesConverting High Dimensional Problems to Low Dimensional Ones
Converting High Dimensional Problems to Low Dimensional Ones
Strand Life Sciences Pvt Ltd
 

Mehr von Strand Life Sciences Pvt Ltd (12)

Strand genomics features in CIO review
Strand genomics features in CIO reviewStrand genomics features in CIO review
Strand genomics features in CIO review
 
Rules of a Quantum World
Rules of  a Quantum WorldRules of  a Quantum World
Rules of a Quantum World
 
Least common ancestors in constant time
Least common ancestors in constant timeLeast common ancestors in constant time
Least common ancestors in constant time
 
Introduction to statistics iii
Introduction to statistics iiiIntroduction to statistics iii
Introduction to statistics iii
 
Introduction to statistics ii
Introduction to statistics iiIntroduction to statistics ii
Introduction to statistics ii
 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statistics
 
Dynamic programming for simd
Dynamic programming for simdDynamic programming for simd
Dynamic programming for simd
 
Complex numbers polynomial multiplication
Complex numbers polynomial multiplicationComplex numbers polynomial multiplication
Complex numbers polynomial multiplication
 
Converting High Dimensional Problems to Low Dimensional Ones
Converting High Dimensional Problems to Low Dimensional OnesConverting High Dimensional Problems to Low Dimensional Ones
Converting High Dimensional Problems to Low Dimensional Ones
 
Searching using Quantum Rules
Searching using Quantum RulesSearching using Quantum Rules
Searching using Quantum Rules
 
Suffix arrays
Suffix arraysSuffix arrays
Suffix arrays
 
Alignment of raw reads in Avadis NGS
Alignment of raw reads in Avadis NGSAlignment of raw reads in Avadis NGS
Alignment of raw reads in Avadis NGS
 

Kürzlich hochgeladen

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Randomized algorithms

  • 1. The Power of Randomization
  • 2. Example 1: Checking Equality • Two large files at two different locations. • Are they identical? – By communicating only a small amount of information!
  • 3. Checking Equality The Challenge • Two large numbers N1 and N2 , n bits each • Communication allowed: m<<n bits • Possible?
  • 4. Checking Equality Impossibility • Suppose the communication is based on N1 alone • m<<n, – Two different N1’s will have the same m-bit communication pattern – Switch N2 from one to another (YES->NO)
  • 5. Checking Equality Randomized Algorithms • Communicate N1 mod M for some number M • If N1 = N2 then you always get YES • If N1 != N2 then you get YES if M divides N1 - N2
  • 6. Checking Equality Analysis • Probability N1 != N2 but M divides N1 - N2 ? • Probability over what? • M and not N1,N2 • Choose M at random in the range 1..2m
  • 7. Checking Equality Analysis • How many factors does N1 - N2 have? – N1 - N2 <= 2n, so (2n)1/log n • If we choose M randomly in the range 1..2 (2n)1/log n – Probability N1 != N2 but M divides N1 - N2 <= 1/2 – So m is ~ n/log n bits (minor gains)
  • 8. Checking Equality Use Prime Numbers • How many prime factors does N1 - N2 have? – N1 - N2 <= 2n, so 2n/log n • If we choose M to be a random prime in 1..4n – There are at least 4n/log 4n > 4n/log(4n) primes – Probability N1 != N2 but M divides N1 - N2 <= ~ 1/2 – So m is ~ log n bits (major gains)
  • 9. Checking Equality The Solution • Two large numbers N1 and N2 , n bits each • log n bits of communication – Remainder w.r.t random prime in range 1..4n • Error Prob < 1/2
  • 10. Checking Equality Reducing Error Prob • Repeat k times • Communication is klog n bits • Error prob < (½)k
  • 11. Checking Equality Example Numbers • 10GB file, n=1010 • Desired Error Prob 10-30 • Communication 99 * 33 = 3267 bits = 400 bytes If 10 billion people do 10 billion checks a day, the prob that even one of the checks is erroneous is 1/10 billion
  • 12. Another Example PCA • Fit a line thru 0 to a collection of points so as to maximize sum of squares of projections
  • 13. PCA Random Sampling • Too many points? • Pick a random sample – The fitting line doesn’t change too much?
  • 14. PCA Random Sampling • How should you sample here?
  • 15. Puzzle Checking Matrix Products • Given three matrices A and BC, check if A=BC? – mod p for simplicity • Matrices are n*n • Easy to do in n3 time • Can you do better?
  • 16. Puzzle Checking Matrix Products • Given three matrices A and BC, check if A=BC? • Matrices are n*n • Easy to do in n3 time • Can you do better?