SlideShare ist ein Scribd-Unternehmen logo
1 von 6
Downloaden Sie, um offline zu lesen
Subject: Perspective in Informatics 3 – Fall Semester 2014
Professor: Davood Rafiei
Assignment No.1 HOANG Nguyen Phong
Submitted on November 3rd ID number: 6930-26-1264
Question 1 [30 marks]
• 3.5.1: on the space of nonnegative integers, which of the following functions are
distance measures? If so, prove it; if not, prove that it fails to satisfy one or more of the
axioms.
a) max(x, y) = the larger of x and y.
This function is distance measure function because of the following reasons:
• In the space of nonnegative integers as given from the beginning, the function would
never return a negative value.
• If x and y are at the same position in the space, then no larger value is defined, which
would return a null value (which is 0). That satisfies the reflexive property of distance
measure function.
• Measuring both distances from x to y and x < y, and from y to x would only return one
larger value. It satisfies the symmetric property of distance measure function.
• Let x and y are 2 separate nodes, and a is a random node (different from x and y).
Then, the triangle-inequality can be proved as shown in the below table:
3 Possible cases of a max(x,a) + max(y,a) > max(x,y) Check
a ∈ [x,y] a + y ≥ y true
a < (x,y) x + y ≥ y true
a > (x,y) a + a ≥ y true(since a≥y => 2a≥y)
• Actually, this function is the L∞-norm Euclidean distance measuring function, which is
used when x and y have many dimensions (where the dimension ~> ∞). Then, the
distance between x and y is approximately equal to the max(x,y).
b) diff(x, y) = |x − y| (the absolute magnitude of the difference between x and y).
• By proving in the same manner of the above case, this function is also a distance
measure function, because of the following reasons:
• Since the absolute-value function, it would always return a nonnegative value.
• If x and y is a same point, the function will return 0. That satisfies the reflexive
property.
• Let x and y are 2 separate nodes, and a is a random node (different from x and y).
Then, the triangle-inequality can be proved as shown in the below table:
3 Possible cases of a diff(x,a) + diff(y,a) > diff(x,y) Check
a ∈ [x,y]
(a – x) + (y – a) ≥ y – x
 y – x ≥ y – x
true
a < (x,y)
(x – a) + (y – a) > y – x
 x + y – 2a > y – x
 x > a
true (since a<x as given in
the initial condition of a )
a > (x,y) a + a > y
true(since a>y as given in
the initial condition of a
=> 2a>y)
• Actually, we can imagine that this function is a L1-norm Euclidean Distance function
for measuring x and y in 1 dimension.
c) sum(x, y) = x + y.
It is easily proved that this function is not a distance measure function, since it does not
satisfies the reflexive property. For instance, if x and y are a same point (≠0), the function
would return a positive value in lieu of 0 because they are both in nonnegative space.
1
Subject: Perspective in Informatics 3 – Fall Semester 2014
Professor: Davood Rafiei
• 3.7.2: Let us compute sketches using the following four “random” vectors:
V1= [+1,+1,+1,-1] V2=[+1,+1,-1,+1]
V3=[+1,-1,+1,+1] V4=[-1,+1,+1,+1]
Compute the sketches of the following vectors.
• [2,3,4,5]
Random vector Dot product Sketch value
V1= [+1,+1,+1,-1] 4 +1
V2=[+1,+1,-1,+1] 6 +1
V3=[+1,-1,+1,+1] 8 +1
V4=[-1,+1,+1,+1] 10 +1
(b)[-2,3,-4,5]
Random vector Dot product Sketch value
V1= [+1,+1,+1,-1] -8 -1
V2=[+1,+1,-1,+1] 10 +1
V3=[+1,-1,+1,+1] -4 -1
V4=[-1,+1,+1,+1] 6 +1
(c)[2,-3,4,-5]
Random vector Dot product Sketch value
V1= [+1,+1,+1,-1] 8 +1
V2=[+1,+1,-1,+1] -10 -1
V3=[+1,-1,+1,+1] 4 +1
V4=[-1,+1,+1,+1] -6 -1
For each pair, what is the estimated angle between them, according to the sketches? What are
the true angles?
The following 2 formulas are employed to calculate the Estimated angle and true angles:
• Estimated angle = 180O(1 – sim( Sketches of 2 vectors))
• True Angle =
𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷(𝑡𝑡ℎ𝑒𝑒 2 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣)
𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝑜𝑜𝑜𝑜 𝑡𝑡ℎ𝑒𝑒 2 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣
Pair Estimated angle True angles
∠ (a)(b) 90o 90o-15o=75o
∠ (b)(c) 180o 180o
∠ (a)(c) 90o 90o+15o=105o
• 3.7.3: suppose we form sketches by using all sixteen of the vectors of length 4, whose
components are each +1 or -1. Compute the sketches of the three vectors in Exercise
3.7.2.
*at dot product = 0, sketch value is randomly chosen to be 1 or +1 as highlighted in gray.
2
Subject: Perspective in Informatics 3 – Fall Semester 2014
Professor: Davood Rafiei
Vector a 2 3 4 5
Random vector dot Product Sketch value
v1 -1 -1 -1 -1 -14 -1
v2 -1 -1 -1 1 -4 -1
v3 -1 -1 1 -1 -6 -1
v4 -1 -1 1 1 4 1
v5 -1 1 -1 -1 -8 -1
v6 -1 1 -1 1 2 1
v7 -1 1 1 -1 0 1
v8 -1 1 1 1 10 1
v9 1 -1 -1 -1 -10 -1
v10 1 -1 -1 1 0 -1
v11 1 -1 1 -1 -2 -1
v12 1 -1 1 1 8 1
v13 1 1 -1 -1 -4 -1
v14 1 1 -1 1 6 1
v15 1 1 1 -1 4 1
v16 1 1 1 1 14 1
Vector b -2 3 -4 5
Random vector dot Product Sketch value
v1 -1 -1 -1 -1 -2 -1
v2 -1 -1 -1 1 8 1
v3 -1 -1 1 -1 -10 -1
v4 -1 -1 1 1 0 -1
v5 -1 1 -1 -1 4 1
v6 -1 1 -1 1 14 1
v7 -1 1 1 -1 -4 -1
v8 -1 1 1 1 6 1
v9 1 -1 -1 -1 -6 -1
v10 1 -1 -1 1 4 1
v11 1 -1 1 -1 -14 -1
v12 1 -1 1 1 -4 -1
v13 1 1 -1 -1 0 1
v14 1 1 -1 1 10 1
v15 1 1 1 -1 -8 -1
v16 1 1 1 1 2 1
3
Subject: Perspective in Informatics 3 – Fall Semester 2014
Professor: Davood Rafiei
Vector c 2 -3 4 -5
Random vector dot Product Sketch value
v1 -1 -1 -1 -1 2 1
v2 -1 -1 -1 1 -8 -1
v3 -1 -1 1 -1 10 1
v4 -1 -1 1 1 0 1
v5 -1 1 -1 -1 -4 -1
v6 -1 1 -1 1 -14 -1
v7 -1 1 1 -1 4 1
v8 -1 1 1 1 -6 -1
v9 1 -1 -1 -1 6 1
v10 1 -1 -1 1 -4 -1
v11 1 -1 1 -1 14 1
v12 1 -1 1 1 4 1
v13 1 1 -1 -1 0 -1
v14 1 1 -1 1 -10 -1
v15 1 1 1 -1 8 1
v16 1 1 1 1 -2 -1
How do the estimates of the angles between each pair compare with the true angles?
Pair Estimated angle True angles
∠ (a)(b) ½ => 90o 90o-15o=75o
∠ (b)(c) 11/12 => approximate 180o 180o
∠ (a)(c) ½ => 90o 90o+15o=105o
Then it can be deduced that even all of 16 random vectors are chosen, the estimates of the
angles between each pair compare with the true angles do not change compared with the result
in problem 3.7.2.
4
Subject: Perspective in Informatics 3 – Fall Semester 2014
Professor: Davood Rafiei
Question 2 [10 marks] 3.7.4(A): Suppose we form sketches using the four vectors from
Exercise 3.7.2. What are the constrains on a, b, c, and d that will cause the sketch of the vector
[a, b, c, d] to be [+1,+1,+1,+1]? (write your constrains in as simple form as possible)
The dot products of four random vectors and [a, b, c, d] can be represented in form of matrix as
following equation:
�
1 1 1 −1
1 1 −1 1
1 −1 1 1
−1 1 1 1
�。 �
𝑎𝑎
𝑏𝑏
𝑐𝑐
𝑑𝑑
� = �
𝑥𝑥1
𝑥𝑥2
𝑥𝑥3
𝑥𝑥4
�
the sketch of [a, b, c, d] is [+1 , +1, +1, +1] where all of x1,x2,x3,x4 ≥ 0
�
1 1 1 −1
1 1 −1 1
1 −1 1 1
−1 1 1 1
�
−1
�
1 1 1 −1
1 1 −1 1
1 −1 1 1
−1 1 1 1
�。 �
𝑎𝑎
𝑏𝑏
𝑐𝑐
𝑑𝑑
� = �
1 1 1 −1
1 1 −1 1
1 −1 1 1
−1 1 1 1
�
−1
�
𝑥𝑥1
𝑥𝑥2
𝑥𝑥3
𝑥𝑥4
�
�
𝑎𝑎
𝑏𝑏
𝑐𝑐
𝑑𝑑
� = �
1 1 1 −1
1 1 −1 1
1 −1 1 1
−1 1 1 1
�
−1
�
𝑥𝑥1
𝑥𝑥2
𝑥𝑥3
𝑥𝑥4
�
We have�
1 1 1 −1
1 1 −1 1
1 −1 1 1
−1 1 1 1
�
−1
=
1
4
�
1 1 1 −1
1 1 −1 1
1 −1 1 1
−1 1 1 1
�
So a, b, a and d can be constrained by the following equation:
�
𝑎𝑎
𝑏𝑏
𝑐𝑐
𝑑𝑑
� =
1
4
�
1 1 1 −1
1 1 −1 1
1 −1 1 1
−1 1 1 1
� �
𝑥𝑥1
𝑥𝑥2
𝑥𝑥3
𝑥𝑥4
� where x1,x2,x3,x4 ≥ 0�
𝑎𝑎 + 𝑏𝑏 + 𝑐𝑐 − 𝑑𝑑 ≥ 0
𝑎𝑎 + 𝑏𝑏 − 𝑐𝑐 + 𝑑𝑑 ≥ 0
𝑎𝑎 − 𝑏𝑏 + 𝑐𝑐 + 𝑑𝑑 ≥ 0
−𝑎𝑎 + 𝑏𝑏 + 𝑐𝑐 + 𝑑𝑑 ≥ 0
5
Subject: Perspective in Informatics 3 – Fall Semester 2014
Professor: Davood Rafiei
Question 3 [10 marks]
a) Consider a universe U with n elements, and let R and S be subsets of U both of size m,
chosen uniformly at random.
What is the expected value of the Jaccard similarity of R and S?
The Expectation of an event x is calculated as Ε(x) = ∑x. P(x)
In this case, Jaccard Similarity of R and S is calculated as:
Sim(R,S)=
|𝑅𝑅⋂𝑆𝑆|
|𝑅𝑅⋃𝑆𝑆|
=
𝑘𝑘
2𝑚𝑚−𝑘𝑘
(𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 0 ≤ 𝑘𝑘 ≤ 𝑚𝑚 𝑖𝑖𝑖𝑖 𝑡𝑡ℎ𝑒𝑒 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑒𝑒𝑒𝑒𝑒𝑒 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑅𝑅 𝑎𝑎𝑎𝑎𝑎𝑎 𝑆𝑆)
Next, the probability of Sim(R,S) is calculated as following:
P(sim(R,S)=(
𝑘𝑘
2𝑚𝑚−𝑘𝑘
))=
𝐶𝐶 𝑚𝑚
𝑘𝑘 𝐶𝐶𝑛𝑛−𝑚𝑚
𝑚𝑚−𝑘𝑘
𝐶𝐶𝑛𝑛
𝑚𝑚
Since:
• To create set R, we combine m element(s) from n elements of the universal set U. It is
calculated as: 𝐶𝐶𝑛𝑛
𝑚𝑚
• Next, to create set S, we need to take k common element(s) from set R first, which is
calculated as 𝐶𝐶𝑚𝑚
𝑘𝑘
. Then the left (m-k) element(s) are chosen from (n-m) elements, since
m element(s) have been chosen to create set R at the beginning. The formula is:
𝐶𝐶𝑛𝑛−𝑚𝑚
𝑚𝑚−𝑘𝑘
As a result, Expectation of Jaccard Similarity sim(S,T) is estimated as:
E(sim(S,T))=∑
𝑘𝑘
2𝑚𝑚−𝑘𝑘
𝐶𝐶 𝑚𝑚
𝑘𝑘
𝐶𝐶𝑛𝑛−𝑚𝑚
𝑚𝑚−𝑘𝑘
𝐶𝐶𝑛𝑛
𝑚𝑚 =𝑚𝑚
𝑘𝑘=0 ∑
𝑘𝑘
2𝑚𝑚−𝑘𝑘
�
𝑚𝑚
𝑘𝑘��
𝑛𝑛−𝑚𝑚
𝑚𝑚−𝑘𝑘�
� 𝑛𝑛
𝑚𝑚�
𝑚𝑚
𝑘𝑘=0
b) How does your answer to part (a) change if R and S must include a certain element (say z)
of U?
It means k ~> z, then the answer is changed to be:
E(sim(S,T))=∑
𝑧𝑧
2𝑚𝑚−𝑧𝑧
𝐶𝐶 𝑚𝑚
𝑧𝑧
𝐶𝐶𝑛𝑛−𝑚𝑚
𝑚𝑚−𝑧𝑧
𝐶𝐶𝑛𝑛
𝑚𝑚 =𝑧𝑧
𝑘𝑘=0 ∑
𝑧𝑧
2𝑚𝑚−𝑧𝑧
� 𝑚𝑚
𝑧𝑧 �� 𝑛𝑛−𝑚𝑚
𝑚𝑚−𝑧𝑧�
� 𝑛𝑛
𝑚𝑚�
𝑧𝑧
𝑘𝑘=0
c) How does your answer to part (a) change if R and S must be disjoint?
It means k=0, then the answer is changed to be:
E(sim(S,T))=∑
𝑘𝑘
2𝑚𝑚−𝑘𝑘
𝐶𝐶 𝑚𝑚
𝑘𝑘
𝐶𝐶𝑛𝑛−𝑚𝑚
𝑚𝑚−𝑘𝑘
𝐶𝐶𝑛𝑛
𝑚𝑚 =0
𝑘𝑘=0 0
6

Weitere ähnliche Inhalte

Ähnlich wie Perspective in Informatics 3 - Assignment 1 - Answer Sheet

Motion in a plane
Motion in a planeMotion in a plane
Motion in a planeVIDYAGAUDE
 
Chapter 3 - Part 1 [Autosaved].pptx
Chapter 3 - Part 1 [Autosaved].pptxChapter 3 - Part 1 [Autosaved].pptx
Chapter 3 - Part 1 [Autosaved].pptxKokebe2
 
Nonparametric approach to multiple regression
Nonparametric approach to multiple regressionNonparametric approach to multiple regression
Nonparametric approach to multiple regressionAlexander Decker
 
GATE Engineering Maths : Vector Calculus
GATE Engineering Maths : Vector CalculusGATE Engineering Maths : Vector Calculus
GATE Engineering Maths : Vector CalculusParthDave57
 
Matrix algebra in_r
Matrix algebra in_rMatrix algebra in_r
Matrix algebra in_rRazzaqe
 
Lesson 1: Vectors and Scalars
Lesson 1: Vectors and ScalarsLesson 1: Vectors and Scalars
Lesson 1: Vectors and ScalarsVectorKing
 
chapter 3 , foley.pptxhuujjjjjjjkjmmmm. Ibibhvucufucuvivihohi
chapter 3 , foley.pptxhuujjjjjjjkjmmmm.  Ibibhvucufucuvivihohichapter 3 , foley.pptxhuujjjjjjjkjmmmm.  Ibibhvucufucuvivihohi
chapter 3 , foley.pptxhuujjjjjjjkjmmmm. Ibibhvucufucuvivihohi54MahakBansal
 
April 10, 2015
April 10, 2015April 10, 2015
April 10, 2015khyps13
 
Matlab polynimials and curve fitting
Matlab polynimials and curve fittingMatlab polynimials and curve fitting
Matlab polynimials and curve fittingAmeen San
 
IVS-B UNIT-1_merged. Semester 2 fundamental of sciencepdf
IVS-B UNIT-1_merged. Semester 2 fundamental of sciencepdfIVS-B UNIT-1_merged. Semester 2 fundamental of sciencepdf
IVS-B UNIT-1_merged. Semester 2 fundamental of sciencepdf42Rnu
 
Lesson 3: Problem Set 4
Lesson 3: Problem Set 4Lesson 3: Problem Set 4
Lesson 3: Problem Set 4Kevin Johnson
 
Module 3 quadratic functions
Module 3   quadratic functionsModule 3   quadratic functions
Module 3 quadratic functionsdionesioable
 
Linear regression by Kodebay
Linear regression by KodebayLinear regression by Kodebay
Linear regression by KodebayKodebay
 
IRJET- Solving Quadratic Equations using C++ Application Program
IRJET-  	  Solving Quadratic Equations using C++ Application ProgramIRJET-  	  Solving Quadratic Equations using C++ Application Program
IRJET- Solving Quadratic Equations using C++ Application ProgramIRJET Journal
 

Ähnlich wie Perspective in Informatics 3 - Assignment 1 - Answer Sheet (20)

Motion in a plane
Motion in a planeMotion in a plane
Motion in a plane
 
Chapter 3 - Part 1 [Autosaved].pptx
Chapter 3 - Part 1 [Autosaved].pptxChapter 3 - Part 1 [Autosaved].pptx
Chapter 3 - Part 1 [Autosaved].pptx
 
Nonparametric approach to multiple regression
Nonparametric approach to multiple regressionNonparametric approach to multiple regression
Nonparametric approach to multiple regression
 
Ch4
Ch4Ch4
Ch4
 
Fst ch2 notes
Fst ch2 notesFst ch2 notes
Fst ch2 notes
 
GATE Engineering Maths : Vector Calculus
GATE Engineering Maths : Vector CalculusGATE Engineering Maths : Vector Calculus
GATE Engineering Maths : Vector Calculus
 
Matrix algebra in_r
Matrix algebra in_rMatrix algebra in_r
Matrix algebra in_r
 
Lesson 1: Vectors and Scalars
Lesson 1: Vectors and ScalarsLesson 1: Vectors and Scalars
Lesson 1: Vectors and Scalars
 
Fst ch3 notes
Fst ch3 notesFst ch3 notes
Fst ch3 notes
 
chapter 3 , foley.pptxhuujjjjjjjkjmmmm. Ibibhvucufucuvivihohi
chapter 3 , foley.pptxhuujjjjjjjkjmmmm.  Ibibhvucufucuvivihohichapter 3 , foley.pptxhuujjjjjjjkjmmmm.  Ibibhvucufucuvivihohi
chapter 3 , foley.pptxhuujjjjjjjkjmmmm. Ibibhvucufucuvivihohi
 
April 10, 2015
April 10, 2015April 10, 2015
April 10, 2015
 
B.Tech-II_Unit-V
B.Tech-II_Unit-VB.Tech-II_Unit-V
B.Tech-II_Unit-V
 
Matlab polynimials and curve fitting
Matlab polynimials and curve fittingMatlab polynimials and curve fitting
Matlab polynimials and curve fitting
 
IVS-B UNIT-1_merged. Semester 2 fundamental of sciencepdf
IVS-B UNIT-1_merged. Semester 2 fundamental of sciencepdfIVS-B UNIT-1_merged. Semester 2 fundamental of sciencepdf
IVS-B UNIT-1_merged. Semester 2 fundamental of sciencepdf
 
Cg 04-math
Cg 04-mathCg 04-math
Cg 04-math
 
Lesson 3: Problem Set 4
Lesson 3: Problem Set 4Lesson 3: Problem Set 4
Lesson 3: Problem Set 4
 
Module 3 quadratic functions
Module 3   quadratic functionsModule 3   quadratic functions
Module 3 quadratic functions
 
Linear regression by Kodebay
Linear regression by KodebayLinear regression by Kodebay
Linear regression by Kodebay
 
Assignment4
Assignment4Assignment4
Assignment4
 
IRJET- Solving Quadratic Equations using C++ Application Program
IRJET-  	  Solving Quadratic Equations using C++ Application ProgramIRJET-  	  Solving Quadratic Equations using C++ Application Program
IRJET- Solving Quadratic Equations using C++ Application Program
 

Mehr von Hoang Nguyen Phong

Perspective in Informatics 3 - Assignment 2 - marked answers
Perspective in Informatics 3 - Assignment 2 - marked answersPerspective in Informatics 3 - Assignment 2 - marked answers
Perspective in Informatics 3 - Assignment 2 - marked answersHoang Nguyen Phong
 
Perspective in Informatics 3 - Assignment 2 - Answer Sheet
Perspective in Informatics 3 - Assignment 2 - Answer SheetPerspective in Informatics 3 - Assignment 2 - Answer Sheet
Perspective in Informatics 3 - Assignment 2 - Answer SheetHoang Nguyen Phong
 
Perspective in Informatics 3 - Assignment 2
Perspective in Informatics 3 - Assignment 2Perspective in Informatics 3 - Assignment 2
Perspective in Informatics 3 - Assignment 2Hoang Nguyen Phong
 
Perspective in Informatics 3 - Assignment 1 - marked answers
Perspective in Informatics 3 - Assignment 1 - marked answersPerspective in Informatics 3 - Assignment 1 - marked answers
Perspective in Informatics 3 - Assignment 1 - marked answersHoang Nguyen Phong
 
Perspective in Informatics 3 - Assignment 1
Perspective in Informatics 3 - Assignment 1Perspective in Informatics 3 - Assignment 1
Perspective in Informatics 3 - Assignment 1Hoang Nguyen Phong
 

Mehr von Hoang Nguyen Phong (6)

Perspective in Informatics 3 - Assignment 2 - marked answers
Perspective in Informatics 3 - Assignment 2 - marked answersPerspective in Informatics 3 - Assignment 2 - marked answers
Perspective in Informatics 3 - Assignment 2 - marked answers
 
Perspective in Informatics 3 - Assignment 2 - Answer Sheet
Perspective in Informatics 3 - Assignment 2 - Answer SheetPerspective in Informatics 3 - Assignment 2 - Answer Sheet
Perspective in Informatics 3 - Assignment 2 - Answer Sheet
 
Perspective in Informatics 3 - Assignment 2
Perspective in Informatics 3 - Assignment 2Perspective in Informatics 3 - Assignment 2
Perspective in Informatics 3 - Assignment 2
 
Perspective in Informatics 3 - Assignment 1 - marked answers
Perspective in Informatics 3 - Assignment 1 - marked answersPerspective in Informatics 3 - Assignment 1 - marked answers
Perspective in Informatics 3 - Assignment 1 - marked answers
 
Perspective in Informatics 3 - Assignment 1
Perspective in Informatics 3 - Assignment 1Perspective in Informatics 3 - Assignment 1
Perspective in Informatics 3 - Assignment 1
 
Outline
OutlineOutline
Outline
 

Kürzlich hochgeladen

Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesShubhangi Sonawane
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxNikitaBankoti2
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 

Kürzlich hochgeladen (20)

Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 

Perspective in Informatics 3 - Assignment 1 - Answer Sheet

  • 1. Subject: Perspective in Informatics 3 – Fall Semester 2014 Professor: Davood Rafiei Assignment No.1 HOANG Nguyen Phong Submitted on November 3rd ID number: 6930-26-1264 Question 1 [30 marks] • 3.5.1: on the space of nonnegative integers, which of the following functions are distance measures? If so, prove it; if not, prove that it fails to satisfy one or more of the axioms. a) max(x, y) = the larger of x and y. This function is distance measure function because of the following reasons: • In the space of nonnegative integers as given from the beginning, the function would never return a negative value. • If x and y are at the same position in the space, then no larger value is defined, which would return a null value (which is 0). That satisfies the reflexive property of distance measure function. • Measuring both distances from x to y and x < y, and from y to x would only return one larger value. It satisfies the symmetric property of distance measure function. • Let x and y are 2 separate nodes, and a is a random node (different from x and y). Then, the triangle-inequality can be proved as shown in the below table: 3 Possible cases of a max(x,a) + max(y,a) > max(x,y) Check a ∈ [x,y] a + y ≥ y true a < (x,y) x + y ≥ y true a > (x,y) a + a ≥ y true(since a≥y => 2a≥y) • Actually, this function is the L∞-norm Euclidean distance measuring function, which is used when x and y have many dimensions (where the dimension ~> ∞). Then, the distance between x and y is approximately equal to the max(x,y). b) diff(x, y) = |x − y| (the absolute magnitude of the difference between x and y). • By proving in the same manner of the above case, this function is also a distance measure function, because of the following reasons: • Since the absolute-value function, it would always return a nonnegative value. • If x and y is a same point, the function will return 0. That satisfies the reflexive property. • Let x and y are 2 separate nodes, and a is a random node (different from x and y). Then, the triangle-inequality can be proved as shown in the below table: 3 Possible cases of a diff(x,a) + diff(y,a) > diff(x,y) Check a ∈ [x,y] (a – x) + (y – a) ≥ y – x  y – x ≥ y – x true a < (x,y) (x – a) + (y – a) > y – x  x + y – 2a > y – x  x > a true (since a<x as given in the initial condition of a ) a > (x,y) a + a > y true(since a>y as given in the initial condition of a => 2a>y) • Actually, we can imagine that this function is a L1-norm Euclidean Distance function for measuring x and y in 1 dimension. c) sum(x, y) = x + y. It is easily proved that this function is not a distance measure function, since it does not satisfies the reflexive property. For instance, if x and y are a same point (≠0), the function would return a positive value in lieu of 0 because they are both in nonnegative space. 1
  • 2. Subject: Perspective in Informatics 3 – Fall Semester 2014 Professor: Davood Rafiei • 3.7.2: Let us compute sketches using the following four “random” vectors: V1= [+1,+1,+1,-1] V2=[+1,+1,-1,+1] V3=[+1,-1,+1,+1] V4=[-1,+1,+1,+1] Compute the sketches of the following vectors. • [2,3,4,5] Random vector Dot product Sketch value V1= [+1,+1,+1,-1] 4 +1 V2=[+1,+1,-1,+1] 6 +1 V3=[+1,-1,+1,+1] 8 +1 V4=[-1,+1,+1,+1] 10 +1 (b)[-2,3,-4,5] Random vector Dot product Sketch value V1= [+1,+1,+1,-1] -8 -1 V2=[+1,+1,-1,+1] 10 +1 V3=[+1,-1,+1,+1] -4 -1 V4=[-1,+1,+1,+1] 6 +1 (c)[2,-3,4,-5] Random vector Dot product Sketch value V1= [+1,+1,+1,-1] 8 +1 V2=[+1,+1,-1,+1] -10 -1 V3=[+1,-1,+1,+1] 4 +1 V4=[-1,+1,+1,+1] -6 -1 For each pair, what is the estimated angle between them, according to the sketches? What are the true angles? The following 2 formulas are employed to calculate the Estimated angle and true angles: • Estimated angle = 180O(1 – sim( Sketches of 2 vectors)) • True Angle = 𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷(𝑡𝑡ℎ𝑒𝑒 2 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣) 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝑜𝑜𝑜𝑜 𝑡𝑡ℎ𝑒𝑒 2 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 Pair Estimated angle True angles ∠ (a)(b) 90o 90o-15o=75o ∠ (b)(c) 180o 180o ∠ (a)(c) 90o 90o+15o=105o • 3.7.3: suppose we form sketches by using all sixteen of the vectors of length 4, whose components are each +1 or -1. Compute the sketches of the three vectors in Exercise 3.7.2. *at dot product = 0, sketch value is randomly chosen to be 1 or +1 as highlighted in gray. 2
  • 3. Subject: Perspective in Informatics 3 – Fall Semester 2014 Professor: Davood Rafiei Vector a 2 3 4 5 Random vector dot Product Sketch value v1 -1 -1 -1 -1 -14 -1 v2 -1 -1 -1 1 -4 -1 v3 -1 -1 1 -1 -6 -1 v4 -1 -1 1 1 4 1 v5 -1 1 -1 -1 -8 -1 v6 -1 1 -1 1 2 1 v7 -1 1 1 -1 0 1 v8 -1 1 1 1 10 1 v9 1 -1 -1 -1 -10 -1 v10 1 -1 -1 1 0 -1 v11 1 -1 1 -1 -2 -1 v12 1 -1 1 1 8 1 v13 1 1 -1 -1 -4 -1 v14 1 1 -1 1 6 1 v15 1 1 1 -1 4 1 v16 1 1 1 1 14 1 Vector b -2 3 -4 5 Random vector dot Product Sketch value v1 -1 -1 -1 -1 -2 -1 v2 -1 -1 -1 1 8 1 v3 -1 -1 1 -1 -10 -1 v4 -1 -1 1 1 0 -1 v5 -1 1 -1 -1 4 1 v6 -1 1 -1 1 14 1 v7 -1 1 1 -1 -4 -1 v8 -1 1 1 1 6 1 v9 1 -1 -1 -1 -6 -1 v10 1 -1 -1 1 4 1 v11 1 -1 1 -1 -14 -1 v12 1 -1 1 1 -4 -1 v13 1 1 -1 -1 0 1 v14 1 1 -1 1 10 1 v15 1 1 1 -1 -8 -1 v16 1 1 1 1 2 1 3
  • 4. Subject: Perspective in Informatics 3 – Fall Semester 2014 Professor: Davood Rafiei Vector c 2 -3 4 -5 Random vector dot Product Sketch value v1 -1 -1 -1 -1 2 1 v2 -1 -1 -1 1 -8 -1 v3 -1 -1 1 -1 10 1 v4 -1 -1 1 1 0 1 v5 -1 1 -1 -1 -4 -1 v6 -1 1 -1 1 -14 -1 v7 -1 1 1 -1 4 1 v8 -1 1 1 1 -6 -1 v9 1 -1 -1 -1 6 1 v10 1 -1 -1 1 -4 -1 v11 1 -1 1 -1 14 1 v12 1 -1 1 1 4 1 v13 1 1 -1 -1 0 -1 v14 1 1 -1 1 -10 -1 v15 1 1 1 -1 8 1 v16 1 1 1 1 -2 -1 How do the estimates of the angles between each pair compare with the true angles? Pair Estimated angle True angles ∠ (a)(b) ½ => 90o 90o-15o=75o ∠ (b)(c) 11/12 => approximate 180o 180o ∠ (a)(c) ½ => 90o 90o+15o=105o Then it can be deduced that even all of 16 random vectors are chosen, the estimates of the angles between each pair compare with the true angles do not change compared with the result in problem 3.7.2. 4
  • 5. Subject: Perspective in Informatics 3 – Fall Semester 2014 Professor: Davood Rafiei Question 2 [10 marks] 3.7.4(A): Suppose we form sketches using the four vectors from Exercise 3.7.2. What are the constrains on a, b, c, and d that will cause the sketch of the vector [a, b, c, d] to be [+1,+1,+1,+1]? (write your constrains in as simple form as possible) The dot products of four random vectors and [a, b, c, d] can be represented in form of matrix as following equation: � 1 1 1 −1 1 1 −1 1 1 −1 1 1 −1 1 1 1 �。 � 𝑎𝑎 𝑏𝑏 𝑐𝑐 𝑑𝑑 � = � 𝑥𝑥1 𝑥𝑥2 𝑥𝑥3 𝑥𝑥4 � the sketch of [a, b, c, d] is [+1 , +1, +1, +1] where all of x1,x2,x3,x4 ≥ 0 � 1 1 1 −1 1 1 −1 1 1 −1 1 1 −1 1 1 1 � −1 � 1 1 1 −1 1 1 −1 1 1 −1 1 1 −1 1 1 1 �。 � 𝑎𝑎 𝑏𝑏 𝑐𝑐 𝑑𝑑 � = � 1 1 1 −1 1 1 −1 1 1 −1 1 1 −1 1 1 1 � −1 � 𝑥𝑥1 𝑥𝑥2 𝑥𝑥3 𝑥𝑥4 � � 𝑎𝑎 𝑏𝑏 𝑐𝑐 𝑑𝑑 � = � 1 1 1 −1 1 1 −1 1 1 −1 1 1 −1 1 1 1 � −1 � 𝑥𝑥1 𝑥𝑥2 𝑥𝑥3 𝑥𝑥4 � We have� 1 1 1 −1 1 1 −1 1 1 −1 1 1 −1 1 1 1 � −1 = 1 4 � 1 1 1 −1 1 1 −1 1 1 −1 1 1 −1 1 1 1 � So a, b, a and d can be constrained by the following equation: � 𝑎𝑎 𝑏𝑏 𝑐𝑐 𝑑𝑑 � = 1 4 � 1 1 1 −1 1 1 −1 1 1 −1 1 1 −1 1 1 1 � � 𝑥𝑥1 𝑥𝑥2 𝑥𝑥3 𝑥𝑥4 � where x1,x2,x3,x4 ≥ 0� 𝑎𝑎 + 𝑏𝑏 + 𝑐𝑐 − 𝑑𝑑 ≥ 0 𝑎𝑎 + 𝑏𝑏 − 𝑐𝑐 + 𝑑𝑑 ≥ 0 𝑎𝑎 − 𝑏𝑏 + 𝑐𝑐 + 𝑑𝑑 ≥ 0 −𝑎𝑎 + 𝑏𝑏 + 𝑐𝑐 + 𝑑𝑑 ≥ 0 5
  • 6. Subject: Perspective in Informatics 3 – Fall Semester 2014 Professor: Davood Rafiei Question 3 [10 marks] a) Consider a universe U with n elements, and let R and S be subsets of U both of size m, chosen uniformly at random. What is the expected value of the Jaccard similarity of R and S? The Expectation of an event x is calculated as Ε(x) = ∑x. P(x) In this case, Jaccard Similarity of R and S is calculated as: Sim(R,S)= |𝑅𝑅⋂𝑆𝑆| |𝑅𝑅⋃𝑆𝑆| = 𝑘𝑘 2𝑚𝑚−𝑘𝑘 (𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 0 ≤ 𝑘𝑘 ≤ 𝑚𝑚 𝑖𝑖𝑖𝑖 𝑡𝑡ℎ𝑒𝑒 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑒𝑒𝑒𝑒𝑒𝑒 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑅𝑅 𝑎𝑎𝑎𝑎𝑎𝑎 𝑆𝑆) Next, the probability of Sim(R,S) is calculated as following: P(sim(R,S)=( 𝑘𝑘 2𝑚𝑚−𝑘𝑘 ))= 𝐶𝐶 𝑚𝑚 𝑘𝑘 𝐶𝐶𝑛𝑛−𝑚𝑚 𝑚𝑚−𝑘𝑘 𝐶𝐶𝑛𝑛 𝑚𝑚 Since: • To create set R, we combine m element(s) from n elements of the universal set U. It is calculated as: 𝐶𝐶𝑛𝑛 𝑚𝑚 • Next, to create set S, we need to take k common element(s) from set R first, which is calculated as 𝐶𝐶𝑚𝑚 𝑘𝑘 . Then the left (m-k) element(s) are chosen from (n-m) elements, since m element(s) have been chosen to create set R at the beginning. The formula is: 𝐶𝐶𝑛𝑛−𝑚𝑚 𝑚𝑚−𝑘𝑘 As a result, Expectation of Jaccard Similarity sim(S,T) is estimated as: E(sim(S,T))=∑ 𝑘𝑘 2𝑚𝑚−𝑘𝑘 𝐶𝐶 𝑚𝑚 𝑘𝑘 𝐶𝐶𝑛𝑛−𝑚𝑚 𝑚𝑚−𝑘𝑘 𝐶𝐶𝑛𝑛 𝑚𝑚 =𝑚𝑚 𝑘𝑘=0 ∑ 𝑘𝑘 2𝑚𝑚−𝑘𝑘 � 𝑚𝑚 𝑘𝑘�� 𝑛𝑛−𝑚𝑚 𝑚𝑚−𝑘𝑘� � 𝑛𝑛 𝑚𝑚� 𝑚𝑚 𝑘𝑘=0 b) How does your answer to part (a) change if R and S must include a certain element (say z) of U? It means k ~> z, then the answer is changed to be: E(sim(S,T))=∑ 𝑧𝑧 2𝑚𝑚−𝑧𝑧 𝐶𝐶 𝑚𝑚 𝑧𝑧 𝐶𝐶𝑛𝑛−𝑚𝑚 𝑚𝑚−𝑧𝑧 𝐶𝐶𝑛𝑛 𝑚𝑚 =𝑧𝑧 𝑘𝑘=0 ∑ 𝑧𝑧 2𝑚𝑚−𝑧𝑧 � 𝑚𝑚 𝑧𝑧 �� 𝑛𝑛−𝑚𝑚 𝑚𝑚−𝑧𝑧� � 𝑛𝑛 𝑚𝑚� 𝑧𝑧 𝑘𝑘=0 c) How does your answer to part (a) change if R and S must be disjoint? It means k=0, then the answer is changed to be: E(sim(S,T))=∑ 𝑘𝑘 2𝑚𝑚−𝑘𝑘 𝐶𝐶 𝑚𝑚 𝑘𝑘 𝐶𝐶𝑛𝑛−𝑚𝑚 𝑚𝑚−𝑘𝑘 𝐶𝐶𝑛𝑛 𝑚𝑚 =0 𝑘𝑘=0 0 6