3. Bitcoin Data Statistics
BlockChain Size: 72.10 G
Number of Blocks = 415263
Number of BlockFiles = 539
Number of Valid Blocks = 415259
Number of OrphanedBlocks in maps = 4
Number of Inputs = 351279196
Number of Outputs = 390680170
Number of Transactions = 134244597
Total Bitcoin volume = 3120487258.90 (312048725890043620 satoshis)
Average tx per block = 323.28
Average inputs per tx = 2.62
Average outputs per tx = 2.91
Average output value = 7.99
6. Bayesian Regression + Latent Source Model
For Bitcoin Market Price Prediction
k=1,2,..., K Latent Sources
y= Price Increase/Decrease/Same
Empirical:
Theoritical:
7. Bitcoin Market Capitalization
The total USD value of bitcoin supply in circulation, as calculated by the daily average market price across majorexchanges.
12. Transaction Graph
V: set of transaction {t1 , t2 , t3 , ..,tn }
E: set of directed edges {e1 ,e2 ,..,em }
ex = (t1 ,t2 ) => output of t1 is used in input of t2
In degree: d+_tx (t): the number of inputs for the transaction
Out degree: d-_tx(t): the number of outputs for the transaction
13. Public key Graph - G: (V,E)
V: set of public keys {pk1 , pk2 , pk3 , ..,pkn }
E: set of directed edges {e1 ,e2 ,..,em }
ex = (pk1 ,pk2 ) => flow of money from pk1 to pk2
In count: d+_addr(pk): # pk has been output in a transaction
Out count: d-_addr(pk): # pk has been input in a transaction
14. User Clustering Heuristics
Heuristics 1
If two(or more) addresses are inputs to the same transaction, they are
controlled by the same user.
Heuristics 2
The one time change address is controlled by the same user as the input
addresses.
what is change address?
what is one time change address?
15. Unsupervised Learning + Other Clustering Method
K - means clustering with PCA
C-means Clustering with Fuzzy Logic: similar to K-means, but allows for partial
membership to clusters.
Hierarchical Clustering: Agglomerative hierarchical clustering involves starting
with many small clusters, merging the most similar. Similarity of clusters ma
be measured utilizing several different metrics.
CURE Clustering Algorithm: a hierarchical clustering algorithm which is
moreadept at handling extreme points.
Approximate and/or parallelized versions of the above algorithms (time
permitting)
16. Tribeflow
Takes user trajectories
Infers a set of latent environments
Best describe the user trajectories through short random walk
Infer user preference: After short walk, performs weighted jump between environments
Infers the relationship between short trajectories and latent environments
Perform accurate personalized next-item prediction
Infer the posterior over the latent environment the user is currently surfing
17. Inferring Tribeflow Model from Data
The number of environments K > 1 from the data
K semi-Markov transition probability matrices corresponding raondom walks
over a finite set of graphs {G_M : M = 1, 2, …, K}
A distribution of user environment preferences
21. TribeFlow Rough Data Format For Transaction sequence
Time User_ID From_BTC_ID TO_BTC_ID
Problem: There’s a sequence of transaction from any source bitcoin address
to a destination bitcoin address. But for this sequence there can be multiple
users.. How to format the data?????? Moreover, what kind of information are we
trying to gain from these trajectories… bit confusing.. Need more thoughts.
Another idea… user’s btc holding sequence. How much bitcoin is being held by
the user over time? Can find out the flow of bitcoin from user to user? Is this
anything important !!!
TimeOfATransaction User_ID BTC_BeforeTransaction
BTC_AfterTransaction
22. Error in running the tribeflow
File "main.py", line 135, in <module>
main()
File "main.py", line 123, in main args.num_batches, True, from_=from_, to=to)
File "/scratch2/azehady/projects/tribeflow-master/tribeflow/dynamic.py", line 400, in fit
kernel.update_state(P)
File "tribeflow/kernels/eccdf.pyx", line 71, in tribeflow.kernels.eccdf.ECCDFKernel.update_state
(tribeflow/kernels/eccdf.c:2526)
assert P.shape[0] == self.P.shape[0]
AssertionError