SlideShare a Scribd company logo
1 of 12
Download to read offline
Hardware-Enhanced Association Rule Mining
with Hashing and Pipelining
Ying-Hsiang Wen, Jen-Wei Huang, and Ming-Syan Chen, Fellow, IEEE
Abstract—Generally speaking, to implement Apriori-based association rule mining in hardware, one has to load candidate itemsets
and a database into the hardware. Since the capacity of the hardware architecture is fixed, if the number of candidate itemsets or the
number of items in the database is larger than the hardware capacity, the items are loaded into the hardware separately. The time
complexity of those steps that need to load candidate itemsets or database items into the hardware is in proportion to the number of
candidate itemsets multiplied by the number of items in the database. Too many candidate itemsets and a large database would create
a performance bottleneck. In this paper, we propose a HAsh-based and PiPelIned (abbreviated as HAPPI) architecture for hardware-
enhanced association rule mining. We apply the pipeline methodology in the HAPPI architecture to compare itemsets with the
database and collect useful information for reducing the number of candidate itemsets and items in the database simultaneously.
When the database is fed into the hardware, candidate itemsets are compared with the items in the database to find frequent itemsets.
At the same time, trimming information is collected from each transaction. In addition, itemsets are generated from transactions and
hashed into a hash table. The useful trimming information and the hash table enable us to reduce the number of items in the database
and the number of candidate itemsets. Therefore, we can effectively reduce the frequency of loading the database into the hardware.
As such, HAPPI solves the bottleneck problem in a priori-based hardware schemes. We also derive some properties to investigate the
performance of this hardware implementation. As shown by the experiment results, HAPPI significantly outperforms the previous
hardware approach and the software algorithm in terms of execution time.
Index Terms—Hardware enhanced, association rule.
Ç
1 INTRODUCTION
DATA mining technology is now used in a wide variety of
fields. Applications include the analysis of customer
transaction records, web site logs, credit card purchase
information, call records, to name a few. The interesting
results of data mining can provide useful information such
as customer behavior for business managers and research-
ers. One of the most important data mining applications is
association rule mining [11], which can be described as
follows: Let I ¼ fi1; i2; . . . ; ing denote a set of items; let D
denote a set of database transactions, where each transac-
tion T is a set of items such that T  I; and let X denote a
set of items, called an itemset. A transaction T contains X if
and only if X  T. An association rule is an implication of
the form X¼)Y , where X  I, Y  I, and X
T
Y ¼ . The
rule X¼)Y has support s percent in the transaction set D if
s percent of transactions in D contain X
S
Y . The rule
X¼)Y holds in the transaction set D with confidence
c percent if c percent of transactions in D that contain X also
contain Y . The support of the rule X¼)Y is given by
s percent ¼
jfT 2 DjX
S
Y  Tgj
jDj
à 100 percent;
where j:j indicates the number of transactions. The
confidence of the rule X¼)Y is given by
c percent ¼
suppððX [ Y Þ
suppðXÞ
à 100 percent:
A typical example of an association rule is that 80 percent of
customers who purchase beef steak and goose liver paste
would also prefer to buy bottles of red wine. Once we have
found all frequent itemsets that meet the minimum support
requirement, calculation of confidence for each rule is
trivial. Therefore, we only need to focus on the methods of
finding the frequent itemsets in the database. The Apriori
[2] approach was the first to address this issue. Apriori
finds frequent itemsets by scanning a database to check the
frequencies of candidate itemsets, which are generated by
merging frequent subitemsets. However, Apriori-based
algorithms have undergone bottlenecks because they have
too many candidate itemsets. DHP [16] proposed a hash
table scheme, which effectively reduces the number of
candidate itemsets. In addition, several mining techniques,
such as TreeProjection [1], the FP-growth algorithm [12],
partitioning [18], sampling [19], and the Hidden Markov
Model [5] have also received a significant amount of
research attention.
With the increasing amount of data, it is important to
develop more efficient algorithms to extract knowledge
from the data. However, the volume of data size is
increasing much faster than CPU execution speeds, which
has a strong influence on the performance of software
algorithms. Several works [7], [8] have proposed parallel
computing schemes to execute operations simultaneously
784 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 20, NO. 6, JUNE 2008
. The authors are with the National Taiwan University, 106, No. 1, Sec. 4,
Roosevelt Road, Taipei, Taiwan.
E-mail: {winshung, jwhuang}@arbor.ee.ntu.edu.tw,
mschen@cc.ee.ntu.edu.tw.
Manuscript received 25 Feb. 2007; revised 9 Aug. 2007; accepted 8 Oct. 2007;
published online 11 Feb. 2008.
For information on obtaining reprints of this article, please send e-mail to:
tkde@computer.org, and reference IEEECS Log Number TKDE-0086-0207.
Digital Object Identifier no. 10.1109/TKDE.2008.39.
1041-4347/08/$25.00 ß 2008 IEEE Published by the IEEE Computer Society
on multiprocessors. The performance, however, cannot
improve linearly as the number of the parallel nodes grows.
Therefore, some researchers have tried to use hardware
devices to accomplish data mining tasks. In [15], Liu et al.
proposed a parallel matrix hardware architecture, which
can efficiently generate candidate 2-itemsets, for high-
throughput data stream applications. Baker and Prasanna
[3], [4] designed scalable hardwares for association rule
mining by utilizing the systolic array proposed in [13] and
[14]. The architecture utilizes parallel computing techniques
to execute a large number of pattern matching operations at
the same time. Other hardware architectures [6], [9], [10],
[20] have been designed to speed up the K-means clustering
algorithm.
Generally speaking, Apriori-based hardware schemes
require loading the candidate itemsets and the database into
the hardware. Since the capacity of the hardware is fixed, if
the number of items in the database is larger than the
hardware capacity, the data items must be loaded sepa-
rately. Therefore, the process of comparing candidate item-
sets with the database needs to be executed several times.
Similarly, if the number of candidate itemsets is larger than
the capacity of the hardware, the pattern matching proce-
dure has to be separated into many rounds. Clearly, it is
infeasible for any hardware design to load the candidate
itemsets and the database into hardware for multiple times.
Since the time complexity of those steps that need to load
candidate itemsets or database items into the hardware is in
proportion to the number of candidate itemsets and the
number of items in the database, this procedure is very time
consuming. In addition, numerous candidate itemsets and a
huge database may cause a bottleneck in the system.
In this paper, we propose a HAsh-based and PiPelIned
(abbreviated as HAPPI) architecture for hardware-enhanced
association rule mining. That is, we identify certain parts of
the mining process that is suitable and will benefit from
hardware implementation and perform hardware-enhanced
mining. Explicitly, we incorporate the pipeline methodology
into the HAPPI architecture to compare itemsets and collect
useful information that enables us to reduce the number of
candidate itemsets and items in the database simulta-
neously. As shown in Fig. 1, there are three hardware
modules in our system. First, when the database is fed into
the hardware, the candidate itemsets are compared with the
items in the database by the systolic array. Candidate
itemsets that have a higher frequency than the minimum
support value are viewed as frequent itemsets. Second, we
determine the frequency that each item occurs in the
candidate itemsets in the transactions at the same time.
These frequencies are called trimming information. From
this information, infrequent items in the transactions can be
eliminated since they are not useful in generating frequent
itemsets through the trimming filter. Third, we generate
itemsets from transactions and hash them into the hash
table, which is then used to filter out unnecessary candidate
itemsets. After the hardware compares candidate itemsets
with the items in the database, the trimming information is
collected and the hash table is built. The useful information
helps us to reduce the number of items in the database and
the number of candidate itemsets. Based on the trimming
information, items are trimmed if their corresponding
occurrence frequencies are not larger than the length of the
current candidate itemsets. In addition, after the candidate
itemsets are generated by merging frequent subitemsets,
they are sent to the hash table filter. If the number of itemsets
in the corresponding bucket of the hash table is less than the
minimum support, the candidate itemsets are pruned. As
such, HAPPI solves the bottleneck problem mentioned
earlier by the cooperation of these three hardware modules.
To achieve these goals, we devise the following five
procedures in the HAPPI architecture: support counting,
transaction trimming, hash table building, candidate gen-
eration, and candidate pruning. Moreover, we derive several
formulas to decide the optimal design in order to reduce the
overhead induced by the pipeline scheme and the ideal
number of hardware modules to achieve the best utilization.
The execution time between sequential processing and
pipeline processing is also analyzed in this paper.
We conduct several experiments to evaluate the perfor-
mance of the HAPPI architecture. In addition, we implement
the work of Baker and Prasanna [3] and a software algorithm
DHP [16] for comparison purposes. The experiment results
show that HAPPI outperforms the previous approach on
execution time significantly, especially when the number of
items in the database is large and the minimum support
value increases. Moreover, the performance of HAPPI is
better than that of the previous approach [3] when the
systolic array contains different numbers of hardware cells.
In fact, by using only 25 hardware cells in the systolic array,
we can achieve the same performance as more than 800
hardware cells in the previous approach. The advantages of
the HAPPI architecture are that it has more computing
power and saves the space costs for mining association rules
in hardware design. The scale-up experiment also shows
that HAPPI outperforms the previous approach on different
numbers of transactions in the database. Indeed, our
architecture is a good example to demonstrate the metho-
dology of performance enhancement by hardware. We
implement our architecture on a commercial FPGA board.
It is easily to be realized in a custom ASIC. With the progress
in IC process technology, the performance of HAPPI will
further be improved. In view of the fast increase in the
amount of data in various emerging mining applications
(e.g., network application mining, data stream mining, and
bioinformatics data mining), it is envisioned that hardware-
enhanced mining is an important research direction to
explore for future data mining tasks.
The remainder of the paper is organized as follows: We
discuss related works in Section 2. The preliminaries are
presented in Section 3. The HAPPI architecture is described
in Section 4. Next, we show several experiments conducted
on HAPPI in Section 5. Finally, we present our conclusions
in Section 6.
WEN ET AL.: HARDWARE-ENHANCED ASSOCIATION RULE MINING WITH HASHING AND PIPELINING 785
Fig. 1. System architecture.
2 RELATED WORKS
In this section, we discuss two previous works that use a
systolic array architecture to enhance the performance of
data mining.
The Systolic Process Array (SPA) architecture is pro-
posed in [10] to perform K-means clustering. SPA accel-
erates the processing speed by utilizing several hardware
cells to calculate the distances in parallel. Each cell
corresponds to a cluster and stores the centroid of the
cluster in local memory. The data flows linked by each cell
include the data object, the minimum distance between the
object and its closest centroid, and the closest centroid of the
object. The cell computes the distance between the centroid
and the input data object. Based on the resulting distance,
the cell updates the minimum distance and the closest
centroid of the data object. Therefore, the system can obtain
the closest centroid of each object, respectively, from SPA.
The centroids are recomputed and updated by the system,
and the new centroids are sent to the cells. The system
continuously updates clustering results.
In [3], the authors implemented a systolic array with
several hardware cells to speed up the Apriori algorithm.
Each cell performs an ALU (larger than, smaller than, or
equal to) operation, which compares the incoming item
with items in the memory of the cell. This operation
generates frequent itemsets by comparing candidate item-
sets with the items in the database. Since all the cells can
execute their own operations simultaneously, the perfor-
mance of the architecture is better than that of a single
processor. However, the number of cells in the systolic
array is fixed. If the number of candidate itemsets is larger
than the number of hardware cells, the pattern matching
procedure has to be separated into many rounds. It is
infeasible to load candidate itemsets and the database into
the hardware for multiple times. As reported in [3], the
performance is only about four times faster than some
software algorithms. Hence, there is much room to improve
the execution time.
3 PRELIMINARIES
The hash table scheme proposed in DHP [16] improves the
performance of Apriori-based algorithms by filtering out
infrequent candidate itemsets. In addition, DHP employs an
effective pruning scheme to eliminate infrequent items in
transactions. We summarize these two schemes below.
In the hash table scheme, a hash function is applied to all
of candidate k-itemsets generated by frequent subitemsets.
Each candidate k-itemset is mapped to a hash value, and
itemsets with the same hash value are put into the same
bucket of the hash table. If the number of the candidate
itemsets in the bucket is less than the minimum support
threshold, the number of these candidate itemsets in the
database is less than the minimum support threshold. As a
result, these candidate itemsets cannot be frequent and are
removed from the system. On the other hand, if the number
of the candidate itemsets in the bucket is larger than the
minimum support threshold, the itemsets are carried to real
frequency testing process by scanning the database.
The hash table for filtering candidate k-itemsets Hk is
built by hashing the k-itemsets generated by each transac-
tion. A hash table contains n buckets, where n is an arbitrary
number. When an itemset is hashed to the bucket i, the
number of itemsets in the bucket is increased by one. The
number of itemsets in each bucket represents the accumu-
lated frequency of the itemsets whose hash values are
assigned to that bucket. After candidate k-itemsets have
been generated, they are hashed and assigned to buckets of
Hk. If the number of itemsets in a bucket is less than the
minimum support, candidate itemsets in this bucket are
removed. The example in Fig. 2 demonstrates how to build
H2 and how to use it to filter out candidate 2-itemsets. After
we scan the transaction TID ¼ 100,  AC  ,  AD  , and
 CD  are hashed to the buckets. According to the hash
function shown in Fig. 2, the hash values of  AC  ,
 AD  , and  CD  are 6, 0, and 6, respectively. As a
result, the number of itemsets in the buckets indexed by 6, 0,
and 6 is increased by one. After all the transactions in the
database have been scanned, frequent 1-itemsets are found,
i.e., L1 ¼ fA; B; C; Eg. In addition, the number of itemsets in
the buckets of H2 are  3; 1; 2; 0; 3; 1; 3  , and the minimum
support frequency is 2. Thus, the candidate 2-itemsets in
buckets 1, 3, and 5 should be pruned. If we generate
candidate 2-itemsets from L1 Ã L1 directly, the original set of
candidate 2-itemsets C2 is
f AB ;  AC ;  AE ;  BC ;  BE ;  CE g:
After filtering out unnecessary candidate itemsets by
checking H2, the new C0
2 becomes
f AC ;  BC ;  BE ;  CE g:
Therefore, the number of candidate itemsets can be
reduced.
The pruning scheme which is able to filter out infrequent
items in the transactions will be implemented in hardware.
The theoretical backgrounds of the pruning scheme are
based on the following two theorems which were presented
in [17]:
Theorem 1. A transaction can only be used to support the set of
frequent ðk þ 1Þ-itemsets if it consists of at least ðk þ 1Þ
candidate k-itemsets.
Theorem 2. An item in a transaction can be trimmed if it does
not appear in at least k of the candidate k-itemsets contained
in the transaction.
786 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 20, NO. 6, JUNE 2008
Fig. 2. The process of building H2 and using H2 to filter out C2.
Based on Theorem 2, whether an item can be trimmed or
not depends on how many candidate itemsets in the current
transaction contain this item. The transaction trimming
module is based on the frequencies of all candidate itemsets
in an individual transaction. Therefore, we can handle
every transaction independently regardless of other trans-
actions in the database. A counter array a½ Š is used to record
the number of times that each item in a transaction occurs in
the candidate k-itemsets. That is, counter a½iŠ represents the
frequency of ith item in the transaction. If a candidate
k-itemset is a subset of the transaction, the numbers in the
counter array of the corresponding items that appear in this
candidate itemset are increased by one. After comparing
with all the candidate k-itemsets, if the value of a counter is
less than k, the item in the transaction is trimmed, as shown
in Fig. 3. For example, transaction TID ¼ 100 has three
items fA; C; Dg. Counter a½0Š represents A, a½1Š represents
C, and a½2Š represents D. Transaction TID ¼ 300 has four
items fA; B; C; Eg. Counter a½0Š corresponds to A, a½1Š
corresponds to B, a½2Š corresponds to C, and a½3Š corre-
sponds to E, respectively. After the comparison with the set
of candidate 2-itemsets, the values of the counter array for
TID ¼ 100 are  1; 1; 0  and the values of the counter
array for TID ¼ 300 are  1; 2; 2; 2  . Since all values of the
counter array for TID ¼ 100 are less than 2, all correspond-
ing items are trimmed from the transaction TID ¼ 100. On
the other hand, because the value of a½0Š for TID ¼ 300 is
less than 2, item A is trimmed from the transaction.
Therefore, transaction TID ¼ 300 becomes fB; C; Eg.
4 HAPPI ARCHITECTURE
As noted earlier, Apriori-based hardware schemes have to
load candidate itemsets and the database into the hardware
to execute the comparison process. Too many candidate
itemsets and a huge database would cause a performance
bottleneck. To solve this problem, we propose the HAPPI
architecture to deal with efficient hardware-enhanced
association rule mining. We incorporate the pipeline
methodology into the HAPPI architecture to perform
pattern matching and collect useful information to reduce
the number of candidate itemsets and items in the database
simultaneously. In this way, HAPPI effectively solves the
bottleneck problem.
In Section 4.1, we introduce our system architecture. In
Section 4.2, the pipeline scheme of the HAPPI architecture is
presented. The transaction trimming scheme is given in
Section 4.3. Then, we describe the hardware design of hash
table filter in Section 4.4. Finally, we derive some properties
for performance evaluation in Section 4.5.
4.1 System Architecture
As shown in Fig. 4, the HAPPI architecture consists of a
systolic array, a trimming filter, and a hash table filter.
There are several hardware cells in the systolic array. Each
cell can perform the comparison operation. Based on the
comparison results, the cells update the support counters of
candidate itemsets and the occurrence frequencies of items
in the trimming information. A trimming filter then
removes infrequent items in the transactions according to
the trimming information. In addition, we build a hash
table by hashing itemsets generated by each transaction.
The hash table filter then prunes unsuitable candidate
itemsets.
To find frequent k-itemsets and generate candidate ðk þ
1Þ-itemsets efficiently, we devise five procedures in the
HAPPI architecture using the three hardware modules: the
systolic array, the trimming filter, and the hash table filter.
The procedures are support counting, transaction trimming,
hash table building, candidate generation, and candidate
pruning. The work flow is shown in Fig. 5. The support
counting procedure finds frequent itemsets by comparing
candidate itemsets with transactions in the database. By
loading candidate k-itemsets and streaming transactions into
WEN ET AL.: HARDWARE-ENHANCED ASSOCIATION RULE MINING WITH HASHING AND PIPELINING 787
Fig. 3. An example of transaction trimming.
Fig. 4. The HAPPI architecture: (a) systolic array, (b) trimming filter, and
(c) hash table filter.
the systolic array, the frequencies that candidate itemsets
occur in the transactions can be determined. Note that if the
number of candidate itemsets is larger than the number of
hardware cells in the systolic array, the candidate itemsets
are separated into several groups. Some of the candidate
itemsets are loaded into the hardware cells and the database
is fed into the systolic array. Afterward, the other candidate
itemsets are loaded into the systolic array one by one. To
complete the comparison with all the candidate itemsets, the
database has to be examined several times. To reduce the
overhead of repeated loading, we design two additional
hardware modules, namely, a trimming filter and a hash
table filter. Infrequent items in the database are eliminated
by the trimming filter, and the number of candidate itemsets
is reduced by the hash table filter. Therefore, the time
required for support counting procedure can be effectively
reduced.
After all the candidate k-itemsets have been compared
with the transactions, their frequencies are sent back to the
system. The frequent k-itemsets can be obtained from the
candidate k-itemsets whose occurrence frequencies are
larger than the minimum support. While the transactions
are being compared with the candidate itemsets, the
corresponding trimming information is collected. The
occurrence frequency of each item, which is contained in
the candidate itemsets in the transactions, is recorded and
updated to the trimming information. After comparing
candidate itemsets with the database, the trimming
information is collected. The occurrence frequencies and
the corresponding transactions are then transmitted to the
trimming filter, and infrequent items are trimmed from
the transactions according to the occurrence frequencies in
the trimming information. Then, the hash table building
procedure generates ðk þ 1Þ-itemsets from the trimmed
transactions. These ðk þ 1Þ-itemsets are hashed into the
hash table for processing. Next, the candidate generation
procedure is also executed by the systolic array. The
frequent k-itemsets are fed into the systolic array for
comparison with other frequent k-itemsets. The candidate
ðk þ 1Þ-itemsets are generated by the systolic injection and
stalling techniques similar to [3]. The candidate pruning
procedure uses the hash table to filter candidate ðk þ
1Þ-itemsets that are not possible to be frequent itemsets.
Then, the procedure reverts to the support counting
procedure. The pruned candidate ðk þ 1Þ-itemsets are
loaded into the systolic array for comparison with
transactions that have been trimmed already. The above
five processes are executed repeatedly until all frequent
itemsets have been found.
4.2 Pipeline Design
We observe that the transaction trimming and the hash table
building procedures are blocked by the support counting
procedure. The transaction trimming procedure has to
obtain trimming information to execute the trimming
process. However, this process cannot be completed until
the support counting procedure compares all the transac-
tions with all the candidate itemsets. In addition, the hash
table building procedure has to get the trimmed transactions
from the trimming filter after all the transactions have been
trimmed. This problem can be resolved by applying the
pipeline scheme, which utilize the three hardware modules
simultaneously in the HAPPI framework. First, we divide
the database into Npipe parts. One part of the transactions in
the database is streamed into the systolic array and the
support counting process is performed on all candidate
itemsets. After comparing the transactions with all the
candidate itemsets, the transactions and their trimming
information are passed to the trimming filter first. The
systolic array then processes the next group of transactions.
After items have been trimmed from a transaction by the
trimming filter, the transaction is passed to the hash table
filter, as shown in Fig. 6, and the trimming filter can deal
with the next transaction. In this way, all the hardware
modules can be utilized simultaneously. Although the
pipelined architecture improves the system’s performance,
it increases the computational overhead because of multiple
times of loading candidate itemsets into the systolic array.
The performance of the pipeline scheme and the improved
design of the HAPPI architecture are discussed in Section 4.5.
4.3 Transaction Trimming
While the support counting procedure is being executed, the
whole database is streamed into the systolic array. However,
not all the transactions are useful for generating frequent
itemsets. Therefore, we filter out items in the transactions
according to Theorem 2 so that the database is reduced. In
the HAPPI architecture, the trimming information records
the frequency of each item in a transaction that appears in
the candidate itemsets. The support counting and trimming
information collecting operations are similar since they all
need to compare candidate itemsets with transactions.
Therefore, in addition to transactions in the database, their
corresponding trimming information is also fed into the
systolic array in another pipe, while the support counting
process is being executed. As shown in Fig. 7, a trimming
vector is embedded in each hardware cell of the systolic
array to record items that are matched with candidate
788 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 20, NO. 6, JUNE 2008
Fig. 5. The procedure flow of one round.
Fig. 6. A diagram of the pipeline procedures.
itemsets. The ith flag in the trimming vector is set to true if
the ith item in the transaction matches the candidate itemset.
After comparing the candidate itemset with all the items in a
transaction, if the candidate itemset is a subset of the
transaction, the incoming corresponding trimming informa-
tion will be accumulated according to the trimming vector.
Since transactions and trimming information are input in
different pipes, support counters and trimming information
can be updated simultaneously in a hardware cell.
In Fig. 7a, the candidate itemset  BC  is stored in the
candidate memory, and a transaction fA; B; C; D; Eg is
about to be fed into the cell. The resultant trimming vector
after comparing  BC  with all the items in the transac-
tion is shown in Fig. 7b. Because items B and C are matched
with the candidate itemset, the trimming vector becomes
 0; 1; 1; 0; 0  . Meanwhile, the corresponding trimming
information is fed into the trimming register, and the
trimming information is updated from  0; 1; 1; 0; 1  to
 0; 2; 2; 0; 1  .
After passing through the systolic array, transactions
and their corresponding trimming information are passed
to the trimming filter. The filter trims off items whose
frequencies are less than k. As the example in Fig. 8 shows,
the trimming information of the transaction fA; B; C; D; Eg
is  2; 2; 2; 1; 2  and the current k is 2. Therefore, the item
D should be trimmed. The new transaction becomes
fA; B; C; Dg. In this way, the size of the database can be
reduced. The trimmed transactions are sent to the hash
table filter module for hash table building.
4.4 Hash Table Filtering
To build a hardware hash table filter, we use a hash value
generator and hash table updating module. The former
generates all the k-itemset combinations of the transactions
and puts the k-itemsets into the hash function to create the
corresponding hash values. As shown in Fig. 9, the hash
value generator comprises a transaction memory, a state
machine, an index array, and a hash function. The
transaction memory stores all the items of a transaction.
The state machine is the controller that generates control
signals of different lengths ðk ¼ 2; 3 . . .Þ flexibly. Then, the
control signals are fed into the index array. To generate a
k-itemset, the first k entries in the index array are utilized.
The values in the index array are the indices of the
transaction memory. The item selected by the ith entry of
the index array is the ith item in a k-itemset. By changing
the values in the index array, the state machine can generate
different combinations of k-itemsets from the transaction.
The procedure starts by loading a transaction into the
transaction memory. Then, the values in the index array are
reset, and the state machine starts to generate control
signals. The values in the index array are changed by the
different states. Each item in the generated itemset is passed
to the hash function through the multiplexer. The hash
function takes some bits from the incoming k-itemsets to
calculate the hash values.
WEN ET AL.: HARDWARE-ENHANCED ASSOCIATION RULE MINING WITH HASHING AND PIPELINING 789
Fig. 7. An example of streaming a transaction and the corresponding
trimming information into the cell. (a) Stream a transaction into the cell.
(b) Stream trimming information into the cell.
Fig. 8. The trimming filter.
Fig. 9. The hash value generator.
Consider the example in Fig. 9. We assume the current k is
3. The first three index entries in the index array are used in
this case. The transaction fA; C; E; F; Gg is loaded into the
transaction memory. The values in the index array are
initiated to 0, 1, and 2, respectively, so that the first itemset
generated is  ACE  . Then, the state machine changes the
values in the index array. The following numbers in the index
array will be  0; 1; 3  ,  0; 1; 4  ,  0; 2; 3  ,  0; 2; 4  ,
to name a few. Therefore, the corresponding itemsets are
 ACF  ,  ACG  ,  AEF  ,  AEG  , and so on.
The hash values generated by the hash value generator
are passed to the hash table updating module. To speed up
the process of hash table building, we utilize Nparallel hash
value generators so that the hash values can be generated
simultaneously. In addition, the hash table is divided into
several parts to increase the throughput of hash table
building. Each part of the hash table contains a range of
hash values, and the controller passes the incoming hash
values to the buffer they belong to. These hash values are
taken as indexes of the hash table to accumulate the values
in the table, as shown in Fig. 10. There are four parallel hash
value generators. The size of the whole hash table is 65,536,
and it is divided into four parts. Thus, the range of each part
is 16,384. If the incoming hash value is 5, it belongs to the
first part of the hash table. The controller would pass the
value to buffer 1. If there are parallel accesses to the hash
table at the same time, only one access can be executed. The
others will be delayed and be handled as soon as possible.
The delayed itemsets are stored in the buffer temporally.
Whenever the access port of hash table is free, the delayed
itemsets are put into the hash table.
After all the candidate k-itemsets have been generated,
they are pruned by the hash table filter. Each candidate
itemset is hashed by the hash function. By querying the
number of itemsets in the bucket with the corresponding
hash value, the candidate itemset is pruned if the number of
itemsets in the bucket does not meet the minimum support
criteria. Therefore, the number of the candidate itemsets can
be reduced effectively with the help of the hash table filter.
4.5 Performance Analysis
In this section, we derive some properties of our system with
and without the pipeline scheme to investigate the total
execution time. Suppose the number of candidate k-itemsets
is NcandÀk and the number of frequent k-itemsets is NfreqÀk:
There are Ncell hardware cells in the systolic array. jTj
represents the average number of items in a transaction, and
jDj is the total number of items in the database. As shown
in Fig. 5, the time needed to find frequent k-itemsets and
candidate ðk þ 1Þ-itemsets includes the time required for
support counting, transaction trimming, hash table building,
candidate generation, and candidate pruning.
First, the execution time of the support counting
procedure is related to the number of times candidate
itemsets and the database are loaded into the systolic array.
That is, if NcandÀk is larger than Ncell, the candidate itemsets
and the database must be input into the systolic array
dNcandÀk=Ncelle times. Each time there are, at most, Ncell
candidate itemsets loaded into the systolic array. The
number of items in these candidate k-itemsets is, at most,
k à Ncell. In addition, all items in the database need jDj cycles
to be streamed into the systolic array. Therefore, the
execution cycle of the support counting procedure is, at most
tsup ¼ dNcandÀk=Ncelle à ðk à Ncell þ jDjÞ:
Second, the transaction trimming procedure eliminates
infrequent items and receives incoming items at the same
time. A transaction item and the corresponding trimming
information are fed into the trimming filter during each
cycle. After the whole database has been passed through the
trimming filter, the transaction trimming procedure is
finished. Thus, the execution cycle depends on the number
of items in the database:
ttrim ¼ jDj:
Third, the hash table building procedure consists of the
hash value generation and the hash table updating pro-
cesses. Because the processes can be executed simulta-
neously, the execution time is based on the process that
generates the hash values. The execution time of hash value
generation consists of the time taken by transaction loading
and by hash value generation from transactions. The overall
transaction loading time is jDj cycles. In addition, there are
jTj items in a transaction on average. Thus, the number of
ðk þ 1Þ-itemset combinations from a transaction is C
jTj
kþ1. Each
time ðk þ 1Þ-itemset requires ðk þ 1Þ cycles to be generated.
The average number of transactions in the database is jDj
jTj .
Therefore, the execution time of hash value generation from
the whole database is ðk þ 1Þ Ã C
jTj
k à jDj
jTj cycles. Because we
have designed a parallel architecture, the procedure can be
executed by Nparallel hardware modules simultaneously. The
execution cycle of the hash table building procedure is
thash ¼ jDj þ ðk þ 1Þ Ã C
jTj
kþ1 Ã
jDj
jTj
 
Ã
1
Nparallel
:
The fourth procedure is candidate generation. Frequent
k-itemsets are compared with other frequent k-itemsets to
generate candidate ðk þ 1Þ-itemsets. The execution time of
candidate generation consists of the time required to load
frequent k-itemsets (at most, k à Ncell each time), the time
taken to pass frequent k-itemsets ðk à NfreqÀkÞ through the
systolic array, and the time needed to generate candidate
k-itemsets ðNcandÀðkþ1ÞÞ. Similar to the support counting
procedure, if NfreqÀk is larger than Ncell, the frequent
k-itemsets have to be separated into several groups and the
790 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 20, NO. 6, JUNE 2008
Fig. 10. The parallel hash table building module.
comparison process has to be executed several times. Thus,
the execution cycle is, at most
tcandidate generation ¼ dNfreqÀk=Ncelle à ðk à Ncell þ k à NfreqÀkÞ
þ NcandÀðkþ1Þ:
Finally, the candidate pruning procedure has to hash
candidate ðk þ 1Þ-itemsets and to query the hash table Hkþ1
each time ðk þ 1Þ-itemset requires ðk þ 1Þ cycles to be
generated. The hash table can accept one input query
during each cycle. Thus, the execution time needed to hash
candidate k-itemsets ððk þ 1Þ Ã NcandÀðkþ1ÞÞ and query the
hash table ðNcandÀkþ1Þ is
tcandidate pruning ¼ ðk þ 1Þ Ã NcandÀðkþ1Þ þ NcandÀðkþ1Þ:
Since the size of the database that we consider is much
larger than the number of the candidate k-itemsets, we can
neglect the execution time of candidate generation and
pruning. Therefore, the time required for one round of the
sequential execution tseq is the sum of the time taken by the
support counting, transaction trimming, and hash table
building procedures, as shown in Property 1.
Property 1. tseq ¼ tsup þ ttrim þ thash.
The pipeline scheme incorporated in the HAPPI archi-
tecture divides the database into Npipe parts and inputs them
into the three modules. However, this scheme causes some
overhead toverhead because of reloading candidate itemsets
for multiple times in the support counting procedure:
toverhead ¼ dNcandÀk=Ncelle à ðk à NcellÞ Ã Npipe:
Therefore, the support counting procedure has to con-
sider toverhead. The execution time of the support counting
procedure in the pipeline scheme becomes
t0
sup ¼ tsup þ toverhead:
The execution time of the pipeline scheme tpipe is
analyzed according to the following two cases:
Case 1. If the execution time of the support counting
procedure is longer than that of the hash table building
procedure, the other parts of the procedure would finish
their operations before the support counting procedure.
However, the transaction trimming and hash table building
procedures have to wait for the last part of the data from the
support counting procedure. Therefore, the total execution
time is t0
sup, and the time required to process the last part of
the database with the trimming filter and the hash table
filter is
tpipe ¼ t0
sup þ ðttrim þ thashÞ Ã
1
Npipe
:
Case 2. If the execution time of the support counting
procedure is less than that of the hash table building
procedure, the other parts of the procedure would be
completed before the hash table building procedure. Since
the hash table building procedure has to wait for the data
from the support counting and transaction trimming
procedures, the total execution time is thash, and the time
required to process the first part of the database with the
support counting and the trimming filter is
tpipe ¼ t0
sup þ ttrim
 
Ã
1
Npipe
þ thash:
Summarizing the above two cases, the execution time
tpipe can be presented as Property 2.
Property 2.
tpipe ¼ maxðt0
sup; thashÞ þ minðt0
sup; thashÞ
Ã
1
Npipe
þ ttrim Ã
1
Npipe
:
To achieve the minimal value of tpipe, we consider the
following two cases:
1. If t0
sup is larger than thash, the execution time tpipe is
dominated by t0
sup. To decrease the value of t0
sup, we
can increase the number of hardware cells in the
systolic array until t0
sup is equal to thash. Therefore, the
optimal value for Npipe to reach the minimal tpipe is
Npipe ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1
dNcandÀk=Ncelle à k à Ncell
à ðttrim þ thashÞ
s
:
2. If t0
sup is smaller than thash, the execution time tpipe is
mainly taken up by thash. To decrease the value of
thash, we can increase the number of Nparallel until
thash is equal to t0
sup. As a result, the optimal value for
Npipe to achieve the minimal tpipe in this case is
Npipe ¼
1
dNcandÀk=Ncelle à k à Ncell
à ðthash À tsupÞ:
To decide the values of Ncell and Nparallel in the HAPPI
architecture, we have to get the value of NcandÀk. However,
NcandÀk varies with different numbers of k. Therefore, we
decide these values according to our experimental experi-
ence. Generally speaking, for many types of data in the real
world, the number of candidate 2-itemsets is the largest.
Thus, we can focus on the case k ¼ 2, since the execution
time would be the largest of all the processes. For a data set
with jTj ¼ 10 and jDj ¼ 1 million, if the minimum support
is 0.2 percent, NcandÀk is about 3,000. Assume that there are
500 hardware cells in the systolic array. To accelerate the
hash table building procedure, we can increase the number
of Nparallel. Based on Property 2, the best number of Nparallel
is set to 4. In addition, after the transactions are trimmed by
the trimming filter we can get the current number of items
in the database. Also, the number of NcandÀk can be obtained
after candidate k-itemsets are pruned by the hash table
filter. Therefore, we can calculate the value of tsup and thash
before starting the support counting and the hash table
building procedures. Since the toverhead is less than the size of
the database under consideration, tsup can be viewed as t0
sup.
Based on the formulas we derived above, we can get the
best value of Npipe to minimize the value of tpipe. When the
support counting procedure is dealing with candidate
2-itemsets and the hash table building procedure is about
WEN ET AL.: HARDWARE-ENHANCED ASSOCIATION RULE MINING WITH HASHING AND PIPELINING 791
to build H3, we divide the database into 30 parts, i.e.,
Npipe ¼ 30. By applying pipeline scheme to these three
hardware modules, the hardware utilization increases and
the wastage due to the blocking can be reduced.
5 EXPERIMENT RESULTS
In this section, we conduct several experiments on a
number of synthetic data sets to evaluate the performance
of the HAPPI architecture. We also implement an approach
mainly based on [3], abbreviated as Direct Comparison
(DC) method, for comparison purposes. Although a hard-
ware algorithm was proposed in [4], its performance
improvement is found much less than DC. However,
HAPPI significantly outperforms DC in orders of magni-
tude. Moreover, we implement a software algorithm DHP
[16], denoted by SW_DHP, as the baseline. The software
algorithm is executed on a PC with 3-GHz P-4 CPU and
1-Gbytes RAM.
Both HAPPI and DC are implemented on the Altera
Stratix 1S40 FPGA board with 50-MHZ clock rate and
10-Mbytes SDRAM. The hardware modules are coded in
Verilog. We use ModelSim to simulate Verilog codes and
verify the functions of our design. In addition, we use Altera
Quartus II IDE to build hardware modules and synthesize
modules into hardware circuits. Finally, the hardware
circuit image is sent to the FPGA board. There is a
synthesized CPU on an FPGA board. We implement a
software program on NIOSII to verify and to record
hardware execution time. At first, the program downloads
data from the database into the memory on FPGA. Then, the
data is fed into hardware modules. The bandwidth
consumed from the memory to the chip is 16 bits/cycle in
our design. Since the data is transferred on the bus, the
maximum bandwidth is limited by the bandwidth of the bus
on FPGA, which is generally 32 bits/cycle. The bandwidth
can be upgraded to 64 bits/cycle in some modern FPGA
board. After the execution of hardware modules, we can
acquire outcomes and execution cycles. The following
experimental results are based on execution cycles on FPGA
board. In addition, in our hardware design, the critical path
is in the hash table building module. There are many logic
combinations in this module. The synthesized hardware
core is, therefore, complex. This makes the maximum clock
frequency in our hardware design bound to 58.6 MHz.
However, since the maximum clock frequency on Altera
FPGA1s40 is 50 MHz, our design meets the hardware
requirement. In our future work, we are going to increase
the clock frequency of the hardware architecture. We will
try to optimize the bottleneck module.
In the hardware implementation of the HAPPI architec-
ture, Nparallel is set to 4. The number of the hardware cells in
the systolic array is 500; there are 65,536 buckets in the hash
table; and Npipe are assigned according to the methodology
in Section 4.5. The method used to generate synthetic data is
described in Section 5.1. The performance comparison of
several schemes in the HAPPI architecture and DC are
discussed in Section 5.2. Section 5.3 presents the perfor-
mance analysis of different distributions of frequent item-
sets. Finally, in Section 5.4, the results of some scale-up
experiments are discussed.
5.1 Generation of Synthetic Data
To obtain reliable experimental results, we employ similar
methods to those used in [16] to generate synthetic data sets.
These data sets are generated with the following parameters:
T represents the average number of items in a transaction of
the database, I denotes the average length of the maximum
potentially frequent itemsets, D is the number of transac-
tions in the database, L is the number of maximal potentially
frequent itemsets, and N is the number of different items in
the database. Table 1 summarizes the parameters used in our
experiments. In the following experiment data sets, L is set
to 3,000. To evaluate the performance of HAPPI, we conduct
several experiments with different data sets. We use TxIyDz
to represent that T ¼ x, I ¼ y, and D ¼ z. In addition, the
sensitivity analysis and the scale-up experiments are also
explored with different data sets. Note that the y-axis of the
following figures is the execution cycle in logarithmic scale.
5.2 Performance Evaluation
Initially, we conduct experiments to evaluate the perfor-
mance of several schemes in the HAPPI architecture and
DC. The testing data sets are T10I4D100 with different
numbers of items in the database. The minimum support is
set to 0.5 percent. As shown in Fig. 11, the four different
schemes are
1. the DC scheme,
2. the systolic array with a trimming filter,
3. the combined scheme made up of the systolic array,
the trimming filter and the hash table filter, and
4. the overall HAPPI architecture with the pipeline
design.
792 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 20, NO. 6, JUNE 2008
TABLE 1
Summary of the Parameters Used
Fig. 11. The execution cycles of several schemes.
With the trimming filter, the execution time increases to
about 10 percent-70 percent compared to DC. As the
number of different items in the database increases, the
improvement due to the trimming filter increases. The
reason is that, if the number of different items grows, the
number of infrequent itemsets also increases. Therefore, the
filtering rate by utilizing the trimming filter is more
remarkable. Moreover, the execution cycle of the combined
scheme with the help of the hash table filter is about 25-
51 times faster. The HAPPI architecture with the pipeline
scheme is 47-122 times better than DC. Note that the main
improvement of the execution time results from the hash
table filter. We not only implemented an efficient hardware
module for the hash table filter in the HAPPI architecture
but also designed the pipeline scheme to let the hash table
filter work together with other hardware modules. The
pipeline and the parallel design are two helpful properties
of the hardware architecture. Therefore, we utilize these
hardware design skills to accelerate the overall system. As
shown in Fig. 11, although the combined scheme provides
much performance boost, the performance improvement
still benefits from the overall HAPPI architecture with
pipeline design. In summary, the HAPPI architecture
outperforms DC, especially when the number of different
items in the database is large.
5.3 Sensitivity Analysis
In the second experiment, we generate several data sets
with different distributions of frequent itemsets to examine
the sensitivity of the HAPPI architecture. The experiment
results of several synthetic data sets with various mini-
mum supports are shown in Figs. 12 and 13. The results
show that no matter what combination of the parameters
T and I is used, the HAPPI architecture consistently
outperforms DC and SW_DHP. Specifically, the execution
time of HAPPI is less than that of DC and that of SW_DHP
by several orders of magnitude. The margin grows as the
minimum support increases. As Fig. 12 shows, HAPPI has
a better performance enhancement ratio to DC on the
T5I2D100K data set than on the T20I8D100K data set. The
reason is that the hash table filter is especially effective in
eliminating infrequent candidate 2-itemsets, as reported in
the experimental results of DHP [16]. Thus, SW_DHP also
has better performance with these two data sets. In
addition, the execution time tpipe is mainly related to the
number of times dNcandÀk=Ncelle of reloading the database.
Since the number of NcandÀk can be substantially reduced
when k is small, the overall performance enhancement is
remarkable. Since the average size of the maximal
potentially frequent itemsets of the data set T20I8D100K
is 8, the performance of the HAPPI architecture is only
2.65 times faster. However, most data in the real world
contains short frequent itemsets. That is, T and I are small.
It is noted that DC has only little enhancement to
SW_DHP with the short frequent itemsets, while HAPPI
possesses better performance. Therefore, the HAPPI
architecture can perform well with real-world data sets.
Fig. 13 demonstrates that the improvement of the HAPPI
architecture over DC becomes more noticeable with increas-
ing minimum support. This is because the number of long
itemsets would be eliminated with large minimum support.
Therefore, the improvement due to the hash table filter
increases. In comparison with DC, the overall performance is
outstanding.
5.4 Scale-up Experiment
According to the performance analysis in Section 4.5, the
execution time tpipe is mainly related to the number of times,
dNcandÀk=Ncelle, of reloading the database into the systolic
array. Therefore, if the number of Ncell increases, the time
we need to stream the database is less and the overall
execution time is faster. Fig. 14 illustrates the scaling
performance of the HAPPI architecture and DC, where
the y-axis is also in logarithmic scale. The execution cycles of
both HAPPI and DC linearly decrease with the increasing
number of hardware cells. We observe that HAPPI outper-
forms DC on different numbers of hardware cells in the
systolic array. The most important result is that we only
utilize 25 hardware cells in the systolic array but achieve the
same performance as the 800 hardware cells used in DC.
The benefit of the HAPPI architecture is more computing
power with lower costs for data mining in hardware design.
We also conduct experiments with different numbers of
transactions in the synthetic data sets to explore the
scalability of the HAPPI architecture. The generated data
sets are T10I4, and the minimum support is set to
0.5 percent. As shown in Fig. 15, the execution time of
HAPPI increases linearly as the number of transactions in
the synthetic data sets increases. The performance of HAPPI
outperforms DC on different numbers of transactions in the
database. Furthermore, Fig. 15 shows the good scalability of
HAPPI and DC. This feature is especially important because
WEN ET AL.: HARDWARE-ENHANCED ASSOCIATION RULE MINING WITH HASHING AND PIPELINING 793
Fig. 12. The execution time of data sets with different T and I.
Fig. 13. The execution cycles with various minimum supports.
the size of applications is growing much faster than the
speed of CPU. Thus, hardware-enhanced data mining
technique is imperative.
6 CONCLUSION
In this work, we have proposed the HAPPI architecture for
hardware-enhanced association rule mining. The bottleneck
of a priori-based hardware schemes is related to the number
of candidate itemsets and the size of the database. To solve
this problem, we apply the pipeline methodology in the
HAPPI architecture to compare itemsets with the database
and collect useful information to reduce the number of
candidate itemsets and items in the database simulta-
neously. HAPPI can prune infrequent items in the transac-
tions and reduce the size of the database gradually by
utilizing the trimming filter. In addition, HAPPI can
effectively eliminate infrequent candidate itemsets with
the help of the hash table filter. Therefore, the bottleneck of
a priori-based hardware schemes can be solved by the
HAPPI architecture. Moreover, we derive some properties
to analyze the performance of HAPPI. We conduct a
sensitivity analysis of various parameters to show many
insights into the HAPPI architecture. HAPPI outperforms
the previous approach, especially with the increasing
number of different items in the database, and with the
increasing minimum support values. Also, HAPPI increases
computing power and saves the costs of data mining in
hardware design as compared to the previous approach.
Furthermore, HAPPI possesses good scalability.
ACKNOWLEDGMENTS
The authors would like to thank Wen-Tsai Liao at Realtek
for his helpful comments to improve this paper. The work
was supported in part by the National Science Council of
Taiwan under Contract NSC93-2752-E-002-006-PAE.
REFERENCES
[1] R. Agarwal, C. Aggarwal, and V. Prasad, “A Tree Projection
Algorithm for Generation of Frequent Itemsets,” J. Parallel and
Distributed Computing, 2000.
[2] R. Agrawal and R. Srikant, “Fast Algorithms for Mining
Association Rules,” Proc. 20th Int’l Conf. Very Large Databases
(VLDB), 1994.
[3] Z.K. Baker and V.K. Prasanna, “Efficient Hardware Data Mining
with the Apriori Algorithm on FPGAS,” Proc. 13th Ann. IEEE
Symp. Field-Programmable Custom Computing Machines (FCCM),
2005.
[4] Z.K. Baker and V.K. Prasanna, “An Architecture for Efficient
Hardware Data Mining Using Reconfigurable Computing Sys-
tems,” Proc. 14th Ann. IEEE Symp. Field-Programmable Custom
Computing Machines (FCCM ’06), pp. 67-75, Apr. 2006.
[5] C. Besemann and A. Denton, “Integration of Profile Hidden
Markov Model Output into Association Rule Mining,” Proc. 11th
ACM SIGKDD Int’l Conf. Knowledge Discovery in Data Mining (KDD
’05), pp. 538-543, 2005.
[6] C.W. Chen, J. Luo, and K.J. Parker, “Image Segmentation via
Adaptive K-Mean Clustering and Knowledge-Based Morphologi-
cal Operations with Biomedical Applications,” IEEE Trans. Image
Processing, vol. 7, no. 12, pp. 1673-1683, 1998.
[7] S.M. Chung and C. Luo, “Parallel Mining of Maximal Frequent
Itemsets from Databases,” Proc. 15th IEEE Int’l Conf. Tools with
Artificial Intelligence (ICTAI), 2003.
[8] S. Cong, J. Han, J. Hoeflinger, and D. Padua, “A Sampling-Based
Framework for Parallel Data Mining,” Proc. 10th ACM SIGPLAN
Symp. Principles and Practice of Parallel Programming (PPoPP ’05),
June 2005.
[9] M. Estlick, M. Leeser, J. Szymanski, and J. Theiler, “Algorithmic
Transformations in the Implementation of K-Means Clustering on
Reconfigurable Hardware,” Proc. Ninth Ann. IEEE Symp. Field-
Programmable Custom Computing Machines (FCCM), 2001.
[10] M. Gokhale, J. Frigo, K. McCabe, J. Theiler, C. Wolinski, and D.
Lavenier, “Experience with a Hybrid Processor: K-Means Cluster-
ing,” J. Supercomputing, pp. 131-148, 2003.
[11] J. Han and M. Kamber, Data Mining: Concepts and Techniques.
Morgan Kaufmann, 2001.
[12] J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without
Candidate Generation,” Proc. ACM SIGMOD ’00, pp. 1-12, May
2000.
[13] H. Kung and C. Leiserson, “Systolic Arrays for VLSI,” Proc. Sparse
Matrix, 1976.
[14] N. Ling and M. Bayoumi, Specification and Verification of Systolic
Arrays. World Scientific Publishing, 1999.
[15] W.-C. Liu, K.-H. Liu, and M.-S. Chen, “High Performance Data
Stream Processing on a Novel Hardware Enhanced Framework,”
Proc. 10th Pacific-Asia Conf. Knowledge Discovery and Data Mining
(PAKDD ’06), Apr. 2006.
[16] J.S. Park, M.-S. Chen, and P.S. Yu, “An Effective Hash Based
Algorithm for Mining Association Rules,” Proc. ACM SIGMOD
’95, pp. 175-186, May 1995.
[17] J.S. Park, M.-S. Chen, and P.S. Yu, “Using a Hash-Based Method
with Transaction Trimming for Mining Association Rules,” IEEE
Trans. Knowledge and Data Eng., vol. 9, no. 5, pp. 813-825, Sept./
Oct. 1997.
[18] A. Savasere, E. Omiecinski, and S. Navathe, “An Efficient
Algorithm for Mining Association Rules in Large Databases,”
Proc. 21st Int’l Conf. Very Large Databases (VLDB ’95), pp. 432-444,
Sept. 1995.
[19] H. Toivonen, “Sampling Large Databases for Association Rules,”
Proc. 22nd Int’l Conf. Very Large Databases (VLDB ’96), pp. 134-145,
1996.
794 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 20, NO. 6, JUNE 2008
Fig. 14. The execution cycles with different numbers of hardware units. Fig. 15. The execution cycles with various numbers of transactions, zðKÞ.
[20] C. Wolinski, M. Gokhale, and K. McCabe, “A Reconfigurable
Computing Fabric,” Proc. Int’l Conf. Eng. of Reconfigurable Systems
and Algorithms (ERSA), 2004.
Ying-Hsiang Wen received the BS degree in
computer science from the National Chiao Tung
University and the MS degree in electrical
engineering from the National Taiwan Univer-
sity, Taipei, in 2006. His research interests
include data mining, video streaming, and multi-
media SoC design.
Jen-Wei Huang received the BS degree in
electrical engineering from the National Taiwan
University, Taipei, in 2002, where he is currently
working toward the PhD degree in computer
science. He is familiar with data mining area.
His research interests include data mining,
mobile computing, and bioinformatics. Among
these, the web mining, incremental mining,
mining data streams, time series issues, and
sequential pattern mining are his special inter-
ests. In addition, some of his research are on mining general temporal
association rules, sequential clustering, data broadcasting, progressive
sequential pattern mining and bioinformatics.
Ming-Syan Chen received the BS degree in
electrical engineering from the National Taiwan
University, Taipei, and the MS and PhD degrees
in computer, information, and control engineer-
ing from the University of Michigan, Ann Arbor,
in 1985 and 1988, respectively. He was the
chairman of the Graduate Institute of Commu-
nication Engineering (GICE), National Taiwan
University, from 2003 to 2006. He is currently a
distinguished professor jointly appointed by the
Electrical Engineering Department, Computer Science and Information
Engineering Department, and GICE, National Taiwan University. He
was a research staff member at IBM T.J. Watson Research Center, New
York, from 1988 to 1996. He served as an associate editor of the IEEE
Transactions on Knowledge and Data Engineering from 1997 to 2001
and is currently on the editorial board of the Very Large Data Base
(VLDB) Journal and Knowledge and Information Systems. His research
interests include database systems, data mining, mobile computing
systems, and multimedia networking. He has published more than
240 papers in his research areas. He is a recipient of the National
Science Council (NSC) Distinguished Research Award, Pan Wen Yuan
Distinguished Research Award, Teco Award, Honorary Medal of
Information, and K.-T. Li Research Breakthrough Award for his research
work, as well as the IBM Outstanding Innovation Award for his
contribution to a major database product. He also received numerous
awards for his teaching, inventions, and patent applications. He is a
fellow of the ACM and the IEEE.
. For more information on this or any other computing topic,
please visit our Digital Library at www.computer.org/publications/dlib.
WEN ET AL.: HARDWARE-ENHANCED ASSOCIATION RULE MINING WITH HASHING AND PIPELINING 795

More Related Content

What's hot

Show and tell program 04 2014-09-04
Show and tell program 04 2014-09-04Show and tell program 04 2014-09-04
Show and tell program 04 2014-09-04
David Phillips
 

What's hot (19)

An Evaluation and Overview of Indices Based on Arabic Documents
An Evaluation and Overview of Indices Based on Arabic DocumentsAn Evaluation and Overview of Indices Based on Arabic Documents
An Evaluation and Overview of Indices Based on Arabic Documents
 
Innovaccer service capabilities with case studies
Innovaccer service capabilities with case studiesInnovaccer service capabilities with case studies
Innovaccer service capabilities with case studies
 
data Fusion and log correlation
data Fusion and log correlationdata Fusion and log correlation
data Fusion and log correlation
 
Algorithm for calculating relevance of documents in information retrieval sys...
Algorithm for calculating relevance of documents in information retrieval sys...Algorithm for calculating relevance of documents in information retrieval sys...
Algorithm for calculating relevance of documents in information retrieval sys...
 
G1803054653
G1803054653G1803054653
G1803054653
 
Show and tell program 04 2014-09-04
Show and tell program 04 2014-09-04Show and tell program 04 2014-09-04
Show and tell program 04 2014-09-04
 
Vespa, A Tour
Vespa, A TourVespa, A Tour
Vespa, A Tour
 
Investment Fund Analytics
Investment Fund AnalyticsInvestment Fund Analytics
Investment Fund Analytics
 
Haystacks slides
Haystacks slidesHaystacks slides
Haystacks slides
 
Multidimensioal database
Multidimensioal  databaseMultidimensioal  database
Multidimensioal database
 
IRJET- On-AIR Based Information Retrieval System for Semi-Structure Data
IRJET-  	  On-AIR Based Information Retrieval System for Semi-Structure DataIRJET-  	  On-AIR Based Information Retrieval System for Semi-Structure Data
IRJET- On-AIR Based Information Retrieval System for Semi-Structure Data
 
Algorithm Procedure and Pseudo Code Mining
Algorithm Procedure and Pseudo Code MiningAlgorithm Procedure and Pseudo Code Mining
Algorithm Procedure and Pseudo Code Mining
 
A Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient AlgorithmA Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient Algorithm
 
Incentive Compatible Privacy Preserving Data Analysis
Incentive Compatible Privacy Preserving Data AnalysisIncentive Compatible Privacy Preserving Data Analysis
Incentive Compatible Privacy Preserving Data Analysis
 
Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimension...
Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimension...Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimension...
Impulsion of Mining Paradigm with Density Based Clustering of Multi Dimension...
 
Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...
Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...
Survey on Existing Text Mining Frameworks and A Proposed Idealistic Framework...
 
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...
 
Show and tell program 04 2014-09-04
Show and tell program 04 2014-09-04Show and tell program 04 2014-09-04
Show and tell program 04 2014-09-04
 
Mining Of Big Data Using Map-Reduce Theorem
Mining Of Big Data Using Map-Reduce TheoremMining Of Big Data Using Map-Reduce Theorem
Mining Of Big Data Using Map-Reduce Theorem
 

Viewers also liked

G10 equatorial & tropical regions
G10 equatorial & tropical regionsG10 equatorial & tropical regions
G10 equatorial & tropical regions
StudsPlanet.com
 
The International legal environment of business
The International legal environment of businessThe International legal environment of business
The International legal environment of business
StudsPlanet.com
 
Trompenaars cultural dimensions
Trompenaars cultural dimensionsTrompenaars cultural dimensions
Trompenaars cultural dimensions
StudsPlanet.com
 
Role of exim bank in export financing
Role of exim bank in export financingRole of exim bank in export financing
Role of exim bank in export financing
Harshul Nagpal
 
Integrated Project Management And Solution Delivery Process
Integrated Project Management And Solution Delivery ProcessIntegrated Project Management And Solution Delivery Process
Integrated Project Management And Solution Delivery Process
Alan McSweeney
 

Viewers also liked (17)

G10 equatorial & tropical regions
G10 equatorial & tropical regionsG10 equatorial & tropical regions
G10 equatorial & tropical regions
 
Worldwide market and trends for electronic manufacturing services
Worldwide market and trends for electronic manufacturing servicesWorldwide market and trends for electronic manufacturing services
Worldwide market and trends for electronic manufacturing services
 
Face recognition using laplacianfaces
Face recognition using laplacianfaces Face recognition using laplacianfaces
Face recognition using laplacianfaces
 
Value orientation model
Value orientation modelValue orientation model
Value orientation model
 
The International legal environment of business
The International legal environment of businessThe International legal environment of business
The International legal environment of business
 
Value orientation model
Value orientation modelValue orientation model
Value orientation model
 
Trompenaars cultural dimensions
Trompenaars cultural dimensionsTrompenaars cultural dimensions
Trompenaars cultural dimensions
 
Weberian model
Weberian modelWeberian model
Weberian model
 
IBM - 2016 - Retail Industry Solutions Guide
IBM - 2016 - Retail Industry Solutions GuideIBM - 2016 - Retail Industry Solutions Guide
IBM - 2016 - Retail Industry Solutions Guide
 
G10 natural regions
G10 natural regionsG10 natural regions
G10 natural regions
 
Exim bank presentation
Exim bank presentationExim bank presentation
Exim bank presentation
 
Role of exim bank in export financing
Role of exim bank in export financingRole of exim bank in export financing
Role of exim bank in export financing
 
exim bank
exim bankexim bank
exim bank
 
Comprehensive And Integrated Approach To Project Management And Solution Deli...
Comprehensive And Integrated Approach To Project Management And Solution Deli...Comprehensive And Integrated Approach To Project Management And Solution Deli...
Comprehensive And Integrated Approach To Project Management And Solution Deli...
 
Integrated Project Management And Solution Delivery Process
Integrated Project Management And Solution Delivery ProcessIntegrated Project Management And Solution Delivery Process
Integrated Project Management And Solution Delivery Process
 
Exim Bank
Exim BankExim Bank
Exim Bank
 
Complexity and Solution Architecture
Complexity and Solution ArchitectureComplexity and Solution Architecture
Complexity and Solution Architecture
 

Similar to Hardware enhanced association rule mining

Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
BRNSSPublicationHubI
 
A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association RulesA New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
Venu Madhav
 
Scalable frequent itemset mining using heterogeneous computing par apriori a...
Scalable frequent itemset mining using heterogeneous computing  par apriori a...Scalable frequent itemset mining using heterogeneous computing  par apriori a...
Scalable frequent itemset mining using heterogeneous computing par apriori a...
ijdpsjournal
 
Ijsrdv1 i2039
Ijsrdv1 i2039Ijsrdv1 i2039
Ijsrdv1 i2039
ijsrd.com
 

Similar to Hardware enhanced association rule mining (20)

Efficient Parallel Pruning of Associative Rules with Optimized Search
Efficient Parallel Pruning of Associative Rules with Optimized  SearchEfficient Parallel Pruning of Associative Rules with Optimized  Search
Efficient Parallel Pruning of Associative Rules with Optimized Search
 
IRJET-Comparative Analysis of Apriori and Apriori with Hashing Algorithm
IRJET-Comparative Analysis of  Apriori and Apriori with Hashing AlgorithmIRJET-Comparative Analysis of  Apriori and Apriori with Hashing Algorithm
IRJET-Comparative Analysis of Apriori and Apriori with Hashing Algorithm
 
BINARY DECISION TREE FOR ASSOCIATION RULES MINING IN INCREMENTAL DATABASES
BINARY DECISION TREE FOR ASSOCIATION RULES MINING IN INCREMENTAL DATABASESBINARY DECISION TREE FOR ASSOCIATION RULES MINING IN INCREMENTAL DATABASES
BINARY DECISION TREE FOR ASSOCIATION RULES MINING IN INCREMENTAL DATABASES
 
BINARY DECISION TREE FOR ASSOCIATION RULES MINING IN INCREMENTAL DATABASES
BINARY DECISION TREE FOR ASSOCIATION RULES MINING IN INCREMENTAL DATABASES BINARY DECISION TREE FOR ASSOCIATION RULES MINING IN INCREMENTAL DATABASES
BINARY DECISION TREE FOR ASSOCIATION RULES MINING IN INCREMENTAL DATABASES
 
D0352630
D0352630D0352630
D0352630
 
Review on: Techniques for Predicting Frequent Items
Review on: Techniques for Predicting Frequent ItemsReview on: Techniques for Predicting Frequent Items
Review on: Techniques for Predicting Frequent Items
 
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
 
Frequent Item Set Mining - A Review
Frequent Item Set Mining - A ReviewFrequent Item Set Mining - A Review
Frequent Item Set Mining - A Review
 
Review Over Sequential Rule Mining
Review Over Sequential Rule MiningReview Over Sequential Rule Mining
Review Over Sequential Rule Mining
 
An Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
An Efficient Compressed Data Structure Based Method for Frequent Item Set MiningAn Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
An Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
 
Query optimization techniques in Apache Hive
Query optimization techniques in Apache Hive Query optimization techniques in Apache Hive
Query optimization techniques in Apache Hive
 
Ijetcas14 316
Ijetcas14 316Ijetcas14 316
Ijetcas14 316
 
J017114852
J017114852J017114852
J017114852
 
A classification of methods for frequent pattern mining
A classification of methods for frequent pattern miningA classification of methods for frequent pattern mining
A classification of methods for frequent pattern mining
 
A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association RulesA New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
 
Scalable frequent itemset mining using heterogeneous computing par apriori a...
Scalable frequent itemset mining using heterogeneous computing  par apriori a...Scalable frequent itemset mining using heterogeneous computing  par apriori a...
Scalable frequent itemset mining using heterogeneous computing par apriori a...
 
Ijsrdv1 i2039
Ijsrdv1 i2039Ijsrdv1 i2039
Ijsrdv1 i2039
 
A Quantified Approach for large Dataset Compression in Association Mining
A Quantified Approach for large Dataset Compression in Association MiningA Quantified Approach for large Dataset Compression in Association Mining
A Quantified Approach for large Dataset Compression in Association Mining
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
An incremental mining algorithm for maintaining sequential patterns using pre...
An incremental mining algorithm for maintaining sequential patterns using pre...An incremental mining algorithm for maintaining sequential patterns using pre...
An incremental mining algorithm for maintaining sequential patterns using pre...
 

More from StudsPlanet.com

More from StudsPlanet.com (20)

Face recognition using laplacianfaces
Face recognition using laplacianfaces Face recognition using laplacianfaces
Face recognition using laplacianfaces
 
World electronic industry 2008
World electronic industry 2008World electronic industry 2008
World electronic industry 2008
 
Uk intellectual model
Uk intellectual modelUk intellectual model
Uk intellectual model
 
The building of the toyota car factory
The building of the toyota car factoryThe building of the toyota car factory
The building of the toyota car factory
 
Textile Industry
Textile IndustryTextile Industry
Textile Industry
 
Sales
SalesSales
Sales
 
Roles of strategic leaders
Roles  of  strategic  leadersRoles  of  strategic  leaders
Roles of strategic leaders
 
Role of ecgc
Role of ecgcRole of ecgc
Role of ecgc
 
Resolution of intl commr disputes
Resolution of intl commr disputesResolution of intl commr disputes
Resolution of intl commr disputes
 
Presentation on india's ftp
Presentation on india's ftpPresentation on india's ftp
Presentation on india's ftp
 
Players in ib
Players in ibPlayers in ib
Players in ib
 
Philips case study analysis
Philips case study analysisPhilips case study analysis
Philips case study analysis
 
Performance of intl sale contract
Performance of intl sale contractPerformance of intl sale contract
Performance of intl sale contract
 
Module test mcm
Module test  mcmModule test  mcm
Module test mcm
 
Methods of facility location selection
Methods of facility location selectionMethods of facility location selection
Methods of facility location selection
 
Locational factors 2
Locational factors 2Locational factors 2
Locational factors 2
 
Location theories
Location theoriesLocation theories
Location theories
 
Iorn and steel industry
Iorn and steel industryIorn and steel industry
Iorn and steel industry
 
International contracts
International contractsInternational contracts
International contracts
 
International sale contract 1
International sale contract 1International sale contract 1
International sale contract 1
 

Recently uploaded

Abortion pills in Jeddah |+966572737505 | Get Cytotec
Abortion pills in Jeddah |+966572737505 | Get CytotecAbortion pills in Jeddah |+966572737505 | Get Cytotec
Abortion pills in Jeddah |+966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
在线制作(UQ毕业证书)昆士兰大学毕业证成绩单原版一比一
在线制作(UQ毕业证书)昆士兰大学毕业证成绩单原版一比一在线制作(UQ毕业证书)昆士兰大学毕业证成绩单原版一比一
在线制作(UQ毕业证书)昆士兰大学毕业证成绩单原版一比一
uodye
 
怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证
怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证
怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证
ehyxf
 
Abortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in Dammam
Abortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in DammamAbortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in Dammam
Abortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in Dammam
ahmedjiabur940
 
Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Ratlam [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Ratlam [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Ratlam [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Ratlam [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
tufbav
 
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
uodye
 
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
ehyxf
 
怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证
怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证
怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证
ehyxf
 
一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理
一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理
一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理
uodye
 
一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证
一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证
一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证
wpkuukw
 
CRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptx
CRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptxCRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptx
CRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptx
Rishabh332761
 
在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一
在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一
在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一
ougvy
 
Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...
Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...
Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...
drmarathore
 
在线办理(scu毕业证)南十字星大学毕业证电子版学位证书注册证明信
在线办理(scu毕业证)南十字星大学毕业证电子版学位证书注册证明信在线办理(scu毕业证)南十字星大学毕业证电子版学位证书注册证明信
在线办理(scu毕业证)南十字星大学毕业证电子版学位证书注册证明信
oopacde
 
一比一定(购)新西兰林肯大学毕业证(Lincoln毕业证)成绩单学位证
一比一定(购)新西兰林肯大学毕业证(Lincoln毕业证)成绩单学位证一比一定(购)新西兰林肯大学毕业证(Lincoln毕业证)成绩单学位证
一比一定(购)新西兰林肯大学毕业证(Lincoln毕业证)成绩单学位证
wpkuukw
 

Recently uploaded (20)

Abortion pills in Jeddah |+966572737505 | Get Cytotec
Abortion pills in Jeddah |+966572737505 | Get CytotecAbortion pills in Jeddah |+966572737505 | Get Cytotec
Abortion pills in Jeddah |+966572737505 | Get Cytotec
 
在线制作(UQ毕业证书)昆士兰大学毕业证成绩单原版一比一
在线制作(UQ毕业证书)昆士兰大学毕业证成绩单原版一比一在线制作(UQ毕业证书)昆士兰大学毕业证成绩单原版一比一
在线制作(UQ毕业证书)昆士兰大学毕业证成绩单原版一比一
 
怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证
怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证
怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证
 
Abortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in Dammam
Abortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in DammamAbortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in Dammam
Abortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in Dammam
 
Guwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime Guwahati
Guwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime GuwahatiGuwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime Guwahati
Guwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime Guwahati
 
Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...
 
Top profile Call Girls In Ratlam [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Ratlam [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Ratlam [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Ratlam [ 7014168258 ] Call Me For Genuine Models We...
 
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
 
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
 
Critical Commentary Social Work Ethics.pptx
Critical Commentary Social Work Ethics.pptxCritical Commentary Social Work Ethics.pptx
Critical Commentary Social Work Ethics.pptx
 
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
 
怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证
怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证
怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证
 
一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理
一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理
一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理
 
一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证
一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证
一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证
 
LANDSLIDE MONITORING AND ALERT SYSTEM FINAL YEAR PROJECT BROCHURE
LANDSLIDE MONITORING AND ALERT SYSTEM FINAL YEAR PROJECT BROCHURELANDSLIDE MONITORING AND ALERT SYSTEM FINAL YEAR PROJECT BROCHURE
LANDSLIDE MONITORING AND ALERT SYSTEM FINAL YEAR PROJECT BROCHURE
 
CRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptx
CRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptxCRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptx
CRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptx
 
在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一
在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一
在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一
 
Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...
Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...
Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...
 
在线办理(scu毕业证)南十字星大学毕业证电子版学位证书注册证明信
在线办理(scu毕业证)南十字星大学毕业证电子版学位证书注册证明信在线办理(scu毕业证)南十字星大学毕业证电子版学位证书注册证明信
在线办理(scu毕业证)南十字星大学毕业证电子版学位证书注册证明信
 
一比一定(购)新西兰林肯大学毕业证(Lincoln毕业证)成绩单学位证
一比一定(购)新西兰林肯大学毕业证(Lincoln毕业证)成绩单学位证一比一定(购)新西兰林肯大学毕业证(Lincoln毕业证)成绩单学位证
一比一定(购)新西兰林肯大学毕业证(Lincoln毕业证)成绩单学位证
 

Hardware enhanced association rule mining

  • 1. Hardware-Enhanced Association Rule Mining with Hashing and Pipelining Ying-Hsiang Wen, Jen-Wei Huang, and Ming-Syan Chen, Fellow, IEEE Abstract—Generally speaking, to implement Apriori-based association rule mining in hardware, one has to load candidate itemsets and a database into the hardware. Since the capacity of the hardware architecture is fixed, if the number of candidate itemsets or the number of items in the database is larger than the hardware capacity, the items are loaded into the hardware separately. The time complexity of those steps that need to load candidate itemsets or database items into the hardware is in proportion to the number of candidate itemsets multiplied by the number of items in the database. Too many candidate itemsets and a large database would create a performance bottleneck. In this paper, we propose a HAsh-based and PiPelIned (abbreviated as HAPPI) architecture for hardware- enhanced association rule mining. We apply the pipeline methodology in the HAPPI architecture to compare itemsets with the database and collect useful information for reducing the number of candidate itemsets and items in the database simultaneously. When the database is fed into the hardware, candidate itemsets are compared with the items in the database to find frequent itemsets. At the same time, trimming information is collected from each transaction. In addition, itemsets are generated from transactions and hashed into a hash table. The useful trimming information and the hash table enable us to reduce the number of items in the database and the number of candidate itemsets. Therefore, we can effectively reduce the frequency of loading the database into the hardware. As such, HAPPI solves the bottleneck problem in a priori-based hardware schemes. We also derive some properties to investigate the performance of this hardware implementation. As shown by the experiment results, HAPPI significantly outperforms the previous hardware approach and the software algorithm in terms of execution time. Index Terms—Hardware enhanced, association rule. Ç 1 INTRODUCTION DATA mining technology is now used in a wide variety of fields. Applications include the analysis of customer transaction records, web site logs, credit card purchase information, call records, to name a few. The interesting results of data mining can provide useful information such as customer behavior for business managers and research- ers. One of the most important data mining applications is association rule mining [11], which can be described as follows: Let I ¼ fi1; i2; . . . ; ing denote a set of items; let D denote a set of database transactions, where each transac- tion T is a set of items such that T I; and let X denote a set of items, called an itemset. A transaction T contains X if and only if X T. An association rule is an implication of the form X¼)Y , where X I, Y I, and X T Y ¼ . The rule X¼)Y has support s percent in the transaction set D if s percent of transactions in D contain X S Y . The rule X¼)Y holds in the transaction set D with confidence c percent if c percent of transactions in D that contain X also contain Y . The support of the rule X¼)Y is given by s percent ¼ jfT 2 DjX S Y Tgj jDj à 100 percent; where j:j indicates the number of transactions. The confidence of the rule X¼)Y is given by c percent ¼ suppððX [ Y Þ suppðXÞ Ã 100 percent: A typical example of an association rule is that 80 percent of customers who purchase beef steak and goose liver paste would also prefer to buy bottles of red wine. Once we have found all frequent itemsets that meet the minimum support requirement, calculation of confidence for each rule is trivial. Therefore, we only need to focus on the methods of finding the frequent itemsets in the database. The Apriori [2] approach was the first to address this issue. Apriori finds frequent itemsets by scanning a database to check the frequencies of candidate itemsets, which are generated by merging frequent subitemsets. However, Apriori-based algorithms have undergone bottlenecks because they have too many candidate itemsets. DHP [16] proposed a hash table scheme, which effectively reduces the number of candidate itemsets. In addition, several mining techniques, such as TreeProjection [1], the FP-growth algorithm [12], partitioning [18], sampling [19], and the Hidden Markov Model [5] have also received a significant amount of research attention. With the increasing amount of data, it is important to develop more efficient algorithms to extract knowledge from the data. However, the volume of data size is increasing much faster than CPU execution speeds, which has a strong influence on the performance of software algorithms. Several works [7], [8] have proposed parallel computing schemes to execute operations simultaneously 784 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 20, NO. 6, JUNE 2008 . The authors are with the National Taiwan University, 106, No. 1, Sec. 4, Roosevelt Road, Taipei, Taiwan. E-mail: {winshung, jwhuang}@arbor.ee.ntu.edu.tw, mschen@cc.ee.ntu.edu.tw. Manuscript received 25 Feb. 2007; revised 9 Aug. 2007; accepted 8 Oct. 2007; published online 11 Feb. 2008. For information on obtaining reprints of this article, please send e-mail to: tkde@computer.org, and reference IEEECS Log Number TKDE-0086-0207. Digital Object Identifier no. 10.1109/TKDE.2008.39. 1041-4347/08/$25.00 ß 2008 IEEE Published by the IEEE Computer Society
  • 2. on multiprocessors. The performance, however, cannot improve linearly as the number of the parallel nodes grows. Therefore, some researchers have tried to use hardware devices to accomplish data mining tasks. In [15], Liu et al. proposed a parallel matrix hardware architecture, which can efficiently generate candidate 2-itemsets, for high- throughput data stream applications. Baker and Prasanna [3], [4] designed scalable hardwares for association rule mining by utilizing the systolic array proposed in [13] and [14]. The architecture utilizes parallel computing techniques to execute a large number of pattern matching operations at the same time. Other hardware architectures [6], [9], [10], [20] have been designed to speed up the K-means clustering algorithm. Generally speaking, Apriori-based hardware schemes require loading the candidate itemsets and the database into the hardware. Since the capacity of the hardware is fixed, if the number of items in the database is larger than the hardware capacity, the data items must be loaded sepa- rately. Therefore, the process of comparing candidate item- sets with the database needs to be executed several times. Similarly, if the number of candidate itemsets is larger than the capacity of the hardware, the pattern matching proce- dure has to be separated into many rounds. Clearly, it is infeasible for any hardware design to load the candidate itemsets and the database into hardware for multiple times. Since the time complexity of those steps that need to load candidate itemsets or database items into the hardware is in proportion to the number of candidate itemsets and the number of items in the database, this procedure is very time consuming. In addition, numerous candidate itemsets and a huge database may cause a bottleneck in the system. In this paper, we propose a HAsh-based and PiPelIned (abbreviated as HAPPI) architecture for hardware-enhanced association rule mining. That is, we identify certain parts of the mining process that is suitable and will benefit from hardware implementation and perform hardware-enhanced mining. Explicitly, we incorporate the pipeline methodology into the HAPPI architecture to compare itemsets and collect useful information that enables us to reduce the number of candidate itemsets and items in the database simulta- neously. As shown in Fig. 1, there are three hardware modules in our system. First, when the database is fed into the hardware, the candidate itemsets are compared with the items in the database by the systolic array. Candidate itemsets that have a higher frequency than the minimum support value are viewed as frequent itemsets. Second, we determine the frequency that each item occurs in the candidate itemsets in the transactions at the same time. These frequencies are called trimming information. From this information, infrequent items in the transactions can be eliminated since they are not useful in generating frequent itemsets through the trimming filter. Third, we generate itemsets from transactions and hash them into the hash table, which is then used to filter out unnecessary candidate itemsets. After the hardware compares candidate itemsets with the items in the database, the trimming information is collected and the hash table is built. The useful information helps us to reduce the number of items in the database and the number of candidate itemsets. Based on the trimming information, items are trimmed if their corresponding occurrence frequencies are not larger than the length of the current candidate itemsets. In addition, after the candidate itemsets are generated by merging frequent subitemsets, they are sent to the hash table filter. If the number of itemsets in the corresponding bucket of the hash table is less than the minimum support, the candidate itemsets are pruned. As such, HAPPI solves the bottleneck problem mentioned earlier by the cooperation of these three hardware modules. To achieve these goals, we devise the following five procedures in the HAPPI architecture: support counting, transaction trimming, hash table building, candidate gen- eration, and candidate pruning. Moreover, we derive several formulas to decide the optimal design in order to reduce the overhead induced by the pipeline scheme and the ideal number of hardware modules to achieve the best utilization. The execution time between sequential processing and pipeline processing is also analyzed in this paper. We conduct several experiments to evaluate the perfor- mance of the HAPPI architecture. In addition, we implement the work of Baker and Prasanna [3] and a software algorithm DHP [16] for comparison purposes. The experiment results show that HAPPI outperforms the previous approach on execution time significantly, especially when the number of items in the database is large and the minimum support value increases. Moreover, the performance of HAPPI is better than that of the previous approach [3] when the systolic array contains different numbers of hardware cells. In fact, by using only 25 hardware cells in the systolic array, we can achieve the same performance as more than 800 hardware cells in the previous approach. The advantages of the HAPPI architecture are that it has more computing power and saves the space costs for mining association rules in hardware design. The scale-up experiment also shows that HAPPI outperforms the previous approach on different numbers of transactions in the database. Indeed, our architecture is a good example to demonstrate the metho- dology of performance enhancement by hardware. We implement our architecture on a commercial FPGA board. It is easily to be realized in a custom ASIC. With the progress in IC process technology, the performance of HAPPI will further be improved. In view of the fast increase in the amount of data in various emerging mining applications (e.g., network application mining, data stream mining, and bioinformatics data mining), it is envisioned that hardware- enhanced mining is an important research direction to explore for future data mining tasks. The remainder of the paper is organized as follows: We discuss related works in Section 2. The preliminaries are presented in Section 3. The HAPPI architecture is described in Section 4. Next, we show several experiments conducted on HAPPI in Section 5. Finally, we present our conclusions in Section 6. WEN ET AL.: HARDWARE-ENHANCED ASSOCIATION RULE MINING WITH HASHING AND PIPELINING 785 Fig. 1. System architecture.
  • 3. 2 RELATED WORKS In this section, we discuss two previous works that use a systolic array architecture to enhance the performance of data mining. The Systolic Process Array (SPA) architecture is pro- posed in [10] to perform K-means clustering. SPA accel- erates the processing speed by utilizing several hardware cells to calculate the distances in parallel. Each cell corresponds to a cluster and stores the centroid of the cluster in local memory. The data flows linked by each cell include the data object, the minimum distance between the object and its closest centroid, and the closest centroid of the object. The cell computes the distance between the centroid and the input data object. Based on the resulting distance, the cell updates the minimum distance and the closest centroid of the data object. Therefore, the system can obtain the closest centroid of each object, respectively, from SPA. The centroids are recomputed and updated by the system, and the new centroids are sent to the cells. The system continuously updates clustering results. In [3], the authors implemented a systolic array with several hardware cells to speed up the Apriori algorithm. Each cell performs an ALU (larger than, smaller than, or equal to) operation, which compares the incoming item with items in the memory of the cell. This operation generates frequent itemsets by comparing candidate item- sets with the items in the database. Since all the cells can execute their own operations simultaneously, the perfor- mance of the architecture is better than that of a single processor. However, the number of cells in the systolic array is fixed. If the number of candidate itemsets is larger than the number of hardware cells, the pattern matching procedure has to be separated into many rounds. It is infeasible to load candidate itemsets and the database into the hardware for multiple times. As reported in [3], the performance is only about four times faster than some software algorithms. Hence, there is much room to improve the execution time. 3 PRELIMINARIES The hash table scheme proposed in DHP [16] improves the performance of Apriori-based algorithms by filtering out infrequent candidate itemsets. In addition, DHP employs an effective pruning scheme to eliminate infrequent items in transactions. We summarize these two schemes below. In the hash table scheme, a hash function is applied to all of candidate k-itemsets generated by frequent subitemsets. Each candidate k-itemset is mapped to a hash value, and itemsets with the same hash value are put into the same bucket of the hash table. If the number of the candidate itemsets in the bucket is less than the minimum support threshold, the number of these candidate itemsets in the database is less than the minimum support threshold. As a result, these candidate itemsets cannot be frequent and are removed from the system. On the other hand, if the number of the candidate itemsets in the bucket is larger than the minimum support threshold, the itemsets are carried to real frequency testing process by scanning the database. The hash table for filtering candidate k-itemsets Hk is built by hashing the k-itemsets generated by each transac- tion. A hash table contains n buckets, where n is an arbitrary number. When an itemset is hashed to the bucket i, the number of itemsets in the bucket is increased by one. The number of itemsets in each bucket represents the accumu- lated frequency of the itemsets whose hash values are assigned to that bucket. After candidate k-itemsets have been generated, they are hashed and assigned to buckets of Hk. If the number of itemsets in a bucket is less than the minimum support, candidate itemsets in this bucket are removed. The example in Fig. 2 demonstrates how to build H2 and how to use it to filter out candidate 2-itemsets. After we scan the transaction TID ¼ 100, AC , AD , and CD are hashed to the buckets. According to the hash function shown in Fig. 2, the hash values of AC , AD , and CD are 6, 0, and 6, respectively. As a result, the number of itemsets in the buckets indexed by 6, 0, and 6 is increased by one. After all the transactions in the database have been scanned, frequent 1-itemsets are found, i.e., L1 ¼ fA; B; C; Eg. In addition, the number of itemsets in the buckets of H2 are 3; 1; 2; 0; 3; 1; 3 , and the minimum support frequency is 2. Thus, the candidate 2-itemsets in buckets 1, 3, and 5 should be pruned. If we generate candidate 2-itemsets from L1 Ã L1 directly, the original set of candidate 2-itemsets C2 is f AB ; AC ; AE ; BC ; BE ; CE g: After filtering out unnecessary candidate itemsets by checking H2, the new C0 2 becomes f AC ; BC ; BE ; CE g: Therefore, the number of candidate itemsets can be reduced. The pruning scheme which is able to filter out infrequent items in the transactions will be implemented in hardware. The theoretical backgrounds of the pruning scheme are based on the following two theorems which were presented in [17]: Theorem 1. A transaction can only be used to support the set of frequent ðk þ 1Þ-itemsets if it consists of at least ðk þ 1Þ candidate k-itemsets. Theorem 2. An item in a transaction can be trimmed if it does not appear in at least k of the candidate k-itemsets contained in the transaction. 786 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 20, NO. 6, JUNE 2008 Fig. 2. The process of building H2 and using H2 to filter out C2.
  • 4. Based on Theorem 2, whether an item can be trimmed or not depends on how many candidate itemsets in the current transaction contain this item. The transaction trimming module is based on the frequencies of all candidate itemsets in an individual transaction. Therefore, we can handle every transaction independently regardless of other trans- actions in the database. A counter array a½ Š is used to record the number of times that each item in a transaction occurs in the candidate k-itemsets. That is, counter a½iŠ represents the frequency of ith item in the transaction. If a candidate k-itemset is a subset of the transaction, the numbers in the counter array of the corresponding items that appear in this candidate itemset are increased by one. After comparing with all the candidate k-itemsets, if the value of a counter is less than k, the item in the transaction is trimmed, as shown in Fig. 3. For example, transaction TID ¼ 100 has three items fA; C; Dg. Counter a½0Š represents A, a½1Š represents C, and a½2Š represents D. Transaction TID ¼ 300 has four items fA; B; C; Eg. Counter a½0Š corresponds to A, a½1Š corresponds to B, a½2Š corresponds to C, and a½3Š corre- sponds to E, respectively. After the comparison with the set of candidate 2-itemsets, the values of the counter array for TID ¼ 100 are 1; 1; 0 and the values of the counter array for TID ¼ 300 are 1; 2; 2; 2 . Since all values of the counter array for TID ¼ 100 are less than 2, all correspond- ing items are trimmed from the transaction TID ¼ 100. On the other hand, because the value of a½0Š for TID ¼ 300 is less than 2, item A is trimmed from the transaction. Therefore, transaction TID ¼ 300 becomes fB; C; Eg. 4 HAPPI ARCHITECTURE As noted earlier, Apriori-based hardware schemes have to load candidate itemsets and the database into the hardware to execute the comparison process. Too many candidate itemsets and a huge database would cause a performance bottleneck. To solve this problem, we propose the HAPPI architecture to deal with efficient hardware-enhanced association rule mining. We incorporate the pipeline methodology into the HAPPI architecture to perform pattern matching and collect useful information to reduce the number of candidate itemsets and items in the database simultaneously. In this way, HAPPI effectively solves the bottleneck problem. In Section 4.1, we introduce our system architecture. In Section 4.2, the pipeline scheme of the HAPPI architecture is presented. The transaction trimming scheme is given in Section 4.3. Then, we describe the hardware design of hash table filter in Section 4.4. Finally, we derive some properties for performance evaluation in Section 4.5. 4.1 System Architecture As shown in Fig. 4, the HAPPI architecture consists of a systolic array, a trimming filter, and a hash table filter. There are several hardware cells in the systolic array. Each cell can perform the comparison operation. Based on the comparison results, the cells update the support counters of candidate itemsets and the occurrence frequencies of items in the trimming information. A trimming filter then removes infrequent items in the transactions according to the trimming information. In addition, we build a hash table by hashing itemsets generated by each transaction. The hash table filter then prunes unsuitable candidate itemsets. To find frequent k-itemsets and generate candidate ðk þ 1Þ-itemsets efficiently, we devise five procedures in the HAPPI architecture using the three hardware modules: the systolic array, the trimming filter, and the hash table filter. The procedures are support counting, transaction trimming, hash table building, candidate generation, and candidate pruning. The work flow is shown in Fig. 5. The support counting procedure finds frequent itemsets by comparing candidate itemsets with transactions in the database. By loading candidate k-itemsets and streaming transactions into WEN ET AL.: HARDWARE-ENHANCED ASSOCIATION RULE MINING WITH HASHING AND PIPELINING 787 Fig. 3. An example of transaction trimming. Fig. 4. The HAPPI architecture: (a) systolic array, (b) trimming filter, and (c) hash table filter.
  • 5. the systolic array, the frequencies that candidate itemsets occur in the transactions can be determined. Note that if the number of candidate itemsets is larger than the number of hardware cells in the systolic array, the candidate itemsets are separated into several groups. Some of the candidate itemsets are loaded into the hardware cells and the database is fed into the systolic array. Afterward, the other candidate itemsets are loaded into the systolic array one by one. To complete the comparison with all the candidate itemsets, the database has to be examined several times. To reduce the overhead of repeated loading, we design two additional hardware modules, namely, a trimming filter and a hash table filter. Infrequent items in the database are eliminated by the trimming filter, and the number of candidate itemsets is reduced by the hash table filter. Therefore, the time required for support counting procedure can be effectively reduced. After all the candidate k-itemsets have been compared with the transactions, their frequencies are sent back to the system. The frequent k-itemsets can be obtained from the candidate k-itemsets whose occurrence frequencies are larger than the minimum support. While the transactions are being compared with the candidate itemsets, the corresponding trimming information is collected. The occurrence frequency of each item, which is contained in the candidate itemsets in the transactions, is recorded and updated to the trimming information. After comparing candidate itemsets with the database, the trimming information is collected. The occurrence frequencies and the corresponding transactions are then transmitted to the trimming filter, and infrequent items are trimmed from the transactions according to the occurrence frequencies in the trimming information. Then, the hash table building procedure generates ðk þ 1Þ-itemsets from the trimmed transactions. These ðk þ 1Þ-itemsets are hashed into the hash table for processing. Next, the candidate generation procedure is also executed by the systolic array. The frequent k-itemsets are fed into the systolic array for comparison with other frequent k-itemsets. The candidate ðk þ 1Þ-itemsets are generated by the systolic injection and stalling techniques similar to [3]. The candidate pruning procedure uses the hash table to filter candidate ðk þ 1Þ-itemsets that are not possible to be frequent itemsets. Then, the procedure reverts to the support counting procedure. The pruned candidate ðk þ 1Þ-itemsets are loaded into the systolic array for comparison with transactions that have been trimmed already. The above five processes are executed repeatedly until all frequent itemsets have been found. 4.2 Pipeline Design We observe that the transaction trimming and the hash table building procedures are blocked by the support counting procedure. The transaction trimming procedure has to obtain trimming information to execute the trimming process. However, this process cannot be completed until the support counting procedure compares all the transac- tions with all the candidate itemsets. In addition, the hash table building procedure has to get the trimmed transactions from the trimming filter after all the transactions have been trimmed. This problem can be resolved by applying the pipeline scheme, which utilize the three hardware modules simultaneously in the HAPPI framework. First, we divide the database into Npipe parts. One part of the transactions in the database is streamed into the systolic array and the support counting process is performed on all candidate itemsets. After comparing the transactions with all the candidate itemsets, the transactions and their trimming information are passed to the trimming filter first. The systolic array then processes the next group of transactions. After items have been trimmed from a transaction by the trimming filter, the transaction is passed to the hash table filter, as shown in Fig. 6, and the trimming filter can deal with the next transaction. In this way, all the hardware modules can be utilized simultaneously. Although the pipelined architecture improves the system’s performance, it increases the computational overhead because of multiple times of loading candidate itemsets into the systolic array. The performance of the pipeline scheme and the improved design of the HAPPI architecture are discussed in Section 4.5. 4.3 Transaction Trimming While the support counting procedure is being executed, the whole database is streamed into the systolic array. However, not all the transactions are useful for generating frequent itemsets. Therefore, we filter out items in the transactions according to Theorem 2 so that the database is reduced. In the HAPPI architecture, the trimming information records the frequency of each item in a transaction that appears in the candidate itemsets. The support counting and trimming information collecting operations are similar since they all need to compare candidate itemsets with transactions. Therefore, in addition to transactions in the database, their corresponding trimming information is also fed into the systolic array in another pipe, while the support counting process is being executed. As shown in Fig. 7, a trimming vector is embedded in each hardware cell of the systolic array to record items that are matched with candidate 788 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 20, NO. 6, JUNE 2008 Fig. 5. The procedure flow of one round. Fig. 6. A diagram of the pipeline procedures.
  • 6. itemsets. The ith flag in the trimming vector is set to true if the ith item in the transaction matches the candidate itemset. After comparing the candidate itemset with all the items in a transaction, if the candidate itemset is a subset of the transaction, the incoming corresponding trimming informa- tion will be accumulated according to the trimming vector. Since transactions and trimming information are input in different pipes, support counters and trimming information can be updated simultaneously in a hardware cell. In Fig. 7a, the candidate itemset BC is stored in the candidate memory, and a transaction fA; B; C; D; Eg is about to be fed into the cell. The resultant trimming vector after comparing BC with all the items in the transac- tion is shown in Fig. 7b. Because items B and C are matched with the candidate itemset, the trimming vector becomes 0; 1; 1; 0; 0 . Meanwhile, the corresponding trimming information is fed into the trimming register, and the trimming information is updated from 0; 1; 1; 0; 1 to 0; 2; 2; 0; 1 . After passing through the systolic array, transactions and their corresponding trimming information are passed to the trimming filter. The filter trims off items whose frequencies are less than k. As the example in Fig. 8 shows, the trimming information of the transaction fA; B; C; D; Eg is 2; 2; 2; 1; 2 and the current k is 2. Therefore, the item D should be trimmed. The new transaction becomes fA; B; C; Dg. In this way, the size of the database can be reduced. The trimmed transactions are sent to the hash table filter module for hash table building. 4.4 Hash Table Filtering To build a hardware hash table filter, we use a hash value generator and hash table updating module. The former generates all the k-itemset combinations of the transactions and puts the k-itemsets into the hash function to create the corresponding hash values. As shown in Fig. 9, the hash value generator comprises a transaction memory, a state machine, an index array, and a hash function. The transaction memory stores all the items of a transaction. The state machine is the controller that generates control signals of different lengths ðk ¼ 2; 3 . . .Þ flexibly. Then, the control signals are fed into the index array. To generate a k-itemset, the first k entries in the index array are utilized. The values in the index array are the indices of the transaction memory. The item selected by the ith entry of the index array is the ith item in a k-itemset. By changing the values in the index array, the state machine can generate different combinations of k-itemsets from the transaction. The procedure starts by loading a transaction into the transaction memory. Then, the values in the index array are reset, and the state machine starts to generate control signals. The values in the index array are changed by the different states. Each item in the generated itemset is passed to the hash function through the multiplexer. The hash function takes some bits from the incoming k-itemsets to calculate the hash values. WEN ET AL.: HARDWARE-ENHANCED ASSOCIATION RULE MINING WITH HASHING AND PIPELINING 789 Fig. 7. An example of streaming a transaction and the corresponding trimming information into the cell. (a) Stream a transaction into the cell. (b) Stream trimming information into the cell. Fig. 8. The trimming filter. Fig. 9. The hash value generator.
  • 7. Consider the example in Fig. 9. We assume the current k is 3. The first three index entries in the index array are used in this case. The transaction fA; C; E; F; Gg is loaded into the transaction memory. The values in the index array are initiated to 0, 1, and 2, respectively, so that the first itemset generated is ACE . Then, the state machine changes the values in the index array. The following numbers in the index array will be 0; 1; 3 , 0; 1; 4 , 0; 2; 3 , 0; 2; 4 , to name a few. Therefore, the corresponding itemsets are ACF , ACG , AEF , AEG , and so on. The hash values generated by the hash value generator are passed to the hash table updating module. To speed up the process of hash table building, we utilize Nparallel hash value generators so that the hash values can be generated simultaneously. In addition, the hash table is divided into several parts to increase the throughput of hash table building. Each part of the hash table contains a range of hash values, and the controller passes the incoming hash values to the buffer they belong to. These hash values are taken as indexes of the hash table to accumulate the values in the table, as shown in Fig. 10. There are four parallel hash value generators. The size of the whole hash table is 65,536, and it is divided into four parts. Thus, the range of each part is 16,384. If the incoming hash value is 5, it belongs to the first part of the hash table. The controller would pass the value to buffer 1. If there are parallel accesses to the hash table at the same time, only one access can be executed. The others will be delayed and be handled as soon as possible. The delayed itemsets are stored in the buffer temporally. Whenever the access port of hash table is free, the delayed itemsets are put into the hash table. After all the candidate k-itemsets have been generated, they are pruned by the hash table filter. Each candidate itemset is hashed by the hash function. By querying the number of itemsets in the bucket with the corresponding hash value, the candidate itemset is pruned if the number of itemsets in the bucket does not meet the minimum support criteria. Therefore, the number of the candidate itemsets can be reduced effectively with the help of the hash table filter. 4.5 Performance Analysis In this section, we derive some properties of our system with and without the pipeline scheme to investigate the total execution time. Suppose the number of candidate k-itemsets is NcandÀk and the number of frequent k-itemsets is NfreqÀk: There are Ncell hardware cells in the systolic array. jTj represents the average number of items in a transaction, and jDj is the total number of items in the database. As shown in Fig. 5, the time needed to find frequent k-itemsets and candidate ðk þ 1Þ-itemsets includes the time required for support counting, transaction trimming, hash table building, candidate generation, and candidate pruning. First, the execution time of the support counting procedure is related to the number of times candidate itemsets and the database are loaded into the systolic array. That is, if NcandÀk is larger than Ncell, the candidate itemsets and the database must be input into the systolic array dNcandÀk=Ncelle times. Each time there are, at most, Ncell candidate itemsets loaded into the systolic array. The number of items in these candidate k-itemsets is, at most, k à Ncell. In addition, all items in the database need jDj cycles to be streamed into the systolic array. Therefore, the execution cycle of the support counting procedure is, at most tsup ¼ dNcandÀk=Ncelle à ðk à Ncell þ jDjÞ: Second, the transaction trimming procedure eliminates infrequent items and receives incoming items at the same time. A transaction item and the corresponding trimming information are fed into the trimming filter during each cycle. After the whole database has been passed through the trimming filter, the transaction trimming procedure is finished. Thus, the execution cycle depends on the number of items in the database: ttrim ¼ jDj: Third, the hash table building procedure consists of the hash value generation and the hash table updating pro- cesses. Because the processes can be executed simulta- neously, the execution time is based on the process that generates the hash values. The execution time of hash value generation consists of the time taken by transaction loading and by hash value generation from transactions. The overall transaction loading time is jDj cycles. In addition, there are jTj items in a transaction on average. Thus, the number of ðk þ 1Þ-itemset combinations from a transaction is C jTj kþ1. Each time ðk þ 1Þ-itemset requires ðk þ 1Þ cycles to be generated. The average number of transactions in the database is jDj jTj . Therefore, the execution time of hash value generation from the whole database is ðk þ 1Þ Ã C jTj k à jDj jTj cycles. Because we have designed a parallel architecture, the procedure can be executed by Nparallel hardware modules simultaneously. The execution cycle of the hash table building procedure is thash ¼ jDj þ ðk þ 1Þ Ã C jTj kþ1 à jDj jTj à 1 Nparallel : The fourth procedure is candidate generation. Frequent k-itemsets are compared with other frequent k-itemsets to generate candidate ðk þ 1Þ-itemsets. The execution time of candidate generation consists of the time required to load frequent k-itemsets (at most, k à Ncell each time), the time taken to pass frequent k-itemsets ðk à NfreqÀkÞ through the systolic array, and the time needed to generate candidate k-itemsets ðNcandÀðkþ1ÞÞ. Similar to the support counting procedure, if NfreqÀk is larger than Ncell, the frequent k-itemsets have to be separated into several groups and the 790 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 20, NO. 6, JUNE 2008 Fig. 10. The parallel hash table building module.
  • 8. comparison process has to be executed several times. Thus, the execution cycle is, at most tcandidate generation ¼ dNfreqÀk=Ncelle à ðk à Ncell þ k à NfreqÀkÞ þ NcandÀðkþ1Þ: Finally, the candidate pruning procedure has to hash candidate ðk þ 1Þ-itemsets and to query the hash table Hkþ1 each time ðk þ 1Þ-itemset requires ðk þ 1Þ cycles to be generated. The hash table can accept one input query during each cycle. Thus, the execution time needed to hash candidate k-itemsets ððk þ 1Þ Ã NcandÀðkþ1ÞÞ and query the hash table ðNcandÀkþ1Þ is tcandidate pruning ¼ ðk þ 1Þ Ã NcandÀðkþ1Þ þ NcandÀðkþ1Þ: Since the size of the database that we consider is much larger than the number of the candidate k-itemsets, we can neglect the execution time of candidate generation and pruning. Therefore, the time required for one round of the sequential execution tseq is the sum of the time taken by the support counting, transaction trimming, and hash table building procedures, as shown in Property 1. Property 1. tseq ¼ tsup þ ttrim þ thash. The pipeline scheme incorporated in the HAPPI archi- tecture divides the database into Npipe parts and inputs them into the three modules. However, this scheme causes some overhead toverhead because of reloading candidate itemsets for multiple times in the support counting procedure: toverhead ¼ dNcandÀk=Ncelle à ðk à NcellÞ Ã Npipe: Therefore, the support counting procedure has to con- sider toverhead. The execution time of the support counting procedure in the pipeline scheme becomes t0 sup ¼ tsup þ toverhead: The execution time of the pipeline scheme tpipe is analyzed according to the following two cases: Case 1. If the execution time of the support counting procedure is longer than that of the hash table building procedure, the other parts of the procedure would finish their operations before the support counting procedure. However, the transaction trimming and hash table building procedures have to wait for the last part of the data from the support counting procedure. Therefore, the total execution time is t0 sup, and the time required to process the last part of the database with the trimming filter and the hash table filter is tpipe ¼ t0 sup þ ðttrim þ thashÞ Ã 1 Npipe : Case 2. If the execution time of the support counting procedure is less than that of the hash table building procedure, the other parts of the procedure would be completed before the hash table building procedure. Since the hash table building procedure has to wait for the data from the support counting and transaction trimming procedures, the total execution time is thash, and the time required to process the first part of the database with the support counting and the trimming filter is tpipe ¼ t0 sup þ ttrim à 1 Npipe þ thash: Summarizing the above two cases, the execution time tpipe can be presented as Property 2. Property 2. tpipe ¼ maxðt0 sup; thashÞ þ minðt0 sup; thashÞ Ã 1 Npipe þ ttrim à 1 Npipe : To achieve the minimal value of tpipe, we consider the following two cases: 1. If t0 sup is larger than thash, the execution time tpipe is dominated by t0 sup. To decrease the value of t0 sup, we can increase the number of hardware cells in the systolic array until t0 sup is equal to thash. Therefore, the optimal value for Npipe to reach the minimal tpipe is Npipe ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 dNcandÀk=Ncelle à k à Ncell à ðttrim þ thashÞ s : 2. If t0 sup is smaller than thash, the execution time tpipe is mainly taken up by thash. To decrease the value of thash, we can increase the number of Nparallel until thash is equal to t0 sup. As a result, the optimal value for Npipe to achieve the minimal tpipe in this case is Npipe ¼ 1 dNcandÀk=Ncelle à k à Ncell à ðthash À tsupÞ: To decide the values of Ncell and Nparallel in the HAPPI architecture, we have to get the value of NcandÀk. However, NcandÀk varies with different numbers of k. Therefore, we decide these values according to our experimental experi- ence. Generally speaking, for many types of data in the real world, the number of candidate 2-itemsets is the largest. Thus, we can focus on the case k ¼ 2, since the execution time would be the largest of all the processes. For a data set with jTj ¼ 10 and jDj ¼ 1 million, if the minimum support is 0.2 percent, NcandÀk is about 3,000. Assume that there are 500 hardware cells in the systolic array. To accelerate the hash table building procedure, we can increase the number of Nparallel. Based on Property 2, the best number of Nparallel is set to 4. In addition, after the transactions are trimmed by the trimming filter we can get the current number of items in the database. Also, the number of NcandÀk can be obtained after candidate k-itemsets are pruned by the hash table filter. Therefore, we can calculate the value of tsup and thash before starting the support counting and the hash table building procedures. Since the toverhead is less than the size of the database under consideration, tsup can be viewed as t0 sup. Based on the formulas we derived above, we can get the best value of Npipe to minimize the value of tpipe. When the support counting procedure is dealing with candidate 2-itemsets and the hash table building procedure is about WEN ET AL.: HARDWARE-ENHANCED ASSOCIATION RULE MINING WITH HASHING AND PIPELINING 791
  • 9. to build H3, we divide the database into 30 parts, i.e., Npipe ¼ 30. By applying pipeline scheme to these three hardware modules, the hardware utilization increases and the wastage due to the blocking can be reduced. 5 EXPERIMENT RESULTS In this section, we conduct several experiments on a number of synthetic data sets to evaluate the performance of the HAPPI architecture. We also implement an approach mainly based on [3], abbreviated as Direct Comparison (DC) method, for comparison purposes. Although a hard- ware algorithm was proposed in [4], its performance improvement is found much less than DC. However, HAPPI significantly outperforms DC in orders of magni- tude. Moreover, we implement a software algorithm DHP [16], denoted by SW_DHP, as the baseline. The software algorithm is executed on a PC with 3-GHz P-4 CPU and 1-Gbytes RAM. Both HAPPI and DC are implemented on the Altera Stratix 1S40 FPGA board with 50-MHZ clock rate and 10-Mbytes SDRAM. The hardware modules are coded in Verilog. We use ModelSim to simulate Verilog codes and verify the functions of our design. In addition, we use Altera Quartus II IDE to build hardware modules and synthesize modules into hardware circuits. Finally, the hardware circuit image is sent to the FPGA board. There is a synthesized CPU on an FPGA board. We implement a software program on NIOSII to verify and to record hardware execution time. At first, the program downloads data from the database into the memory on FPGA. Then, the data is fed into hardware modules. The bandwidth consumed from the memory to the chip is 16 bits/cycle in our design. Since the data is transferred on the bus, the maximum bandwidth is limited by the bandwidth of the bus on FPGA, which is generally 32 bits/cycle. The bandwidth can be upgraded to 64 bits/cycle in some modern FPGA board. After the execution of hardware modules, we can acquire outcomes and execution cycles. The following experimental results are based on execution cycles on FPGA board. In addition, in our hardware design, the critical path is in the hash table building module. There are many logic combinations in this module. The synthesized hardware core is, therefore, complex. This makes the maximum clock frequency in our hardware design bound to 58.6 MHz. However, since the maximum clock frequency on Altera FPGA1s40 is 50 MHz, our design meets the hardware requirement. In our future work, we are going to increase the clock frequency of the hardware architecture. We will try to optimize the bottleneck module. In the hardware implementation of the HAPPI architec- ture, Nparallel is set to 4. The number of the hardware cells in the systolic array is 500; there are 65,536 buckets in the hash table; and Npipe are assigned according to the methodology in Section 4.5. The method used to generate synthetic data is described in Section 5.1. The performance comparison of several schemes in the HAPPI architecture and DC are discussed in Section 5.2. Section 5.3 presents the perfor- mance analysis of different distributions of frequent item- sets. Finally, in Section 5.4, the results of some scale-up experiments are discussed. 5.1 Generation of Synthetic Data To obtain reliable experimental results, we employ similar methods to those used in [16] to generate synthetic data sets. These data sets are generated with the following parameters: T represents the average number of items in a transaction of the database, I denotes the average length of the maximum potentially frequent itemsets, D is the number of transac- tions in the database, L is the number of maximal potentially frequent itemsets, and N is the number of different items in the database. Table 1 summarizes the parameters used in our experiments. In the following experiment data sets, L is set to 3,000. To evaluate the performance of HAPPI, we conduct several experiments with different data sets. We use TxIyDz to represent that T ¼ x, I ¼ y, and D ¼ z. In addition, the sensitivity analysis and the scale-up experiments are also explored with different data sets. Note that the y-axis of the following figures is the execution cycle in logarithmic scale. 5.2 Performance Evaluation Initially, we conduct experiments to evaluate the perfor- mance of several schemes in the HAPPI architecture and DC. The testing data sets are T10I4D100 with different numbers of items in the database. The minimum support is set to 0.5 percent. As shown in Fig. 11, the four different schemes are 1. the DC scheme, 2. the systolic array with a trimming filter, 3. the combined scheme made up of the systolic array, the trimming filter and the hash table filter, and 4. the overall HAPPI architecture with the pipeline design. 792 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 20, NO. 6, JUNE 2008 TABLE 1 Summary of the Parameters Used Fig. 11. The execution cycles of several schemes.
  • 10. With the trimming filter, the execution time increases to about 10 percent-70 percent compared to DC. As the number of different items in the database increases, the improvement due to the trimming filter increases. The reason is that, if the number of different items grows, the number of infrequent itemsets also increases. Therefore, the filtering rate by utilizing the trimming filter is more remarkable. Moreover, the execution cycle of the combined scheme with the help of the hash table filter is about 25- 51 times faster. The HAPPI architecture with the pipeline scheme is 47-122 times better than DC. Note that the main improvement of the execution time results from the hash table filter. We not only implemented an efficient hardware module for the hash table filter in the HAPPI architecture but also designed the pipeline scheme to let the hash table filter work together with other hardware modules. The pipeline and the parallel design are two helpful properties of the hardware architecture. Therefore, we utilize these hardware design skills to accelerate the overall system. As shown in Fig. 11, although the combined scheme provides much performance boost, the performance improvement still benefits from the overall HAPPI architecture with pipeline design. In summary, the HAPPI architecture outperforms DC, especially when the number of different items in the database is large. 5.3 Sensitivity Analysis In the second experiment, we generate several data sets with different distributions of frequent itemsets to examine the sensitivity of the HAPPI architecture. The experiment results of several synthetic data sets with various mini- mum supports are shown in Figs. 12 and 13. The results show that no matter what combination of the parameters T and I is used, the HAPPI architecture consistently outperforms DC and SW_DHP. Specifically, the execution time of HAPPI is less than that of DC and that of SW_DHP by several orders of magnitude. The margin grows as the minimum support increases. As Fig. 12 shows, HAPPI has a better performance enhancement ratio to DC on the T5I2D100K data set than on the T20I8D100K data set. The reason is that the hash table filter is especially effective in eliminating infrequent candidate 2-itemsets, as reported in the experimental results of DHP [16]. Thus, SW_DHP also has better performance with these two data sets. In addition, the execution time tpipe is mainly related to the number of times dNcandÀk=Ncelle of reloading the database. Since the number of NcandÀk can be substantially reduced when k is small, the overall performance enhancement is remarkable. Since the average size of the maximal potentially frequent itemsets of the data set T20I8D100K is 8, the performance of the HAPPI architecture is only 2.65 times faster. However, most data in the real world contains short frequent itemsets. That is, T and I are small. It is noted that DC has only little enhancement to SW_DHP with the short frequent itemsets, while HAPPI possesses better performance. Therefore, the HAPPI architecture can perform well with real-world data sets. Fig. 13 demonstrates that the improvement of the HAPPI architecture over DC becomes more noticeable with increas- ing minimum support. This is because the number of long itemsets would be eliminated with large minimum support. Therefore, the improvement due to the hash table filter increases. In comparison with DC, the overall performance is outstanding. 5.4 Scale-up Experiment According to the performance analysis in Section 4.5, the execution time tpipe is mainly related to the number of times, dNcandÀk=Ncelle, of reloading the database into the systolic array. Therefore, if the number of Ncell increases, the time we need to stream the database is less and the overall execution time is faster. Fig. 14 illustrates the scaling performance of the HAPPI architecture and DC, where the y-axis is also in logarithmic scale. The execution cycles of both HAPPI and DC linearly decrease with the increasing number of hardware cells. We observe that HAPPI outper- forms DC on different numbers of hardware cells in the systolic array. The most important result is that we only utilize 25 hardware cells in the systolic array but achieve the same performance as the 800 hardware cells used in DC. The benefit of the HAPPI architecture is more computing power with lower costs for data mining in hardware design. We also conduct experiments with different numbers of transactions in the synthetic data sets to explore the scalability of the HAPPI architecture. The generated data sets are T10I4, and the minimum support is set to 0.5 percent. As shown in Fig. 15, the execution time of HAPPI increases linearly as the number of transactions in the synthetic data sets increases. The performance of HAPPI outperforms DC on different numbers of transactions in the database. Furthermore, Fig. 15 shows the good scalability of HAPPI and DC. This feature is especially important because WEN ET AL.: HARDWARE-ENHANCED ASSOCIATION RULE MINING WITH HASHING AND PIPELINING 793 Fig. 12. The execution time of data sets with different T and I. Fig. 13. The execution cycles with various minimum supports.
  • 11. the size of applications is growing much faster than the speed of CPU. Thus, hardware-enhanced data mining technique is imperative. 6 CONCLUSION In this work, we have proposed the HAPPI architecture for hardware-enhanced association rule mining. The bottleneck of a priori-based hardware schemes is related to the number of candidate itemsets and the size of the database. To solve this problem, we apply the pipeline methodology in the HAPPI architecture to compare itemsets with the database and collect useful information to reduce the number of candidate itemsets and items in the database simulta- neously. HAPPI can prune infrequent items in the transac- tions and reduce the size of the database gradually by utilizing the trimming filter. In addition, HAPPI can effectively eliminate infrequent candidate itemsets with the help of the hash table filter. Therefore, the bottleneck of a priori-based hardware schemes can be solved by the HAPPI architecture. Moreover, we derive some properties to analyze the performance of HAPPI. We conduct a sensitivity analysis of various parameters to show many insights into the HAPPI architecture. HAPPI outperforms the previous approach, especially with the increasing number of different items in the database, and with the increasing minimum support values. Also, HAPPI increases computing power and saves the costs of data mining in hardware design as compared to the previous approach. Furthermore, HAPPI possesses good scalability. ACKNOWLEDGMENTS The authors would like to thank Wen-Tsai Liao at Realtek for his helpful comments to improve this paper. The work was supported in part by the National Science Council of Taiwan under Contract NSC93-2752-E-002-006-PAE. REFERENCES [1] R. Agarwal, C. Aggarwal, and V. Prasad, “A Tree Projection Algorithm for Generation of Frequent Itemsets,” J. Parallel and Distributed Computing, 2000. [2] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,” Proc. 20th Int’l Conf. Very Large Databases (VLDB), 1994. [3] Z.K. Baker and V.K. Prasanna, “Efficient Hardware Data Mining with the Apriori Algorithm on FPGAS,” Proc. 13th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM), 2005. [4] Z.K. Baker and V.K. Prasanna, “An Architecture for Efficient Hardware Data Mining Using Reconfigurable Computing Sys- tems,” Proc. 14th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM ’06), pp. 67-75, Apr. 2006. [5] C. Besemann and A. Denton, “Integration of Profile Hidden Markov Model Output into Association Rule Mining,” Proc. 11th ACM SIGKDD Int’l Conf. Knowledge Discovery in Data Mining (KDD ’05), pp. 538-543, 2005. [6] C.W. Chen, J. Luo, and K.J. Parker, “Image Segmentation via Adaptive K-Mean Clustering and Knowledge-Based Morphologi- cal Operations with Biomedical Applications,” IEEE Trans. Image Processing, vol. 7, no. 12, pp. 1673-1683, 1998. [7] S.M. Chung and C. Luo, “Parallel Mining of Maximal Frequent Itemsets from Databases,” Proc. 15th IEEE Int’l Conf. Tools with Artificial Intelligence (ICTAI), 2003. [8] S. Cong, J. Han, J. Hoeflinger, and D. Padua, “A Sampling-Based Framework for Parallel Data Mining,” Proc. 10th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP ’05), June 2005. [9] M. Estlick, M. Leeser, J. Szymanski, and J. Theiler, “Algorithmic Transformations in the Implementation of K-Means Clustering on Reconfigurable Hardware,” Proc. Ninth Ann. IEEE Symp. Field- Programmable Custom Computing Machines (FCCM), 2001. [10] M. Gokhale, J. Frigo, K. McCabe, J. Theiler, C. Wolinski, and D. Lavenier, “Experience with a Hybrid Processor: K-Means Cluster- ing,” J. Supercomputing, pp. 131-148, 2003. [11] J. Han and M. Kamber, Data Mining: Concepts and Techniques. Morgan Kaufmann, 2001. [12] J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” Proc. ACM SIGMOD ’00, pp. 1-12, May 2000. [13] H. Kung and C. Leiserson, “Systolic Arrays for VLSI,” Proc. Sparse Matrix, 1976. [14] N. Ling and M. Bayoumi, Specification and Verification of Systolic Arrays. World Scientific Publishing, 1999. [15] W.-C. Liu, K.-H. Liu, and M.-S. Chen, “High Performance Data Stream Processing on a Novel Hardware Enhanced Framework,” Proc. 10th Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD ’06), Apr. 2006. [16] J.S. Park, M.-S. Chen, and P.S. Yu, “An Effective Hash Based Algorithm for Mining Association Rules,” Proc. ACM SIGMOD ’95, pp. 175-186, May 1995. [17] J.S. Park, M.-S. Chen, and P.S. Yu, “Using a Hash-Based Method with Transaction Trimming for Mining Association Rules,” IEEE Trans. Knowledge and Data Eng., vol. 9, no. 5, pp. 813-825, Sept./ Oct. 1997. [18] A. Savasere, E. Omiecinski, and S. Navathe, “An Efficient Algorithm for Mining Association Rules in Large Databases,” Proc. 21st Int’l Conf. Very Large Databases (VLDB ’95), pp. 432-444, Sept. 1995. [19] H. Toivonen, “Sampling Large Databases for Association Rules,” Proc. 22nd Int’l Conf. Very Large Databases (VLDB ’96), pp. 134-145, 1996. 794 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 20, NO. 6, JUNE 2008 Fig. 14. The execution cycles with different numbers of hardware units. Fig. 15. The execution cycles with various numbers of transactions, zðKÞ.
  • 12. [20] C. Wolinski, M. Gokhale, and K. McCabe, “A Reconfigurable Computing Fabric,” Proc. Int’l Conf. Eng. of Reconfigurable Systems and Algorithms (ERSA), 2004. Ying-Hsiang Wen received the BS degree in computer science from the National Chiao Tung University and the MS degree in electrical engineering from the National Taiwan Univer- sity, Taipei, in 2006. His research interests include data mining, video streaming, and multi- media SoC design. Jen-Wei Huang received the BS degree in electrical engineering from the National Taiwan University, Taipei, in 2002, where he is currently working toward the PhD degree in computer science. He is familiar with data mining area. His research interests include data mining, mobile computing, and bioinformatics. Among these, the web mining, incremental mining, mining data streams, time series issues, and sequential pattern mining are his special inter- ests. In addition, some of his research are on mining general temporal association rules, sequential clustering, data broadcasting, progressive sequential pattern mining and bioinformatics. Ming-Syan Chen received the BS degree in electrical engineering from the National Taiwan University, Taipei, and the MS and PhD degrees in computer, information, and control engineer- ing from the University of Michigan, Ann Arbor, in 1985 and 1988, respectively. He was the chairman of the Graduate Institute of Commu- nication Engineering (GICE), National Taiwan University, from 2003 to 2006. He is currently a distinguished professor jointly appointed by the Electrical Engineering Department, Computer Science and Information Engineering Department, and GICE, National Taiwan University. He was a research staff member at IBM T.J. Watson Research Center, New York, from 1988 to 1996. He served as an associate editor of the IEEE Transactions on Knowledge and Data Engineering from 1997 to 2001 and is currently on the editorial board of the Very Large Data Base (VLDB) Journal and Knowledge and Information Systems. His research interests include database systems, data mining, mobile computing systems, and multimedia networking. He has published more than 240 papers in his research areas. He is a recipient of the National Science Council (NSC) Distinguished Research Award, Pan Wen Yuan Distinguished Research Award, Teco Award, Honorary Medal of Information, and K.-T. Li Research Breakthrough Award for his research work, as well as the IBM Outstanding Innovation Award for his contribution to a major database product. He also received numerous awards for his teaching, inventions, and patent applications. He is a fellow of the ACM and the IEEE. . For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib. WEN ET AL.: HARDWARE-ENHANCED ASSOCIATION RULE MINING WITH HASHING AND PIPELINING 795